日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程语言 > php >内容正文

php

php 提取文字,如何使用PHP从word文档中提取文本内容?

發布時間:2024/3/12 php 41 豆豆
生活随笔 收集整理的這篇文章主要介紹了 php 提取文字,如何使用PHP从word文档中提取文本内容? 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

我想用PHP從word文檔中提取文本內容.

我在Microsoft Word for Mac 2011中創建了一個新的單詞文檔.

編輯:也通過在Windows 7中的Microsoft Word中創建相同的文檔進行測試.

文件的內容是

The quick brown fox jumps over the lazy dog

我把它保存到磁盤作為Word 97-2004文檔(.doc).

$source = "word.doc";

$phpWord = \PhpOffice\PhpWord\IOFactory::load($source, 'MsDoc');

$text = '';

$sections = $phpWord->getSections();

foreach ($sections as $s) {

$els = $s->getElements();

foreach ($els as $e) {

if (get_class($e) === 'PhpOffice\PhpWord\Element\Text') {

$text .= $e->getText();

} elseif (get_class($e) === 'PhpOffice\PhpWord\Section\TextBreak') {

$text .= " \n";

} else {

throw new Exception('Unknown class type ' . get_class($e));

}

}

}

print $text;

此代碼的輸出只是文本的一部分:

The quick brown fox j

代碼有問題,還是某種兼容性問題?

編輯:

如果我添加一個var_dump($els);之前($els為$e){輸出是這樣的:

array(1) {

[0]=>

object(PhpOffice\PhpWord\Element\Text)#1265 (14) {

["text":protected]=>

string(21) "The quick brown fox j"

["fontStyle":protected]=>

object(PhpOffice\PhpWord\Style\Font)#1267 (25) {

["aliases":protected]=>

array(1) {

["line-height"]=>

string(10) "lineHeight"

}

["type":"PhpOffice\PhpWord\Style\Font":private]=>

string(4) "text"

["name":"PhpOffice\PhpWord\Style\Font":private]=>

NULL

["hint":"PhpOffice\PhpWord\Style\Font":private]=>

NULL

["size":"PhpOffice\PhpWord\Style\Font":private]=>

NULL

["color":"PhpOffice\PhpWord\Style\Font":private]=>

NULL

["bold":"PhpOffice\PhpWord\Style\Font":private]=>

bool(false)

["italic":"PhpOffice\PhpWord\Style\Font":private]=>

bool(false)

["underline":"PhpOffice\PhpWord\Style\Font":private]=>

string(4) "none"

["superScript":"PhpOffice\PhpWord\Style\Font":private]=>

bool(false)

["subScript":"PhpOffice\PhpWord\Style\Font":private]=>

bool(false)

["strikethrough":"PhpOffice\PhpWord\Style\Font":private]=>

bool(false)

["doubleStrikethrough":"PhpOffice\PhpWord\Style\Font":private]=>

bool(false)

["smallCaps":"PhpOffice\PhpWord\Style\Font":private]=>

bool(false)

["allCaps":"PhpOffice\PhpWord\Style\Font":private]=>

bool(false)

["fgColor":"PhpOffice\PhpWord\Style\Font":private]=>

NULL

["scale":"PhpOffice\PhpWord\Style\Font":private]=>

NULL

["spacing":"PhpOffice\PhpWord\Style\Font":private]=>

NULL

["kerning":"PhpOffice\PhpWord\Style\Font":private]=>

NULL

["paragraph":"PhpOffice\PhpWord\Style\Font":private]=>

object(PhpOffice\PhpWord\Style\Paragraph)#1266 (26) {

["aliases":protected]=>

array(1) {

["line-height"]=>

string(10) "lineHeight"

}

["basedOn":"PhpOffice\PhpWord\Style\Paragraph":private]=>

string(6) "Normal"

["next":"PhpOffice\PhpWord\Style\Paragraph":private]=>

NULL

["alignment":"PhpOffice\PhpWord\Style\Paragraph":private]=>

string(0) ""

["indentation":"PhpOffice\PhpWord\Style\Paragraph":private]=>

NULL

["spacing":"PhpOffice\PhpWord\Style\Paragraph":private]=>

NULL

["lineHeight":"PhpOffice\PhpWord\Style\Paragraph":private]=>

NULL

["widowControl":"PhpOffice\PhpWord\Style\Paragraph":private]=>

bool(true)

["keepNext":"PhpOffice\PhpWord\Style\Paragraph":private]=>

bool(false)

["keepLines":"PhpOffice\PhpWord\Style\Paragraph":private]=>

bool(false)

["pageBreakBefore":"PhpOffice\PhpWord\Style\Paragraph":private]=>

bool(false)

["numStyle":"PhpOffice\PhpWord\Style\Paragraph":private]=>

NULL

["numLevel":"PhpOffice\PhpWord\Style\Paragraph":private]=>

int(0)

["tabs":"PhpOffice\PhpWord\Style\Paragraph":private]=>

array(0) {

}

["shading":"PhpOffice\PhpWord\Style\Paragraph":private]=>

NULL

["borderTopSize":protected]=>

NULL

["borderTopColor":protected]=>

NULL

["borderLeftSize":protected]=>

NULL

["borderLeftColor":protected]=>

NULL

["borderRightSize":protected]=>

NULL

["borderRightColor":protected]=>

NULL

["borderBottomSize":protected]=>

NULL

["borderBottomColor":protected]=>

NULL

["styleName":protected]=>

NULL

["index":protected]=>

NULL

["isAuto":"PhpOffice\PhpWord\Style\AbstractStyle":private]=>

bool(false)

}

["shading":"PhpOffice\PhpWord\Style\Font":private]=>

NULL

["rtl":"PhpOffice\PhpWord\Style\Font":private]=>

bool(false)

["styleName":protected]=>

NULL

["index":protected]=>

NULL

["isAuto":"PhpOffice\PhpWord\Style\AbstractStyle":private]=>

bool(false)

}

["paragraphStyle":protected]=>

object(PhpOffice\PhpWord\Style\Paragraph)#1266 (26) {

["aliases":protected]=>

array(1) {

["line-height"]=>

string(10) "lineHeight"

}

["basedOn":"PhpOffice\PhpWord\Style\Paragraph":private]=>

string(6) "Normal"

["next":"PhpOffice\PhpWord\Style\Paragraph":private]=>

NULL

["alignment":"PhpOffice\PhpWord\Style\Paragraph":private]=>

string(0) ""

["indentation":"PhpOffice\PhpWord\Style\Paragraph":private]=>

NULL

["spacing":"PhpOffice\PhpWord\Style\Paragraph":private]=>

NULL

["lineHeight":"PhpOffice\PhpWord\Style\Paragraph":private]=>

NULL

["widowControl":"PhpOffice\PhpWord\Style\Paragraph":private]=>

bool(true)

["keepNext":"PhpOffice\PhpWord\Style\Paragraph":private]=>

bool(false)

["keepLines":"PhpOffice\PhpWord\Style\Paragraph":private]=>

bool(false)

["pageBreakBefore":"PhpOffice\PhpWord\Style\Paragraph":private]=>

bool(false)

["numStyle":"PhpOffice\PhpWord\Style\Paragraph":private]=>

NULL

["numLevel":"PhpOffice\PhpWord\Style\Paragraph":private]=>

int(0)

["tabs":"PhpOffice\PhpWord\Style\Paragraph":private]=>

array(0) {

}

["shading":"PhpOffice\PhpWord\Style\Paragraph":private]=>

NULL

["borderTopSize":protected]=>

NULL

["borderTopColor":protected]=>

NULL

["borderLeftSize":protected]=>

NULL

["borderLeftColor":protected]=>

NULL

["borderRightSize":protected]=>

NULL

["borderRightColor":protected]=>

NULL

["borderBottomSize":protected]=>

NULL

["borderBottomColor":protected]=>

NULL

["styleName":protected]=>

NULL

["index":protected]=>

NULL

["isAuto":"PhpOffice\PhpWord\Style\AbstractStyle":private]=>

bool(false)

}

["phpWord":protected]=>

object(PhpOffice\PhpWord\PhpWord)#1247 (3) {

["sections":"PhpOffice\PhpWord\PhpWord":private]=>

array(1) {

[0]=>

object(PhpOffice\PhpWord\Element\Section)#1261 (16) {

["container":protected]=>

string(7) "Section"

["style":"PhpOffice\PhpWord\Element\Section":private]=>

object(PhpOffice\PhpWord\Style\Section)#1262 (28) {

["orientation":"PhpOffice\PhpWord\Style\Section":private]=>

string(8) "portrait"

["paper":"PhpOffice\PhpWord\Style\Section":private]=>

object(PhpOffice\PhpWord\Style\Paper)#1263 (8) {

["sizes":"PhpOffice\PhpWord\Style\Paper":private]=>

array(6) {

["A3"]=>

array(3) {

[0]=>

int(297)

[1]=>

int(420)

[2]=>

string(2) "mm"

}

["A4"]=>

array(3) {

[0]=>

int(210)

[1]=>

int(297)

[2]=>

string(2) "mm"

}

["A5"]=>

array(3) {

[0]=>

int(148)

[1]=>

int(210)

[2]=>

string(2) "mm"

}

["Folio"]=>

array(3) {

[0]=>

float(8.5)

[1]=>

int(13)

[2]=>

string(2) "in"

}

["Legal"]=>

array(3) {

[0]=>

float(8.5)

[1]=>

int(14)

[2]=>

string(2) "in"

}

["Letter"]=>

array(3) {

[0]=>

float(8.5)

[1]=>

int(11)

[2]=>

string(2) "in"

}

}

["size":"PhpOffice\PhpWord\Style\Paper":private]=>

string(2) "A4"

["width":"PhpOffice\PhpWord\Style\Paper":private]=>

int(11870)

["height":"PhpOffice\PhpWord\Style\Paper":private]=>

int(16787)

["styleName":protected]=>

NULL

["index":protected]=>

NULL

["aliases":protected]=>

array(0) {

}

["isAuto":"PhpOffice\PhpWord\Style\AbstractStyle":private]=>

bool(false)

}

["pageSizeW":"PhpOffice\PhpWord\Style\Section":private]=>

int(11906)

["pageSizeH":"PhpOffice\PhpWord\Style\Section":private]=>

int(16838)

["marginTop":"PhpOffice\PhpWord\Style\Section":private]=>

int(1417)

["marginLeft":"PhpOffice\PhpWord\Style\Section":private]=>

int(1417)

["marginRight":"PhpOffice\PhpWord\Style\Section":private]=>

int(1417)

["marginBottom":"PhpOffice\PhpWord\Style\Section":private]=>

int(1417)

["gutter":"PhpOffice\PhpWord\Style\Section":private]=>

int(0)

["headerHeight":"PhpOffice\PhpWord\Style\Section":private]=>

int(720)

["footerHeight":"PhpOffice\PhpWord\Style\Section":private]=>

int(720)

["pageNumberingStart":"PhpOffice\PhpWord\Style\Section":private]=>

NULL

["colsNum":"PhpOffice\PhpWord\Style\Section":private]=>

int(1)

["colsSpace":"PhpOffice\PhpWord\Style\Section":private]=>

int(720)

["breakType":"PhpOffice\PhpWord\Style\Section":private]=>

NULL

["lineNumbering":"PhpOffice\PhpWord\Style\Section":private]=>

NULL

["borderTopSize":protected]=>

NULL

["borderTopColor":protected]=>

NULL

["borderLeftSize":protected]=>

NULL

["borderLeftColor":protected]=>

NULL

["borderRightSize":protected]=>

NULL

["borderRightColor":protected]=>

NULL

["borderBottomSize":protected]=>

NULL

["borderBottomColor":protected]=>

NULL

["styleName":protected]=>

NULL

["index":protected]=>

NULL

["aliases":protected]=>

array(0) {

}

["isAuto":"PhpOffice\PhpWord\Style\AbstractStyle":private]=>

bool(false)

}

["headers":"PhpOffice\PhpWord\Element\Section":private]=>

array(0) {

}

["footers":"PhpOffice\PhpWord\Element\Section":private]=>

array(0) {

}

["elements":protected]=>

array(1) {

[0]=>

*RECURSION*

}

["phpWord":protected]=>

*RECURSION*

["sectionId":protected]=>

int(1)

["docPart":protected]=>

string(7) "Section"

["docPartId":protected]=>

int(1)

["elementIndex":protected]=>

int(1)

["elementId":protected]=>

NULL

["relationId":protected]=>

NULL

["nestedLevel":"PhpOffice\PhpWord\Element\AbstractElement":private]=>

int(0)

["parentContainer":"PhpOffice\PhpWord\Element\AbstractElement":private]=>

NULL

["mediaRelation":protected]=>

bool(false)

["collectionRelation":protected]=>

bool(false)

}

}

["collections":"PhpOffice\PhpWord\PhpWord":private]=>

array(5) {

["Bookmarks"]=>

object(PhpOffice\PhpWord\Collection\Bookmarks)#1248 (1) {

["items":"PhpOffice\PhpWord\Collection\AbstractCollection":private]=>

array(0) {

}

}

["Titles"]=>

object(PhpOffice\PhpWord\Collection\Titles)#1249 (1) {

["items":"PhpOffice\PhpWord\Collection\AbstractCollection":private]=>

array(0) {

}

}

["Footnotes"]=>

object(PhpOffice\PhpWord\Collection\Footnotes)#1250 (1) {

["items":"PhpOffice\PhpWord\Collection\AbstractCollection":private]=>

array(0) {

}

}

["Endnotes"]=>

object(PhpOffice\PhpWord\Collection\Endnotes)#1251 (1) {

["items":"PhpOffice\PhpWord\Collection\AbstractCollection":private]=>

array(0) {

}

}

["Charts"]=>

object(PhpOffice\PhpWord\Collection\Charts)#1252 (1) {

["items":"PhpOffice\PhpWord\Collection\AbstractCollection":private]=>

array(0) {

}

}

}

["metadata":"PhpOffice\PhpWord\PhpWord":private]=>

array(3) {

["DocInfo"]=>

object(PhpOffice\PhpWord\Metadata\DocInfo)#1253 (12) {

["creator":"PhpOffice\PhpWord\Metadata\DocInfo":private]=>

string(0) ""

["lastModifiedBy":"PhpOffice\PhpWord\Metadata\DocInfo":private]=>

string(0) ""

["created":"PhpOffice\PhpWord\Metadata\DocInfo":private]=>

int(1483515248)

["modified":"PhpOffice\PhpWord\Metadata\DocInfo":private]=>

int(1483515248)

["title":"PhpOffice\PhpWord\Metadata\DocInfo":private]=>

string(0) ""

["description":"PhpOffice\PhpWord\Metadata\DocInfo":private]=>

string(0) ""

["subject":"PhpOffice\PhpWord\Metadata\DocInfo":private]=>

string(0) ""

["keywords":"PhpOffice\PhpWord\Metadata\DocInfo":private]=>

string(0) ""

["category":"PhpOffice\PhpWord\Metadata\DocInfo":private]=>

string(0) ""

["company":"PhpOffice\PhpWord\Metadata\DocInfo":private]=>

string(0) ""

["manager":"PhpOffice\PhpWord\Metadata\DocInfo":private]=>

string(0) ""

["customProperties":"PhpOffice\PhpWord\Metadata\DocInfo":private]=>

array(0) {

}

}

["Protection"]=>

object(PhpOffice\PhpWord\Metadata\Protection)#1254 (1) {

["editing":"PhpOffice\PhpWord\Metadata\Protection":private]=>

NULL

}

["Compatibility"]=>

object(PhpOffice\PhpWord\Metadata\Compatibility)#1255 (1) {

["ooxmlVersion":"PhpOffice\PhpWord\Metadata\Compatibility":private]=>

int(12)

}

}

}

["sectionId":protected]=>

NULL

["docPart":protected]=>

string(7) "Section"

["docPartId":protected]=>

int(1)

["elementIndex":protected]=>

int(1)

["elementId":protected]=>

string(6) "5d531b"

["relationId":protected]=>

NULL

["nestedLevel":"PhpOffice\PhpWord\Element\AbstractElement":private]=>

int(0)

["parentContainer":"PhpOffice\PhpWord\Element\AbstractElement":private]=>

string(7) "Section"

["mediaRelation":protected]=>

bool(false)

["collectionRelation":protected]=>

bool(false)

}

}

總結

以上是生活随笔為你收集整理的php 提取文字,如何使用PHP从word文档中提取文本内容?的全部內容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。