日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程语言 > python >内容正文

python

python 读取word 题库_Python-docx 读取word.docx内容

發布時間:2024/4/19 python 40 豆豆
生活随笔 收集整理的這篇文章主要介紹了 python 读取word 题库_Python-docx 读取word.docx内容 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

第一次寫博客,也不知道要寫點兒什么好,所以就把我在學習Python的過程中遇到的問題記錄下來,以便之后查看,本人小白,寫的不好,如有錯誤,還請大家批評指正!

中文編碼問題總是讓人頭疼,想要用Python讀取word中的內容,用open()經常報錯,上網一搜結果發現了Python有專門讀取.docx的模塊python_docx(只能讀取.docx文件,不能讀取.doc文件),用起來很方便。

安裝python-docx:

pip install python_docx

(注意:不是pip install docx ?! docx也可以安裝,但總是報錯,缺少exceptions,無法導入)

接下來就可以用Python_docx 來讀取word文本了。

代碼如下:

importdocxfrom docx importDocument

path= "C:\\Users\\Administrator\\Desktop\\word.docx"document=Document(path)for paragraph indocument.paragraphs:print(paragraph.text)

運行即可輸出文本。

我嘗試用docx讀取.doc文本

代碼如下:

importosimportdocxfor filename inos.listdir(os.getcwd()):if filename.endswith('.doc'):print(filename[:-4])

doc= docx.Document(filename[:-4]+".docx")for para indoc.paragraphs:print (para.text)

結果報錯:docx.opc.exceptions.PackageNotFoundError: Package not found。還是無法識別doc

引用1樓,“改變拓展名并沒有改變其編碼方式,因此無法讀取文本內容,需將doc文件另存為docx文件后再用python-docx讀取其內容”

# Document 還有添加標題、分頁、段落、圖片、章節等方法,說明如下

| add_heading(self, text='', level=1)|Return a heading paragraph newly added to the end of the document,| containing *text* andhaving its paragraph style determined by| *level*. If *level* is 0, the style is set to `Title`. If *level* is

| 1 (or omitted), `Heading 1` is used. Otherwise the style isset to| `Heading {level}`. Raises |ValueError| if *level* isoutside the| range 0-9.|

|add_page_break(self)| Return a paragraph newly added to the end of the document and

| containing only a page break.|

| add_paragraph(self, text='', style=None)|Return a paragraph newly added to the end of the document, populated| with *text* and having paragraph style *style*. *text*can contain|tab (``\t``) characters, which are converted to the appropriate XML| form for a tab. *text* can also include newline (``\n``) orcarriage| return (``\r``) characters, each of which isconverted to a line| break.|

| add_picture(self, image_path_or_stream, width=None, height=None)| Return a new picture shape added inits own paragraph at the end of|the document. The picture contains the image at| *image_path_or_stream*, scaled based on *width* and *height*. If| neither width nor height isspecified, the picture appears at its| native size. If only one is specified, it isused to compute| a scaling factor that isthen applied to the unspecified dimension,|preserving the aspect ratio of the image. The native size of the| picture is calculated using the dots-per-inch (dpi) value specified| in the image file, defaulting to 72 dpi if no value isspecified, as| isoften the case.|

| add_section(self, start_type=2)| Return a |Section|object representing a new section added at the end| of the document. The optional *start_type*argument must be a member| of the :ref:`WdSectionStart` enumeration, anddefaults to| ``WD_SECTION.NEW_PAGE`` if notprovided.|

| add_table(self, rows, cols, style=None)| Add a table having row and column counts of *rows* and *cols*

| respectively and table style of *style*. *style*may be a paragraph| style object or a paragraph style name. If *style* is |None|, the|table inherits the default table style of the document.|

|save(self, path_or_stream)| Save this document to *path_or_stream*, which can be eit a path to| a filesystem location (a string) or a file-like object.

docx還有許多其它功能,還正在學習中,詳見官方文檔:https://python-docx.readthedocs.io/en/latest/user/quickstart.html

總結

以上是生活随笔為你收集整理的python 读取word 题库_Python-docx 读取word.docx内容的全部內容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。