Python 解密 pdf 文件
生活随笔
收集整理的這篇文章主要介紹了
Python 解密 pdf 文件
小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
一,利用pypdf庫?批量 解除pdf 的文件的密碼。這里選擇pypdf4,其它pypdf2,pypdf3等,亦可參考,代碼如下:
import os from PyPDF4 import PdfFileReader from PyPDF4 import PdfFileWriterres_path="./resdir/"def decrypt_pdf(srcfname, resfname, password):try:file = open(srcfname, 'rb')except Exception as err:print('file open failed!' + str(err))return Nonepdf_reader = PdfFileReader(file, strict=False)if not pdf_reader.isEncrypted:print('file is no encrypted, do nothing. file: %s' % srcfname)return Noneret = pdf_reader.decrypt(password)if (ret != 1):print("%s no password (%s) is error" % (srcfname, password))return Nonepdf_writer = PdfFileWriter()pdf_writer.appendPagesFromReader(pdf_reader)res_file = open(resfname, 'wb')pdf_writer.write(res_file)file.close()res_file.close()return Nonedef main():os.mkdir(res_path)src_path = input(r"input pdf path(example: D:\\pdf\): ")password = input(r"input passwd(example: 123456): ") if src_path == "" or password == "":print('please input right path and password !!!')returnfor filename in os.listdir(src_path):sfname = src_path + filenamerfname = res_path + filenameprint("----- start decrypting file-----------")decrypt_pdf(sfname, rfname)print("----- end decrypting file-------------")if __name__ == '__main__':main()使用環境:python3環境,將此腳本和要解密的pdf文件夾放在同級目錄下執行。
二,解密過程中遇到的問題:
File "/xxx/lib/python3.10/site-packages/PyPDF4/utils.py", line 237, in b_r = s.encode('latin-1') UnicodeEncodeError: 'latin-1' codec can't encode character '\u02c6' in position 0: ordinal not in range(256)這個問題是pypdf庫在解析 pdf中文文檔時會出現,解決方法是修改庫里面的utils.py文件,如下:
源代碼:
... r = s.encode('latin-1')if len(s) < 2:bc[s] = rreturn r ...修改后:
... try:r = s.encode('latin-1') except Exception as e:r = s.encode('utf-8')if len(s) < 2:bc[s] = r return r ...?修改完后重新運行上面腳本,既可解決此問題。
總結
以上是生活随笔為你收集整理的Python 解密 pdf 文件的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 免费在线PDF解密
- 下一篇: python文件处理——pdf解密