PDF转图片实现方式
生活随笔
收集整理的這篇文章主要介紹了
PDF转图片实现方式
小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
1. 需要python包
PyPDF2
glob
pdf2image
numpy
2.?PyPDF2轉圖片步驟
3. 需要按照依賴poppler
Poppler是用于呈現可移植文檔格式(PDF)文檔的免費軟件實用程序庫。不同機器安裝poppler方式如下:
3.1 mac機器
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
and press?enter/return?key. Wait for the command to finish. If you are prompted to enter a password, please type your Mac user's login password and press ENTER. Mind you, as you type your password, it won't be visible on your Terminal (for security reasons), but rest assured it will work.
brew install poppler
Done! You can now use?poppler.
3.2? window安裝
Poppler for Windows
3.3 linux安裝
apt-get install python-poppler?
或者
sudo apt-get install python3-poppler-qt4
4. Python實現轉圖片代碼
from PyPDF2 import PdfFileReader, PdfFileWriter import glob import os from pdf2image import convert_from_bytes import shutil import numpy as np from time import timedef pdf2image2(pdfPath, imagePath):images = convert_from_bytes(open(pdfPath, 'rb').read())for image in images:if not os.path.exists(imagePath):os.makedirs(imagePath)pngname=pdfPath[6:-4]image.save(imagePath+'/'+pngname+'.jpg', 'JPEG',quality=30)def process_bar(no, total_length):bar = '\r' + str(no) + '|' + str(total_length)print(bar, end='', flush=True)def split_combine(path, pdf_writer):pdf = PdfFileReader(path, strict=False)# lastest pagepage = pdf.getPage(0)pdf_writer.addPage(page)def run():# get curren dir pdf filesstart_time = time()pdf_list = glob.glob('*.pdf')pdf_writer = PdfFileWriter()imgpath="./img/"tmppath="./tmp/"if not os.path.exists(imgpath):os.makedirs(imgpath)if not os.path.exists(tmppath):os.makedirs(tmppath)for i, pdf_file in enumerate(pdf_list):process_bar(i + 1, len(pdf_list))split_combine(pdf_file, pdf_writer)with open(tmppath+pdf_file, 'wb') as output_pdf:pdf_writer.write(output_pdf)pdf2image2(tmppath+pdf_file, imgpath)shutil.rmtree(tmppath)end_time = time()print(end_time-start_time)if __name__ == '__main__':run()總結
以上是生活随笔為你收集整理的PDF转图片实现方式的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 前端学习(975):bootstrap轮
- 下一篇: 逆向脱壳附加数据处理