程序師世界是廣大編程愛好者互助、分享、學習的平台,程序師世界有你更精彩!
首頁
編程語言
C語言|JAVA編程
Python編程
網頁編程
ASP編程|PHP編程
JSP編程
數據庫知識
MYSQL數據庫|SqlServer數據庫
Oracle數據庫|DB2數據庫
您现在的位置: 程式師世界 >> 編程語言 >  >> 更多編程語言 >> Python

Python merge pdf

編輯:Python

List of articles

    • 1. Preface
    • 2. install
    • 3. Use
    • 4. Reference resources

1. Preface

You need to crawl when you are a reptile recently pdf The paging files are merged into one pdf file , Just thinking about python Is there any library that can implement . Through a simple search , Found out Pypdf2.

2. install

The installation procedure is simple :

pip install pypdf2

3. Use

Find... By querying the document pypdf2 It provides a lot of , Only two of them are needed to meet my needs , namely PdfFileMerger, PdfFileReader,

from PyPDF2 import PdfFileMerger, PdfFileReader # introduce 
file_merger = PdfFileMerger(strict=False) # Initialize and set non strict checks 
target_path = 'X:/XXX/temp' # Merge pdf In the directory 
path='X:/XXX/XXX' # Output directory after merging 
pdf_lst = [f for f in os.listdir(target_path) if f.endswith('.pdf')]# Read pdf
pdf_lst = [os.path.join(target_path, filename) for filename in pdf_lst]# Complete the file address 
for pdf in pdf_lst:
file_merger.append(PdfFileReader(pdf), 'tag')
file_merger.addMetadata(
{
u'/Title': u'my title', u'/Creator': u'creator', '/Subject': 'subjects'})# completion pdf Information 
with open(PATH, 'wb+') as fa:
file_merger.write(fa) # Write the merged pdf

Through the above code, you can achieve pdf The merger of .

PdfFileMerger() There is a parameter ,strict(bool) – Determine whether the user should be warned of a problem , The default value is True. I merged pdf A problem was detected in , Not set to strict=False Can't merge successfully .

PdfFileMerger Of append() Methods also have parameters , The first parameter is the file object , The second parameter is bookmark , In order to facilitate browsing, I crawled the bookmark with a crawler , The second parameter is used to set bookmarks in the merge .

PdfFileMerger Of addMetadata() Method can add metadata , Easy pdf Follow up management . The parameters of this method must be set using tuples .

4. Reference resources

Official documents :PyPDF2 Documentation — PyPDF2 1.26.0 documentation (pythonhosted.org)


  1. 上一篇文章:
  2. 下一篇文章:
Copyright © 程式師世界 All Rights Reserved