程序師世界是廣大編程愛好者互助、分享、學習的平台,程序師世界有你更精彩!
首頁
編程語言
C語言|JAVA編程
Python編程
網頁編程
ASP編程|PHP編程
JSP編程
數據庫知識
MYSQL數據庫|SqlServer數據庫
Oracle數據庫|DB2數據庫
您现在的位置: 程式師世界 >> 編程語言 >  >> 更多編程語言 >> Python

Python office automation calculates and collates data from excel and writes it to word

編輯:Python

Preface

In the article of the last few days, we explained how to Word Extract the specified data from the table and save it to Excel in , Today we will explain how to use again with the real needs of a reader Python from Excel Middle computation 、 Organize data and write Word in , It's not hard , There are two main steps :

openpyxl Read Excel Get content

docx Reading and writing Word file

Let's start !

Demand confirmation

Let's first look at what we need to deal with Excel Part of the data , Data has been picturesque due to privacy concerns

You can see a lot of data , And there's duplicate data . What we need to do is to calculate the data of each column according to certain rules 、 Organize and use Python Auto fill to Word in , The general requirements are as follows

The above is only part of the requirements , Real need to be filled in word More data in !

In addition to processing and storing according to the format Word Out of the position specified in , There is another need : Final output word File names also need to be generated according to certain rules :

OK, Requirement analysis completed , Let's look at Python How to solve !

Python Realization

First we use Python For the Excel To analyze

from openpyxl import load_workbookimport os#  Get path to desktop def GetDesktopPath():    return os.path.join(os.path.expanduser("~"), 'Desktop')path = GetDesktopPath() + '/ Information /' #  The path forming the folder will be reused later workbook = load_workbook(filename=path + ' data .xlsx')sheet = workbook.active #  Get current page #  Data range can be obtained by code , It is also convenient for batch loop iteration #  Get data range print(sheet.dimensions)# A1:W10

utilize openpyxl There are several ways to read cells

cells = sheet['A1:A4']  #  return A1-A4 Of 4 A cell cells = sheet['A'] #  obtain A Column cells = sheet['A:C'] #  obtain A-C Column cells = sheet[5] #  For the first 5 That's ok #  Note if it is used above cells Get returns the nested ancestor for cell in cells:    print(cell[0].value) #  Traverse cells You still need to take out the element in the ancestor to get the value #  Get all of a range cell#  It can also be used. iter_col Return column for row in sheet.iter_rows(min_row=1, max_row=3,min_col=2, max_col=4):    for cell in row:        print(cell.value)

If we understand the principle, we can get it analytically Excel The data in

# SQESQE = sheet['Q2'].value#  supplier & Manufacturer supplier = sheet['G2'].value#  Po No C2_10 = sheet['C2:C10'] #  return cell.tuple object #  Using list derivation to follow the same principle vC2_10 = [str(cell[0].value) for cell in C2_10]#  use set Easy to use after heavy removal , Connect , fill word Table use order_num = ','.join(set(vC2_10))#  use set Easy to use after heavy removal & Connect ,word File name naming using order_num_title = '&'.join(set(vC2_10))#  Product model T2_10 = sheet['T2:T10']vT2_10 = [str(cell[0].value) for cell in T2_10]ptype = ','.join(set(vT2_10))#  Product description P2_10 = sheet['P2:P10']vP2_10 = [str(cell[0].value) for cell in P2_10]info = ','.join(set(vP2_10))info_title = '&'.join(set(vP2_10))#  date #  use datetime Library get today's time and format accordingly import datetimetoday = datetime.datetime.today()time = today.strftime('%Y year %m month %d Japan ')#  Inspection quantity V2_10 = sheet['V2:V10']vV2_10 = [int(cell[0].value) for cell in V2_10]total_num = sum(vV2_10) #  Calculate total quantity #  Number of inspection containers W2_10 = sheet['W2:W10']vW2_10 = [int(cell[0].value) for cell in W2_10]box_num = sum(vW2_10)#  Generate the final required word file name title = f'{order_num_title}-{supplier}-{total_num}-{info_title}-{time}- Inspection Report 'print(title)

Through the code above , We have succeeded in Excel Data extracted from , such Excel That's the end of the part , Next, we will word It's time to fill in the form , Because here we read by default word yes .docx Format , In fact, what readers need is .doc Format file , therefore windows You can use the following code to batch convert doc, The premise is good installation win32com

# pip install pypiwin32from win32com import clientdocx_path = path + ' Templates .docx'# doc turn docx Function of def doc2docx(doc_path,docx_path):    word = client.Dispatch("Word.Application")    doc = word.Documents.Open(doc_path)    doc.SaveAs(docx_path, 16)    doc.Close()    word.Quit()    print('\n doc File converted to docx \n')if not os.path.exists(docx_path):    doc2docx(docx_path[:-1], docx_path)

But in the Mac There is no good solution , Welcome to communicate if you have any ideas , All right docx Let's continue after the format file Word part

docx_path = path + ' Templates .docx'from docx import Document#  Instantiation document = Document(docx_path)#  Read word All tables in tables = document.tables# print(len(tables))# 15

After confirming the number of each form, the corresponding filling operation can be carried out ,table Usage and openpyxl Very similar in , Note index and native python All from 0 Start

tables[0].cell(1, 1).text = SQEtables[1].cell(1, 1).text = suppliertables[1].cell(2, 1).text = suppliertables[1].cell(3, 1).text = ptypetables[1].cell(4, 1).text = infotables[1].cell(5, 1).text = order_numtables[1].cell(7, 1).text = time

The above code is completed Word Table in

We continue to use Python Fill in the next form

for i in range(2, 11):    tables[6].cell(i, 0).text = str(sheet[f'T{i}'].value)    tables[6].cell(i, 1).text = str(sheet[f'P{i}'].value)    tables[6].cell(i, 2).text = str(sheet[f'C{i}'].value)    tables[6].cell(i, 4).text = str(sheet[f'V{i}'].value)    tables[6].cell(i, 5).text = str(sheet[f'V{i}'].value)    tables[6].cell(i, 6).text = '0'    tables[6].cell(i, 7).text = str(sheet[f'W{i}'].value)    tables[6].cell(i, 8).text = '0'tables[6].cell(12, 4).text = str(total_num)tables[6].cell(12, 5).text = str(total_num)tables[6].cell(12, 7).text = str(box_num)

Two details need to be noted here :

word The data to be written must be a string , So from Excel The acquired data needs to be str format

Other situations such as consolidation may exist in the form , So the number of rows and columns you see may not be true , Need to test with code .

According to the above method , From Excel Fill in the data one by one Word The corresponding position in the ! Just save it for the last time .

document.save(path + f'{title}.docx')print('\n File generated ') Conclusion

Review the above process , In fact, from the perspective of requirements and document format , The read-write parsing task of this file is more complex , Code code and thinking time will be long , So when we think about using Python We need to think about this problem before we carry out office automation : Is there a lot of work to be done this time , Or in the long run , use Python Can we free our hands ? If not , Actually, it can be done manually , So the significance of automatic office is lost !

Click to open the source code address

That's all Python Office automation from Excel Calculate and collate data in and write Word Details of , More about Python Excel Data writing Word Please pay attention to other relevant articles of software development network !



  1. 上一篇文章:
  2. 下一篇文章:
Copyright © 程式師世界 All Rights Reserved