程序師世界是廣大編程愛好者互助、分享、學習的平台,程序師世界有你更精彩!
首頁
編程語言
C語言|JAVA編程
Python編程
網頁編程
ASP編程|PHP編程
JSP編程
數據庫知識
MYSQL數據庫|SqlServer數據庫
Oracle數據庫|DB2數據庫
您现在的位置: 程式師世界 >> 編程語言 >  >> 更多編程語言 >> Python

[Excel report summary of interesting projects] Python+pandas+xlwings realizes batch extraction of table information, summarizes it into a table and sends it to the mailbox

編輯:Python

文章目錄

  • 前言
  • 一、Sub-function details
    • 1.GUI
    • 2.多線程
    • 3.Excel處理
      • 3.1 初始化xlwings
      • 3.2 osLibrary batch readsExcel表格
      • 3.3 pandasThe library reads table elements
      • 3.4 xlwingsInsert elements into the table
      • 3.5 pandasThe library classifies tables
      • 3.6 xlwingsGives a table appearance
    • 4. 發送郵件
  • 二、執行結果
    • 1.執行結果
    • 2.代碼


前言

Use the time before going to bed after showering,It took three or four days,Made this program as my internship assignment.
It can realize batch extraction of tables in the specified format,Sorted by time and summarized into a table file according to the specified format,And then automatically sent to the mailbox,There is also a simple one for the programGUI,Folders can be manually selected as workspaces.

一、Sub-function details

1.GUI

GUIIt is relatively simple to use and can be quickly createdpysimplegui,甚至比tkinterMuch simpler,The body is a progress bar

Then select the folder and click Run to start the thread that implements the function
Threads are mentioned here,沒錯!It must be implemented using multithreadinggui,並且pysimpleguiis occupying the main thread,This is also clearly stated in the official documentation(不得不說pysimpleguiThe documentation is well done)

def init_gui():
sg.theme('SystemDefaultForReal')
layout = [[sg.Text('當前文件夾:'), sg.Text('', key='text_path')],
[sg.Text('任務完成進度')],
[sg.ProgressBar(100, orientation='h', size=(50, 20), key='progressbar')],
[sg.Text('', key='progressname')],
[sg.FolderBrowse('打開文件夾', key='folder', target='text_path'), sg.Button('運行'), sg.Cancel()]]
window = sg.Window('執行進度', layout)
progress_bar = window['progressbar']
progress_name = window['progressname']
return progress_bar, progress_name, window

Updates to progress bar and current file:
需要在mainSet the loop body in ,循環體中

 progress_bar.update_bar(progress_point)
progress_name.update(progress_name_str)```

2.多線程

def work():
global complete_flag
pythoncom.CoInitialize()
workbook, worksheet, app = init_worksheet()
worksheet = read_excel(folder_path, worksheet, workbook, app)
sort_elem()
sendemail()
complete_flag = 2

To enhance the maintainability of the code,A little effort was put into packaging,The work function is encapsulated into this function according to the process
The following sentence is to solve in multi-threadingxlwings會報錯的問題

 pythoncom.CoInitialize()

We use protected mode to create threads for worker functions,Protected mode is when the main thread closes the thread

 worker_task = threading.Thread(target=work)
worker_task.setDaemon(True)

Suspend is essential in the main thread,Otherwise, the worker thread cannot run

 time.sleep(0.1)

3.Excel處理

We use to read datapandas,pandasProvides a more powerful and convenient query function,And for writing use isxlwings

3.1 初始化xlwings

首先是xlwings初始化,xlwings會打開Excel創建一個新的表格,We edit on the form

def init_worksheet():
global progress_point, progress_name_str
progress_point = 10
progress_name_str = 'Excel初始化中'
app = xw.App(visible=False, add_book=False)
workbook = app.books.add()
worksheet = workbook.sheets.add('工作記錄')
worksheet.range('A1').value = '報修時間'
worksheet.range('B1').value = '報修部門'
worksheet.range('C1').value = 'Repair class'
worksheet.range('D1').value = '報修內容'
worksheet.range('E1').value = 'Repair judgment'
worksheet.range('F1').value = 'Material replacement'
worksheet.range('G1').value = '維修人員'
return workbook, worksheet, app

3.2 osLibrary batch readsExcel表格

利用osThe library implements traversal of table files,Call the function in this function to insert the table element into the new table

def read_excel(folder_path, worksheet, workbook, app):
global progress_point, progress_name_str
file_list = os.listdir(folder_path)
per_point = 60 / len(file_list)
for i in file_list:
if i == 'total work order.xlsx':
continue
progress_point += per_point
progress_name_str = i
worksheet = read_elem(worksheet, i)
workbook.save(output_path)
workbook.close()
app.quit()
return worksheet

3.3 pandasThe library reads table elements

用pandasThe library will read the table file builddataframe,Extract by location,The extracted elements are inserted into a new table.
This step is to customize the processed form,This is also something I've always wanted to improve,但是感覺太麻煩了

def read_elem(worksheet, filename):
data = pd.DataFrame(pd.read_excel(folder_path + '\\' + filename))
time = data.iloc[1][1] # 行列
content = data.iloc[2][1]
department = data.iloc[0][1]
classes = data.iloc[0][3]
worker = data.iloc[0][5]
result = data.iloc[3][1]
goods = data.iloc[4][1]
worksheet = insert_elem(worksheet, time, department, classes, content, result, goods, worker)
return worksheet

3.4 xlwingsInsert elements into the table

We insert elements into a row according to their position,This is also the core of aggregation

def insert_elem(worksheet, time, department, classes, content, result, goods, worker):
global count
count += 1
worksheet.range('A' + str(count)).value = time
worksheet.range('B' + str(count)).value = department
worksheet.range('C' + str(count)).value = classes
worksheet.range('D' + str(count)).value = content
worksheet.range('E' + str(count)).value = result
worksheet.range('F' + str(count)).value = goods
worksheet.range('G' + str(count)).value = worker
return worksheet

3.5 pandasThe library classifies tables

The previously saved files are only intermediate files,Read the output file again,利用sort_values對時間進行排序

def sort_elem():
global progress_point, progress_name_str
progress_point = 90
progress_name_str = 'ExcelFile sorting'
data = pd.DataFrame(pd.read_excel(output_path))
data = data.sort_values(by='報修時間', ascending=True)
data.to_excel(output_path, index=False)
shape_excel()

3.6 xlwingsGives a table appearance

Read the sorted table file to set the table shape,If you don't set it, it will look bad,We can set the table properties as we want,There is no loop involved here,時間復雜度低.

def shape_excel():
app = xw.App(visible=False, add_book=False)
workbook = app.books.open(output_path)
worksheet = workbook.sheets[0]
value = worksheet.range('A1').expand('down')
value.column_width = 16 # 寬度
value = worksheet.range('D1').expand('down')
value.column_width = 20 # 寬度
value = worksheet.range('E1').expand('down')
value.column_width = 30 # 寬度
value = worksheet.range('F1').expand('down')
value.column_width = 16 # 寬度
value = worksheet.range('A1').expand('right')
value.row_height = 30 # 行高 磅數
value = worksheet.range('A2').expand('table')
value.row_height = 20 # 行高 磅數
workbook.save(output_path)
workbook.close()
app.quit()

4. 發送郵件

The information presented below is fictitious,This code borrows the method of some bloggers

def sendemail():
global progress_point, progress_name_str
progress_name_str = '郵件發送中'
host_server = 'smtp.126.com' # 126郵箱smtp服務器
sender_email = '[email protected]' # 發件人郵箱
pwd = 'Go to email to apply for onestmp的授權碼'
receiver = ['1********[email protected]'] # 收件人郵箱
mail_title = 'Communication Information Class7-8Summary of monthly work orders' # 郵件標題
mail_content = "大王,Please review this month's report!" # 郵件正文內容
msg = MIMEMultipart()
msg["Subject"] = Header(mail_title, 'utf-8')
msg["From"] = sender_email
msg["To"] = ";".join(receiver)
msg.attach(MIMEText(mail_content, 'plain')) # html/plain
attachment = MIMEApplication(open(output_path, 'rb').read())
attachment["Content-Type"] = 'application/octet-stream'
basename = "Communication Information Class7-8Summary of monthly work orders.xlsx" # Rename the attachment The file extension cannot be changed
attachment.add_header('Content-Disposition', 'attachment',
filename=('utf-8', '', basename)) # 注意:此處basename要轉換為gbk編碼,否則中文會有亂碼.
msg.attach(attachment)
try:
smtp = SMTP_SSL(host_server) # ssl登錄連接到郵件服務器
smtp.set_debuglevel(1) # 0是關閉,1是開啟debug
smtp.ehlo(host_server) # 跟服務器打招呼,告訴它我們准備連接,最好加上這行代碼
smtp.login(sender_email, pwd)
smtp.sendmail(sender_email, receiver, msg.as_string())
smtp.quit()
print("郵件發送成功")
progress_point = 100
progress_name_str = '任務完成'
except smtplib.SMTPException:
print("無法發送郵件")

二、執行結果

1.執行結果

The table below is purely fictitious
這樣的表格

Twenty sheets were tested

匯總結果

Email received
GUI界面如下

2.代碼

import numpy as np
import time
import pandas as pd
import xlwings as xw
import os
import os.path
import smtplib
import string
from smtplib import SMTP_SSL
from email.mime.text import MIMEText
from email.mime.multipart import MIMEMultipart
from email.header import Header
from email.mime.application import MIMEApplication # Used to add attachments
import PySimpleGUI as sg
import threading
import pythoncom
import win32
complete_flag = 0
count = 1
progress_point = 0
progress_name_str = ''
folder_path = ''
output_path = folder_path + r'\total work order.xlsx'
def init_gui():
sg.theme('SystemDefaultForReal')
layout = [[sg.Text('當前文件夾:'), sg.Text('', key='text_path')],
[sg.Text('任務完成進度')],
[sg.ProgressBar(100, orientation='h', size=(50, 20), key='progressbar')],
[sg.Text('', key='progressname')],
[sg.FolderBrowse('打開文件夾', key='folder', target='text_path'), sg.Button('運行'), sg.Cancel()]]
window = sg.Window('執行進度', layout)
progress_bar = window['progressbar']
progress_name = window['progressname']
return progress_bar, progress_name, window
def init_worksheet():
global progress_point, progress_name_str
progress_point = 10
progress_name_str = 'Excel初始化中'
app = xw.App(visible=False, add_book=False)
workbook = app.books.add()
worksheet = workbook.sheets.add('工作記錄')
worksheet.range('A1').value = '報修時間'
worksheet.range('B1').value = '報修部門'
worksheet.range('C1').value = 'Repair class'
worksheet.range('D1').value = '報修內容'
worksheet.range('E1').value = 'Repair judgment'
worksheet.range('F1').value = 'Material replacement'
worksheet.range('G1').value = '維修人員'
return workbook, worksheet, app
def insert_elem(worksheet, time, department, classes, content, result, goods, worker):
global count
count += 1
worksheet.range('A' + str(count)).value = time
worksheet.range('B' + str(count)).value = department
worksheet.range('C' + str(count)).value = classes
worksheet.range('D' + str(count)).value = content
worksheet.range('E' + str(count)).value = result
worksheet.range('F' + str(count)).value = goods
worksheet.range('G' + str(count)).value = worker
return worksheet
def read_elem(worksheet, filename):
data = pd.DataFrame(pd.read_excel(folder_path + '\\' + filename))
time = data.iloc[1][1] # 行列
content = data.iloc[2][1]
department = data.iloc[0][1]
classes = data.iloc[0][3]
worker = data.iloc[0][5]
result = data.iloc[3][1]
goods = data.iloc[4][1]
worksheet = insert_elem(worksheet, time, department, classes, content, result, goods, worker)
return worksheet
def read_excel(folder_path, worksheet, workbook, app):
global progress_point, progress_name_str
file_list = os.listdir(folder_path)
per_point = 60 / len(file_list)
for i in file_list:
if i == 'total work order.xlsx':
continue
progress_point += per_point
progress_name_str = i
worksheet = read_elem(worksheet, i)
workbook.save(output_path)
workbook.close()
app.quit()
return worksheet
def sort_elem():
global progress_point, progress_name_str
progress_point = 90
progress_name_str = 'ExcelFile sorting'
data = pd.DataFrame(pd.read_excel(output_path))
data = data.sort_values(by='報修時間', ascending=True)
data.to_excel(output_path, index=False)
shape_excel()
def shape_excel():
app = xw.App(visible=False, add_book=False)
workbook = app.books.open(output_path)
worksheet = workbook.sheets[0]
value = worksheet.range('A1').expand('down')
value.column_width = 16 # 寬度
value = worksheet.range('D1').expand('down')
value.column_width = 20 # 寬度
value = worksheet.range('E1').expand('down')
value.column_width = 30 # 寬度
value = worksheet.range('F1').expand('down')
value.column_width = 16 # 寬度
value = worksheet.range('A1').expand('right')
value.row_height = 30 # 行高 磅數
value = worksheet.range('A2').expand('table')
value.row_height = 20 # 行高 磅數
workbook.save(output_path)
workbook.close()
app.quit()
def sendemail():
global progress_point, progress_name_str
progress_name_str = '郵件發送中'
host_server = 'smtp.126.com' # 126郵箱smtp服務器
sender_email = '[email protected]' # 發件人郵箱
pwd = '**********'
receiver = ['1*******[email protected]'] # 收件人郵箱
mail_title = 'Communication Information Class7-8Summary of monthly work orders' # 郵件標題
mail_content = "大王,Please review this month's report!" # 郵件正文內容
msg = MIMEMultipart()
msg["Subject"] = Header(mail_title, 'utf-8')
msg["From"] = sender_email
msg["To"] = ";".join(receiver)
msg.attach(MIMEText(mail_content, 'plain')) # html/plain
attachment = MIMEApplication(open(output_path, 'rb').read())
attachment["Content-Type"] = 'application/octet-stream'
basename = "Communication Information Class7-8Summary of monthly work orders.xlsx" # Rename the attachment The file extension cannot be changed
attachment.add_header('Content-Disposition', 'attachment',
filename=('utf-8', '', basename)) # 注意:此處basename要轉換為gbk編碼,否則中文會有亂碼.
msg.attach(attachment)
try:
smtp = SMTP_SSL(host_server) # ssl登錄連接到郵件服務器
smtp.set_debuglevel(1) # 0是關閉,1是開啟debug
smtp.ehlo(host_server) # 跟服務器打招呼,告訴它我們准備連接,最好加上這行代碼
smtp.login(sender_email, pwd)
smtp.sendmail(sender_email, receiver, msg.as_string())
smtp.quit()
print("郵件發送成功")
progress_point = 100
progress_name_str = '任務完成'
except smtplib.SMTPException:
print("無法發送郵件")
def work():
global complete_flag
pythoncom.CoInitialize()
workbook, worksheet, app = init_worksheet()
worksheet = read_excel(folder_path, worksheet, workbook, app)
sort_elem()
sendemail()
complete_flag = 2
if __name__ == "__main__":
progress_bar, progress_name, window = init_gui()
worker_task = threading.Thread(target=work)
worker_task.setDaemon(True)
while True:
event, values = window.read(timeout=10)
time.sleep(0.1)
if event == 'Cancel' or complete_flag == 2:
break
if values['folder']:
folder_path = values['folder']
complete_flag = 1
if event == '運行' and complete_flag == 1:
worker_task.start()
if event == '運行' and complete_flag == 0:
sg.popup('Please select a folder before clicking Run', title='Error', auto_close=True, auto_close_duration=5, grab_anywhere=True)
progress_bar.update_bar(progress_point)
progress_name.update(progress_name_str)

  1. 上一篇文章:
  2. 下一篇文章:
Copyright © 程式師世界 All Rights Reserved