程序師世界是廣大編程愛好者互助、分享、學習的平台,程序師世界有你更精彩!
首頁
編程語言
C語言|JAVA編程
Python編程
網頁編程
ASP編程|PHP編程
JSP編程
數據庫知識
MYSQL數據庫|SqlServer數據庫
Oracle數據庫|DB2數據庫
您现在的位置: 程式師世界 >> 編程語言 >  >> 更多編程語言 >> Python

【趣味項目之Excel報表匯總】Python+pandas+xlwings實現批量提取表格信息匯總到表格並發送到郵箱

編輯:Python

文章目錄

  • 前言
  • 一、分功能詳解
    • 1.GUI
    • 2.多線程
    • 3.Excel處理
      • 3.1 初始化xlwings
      • 3.2 os庫批量讀取Excel表格
      • 3.3 pandas庫讀取表格元素
      • 3.4 xlwings插入元素到表格
      • 3.5 pandas庫給表格分類
      • 3.6 xlwings給設置表格外觀
    • 4. 發送郵件
  • 二、執行結果
    • 1.執行結果
    • 2.代碼


前言

利用洗完澡睡前的時間,花了三四天,做了這個程序來作為我的實習作業。
可以實現批量提取規定格式的表格,按時間排序後按指定格式匯總到表格文件,然後自動發送到郵箱,還為程序還搭了一個簡單的GUI,可以手動選擇文件夾來作為工作區。

一、分功能詳解

1.GUI

GUI使用了比較簡單可以快速建立的pysimplegui,甚至比tkinter還簡單多了,主體是一個進度條

再選擇文件夾後點擊運行即可啟動功能實現的線程
這裡提到線程,沒錯!必須使用多線程來實現gui,並且pysimplegui是占據主線程的,這一點在官方文檔也說的很明白(不得不說pysimplegui的文檔做的很不錯)

def init_gui():
sg.theme('SystemDefaultForReal')
layout = [[sg.Text('當前文件夾:'), sg.Text('', key='text_path')],
[sg.Text('任務完成進度')],
[sg.ProgressBar(100, orientation='h', size=(50, 20), key='progressbar')],
[sg.Text('', key='progressname')],
[sg.FolderBrowse('打開文件夾', key='folder', target='text_path'), sg.Button('運行'), sg.Cancel()]]
window = sg.Window('執行進度', layout)
progress_bar = window['progressbar']
progress_name = window['progressname']
return progress_bar, progress_name, window

對進度條和當前文件的更新:
需要在main中設置循環體,循環體中

 progress_bar.update_bar(progress_point)
progress_name.update(progress_name_str)```

2.多線程

def work():
global complete_flag
pythoncom.CoInitialize()
workbook, worksheet, app = init_worksheet()
worksheet = read_excel(folder_path, worksheet, workbook, app)
sort_elem()
sendemail()
complete_flag = 2

為了增強代碼的可維護性,在封裝上下了一點點功夫,將工作函數按流程封裝到這個函數中
下面這句是為了解決在多線程中xlwings會報錯的問題

 pythoncom.CoInitialize()

我們用保護模式為工作函數建立線程,保護模式就是主線程關閉這個線程就關閉

 worker_task = threading.Thread(target=work)
worker_task.setDaemon(True)

在主線程中必不可少的進行掛起,不然工作線程是沒法運行的

 time.sleep(0.1)

3.Excel處理

咱們讀取數據用的是pandas,pandas提供了比較強大方便的查詢功能,而進行寫入使用的是xlwings

3.1 初始化xlwings

首先是xlwings初始化,xlwings會打開Excel創建一個新的表格,我們在表格上進行編輯

def init_worksheet():
global progress_point, progress_name_str
progress_point = 10
progress_name_str = 'Excel初始化中'
app = xw.App(visible=False, add_book=False)
workbook = app.books.add()
worksheet = workbook.sheets.add('工作記錄')
worksheet.range('A1').value = '報修時間'
worksheet.range('B1').value = '報修部門'
worksheet.range('C1').value = '報修班組'
worksheet.range('D1').value = '報修內容'
worksheet.range('E1').value = '維修判定'
worksheet.range('F1').value = '材料更換'
worksheet.range('G1').value = '維修人員'
return workbook, worksheet, app

3.2 os庫批量讀取Excel表格

利用os庫實現對表格文件的遍歷,在此函數中調用函數讓表格元素插入新表格

def read_excel(folder_path, worksheet, workbook, app):
global progress_point, progress_name_str
file_list = os.listdir(folder_path)
per_point = 60 / len(file_list)
for i in file_list:
if i == '總工單.xlsx':
continue
progress_point += per_point
progress_name_str = i
worksheet = read_elem(worksheet, i)
workbook.save(output_path)
workbook.close()
app.quit()
return worksheet

3.3 pandas庫讀取表格元素

用pandas庫將讀取表格文件建立dataframe,再按位置提取出來,提取出來的元素插入新表格中。
這一步是要對處理的表格進行定制的,這也是我一直想改進的地方,但是感覺太麻煩了

def read_elem(worksheet, filename):
data = pd.DataFrame(pd.read_excel(folder_path + '\\' + filename))
time = data.iloc[1][1] # 行列
content = data.iloc[2][1]
department = data.iloc[0][1]
classes = data.iloc[0][3]
worker = data.iloc[0][5]
result = data.iloc[3][1]
goods = data.iloc[4][1]
worksheet = insert_elem(worksheet, time, department, classes, content, result, goods, worker)
return worksheet

3.4 xlwings插入元素到表格

咱按照位置給元素插到一排去,這也是匯總的核心了

def insert_elem(worksheet, time, department, classes, content, result, goods, worker):
global count
count += 1
worksheet.range('A' + str(count)).value = time
worksheet.range('B' + str(count)).value = department
worksheet.range('C' + str(count)).value = classes
worksheet.range('D' + str(count)).value = content
worksheet.range('E' + str(count)).value = result
worksheet.range('F' + str(count)).value = goods
worksheet.range('G' + str(count)).value = worker
return worksheet

3.5 pandas庫給表格分類

前面保存的文件只是中間文件,將已經輸出的文件再次讀取,利用sort_values對時間進行排序

def sort_elem():
global progress_point, progress_name_str
progress_point = 90
progress_name_str = 'Excel文件排序中'
data = pd.DataFrame(pd.read_excel(output_path))
data = data.sort_values(by='報修時間', ascending=True)
data.to_excel(output_path, index=False)
shape_excel()

3.6 xlwings給設置表格外觀

將排序好的表格文件讀取進行表格形狀的設置,如果不設置的話會比較不好看,我們可以按照自己想要的樣子進行設置表格屬性,這裡不涉及循環,時間復雜度低。

def shape_excel():
app = xw.App(visible=False, add_book=False)
workbook = app.books.open(output_path)
worksheet = workbook.sheets[0]
value = worksheet.range('A1').expand('down')
value.column_width = 16 # 寬度
value = worksheet.range('D1').expand('down')
value.column_width = 20 # 寬度
value = worksheet.range('E1').expand('down')
value.column_width = 30 # 寬度
value = worksheet.range('F1').expand('down')
value.column_width = 16 # 寬度
value = worksheet.range('A1').expand('right')
value.row_height = 30 # 行高 磅數
value = worksheet.range('A2').expand('table')
value.row_height = 20 # 行高 磅數
workbook.save(output_path)
workbook.close()
app.quit()

4. 發送郵件

以下涉及到的信息都是虛構的,這段代碼借鑒了一些博主的方法

def sendemail():
global progress_point, progress_name_str
progress_name_str = '郵件發送中'
host_server = 'smtp.126.com' # 126郵箱smtp服務器
sender_email = '[email protected]' # 發件人郵箱
pwd = '去郵箱申請一個stmp的授權碼'
receiver = ['1********[email protected]'] # 收件人郵箱
mail_title = '通信信息班7-8月工單匯總' # 郵件標題
mail_content = "大王,本月報表請過目!" # 郵件正文內容
msg = MIMEMultipart()
msg["Subject"] = Header(mail_title, 'utf-8')
msg["From"] = sender_email
msg["To"] = ";".join(receiver)
msg.attach(MIMEText(mail_content, 'plain')) # html/plain
attachment = MIMEApplication(open(output_path, 'rb').read())
attachment["Content-Type"] = 'application/octet-stream'
basename = "通信信息班7-8月工單匯總.xlsx" # 給附件重命名 不能改文件擴展名
attachment.add_header('Content-Disposition', 'attachment',
filename=('utf-8', '', basename)) # 注意:此處basename要轉換為gbk編碼,否則中文會有亂碼。
msg.attach(attachment)
try:
smtp = SMTP_SSL(host_server) # ssl登錄連接到郵件服務器
smtp.set_debuglevel(1) # 0是關閉,1是開啟debug
smtp.ehlo(host_server) # 跟服務器打招呼,告訴它我們准備連接,最好加上這行代碼
smtp.login(sender_email, pwd)
smtp.sendmail(sender_email, receiver, msg.as_string())
smtp.quit()
print("郵件發送成功")
progress_point = 100
progress_name_str = '任務完成'
except smtplib.SMTPException:
print("無法發送郵件")

二、執行結果

1.執行結果

下面表格純屬虛構
這樣的表格

試驗了二十張

匯總結果

郵箱收到了
GUI界面如下

2.代碼

import numpy as np
import time
import pandas as pd
import xlwings as xw
import os
import os.path
import smtplib
import string
from smtplib import SMTP_SSL
from email.mime.text import MIMEText
from email.mime.multipart import MIMEMultipart
from email.header import Header
from email.mime.application import MIMEApplication # 用於添加附件
import PySimpleGUI as sg
import threading
import pythoncom
import win32
complete_flag = 0
count = 1
progress_point = 0
progress_name_str = ''
folder_path = ''
output_path = folder_path + r'\總工單.xlsx'
def init_gui():
sg.theme('SystemDefaultForReal')
layout = [[sg.Text('當前文件夾:'), sg.Text('', key='text_path')],
[sg.Text('任務完成進度')],
[sg.ProgressBar(100, orientation='h', size=(50, 20), key='progressbar')],
[sg.Text('', key='progressname')],
[sg.FolderBrowse('打開文件夾', key='folder', target='text_path'), sg.Button('運行'), sg.Cancel()]]
window = sg.Window('執行進度', layout)
progress_bar = window['progressbar']
progress_name = window['progressname']
return progress_bar, progress_name, window
def init_worksheet():
global progress_point, progress_name_str
progress_point = 10
progress_name_str = 'Excel初始化中'
app = xw.App(visible=False, add_book=False)
workbook = app.books.add()
worksheet = workbook.sheets.add('工作記錄')
worksheet.range('A1').value = '報修時間'
worksheet.range('B1').value = '報修部門'
worksheet.range('C1').value = '報修班組'
worksheet.range('D1').value = '報修內容'
worksheet.range('E1').value = '維修判定'
worksheet.range('F1').value = '材料更換'
worksheet.range('G1').value = '維修人員'
return workbook, worksheet, app
def insert_elem(worksheet, time, department, classes, content, result, goods, worker):
global count
count += 1
worksheet.range('A' + str(count)).value = time
worksheet.range('B' + str(count)).value = department
worksheet.range('C' + str(count)).value = classes
worksheet.range('D' + str(count)).value = content
worksheet.range('E' + str(count)).value = result
worksheet.range('F' + str(count)).value = goods
worksheet.range('G' + str(count)).value = worker
return worksheet
def read_elem(worksheet, filename):
data = pd.DataFrame(pd.read_excel(folder_path + '\\' + filename))
time = data.iloc[1][1] # 行列
content = data.iloc[2][1]
department = data.iloc[0][1]
classes = data.iloc[0][3]
worker = data.iloc[0][5]
result = data.iloc[3][1]
goods = data.iloc[4][1]
worksheet = insert_elem(worksheet, time, department, classes, content, result, goods, worker)
return worksheet
def read_excel(folder_path, worksheet, workbook, app):
global progress_point, progress_name_str
file_list = os.listdir(folder_path)
per_point = 60 / len(file_list)
for i in file_list:
if i == '總工單.xlsx':
continue
progress_point += per_point
progress_name_str = i
worksheet = read_elem(worksheet, i)
workbook.save(output_path)
workbook.close()
app.quit()
return worksheet
def sort_elem():
global progress_point, progress_name_str
progress_point = 90
progress_name_str = 'Excel文件排序中'
data = pd.DataFrame(pd.read_excel(output_path))
data = data.sort_values(by='報修時間', ascending=True)
data.to_excel(output_path, index=False)
shape_excel()
def shape_excel():
app = xw.App(visible=False, add_book=False)
workbook = app.books.open(output_path)
worksheet = workbook.sheets[0]
value = worksheet.range('A1').expand('down')
value.column_width = 16 # 寬度
value = worksheet.range('D1').expand('down')
value.column_width = 20 # 寬度
value = worksheet.range('E1').expand('down')
value.column_width = 30 # 寬度
value = worksheet.range('F1').expand('down')
value.column_width = 16 # 寬度
value = worksheet.range('A1').expand('right')
value.row_height = 30 # 行高 磅數
value = worksheet.range('A2').expand('table')
value.row_height = 20 # 行高 磅數
workbook.save(output_path)
workbook.close()
app.quit()
def sendemail():
global progress_point, progress_name_str
progress_name_str = '郵件發送中'
host_server = 'smtp.126.com' # 126郵箱smtp服務器
sender_email = '[email protected]' # 發件人郵箱
pwd = '**********'
receiver = ['1*******[email protected]'] # 收件人郵箱
mail_title = '通信信息班7-8月工單匯總' # 郵件標題
mail_content = "大王,本月報表請過目!" # 郵件正文內容
msg = MIMEMultipart()
msg["Subject"] = Header(mail_title, 'utf-8')
msg["From"] = sender_email
msg["To"] = ";".join(receiver)
msg.attach(MIMEText(mail_content, 'plain')) # html/plain
attachment = MIMEApplication(open(output_path, 'rb').read())
attachment["Content-Type"] = 'application/octet-stream'
basename = "通信信息班7-8月工單匯總.xlsx" # 給附件重命名 不能改文件擴展名
attachment.add_header('Content-Disposition', 'attachment',
filename=('utf-8', '', basename)) # 注意:此處basename要轉換為gbk編碼,否則中文會有亂碼。
msg.attach(attachment)
try:
smtp = SMTP_SSL(host_server) # ssl登錄連接到郵件服務器
smtp.set_debuglevel(1) # 0是關閉,1是開啟debug
smtp.ehlo(host_server) # 跟服務器打招呼,告訴它我們准備連接,最好加上這行代碼
smtp.login(sender_email, pwd)
smtp.sendmail(sender_email, receiver, msg.as_string())
smtp.quit()
print("郵件發送成功")
progress_point = 100
progress_name_str = '任務完成'
except smtplib.SMTPException:
print("無法發送郵件")
def work():
global complete_flag
pythoncom.CoInitialize()
workbook, worksheet, app = init_worksheet()
worksheet = read_excel(folder_path, worksheet, workbook, app)
sort_elem()
sendemail()
complete_flag = 2
if __name__ == "__main__":
progress_bar, progress_name, window = init_gui()
worker_task = threading.Thread(target=work)
worker_task.setDaemon(True)
while True:
event, values = window.read(timeout=10)
time.sleep(0.1)
if event == 'Cancel' or complete_flag == 2:
break
if values['folder']:
folder_path = values['folder']
complete_flag = 1
if event == '運行' and complete_flag == 1:
worker_task.start()
if event == '運行' and complete_flag == 0:
sg.popup('請先選擇文件夾後再點擊運行', title='Error', auto_close=True, auto_close_duration=5, grab_anywhere=True)
progress_bar.update_bar(progress_point)
progress_name.update(progress_name_str)

  1. 上一篇文章:
  2. 下一篇文章:
Copyright © 程式師世界 All Rights Reserved