程序師世界是廣大編程愛好者互助、分享、學習的平台,程序師世界有你更精彩!
首頁
編程語言
C語言|JAVA編程
Python編程
網頁編程
ASP編程|PHP編程
JSP編程
數據庫知識
MYSQL數據庫|SqlServer數據庫
Oracle數據庫|DB2數據庫
您现在的位置: 程式師世界 >> 編程語言 >  >> 更多編程語言 >> Python

Python processes two excel and writes a new Excel based on the same fields

編輯:Python

In the work of data processing , I often meet many Excel file , Then the content is different , Associated fields , They need to be merged together , Then generate a new file and put it in a new excel in .

# 1. First, we need to encapsulate two classes

① One class is dedicated to reading Excel Of , Of course, it can also be encapsulated and written to Excel, If you want to add

② Second, we can encapsulate another class , Take two. Excel All the fields of the table are traversed , Then combine them into a dictionary of nested lists , Then insert

Create a new one

handle_excel.py
import openpyxl
from openpyxl import load_workbook
class EncapsulationExcel:
""" encapsulation Excel
"""
def __init__(self, filename, sheet=None):
self.filename, self.sheet = filename, sheet
def operation_excel(self):
""" operation Excel get data
:return:
"""
# 1. open Excel
wb = load_workbook(self.filename)
# 2. Locate the form
if self.sheet is None:
ws = wb.active
else:
ws = wb[self.sheet]
# 3. Get table data
values = tuple(ws.iter_rows(min_row=1, max_row=1, values_only=True))
sheet_head_tuple = values[0]
cases_list = [] # Data dictionary information , Store in list , This list is the nested dictionary list
for data in tuple(ws.iter_rows(min_row=2, values_only=True)):
cases_list.append(dict(zip(sheet_head_tuple, data)))
return cases_list
def write_result(self, row, col, result):
""" operation Excel data , Write data
"""
other_wb = load_workbook(self.filename)
if self.sheet is None:
other_ws = other_wb.active
else:
other_ws = other_wb[self.sheet]
# if isinstance(row, int) and (2 <= row <= other_ws.max_row):
if isinstance(row, int) and (row >= 2):
# row Said line ,column The column ,value Indicates the insertion value
other_ws.cell(row=row, column=col, value=result)
other_wb.save(self.filename)
else:
print(" The line number passed in is incorrect , Line number should be greater than 1 The integer of ")
class OperationExcel(object):
""" Handle Excel data , And write
"""
def __init__(self, sheet):
"""
Get from nested dictionary Excel The header
:param sheet: Define a sheet The name of
"""
self.wb = openpyxl.Workbook()
self.ws = self.wb.create_sheet(index=0, title=sheet)
def get_title(self, data):
"""
:param data: Pass in json Nested Dictionary of data
:return: Returns the titles of all dictionaries (keys)
"""
title_list = []
for dic in data:
for key in dic:
if key not in title_list:
title_list.append(key)
return title_list
def handle_data(self, title_list, data, filename):
"""
:param data: call get_title Returned title list
:param data: Pass in json Nested Dictionary of data
:return: Returns the titles of all dictionaries (keys)
"""
# 1. Insert the title of the list Excel The first line of
first_row = 1
for header in title_list:
col = title_list.index(header)
self.ws.cell(first_row, col + 1, header)
# 2. Put all the values , Filter by title , Insert excel
row = 2
for player in data:
for _key, _value in player.items():
col = title_list.index(_key)
self.ws.cell(row, col + 1, _value)
row += 1 # enter the next row
self.wb.save(filename)
self.wb.close()
if __name__ == '__main__':
filename = " Shanghai data .xlsx"
sheet = " Data cleaning "
data = [{' welcome ': 4, " immediately ": 3, " Sign in ": " success "}, {' welcome ': 5, " immediately ": 7, " sign out ": " success "}, {' welcome ': 9, " immediately ": 8, " register ": " Failure "}]
A = OperationExcel(sheet)
title_list = A.get_title(data)
# Data cleaning , write in Excel
A.handle_data(title_list=title_list,
data=data,
filename=filename)

# 2. We need to process data , Import the code of the file above

from handle_excel import EncapsulationExcel, OperationExcel
class ExcelEncapsulation(object):
def __init__(self, file_name_1, file_name_2):
""" Initialization data
"""
self.file_name1 = file_name_1
self.file_name2 = file_name_2
def write_excel_data(self, field_name_1, field_name_2, filename, sheetname):
one_excel = EncapsulationExcel(filename=self.file_name1)
two_excel = EncapsulationExcel(filename=self.file_name2)
A = one_excel.operation_excel()
B = two_excel.operation_excel()
list_data = []
for i in A:
for j in B:
if i[field_name_1] == j[field_name_2]:
j.update(i)
list_data.append(j)
C = OperationExcel(sheetname)
title_list = C.get_title(list_data)
# Data cleaning , write in Excel
C.handle_data(title_list=title_list,
data=list_data,
filename=filename)
if __name__ == '__main__':
# To compare Excel file name
file_name_one = " end 20220612 Payment certificate user information .xlsx"
file_name_two = ' Cost data _1654850871000.xlsx'
# Two to compare Excel The same field name of
field_name_one = ' Buy UIN'
field_name_two = 'uin'
# Create a new Excel The name of the file
file_name = " file name .xlsx"
# Create a new sheet name
sheet_name = " Data cleaning "
ExcelEncapsulation(file_name_one, file_name_two).write_excel_data(field_name_one, field_name_two, file_name,
sheet_name)

Execution completed , Generate new files

 


  1. 上一篇文章:
  2. 下一篇文章:
Copyright © 程式師世界 All Rights Reserved