程序師世界是廣大編程愛好者互助、分享、學習的平台,程序師世界有你更精彩!
首頁
編程語言
C語言|JAVA編程
Python編程
網頁編程
ASP編程|PHP編程
JSP編程
數據庫知識
MYSQL數據庫|SqlServer數據庫
Oracle數據庫|DB2數據庫
您现在的位置: 程式師世界 >> 編程語言 >  >> 更多編程語言 >> Python

[Python automated excel] pandas handles the opening and closing of Excel

編輯:Python

Words Excel Data sheet , Long period of division 、 A long time must be divided. .Excel Data table “ branch ” And “ close ” It is a common operation in daily office . Manual operation is not difficult , But when the amount of data is large , Repetitive operations are often disruptive . utilize Python Of Pandas library , It can be realized automatically Excel Data table “ Opening and closing ”. Here are some examples to share my practical code fragments .( If there is a better way , Welcome criticism and correction )

Home directory

branch : The longitudinal “ branch ”

From the data platform ( Such as questionnaire platform ) The data exported in is often list type , Each line is a record , When there's a lot of data , Forms are often very “ Long ” Of . Sometimes it is necessary to use different values in a certain column , There will be a summary table “ branch ” Be alone Excel file .

The longitudinal “ branch ” Schematic diagram

A worksheet “ branch ” For multiple Excel file

def to_excelByColName(sourceDf,colName,outPath,excelName):
    '''
         The longitudinal “ branch ”: A worksheet “ branch ” For multiple Excel file
         According to the different values in the specified column name , decompose Excel, And stored as multiple Excel file .
        sourceDf: The original DataFrame
        colName: Specifies the column name
        outPath: The output path
        excelName: file name , Add .xlsx suffix
    '''
    colNameList = sourceDf[colName].drop_duplicates().tolist()
    for eachColName in colNameList:
        sourceDf[sourceDf[colName]==eachColName].to_excel('/'.join([outPath,eachColName+excelName]),index=False)

for example : take 20 A class 1000 A summary of students , Divided into classes 20 individual Excel file .

call to_excelByColName function , The effect is as follows :

to_excelByColName(sourceDf = sourceDf,colName=" class ",outPath=".\ Shift data sheet ",excelName=" Generate data table .xlsx")

Vertical division , Demonstration effect

A worksheet “ branch ” Multiple for one file sheet

def to_excelByColNameWithSheets(sourceDf,colName,outPath):
    '''
         The longitudinal “ branch ”: A worksheet “ branch ” Multiple for one file sheet
         According to the different values in the specified column name , decompose Excel, And stored as a single file Excel Multiple files Sheet.
        sourceDf: The original DataFrame
        colName: Specifies the column name
        outPath: The output path , Add .xlsx suffix
    '''
    writer = pd.ExcelWriter(outPath)
    colNameList = sourceDf[colName].drop_duplicates().tolist()
    for eachColName in colNameList:
        sourceDf[sourceDf[colName]==eachColName].to_excel(writer,sheet_name=eachColName)
    writer.save()

for example : take 20 A class 1000 A summary of students , Divided into classes 1 individual Excel Of documents 20 individual sheet surface .

call to_excelByColNameWithSheets function , The effect is as follows :

to_excelByColNameWithSheets(sourceDf = sourceDf,colName=" class ",outPath=".\ Shift data sheet \ Generate data table .xlsx")

Generate renderings

branch : The transverse “ branch ”

When processing data , Sometimes you need to add multiple auxiliary columns , This will also make the data table more and more “ wide ”. In the end, we only need some key columns , Then this involves horizontal data segmentation , Or extract some columns and keep them as a separate data table . Horizontal segmentation only needs to give DataFrame Just pass in the column name list .

for example : Just the name and class fields in the data table , It can be written like this .

df1 = sourceDf[[" full name "," class "]]
df1.to_excel(" A data sheet containing only names and classes .xlsx")

close : The longitudinal “ close ”

For data with the same structure , It can be spliced vertically during data processing , Easy to handle together .

The longitudinal “ close ” Schematic diagram

Multiple Excel Merge files into one worksheet

def readExcelFilesByNames(fpath,fileNameList=[],header=0):
    '''
         The longitudinal “ close ”: Multiple Excel Merge files into one worksheet
         Read the specified under the path Excel file , And merge into a total DataFrame.
         Every Excel The data table format of the file shall be consistent .
        1.fpath: Required , yes Excel File path , No filename
        2.fileNameList: Need to read Excel List of filenames
        3.header: Specify the number of rows to read
    '''
    outdf = pd.DataFrame()
    for fileName in fileNameList:
        tempdf =pd.read_excel('/'.join([fpath,fileName]),header = header)
        outdf = pd.concat([outdf,tempdf])
    return outdf

for example : take 20 A class Excel file , Merge into one data table

call readExcelFilesByNames function , The effect is as follows :

fileNameList = [
    " 6、 ... and 1 Shift data sheet .xlsx",    " 6、 ... and 2 Shift data sheet .xlsx",    " 6、 ... and 3 Shift data sheet .xlsx",    " 6、 ... and 4 Shift data sheet .xlsx",
    " 6、 ... and 5 Shift data sheet .xlsx",    " 6、 ... and 6 Shift data sheet .xlsx",    " 6、 ... and 7 Shift data sheet .xlsx",    " 6、 ... and 8 Shift data sheet .xlsx",
    " 6、 ... and 9 Shift data sheet .xlsx",    " 6、 ... and 10 Shift data sheet .xlsx",    " 6、 ... and 11 Shift data sheet .xlsx",    " 6、 ... and 12 Shift data sheet .xlsx",
    " 6、 ... and 13 Shift data sheet .xlsx",    " 6、 ... and 14 Shift data sheet .xlsx",    " 6、 ... and 15 Shift data sheet .xlsx",    " 6、 ... and 16 Shift data sheet .xlsx",
    " 6、 ... and 17 Shift data sheet .xlsx",    " 6、 ... and 18 Shift data sheet .xlsx",    " 6、 ... and 19 Shift data sheet .xlsx",    " 6、 ... and 20 Shift data sheet .xlsx",
]
readExcelFilesByNames(fpath = ".\ Shift data sheet ",fileNameList=fileNameList)

Merge tables , Demonstration effect

Multiple Sheet Merge into one worksheet

def readExcelBySheetsNames(fpath,header = 0,prefixStr = "",sheetNameStr ="sheetName",prefixNumStr = "prefixNum"):
    '''
         The longitudinal “ close ”: Multiple Sheet Merge into one worksheet
         Read all Excel Of documents sheet, And merge to return a total DataFrame.
         Every sheet The format of the data table should be consistent .
        1.fpath: Required , yes Excel Path to file , Add file name
        2. Two new columns will be generated :sheetName and prefixNum, Convenient data processing
            sheetName Columns are all sheet The name column of
            prefixNum Columns are count columns
        3.header: Specify the number of rows to read
    '''
    xl = pd.ExcelFile(fpath)
    #  obtain Excel All in the file sheet name
    sheetNameList = xl.sheet_names
    outfd = pd.DataFrame()
    num  = 0 
    for sheetName in sheetNameList:
        num += 1
        data = xl.parse(sheetName,header=header)
        #  produce sheet Name column and count column
        data[sheetNameStr] = sheetName
        data[prefixNumStr] = prefixStr +str(num)
        #  Data table splicing
        outfd = pd.concat([outfd,data.dropna()])
    xl.close()
    return outfd

The following call readExcelBySheetsNames, The operation effect is as follows :

readExcelBySheetsNames(fpath = ".\ Shift data sheet \ General data sheet .xlsx",sheetNameStr ="sheet name ",prefixNumStr = "sheet Serial number ")

Demonstration effect

close : The transverse “ close ”

For different Excel Horizontal merging between worksheets , Mainly based on some columns ( Such as : full name 、 ID number, etc ) A merger . stay pandas You can use merge Method to implement , This is a very easy way to use , The opening speech is long , Follow up detailed sorting .

DataFrame.merge(right, how='inner', on=None, left_on=None, right_on=None, left_index=False, right_index=False, sort=False, suffixes=('_x', '_y'), copy=True, indicator=False, validate=None)

Conclusion

This article talks about Python Handle Excel The file mode is mainly based on pandas Library , The main aim is List data table . The list data table is described in detail in the following article :

https://www.cnblogs.com/wansq/p/15869594.html

Data table branch It mainly involves file preservation ( write in ), For the program, it belongs to Output link ;

Data table close Mainly for file opening ( Read ), For the program, it belongs to Input link .

When the above code is used to divide and combine a large number of repetitive tables , Great advantage ; But for the occasional 、 A small amount of opening and closing , Maybe it's faster to click with the mouse .

There is no good or bad technology , We need to use it flexibly !

Official account “ I don't know ”,

reply “ Opening and closing ” You can download the code of this article ,

Open the box !

【Python automation Excel】pandas Handle Excel Of “ Opening and closing ” More articles about

  1. Python Operation of automatic office Excel file

    Module import import openpyxl Read Excel file open Excel file workbook = openpyxl.load_workbook("test.xlsx") Output ...

  2. Python openpyxl、pandas operation Excel Methods brief introduction and concrete examples

    This article focuses on windows Under the system Python3.5 The third party in the market excel Operation Library -openpyxl: Actually Python Third party libraries have a lot to work with Excel, Such as :xlrd,xlwt,xlwings Even annotated data ...

  3. 【 Tedious work automation 】pandas Handle excel file

    0. General treatment Read excel Format file :df = pd.read_excel('xx.xlsx'), Here are some functions that simply view the contents of a file : df.head(): Show the first five lines : df.columns: Exhibition ...

  4. Python Use Pandas Read Excel Learning notes of

    Here are Python Use in Pandas Read Excel Methods One . Software environment : OS:Win7 64 position Python 3.7 Two . Document preparation 1. Project structure : 2. Create a... In the current experiment folder Source Folder ...

  5. 【Python automation Excel】pandas Handle Excel The basic flow of data

    What I'm talking about here pandas Not giant pandas , It is Python Third party library . What can this library do ? It's in Python The field of data analysis is unknown . Nobody knows . Can be said to be Python In the world Excel. pandas The library processes data ...

  6. Python utilize pandas Handle Excel Application of data

    Python utilize pandas Handle Excel Application of data   Recently, I've been fascinated by efficient data processing pandas, In fact, this is used for data analysis , If you're doing big data analysis and testing , So this is very useful !! But in fact, we usually do ...

  7. 【python Basics 】 utilize pandas Handle Excel data

    Reference resources :https://www.cnblogs.com/liulinghua90/p/9935642.html One . Install third party libraries xlrd and pandas 1:pandas Dependency processing Excel Of xlrd modular , ...

  8. 【python-excel】Selenium+python Automatic reading Excel data (xlrd)

    Selenium2+python Automatic reading Excel data (xlrd) Reprinted address :http://www.cnblogs.com/lingzeng86/p/6793398.html ·········· ...

  9. Python use Pandas Reading and writing Excel

    Pandas yes python A data analysis package of , It includes a large number of databases and some standard data models , Provides the tools needed to operate large datasets efficiently .Pandas Provides a large number of functions and methods that enable us to process data quickly and conveniently . Pandas Official documents ...

  10. Python Pandas operation Excel

    Python Pandas operation Excel Antecedents feed * Used in this chapter Python3.6 Pandas==0.25.3 You need to use excel Too many file fields for Taking into account the subsequent changes in field naming and Chinese / english / Japan ...

Random recommendation

  1. 【LeetCode】Rotate Array

    Rotate Array Rotate an array of n elements to the right by k steps. For example, with n = 7 and k = ...

  2. How to delete eclipse It has been recorded svn The address of

    eclipse-->window-->show view-->svn Tab to delete svn link , Right click to discard . 1.

  3. Share red hat linux 6 Installation on oracle11g When I met gcc: error trying to exec 'cc1': execvp: No such file or directory The process of dealing with the problem of

    Installation environment :Red Hat Linux 6.5_x64.oracle11g 64bit Error details : The installation to 68% The pop-up window reports an error : call makefile '/test/app/Administrators/p ...

  4. Linux command — Set or view network configuration commands ifconfig

    ifconfig The command is used to set or view the network configuration , Include IP Address . Network mask . Broadcast address, etc . It is linux In the system , The most frequently used command on the Internet . 1. Command Introduction Command format : ifconfig [interfa ...

  5. SRM 504.5(2-1000pt)

    DIV2 1000pt The question : A group of people lined up , Each operation is decided by the dice , As long as no one wins , The game is not over . If you roll the dice 4, Then the first person in the queue wins the prize : otherwise , If you shake an odd number , Then the first person lines up to the end of the line : otherwise , The first person is out . if ...

  6. 《C++ Language foundation 》 Practice reference —— Arrays as data members

    return : Teacher he's teaching link [ project 5 - Arrays as data members ] Reading materials P255 example 8.4. Notice that the data members in the class can be arrays . Design a salary category (Salary), The data members of the class are as follows : class Salary ...

  7. UVALive 5790 Ball Stacking Problem solving report

    Summary of the game subject The question : Yes n A pyramid shaped ball of layers , If you choose a ball , You must take the two balls above it , Of course, you can take none of them . Find the maximum weight sum of the selected ball . Answer key : Turn this pile of balls into , The first line is (0,0), The second is (1,0) ...

  8. Mail alarm shell Script

    reminder   First install postfix perhaps sendmail Wait for the mail server 1.Apache #!/bin/bash #apache.sh nc -w2 localhost 80 if[ $? -ne ...

  9. Windows8.1 And Ubuntu14.04 Dual system

    Model : lenovo Y480 Existing operating system :win8.1 64 position Physical memory :8G Disk storage : Two hard drives 1. SSD Solid state disk Solid State Drives  110G   Now... Is installed win8 System 2.H ...

  10. [zw] lavender / Alfalfa + Mulberry /( Black red blue ) Moldy and other plants

    Interesting questions Why do you think time flies faster when you grow up ? See also , The discussion was quite profound Why do people feel that time passes faster and faster with age ? When I was a child , You would spend ten minutes watching an ant . When I was a child , You will find it strange to meet a bird on your walk . ...


  1. 上一篇文章:
  2. 下一篇文章:
Copyright © 程式師世界 All Rights Reserved