程序師世界是廣大編程愛好者互助、分享、學習的平台，程序師世界有你更精彩！


設為首頁	加入收藏

首頁
編程語言: C語言|JAVA編程
 Python編程
網頁編程: ASP編程|PHP編程
 JSP編程
數據庫知識: MYSQL數據庫|SqlServer數據庫
 Oracle數據庫|DB2數據庫

您现在的位置：程式師世界 >> 編程語言 > >> 更多編程語言 >> Python

python處理數據-excel功能

編輯：Python

一、數據源

1、讀取數據

import pandas
JL_data=pandas.read_excel(io='路徑/數據源.xlsx')

data=pandas.DataFrame(JL_data)

2、編寫數據

1）行列

data=pd.DataFrame(
np.random.randint(low=0,high=6,size=(5,5)),
columns=['列1','列2'],
index=['行1','行2']
)

2）列

data=pd.DataFrame({'key1':list('aabba'),
'key2': ['one','two','one','two','one'],
'data1': np.random.randn(5),
'data2': np.random.randn(5)})

二、數據獲取

1、單列：data["lng"]

2、多列：

xl = data.iloc[:, 1:] # 選取DataFrame的所有行，並截取第二列至最末列。
df = data.iloc[1:] # 選取DataFrame的第二行至最末行，保留所有列，並將選取的數據表保存在一個新的變量中。

3、單個值：

data["lng"][0]==data.loc[0,"lng"]

三、特征構建

基於已有的數據進行特征構建

1、計算：兩列之間的加減乘除

data['new']= 加+ 減- 乘* 除/ 余數% 商// 冪**

2、定義指標

1）某個關鍵詞作為指標

import re
j=0
for j in range(0,len(data['字段1'])):
data['new'][j] = str(re.findall(r'關鍵詞',data['字段1'][j])).replace('[','').replace("'",'')
j+=1
data['new']

2）特定符號之後的內容作為指標

(?<=特殊符號).*$

3）某個段作為指標

data['new'] = ''
data.loc[data['字段1'] - data['字段2'] > 0, 'new'] = '類1'
data.loc[data['字段1'] - data['字段2'] <= 0, 'new'] = '類2'
data['new'].value_counts()
data['new']

四、數據透視

靈活性高，可以隨意定制你的分析計算要求

脈絡清晰易於理解數據

操作性強，報表神器

1、pivot_table簡介

pivot_table(data, values=None, index=None, columns=None,aggfunc='mean', fill_value=None, margins=False, dropna=True, margins_name='All')

pivot_table有四個最重要的參數index、values、columns、aggfunc，本文以這四個參數為中心講解pivot操作是如何進行。

1）Index：Index就是層次字段，就是excel透視表的行，可以多行，形成多個維度

pd.pivot_table(data,index=['字段1']) ，pd.pivot_table(data,index=['字段1','字段2'])

2）Columns類似Index可以設置列層次字段，它不是一個必要參數，作為一種分割數據的可選方式。

pd.pivot_table(df,index=['字段1'],columns=['字段2']）

3） Values：在index字段維度下統計的指標，只能為數值的字段

pd.pivot_table(data,index=['字段1','字段2']],values=['指標1','指標2','指標3'])

4）Aggfunc參數可以設置我們對數據聚合時進行的函數操作。

當我們未設置aggfunc時，它默認aggfunc='mean'計算均值：

pd.pivot_table(df,index=['字段1','字段2']],values=['指標1','指標2','指標3'],aggfunc=[np.sum,np.mean])

還有其他：min、sum、max、count、mean、median等
5）fill_value默認填充空值，可fill_value=0，margins=True\margins=1進行匯總

6）類型

aggfunc也可以使用dict類型，如果dict中的內容與values不匹配時，以dict中為准。

Pandas中把dataframe轉成array：df=df.values

2、groupby

1）聚合

data.groupby('字段')，data.groupby(['字段1'],['字段2'])

匯總值：data.groupby(['字段1'],['字段2'])['指標1'].transform('sum')

data=groupby(['字段1'],['字段2'])['指標1','指標2'].sum().sort_values(['字段1'],['字段2']],ascending=False).reset_index()

2）分組，通過字典分組

mapping = {'字段1':'類1','字段2':'類1','字段3':'類2'}

data = people.groupby(mapping,axis=1).mean()

五、數據保存

將處理結果保存到excel中

data.to_excel(r'路徑/數據結構.xlsx')

上一篇文章： Yyds dry inventory standard input of Python skills
下一篇文章： Python data processing excel function

Python

python

Programming based Shortcut ke

如何使用 Lightly 進行 Python GUI 項目開發

GUI 即圖形用戶界面（Graphical User Int

Check a basic topic of pandas data processing (learning colored eggs at the end of the article)

Click on the above “Python Hom

python opencv二值化阈值圖像分割

在一般的視覺視覺顏色是由RGB組成的，為了簡化處理的視覺的復

Python3 URLEncode and URLDecode use

python3 urlencode and urldecod

我花了3個大夜，為Python程序員把把必備英語單詞詞匯表，全部整理了一遍

前言有很多人學Python或者其他編程語言，覺得自己又是0

相關文章

没有相关文章

閱讀排行榜

當 Pandas 遇見 SQL，一個強大的工具庫誕生了用python寫名片管理系統 Programming science ｜ you may be wrong about Python Deep learning project: how to use Python and opencv for face recognition Python batch extracts the specified column and writes it to a new file 【DRF+Django】微信小程序入門到實戰_day01(上) 二叉樹 BFS 力扣 Python Python Tkinter - Chapter 4.2 tag attributes 我，大二，假期用Python兼職賺了8929.6元關於Python賦值語句 Python獲取excel數據

熱門圖文

C語言：黑客學員必修課(二) Delphi報表控件----SReport3強大易用穩定與最終用戶交互良好為什麼要先學C語言地圖api-高德/百度地圖API申請的key出現INVALID_USER_SCODE android PHP oracle 亂碼問題 jQuery聯合C#完成上傳文件的辦法 HDU3768 Shopping（狀態壓縮DP+spfa）旅行商問題 C#的文件操作管理

欄目導航

編程綜合問答

更多關於編程

編程問題解答

Copyright © 程式師世界 All Rights Reserved