程序師世界是廣大編程愛好者互助、分享、學習的平台,程序師世界有你更精彩!
首頁
編程語言
C語言|JAVA編程
Python編程
網頁編程
ASP編程|PHP編程
JSP編程
數據庫知識
MYSQL數據庫|SqlServer數據庫
Oracle數據庫|DB2數據庫
您现在的位置: 程式師世界 >> 編程語言 >  >> 更多編程語言 >> Python

How does pandas change from one line to another (splicing and aggregating text)?

編輯:Python

1. Scenario introduction

Sometimes when performing summary calculation , We need to convert the data before processing into the data after processing .
Before processing :

After processing :

2. Implementation method

Prepare the data in advance :

import pandas as pd
# Prepare the data 
df = pd.DataFrame({
' full name ': ['A', 'A', 'B', 'B', 'C', 'C', 'C'],
' department ':[' The sales department ', ' The sales department ', ' The sales department ', ' The sales department ', ' The personnel department ', ' The personnel department ', ' The personnel department '],
' Management area ':[' south China ', ' The north China ', ' Central China ', ' East China ', ' south China ', ' The north China ', ' Central China ']})
df

2.1 Implementation method basic edition

result=df.groupby(df[' full name ']).agg( Department =(' department ',lambda x:','.join(x.unique())),
Managed area =(' Management area ',lambda x:','.join(x.unique()))).reset_index()

2.2 Further exploration

according to 2.1 In this way, we can achieve our expectations , But when there are many aggregated fields , We will face the trouble of writing multiple anonymous functions , Not very convenient , So how to solve this problem ?

def string_concat(column_name,sep=','):
return sep.join(column_name.unique())
result=df.groupby(df[' full name ']).agg( Department =(' department ',string_concat),
Managed area =(' Management area ',string_concat)).reset_index()
result

2.3 Upgrade again

be based on 2.2 We solved the problem of writing anonymous functions repeatedly , But in reality , If we use different delimiters for multiple columns , It's not very humanized , Then how to solve this problem ?

def custome_str_cat(sep='|'):
def str_cat(column_name):
return sep.join(column_name.unique())
if sep:
return str_cat
result=df.groupby(df[' full name ']).agg( Department =(' department ', custome_str_cat('*')), Managed area =(' Management area ',
custome_str_cat()), Managed area 2=(' Management area ', custome_str_cat('—'))).reset_index()

The effect diagram after implementation is as follows :

3. At the end

The above is about the aggregate function of text class , amount to postgresql In the database string_agg function ,Oracle Medium wm_concat function ,MySQL Medium GROUP_CONCAT function . So how to be in pandas Turn one row into multiple rows ? That is, the processed data in this article will be transformed into the data before processing , You can refer to this blog , Portal :https://blog.csdn.net/qq_41780234/article/details/121623812?spm=1001.2014.3001.5502


  1. 上一篇文章:
  2. 下一篇文章:
Copyright © 程式師世界 All Rights Reserved