程序師世界是廣大編程愛好者互助、分享、學習的平台,程序師世界有你更精彩!
首頁
編程語言
C語言|JAVA編程
Python編程
網頁編程
ASP編程|PHP編程
JSP編程
數據庫知識
MYSQL數據庫|SqlServer數據庫
Oracle數據庫|DB2數據庫
您现在的位置: 程式師世界 >> 編程語言 >  >> 更多編程語言 >> Python

itertools. Groupby and pandas Similarities and differences of groupby

編輯:Python

background

Recently I met a bug, In old code , It's OK to use it for many years , However, new business requirements have encountered problems . After exclusion , The discovery is due to itertools.groupby The usage of is different from that in imagination , At least I know pandas.groupby Not quite the same. . I haven't seen any relevant comparisons on the Internet , So I wrote a comparison of similarities and differences .

itertools.groupby And pandas.groupby Similarities

# itertools.groupby
from itertools import groupby
a = groupby([1,1,2,3])
for i,j in a:
print(i,' ',len(list(j)))
# pandas.groupby
import pandas
b = pandas.Series([0,1,2,3],index=[1,1,2,3],name='b').groupby(level=0,)
b.apply(lambda x:len(x))

It can be seen that both groups are based on certain rules , But when Data order when , The results of the two groups are similar .

itertools.groupby And pandas.groupby The difference between

# itertools.groupby
from itertools import groupby
a = groupby([1,2,3,1])
for i,j in a:
print(i,' ',len(list(j)))
# pandas.groupby
import pandas
b = pandas.Series([0,1,2,3],index=[1,2,3,1],name='b').groupby(level=0,)
b.apply(lambda x:len(x))

Can see , When Data is out of order when , The results of the two groups are different .
pandas.groupby There is no difference between the results of . but itertools.groupby The grouping result of is more like the grouping result of adjacent data after de duplication , The same value , If divided by other values , The grouping results are completely different .

summary

When Data is out of order when , To be exact, data of the same value are not adjacent ,itertools.groupby It can have unexpected results ,


  1. 上一篇文章:
  2. 下一篇文章:
Copyright © 程式師世界 All Rights Reserved