當(dāng)前位置：首頁(yè) >

Python熊猫– GroupBy

發(fā)布時(shí)間：2023/12/1 47 豆豆

生活随笔收集整理的這篇文章主要介紹了 Python熊猫– GroupBy 小編覺(jué)得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

Python熊貓– GroupBy (Python Pandas – GroupBy)

GroupBy method can be used to work on group rows of data together and call aggregate functions. It allows to group together rows based off of a column and perform an aggregate function on them.

GroupBy方法可用于一起處理分組數(shù)據(jù)行并調(diào)用聚合函數(shù)。它允許基于列將行分組在一起，并對(duì)它們執(zhí)行聚合功能。

Consider the below example, there are three partitions of IDS (1, 2, and 3) and several values for them. We can now group by the ID column and aggregate them using some sort of aggregate function. Here we are sum-ing the values and putting the values.

考慮下面的示例，有三個(gè)IDS分區(qū)(1、2和3)，以及它們的幾個(gè)值。現(xiàn)在，我們可以按ID列進(jìn)行分組，并使用某種聚合函數(shù)對(duì)其進(jìn)行聚合。在這里，我們將這些值相加并放入這些值。

與熊貓團(tuán)購(gòu) (Groupby with Pandas)

Create a dataframe from a dictionary

從字典創(chuàng)建數(shù)據(jù)框

import numpy as np import pandas as pddata = {'company':['Google','Microsoft','FB','Google','FB'], 'person':['Molly','Nathaniel', 'Sriansh', 'Carl','Sarah'], 'Sales':[200,123,130,144,122]}df = pd.DataFrame(data) print(df)

Output

輸出量

company person Sales 0 Google Molly 200 1 Microsoft Nathaniel 123 2 FB Sriansh 130 3 Google Carl 144 4 FB Sarah 122

Following examples illustrate the 'GroupBy' function,

以下示例說(shuō)明了“ GroupBy”功能，

Example 1: GroupBy by 'company'

示例1：按“公司”分組

# returns the groubBy object print(df.groupby('company')) ''' <pandas.core.groupby.generic.DataFrameGroupBy object at 0x7f1721585350> '''by_company = df.groupby('company') #invoke aggregate function print(by_company.mean()) '''Sales company FB 126 Google 172 Microsoft 123 '''

In the above example, we don't see the person column, because the data type is String and by no means, we can get mean of String variables, and hence Pandas automatically ignores any non-numeric values.

在上面的示例中，我們沒(méi)有看到person列，因?yàn)閿?shù)據(jù)類(lèi)型是String ，但絕不能獲得String變量的均值，因此Pandas自動(dòng)忽略任何非數(shù)字值。

Below are some more examples of aggregate functions,

以下是聚合函數(shù)的更多示例，

print(by_company.sum())''' Output:Sales company FB 252 Google 344 Microsoft 123 ''' print(by_company.std())''' Output:Sales company FB 5.656854 Google 39.597980 Microsoft NaN '''

Note the return type of the values are by default a DataFrame, as illustrated below,

請(qǐng)注意，默認(rèn)情況下，值的返回類(lèi)型為DataFrame，如下所示，

std = by_company.std() print(type(std))''' Output: <class 'pandas.core.frame.DataFrame'> '''

And, hence we can perform all the dataFrame functions such as,

并且，因此我們可以執(zhí)行所有dataFrame函數(shù)，例如，

print(by_company.std().loc['FB'])''' Output: Sales 5.656854 Name: FB, dtype: float64 '''

The above mentioned steps, all can be performed in a single step as follows,

上述步驟全部可以在一個(gè)步驟中執(zhí)行，如下所示：

print(df.groupby('company').sum().loc['FB'])''' Output: Sales 252 Name: FB, dtype: int64 '''

Some more aggregate functions are,

還有一些聚合函數(shù)，

print(df.groupby('company').count())''' Output:person Sales company FB 2 2 Google 2 2 Microsoft 1 1 '''print(df.groupby('company').max())''' Output:person Sales company FB Sriansh 130 Google Molly 200 Microsoft Nathaniel 123 '''print(df.groupby('company').min())''' Output:person Sales company FB Sarah 122 Google Carl 144 Microsoft Nathaniel 123 '''

使用具有描述方法的GroupBy (Using GroupBy with describe method)

The describe() method returns a bunch of useful information all at once.

describe()方法一次返回一堆有用的信息。

print(df.groupby('company').describe())''' Output:Sales ...count mean std ... 50% 75% max company ... FB 2.0 126.0 5.656854 ... 126.0 128.0 130.0 Google 2.0 172.0 39.597980 ... 172.0 186.0 200.0 Microsoft 1.0 123.0 NaN ... 123.0 123.0 123.0[3 rows x 8 columns] '''

The format of the description can be changed using transpose() method,

可以使用transpose()方法更改描述的格式，

print(df.groupby('company').describe().transpose())''' Output: company FB Google Microsoft Sales count 2.000000 2.00000 1.0mean 126.000000 172.00000 123.0std 5.656854 39.59798 NaNmin 122.000000 144.00000 123.025% 124.000000 158.00000 123.050% 126.000000 172.00000 123.075% 128.000000 186.00000 123.0max 130.000000 200.00000 123.0 '''

翻譯自: https://www.includehelp.com/python/python-pandas-groupby.aspx

創(chuàng)作挑戰(zhàn)賽新人創(chuàng)作獎(jiǎng)勵(lì)來(lái)咯，堅(jiān)持創(chuàng)作打卡瓜分現(xiàn)金大獎(jiǎng)

總結(jié)

以上是生活随笔為你收集整理的Python熊猫– GroupBy的全部?jī)?nèi)容，希望文章能夠幫你解決所遇到的問(wèn)題。

如果覺(jué)得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。