當(dāng)前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

非常详细的Series核心操作使用详解

發(fā)布時(shí)間：2023/12/20 编程问答 32 豆豆

生活随笔收集整理的這篇文章主要介紹了非常详细的Series核心操作使用详解小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

文章目錄

簡介
1 創(chuàng)建
- 1.1 通過字典操作
- 1.2 通過numpy數(shù)組創(chuàng)建
- 1.3 通過標(biāo)量創(chuàng)建
2 數(shù)據(jù)訪問
- 2.1 通過下標(biāo)訪問
- 2.2 通過索引訪問
- 2.3 通過切片訪問
- 2.4 布爾變量訪問
3 索引操作
- 根據(jù)數(shù)據(jù)分組
- 3.1 索引屬性
- 3.2 訪問索引
4 基本操作
- 4.1 添加數(shù)據(jù)
- 4.2 刪除數(shù)據(jù)
- 4.3 修改數(shù)據(jù)
- 4.4 查看數(shù)據(jù)
- 4.5 重建索引
- 4.6 數(shù)據(jù)對齊
5 數(shù)據(jù)統(tǒng)計(jì)
- 5.1 功能介紹
- 5.2 代碼演示
6 注意事項(xiàng)
參考

簡介

Pandas是非常強(qiáng)大的二維數(shù)組操作庫。而二維庫是由多個(gè)一級的series組成，它具有以下內(nèi)容：

數(shù)據(jù)：可以是列表, 字典或標(biāo)量值。
index：索引的值應(yīng)唯一且可哈希。它必須與數(shù)據(jù)長度相同。
dtype：是指系列的數(shù)據(jù)類型。

本文將介紹Series的基本使用方法。

1 創(chuàng)建

1.1 通過字典操作

根據(jù)以下示例可見：

通過字典可以直接生成Series對象
鍵會(huì)成為Series的索引，值會(huì)變成Series的數(shù)據(jù)
值類型可以不一樣（這點(diǎn)不同于Numpy）

import numpy as np import pandas as pd data = {'1':1,'2':2,'3':3,'4':'hello','5':'python','list':[1,2] } s1 = pd.Series(data) print(s1, type(s1))#運(yùn)行結(jié)果 1 1 2 2 3 3 4 hello 5 python list1 [1, 2] dtype: object <class 'pandas.core.series.Series'>

1.2 通過numpy數(shù)組創(chuàng)建

| 在一個(gè)Series對象中，數(shù)據(jù)是必需的，所以通過pd.Series() | 數(shù)在創(chuàng)建時(shí)，第1個(gè)就是數(shù)據(jù)。下面的示例中使用 np. |random.rand(5) 函數(shù)生成長度為5的一組narray對象，對Series進(jìn)行初始化。index就是索引，長度與數(shù)據(jù)必需一致。name是Series對象的名稱用于顯示。這兩個(gè)變量都不是必需的。

import numpy as np import pandas as pd # 三個(gè)參數(shù)分別表示數(shù)據(jù)，索引和Series的名稱 s = pd.Series(np.random.rand(5), index = list('abcde'), name = 'test') print(s,type(s))# 輸出 a 0.478839 b 0.517298 c 0.854202 d 0.543885 e 0.032623 Name: test, dtype: float64 <class 'pandas.core.series.Series'>

1.3 通過標(biāo)量創(chuàng)建

所謂標(biāo)題就是一維的常量，如下所示，可以使用數(shù)字3創(chuàng)建一個(gè)長度為5的Series對象。這里，長度是由索引決定的，由于數(shù)據(jù)只有一標(biāo)量3，所以就用3填充5次。

import numpy as np import pandas as pds = pd.Series(3,index=list('abcde')) print(s)# 輸出 a 3 b 3 c 3 d 3 e 3 dtype: int64

2 數(shù)據(jù)訪問

數(shù)據(jù)訪問就是訪問Series對象中數(shù)據(jù)的方法。由于Series是一維的，所以常規(guī)可以通過索引或偏移量的方式進(jìn)行訪問數(shù)據(jù)。

2.1 通過下標(biāo)訪問

通過下標(biāo)訪問是最常規(guī)的一種方法，可以將Series對象當(dāng)作數(shù)組一樣使用下標(biāo)進(jìn)行訪問，下標(biāo)同樣從0開始。

import numpy as np import pandas as pds = pd.Series(np.random.rand(5)) print(s,'\n') print('s[2]:', s[2],type(s[2]),s[2].dtype)# 輸出 0 0.949404 1 0.400692 2 0.660859 3 0.295815 4 0.680184 dtype: float64 s[2]: 0.6608588265235231 <class 'numpy.float64'> float64

2.2 通過索引訪問

通過索引訪問就是利用Series中的index訪問對應(yīng)的數(shù)據(jù)，可以理解為將Series當(dāng)作字典，使用key訪問其value。不過其訪問功能更加強(qiáng)大，除了可以使用單個(gè)key訪問其value，還可以使用包含多個(gè)key的列表，一次獲得多個(gè)value。

需要注意，使用單個(gè)key訪問時(shí)，若key不存在時(shí)，則會(huì)報(bào)錯(cuò)，如果使用key列表，則返回為None。

import numpy as np import pandas as pds = pd.Series(np.random.rand(5), index = list('abcde')) print(s) print('-'*10, '\n') print("s['a']:", s['a'], '\n') print("--- s[['b','e', 'f']] ---") print(s[['b','e', 'f']])# 輸出 a 0.977675 b 0.128278 c 0.110421 d 0.413023 e 0.568087 dtype: float64 ---------- s['a']: 0.9776748201255117 --- s[['b','e', 'f']] --- b 0.128278 e 0.568087 f NaN dtype: float64 # s['f']不存在，第一次會(huì)給出報(bào)警，但可以正常執(zhí)行 C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\series.py:1152: FutureWarning: Passing list-likes to .loc or [] with any missing label will raise | KeyError in the future, you can use .reindex() | as an alternative. |See the documentation here: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#deprecate-loc-reindex-listlikereturn self.loc[key]

2.3 通過切片訪問

import numpy as np import pandas as pds1 = pd.Series(np.random.rand(5),list('abcde')) print(s1,'\n') print(s1['a':'c'],'\n') #用index做索引的話是末端包含的 print(s1[0:2],'\n') #用下標(biāo)做切片索引的話和list切片是一樣的，不包含末端 # 輸出 a 0.634454 b 0.132619 c 0.211219 d 0.559798 e 0.424643 dtype: float64 a 0.634454 b 0.132619 c 0.211219 dtype: float64 a 0.634454 b 0.132619 dtype: float64

2.4 布爾變量訪問

布爾型索引判斷，生成的是一個(gè)由布爾型組成的新的Series。
| .isnull() | .notnull() | 判斷是否是空值，其中None表示空值，NaN表示有問題的值，兩個(gè)都會(huì)被判斷為空值。 |

import numpy as np import pandas as pds = pd.Series([0.2, 0.5, None]) print(s,'\n') print(s > 50,'\n') | print(s.isnull() | '\n') | | print(s.notnull() | '\n') | print(s[s > 50])# 輸出 0 0.2 1 0.5 2 NaN dtype: float64 0 False 1 False 2 False dtype: bool 0 False 1 False 2 True dtype: bool 0 True 1 True 2 False dtype: bool Series([], dtype: float64)

3 索引操作

除了數(shù)據(jù)的操作，索引的操作也很重要，下面是對索引的一些常規(guī)操作。

根據(jù)數(shù)據(jù)分組

3.1 索引屬性

除了數(shù)據(jù)訪問，我們還可以訪問索引內(nèi)容。索引的類型。以下代碼展示了常用的索引類型，基本范圍都是與range相關(guān)的內(nèi)容。

import numpy as np import pandas as pds = pd.Series(np.random.rand(5), index=range(5)) print('type(s.index): ', type(s.index), '\n') print('s.index:', s.index, '\n')s = pd.Series(np.random.rand(5), index=list('abcde')) print('type(s.index): ', type(s.index), '\n') print('s.index:', s.index, '\n')s = pd.Series(np.random.rand(5), index=pd.date_range('2018-01-01', periods=5)) print('type(s.index): ', type(s.index), '\n') print('s.index:', s.index, '\n')# 輸出 type(s.index): <class 'pandas.core.indexes.range.RangeIndex'> s.index: RangeIndex(start=0, stop=5, step=1) type(s.index): <class 'pandas.core.indexes.base.Index'> s.index: Index(['a', 'b', 'c', 'd', 'e'], dtype='object') type(s.index): <class 'pandas.core.indexes.datetimes.DatetimeIndex'> s.index: DatetimeIndex(['2018-01-01', '2018-01-02', '2018-01-03', '2018-01-04','2018-01-05'],dtype='datetime64[ns]', freq='D')

3.2 訪問索引

由于索引類型基本與range相關(guān)，所以可以與list類型一樣使用下標(biāo)和范圍進(jìn)行訪問。

import numpy as np import pandas as pds = pd.Series(np.random.rand(5), index=range(5))print('type(s)', type(s), '\n')print(s, '\n')# 查看索引 print(s.index, '\n')# 查看范圍為 [1, 3) 的索引范圍 print(s.index[1:3], '\n')# 可以直接使用選定范圍后的內(nèi)容查看s的數(shù)據(jù) print(s[s.index[1:3]])# 遍歷索引，然后顯示索引對應(yīng)的值 for id in s.index:print(s[id])# 輸出 type(s) <class 'pandas.core.series.Series'> 0 0.138492 1 0.285440 2 0.280471 3 0.245737 4 0.996996 dtype: float64 RangeIndex(start=0, stop=5, step=1) RangeIndex(start=1, stop=3, step=1)1 0.285440 2 0.280471 dtype: float640.13849193313381447 0.2854401610542934 0.280470887359729 0.2457365359030208 0.996996040313859

4 基本操作

4.1 添加數(shù)據(jù)

import numpy as np import pandas as pds1 = pd.Series(np.random.rand(2)) print('s1') print(s1)s1[3]= 100 #用index增添 s1['a'] = 200 print('\ns1添加兩數(shù)據(jù)') print(s,'\n')s2 = pd.Series(np.random.rand(2), index = ['value1','value2']) print('\ns2') print(s2)s3 = s.append(s2) #用append()增添 print('\ns1.append(s2)') print(s3)# 輸出 s1 0 0.981331 1 0.555244 dtype: float64s1添加兩數(shù)據(jù) 0 0.570088 1 0.835804 3 100.000000 a 200.000000 dtype: float64 s2 value1 0.089712 value2 0.399171 dtype: float64s1.append(s2) 0 0.570088 1 0.835804 3 100.000000 a 200.000000 value1 0.089712 value2 0.399171 dtype: float64

4.2 刪除數(shù)據(jù)

import numpy as np import pandas as pd s = pd.Series(np.random.rand(5),index = list('abcde')) print('s') print(s)del s['a'] #用del刪除 print("\n刪除1個(gè)數(shù)據(jù)：del s['a']") print(s,'\n')s1 = s.drop(['c','d']) #用.drop()刪除，刪除多個(gè)要加[] print("\n刪除多個(gè)數(shù)據(jù)：s1 = s.drop(['c','d']) ") print(s1)# 輸出 s a 0.687421 b 0.938094 c 0.391408 d 0.667542 e 0.245056 dtype: float64刪除1個(gè)數(shù)據(jù)：del s['a'] b 0.938094 c 0.391408 d 0.667542 e 0.245056 dtype: float64 刪除多個(gè)數(shù)據(jù)：s1 = s.drop(['c','d']) b 0.938094 e 0.245056 dtype: float64

4.3 修改數(shù)據(jù)

數(shù)據(jù)修改直接使用索引指定進(jìn)行賦值操作，可以單個(gè)修改也可以批量修改。

import numpy as np import pandas as pds = pd.Series(np.random.rand(5),index = list('abcde')) print(s,'\n') s[1] = 100 # 直接賦值 print(s,'\n') s[['c','d']] = 200 # 批量賦值 print(s)# 輸出 a 0.317819 b 0.359241 c 0.662112 d 0.087609 e 0.940697 dtype: float64 a 0.317819 b 100.000000 c 0.662112 d 0.087609 e 0.940697 dtype: float64 a 0.317819 b 100.000000 c 200.000000 d 200.000000 e 0.940697 dtype: float64

4.4 查看數(shù)據(jù)

類似于Linux的head和tail命令，可以使用s.head(n)和s.tail(n)進(jìn)行數(shù)據(jù)訪問。

import numpy as np import pandas as pd s = pd.Series(np.random.rand(10)) print(s.head(2),'\n') print(s.tail(3))# 輸出 0 0.140628 1 0.768699 dtype: float64 7 0.255628 8 0.535300 9 0.324614 dtype: float64

4.5 重建索引

.reindex(新的標(biāo)簽,fill_value = )會(huì)根據(jù)更改后的標(biāo)簽重新排序，若添加了原標(biāo)簽中沒有的新標(biāo)簽，則默認(rèn)填入NaN，參數(shù)fill_value指對新出現(xiàn)的標(biāo)簽填入的值。

import numpy as np import pandas as pd s = pd.Series(np.random.rand(3),index = ['a','b','c']) print(s, '\n') s1 = s.reindex(['c','b','a','A'],fill_value = 100) print(s1)# 輸出 a 0.692466 b 0.757568 c 0.181863 dtype: float64 c 0.181863 b 0.757568 a 0.692466 A 100.000000 dtype: float64

4.6 數(shù)據(jù)對齊

數(shù)據(jù)對齊的目的是根據(jù)索引，對數(shù)據(jù)進(jìn)行相應(yīng)的操作，如相加。

import numpy as np import pandas as pds1 = pd.Series(np.random.rand(3),index = ['a','b','c']) s2 = pd.Series(np.random.rand(3),index =['a','c','A']) print(s1,'\n') print(s2,'\n') print(s1+s2) # 輸出 a 0.414064 b 0.599441 c 0.579188 dtype: float64 a 0.163382 c 0.095508 A 0.521609 dtype: float64 A NaN a 0.577446 b NaN c 0.674696 dtype: float64

5 數(shù)據(jù)統(tǒng)計(jì)

5.1 功能介紹

常用的統(tǒng)計(jì)函數(shù)以下表所示：

函數(shù)含義

aggregate()	聚合運(yùn)算，用于自定義統(tǒng)計(jì)函數(shù)，待研究。
all()	等價(jià)于邏輯“與”
any()	等價(jià)于邏輯“或”
idxmin()	尋找最小值對應(yīng)的所在位置
idxmax()	尋找最大值所在位置
count()	計(jì)數(shù)，None不統(tǒng)計(jì)。
cumsum()	運(yùn)算累計(jì)和
cumprod()	運(yùn)算累計(jì)積
cov()	計(jì)算協(xié)方差
corr()	計(jì)算相關(guān)系數(shù)
describe()	描述性統(tǒng)計(jì)，返回多個(gè)常用統(tǒng)計(jì)結(jié)果。
groupby()	分組
kurt()	計(jì)算峰度
max()	計(jì)算最大值
mean()	計(jì)算平均值
median()	計(jì)算中位數(shù)
min()	計(jì)算最小值
mode()	計(jì)算眾數(shù)
pct--_change()	運(yùn)算比率（后一個(gè)元素與前一個(gè)元素的比率）
quantile()	計(jì)算任意分位數(shù)
size()	計(jì)數(shù)（統(tǒng)計(jì)所有元素的個(gè)數(shù)）
skew()	計(jì)算偏度
std()	計(jì)算標(biāo)準(zhǔn)差
sum()	求和
value_counts()	頻次統(tǒng)計(jì)，即按相同值分組，返回每組的數(shù)據(jù)個(gè)數(shù)。
var()	計(jì)算方差

5.2 代碼演示

以下為演示代碼，以展示主要函數(shù)使用效果。
注：部分函數(shù)測試未通過，待進(jìn)一步調(diào)研。

import numpy as np import pandas as pd data=[1,2,3,4,5,5,6,8,1,3,5,2,5,2] s = pd.Series(data) print(s)#print('s.aggregate()', s.aggregate(3), '\n') print('s.all()', s.all(), '\n') print('s.any()', s.any(), '\n') print('s.idxmin()', s.idxmin(), '\n') print('s.idxman()', s.idxmax(), '\n') print('s.count()', s.count(), '\n') print('s.cumsum()', s.cumsum(), '\n') print('s.cumprod()', s.cumprod(), '\n') #print('s.cov()', s.cov(), '\n') #print('s.corr()', s.corr(), '\n') print('s.describe()', s.describe(), '\n') #print('s.groupby()', s.groupby(5), '\n') print('s.kurt()', s.kurt(), '\n') print('s.max()', s.max(), '\n') print('s.mean()', s.mean(), '\n') print('s.median()', s.median(), '\n') print('s.min()', s.min(), '\n') print('s.mode()', s.mode(), '\n') print('s.pct_change()', s.pct_change(), '\n') print('s.quantile()', s.quantile(), '\n') #print('s.size()', s.size(), '\n') print('s.skew()', s.skew(), '\n') print('s.std()', s.std(), '\n') print('s.sum()', s.sum(), '\n') print('s.value_counts()', s.value_counts(), '\n') print('s.var()', s.var(), '\n')# 輸出 0 1 1 2 2 3 3 4 4 5 5 5 6 6 7 8 8 1 9 3 10 5 11 2 12 5 13 2 dtype: int64 s.all() True s.any() True s.idxmin() 0 s.idxman() 7 s.count() 14 s.cumsum() 0 1 1 3 2 6 3 10 4 15 5 20 6 26 7 34 8 35 9 38 10 43 11 45 12 50 13 52 dtype: int64 s.cumprod() 0 1 1 2 2 6 3 24 4 120 5 600 6 3600 7 28800 8 28800 9 86400 10 432000 11 864000 12 4320000 13 8640000 dtype: int64 s.describe() count 14.000000 mean 3.714286 std 2.054210 min 1.000000 25% 2.000000 50% 3.500000 75% 5.000000 max 8.000000 dtype: float64 s.kurt() -0.33190548058712066 s.max() 8 s.mean() 3.7142857142857144 s.median() 3.5 s.min() 1 s.mode() 0 5 dtype: int64 s.pct_change() 0 NaN 1 1.000000 2 0.500000 3 0.333333 4 0.250000 5 0.000000 6 0.200000 7 0.333333 8 -0.875000 9 2.000000 10 0.666667 11 -0.600000 12 1.500000 13 -0.600000 dtype: float64 s.quantile() 3.5 s.skew() 0.4487734149006034 s.std() 2.054210364052382 s.sum() 52 s.value_counts() 5 4 2 3 3 2 1 2 8 1 6 1 4 1 dtype: int64 s.var() 4.21978021978022

6 注意事項(xiàng)

空值（None）和任何值相加都會(huì)返回空值。
count之類的函數(shù)不統(tǒng)計(jì)空值（None）。

參考

[1] pandas時(shí)間序列操作方法pd.date_range()，https://blog.csdn.net/missyougoon/article/details/83958749
[2] pd.Series 用法，https://www.cnblogs.com/sparkingplug/p/11409365.html
[3] Pandas時(shí)間序列：生成指定范圍的日期, https://blog.csdn.net/bqw18744018044/article/details/80920356

總結(jié)

以上是生活随笔為你收集整理的非常详细的Series核心操作使用详解的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。

上一篇： JavaScript - 通过居民身份证
下一篇： vm安装vmtools