當前位置：首頁 > 人文社科 > 生活经验 >内容正文

生活经验

scipy csr_matrix csc_matrix

發布時間：2023/11/28 生活经验 48 豆豆

生活随笔收集整理的這篇文章主要介紹了 scipy csr_matrix csc_matrix 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

202113224

data_cat = sparse.hstack((data_cat, data_))
橫向合并
df_tfidf = pd.DataFrame(data_cat.toarray())
csr轉成dataframe

20210222

csr_matrix 的數據格式

上面那種數據格式改成下面這種形式
下面的零為行數第幾行
0,1,2,3 是索引
第三列是具體的值

概述
在用python進行科學運算時，常常需要把一個稀疏的np.array壓縮，這時候就用到scipy庫中的sparse.csr_matrix(csr:Compressed Sparse Row marix) 和sparse.csc_matric(csc:Compressed Sparse Column marix)

scipy.sparse.csr_matrix

官方API介紹(省略前幾種容易理解的了) 
csr_matrix((data, indices, indptr), [shape=(M, N)]) 
is the standard CSR representation where the column indices for row i are stored in indices[indptr[i]:indptr[i+1]] and their corresponding values are stored in data[indptr[i]:indptr[i+1]]. If the shape parameter is not supplied, the matrix dimensions are inferred from the index arrays.

#  示例解讀
>>> indptr = np.array([0, 2, 3, 6])
>>> indices = np.array([0, 2, 2, 0, 1, 2])
>>> data = np.array([1, 2, 3, 4, 5, 6])
>>> csr_matrix((data, indices, indptr), shape=(3, 3)).toarray()
array([[1, 0, 2],[0, 0, 3],[4, 5, 6]])
# 按row行來壓縮
# 對于第i行，非0數據列是indices[indptr[i]:indptr[i+1]] 數據是data[indptr[i]:indptr[i+1]]
# 在本例中
# 第0行，有非0的數據列是indices[indptr[0]:indptr[1]] = indices[0:2] = [0,2]
# 數據是data[indptr[0]:indptr[1]] = data[0:2] = [1,2],所以在第0行第0列是1，第2列是2
# 第1行，有非0的數據列是indices[indptr[1]:indptr[2]] = indices[2:3] = [2]
# 數據是data[indptr[1]:indptr[2] = data[2:3] = [3],所以在第1行第2列是3
# 第2行，有非0的數據列是indices[indptr[2]:indptr[3]] = indices[3:6] = [0,1,2]
# 數據是data[indptr[2]:indptr[3]] = data[3:6] = [4,5,6],所以在第2行第0列是4，第1列是5,第2列是6

scipy.sparse.csc_matrix

官方API介紹(省略前幾種容易理解的了) 
csc_matrix((data, indices, indptr), [shape=(M, N)]) 
is the standard CSC representation where the row indices for column i are stored in indices[indptr[i]:indptr[i+1]] and their corresponding values are stored in data[indptr[i]:indptr[i+1]]. If the shape parameter is not supplied, the matrix dimensions are inferred from the index arrays.

#  示例解讀
>>> indptr = np.array([0, 2, 3, 6])
>>> indices = np.array([0, 2, 2, 0, 1, 2])
>>> data = np.array([1, 2, 3, 4, 5, 6])
>>> csc_matrix((data, indices, indptr), shape=(3, 3)).toarray()
array([[1, 0, 4],[0, 0, 5],[2, 3, 6]])
# 按col列來壓縮
# 對于第i列，非0數據行是indices[indptr[i]:indptr[i+1]] 數據是data[indptr[i]:indptr[i+1]]
# 在本例中
# 第0列，有非0的數據行是indices[indptr[0]:indptr[1]] = indices[0:2] = [0,2]
# 數據是data[indptr[0]:indptr[1]] = data[0:2] = [1,2],所以在第0列第0行是1，第2行是2
# 第1行，有非0的數據行是indices[indptr[1]:indptr[2]] = indices[2:3] = [2]
# 數據是data[indptr[1]:indptr[2] = data[2:3] = [3],所以在第1列第2行是3
# 第2行，有非0的數據行是indices[indptr[2]:indptr[3]] = indices[3:6] = [0,1,2]
# 數據是data[indptr[2]:indptr[3]] = data[3:6] = [4,5,6],所以在第2列第0行是4，第1行是5,第2行是6

總結

以上是生活随笔為你收集整理的scipy csr_matrix csc_matrix的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： Python 属性__getattrib
下一篇：获取当前脚本目录路径问题汇总

日韩av黄I国产麻豆传媒I国产91av视频在线观看I日韩一区二区三区在线看I美女国产在线I麻豆视频国产在线观看I成人黄色短片

生活经验

scipy csr_matrix csc_matrix

總結