Py之Pandas:Python的pandas库简介、安装、使用方法详细攻略
Py之Pandas:Python的pandas庫簡介、安裝、使用方法詳細攻略
?
?
目錄
pandas庫簡介
pandas庫安裝
pandas庫使用方法
1、函數(shù)使用方法
2、使用經(jīng)驗總結(jié)
3、繪圖相關(guān)操作
?
?
?
推薦文章
Py之Pandas:Python的pandas庫簡介、安裝、使用方法詳細攻略
Python之pandas:DataFrame二維表的簡介、常用函數(shù)、常用案例之詳細攻略
Python:Python實現(xiàn)讀入、寫出、復(fù)制(自定義函數(shù)封裝)各自類型如txt、csv等文件之詳細攻略
Python之pandas:DataFrame表格數(shù)據(jù)文件最常用的函數(shù)(輸出基本信息)集合
Python之Pandas:pandas系列自定義函數(shù)封裝(將數(shù)據(jù)表格文件進行橫向/縱向拼接、統(tǒng)計每一列不重復(fù)類別及其個數(shù))
Python之pandas:DataFrame常用函數(shù)(輸出基本信息&與字典、統(tǒng)計、映射等相關(guān))案例集合
?
pandas庫簡介
? ? ? 在 Python 自帶的科學計算庫中,Pandas 模塊是最適于數(shù)據(jù)科學相關(guān)操作的工具。它與 Scikit-learn 兩個模塊幾乎提供了數(shù)據(jù)科學家所需的全部工具。Pandas 是一種開源的、易于使用的數(shù)據(jù)結(jié)構(gòu)和Python編程語言的數(shù)據(jù)分析工具。
? ? ? 根據(jù)大多數(shù)一線從事機器學習應(yīng)用的研發(fā)人員的經(jīng)驗,如果問他們究竟在機器學習的哪個環(huán)節(jié)最耗費時間,恐怕多數(shù)人會很無奈地回答您:“數(shù)據(jù)預(yù)處理。”。事實上,多數(shù)在業(yè)界的研發(fā)團隊往往不會投人太多精力從事全新機器學習模型的研究,而是針對具體的項目和特定的數(shù)據(jù),使用現(xiàn)有的經(jīng)典模型進行分析。這樣一來,時間多數(shù)被花費在處理數(shù)據(jù),甚至是數(shù)據(jù)清洗的工作上,特別是在數(shù)據(jù)還相對原始的條件下。Pandas便應(yīng)運而生,它是一款針對于數(shù)據(jù)處理和分析的Python工具包,實現(xiàn)了大量便于數(shù)據(jù)讀寫、清洗、填充以及分析的功能。這樣就幫助研發(fā)人員節(jié)省了大量用于數(shù)據(jù)預(yù)處理下作的代碼,同時也使得他們有更多的精力專注于具體的機器學習任務(wù)。
pandas: powerful Python data analysis toolkit
pandas
?
pandas庫安裝
pip install pandas
?
pandas庫使用方法
1、函數(shù)使用方法
Pickling
| read_pickle(path[,?compression]) | Load pickled pandas object (or any object) from file. |
Flat File
| read_table(filepath_or_buffer[,?sep,?…]) | (DEPRECATED) Read general delimited file into DataFrame. |
| read_csv(filepath_or_buffer[,?sep,?…]) | Read a comma-separated values (csv) file into DataFrame. |
| read_fwf(filepath_or_buffer[,?colspecs,?…]) | Read a table of fixed-width formatted lines into DataFrame. |
| read_msgpack(path_or_buf[,?encoding,?iterator]) | Load msgpack pandas object from the specified file path |
Clipboard
| read_clipboard([sep]) | Read text from clipboard and pass to read_csv. |
Excel
| read_excel(io[,?sheet_name,?header,?names,?…]) | Read an Excel file into a pandas DataFrame. |
| ExcelFile.parse([sheet_name,?header,?names,?…]) | Parse specified sheet(s) into a DataFrame |
| ExcelWriter(path[,?engine,?date_format,?…]) | Class for writing DataFrame objects into excel sheets, default is to use xlwt for xls, openpyxl for xlsx. |
JSON
| read_json([path_or_buf,?orient,?typ,?dtype,?…]) | Convert a JSON string to pandas object. |
| json_normalize(data[,?record_path,?meta,?…]) | Normalize semi-structured JSON data into a flat table. |
| build_table_schema(data[,?index,?…]) | Create a Table schema from?data. |
HTML
| read_html(io[,?match,?flavor,?header,?…]) | Read HTML tables into a?list?of?DataFrame?objects. |
HDFStore: PyTables (HDF5)
| read_hdf(path_or_buf[,?key,?mode]) | Read from the store, close it if we opened it. |
| HDFStore.put(key,?value[,?format,?append]) | Store object in HDFStore |
| HDFStore.append(key,?value[,?format,?…]) | Append to Table in file. |
| HDFStore.get(key) | Retrieve pandas object stored in file |
| HDFStore.select(key[,?where,?start,?stop,?…]) | Retrieve pandas object stored in file, optionally based on where criteria |
| HDFStore.info() | Print detailed information on the store. |
| HDFStore.keys() | Return a (potentially unordered) list of the keys corresponding to the objects stored in the HDFStore. |
| HDFStore.groups() | return a list of all the top-level nodes (that are not themselves a pandas storage object) |
| HDFStore.walk([where]) | Walk the pytables group hierarchy for pandas objects |
Feather
| read_feather(path[,?columns,?use_threads]) | Load a feather-format object from the file path |
Parquet
| read_parquet(path[,?engine,?columns]) | Load a parquet object from the file path, returning a DataFrame. |
SAS
| read_sas(filepath_or_buffer[,?format,?…]) | Read SAS files stored as either XPORT or SAS7BDAT format files. |
SQL
| read_sql_table(table_name,?con[,?schema,?…]) | Read SQL database table into a DataFrame. |
| read_sql_query(sql,?con[,?index_col,?…]) | Read SQL query into a DataFrame. |
| read_sql(sql,?con[,?index_col,?…]) | Read SQL query or database table into a DataFrame. |
Google BigQuery
| read_gbq(query[,?project_id,?index_col,?…]) | Load data from Google BigQuery. |
STATA
| read_stata(filepath_or_buffer[,?…]) | Read Stata file into DataFrame. |
| StataReader.data(**kwargs) | (DEPRECATED) Reads observations from Stata file, converting them into a dataframe |
| StataReader.data_label() | Returns data label of Stata file |
| StataReader.value_labels() | Returns a dict, associating each variable name a dict, associating each value its corresponding label |
| StataReader.variable_labels() | Returns variable labels as a dict, associating each variable name with corresponding label |
| StataWriter.write_file() |
?
2、使用經(jīng)驗總結(jié)
Python語言學習之pandas:DataFrame二維表的簡介、常用函數(shù)、常用案例之詳細攻略
?
?
總結(jié)
以上是生活随笔為你收集整理的Py之Pandas:Python的pandas库简介、安装、使用方法详细攻略的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: SQL:使用备份向导、SQL命令、导出数
- 下一篇: Py之pipenv:Python包的管理