python画交互式地图_使用Python构建交互式地图-入门指南
python畫交互式地圖
Welcome to The Beginner’s Guide to Building Interactive Maps in Python
歡迎使用Python構建交互式地圖的初學者指南
In this post, I would like to show you how to create interactive climate maps using the Historical Climate Data, where you can visualize, examine, and explore the data. Data visualization plays an important role in representing data. Creating visualizations helps to present your analysis in an easier form of understanding. Especially when working with large datasets it is very easy to get lost, that’s when we can see the power of data visualization. In this exercise, we will work with climate data from Kaggle. We will build two interactive climate maps. The first one will be showing the climate change of each country, and the second one will be showing the temperature change over time. Let’s get started, we have a lot to do!
在本文中,我想向您展示如何使用歷史氣候數據創建交互式氣候圖,您可以在其中可視化,檢查和探索數據。 數據可視化在表示數據中起著重要作用。 創建可視化有助于以一種更容易理解的方式呈現您的分析。 特別是在處理大型數據集時,很容易迷失方向,這就是我們可以看到數據可視化的強大功能。 在本練習中,我們將使用來自Kaggle的氣候數據。 我們將構建兩個交互式氣候圖。 第一個顯示每個國家的氣候變化,第二個顯示隨著時間的溫度變化。 讓我們開始吧,我們還有很多事要做!
Kaggle is the world’s largest data science community with powerful tools and resources to help you achieve your data science goals.
Kaggle是全球最大的數據科學社區,其功能強大的工具和資源可幫助您實現數據科學目標。
目錄: (Table of Contents:)
- Plotly 密謀
- Understanding the Data 了解數據
- Data Cleaning 數據清理
- Data Preprocessing 數據預處理
- Data Visualization 數據可視化
密謀 (Plotly)
Plotly is Python graphing library that makes interactive, publication-quality graphs. Examples of how to make line plots, scatter plots, area charts, bar charts, error bars, box plots, histograms, heatmaps, subplots, multiple-axes, polar charts, and bubble charts. It is also an open-source library.
Plotly是Python圖形庫,可制作交互式的,具有出版質量的圖形。 有關如何制作折線圖,散點圖,面積圖,條形圖,誤差線,箱形圖,直方圖,熱圖,子圖,多軸圖,極坐標圖和氣泡圖的示例。 它也是一個開源庫。
To learn more about Plotly: Plotly Graphing Library
要了解有關Plotly的更多信息: Plotly Graphing Library
了解數據 (Understanding the Data)
The Berkeley Earth Surface Temperature Study combines 1.6 billion temperature reports from 16 pre-existing archives. It is nicely packaged and allows for slicing into interesting subsets (for example by country). They publish the source data and the code for the transformations they applied.
伯克利地球表面溫度研究結合了16個現有檔案中的16億個溫度報告。 它包裝精美,可以切成有趣的子集(例如,按國家/地區)。 他們為應用的轉換發布源數據和代碼。
Dataset can be found at the following link: Climate Data
可以在以下鏈接中找到數據集: 氣候數據
The data folder includes the following datasets:
數據文件夾包含以下數據集:
- Global Average Land Temperature by Country 全球平均陸地溫度(按國家)
- Global Average Land Temperature by State 全球各州平均陸地溫度
- Global Land Temperatures By Major City 主要城市的全球陸地溫度
- Global Land Temperatures By City 全球城市氣溫
- Global Land and Ocean-and-Land Temperatures 全球陸地和海洋和陸地溫度
We will be working with the “Global Average Land Temperature by Country” dataset, this data fits better for our goal because we are going to build interactive climate maps, and having a data filtered by country will make our life much easier.
我們將使用“按國家/地區劃分的全球平均陸地溫度”數據集,此數據更適合我們的目標,因為我們將構建交互式氣候圖,并且按國家/地區過濾數據將使我們的生活變得更加輕松。
圖書館 (Libraries)
We will need three main libraries to get started. When we come to visualization I will ask you to import a couple more sub-libraries, which are also known as library components. For now, we are going to import the following libraries:
我們將需要三個主要的庫來開始。 進行可視化時,我將要求您導入幾個子庫,這些子庫也稱為庫組件。 現在,我們將導入以下庫:
import numpy as npimport pandas as pd
import plotly as py
If you don’t have these libraries, don’t worry. It is super easy to install them, as you can see below:
如果您沒有這些庫,請不要擔心。 安裝它們非常容易,如下所示:
pip install numpy pandas plotly讀取數據 (Read Data)
df = pd.read_csv("data/GlobalLandTemperaturesByCountry.csv")print(df.head())head頭 print(df.tail())tail尾巴 # Checking the null values in each columndf.isnull().sum()nulls空值
數據清理 (Data Cleaning)
Data Science is more about understanding the data, and data cleaning is a very important part of this process. What makes the data more valuable depends on how much we can get from it. Preparing the data well will make your data analysis results more accurate.
數據科學更多地是關于理解數據的,數據清理是此過程中非常重要的一部分。 什么使數據更有價值取決于我們可以從中獲得多少。 做好數據準備將使您的數據分析結果更加準確。
Let’s start with cleaning process. Firstly, let’s start by dropping the “AverageTemperatureUncertainty” column, because we don’t need it.
讓我們從清潔過程開始。 首先,讓我們開始刪除“ AverageTemperatureUncertainty ”列,因為我們不需要它。
df = df.drop("AverageTemperatureUncertainty", axis=1)Then, let’s rename the column names to have a better look. As you can see above, we are using a method called rename. Isn’t that cool how easy to rename a column name.
然后,讓我們重命名列名稱以使其外觀更好。 如您在上面所看到的,我們正在使用一種稱為重命名的方法。 重命名列名稱的難易程度不是很酷。
df = df.rename(columns={'dt':'Date'})df = df.rename(columns={'AverageTemperature':'AvTemp'})
Lastly for the data cleaning, let’s drop the rows with the null values so that they don’t effect our analysis. As we checked earlier, we have around 32000 rows with null values in AverageTemperature column. And in total we have around 577000 rows, so dropping them is not a big deal. But in some cases, there are a couple other methods to handle null values.
最后,為了進行數據清理,讓我們刪除具有空值的行,以免影響我們的分析。 正如我們之前所檢查的,AverageTemperature列中大約有32000行具有空值。 總共有大約577000行,因此刪除它們并不是什么大問題。 但是在某些情況下,還有其他幾種方法可以處理空值。
df = df.dropna()Now, let’s have a look at our dataframe. I will print the first 10 rows using the head method.
現在,讓我們看一下我們的數據框。 我將使用head方法打印前10行。
df.head(10)result結果數據預處理 (Data Preprocessing)
This step is also known as data manipulation, where we filter the data so that we can focus on a specific analysis. Especially when working with big datasets, data preprocessing/ filtering is a must. For example, our historical climate data is showing temperatures for all 12 months between 1744 to 2013, so it’s actually a very wide range. Using data filtering techniques, we will focus on a smaller range like between 2000 to 2002.
此步驟也稱為數據處理,其中我們對數據進行過濾,以便我們可以專注于特定的分析。 特別是在處理大型數據集時,必須進行數據預處理/過濾。 例如,我們的歷史氣候數據顯示了1744年至2013年之間的所有12個月的溫度,因此實際上范圍很廣。 使用數據過濾技術,我們將專注于較小的范圍,例如2000到2002年之間。
比較運算符 (Comparison Operators)
- < <
- > >
- <= <=
- >= > =
- == ==
- != !=
We will use these operators to compare a specific value to values in the column. The result will be a series of booleans: True and Falses. True if the comparison is right, false if the comparison is not right.
我們將使用這些運算符將特定值與列中的值進行比較。 結果將是一系列布爾值:True和Falses。 如果比較正確,則為true;如果比較不正確,則為false。
分組依據 (Grouping by)
In this step, we are grouping the dataframe by Country name and the date columns. And also, sorting the values by date from latest to earliest time.
在此步驟中,我們將按國家/地區名稱和日期列對數據框進行分組。 而且,還可以按日期從最晚到最早的時間對值進行排序。
df_countries = df.groupby( ['Country','Date']).sum().reset_index().sort_values('Date', ascending=False)result結果
屏蔽數據范圍 (Masking by the data range)
start_date = '2000-01-01'end_date = '2002-01-01' mask = (df_countries['Date'] > start_date) & (df_countries['Date'] <= end_date) df_countries = df_countries.loc[mask] df_countries.head(10)result結果
As you can see above, the dataframe is looking great. Sorted by date and filtered by country name. We can find the average temperature in each month of each country by looking at this dataframe. Here comes the fun part, which is data visualization. Are you ready?
正如您在上面看到的,數據框看起來很棒。 按日期排序并按國家/地區名稱過濾。 通過查看此數據框,我們可以找到每個國家/地區每個月的平均溫度。 這是有趣的部分,它是數據可視化。 你準備好了嗎?
數據可視化 (Data Visualization)
情節的組成 (Components of Plotly)
Before we start, as mentioned earlier there are couple sub-libraries to import to enjoy data visualization. These sub-libraries are also known as Components.
在開始之前,如前所述,有幾個子庫需要導入才能享受數據可視化。 這些子庫也稱為組件。
#Plotly Componentsimport plotly.express as pximport plotly.graph_objs as go
from plotly.subplots import make_subplots
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
氣候變化圖 (Climate Change Map)
Perfect, now by running the following code you will see the magic happening.
完美,現在通過運行以下代碼,您將看到魔術的發生。
#Creating the visualizationfig = go.Figure(data=go.Choropleth( locations = df_countries['Country'], locationmode = 'country names', z = df_countries['AvTemp'], colorscale = 'Reds', marker_line_color = 'black', marker_line_width = 0.5, ))fig.update_layout( title_text = 'Climate Change', title_x = 0.5, geo=dict( showframe = False, showcoastlines = False, projection_type = 'equirectangular' ) ) fig.show()climate change interactive map氣候變化互動地圖
氣候變化的時間表 (Climate Change by Timeline)
# Manipulating the original dataframedf_countrydate = df_countries.groupby(['Date','Country']). sum().reset_index() #Creating the visualization
fig = px.choropleth(df_countrydate, locations="Country", locationmode = "country names", color="AvTemp", hover_name="Country", animation_frame="Date" ) fig.update_layout( title_text = 'Average Temperature Change', title_x = 0.5, geo=dict( showframe = False, showcoastlines = False, )) fig.show()
結果 (Results)
Both are the same map, in the first one you can see the change in average temperature. And in the second graph, I am just hovering over some countries, which is showing more detailed information about each of them.
兩者是同一張圖,在第一個圖中,您可以看到平均溫度的變化。 在第二張圖中,我只是將鼠標懸停在某些國家/地區上,該國家/地區顯示了有關每個國家/地區的更詳細的信息。
interactive map 1互動地圖1 interactive map 2互動地圖2Thank you for reading this post, I hope you enjoyed and learn something new today. Feel free to contact me through my blog if you have any questions while implementing the code. I will be more than happy to help. You can find more posts I’ve published related to Python and Machine Learning. Stay safe and happy coding!
感謝您閱讀這篇文章,希望您今天喜歡并學到一些新東西。 如果在實施代碼時有任何疑問,請隨時通過我的博客與我聯系 。 我將非常樂意提供幫助。 您可以找到我發布的更多有關Python和機器學習的文章。 保持安全快樂的編碼!
I am Behic Guven, and I love sharing stories on creativity, programming, motivation, and life.
我是Behic Guven,我喜歡分享有關創造力,編程,動力和生活的故事。
Follow my blog and Towards Data Science to stay inspired.
關注 我的博客 和 邁向數據科學 ,保持靈感。
相關文章 (Related Posts)
翻譯自: https://towardsdatascience.com/building-interactive-maps-in-python-the-beginners-guide-5711dd66257e
python畫交互式地圖
總結
以上是生活随笔為你收集整理的python画交互式地图_使用Python构建交互式地图-入门指南的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 苹果推进隐私保护 中国用户可开启高级数据
- 下一篇: python 数据科学书籍_您必须在20