日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當(dāng)前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

泰坦尼克数据集预测分析_探索性数据分析-泰坦尼克号数据集案例研究(第二部分)

發(fā)布時(shí)間:2023/11/29 编程问答 40 豆豆
生活随笔 收集整理的這篇文章主要介紹了 泰坦尼克数据集预测分析_探索性数据分析-泰坦尼克号数据集案例研究(第二部分) 小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

泰坦尼克數(shù)據(jù)集預(yù)測分析

Data is simply useless until you don’t know what it’s trying to tell you.

除非您不知道數(shù)據(jù)在試圖告訴您什么,否則數(shù)據(jù)將毫無用處。

With this quote we’ll continue on our quest to find the hidden secrets of the Titanic. ‘The Unsinkable’, as it was claimed by its designers and makers proved that even the best of human engineering may sometimes fail when nature comes on to test it.

用這句話,我們將繼續(xù)尋找泰坦尼克號的秘密。 正如其設(shè)計(jì)師和制造商所宣稱的,“堅(jiān)不可摧”證明了,即使人類最好的工程學(xué),有時(shí)也會由于自然的考驗(yàn)而失敗。

In last article, we saw the different attributes of the data and had quick glance on what the data looked like. If you haven’t read part 1 of this blog , I recommend you to kindly read it by clicking here before continuing. In this article we’ll look at the relationships of each of the attributes to the survival of the passenger and to continue with our quest to find out whether you would’ve survived the Titanic Sinking or not.

在上一篇文章中,我們看到了數(shù)據(jù)的不同屬性,并快速瀏覽了數(shù)據(jù)的外觀。 如果您還沒有閱讀本博客的第1部分,建議您在繼續(xù)之前單擊此處 ,請仔細(xì)閱讀。 在本文中,我們將研究每個(gè)屬性與乘客生存的關(guān)系,并繼續(xù)我們的探索,以找出您是否會在《泰坦尼克號沉沒》中幸存。

1.旅客艙位與生存的關(guān)聯(lián) (1. Co-Relation of Passenger Class with the survival)

Since, there are 3 classes present in the ship. Let’s find out the count of each passengers in each class.

此后,船上共有3個(gè)班級。 讓我們找出每個(gè)班級的每位乘客人數(shù)。

Output:

輸出:

Now, let’s find out the the total number of survivors from each class

現(xiàn)在,讓我們找出每個(gè)班級的幸存者總數(shù)

Output:

輸出:

As you can see, the percentage of the passengers belonging to Upper Class who survived is better than the rest of the two having a survival percentage of around 62.96%.

如您所見,幸存的上層階級乘客百分比要好于其余兩個(gè)的生存百分比(約62.96%)。

The Survival Percentage of Middle Class Passengers is around 47.28% which better than the lower class but worse than that of the Upper Class

中產(chǎn)階級乘客的生存率大約為47.28%,高于低層階級,但低于上層階級

The Lower Class was hit the most, having a survival percentage of just 24.23% which is significantly lower than the above two classes.

下層階級受到的打擊最大,生存率僅為24.23%,明顯低于上述兩個(gè)階級。

The results indicate that the survival of the Titanic Sink was largely affected by the class in which you belong indicating the discrimination based on the class.

結(jié)果表明,泰坦尼克號水槽的生存在很大程度上受到您所屬類別的影響,表明基于類別的歧視。

2.性別與生存的關(guān)系 (2. Co-Relation of Gender with the survival)

Let’s start by printing the number of passengers of each gender.

讓我們開始打印每種性別的乘客數(shù)量。

Output:

輸出:

Now, let’s find out the survival percentage of the passengers belonging to each gender.

現(xiàn)在,讓我們找出每種性別的乘客的生存率。

Output:

輸出:

The information suggests that the women were given the highest priority while saving lives. Almost 74.2% of the women survived and 18.89% of men survived. (How pure these gentlemen were!😢??)

信息表明,在挽救生命的同時(shí),婦女被賦予最高優(yōu)先權(quán)。 幾乎有74.2%的女性得以幸存,而18.89%的男性得以幸存。 (這些先生們真是純潔!😢??)

3.年齡與生存的關(guān)系 (3. Co-Relation of Age with Survival)

Now, let’s look at the effect of age on the survival. But first, let’s have a quick glance on some stats of the age along with the values that are missing in the data-set.

現(xiàn)在,讓我們看看年齡對生存的影響。 但是首先,讓我們快速瀏覽一下年齡的一些統(tǒng)計(jì)數(shù)據(jù)以及數(shù)據(jù)集中缺少的值。

Output:

輸出:

There are a total of 177 missing values i.e. the age of 177 Passengers are missing in the data-set. These missing values may pose some problems while predicting and hence, need to be addressed.

共有177個(gè)缺失值,即數(shù)據(jù)集中缺少177歲的乘客。 這些缺失值在預(yù)測時(shí)可能會帶來一些問題,因此需要解決。

Now, let’s visualize by plotting some histograms on the basis of the data

現(xiàn)在,讓我們根據(jù)數(shù)據(jù)繪制一些直方圖以進(jìn)行可視化

kde = True gives Kernel Density Function for the histogram and rug are the small markings which plots the exact point at which the data were recorded.

kde = True給出了直方圖的內(nèi)核密度函數(shù),而rug是小的標(biāo)記,它們繪制了記錄數(shù)據(jù)的精確點(diǎn)。

Output:

輸出:

Now, let’s check out the survival in each group by plotting the following graph with kde. The y-axis actually denotes probability density function for the kernel density estimation and the area under the kde curve give the probability of respective points in x-axis.

現(xiàn)在,讓我們用kde繪制下圖來檢查每組的存活率。 y軸實(shí)際上表示用于核密度估計(jì)的概率密度函數(shù),而kde曲線下的面積給出了x軸上各個(gè)點(diǎn)的概率。

Output:

輸出:

The following plot show the distribution of gender in each age group.

下圖顯示了各個(gè)年齡段的性別分布。

Output:

輸出:

Now, let’s find out comparison of survival in each of these groups using kde plot.

現(xiàn)在,讓我們使用kde圖找出這些組中每個(gè)組的生存率比較。

Output:

輸出:

We can also understand what’s represented in these histograms as follows:

我們還可以理解這些直方圖中的表示形式,如下所示:

Output:

輸出:

4.否的關(guān)聯(lián) 幸存旅客的兄弟姐妹/配偶 (4. Co-Relation of no. of Siblings/Spouses of the passenger with Survival)

Let’s start by understanding the distribution of values of this attribute.

讓我們首先了解該屬性的值的分布。

Output:

輸出:

Now, let’s plot the histogram describing the survival of the passengers having respective number of Siblings/Spouses.

現(xiàn)在,讓我們繪制直方圖,描述具有相應(yīng)數(shù)量的兄弟姐妹/配偶的乘客的生存情況。

Output:

輸出:

The inference of the above histogram can be derived using the following code:

可以使用以下代碼推導(dǎo)以上直方圖的推論:

Output:

輸出:

5.父母/子女人數(shù)與生存率的相互關(guān)系 (5. Co-relation of No. of Parents/Children with survival)

The distribution of the number of Parents/Children are as follows

父母子女?dāng)?shù)的分配如下

Output:

輸出:

Here are the two different plots denoting the survival of passengers having respective no. of Parents/Children. The first one using ‘distplot’ and the second one using ‘countplot’

這是兩個(gè)不同的圖,分別表示編號分別為的乘客的生存情況。 父母/子女。 第一個(gè)使用“ distplot”,第二個(gè)使用“ countplot”

Output:

輸出:

6.票價(jià)與生存的關(guān)聯(lián) (6. Co-relation of Fare with survival)

Now, let’s try to understand if there was any regularity in the fare and whether there’s any relation with the survival. The code describes the distribution of the fare.

現(xiàn)在,讓我們嘗試了解票價(jià)是否有規(guī)律性以及與生存率是否有關(guān)系。 該代碼描述了票價(jià)的分配。

Output:

輸出:

Let’s plot the distribution of the Fare classified by the Survival

讓我們繪制按幸存分類的票價(jià)分布

Output:

輸出:

Let’s check whether the passengers were charged uniformly or not. If yes, let’s try to understand what are the factors that decided the fare for the tickets.

讓我們檢查一下乘客是否被統(tǒng)一收費(fèi)。 如果是,讓我們嘗試了解決定門票價(jià)格的因素是什么。

To check whether ‘Gender’ was the factor to decide the fare of the tickets, here’s the plot for each embarkation followed by the inference of it.

要檢查“性別”是否是決定票價(jià)的因素,以下是每次登機(jī)的情節(jié),然后進(jìn)行推斷。

Output:

輸出:

Output:

輸出:

Thus, as per the data, mean fare charged for women were significantly higher in Cherbourg and Southampton.

因此,根據(jù)數(shù)據(jù),瑟堡和南安普敦的女性平均車費(fèi)要高得多。

To check whether ‘Embarkation’ , ‘Class’ and ‘Age’ were the factor deciding the fare of the tickets, here’s the plot for each embarkation and class classified with ‘Survival’ Status followed by the inference of it.

要檢查“入庫”,“艙位”和“年齡”是否是決定票價(jià)的因素,這是按“生存”狀態(tài)歸類的每個(gè)登乘艙位和艙位的圖,然后進(jìn)行推斷。

Output:

輸出:

Thus, it is evident from the data that tickets were priced mostly on the basis of Pclass and the point of Embarkation but not on the basis of Age.

因此,從數(shù)據(jù)中可以明顯看出,機(jī)票的定價(jià)主要基于Pclass和登機(jī)地點(diǎn),而不是基于年齡。

7.登船與生存的關(guān)系 (7. Co-relation of Embarkation with survival)

We have seen the description of the data having numerical attributes till now. Here’s a look at the description of the categorical data.

到目前為止,我們已經(jīng)看到了對具有數(shù)值屬性的數(shù)據(jù)的描述。 這里是對分類數(shù)據(jù)的描述。

Output:

輸出:

Here’s a plot describing the ratio of the survival of passengers from each port of Embarkation.

這是一張描述每個(gè)登船口岸旅客生存率的圖表。

Output:

輸出:

And now here’s the pair-plot of each of the attributes that we have discussed till now.

現(xiàn)在,這是到目前為止我們討論過的每個(gè)屬性的配對圖。

Output:

輸出:

As you might have noticed we’ve ignored Passenger_Id, Name of the Passenger, Ticket and Cabin No. as they play little to no role in determining the survival of the passenger.

您可能已經(jīng)注意到,我們已經(jīng)忽略了Passenger_Id乘客 姓名機(jī)票機(jī)艙號 。 因?yàn)樗鼈冊跊Q定乘客的生存方面幾乎沒有作用。

Thus, we tried to understand the data by visualizing using various techniques and uncovered various mysteries related to Titanic. In next Article we’ll be understanding the types of data and why some type of data need to be converted into the specific format to be able to fit various Machine Learning models on it. Thank you for joining throughout this journey of exploration and hope, you’ve got the experience of being a detective!🕵

因此,我們試圖通過使用各種技術(shù)進(jìn)行可視化來理解數(shù)據(jù),并發(fā)現(xiàn)與泰坦尼克號有關(guān)的各種奧秘。 在下一篇文章中,我們將了解數(shù)據(jù)的類型以及為什么需要將某種類型的數(shù)據(jù)轉(zhuǎn)換為特定格式才能適合其上的各種機(jī)器學(xué)習(xí)模型。 感謝您加入探索和希望的整個(gè)旅程,您已經(jīng)成為一名偵探!🕵

Link to the Notebook: Click Here

鏈接到筆記本: 單擊此處

Link to Part 1 of this Blog: Click Here

鏈接到此博客的第1部分: 單擊此處

翻譯自: https://medium.com/@bapreetam/exploratory-data-analysis-a-case-study-on-titanic-data-set-part-2-96a9f3df963a

泰坦尼克數(shù)據(jù)集預(yù)測分析

總結(jié)

以上是生活随笔為你收集整理的泰坦尼克数据集预测分析_探索性数据分析-泰坦尼克号数据集案例研究(第二部分)的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò),歡迎將生活随笔推薦給好友。

主站蜘蛛池模板: 精品在线91| 日韩精品高清在线 | 亚洲精品黄色 | 中文字幕免费看 | 另类视频在线观看+1080p | 国产精品va无码一区二区三区 | 最新日韩在线 | 中文字幕精品一区久久久久 | 玩偶游戏在线观看免费 | 欧美综合自拍 | 91在线精品秘密一区二区 | 天堂网在线播放 | 免费三片60分钟 | 国产欧美综合一区二区三区 | 在线免费观看网站入口在哪 | 激情网站视频 | 一个人在线观看www软件 | 特级西西人体wwwww | 少妇精品亚洲一区二区成人 | 日本一区二区三区四区视频 | 亚洲一区二区三区四区五区六区 | 亚洲高清在线 | 91精品国产综合久 | 国产99在线 | 亚洲 | 人妻洗澡被强公日日澡电影 | 高h亲子乱h | 日韩h在线观看 | 都市激情麻豆 | 日批视频免费在线观看 | 先锋影音av在线 | 国产欧美专区 | 天堂在线视频免费 | 亚洲成人中文字幕 | 人妻与黑人一区二区三区 | 精品乱码久久久久久中文字幕 | 亚洲激情一区 | 天天躁日日躁狠狠躁 | 美女国产一区 | 国产精久久久久 | 久久av资源 | 日本一区二区精品视频 | 老色批永久免费网站www | 亚洲激情在线 | 国产精品久久久久久久久动漫 | 国产精品久久久精品三级 | 国产日韩欧美综合在线 | 最近中文字幕在线观看 | 日本少妇网站 | 欲求不满的岳中文字幕 | 成人一级毛片 | 中文字幕第一页在线视频 | 青青草伊人 | 青娱乐在线免费观看 | 中国女人真人一级毛片 | 无码人中文字幕 | 在线看中文字幕 | 思思久久久 | 日韩中文字幕一区 | 丰满熟妇肥白一区二区在线 | 美女搞黄视频网站 | 成人免费视频国产免费麻豆 | 波多野结衣一区二 | 国产男女猛烈无遮挡a片漫画 | 久久成年视频 | 婷婷色在线视频 | 4438x在线观看 | 一区二区自拍偷拍 | 毛片官网| 黄色同人网站 | 日日夜夜一区二区 | 精品一区二区三区在线免费观看 | 欧美香蕉在线 | 农村妇女毛片 | 欧美性猛交xxxx黑人猛交 | 人妻熟妇又伦精品视频a | 春色影视 | 激情av中文字幕 | 性色av浪潮av | 麻豆免费在线观看 | 亚洲欧美激情图片 | 免费福利小视频 | 天天躁日日躁狠狠躁免费麻豆 | 亚洲福利视频一区二区 | 婷婷中文网 | 制服丝袜中文字幕在线 | 国产盗摄精品 | 欧美一二区视频 | 国产在线视频资源 | 蜜臀av性久久久久蜜臀aⅴ | 91极品尤物 | 插入综合网 | 丰满人妻熟妇乱偷人无码 | 毛片黄色片 | 欧美一区二区三区久久综合 | 在线观看欧美一区 | 欧美成人激情视频 | 老司机深夜福利网站 | 亚洲区一区二区三区 | 我的公把我弄高潮了视频 |