数据科学项目_完整的数据科学组合项目
數(shù)據(jù)科學(xué)項(xiàng)目
In this article, I would like to showcase what might be my simplest data science project ever.
在本文中,我想展示一下有史以來(lái)最簡(jiǎn)單的數(shù)據(jù)科學(xué)項(xiàng)目 。
I have spent hours training a much more complex models in the past, and struggled to find the right parameters to create machine learning pipelines.
過(guò)去,我花費(fèi)了數(shù)小時(shí)來(lái)訓(xùn)練更復(fù)雜的模型,并努力尋找合適的參數(shù)來(lái)創(chuàng)建機(jī)器學(xué)習(xí)管道。
Despite its simplicity, if I could only display one project on my resume, it would be this one.
盡管它很簡(jiǎn)單,但如果我只能在簡(jiǎn)歷中顯示一個(gè)項(xiàng)目,那就是這個(gè)。
Let me explain why.
讓我解釋一下原因。
包裝是否確定禮物的價(jià)值? (Does the package determine the value of the gift?)
As a child, I would always get excited about holidays because I could get gifts. (Just humour me here, I do have a point, I promise). My aunt presented me with this beautiful dress, perhaps more beautiful than any other gift I received that day.
小時(shí)候,我總是會(huì)對(duì)假期感到興奮,因?yàn)槲铱梢缘玫蕉Y物。 ( 我保證我在這里很幽默,我有一點(diǎn)要保證)。 我的姨媽給了我這件漂亮的衣服,也許比那天我收到的任何其他禮物都要漂亮。
Here’s the thing though — I didn’t even want to open it. She had shabbily wrapped it with newspaper, and the gift seemed to have lost half its value before I even saw what was inside.
不過(guò),這是東西–我什至不想打開(kāi)它。 她用報(bào)紙把它包裹起來(lái),禮物似乎失去了一半的價(jià)值,我什至沒(méi)有看到里面的東西。
To answer the question above, no. The package by no means determines the value of the gift.
要回答上述問(wèn)題, 否 。 包裝決不會(huì)決定禮物的價(jià)值。
However, it can greatly influence your expectation of what’s inside and can change the way you perceive it.
但是,它會(huì)極大地影響您對(duì)內(nèi)部?jī)?nèi)容的期望,并會(huì)改變您對(duì)其的感知方式。
The machine learning models you spend weeks training are great. Demonstrate that. Don’t let them die in your Jupyter Notebook.
您花費(fèi)數(shù)周訓(xùn)練的機(jī)器學(xué)習(xí)模型很棒。 證明這一點(diǎn)。 不要讓它們?cè)贘upyter Notebook中死亡。
Recruiters have hundreds of resumes to read. It is almost impossible for them to read through all your code on GitHub and understand all your projects.
招聘人員有數(shù)百份簡(jiǎn)歷可供閱讀。 他們幾乎不可能閱讀GitHub上的所有代碼并理解所有項(xiàng)目。
To stand out, you need to do something slightly different. Create an interface they can interact with. Maybe a live dashboard they can play around with.
要脫穎而出,您需要做些不同的事情。 創(chuàng)建一個(gè)可以與之交互的界面。 也許他們可以玩的實(shí)時(shí)儀表板。
Even if it's not the best dashboard or interface out there, it will create interest, because you created something they can actually use.
即使不是最佳的儀表板或界面,它也會(huì)引起人們的興趣,因?yàn)槟鷦?chuàng)建了它們可以實(shí)際使用的東西。
I wanted to do exactly that, which is why I came up with this portfolio project. In the next few sections, I will explain exactly what I did without going too much into the technical detail.
我想做到這一點(diǎn),這就是為什么我提出這個(gè)投資組合項(xiàng)目的原因。 在接下來(lái)的幾節(jié)中,我將準(zhǔn)確解釋我所做的事情,而無(wú)需過(guò)多地討論技術(shù)細(xì)節(jié)。
目標(biāo) (Aim)
I aimed to display skills in the following areas:
我旨在展示以下領(lǐng)域的技能:
- Data Collection 數(shù)據(jù)采集
- Data Wrangling 數(shù)據(jù)整理
- Data Visualization 數(shù)據(jù)可視化
- Machine Learning 機(jī)器學(xué)習(xí)
- Web Development Web開(kāi)發(fā)
In order to do so, I created the following components in my project:
為此,我在項(xiàng)目中創(chuàng)建了以下組件:
- Front-end interface 前端界面
- Movie Dashboard 電影儀表板
- Movie Recommender System 電影推薦系統(tǒng)
I will explain and demonstrate each component in detail.
我將詳細(xì)解釋和演示每個(gè)組件。
Note: If you don’t want to read through the entire article and just want to take a look at the final product, just scroll down and take a look at the ‘Links’ section.
注意:如果您不想通讀整篇文章,只想看一下最終產(chǎn)品,只需向下滾動(dòng)并看一下“ 鏈接 ”部分。
前端接口 (Front-End Interface)
In the past, I would create projects and let the code sit in my GitHub repository. I write an occasional article explaining the project on Medium.
過(guò)去,我將創(chuàng)建項(xiàng)目并將代碼放在我的GitHub存儲(chǔ)庫(kù)中。 我偶爾寫(xiě)一篇文章,解釋Medium上的項(xiàng)目。
Here, I took a different approach.
在這里,我采取了另一種方法。
I created a web-page and explained the different components in my project. I wrote briefly about how users can interact with the systems I created, and put up links to my code and Medium article.
我創(chuàng)建了一個(gè)網(wǎng)頁(yè),并解釋了項(xiàng)目中的不同組件。 我簡(jiǎn)短地寫(xiě)了關(guān)于用戶如何與我創(chuàng)建的系統(tǒng)進(jìn)行交互的文章,并提供了指向我的代碼和中型文章的鏈接。
The entire project can be understood and accessed through just one page, which makes it so much easier for people to engage with.
整個(gè)項(xiàng)目?jī)H需一頁(yè)即可理解和訪問(wèn),這使人們更容易進(jìn)行互動(dòng)。
You can check the site out here — View on laptop or PC for better UI experience.
您可以在此處 查看 該站點(diǎn) — 在便攜式計(jì)算機(jī)或PC上查看以獲得更好的UI體驗(yàn)。
電影儀表板 (Movie Dashboard)
Next, I created a movie dashboard with Tableau.
接下來(lái),我使用Tableau創(chuàng)建了一個(gè)電影儀表板。
The steps involved:
涉及的步驟:
數(shù)據(jù)采集 (Data Collection)
I had to collect data from a variety of different places. I also wanted to visualize Bechdel scores of these movies (a measure of female representation in Hollywood), so I used an API to get that data.
我不得不從許多不同的地方收集數(shù)據(jù)。 我還想可視化這些電影的Bechdel分?jǐn)?shù)( 好萊塢中女性代表的度量 ),因此我使用API??來(lái)獲取該數(shù)據(jù)。
數(shù)據(jù)整理 (Data Wrangling)
I cleaned the data and merged the datasets together. Once I was done, I could finally visualize it!
我清理了數(shù)據(jù)并將數(shù)據(jù)集合并在一起。 完成后,我終于可以將其可視化!
數(shù)據(jù)可視化 (Data Visualization)
Surprisingly, this took up a huge portion of my time compared to other parts of this project.
令人驚訝的是,與該項(xiàng)目的其他部分相比,這花費(fèi)了我大量的時(shí)間。
I spent two days trying to create a visually appealing dashboard.
我花了兩天的時(shí)間來(lái)創(chuàng)建一個(gè)吸引人的儀表板。
I created one with a Python Dash app. I wasn’t too satisfied with the layout, and tried creating a Shiny web app in R instead.
我用Python Dash應(yīng)用程序創(chuàng)建了一個(gè)。 我對(duì)布局不太滿意,而是嘗試在R中創(chuàng)建一個(gè)Shiny Web應(yīng)用程序。
It turned out better than my Dash app, and I loved the functionality. However, I simply didn’t find the design appealing.
事實(shí)證明,它比我的Dash應(yīng)用程序好,我喜歡它的功能。 但是,我只是覺(jué)得設(shè)計(jì)沒(méi)有吸引力。
Finally, I decided to use Tableau. This only took me about an hour to create. If you want to get started with Tableau, you can read this tutorial I created.
最后,我決定使用Tableau。 這只花了我大約一個(gè)小時(shí)的時(shí)間。 如果要開(kāi)始使用Tableau,可以閱讀我創(chuàng)建的本教程 。
You can view my dashboard here — View on laptop or PC for better UI experience.
您可以在此處查看我的儀表板- 在筆記本電腦或PC上查看以獲得更好的UI體驗(yàn) 。
推薦系統(tǒng) (Recommender System)
Finally, machine learning!
最后,機(jī)器學(xué)習(xí)!
I created a simple recommendation system with the same data I used for the dashboard and deployed it with a Dash app.
我使用與儀表板相同的數(shù)據(jù)創(chuàng)建了一個(gè)簡(jiǎn)單的推薦系統(tǒng),并通過(guò)Dash應(yīng)用程序進(jìn)行了部署。
Just enter a movie name, and it uses the back-end recommendation system to generate movie suggestions for you.
只需輸入電影名稱,它就會(huì)使用后端推薦系統(tǒng)為您生成電影建議。
Actually, this recommendation system was created when I was just starting to learn machine learning.
實(shí)際上,這個(gè)推薦系統(tǒng)是在我剛開(kāi)始學(xué)習(xí)機(jī)器學(xué)習(xí)時(shí)創(chuàng)建的。
I found the code in my Jupyter Notebook, and decided to clean it up a bit to create this simple application.
我在Jupyter Notebook中找到了代碼,并決定對(duì)其進(jìn)行一些清理以創(chuàng)建此簡(jiǎn)單應(yīng)用程序。
You can take a look at the recommendation system here — View on laptop or PC for better UI experience.
您可以在這里 查看推薦系統(tǒng)- 在筆記本電腦或PC上查看以獲得更好的UI體驗(yàn) 。
That’s it!
而已!
鏈接 (Links)
Front-End Interface
前端接口
Movie Dashboard
電影儀表板
Recommender System
推薦系統(tǒng)
Code (I apologize since the codes are pretty messy, I will clean them and re-upload soon.)
代碼 ( 我很抱歉,因?yàn)榇a太亂了,我將清理它們并盡快重新上傳。 )
I hope you enjoyed this article and found the tips above helpful. Jupyter Notebooks are great, but don’t let your projects just sit there.
希望您喜歡這篇文章,并發(fā)現(xiàn)以上提示對(duì)您有所幫助。 Jupyter Notebooks很棒,但不要讓您的項(xiàng)目只坐在那兒。
Use your creativity to create something other people can interact with.
利用您的創(chuàng)造力創(chuàng)造其他人可以與之互動(dòng)的東西。
I’ve seen some incredible projects on GitHub with only one star. On the other hand, I’ve also seen some really simple projects gain a lot of attention just because of how it was presented.
我在GitHub上僅看到一顆星星就看到了一些令人難以置信的項(xiàng)目。 另一方面,我也看到一些非常簡(jiǎn)單的項(xiàng)目因其呈現(xiàn)方式而引起了很多關(guān)注。
Most importantly though, create projects you like to work on and do what you feel is enjoyable!
不過(guò),最重要的是,創(chuàng)建您喜歡的項(xiàng)目并做自己認(rèn)為愉快的事情!
翻譯自: https://towardsdatascience.com/a-complete-data-science-portfolio-project-ebbced35ea84
數(shù)據(jù)科學(xué)項(xiàng)目
總結(jié)
以上是生活随笔為你收集整理的数据科学项目_完整的数据科学组合项目的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問(wèn)題。
- 上一篇: 梦到抓跳蚤是什么意思
- 下一篇: uni-app清理缓存数据_数据清理-从