當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

使用协同过滤推荐电影

發(fā)布時間：2023/11/29 编程问答 39 豆豆

生活随笔收集整理的這篇文章主要介紹了使用协同过滤推荐电影小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

ALSO, ARE RECOMMENDER SYSTEMS INFLUENCING OUR TASTE??

此外，推薦系統(tǒng)是否影響我們的口味？

An excerpt on creating a movie recommender system similar to the OTT platforms.

有關創(chuàng)建類似于OTT平臺的電影推薦系統(tǒng)的摘錄。

INTRODUCTION

介紹

Formally Defining,A Recommender System is a system that seeks to predict or filter preferences according to the user’s preferences. The demand for a good recommender system is soaring, especially with then onset of Covid-19 induced lock down,forcing everyone to stay home and watch movies of their favourite genre,actor,director….you get it right.This is where a recommender system plays an important role in providing the user, content he is more likely to watch, rather than the user searching for something that interests him,which would mess with the user experience.

正式定義，推薦系統(tǒng)是一種試圖根據(jù)用戶的偏好來預測或過濾偏好的系統(tǒng)。對好的推薦器系統(tǒng)的需求猛增，尤其是在Covid-19引發(fā)鎖定之后，迫使每個人呆在家里觀看自己喜歡的類型，演員，導演的電影……您就對了。這就是推薦器的地方系統(tǒng)在提供用戶更可能觀看的內(nèi)容而不是用戶搜索他感興趣的內(nèi)容方面起著重要作用，而這會干擾用戶體驗。

The essence of a recommender system lies in its recommendation engine.There are Two types of Recommendation engine:

推薦系統(tǒng)的本質(zhì)在于其推薦引擎。推薦引擎有兩種類型：

Content-based filtering engine: It provides recommendations by matching the description of the movie and a user profile, generated by the interests provided by the user.It has an explicit understanding of the recommendation.You might have observed it in some apps,where you are asked questions about your preferences as soon as you signup.This is what it’s for.

基于內(nèi)容的過濾引擎：它通過匹配電影的描述和由用戶提供的興趣產(chǎn)生的用戶個人資料來提供推薦。它對推薦具有清晰的了解。您可能已經(jīng)在某些應用中觀察到了該推薦，在您注冊后被問到有關您的偏好的問題。這就是它的用途。

Collaborative filtering engine: It is a method of making automatic predictions about the interests of a user by collecting preferences or taste information based on the activity of current user along with many other users with similar activity(collaborating).The underlying assumption of the collaborative filtering approach is that if a person A has the same opinion as a person B on an issue, A is more likely to have B’s opinion on a different issue than that of a randomly chosen person.It need not have any explicit understanding of the recommendation.You might have observed in one of your OTT platforms when you open a particular movie, An array of movies under the heading “people who watched this movie also watched”.This is what it uses.

協(xié)作過濾引擎：這是一種通過根據(jù)當前用戶以及許多其他具有類似活動(協(xié)作)的用戶的活動收集偏好或品味信息來自動預測用戶興趣的方法。方法是，如果一個人A在某個問題上與人B擁有相同的觀點，那么與隨機選擇的人相比，A在一個不同的問題上更有可能擁有B的觀點，它不需要對該建議有任何明確的理解。當您打開特定電影時，您可能已經(jīng)在一個OTT平臺中觀察到過，標題為“看過這部電影的人也看過”的一系列電影。這就是它的用途。

Equipped with this basics,Lets dive into creating a movie recommender system using collaborative filtering.

配備了這些基礎知識后，我們將深入研究使用協(xié)作過濾創(chuàng)建電影推薦系統(tǒng)。

We start by Importing required libraries. We will be using Scikit-surprise which contains the SVD(Singular Value Decomposition).SVD allows us to extract and untangle information,which is really helpful in creating a recommender system.

我們首先導入所需的庫。我們將使用包含SVD(奇異值分解)的Scikit-surprise。SVD允許我們提取和解開信息，這對于創(chuàng)建推薦系統(tǒng)非常有幫助。

This topic involves a lot of statistical data analysis.resources to know more about scikit surprise,SVD:

本主題涉及大量統(tǒng)計數(shù)據(jù)分析。了解更多關于scikit Surprise，SVD的資源：

First thing one must do before creating a model is observe the data. This gives us a lot of insight on the type of data it is, and what we could use to gain the maximum from it.

創(chuàng)建模型之前，必須做的第一件事就是觀察數(shù)據(jù)。這使我們對數(shù)據(jù)的類型以及可以用來從中獲得最大收益的數(shù)據(jù)有很多了解。

As we observe the data, we see that timestamp is a redundant column and it is best to remove it.

當我們觀察數(shù)據(jù)時，我們看到時間戳是多余的列，最好將其刪除。

It is always a good practice to check for NaNs in your dataset,luckily we don’t have any.

最好在您的數(shù)據(jù)集中檢查NaN，幸運的是我們沒有。

現(xiàn)在是該模型的主要部分，探索性數(shù)據(jù)分析 (Now comes the Main Part of this model, Exploratory Data Analysis)

To start,We look for the Number of movies and users in the dataset.

首先，我們在數(shù)據(jù)集中尋找電影和用戶數(shù)。

Now we find Sparsity of the data. Sparsity tells us the percentage of movies missing rating by the users. i.e Not all users rate a movie, It tells us the percentage of missing values by the total values.Sparsity for this data is 98%. Usually the lower the sparsity,the better.But in the case of Collaborative Filtering, below 99% is manageable.

現(xiàn)在我們發(fā)現(xiàn)數(shù)據(jù)的稀疏性。稀疏度告訴我們用戶缺少電影評分的百分比。即，并非所有用戶都對電影進行評分，它告訴我們?nèi)笔е嫡伎傊档陌俜直取４藬?shù)據(jù)的稀疏度為98％。通常，稀疏度越低越好。但是在協(xié)作過濾的情況下，低于99％是可以控制的。

Sparsity(%) = (No of Missing Values/(Total Values))*100

稀疏度(％)=(遺漏值/(總值))* 100

Now we try to visualize ratings distribution.

現(xiàn)在，我們嘗試可視化收視率分布。

Most of the ratings are between 3–5 and the range of the ratings are from 0.5 to 5.

大多數(shù)評級介于3-5之間，評級范圍介于0.5到5之間。

FEATURE ENGINEERING

特征工程

Now comes The next essential part of the system, Feature Engineering.I always believe that Feature Engineering as Important as building a model, as It allows the model to better understand and converge better.

現(xiàn)在是系統(tǒng)的下一個基本部分，即要素工程。我一直認為要素工程對于構建模型同樣重要，因為它可以使模型更好地理解和融合。

Here We are Reducing the Dimensions by removing the redundant data like Movies with less than 3 ratings or user who rated less than 3 movies, as it is difficult to recommend something with such less data to analyse.

在這里，我們正在通過刪除冗余數(shù)據(jù)(例如評級低于3的電影或評級低于3的用戶的電影)來減少尺寸，因為很難推薦具有此類數(shù)據(jù)的數(shù)據(jù)來進行分析。

Now lets start creating the Model,

現(xiàn)在開始創(chuàng)建模型，

Creating a Surprise Dataset for training using the Reader class that we imported and provide the expected scale of rating,which we found out during our exploratory data analysis.You can add that to your data using the dataset import.

使用我們導入的Reader類創(chuàng)建一個用于訓練的Surprise Dataset，并提供我們在探索性數(shù)據(jù)分析中發(fā)現(xiàn)的預期的評分等級。您可以使用數(shù)據(jù)集導入將其添加到數(shù)據(jù)中。

Now as we are using our whole train set for training,we create an antiset which consists of all the data without the reviews on which we can test.

現(xiàn)在，當我們使用整個訓練集進行訓練時，我們將創(chuàng)建一個包含所有數(shù)據(jù)的antiset，而沒有可以測試的評論。

We create our SVD, which untangles the information for us to complete the recommender model.

我們創(chuàng)建了SVD，它為我們整理了信息，以完成推薦模型。

We then evaluate our model with the metrics Root Mean Square Error and Mean Absolute Error as they provide the average over the epoch of the absolute values of difference between the recommendation and the actual observation.

然后，我們使用度量均方根誤差和均值絕對誤差來評估我們的模型，因為它們提供了建議與實際觀察值之間的絕對差值的平均值。

Predicting

預測

預測為我們提供了用戶ID為1的電影ID。 (The prediction gives us a movie id for user id 1.)

This finishes our recommender system’s job.

這樣就完成了推薦系統(tǒng)的工作。

Now… lets discuss about something debatable.

現(xiàn)在...讓我們討論一些值得商bat的問題。

推薦系統(tǒng)是否正在影響我們在電影中的品味并控制我們？ (Are Recommender Systems influencing our taste in movies and taking the control from us??)

Photo by Juan Rumimpunu on Unsplash Juan Rumimpunu在Unsplash上的照片

My Father who is no way related to computer Science asked me this one fine morning.He was going through his favourite video streaming service and made an observation that, He was seeing videos that are related to a few areas only. It made him feel that his choice is getting Influenced by it and was unable to come across something new.

我父親與計算機科學毫無關系，今天上午好。我正在經(jīng)歷他最喜歡的視頻流媒體服務，并觀察到，他正在觀看的視頻僅涉及幾個領域。這讓他感到自己的選擇正在受到影響，無法遇到新的事物。

I explained this to him using my own words and understanding:

我用自己的語言和理解向他解釋了這一點：

He has been watching the same videos over and over daily,Thus creating a profile that, he is interested in only in this particular topic of videos.That was the reason he was shown videos from that particular topic only.

他每天都在看相同的視頻，因此創(chuàng)建了一個個人檔案，他只對特定的視頻主題感興趣。這就是為什么他只看到該特定主題的視頻。

But does it mean you have no control over it,

但這是否意味著您無法控制它，

The Answer is NO.

答案是否定的。

You still have your control, If you are not interested in a topic, but you were recommended by the engine, Just let the engine know that you are not interested. Yes, you have that option. Expand your viewing horizons for diverse content. A recommender system is there just to help you, not control you.It all finally depends on the viewer to watch or not.

您仍然可以控制自己，如果您對某個主題不感興趣，但是引擎推薦您，只需讓引擎知道您不感興趣即可。是的，您可以選擇。擴大您的觀看范圍，以獲取各種內(nèi)容。推薦系統(tǒng)只是在幫助您而不是控制您，最終取決于觀看者是否觀看。

Lets share our views on this and spread some knowledge.Lets learn and grow as a community.. Because all we are left with is people,memories and knowledge.

讓我們就此發(fā)表看法并傳播一些知識。讓我們作為一個社區(qū)學習和成長。因為我們所剩的就是人，記憶和知識。

Thank you.

謝謝。

翻譯自: https://medium.com/swlh/recommending-a-movie-using-collaborative-filtering-6dab1b8f4472

總結

以上是生活随笔為你收集整理的使用协同过滤推荐电影的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯，歡迎將生活随笔推薦給好友。

电影

上一篇：梦到不认识的人出车祸是什么意思
下一篇：数据暑假实习面试_面试数据科学实习如何准