求解决方法_解决方法
求解決方法
Relationship management is one of the determining factors in the business health. One of the most important factors of this connection is the ability to identify when a customer is likely to cancel a service. For that reason, it is necessary to take initiatives that maximize customer retention.
關(guān)系管理是業(yè)務(wù)健康的決定因素之一。 這種聯(lián)系的最重要因素之一是能夠確定客戶何時(shí)可能取消服務(wù)。 因此,有必要采取措施最大限度地保留客戶。
Therefore, projects that identify customers prone to churn have become a frequent concern for organizations, as the cost of retention is usually lower than the cost of acquisition.
因此,確定客戶容易流失的項(xiàng)目已成為組織經(jīng)常關(guān)注的問(wèn)題,因?yàn)楸A舫杀就ǔ5陀谑召?gòu)成本。
Although it has gained the attention of many companies, there is no magic formula to solve the churn problem. In addition, the solution can have numerous complexities, like identifying the churn reason to apply different retention strategies.
盡管它已經(jīng)引起了許多公司的關(guān)注,但是還沒(méi)有神奇的公式可以解決流失問(wèn)題。 此外,解決方案可能具有許多復(fù)雜性,例如確定應(yīng)用不同保留策略的客戶流失原因。
挑戰(zhàn)性 (Challenges)
獲取新客戶的成本是否大于保留成本? (Is the cost of acquiring new customers greater than the cost of retention?)
It is essential to observe financial and strategic expenses in order to acquire and retain customers, since for some companies the cost of acquisition may be 5x higher than the cost of retention.
為了獲得并留住客戶,必須觀察財(cái)務(wù)和戰(zhàn)略支出,因?yàn)閷?duì)于某些公司而言,獲取成本可能比保留成本高5倍。
將治療哪種類(lèi)型的客戶流失? (What type of churn will be treated?)
It is important to highlight that the churn increase for a product or service occurs in many ways, such as:
重要的是要強(qiáng)調(diào),產(chǎn)品或服務(wù)的客戶流失率以多種方式發(fā)生,例如:
Volunteer: when the customer chooses to cancel the service due to dissatisfaction or preference for a competitor.
志愿者 :當(dāng)客戶由于對(duì)競(jìng)爭(zhēng)對(duì)手的不滿或偏愛(ài)而選擇取消服務(wù)時(shí)。
Silent: happens when a customer stops using the service for a long period and it does not generate costs — as using a credit card without monthly fees.
靜默 :當(dāng)客戶長(zhǎng)時(shí)間停止使用服務(wù)且不會(huì)產(chǎn)生成本時(shí)(例如使用沒(méi)有月費(fèi)的信用卡),會(huì)發(fā)生靜默 。
Involuntary: when the consumer does not intend to cancel the service, but due to a negligence he may end up having his plan not renewed or canceled for irregular use, lack of payment, among others.
非自愿的 :當(dāng)消費(fèi)者不打算取消服務(wù),但是由于疏忽,他最終可能會(huì)因?yàn)椴徽?dāng)使用,缺乏付款等原因而沒(méi)有續(xù)簽或取消其計(jì)劃。
您的專家對(duì)這個(gè)問(wèn)題了解多少? (How much do your experts know about the problem?)
Having a skilled team is very important to analyze if the project can be executed internally or if it needs outsourced help. Personalized solutions and prepared professionals can help to overcome the challenges of the problem and obtain rich and applicable results.
擁有一支熟練的團(tuán)隊(duì)對(duì)于分析項(xiàng)目是否可以在內(nèi)部執(zhí)行或是否需要外包幫助非常重要。 個(gè)性化的解決方案和專業(yè)的專業(yè)人員可以幫助克服問(wèn)題的挑戰(zhàn),并獲得豐富而適用的結(jié)果。
您是否有一個(gè)數(shù)據(jù)庫(kù)可以提取有關(guān)業(yè)務(wù)及其客戶的信息? (Do you have a database that allows you to extract information about the business and its customers?)
A solid database makes project execution much more feasible and generates robust and reliable results. This is a fundamental step to obtain customer knowledge and, consequently, understand how to map and develop your solution. Which brings us to the next question:
可靠的數(shù)據(jù)庫(kù)使項(xiàng)目執(zhí)行更加可行,并產(chǎn)生可靠可靠的結(jié)果。 這是獲取客戶知識(shí)并因此了解如何映射和開(kāi)發(fā)解決方案的基本步驟。 這就引出了下一個(gè)問(wèn)題:
您對(duì)客戶有多了解? (How well do you know your clients?)
It is also necessary to diagnose how your actions reflect on customers and, for that, you need to gather the information that defines their individual profile and behavior. This analysis is the key to identify whether or not they are prone to churn.
還需要診斷您的行為如何影響客戶,為此,您需要收集定義其個(gè)人資料和行為的信息。 該分析是確定它們是否容易流失的關(guān)鍵。
解決方法 (Ways to solve)
When it comes to solving the problem, there are a few more challenges to be overcomed by the team of experts. The first one is related to combine technical knowledge and business understanding, since exploratory analysis and the feature engineering must consider the organizational model to be successful.
解決問(wèn)題時(shí),專家團(tuán)隊(duì)還需要克服一些其他挑戰(zhàn)。 第一個(gè)涉及將技術(shù)知識(shí)和業(yè)務(wù)理解相結(jié)合,因?yàn)樘剿餍苑治龊凸δ芄こ瘫仨毧紤]組織模型的成功。
After characteristics consolidation and the insertion of business insights, it is time to start modeling. At this stage, you may encounter imbalanced data, in other words, by splitting the base of people who churned and people who remained faithful to the service, you may find an exacerbated higher proportion of loyal customers.
在特征合并和業(yè)務(wù)見(jiàn)解插入之后,是時(shí)候開(kāi)始建模了。 在此階段,您可能會(huì)遇到數(shù)據(jù)不平衡的情況,換句話說(shuō),通過(guò)分散攪動(dòng)的人群和忠于服務(wù)的人群,您會(huì)發(fā)現(xiàn)忠誠(chéng)客戶的比例更高。
The biggest problem with imbalanced data is that, if it is not addressed, machine learning algorithms tend to have a good response only for the majority class. This implies the generation of many false negatives, as there is an inclination to classify customers who are likely to leave as loyals.
數(shù)據(jù)不平衡的最大問(wèn)題是,如果不加以解決,機(jī)器學(xué)習(xí)算法往往僅對(duì)大多數(shù)人有很好的響應(yīng)。 由于存在將可能離開(kāi)的客戶歸為忠誠(chéng)客戶的傾向,因此這意味著會(huì)產(chǎn)生許多假否定情況。
處理不平衡數(shù)據(jù)的技術(shù) (Techniques to deal with imbalanced data)
At this point, it is necessary to use techniques to solve the imbalanced dataset problem and optimize the filter of customer’s behavior. Among them we can mention some of the most common ones: Oversampling, Undersampling, SMOTE and ADASYN. It is worth mentioning that they are not generalists, which explains why each problem is treated according to its specificity.
在這一點(diǎn)上,有必要使用技術(shù)來(lái)解決數(shù)據(jù)集不平衡的問(wèn)題并優(yōu)化客戶行為的過(guò)濾器。 在它們當(dāng)中,我們可以提到一些最常見(jiàn)的:過(guò)采樣,欠采樣,SMOTE和ADASYN。 值得一提的是,他們不是通才,這解釋了為什么每個(gè)問(wèn)題都要根據(jù)其具體性進(jìn)行處理。
Undersampling and Oversampling are more elementary techniques and mean the reduction of the class with greater representativeness and expansion of the one with less representativeness, respectively.
欠采樣和過(guò)采樣是比較基本的技術(shù),分別表示代表性較高的類(lèi)別的減少和代表性較低的類(lèi)別的擴(kuò)展。
SMOTE and ADASYN are more complex and make synthetic samples of the data. Both are similar strategies but ADASYN uses density distribution to create the synthetic elements.
SMOTE和ADASYN更復(fù)雜,它們是數(shù)據(jù)的綜合樣本。 兩者都是相似的策略,但是ADASYN使用密度分布來(lái)創(chuàng)建合成元素。
了解您的客戶流失解決方案的性能 (Understand the performance of your churn solution)
The churn model must be built based on the expected responses, being concerned with performance and how the output should be presented. When measuring model performance it is important to choose the correct metric for evaluation. Accuracy, for example, can give us a false sense of an stunning model, however, the result can be due to a correct classification only of the majority class — in which there is no presence of churn.
流失模型必須基于預(yù)期的響應(yīng),性能和輸出表示方式來(lái)構(gòu)建。 在測(cè)量模型性能時(shí),選擇正確的評(píng)估指標(biāo)非常重要。 例如,準(zhǔn)確性可能使我們對(duì)令人震驚的模型有錯(cuò)誤的認(rèn)識(shí),但是,結(jié)果可能是由于僅對(duì)大多數(shù)類(lèi)別進(jìn)行了正確分類(lèi)而沒(méi)有流失。
Walber on Walber在Wikipedia維基百科上Such evaluation can be centered on how much the solution improves your current retention strategy. If we consider that the retention actions are done on random clients, we can evaluate how much the sample indicated by the model would improve the selection of clients prone to churn.
這樣的評(píng)估可以集中在解決方案可以在多大程度上改善您當(dāng)前的保留策略上。 如果我們認(rèn)為保留操作是針對(duì)隨機(jī)客戶執(zhí)行的,則我們可以評(píng)估該模型指示的樣本將改善易流失客戶的選擇的程度。
Traditional evaluation metrics, like precision and recall, can also be fairly useful. The former is the number of correct indications over the total of number indications, while the second is the percentage of churn clients correctly classified over the total number of churns. Another method is the f1-score that can be described as:
傳統(tǒng)的評(píng)估指標(biāo),如準(zhǔn)確性和召回率,也可能非常有用。 前者是正確指示的數(shù)量占總數(shù)指示的總數(shù),而第二個(gè)是正確分類(lèi)的流失客戶在流失總數(shù)中的百分比。 另一種方法是f1得分,可以描述為:
F1 = 2 * (precision * recall) / (precision + recall)
F1 = 2 *(精度*召回率)/(精度+召回率)
了解結(jié)果 (Understanding the results)
In order to evaluate the metric to be used, it is crucial to understand operational costs to retain a customer given the potential for expected future revenue (lifetime value — LTV).
為了評(píng)估要使用的指標(biāo),了解運(yùn)營(yíng)成本以留住客戶至關(guān)重要,因?yàn)檫@可能帶來(lái)預(yù)期的未來(lái)收入(生命周期價(jià)值-LTV)。
Customers with a high LTV may justify a higher expense for retention, while customers with a low LTV may not justify the investment to retain it.
LTV高的客戶可能會(huì)為保留費(fèi)用支付更高的費(fèi)用,而LTV低的客戶可能無(wú)法為保留該費(fèi)用而進(jìn)行投資。
From the knowledge of the parameters for retaining a customer, this operation can be marked out, whether or not it makes the acceptance of wrongly classified consumers more flexible. This factor is directly related to penalties for generating false positives — when a loyal customer is classified as a churn.
根據(jù)保留客戶的參數(shù)知識(shí),可以標(biāo)明此操作,無(wú)論是否使接受錯(cuò)誤分類(lèi)的消費(fèi)者更為靈活。 當(dāng)忠實(shí)的客戶被歸類(lèi)為客戶流失時(shí),此因素與產(chǎn)生誤報(bào)的罰款直接相關(guān)。
If the cost of the retention operation is low, you can choose to flag more customers and thus get the majority of real churns. However, this will result in the presence of more false positives. Likewise, if the cost is high, it is essential to focus on the accuracy of the selected group, in order to avoid unnecessary expenses.
如果保留操作的成本較低,則可以選擇標(biāo)記更多的客戶,從而獲得大部分的實(shí)際客戶流失。 但是,這將導(dǎo)致出現(xiàn)更多的誤報(bào)。 同樣,如果成本很高,則必須重點(diǎn)關(guān)注所選組的準(zhǔn)確性,以避免不必要的支出。
In classification models, the threshold to classify a client as a churner is, by default, having a probability of leaving the service superior to 50%. This limit can be changed according to the business, for example, if higher precision is required, we can evaluate as churn only elements with a probability above 70%.
在分類(lèi)模型中,默認(rèn)情況下,將客戶分類(lèi)為客戶的閾值具有使服務(wù)保持在50%以上的可能性。 可以根據(jù)業(yè)務(wù)更改此限制,例如,如果需要更高的精度,我們可以僅將概率高于70%的元素評(píng)估為流失。
Sin-Yi Chou on 仙乙丑在GithubGithub上該模型 (The model)
The expected output can influence the employed strategy used to solve the problem. In addition to classification algorithms, which have binary responses, there are approaches that use survival and hybrid models. Survival analysis models do not classify customers as prone to churn or not. The generated response is a curve that can be operated to track each client’s probability to churn over time.
預(yù)期的輸出會(huì)影響解決該問(wèn)題所采用的策略。 除了具有二進(jìn)制響應(yīng)的分類(lèi)算法外,還有使用生存和混合模型的方法。 生存分析模型不會(huì)將客戶分類(lèi)為容易流失的客戶。 生成的響應(yīng)是一條曲線,可用于跟蹤每個(gè)客戶隨時(shí)間流逝的可能性。
To overcome survival analysis problems that involve complex and non-linear risk functions, models that extend binary classifications and transform their results into survival analysis have been developed. Such models are known as hybrid models and some of them are: RF-SRC, deepSurv and WTTE-RNN.
為了克服涉及復(fù)雜和非線性風(fēng)險(xiǎn)函數(shù)的生存分析問(wèn)題,開(kāi)發(fā)了擴(kuò)展二進(jìn)制分類(lèi)并將其結(jié)果轉(zhuǎn)換為生存分析的模型。 這種模型稱為混合模型,其中一些是:RF-SRC,deepSurv和WTTE-RNN。
結(jié)論 (Conclusion)
In summary, it is clear that churn modeling is vital for companies to be able to retain customers and reduce costs. Therefore, it is necessary to be aware that the success of these resources goes through several aspects — ranging from the knowledge of the public, to the complexity and robustness of the model. In case of any doubts, feel free to contact me!
總之,很明顯,流失模型對(duì)于公司能夠保留客戶并降低成本至關(guān)重要。 因此,有必要意識(shí)到,這些資源的成功涉及多個(gè)方面-從公眾的知識(shí)到模型的復(fù)雜性和魯棒性。 如有任何疑問(wèn),請(qǐng)隨時(shí)與我聯(lián)系!
翻譯自: https://towardsdatascience.com/unraveling-churn-and-its-challenges-a207276ff4a9
求解決方法
總結(jié)
以上是生活随笔為你收集整理的求解决方法_解决方法的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問(wèn)題。
- 上一篇: centos有趣软件包_这5个软件包使学
- 下一篇: xml格式是什么示例_什么是对抗示例?