Aug 2023
7 Mon
8 Tue
9 Wed
10 Thu
11 Fri 09:00 AM – 06:00 PM IST
12 Sat
13 Sun
Dhruv Nigam
At Dream 11, we have built a Customer lifetime value(CLTV) model to predict each user’s future lifetime value. There are two broad areas where having a future-looking estimate of customer value can help.
Personalization
Having user-level customer lifetime value predictions enables us to personalize our platform for each user including personalized marketing campaigns, discounts, and recommendations. Eg. Users having a high predicted value might be offered higher discounts to improve retention since we will likely recover the costs of the higher discounts.
A proxy metric for long-term value
Experiments are central to our culture. However, we can only run each experiment for a finite time. Often we are interested in how short-term interventions affect long-term metrics. in extreme cases, certain features might improve short-term metrics while degrading long-term customer value. Change in future predicted lifetime value helps establish whether a certain treatment had a meaningful long-term impact on the target versus the control group.
Technical challenges faced
In a fast-moving e-commerce setting where user behavior is dynamic and there is an inherent seasonality to the business, getting long-term unbiased estimates at a user level is almost impossible. However, the most important use cases for CLTV estimates that do not necessitate accurate predictions are the user level. Some, like personalization, can work reasonably well even if we can predict the cross-sectional ranks of users. Other use cases where we want to assess the impact of a feature on long-term value rely only on cohort-level estimates which can be much more accurate because of variance reduction assuming i.i.d. errors for user-level estimates.
One of the biggest challenges we faced was the distribution of our target variable. We chose the future cumulative Contest Entry amount(CEA, roughly equivalent to revenue) over 360 days as our target variable. CEA, being a monetary metric, is highly skewed. Its distribution exhibits two key features-
These distributional quirks are not unique to Dream11. Almost any e-commerce business with non-contractual(not subscription) customer engagement will display similar distributions.
Conventional evaluation metrics and loss functions fall short of capturing the nuances of this distribution that we call zero-inflated log-normal distribution. We address these challenges using customer loss functions like Tweedie loss and evaluation metrics like normalized Gini index.
Dream11 currently has close to ~80 million paying users. The scale of data to be processed for feature engineering, model training, and inference presents another challenge. We used spark to scale training and inference to meet the scale requirements.
Adoption and usability
We demonstrate that using this single model, how we were able to-
Hosted by
Supported by
{{ gettext('Login to leave a comment') }}
{{ gettext('Post a comment…') }}{{ errorMsg }}
{{ gettext('No comments posted yet') }}