The Fifth Elephant 2023 Monsoon

On AI, industrial applications of ML, and MLOps

Tickets

Loading…

Dhruv Nigam

@dhruvn

Predicting customer lifetime value in a non-contractual digital commerce setting

Submitted Jun 20, 2023

At Dream 11, we have built a Customer lifetime value(CLTV) model to predict each user’s future lifetime value. There are two broad areas where having a future-looking estimate of customer value can help.

Personalization

Having user-level customer lifetime value predictions enables us to personalize our platform for each user including personalized marketing campaigns, discounts, and recommendations. Eg. Users having a high predicted value might be offered higher discounts to improve retention since we will likely recover the costs of the higher discounts.

A proxy metric for long-term value

Experiments are central to our culture. However, we can only run each experiment for a finite time. Often we are interested in how short-term interventions affect long-term metrics. in extreme cases, certain features might improve short-term metrics while degrading long-term customer value. Change in future predicted lifetime value helps establish whether a certain treatment had a meaningful long-term impact on the target versus the control group.

Technical challenges faced

In a fast-moving e-commerce setting where user behavior is dynamic and there is an inherent seasonality to the business, getting long-term unbiased estimates at a user level is almost impossible. However, the most important use cases for CLTV estimates that do not necessitate accurate predictions are the user level. Some, like personalization, can work reasonably well even if we can predict the cross-sectional ranks of users. Other use cases where we want to assess the impact of a feature on long-term value rely only on cohort-level estimates which can be much more accurate because of variance reduction assuming i.i.d. errors for user-level estimates.

One of the biggest challenges we faced was the distribution of our target variable. We chose the future cumulative Contest Entry amount(CEA, roughly equivalent to revenue) over 360 days as our target variable. CEA, being a monetary metric, is highly skewed. Its distribution exhibits two key features-

  • A fat tail indicated a few users who contribute abnormally high CEA
  • A concentrated probability mass at 0 represents users who have churned out of the system and contribute exactly zero CEA.
  • Seasonality. Being in the fantasy space, user activity is heavily clustered around major sporting events

These distributional quirks are not unique to Dream11. Almost any e-commerce business with non-contractual(not subscription) customer engagement will display similar distributions.

Conventional evaluation metrics and loss functions fall short of capturing the nuances of this distribution that we call zero-inflated log-normal distribution. We address these challenges using customer loss functions like Tweedie loss and evaluation metrics like normalized Gini index.

Dream11 currently has close to ~80 million paying users. The scale of data to be processed for feature engineering, model training, and inference presents another challenge. We used spark to scale training and inference to meet the scale requirements.

Adoption and usability

We demonstrate that using this single model, how we were able to-

  • personalizing treatments - specifically marketing and promotions, driving higher retention and ROI
  • establishing the quality of marketing channels(google, Facebook, etc.) based on the lifetime value of users acquired through it - enabling quick feedback for optimizing acquisition spending.
  • quantifying the long-term impact of new product features beyond the experimentation phase.

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hybrid access (members only)

Hosted by

Jump starting better data engineering and AI futures

Supported by

E2E Cloud is India's first AI hyper scaler, a cloud computing platform providing accelerated cloud-based solutions at maximum optimization and lowest pricing