AI & Research,And Industrial Tracks - all videos, Also inviting registrations for Signal In Bangalore This update is for participants only
There are three phases in the lifecycle of an application - research, application and aftermath of the application.
- Assess capabilities, determining the new frontiers for AI.
- Find a use for the application.
- Learn how to run it, monitor it and update it with time.
The three tracks at the 2023 Monsoon edition of The Fifth Elephant will cover this lifecycle.
The 2023 Monsoon edition is curated by:
- Nischal HP, Vice President of Data Engineering and Data Science at Scoutbee. Nischal curated the MLOps conference which was held online between 23 and 27 July 2021.
- Sumod Mohan, Founder and CEO at AutoInfer. Sumod curated Anthill Inside 2019 edition, held in Bangalore on 23 November.
- AI and Research - covers research, findings, and solutions for challenges on building models in various areas such as fraud detection, forecasting, and analytics. This track delves into the latest methodologies for handling challenges such as large-scale data processing, distributed computing, and optimizing model performance.
- Industrial applications of ML - covers implementation of AI in the industry, with more focus on the AI models, the issues in training, gathering data so, and so forth. ML is being used at scale in industries such as automotive, mechanical, manufacturing, agriculture, and such domains. This track focuses on the challenges in this space, as we see innovation coming out of these industries in the pursuit of using ML on a second-to-second basis.
- AI and Product - covers strategies for building AI products to scale and mitigating challenges. This track provides insights on incorporating AI tools and forecasting techniques to improve model training, developing a working model architecture, and using data in the business context.
The Fifth Elephant 2023 Monsoon edition will be held in-person. Attendance is open to The Fifth Elephant members only. Purchase a membership to attend the conference in-person. If you have questions about participation, post a comment here.
- Data/MLOps engineers who want to learn about state-of-the-art tools and techniques, especially from domains such as automobile, agri-tech and mechanical industries.
- Data scientists who want a deeper understanding of model deployment/governance.
- Architects who are building ML workflows that scale.
- Tech founders who are building products that require AI or ML.
- Product managers, who want to learn about the process of building AI/ML products.
- Directors, VPs and senior tech leadership who are building AI/ML teams.
Sponsorship slots are open for:
- Infrastructure (GPU, CPU and cloud providers) and developer productivity tool makers who want to evangelise their offering to developers and decision-makers.
- Companies seeking tech branding among AI and ML developers.
- Venture Capital (VC) firms and investors who want to scan the landscape of innovations and innovators in AI and who want to source leads for investment in the AI and ML space.
Predicting customer lifetime value in a non-contractual digital commerce setting
At Dream 11, we have built a Customer lifetime value(CLTV) model to predict each user’s future lifetime value. There are two broad areas where having a future-looking estimate of customer value can help.
Having user-level customer lifetime value predictions enables us to personalize our platform for each user including personalized marketing campaigns, discounts, and recommendations. Eg. Users having a high predicted value might be offered higher discounts to improve retention since we will likely recover the costs of the higher discounts.
A proxy metric for long-term value
Experiments are central to our culture. However, we can only run each experiment for a finite time. Often we are interested in how short-term interventions affect long-term metrics. in extreme cases, certain features might improve short-term metrics while degrading long-term customer value. Change in future predicted lifetime value helps establish whether a certain treatment had a meaningful long-term impact on the target versus the control group.
Technical challenges faced
In a fast-moving e-commerce setting where user behavior is dynamic and there is an inherent seasonality to the business, getting long-term unbiased estimates at a user level is almost impossible. However, the most important use cases for CLTV estimates that do not necessitate accurate predictions are the user level. Some, like personalization, can work reasonably well even if we can predict the cross-sectional ranks of users. Other use cases where we want to assess the impact of a feature on long-term value rely only on cohort-level estimates which can be much more accurate because of variance reduction assuming i.i.d. errors for user-level estimates.
One of the biggest challenges we faced was the distribution of our target variable. We chose the future cumulative Contest Entry amount(CEA, roughly equivalent to revenue) over 360 days as our target variable. CEA, being a monetary metric, is highly skewed. Its distribution exhibits two key features-
- A fat tail indicated a few users who contribute abnormally high CEA
- A concentrated probability mass at 0 represents users who have churned out of the system and contribute exactly zero CEA.
- Seasonality. Being in the fantasy space, user activity is heavily clustered around major sporting events
These distributional quirks are not unique to Dream11. Almost any e-commerce business with non-contractual(not subscription) customer engagement will display similar distributions.
Conventional evaluation metrics and loss functions fall short of capturing the nuances of this distribution that we call zero-inflated log-normal distribution. We address these challenges using customer loss functions like Tweedie loss and evaluation metrics like normalized Gini index.
Dream11 currently has close to ~80 million paying users. The scale of data to be processed for feature engineering, model training, and inference presents another challenge. We used spark to scale training and inference to meet the scale requirements.
Adoption and usability
We demonstrate that using this single model, how we were able to-
- personalizing treatments - specifically marketing and promotions, driving higher retention and ROI
- establishing the quality of marketing channels(google, Facebook, etc.) based on the lifetime value of users acquired through it - enabling quick feedback for optimizing acquisition spending.
- quantifying the long-term impact of new product features beyond the experimentation phase.