Anthill Inside 2019

On infrastructure for AI and ML: from managing training data to data storage, cloud strategy and costs of developing ML models

Tickets

Feature selection and engineering using genetic algorithms and genetic programming

Submitted by SIDHARTH KUMAR (@sidkumar) on Tuesday, 30 April 2019

Section: Full talk Technical level: Advanced Session type: Lecture

Abstract

While feature selection is almost a solved problem in data science, feature engineering is still quite a mystery. In this talk I will outline a method that I use to solve feature engineering, with a goal to provide a generalized framework to tackle both feature engineering and selection simultaneoously.

Outline

The first few slides will talk about the application of genetic algorithms (GA) to feature selection. The next couple of slides will talk about advancements made to GAs by use of a multi-dimensional covariance map, a method that I developed. The next couple of slides will talk about genetic programming (GP) and how one can use the multi-dimensional covariance map to augment the convergence of GPs.

Requirements

A good understanding of machine learning fundamentals

Speaker bio

I’m currently a principal data scientist at Intuit. A public but slightly dated bio is available here: https://www.analyticsvidhya.com/datahack-summit-2018/speakers/sidharth-kumar/ An informal writeup on me is available here: http://humansofanalytics.com/stories/sidharth-kumar-data-science-savant-machine-learning-aficionado-and-ardent-chess-player/

Comments

  • Abhishek Balaji (@booleanbalaji) Reviewer 3 months ago

    Hi Siddharth,

    Thank you for submitting a proposal.To evaluate your talk, we need detailed slides and a preview video. Your slides must take the following points into consideration:

    • Problem statement/context, which the audience can relate to and understand. The problem statement has to be a problem (based on this context) that can be generalized for all.
    • What were the tools/options available in the market to solve this problem? How did you evaluate these, and what metrics did you use for the evaluation? Why did you decide to build your own ML model?
    • Why did you pick the option that you did?
    • Explain how the situation was before the solution you picked/built and how was the fraud/ghosting after implementing the solution you picked and built? Show before-after scenario comparisons & metrics.
    • What compromises/trade-offs did you have to make in this process?
    • What are the privacy, regulatory and ethical considerations when building this solution?
    • What is the one takeaway that you want participants to go back with at the end of this talk? What is it that participants should learn/be cautious about when solving similar problems?

    As next steps, we’d need to see the detailed and/or updated slides by 21 May, in order to close the decision on your proposal. If we dont receive an update by 21 May, we’d have to move the proposal for consideration for a future conference.

  • Sidharth Kumar 3 months ago

    Hi Abhishek,
    It’s unlikely that I’ll be able to submit my slides before 31st May, and therefore, I withdraw my proposal.
    Best,
    Sid

    • Abhishek Balaji (@booleanbalaji) Reviewer 2 months ago

      Noted. I’d highly encourage you to find some time to submit the slides when you can. We’ll move the proposal for consideration under a future event.

Login with Twitter or Google to leave a comment