Comments

Making Data Science Work session 1

Setting up Machine Learning (ML) Projects for Success

Comments

  • Zainab Bawa (@zainabbawa) Crew a month ago (edited a month ago)

    I have known Goda to be an operations research engineer and many people have told me how much experience she has with this. How is this role positioned in the current org structure of data science?

  • Harshit Gupta (@harshit2115) a month ago

    Hey,
    Thanks for this amazing meetup. I’ve a question.
    I am an intern at a credit lending company, and we are working on new fraud and underwriting models, as covid19 has highly impacted the distribution of the data that our previous models were trained on. So are all before-covid models junk now ? How can we create new models considering covid19 factors ?

    • Zainab Bawa (@zainabbawa) Crew a month ago

      Shared your question.

      • Harshit Gupta (@harshit2115) a month ago

        Thanks a lot.

  • Mani Sarkar (@neomatrix369) a month ago (edited a month ago)

    Some notes and comments taken down during the talk:

    Really refreshing to hear the comments from Srujana — about care and rigour . If only every intellectual individual in our field would do this

    So few care about it, let alone don’t support it when someone wishes to take that journey .
    Very good questions - starting points: Why, what, who, when and how! The 5 whys are so important to answer

    Its methodology that we don’t take into consideration :
    S - Specific, M - Measurable, A - Achievable, R - Relevant, T - time-bound
    https://en.wikipedia.org/wiki/SMART_criteria

    Very good point, translating business problems into technical solutions that humans can use and solve the problem and track.

    Would like to share this with everyone: Talk Slides: Do we know our data (since there has been comments on data, data quality), Talk Video: Do we know our data.

    Logging and instrumentation is very important, measure and log the metrics where its necessary, overlooking isn’t an issue but underlogging can be less useful Very good point made about quick iterations

    Weights and Biases (wandb.com) is a great tool to measure at every point in your pipeline, even if its not model creation. Just any measurements can be made.

    Alert systems in place is a good to do. I think many systems do it. ScribbleData allows that. And I think with W&B system in place you can also do that. W&B = Weights and Biases = wandb.com

    Clean and manageable coding behind the core components and the pipeline itself should be an important point to always keep in mind. Because we spent most of the time trying to read, understand and change something - either when fixing a bug or when trying to add a new functionality.

Login to leave a comment