Harnessing implementation Patterns in Data Science
Submitted by Jaskaran Singh on Thursday, 23 August 2018
Transforming data science and big data implementations into generic and reusable blueprints for generating data pipelines which save developers cost and time accompanied by Generic CICD (Continuous Integration and continuous deployment) pipeline for deploying these to any cloud in minutes .
Data Science and Big Data include patterns for processing and analysing data specific to sectors of retail ,finance and e-commerce . Focus is on generating blueprint from these patterns which are generic & reusable which developers can harness to generate complex pipelines saving cost and time. These patterns include ingestion from various sources ,feature engineering patterns and generic data science models such as regression and classification use cases.There are a number of use cases such as cloud migration , ingestion , propensity ,sentimental analysis etc where implementations are same but inputs or factors are different. I know what are limitations and bottle necks developers face while developing these use cases. But none has tried to transform these use cases into blueprints or templates so it can be utilised by others developers. Use of these blueprints would directly effect the business owners since it would save cost and majorly time in implementing these pipelines.
No software requirements as such except java and maven because it involves blueprints which can be of any frameworks or language.
I have been working and developing use cases for retail and e-commerce sector for quite a while now.
I have a number of open source contributions to my name and many of plugins and API’s written by me are present in maven central. I have worked on cloud,devops ,data science and analytics and have knowledge in all of these domains.
I have spoken on this in a few conferences before where my work and research is well appreciated . Examples include PDC(Pune Data Conference).