How many times did you feel that you were not able to understand someone else’s code or sometimes not even your own? It’s mostly because of bad/no documentation and not following the best practices. Here I will be demonstrating some of the best practices in Data Science, for R and Python, the two most important programming languages in the world for Data Science, which would help in building sustainable data products.
-
Integrated Development Environment (RStudio, PyCharm)
-
Coding best practices (Google’s R Style Guide and Hadley’s Style Guide, PEP 8)
-
Linter (lintR, Pylint)
-
Documentation – Code (Roxygen2, reStructuredText), README/Instruction Manual (RMarkdown, Jupyter Notebook)
-
Unit testing (testthat, unittest)
-
Packaging
-
Version control (Git)
These best practices reduce technical debt in long term significantly, foster more collaboration and promote building of more sustainable data products in any organization.
-
Why Data Science Best Practices?
-
Why R & Python
-
Data Science Best Practices
-
Integrated Development Environment (RStudio, PyCharm)
-
Coding best practices (Google’s R Style Guide and Hadley’s Style Guide, PEP 8)
-
Linter (lintR, Pylint)
-
Documentation – Code (Roxygen2, reStructuredText), README/Instruction Manual (RMarkdown, Jupyter Notebook)
-
Unit testing (testthat, unittest)
-
Packaging
-
Version control (Git)
-
Conclusion
None
I have BTech in Electrical Engineering from IIT Kanpur, Executive General Management from IIM, Bangalore and PhD in Bioinformatics / Computational Biology from UCD, Ireland. After PhD, I worked as the Head of Software Development at Genome Life Sciences, Chennai. I have more than 10 years of experience in the field of Genomic Data Science at international research organizations including UMH, Alicante (Spain), IGIB, Delhi and Monsanto, Bangalore. Currently, I am leading Data Analytics platform at Monsanto (a subsidiary of Bayer), Bangalore.
{{ gettext('Login to leave a comment') }}
{{ gettext('Post a comment…') }}{{ errorMsg }}
{{ gettext('No comments posted yet') }}