Introduction to R for Data Science [Workshop]
Submitted Apr 15, 2019
R programming is one of the most popular programming languages used in Data Science. Known for its simplicity and easy to take off working environment, R has been the language of choice of many non-programmers and its Rich ecosystem enables it to perform variety of Data Science related tasks. The objective of this workshop is to help you get started with R for you to move forward with your Data Science journey. As we are moving into the world of language-agnostic developers, Even if you know a language already, knowing another extra programming language like R would add an extra feather to your cap.
Outline
Workshop Outline
- Introduction to R & RStudio
- RStudio Overview
- Basics of R Programming
- Data wrangling and Visualization using Tidyverse
- Documentation and Reporting using R Markdown
-
Sample R Projects
Duration of the workshop:: 3 Hours (Basics R) + ~2 Hours (R for Data analysis)
Background knowledge required to participate in the workshop:: This material is designed for even Non-programmers (Statisticians and Economists) to start with R.
What concepts/technologies should participants be familiar with in order to attend the workshop.: A little bit of some programming language idea would help.
Target audience: who should attend the workshop?: A SAS/Data Scientist wanting to learn R to couple with their existing Tech stack.
Who should NOT attend this workshop.: Anyone who has read an R book or even some bit of R book wouldn’t need to attend, as it might seem very reduntant.
Why attend this workshop? What will participants learn from attending this workshop? How will they benefit?: Data science Tech stack is vast and huge with individual advantages. Having a langauge like R in your toolkit would be really valuable. For example: R has rich set of Bayesian tools and DSLs of R are quite extensive/customizable/useful. Participants will learn to start with R thus setting up the base layer for further development like NLP with R / Automated Dashboarding/Reporting using R.
Detailed workshop plan: -
Introduction to R & RStudio
* What’s R
* What’s RStudio
* Why R
* Demo of R - RStudio Overview
* RStudio Panes
* RStudio Toolbar
* RStudio Best Practices - Basics of R Programming
* Programming Concepts like- Variables
- Data Structures
- Iteration
- Control Flows
- Conditions and more
- Data wrangling and Visualization using Tidyverse
* What’s tidyverse and what does it constitute
* Data Analysis / Wrangling (mostlytidyr
anddplyr
)
* Data Visualiation (ggplot2
) - Documentation and Reporting using R Markdown
* What’sRMarkdown
* WhyRMarkdown
* Creating Documentation / Reporting
* Publishing RMarkdown -
Sample R Projects
* Sample R projects (Industry use-case)Requirements.
* R and RStudio are required to be installed
* Basic System Config of 2+ GB RAM, Any OS
* Some set of packages mentioned in the github repo should be installed
* Download the github repo that contains data (along with the code and presentation)
Requirements
Better for those who knew some programming before. But also for Beginners - especially those who want to do Data science.
Speaker bio
Abdul Majed is an Analytics Consultant helping Organizations make sense some out of the massive - often not knowing what to do - data. Married to R (but dating Python). Always amazed by Open Source and its contributors and trying to be one of them.
Organizer @ Bengaluru R user Group (BRUG) Organizer
Contributed to Open source by publishing packages on CRAN and PyPi
Writer @ Towards Data Science and DataScience+
Links
- https://towardsdatascience.com/@amrwrites
- https://github.com/amrrs
- https://kaggle.com/nulldata/kernels
Slides
https://amrrs.github.io/r_beginners_workshop/presentation.html#1
{{ gettext('Login to leave a comment') }}
{{ gettext('Post a comment…') }}{{ errorMsg }}