Finding order in the chaos : machine learning for web text analytics using R

This submission has been added to the schedule

Powered by VideoKen

HS

Finding order in the chaos : machine learning for web text analytics using R

Submitted Jun 3, 2013

Section: Workshops Technical level: Beginner

Participant will gain understanding of the following (through R),

A short, plain english introduction to the ideas that underlie text mining.
How to import large unstructured text data & apply basic cleanup procedures ?
How to apply more advanced natural language processing methods to the data ?
How to convert the unstructured text information to data structures suitable for machine learning and visualization ?
How to apply unsupervised learning methods (clustering, Latent Dirichlet Allocation) to data for identifying topics in the web documents ?
How to apply supervised learning methods to the data for classification ?

Outline

Do you get the feeling of ‘the cart before the horse’ on hearing buzz-words like social data mining or sentiment analysis and so on? Fundamental text mining methods are the real ‘workhorses’ behind these buzz-words. This workshop aims to give understanding of the fundamentals in ‘learning by doing’ fashion.

Internet, the information beast, largely consists of unstructured text form data. R environment provides excellent set of tools to deal with this. We will take up a realistic problem of finding topics in web-documents and touch upon a number of relevant machine learning methods using R.

We will also cover some relevant and interesting business problems which can be tackled using these methods.

Requirements

Laptop with working R installation
Internet connection ( to download relevant R packages)

Speaker bio

An avid R user, I work on applying machine learning methods to the field of digital advertising, @ Sokrati Inc. I have a prior experience of applying these methods to telecom and banking sector problems. I hold a master’s in Operations Research from IIT, Mumbai.

The Fifth Elephant 2013

Finding order in the chaos : machine learning for web text analytics using R

Outline

Requirements

Speaker bio

Comments