Jul 2016
25 Mon
26 Tue
27 Wed
28 Thu 08:30 AM – 06:25 PM IST
29 Fri 08:30 AM – 06:15 PM IST
30 Sat 08:45 AM – 05:00 PM IST
31 Sun 08:15 AM – 06:00 PM IST
Vipul Gupta
Larger datasets lead to better quality of Prediction models. However experimenting with larger datasets in a test environment to test the accuracy of the model is not always feasible, primarily due to limited resources like limited main memory, lack of CPU power, etc. This talk will highlight how such experiments can be conducted on small nodes (like a modern laptop) by leveraging streaming systems like Spark, and how streaming systems can be used for Machine Learning problems in Test environments with limited resources.
The audience can expect to understand the benefits of using Streaming systems for setting up, training and testing their models in a smaller environment (even a single node), and eventually deploying such models in production environments with abundance of system resources at their disposal.
Testing Machine Learning Algorithms requires large data for training. This is essential to find a good model for prediction
Large Data Sets - far greater than memory of a node
Streaming Systems to the rescue - Example Spark
A Logistics Regression or K-means clustering algorithm example to demonstrate the concept
-NA-
{{ gettext('Login to leave a comment') }}
{{ gettext('Post a comment…') }}{{ errorMsg }}
{{ gettext('No comments posted yet') }}