Jul 2019
22 Mon
23 Tue
24 Wed
25 Thu 09:15 AM – 05:45 PM IST
26 Fri 09:20 AM – 05:30 PM IST
27 Sat
28 Sun
Jul 2019
22 Mon
23 Tue
24 Wed
25 Thu 09:15 AM – 05:45 PM IST
26 Fri 09:20 AM – 05:30 PM IST
27 Sat
28 Sun
Nandan Thakur
Data Science starts with data cleaning. When developers are working with text, they often clean it up first. Sometimes by replacing keywords (“Javascript” with “JavaScript”) while other times, to find out whether a keyword (“JavaScript”) was mentioned in a document. In today’s fast-moving world, bigger and bigger datasets are coming up with tens of thousands to millions of documents. the amount of time one would want to invest in cleaning these gigantic datasets would take them days using RegEx (5 days ~ 20K keywords and 3 Million documents). Therefore, FlashText - a super blazingly fast library reduced days of computation time into few minutes (15mins ~ 20K keywords and 3 Million documents). FlashText is efficient at both extracting keywords and replacing them in sentences and has been implemented using the Aho-Corasick algorithm and the Trie Data Structure approach.
[0-3mins]: Brief Introduction about Myself. Introduction to FlashText and compare FlashText vs. Regular Expressions Performance.
[3-8mins]: How is FlashText so blazingly fast?
[8-10mins]: When to Use FlashText?
[10-12mins]: Installing FlashText.
[12-15mins]: UseCase 1: Code – Searching for words in a text document
[15-18mins]: UseCase 2: Code – Replacing words in a text document
[18-20mins]: End Notes and Feedback for Future Talks.
Not a workshop
I am a perpetual, quick learner and keen to explore the realm of Data Analytics and Science. I am deeply excited about the times we live in and the rate at which data is being generated and being transformed as an asset. I am well versed in domains such as Natural Language Processing, Machine Learning, and Signal Processing and share a keen interest in learning interdisciplinary concepts involving Machine Learning.
https://drive.google.com/open?id=1WZ6MU80Qoz5znd89p9aSzTKxAor4Mo6zMvF2qPKqRyA
Jul 2019
22 Mon
23 Tue
24 Wed
25 Thu 09:15 AM – 05:45 PM IST
26 Fri 09:20 AM – 05:30 PM IST
27 Sat
28 Sun
Hosted by
{{ gettext('Login to leave a comment') }}
{{ gettext('Post a comment…') }}{{ errorMsg }}
{{ gettext('No comments posted yet') }}