In 2014, infrastructure components such as Hadoop, Berkeley Data Stack and other commercial tools have stabilized and are thriving. The challenges have moved higher up the stack from data collection and storage to data analysis and its presentation to users. The focus for this year’s conference on analytics – the infrastructure that powers analytics and how analytics is done.
Talks will cover various forms of analytics including real-time and opportunity analytics, and technologies and models used for analyzing data.
Proposals will be reviewed using 5 criteria:
Domain diversity – proposals will be selected from different domains – medical, insurance, banking, online transactions, retail. If there is more than one proposal from a domain, the one which meets the editorial criteria will be chosen.
Novelty – what has been done beyond the obvious. Insights – what insights does the proposal share with the audience that they did not know earlier. Practical versus theoretical – we are looking for applied knowledge. If the proposal covers material that can be looked up online, it will not be considered.
Conceptual versus tools-centric – tell us why, not how. Tell the audience what was the philosophy underlying your use of an application, not how an application was used. Presentation skills – proposer’s presentation skills will be reviewed carefully and assistance provided to ensure that the material is communicated in the most precise and effective manner to the audience.
For queries about proposals / submissions, write to firstname.lastname@example.org
Data Collection and Transport – for e.g, Opendatatoolkit, Scribe, Kafka, RabbitMQ, etc.
Data Storage, Caching and Management – Distributed storage (such as Gluster, HDFS) or hardware-specific (such as SSD or memory) or databases (Postgresql, MySQL, Infobright) or caching/storage (Memcache, Cassandra, Redis, etc).
Data Processing, Querying and Analysis – Oozie, Azkaban, scikit-learn, Mahout, Impala, Hive, Tez, etc.
Big data and security
Big data and internet of things
Data Usage and BI (Business Intelligence) in different sectors.
Please note: the technology stacks mentioned above indicate latest technologies that will be of interest to the community. Talks should not be on the technologies per se, but how these have been used and implemented in various sectors, enterprises and contexts.
Data sciences (is) in fashion @ Myntra
Ever dreamt that you can walk into a store which has been designed just for you? A store where the shelves have been stacked keeping in mind your fashion preferences only. A sales rep who understands what you wear and what’s missing in your wardrobe. Myntra is fast transforming itself into such a hyper-personalized (1:1) store and this transformation is being powered solely through analytics over big data. This talk discusses the challenges of delivering personalization at Internet scale. We present an overview of the machine learning techniques and big data technologies used to develop the system.
Fashion is a hard category to sell online given most of the purchases happen on impulse. It fundamentally differs from selling categories like mobiles which are more driven through reviews and ratings. Data sciences help bring a differentiating angle to selling fashion and can be applied in a variety of fashion e-tailing problems ranging from product rankings (the extreme form of which is personalised store for every user), store organisation and navigation, better customer engagement, better offer creation, better merchandising decisions, etc. We @ Myntra are working towards these specific problems and would love to talk about the approaches that have worked for us.
We are personalising the customer experience in multiple ways
- by personalising customer communications through various channels
- by personalising the website to tailor to customers’ preferences
- by creating unique customer specific offers
We would as well be talking about the data platform which powers all these efforts.
Divya Alok, Devashish, Debdoot
Data scientists working at myntra. We look at tons of data and actively look towards creating data products used by our customers - external and internal.