The Fifth Elephant 2015

A conference on data, machine learning, and distributed and parallel computing

Machine Learning, Distributed and Parallel Computing, and High-performance Computing are the themes for this year’s edition of Fifth Elephant.

The deadline for submitting a proposal is 15th June 2015

We are looking for talks and workshops from academics and practitioners who are in the business of making sense of data, big and small.

Track 1: Discovering Insights and Driving Decisions

This track is about general, novel, fundamental, and advanced techniques for making sense of data and driving decisions from data. This could encompass applications of the following ML paradigms:

  • Statistical Visualizations
  • Unsupervised Learning
  • Supervised Learning
  • Semi-Supervised Learning
  • Active Learning
  • Reinforcement Learning
  • Monte-carlo techniques and probabilistic programming
  • Deep Learning

Across various data modalities including multi-variate, text, speech, time series, images, video, transactions, etc.

Track 2: Speed at Scale

This track is about tools and processes for collecting, indexing, and processing vast amounts of data. The theme includes:

  • Distributed and Parallel Computing
  • Real Time Analytics and Stream Processing
  • MapReduce and Graph Computing frameworks
  • Kafka, Spark, Hadoop, MPI
  • Stories of parallelizing sequential programs
  • Cost/Security/Disaster Management of Data

Commitment to Open Source

HasGeek believes in open source as the binding force of our community. If you are describing a codebase for developers to work with, we’d like it to be available under a permissive open source license. If your software is commercially licensed or available under a combination of commercial and restrictive open source licenses (such as the various forms of the GPL), please consider picking up a sponsorship. We recognize that there are valid reasons for commercial licensing, but ask that you support us in return for giving you an audience. Your session will be marked on the schedule as a sponsored session.

Workshops

If you are interested in conducting a hands-on session on any of the topics falling under the themes of the two tracks described above, please submit a proposal under the workshops section. We also need you to tell us about your past experience in teaching and/or conducting workshops.

Hosted by

The Fifth Elephant - known as one of the best data science and Machine Learning conference in Asia - has transitioned into a year-round forum for conversations about data and ML engineering; data science in production; data security and privacy practices. more

Abhijit Pratap Singh

@sabhi

Harnessing the power of the Erlang VM at Housing

Submitted Jun 15, 2015

RoR and Django has ensured we remain productive in the face of rapidly changing product requirements at Housing. However we ran into issues of memory and speed when we had to scale throughput and interface with other services in our SOA. This talk describes how we rewrote some core parts of our infrastructure to ride on the coattails of the awesome Erlang VM.

Outline

I would briefly introduce the audience about the Erlang/Elixir, the actor model and its benefits.
We will discuss the following three use cases as examples of where we used the Erlang VM to overcome limitations of speed and memory in our core infrastructure

  1. Ruby APIs, implementing search with Elasticsearch, have been the backbone of our realtime search infrastructure. After having experienced a few server side scalability bottlenecks due to inefficient Unicorn workers, we decided to port some of the feasible and critical code to Elixir which exploits the soft real-time capabilities of the Erlang VM. In a Service Oriented Architecture centric environment, various Ruby APIs doing IO on network fitted a great use case for the Erlang VM.

  2. CouchDB is another software written in Erlang. It plays a significant role in storage of custom JSON data and enable real-time streaming search through CouchDB Views. We injected a Elixir evaluation runtime in CouchDB to allow us to write views in Elixir, which results in performant View generation times while also assuring developer productivity.

  3. RabbitMQ has been a critical component for our backend infrastructure . The use of topic exchanges can enable the source logic and the consumer logic to be decoupled enabling us to write generic publishers which could just be embedded as a library in other services.

Finally I would also like to shed some light upon production issues that can occur in Erlang VM and the ways to fix them.

Requirements

No requirements

Speaker bio

Abhijit Pratap Singh is a Software Developer at Housing.com. He has been working at Housing.com for the past two years. He has played an active part in developing the product backend for Housing.com.

Pranav Rao is a Software Developer at Housing.com. He was the primary developer behind the geo-services backend at Housing.com working on search and storage of other geo-entities.

Slides

http://i.imgur.com/a0vYx1e.png?1

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hosted by

The Fifth Elephant - known as one of the best data science and Machine Learning conference in Asia - has transitioned into a year-round forum for conversations about data and ML engineering; data science in production; data security and privacy practices. more