JuliaCon India 2015

The first Indian conference on the Julia programming language

Tanmay K. Mohapatra


Crunching Big Data with Julia

Submitted Sep 18, 2015

Introduce the big data infrastructure in Julia. It can be used to read/write HDFS files and run parallel Julia programs on a Yarn cluster.


This talk will use Elly.jl to demonstrate a big data workflow in Julia. Elly is a Hadoop HDFS and Yarn client. It is a pure Julia implementation with no dependencies on libhdfs. It provides:

  • A familiar Julia ClusterManager interface, making it possible to use the familiar Julia parallel constructs on a Yarn cluster: addprocs, @parallel, spawn, pmap, etc.
  • Lower level APIs to write native Yarn applications.
  • A familiar Julia IO API for accessing HDFS files.

We shall use Elly and a few associated Julia packages to process a few example datasets.

Speaker bio

Tanmay K.M., Julia contributor. https://github.com/tanmaykm



{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}