The Fifth Elephant 2013

An Event on Big Data and Cloud Computing

Lakshman Prasad

@becomingguru

Interactive analysis of data live, using Pandas, Matplotlib and IPython

Submitted May 1, 2013

The session is a live coding session to analyse various datasets using Pandas and plotting them live, in an IPython notebook.

There has been a surge in the development of SciPy tools and it’s adoption has seen an unprecedented increase recently because it can be used for both interactive analysis and run in production.

The session hopes to give the audience a short tour of the data analysis and visualisation using these scientific python tools by doing some analysis live on different data sets.

Outline

One of the data sets that is going to be the used is the dataset parsed from usesthis.com: The hardware and software used by people to get their work done. (permission for the same from the site owner has been obtained.)

The audience is going to be a part of the whole process of parsing it and converting it into numpy arrays whereupon it can be analysed to find various answers.

Another dataset would be the names of people in the US social security database since 1880 with 3 million published name records.

Speaker bio

The speaker has been working on Python for years and SciPy tools have always interested him. Recently he took the time to dive into it and has been pleased with what he learnt so far, which he can’t wait to share!

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hosted by

Jump starting better data engineering and AI futures