A lazy programmer's guide to web scraping

Oct 2017

2 Mon

3 Tue

4 Wed

5 Thu

6 Fri

7 Sat 09:00 AM – 10:00 AM IST

8 Sun 09:00 AM – 10:00 AM IST

IIIT Hyderabad, Hyderabad

A lazy programmer's guide to web scraping

Submitted Aug 10, 2017

Technical level: Beginner

As the ever increasing data in the today’s world data can be used in various ways. Thus web scraping helps you obtain that data efficiently and also saves a ton of time by making the data collection an automated task.

So the talk will be focused on getting started in web scraping using python. Scraping in python can be done in various different ways, the aim of this talk to provide the attendees with nitty-gritty details so that at the end of the talk, attendees will be able to judge on their own what approach to take and what libraries/tools to use depending on the problems they intend to solve. The talk will cover useful scraping libraries/tools and neat tricks and techniques required to scrape even the hard-to-scrape sites effectively. Hard-to-scrape can be described as sites which load the DOM with Javascript, or need authentication, or require captchas , involving cookies, e. t.c.

Also the talk will go through some real world example codes to give the attendees a gist of what all it takes to successfully extract the data they require. At the end some scraping ethics will be mentioned so that one doesn’t end up putting anyone in trouble.

Outline

This is the basic outline of the talk which may have some changes when the final talk is delivered

What is web scraping ?
Why should you scrape ?
Things that might come handy
How it’s done
Comparing Parsers
Preserving the data
Code Examples
What to use when to use
Scraping Hacks
Ethics of Scraping
Q/A and General Discussion

Requirements

Prerequisites:
1.Basic HTML and CSS knowledge.
2.Knowledge of HTTP methods GET and POST .
3.Familiarity with python language.

Speaker bio

Hello world !
I am a web developer and an open source enthusiast, I am also a Python lover and use it to automate everything I can.
Being an open source developer I am an active member of the various local user group that supports and promote open source.
Recently I spoke at PyDelhi Conf 2017 about automation using python and my talk was titled as “A lazy programmer’s guide to automation”.

Slides

https://docs.google.com/presentation/d/1vH8iglKUqzzydG0NK_lW0TtghFxu6U29KHrOGlHNmEk/pub?start=false&loop=false&delayms=5000&slide=id.p

PyConf Hyderabad 2017