Creating spiders and crawling websites using Scrapy

Feb 2017

13 Mon

14 Tue

15 Wed

16 Thu 09:00 AM – 06:00 PM IST

17 Fri 09:00 AM – 06:00 PM IST

18 Sat

19 Sun

AMANORA THE FERN HOTELS AND CLUB, PUNE, Pune

All submissions

Previous Next

Creating spiders and crawling websites using Scrapy

Submitted Nov 29, 2016

Technical level: Beginner

It is necessary to extract data from websites for various purposes such as Data Mining, content optimisation, data analysis, archival of historical data etc. Scrapy provides an application framework for extracting structured data by crawling websites. This talk will cover the basics of Scrapy, its installation, and live examples of writing spiders (classes that defined how a website will be scraped) for beginners.

Outline

What is scrapy (2 minutes)
Dependencies and requirements (5 minutes)
Installation (2 minutes)
Creating a new Scrapy Project (2 minutes)
Example of a spider that crawls a site and extracts data (5 minutes)
Command line method of exporting the scraped data (5 minutes)
Making the spider follow the links (5 minutes)
Using spider arguments (3 minutes)
Examples (3 minutes)

Speaker bio

Python and Java trainer
Managing Trustee of Peoples Education Trust
Contributer in Fedora Project
Pursuing PhD in Data Mining