JSFoo 2014

JavaScript as the centerpiece of a complex web stack

Using PhantomJS for web process automation, testing and content scraping

Submitted by Shani Mahadeva (@satyashani) on Tuesday, 26 August 2014

videocam_off

Technical level

Intermediate

Section

Full talk

Status

Submitted

Vote on this proposal

Login to vote

Total votes:  +2

Objective

To explain how PhantomJS in congruence with CasperJS can be used to perform unobtrusive content scraping from sites and for automating web form filling, testing, crawling etc.e

Description

PhantomJS and CasperJS have been around for quite some time now but I don’t see there use much in mainstream productions or any talks about them. There are occasions when you need to collect data from client’s website for analysis or from the web for linguistic analysis. While perl or python are generally used for this, PhantomJS is much better because increasing number of websites today load content via Ajax and the content can be easily parsed with jQuery. Similarly for functional testing of web sites, screen shot generation for various devices PhantomJS is the way to go. Another application is web process automation which could be filling large number of forms, link collection, pirated content detection etc.

Requirements

Any operating system, PhantomJS, CasperJS, any good Editor.
Basic knowledge of JavaScript and understanding of variable scope.

Speaker bio

I’m senior programmer for a research company based in New York involved in analysis of contents of online LMS systems. For them I sometimes have to scrape public contents from new client’s websites. We do all functional testing using PhantomJS. I also use PhantomJS for protecting client contents like music files from internet piracy.

Comments

  • 2
    Sugandha Naolekar (@sugandha) 4 years ago

    Yes, this will be very helpful especially crawling data from government websites.

Login with Twitter or Google to leave a comment