Web Scraping with Processing

After our workshop with Tom on Webscraping we were asked to go and try scraping some data ourselves. We looked at writing a scraper in Python, which I found a little hard to get my head around. As I have worked in Processing before it seemed logical to try and replicate a scraper using P5.

The following code can be inout into Processing and used to scrape the HTML data from a given URL and output a number based on the amount of lines scraped.

String lines[] = loadStrings(“”);  //Input chosen URL here
void draw(){
println(“there are ” + lines.length + ” lines”); //This states “There are X lines”
for (int i = 0 ; i < lines.length; i++) { //following counts ‘i’ for each line scraped
  println(lines[i]); //Prints ‘i’ after counted
delay(1000); //Time delay for scraper to run

My idea for the use of this webscraper would be to be used on a live updated webpage, hence the addition of the timer (delay) at the bottom instead of a stop command.

