I would like to scrape a list of pages that will all have similar HTML. The goal is to scrape all of the ‘Things to Do’ data from TripAdvisor.
We start with a list of countries, which I would provide.
example:
http://www.tripadvisor.com/Attractions-g291982-Activities-Costa_Rica.html
http://www.tripadvisor.com/Attractions-g153339-Activities-Canada.html
http://www.tripadvisor.com/Attractions-g150768-Activities-Mexico.html
Within each country, we scrape the URLs to each city, and visit…
