Metadata And Profile Crawler

A crawler script which retrieves the metadata or profile from a list of webpages. The crawler must retrieve title, description/profile (in metadata or in body of text) and keywords

The values must be similar to those displayed on a search engine, like Google or bing. The webpages to scraper are on wikipedia, twitter, facebook, last.fm, imdb, myspace, and a generic website (assuming the description metadata is set)

For twitter the crawler must retrieve the person’s profile description

F…

Leave a Reply