Url Scraper Desktop Utility

I need a lightweight desktop utility that does the following:

1. Allow user to enter a search query and after clicking “Start” the utility will scrape Google.com search results using their web search API (http://code.google.com/apis/ajaxsearch/web.html) and gather the first 500-1000 URLs of each of the search results. Using the API allows gathering of this information without violating Google’s terms.

2. The utility should also gather the PageRank of each domain (not page URL). This is an example script that can do this here: http://www.hm2k.com/projects/pagerank. Not sure how to do this via a desktop utility.

3. The utility will then take each domain in the query results and find the number of “Unique Visitors” from Compete.com. Here’s an example query that shows unique visitors for Digg: http://siteanalytics.compete.com/digg.com/

4. The results (URL, domain, pagerank, unique visitors) should be displayed in table format(sorted by highest to lowest unique visitors) with a count of the URLs and include a way to export to a .csv file.

5. The user should be able to filter for the minimum PageRank (1-10) and/or unique visitors to be included in the table and .csv file.

Code must be fully documented. Developer must deliver source code as well as compiled program.

I need this software delivered 4 days after the project starts. Anyone showing a working demo(video or limited version is fine) will be given preference.

Leave a Reply

Your email address will not be published. Required fields are marked *