Python Script To Scrape Website

We are looking for someone to develop a script in python to scrape craigslist and grab the listing title, address (if available), images, the housing category, and housing listing description from the listings posted at http://denver.craigslist.org/hhh/. The script should then turn the information into the xml format shown in the attached document.

Some key aspects to note:
-the address will have to be geocoded into latitude and longitude.

-also housing category codes will have to be read and converted to our category codes (see attached for conversions)

-also description of the housing listing text (misc) may have to be broken up by length into sections. ie: the “{tabend nameoftab}” tags that can be seen in the attached document.

Leave a Reply

Your email address will not be published. Required fields are marked *