Data Mining

The project involves extracting certain short data tables from one public information website, and entering them into Excel or a .csv file. Excel will be easier, for reasons we will explain. Data has to be extracted manually due to a nonhuman/Captcha barrier. Thus, a crawler will not work. To extract one of the tables (to one row on the Excel sheet) takes entering a number on a portal page, which leads to one additional page with the table, that loads fairly quickly (about 2 seconds). The Captcha barrier comes up approximately every 10 minutes, if you are working steadily, and takes about 10 seconds to complete.

This work has to be done carefully. Your research results will be reviewed periodically and only results containing less than 0.1% error rate will be accepted for payment. We suggest payment of $5 per 200 tables. There are 5,000 tables/pages to mine in this project.

Thanks! We look forward to reviewing your bids.

Leave a Reply

Your email address will not be published. Required fields are marked *