Scraping Of Website Data W/ Java Into Text Files

What we need:
A data extraction guru. Someone who has expertise in screen scraping and parsing, from static pages and dynamic forms. We need someone with the ability to (a) dynamically navigate to the “account” pages of an airline website, and (b) extract the data we need and save the text into a file on our local directory.

Note: we need the script to automate the scraping, rather than a cut and paste method. We have already been able to accomplish this for some of our accounts, but we want to find a reliable teammate (hopefully you) who can do this for future accounts.

Our Goal:
Working script for one website account and hopefully, long term relationship with you to enhance the code to handle other website accounts. Once we have worked with you and you have worked with us on this first step trial with you, we would like to continue to pay you for many other similar and more advanced scraping projects.

What we will give you:
Login and password to a travel website. Details of website to be scraped will be given to winning project bidder.

What we need:
We need a script that can login at an airline website and pull user data. The script will then need to navigate to the “account” pages on the site – the pages that include key data like name, # of recent trips, etc. Then the script should scrape the page and output the data into our local directory, as simple text dump of those HTML pages. Please bring any questions via PM.

Coding Method:
We want this coded in Java – and we are interested in working with someone who is confident in doing this.

We need the script to run on the command line…where we (admin) will enter a user & password…and the script returns all the data we’ve requested. When we run this script at the command line, we want it to create a directory on our local computer and dump the text version of the HTML pages into that directory.

We expect that this first project should be a quick one, requiring less than 15 hours of work. Reasonable bid + Good feedback and Experience will be deciding factors in the selection. If this project seems difficult or scraping is new to you, it is probably appropriate to skip our project. Otherwise, we are an excellent team looking for a longterm partner who can be our scraping ninja.

Leave a Reply

Your email address will not be published. Required fields are marked *