I need a program written to extract certain text from multiple pages on a website. All of the pages have the same exact layout. Maybe where I enter the URL, and it returns the data, or where I have a .txt file with all of the URLS (one url on each line). So either I do 1 url at a time, or it batch processes all of them from the .txt file.
Each piece of data needs to be returned in a Tabbed delimited Form (.csv).
Take for instance this url:
http://www.rabbitsreviews.com/s9319/Elegant-Angel.html
I need this url’s information to be displayed EXACTLY like what you see in example-elegant-angel.xlsx
Some things to note:
Video Formats
Sometimes there is 1 format, sometimes 5. In this example, it has 3 formats. Each format needs its own column.
Independent Biller(s)
I need the names ONLY extracted, and not the urls. Each biller needs its own column, as in the example.
Customer Service
Each one needs its own column. Sometimes it will be a phone number, or an email address, or a url. If it is an email address, I need the mailto address. If it is a url, I need the actual URL, as you can see that the url is cut off, so the actual url needs to be returned, and not the text.
**************************************************************************************************************************
You can either have each pages info on a separate ROW on the excel document (so there is only 1 excel document), OR you can have each pages info returned to its own excel document (So each url will have its own document)
As you should know, an Excel .csv file can also be opened up in a .txt format….just in tabbed delimited form. So however you choose to work that is fine. I just need each piece in its own column.
Here is an example of what the urls will be (whether I have to process each one at a time, or your program will bacth process them all). There will be about around 5,000 urls to do.
You can click to these to give you more of an understanding about the subtle variations of each page.
http://www.rabbitsreviews.com/s9319/Elegant-Angel.html
http://www.rabbitsreviews.com/s8260/Movie-Box-.html
http://www.rabbitsreviews.com/s404/Videos-Z.html
http://www.rabbitsreviews.com/s41/Scoreland.html
http://www.rabbitsreviews.com/s2959/DDF-Busty.html
http://www.rabbitsreviews.com/s2702/Kelly-Madison.html
http://www.rabbitsreviews.com/s5490/Juggmaster.html
http://www.rabbitsreviews.com/s4968/Boobs-Garden.html
If you have any questions, please feel free to ask.