we are looking for web scraper php script
which will scan web sites products, get the data and then insert and update it into a MySql DB.
The scraper shall do the following steps, once a day as a cron job:
1. Scan the site and prepare a list of all his products name or SKUs.
2. Compare the list with existing DB.
3. Adding new products to the DB.
4. Download all product pictures into one directory when each picture name based on the SKU number and the picture number.
5. Update DB fields that had been changed for each existing product.
6. Translate all text fields with Google translator to six languages and add translation to the DB.
7. Changing the IP or the Proxy server in daily basses, in order to avoid blocking from scanned site administrator.
8. Generating Category code for each new category.
The DB should contain the following fields:
Category name
category code
category parent code
Product Title
SKU
Price (USD)
Quantity
Shipping costs (free shipping or not)
Overview \ Description (Text)
Specification
Dimensions
Shipping Weight
Related products list
Status (in stock, out of stock)
Delivery time (how many days to delivery, if maintained by the site)
Product original web page link
Added to DB Date/Time
Updated changes in the DB Date/Time
Number of total Small pictures
Number of total Large pictures
Small size pictures path
Large size pictures path
All pictures shall be added to a directory when each picture has name based on SKU and picture number.
the script will be excuted on the following sites:
http://www.dealextreme.com/
http://www.szprice.com/
http://www.sw-box.com/
http://www.everbuying.com/ – only free shipping products
http://www.emixt.com/
http://www.dhgate.com/ – only free shipping products,
Each site should be a different script file so it can be easy to add and remove scanned sites.