Clickbank Xml Feed Mysql Cron

I need a php script that will get a 3.5MB .ZIP file from an outside url at Clickbank.com containing a 16 MB XML feed with approximately 10,000 unique items, unzip the file contents (1 .xml file and 1 .dtd file) and convert the data to and create/populate 2 MySQL tables.

This script will be run daily in CRON so no user intervention is allowed. It has to be fully automatic.

Please make sure to have error-checking in the script, so if something goes wrong in a preceeding step, NO database are TABLES emptied, but an e-mail is sent warning something went wrong.

ALL TABLES(2) in the database will be repopulated entirely each time the script is run.

For Each Record of the table “t_cb_raw” where “commission” is less than “40” THEN do NOT include that record in the newly created new table named “t_cb_noduplicates”, must be able to change this value within the script.
“t_cb_raw” is the Local CBMarketFeed DB after download of the CBMarketFeed.

t_cb_noduplicates-ftitle = tcbraw-Title
t_cb_noduplicates-fdesc = tcbraw-Description
t_cb_noduplicates-fvendorid = tcbraw-Id
t_cb_noduplicates-flink = http://”tcbraw-Id”.”tcbraw-Id”.hop.clickbank.net/
( Table t_cb_noduplicates-flink value needs to be UNIQUE, once first unique “tcbraw-Id” is encountered, no further duplicates allowed )
t_cb_noduplicates-popularityrank = tcbraw-PopularityRank
t_cb_noduplicates-recurring = tcbraw-HasRecurringProducts
t_cb_noduplicates-gravity = tcbraw-Gravity
t_cb_noduplicates-earnedpersale = tcbraw-EarnedPerSale
t_cb_noduplicates-percentpersale = tcbraw-PercentPerSale
t_cb_noduplicates-totalearningspersale = tcbraw-TotalEarningsPerSale
t_cb_noduplicates-totalrebillamount = tcbraw-TotalRebillAmt
t_cb_noduplicates-referred = tcbraw-Referred
t_cb_noduplicates-commission = tcbraw-Commission
t_cb_noduplicates-added = tcbraw-ActiveDate
t_cb_noduplicates-fcat = tcbraw-Name
t_cb_noduplicates-ftime = time() ; The time that the data was retrieved (seconds since 1970)

“CREATE TABLE t_cb_noduplicates (“.
” fid bigint(20) NOT NULL auto_increment,”.
” ftitle varchar(255) NOT NULL,”.
” fdesc longtext NOT NULL,”.
” fvendorID varchar(32) NOT NULL,”.
” flink varchar(255) NOT NULL,”.
” fpopularityrank INT NOT NULL,”.
” frecurring BOOL NOT NULL,”.
” fgravity FLOAT NOT NULL DEFAULT ‘0’,”.
” fearnedpersale FLOAT NOT NULL DEFAULT ‘0’,”.
” fpercentpersale FLOAT NOT NULL DEFAULT ‘0’,”.
” ftotalearningspersale FLOAT NOT NULL DEFAULT ‘0’,”.
” ftotalrebillamount FLOAT NOT NULL DEFAULT ‘0’,”.
” freferred FLOAT NOT NULL DEFAULT ‘0’,”.
” fcommission INT NOT NULL,”.
” fadded TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,”.
” fcat varchar(32) NOT NULL,”.
” ftime int(11) DEFAULT ‘0’ NOT NULL,”.
” PRIMARY KEY (fid),”.
” UNIQUE fid (fid),”.
” KEY fhtime (fhtime),”.
” KEY fcat (fcat)”.
” );” ;

On successful completion of each step the BEGIN & END EVENT will be logged.

On JOB Completion -OR- if there is an ERROR an e-mail is sent containing the results of the BEGIN & END EVENT LOG showing Steps Completed, and in case of ERROR, the ERROR.send an email showing the start and end times for the process (from downloading the new feed, to when the table was finished updating with the new data) as follows;

Example LogFile/Process Email:
2010-09-15 10:04pm started download clickbank feed
2010-09-15 10:09pm finished download clickbank feed
2010-09-15 10:10pm started parse clickbank feed
2010-09-15 10:14pm finished parse clickbank feed
2010-09-15 10:14pm started replace of tcbraw table
2010-09-15 10:29pm finished replace of tcbraw table
2010-09-15 10:29pm started query-replace of t_cb_noduplicates table
2010-09-15 10:35pm finished query-replace of t_cb_noduplicates table

Find the clickbank xml feed zip file at http://www.clickbank.com/feeds/marketplace_feed_v1.xml.zip

Platform: Linux-based web server

The PHP script must run on a Linux-based web server CRON using the following:

PHP version 5.2.14
MySQL version 5.1.48
eXtend 2.0 Control Panel

I will be running the script with a cron job, but it must also be able to be fired off by hitting it with a browser.

Leave a Reply

Your email address will not be published. Required fields are marked *