I am looking for a vertical search engine with simple
three column page contains:
1- a web script that scrap’s the news from 15 (Arabic) newspapers
automaticly every morning, and store it in the server SQL.
Using open source Lucene/Nutch 1.2 or any script you are experienced with.
2- Powerful control panel.
3- a search box, to search the data stored in the SQL.
the search results will show the latest news first:
A- the Title with link to the original source (open in a new window)
B- the scrap date.
C- description (about 200 characters).
D- highlighted keyword.
if it is possible the spider should exclude parts of the pages:
(div id) (div class) (span id) (span class) …etc.
If you did before similar project, it would be
more appreciated to let me see.