E-Mail Extractor is a web spider (PHP script) which crawls through the web and extracts e-mail addresses from webpages. You just have to enter start webpage URL, select crawling mode, set maximum number of URLs to crawl and launch a spider. It will walk around the web and collect e-mail addresses.
Features
- Start webpage URL: spider starts working with this page.
- 2 crawling modes: same domain URLs only and all URLs.
- Maximum number of URLs: spider can crawl limited and unlimited number of URLs.
- Crawling statistics: list of URLs, e-mails and error log.
- AJAX-ed interface: modern jQuery-driven interface.
- Clean code: clean PHP and JavaScript code can be used for study purposes.
- CURL and fsockopen supported: spider can work through either fsockopen or CURL.
- Easy terminarion and resumption: close browser to stop the spider, open resumption URL to continue crawling.
- Easy to install: edit inc/config.php.
Limitations
- Spider doesn’t extract e-mails from images.
- Spider doesn’t extract e-mails from password protected area of websites.
- Some websites may block spiders.
- Make sure that using spiders is not against of your hosting provider TOS/TAC.
Requirements
- PHP version 5.0 or greater
- MySQL version 5.0 or greater
Installation
Let’s imagine that you have website http://www.website.com/ and you want to install script there.
- Create folder email-extractor(use any other name) in root of your domain. Once created it can be reached by URL:http://www.website.com/email-extractor/
- Make sure that folder email-extractorhas permissions0755;index.phpandajax.phphave permissions0644.
- Edit inc/config.phpand set MySQL database parameters.