E-Mail Extractor is a web spider (PHP script) which crawls through the web and extracts e-mail addresses from webpages. You just have to enter start webpage URL, select crawling mode, set maximum number of URLs to crawl and launch a spider. It will walk around the web and collect e-mail addresses.
Features
- Start webpage URL: spider starts working with this page.
- 2 crawling modes: same domain URLs only and all URLs.
- Maximum number of URLs: spider can crawl limited and unlimited number of URLs.
- Crawling statistics: list of URLs, e-mails and error log.
- AJAX-ed interface: modern jQuery-driven interface.
- Clean code: clean PHP and JavaScript code can be used for study purposes.
- CURL and fsockopen supported: spider can work through either fsockopen or CURL.
- Easy terminarion and resumption: close browser to stop the spider, open resumption URL to continue crawling.
- Easy to install: edit
inc/config.php
.
Limitations
- Spider doesn’t extract e-mails from images.
- Spider doesn’t extract e-mails from password protected area of websites.
- Some websites may block spiders.
- Make sure that using spiders is not against of your hosting provider TOS/TAC.
Requirements
- PHP version 5.0 or greater
- MySQL version 5.0 or greater
Installation
Let’s imagine that you have website http://www.website.com/
and you want to install script there.
- Create folder
email-extractor
(use any other name) in root of your domain. Once created it can be reached by URL:http://www.website.com/email-extractor/
- Make sure that folder
email-extractor
has permissions0755
;index.php
andajax.php
have permissions0644
. - Edit
inc/config.php
and set MySQL database parameters.