Useful snippets to protect your WordPress blog against scrapers

Force your WordPress blog to break out of frames

Some scrapers display your blog in a frame to keep advantage of your content, and show their ads in another frame in order to try to make a few bucks. This code will force your blog to break out of the frames, so the visitor will only see your blog, not the scraper site.

Just paste the code below into your functions.php file, save it, and you’re done.

// Break Out of Frames for WordPress
function break_out_of_frames() {
	if (!is_preview()) {
		echo "\n<script type=\"text/javascript\">";
		echo "\n<!--";
		echo "\nif (parent.frames.length > 0) { parent.location.href = location.href; }";
		echo "\n-->";
		echo "\n</script>\n\n";
	}
}
add_action('wp_head', 'break_out_of_frames');

Source: http://wp-mix.com/break-out-of-frames-wordpress/

Protect your blog against image hotlinking

Most scrapers simply use your RSS feed and display it on their site, which means that they also use your original images on their sites, and consume your server bandwidth for their own websites. So you can definitely use this to inform the reader that he’s reading an article stolen from another blog.

Let’s create a small image saying something like “This article has been stolen from www.yoursite.com”. and upload it on your blog server. Then, edit your .htaccess file, (located in your WordPress blog root directory) and append this code to it:

RewriteEngine On
#Replace ?mysite\.com/ with your blog url
RewriteCond %{HTTP_REFERER} !^http://(.+\.)?mysite\.com/ [NC]
RewriteCond %{HTTP_REFERER} !^$
#Replace /images/nohotlink.jpg with your "don't hotlink" image url
RewriteRule .*\.(jpe?g|gif|bmp|png)$ /images/nohotlink.jpg [L]

Here is a funny example of this technique in action:

Source: http://www.wprecipes.com/how-to-protect-your-wordpress-blog-from-hotlinking

Automatically add a link to your post title

As the majority of content thieves are using automatic scraping tools, they’ll scrap all of your content, including the post title. A good way to discourage scrapers is to automatically put a link on your post titles, so each stolen post will automatically link to your original post.

To do so in WordPress, simply open your single.php file and locate where the title is displayed. Then, replace the code by the following:

<h1>
  <a href="<?php the_permalink(); ?>"><?php the_title(); ?></a>
</h1>

Source: http://www.catswhoblog.com/how-to-protect-your-blog-from-content-thieves

Automatically add a link to your original posts using RSS feed

Another useful way to fight back against content theft is to automatically insert a copyright notice with a backlink to the original post on each RSS item. That way, scrapers who use your RSS feed to publish your content on their own sites automatically will also publish your copyright notice and backlink!

Simply add the code below to your functions.php file. Copyright notice can be customized on line 4.

// add custom feed content
function add_feed_content($content) {
	if(is_feed()) {
		$content .= '<p>This article is copyright &copy; '.date('Y').'&nbsp;'.bloginfo('name').'</p>';
	}
	return $content;
}
add_filter('the_excerpt_rss', 'add_feed_content');
add_filter('the_content', 'add_feed_content');

Source: http://digwp.com/2012/10/customizing-wordpress-feeds/

Create a custom RSS feed

While the technique above is good, it only display a small notice at the bottom of your posts. You might want a more in-depth solution, which allow you to limit the number of characters appearing in each RSS feed item.

Here is a ready to use WordPress page template that you can easily customize to fit your specific needs.

<?php
/*
Template Name: Custom Feed
*/

$numposts = 5;

function yoast_rss_date( $timestamp = null ) {
  $timestamp = ($timestamp==null) ? time() : $timestamp;
  echo date(DATE_RSS, $timestamp);
}

function yoast_rss_text_limit($string, $length, $replacer = '...') { 
  $string = strip_tags($string);
  if(strlen($string) > $length) 
    return (preg_match('/^(.*)\W.*$/', substr($string, 0, $length+1), $matches) ? $matches[1] : substr($string, 0, $length)) . $replacer;   
  return $string; 
}

$posts = query_posts('showposts='.$numposts);

$lastpost = $numposts - 1;

header("Content-Type: application/rss+xml; charset=UTF-8");
echo '<?xml version="1.0"?>';
?><rss version="2.0">
<channel>
  <title>Yoast E-mail Update</title>
  <link>http://yoast.com/</link>
  <description>The latest blog posts from Yoast.com.</description>
  <language>en-us</language>
  <pubDate><?php yoast_rss_date( strtotime($ps[$lastpost]->post_date_gmt) ); ?></pubDate>
  <lastBuildDate><?php yoast_rss_date( strtotime($ps[$lastpost]->post_date_gmt) ); ?></lastBuildDate>
  <managingEditor>[email protected]</managingEditor>
<?php foreach ($posts as $post) { ?>
  <item>
    <title><?php echo get_the_title($post->ID); ?></title>
    <link><?php echo get_permalink($post->ID); ?></link>
    <description><?php echo '<![CDATA['.yoast_rss_text_limit($post->post_content, 500).'<br/><br/>Keep on reading: <a href="'.get_permalink($post->ID).'">'.get_the_title($post->ID).'</a>'.']]>';  ?></description>
    <pubDate><?php yoast_rss_date( strtotime($post->post_date_gmt) ); ?></pubDate>
    <guid><?php echo get_permalink($post->ID); ?></guid>
  </item>
<?php } ?>
</channel>
</rss>

Source: http://yoast.com/custom-rss-feeds-wordpress/

Leave a Reply

Your email address will not be published. Required fields are marked *