I recently got this question and stared at it. Long and hard. There were so many things wrong with the question that I didn't know where to start. But since this is a blog post, don't fear. I've figured out where to start by now. I hope when you are at the end of the post that you know where to start, too. Duplicate content is a subject that everyone asks
about, but I find that few people truly understand what it is or what search engines do with it, much less the answer to how to avoid the duplicate content penalty.

The first reason that I started staring at the question was because of how it was worded. Here is an excerpt from that e-mail:

We want to start put lots of content in our blog and hope those articles we put will show up in search result (and we can catch long-tailed keyword search). It looks like it is quicker to establish content partners and just use other people's content. Will the content still show up in the search result if it is exactly the same as the content in another website? If not, how much percentage difference should we have? 80% the same?
Gee, the whole approach is so wrong. Taking other people's content and trying to change it just enough to fool the search engines—where do we start?

"Kopirkin" vending photocopier

Image via Wikipedia

First, I'll try to answer the questions. SEO gurus describe this situation as the "duplicate content penalty," but the phrase is somewhat of a misnomer, because the search engines are not really penalizing your site—they are just showing content that they believe is unique (by removing duplicates). So, if your content is substantially the same as another page on the Web, Google and the other engines won't show them all—they'll just show one version, or they'll show one ranked substantially higher than another that is somewhat different, but too similar to show close together in the results. This makes sense, because searchers don't want a page full of search results that all have similar pages in them.

Fine, you might say. How do I make sure that mine is the one that is shown? The best answer is to make your content original, because if someone else rips it off, the search engines will probably show yours. They try to show the one that was posted first or the one from the best site, so because the rip-off sites steal so much content, the search engines detect properly (most of the time) whose content it is. If they get it wrong, you can assert your rights under the U.S. Digital Millennium Copyright Act (DCMA) with Google or Bing (or whatever engine you are struggling with). Likewise, if you are the one pasting content from other sites, Google will probably know not to show your version and the real owners can assert their DCMA rights against you. (Most countries outside the U.S. afford you similar protection for your copyrighted material.)

To get to the burning question, no one knows how high a percentage you need for your pages to be hidden in the results, but some search gurus claim it is as little as 30% of the copy on the page. Having said that, I think it is the wrong question.to be asking.

Your goal shouldn't be to grab other people's content and post it for your own benefit. If that content is protected by copyright, then what you are doing is illegal if you haven't received permission from the copyright owner. Second, your site is likely to become an eclectic mess of opinions and writing styles from people who aren't you. In your haste to create lots of content, you probably aren't creating very good content. Your branding and your expertise should be out front—you do that by creating original content that shows off what you know and explains why people should buy from you. It might sound harder at first, but it works a lot better in the end because you'll get the search rankings and the search traffic you crave, but also the customers, too. You see, getting search rankings isn't your end game. Even if you manage to fool search engines with your purloined content, you are unlikely to win customers that way.

So, if you are trying to do something at low cost, go ahead and hire people to write your content. But don't just accept whatever they crank out--that is cheap but not usually very effective, Instead, make sure that they are writing content that shows off what you know. Give them some of the ideas and inspect everything they do to ensure that it meets your quality standards as well as your brand message. It might sound hard to do, but I have found that it's easier in the long run because you don't have to undo any content that shows you badly, and because good content can work for a very long time.

Enhanced by Zemanta

July 19, 2010





Mike is an expert in search marketing, search technology, social media, publishing, text analytics, and web metrics, who regularly makes speaking appearances.

Mike's previous appearances include Text Analytics World, Rutgers Business School, SEMRush webinar, ClickZ Live.

Mike also founded and writes for Biznology, is the co-author of Outside-In Marketing (with James Mathewson) and the best-selling Search Engine Marketing, Inc. (now in its 3rd edition, and sole author of Do It Wrong Quickly, named by the Miami Herald as one of the 11 best business books of 2007.






Comments(21)

Hey Mike,

Duplicate content is a pet hate of mine, I usually get clients who kindly offer to write the content for me and send it over so all I have to do is simply copy and paste the text into their site and its done... but most of the time I find my self staring at a whole host of duplicate content issues via copyscape. So I have to go back to the client and explain why their content cannot be used and that I would have to rewrite it all for them.

At least I guess I have a good idea of what they want on their website so it doesn't usually take too long to rewrite it and make it original.

Thanks for sharing this post.

Thanks, Damian. Your clients are lucky that you take the time to do it right--it works out a lot better for them in the end.

Nice article Mike. As someone who's in the "learning" phase of SEO/SEM techniques I certainly learned something from it.

What about the case of an e-commerce site, where the retail product pages use a lot of copy that comes from the product manufacturer? Will those pages be affected by the duplicate content penalty? If so, it seems strange to have to write original copy for each product when you don't manufacture it.

Thanks, Bob. It might not make sense, but think about it from the searcher's point of view. Why would the searcher want to see page after page with the same copy? Is the only difference the price? Retailers need to find some way to differentiate, if not by rewriting the copy, then by adding something unique—Amazon adds reviews, for example. If you don't have some important piece of unique information to add, Google has no reason to show you, and searchers have no reason to find you. I know this makes things harder for marketers, but you could be in worse businesses than one where you need to merely outwork your competitors.

Hi Mike
Thanks for such a great topic. Yes I m definitely agree with Bob and thanks for explaining about it. Many cms in todays date like Magento, Joomla already have some seo features like meta tags which solves the purpose of seo. Wordpress and Magento are seo optimized to a very good extent.

Many people get so caught up in SEO tactics that they forget the most important thing is to generate genuinely unique, useful and relevant content. Instead of trying to manipulate the sophisticated search engine algorithms it is much easier to just play ball and create a compelling site that is also friendly to crawlers.

Perhaps baked into Google's algorithm is a unique quality score.

The variables being:
- Unique content across the website
- Size of the website
- Links to deep-pages
- Uniqueness of content per page against other content on the web.

These are all excellent comments. The point is that if you do what works for searchers, you'll find Google figures out ways to reward it. If you don't, then not.

Thanks for a great article Mike. I've been researching to write something on this topic for my clients and most of what I see relates to copying content from other sources. What is your view on a company duplicating content on its own sites that target different countries? For example a company with a .com, a co.uk and a com.au that wants to put the same product and company information on each site with only minor changes such as addresses or currencies. I agree with all you've said on Google wanting to see unique content that is genuinely useful to searchers. If Google is going to favour the .com, com.au and co.uk in their respective markets by showing the relevant site in SERPs in each country then I would have thought it would not matter that the pages were nearly the same? If the content was already written in the most user-friendly manner would it really be necessary to try and completely re-write a company history page or a product description page?

Hello Mike, thanks for the post.

Let me tell you about a bit of a challenge with duplication of content. Large group of ecommerce sites, 10 of them with exactly the same product range (40k prodeucts each) - same platform, same servers, same content same templates.

Not an option to trim down the number of sites as each of them has 100s of K's unique customers gathered through years of operating in catalogue business. I am thinking - different templates, programmatic change of content, IP ranges etc... any thoughts?

Thanks,
Adam

Hi Roy, I think that you need not worry in your situation, but the proof's in the pudding.

People talk about the Google search engine, but the truth is that Google has dozens of search engines, with one in each country being on example. As you point out, the content should be fine as long as it is unique within that country's search engine and your pages are properly indexed for that country's search engine. The real answer is in your search results. If you see the right country pages coming up in each country search results page for the keywords you need, then you have nothing to worry about.

Hi Adam. As important as search marketing is, we can't act like it is the tail that should wag the dog. As you point out, whatever you are doing is working quite well to attract audiences across all the properties, so there is no reason to consolidate the sites into one.

Your description certainly sounds like a classic duplicate content situation for search marketing, but the larger question is about what problem you are trying to solve. So long as one of your sites is coming up in the search results for each keyword, I wonder if it matters which one. You might want to control which of the sites comes up, for which you can implement a robots.txt or robots tag so that only one of the sites is indexed. That way you know which site you are optimizing for search and which ones won't come up.

But I sense that what you really want to know is, "How can I get all of the sites to come up in search?" Unless you do the work to make the content unique across each site, you probably can't. You can fool around with your content to try to see whether you can trick Google into showing multiple sites, but I'd argue that it is a fool's errand in the long run. If it actually works, it isn't good for the searcher and it isn't good for Google, so eventually it won't work.

I'd focus on things that will last a while rather than some short-term trick. If each of these sites reach different audiences, perhaps there is a way of distinguishing the copy so that it appeals to the different target segments with different messages and benefits. If not, then I'd be working on some other marketing efforts besides tricking the search engines to give your duplicate content a pass. That probably wasn't the answer you wanted--sorry.

Thanks for clearing that up Mike.
Everything I write for my own sites and for my clients is original content, but like spam is the problem of seeing my headlines and titles co-oped.

At first it was aggravating, but then as I would read the article or blog post it started to get funny. Even my test headlines for obscure directories were being sandwiched between another title from Ezine articles.

I'll keep testing my headlines and titles, and writing original content but this was the best explanation about the duplicate content mess I've seen.

Hi Mike
great post.
Google caffeine has got the Internet marketers worried and the forums are buzzing for stories,
I have a question for you,
we have all seen these article directory submitters, submitting 1 article to over 100 sites, who will google go after the article site for having loads of duplicate articles or tracking the back link back to the other website and taking action there??
or is the article directory to powerful to be hurt in any way??.

or look at this way, a football story about englands world cup disapointment, google could read that there are thousands of sites with dulicate content.

I think there are a lot of grey areas for the term duplicate content.

Good information Mike!
I have been wondering about this for a long time, and finally I got some decent answers. I have 1 question though (just to see if I understand this right):
- If I write 1 article and post that one article on 100 article directories linking back to my blog or website, will the website be penalized by Google in any way?

Andre, thanks for the feedback. I always hesitate to give definitive answers to questions like these, because no one really knows what the search engine algorithms do, and it's certain that different search engines do different things in the same situation. So, having gotten my weasel-words disclaimer out of the way, I believe that your site won't be penalized in this situation.

To me, the word "penalized" implies that you're losing ranking power that you'd otherwise have. There's no reason for a search engine to do that. On the other hand, if you are asking whether getting link-backs to your site for exactly the same article in 100 places is as valuable as getting those same 100 links from those same sites for different articles, I'd say that the unique articles would be more valuable. If you consider that a penalty, that's up to you. I don't.

One way to think about this is that Google and other search engines tend to give more credit to links that appear to arise spontaneously. So links from 100 places for the exact same article doesn't seem all that spontaneous, so it might help, but not nearly as much as those same links would help from 100 different articles.

It's entirely possible that other experts might read the same ranking algorithm tea leaves and disagree with me, but that is my opinion.

Thanks for clearing that up Mike.

Duplicate content is such a heated debate with so much bad information floating around that this article is definitely a breath of fresh air. People always look for short cuts instead of treating their online endeavors as a true business. And with so many cost effective ways to get unique content, it just makes no sense to take the cookie cutter approach.

Many thanks Mike for taking the time to reply to my question - its reassuring to know that you agree and that Google still seems to be running on common sense for most things.
Keep up the good writing - your articles are very helpful and insightful.
Cheers
Roy

Thanks for the kind feedback, Roy. I am in the midst of my annual August vacation but will be back knocking out Search Engine Guide pieces in September. Good luck with your efforts.

Hi Mike
Thanks for such a great topic. Yes I m definitely agree with you and thanks for explaining about it. Many cms in todays like Wordpress, Joomla, Drupal already have some seo features like meta tags which solves the purpose of seo. Wordpress and Joomla are seo optimized to a very good extent.

Comments closed after 30 days to combat spam.


Search Engine Guide > Mike Moran > How do I avoid the duplicate content penalty?