Over the past several weeks I've been writing about duplicate content. Today's post wraps up the series that started with my theories on duplicate content penalties, where I explained the different types of duplicate content that the search engines find. Over the course of the series (see links to all posts at the end) I discussed various aspects of duplicate content, how it happens (sometimes inadvertently) and how it can be corrected.

In this installment I'll provide one of the best permanent fixes to inadvertent duplicate content that is common with business websites. The implications of this can be pretty significant depending on the size of your site. I've held off this last post for several weeks for no reason other than Scott Allen beat me to the punch with his own article detailing how 3 lines of code can improve your rankings. So if this article sounds familiar, you know why. Just know that I wrote mine first! :)

www. vs. no www.

Real quick, go to your browser and type in yoursite.com. Does the URL in the browsers address bar change to a) http://yoursite.com or b) http://www.yoursite.com?

Now type in www.yoursite.com. Does the URL in your browser change to a) http://www.yoursite.com or b) http://yoursite.com?

In both of those instances, if you answered A then you have potential duplication issues. Here is an example of one of my articles on Gooruze.com which shows the potential duplication:

Duplicate WWW issue

Take away the www. from the URL and lo-and behold you see the exact same article:

Duplicate WWW issue

You can see how this can become a problem, with virtually every article having its www. or non-www. twin.

The various versions are accessed depending on how each person typed in the website in the address bar to begin with (or the link they followed.) Did they type in the www. or not? You may have, I may have not. If I then bookmark the site or provide a link to it from another site, and you do the same, we're both sending the search engines to two different pages (URLs) both of which have the same content. If the search engines spider starting from either of those links, then literally hundreds of articles will be indexed, half of which are pure duplicates.

Fortunately, this issue isn't as bad as a lot of duplicate content issues because most of the search engines have gotten pretty good at figuring out that those pages are the same, after a bit of time. In most cases the search engines will equate the two versions, with or without the www., as being the same page(s). But it doesn't happen right away. In fact it can take several months or perhaps more, depending on the site, for the engines to tie the two together. While some are content to wait it out, the real danger is that you are potentially handicapping your link flow and incoming link juice while you wait.

The less you have to make the search engines think the better. Even if you're confident that the search engines have already made the connection between the www. and non-www. versions being one and the same, you never know what might change that in the future. The best strategy then is to be proactive in "fixing" this kind of duplication.

If you're running on an Apache server then the solution is relatively simple. Simply add this bit of code to your .htaccess file in the root directory of your server:

RewriteEngine On
RewriteCond %{HTTP_HOST} ^site.com
RewriteRule (.*) http://www.site.com/$1 [R=301,L]

Don't ask me to explain it, all I know is that it works! If your site is on any other kind of server, then you'll have to contact your web host for a fix. The .htaccess file is pretty finicky so be sure to back it up before making any changes. Once you get the updated version uploaded, give it a shot. If you type in site.com the address should redirect to http://www.site.com. Now do the same thing but with an inner page of your site. Type in site.com/page and you should be redirected to http://www.site.com/page. There you go. All set.

This article is part of a series on duplicate content. Follow the links below to read more:

  1. Theories in Duplicate Content Penalties
  2. How Poor Product Categorization Creates Duplicate Content and Frustrates Your Shoppers
  3. Redirecting Alternate Domains to Prevent Duplicate Content
  4. Preventing Secure & Non-Secure Site Duplication
  5. Why Session ID's And Search Engines Don't Get Along (Hint: It's a Duplicate Content Thing)
  6. What Does a Title Tag, Title Tag and Title Tag Have In Common?
  7. How to Create Printer Friendly Pages Without Creating Duplicate Content
  8. How to Use Your WWW. to Prevent Duplicate Content

May 20, 2008





Stoney deGeyter is the President of Pole Position Marketing, a leading search engine optimization and marketing firm helping businesses grow since 1998. Stoney is a frequent speaker at website marketing conferences and has published hundreds of helpful SEO, SEM and small business articles.

If you'd like Stoney deGeyter to speak at your conference, seminar, workshop or provide in-house training to your team, contact him via his site or by phone at 866-685-3374.

Stoney pioneered the concept of Destination Search Engine Marketing which is the driving philosophy of how Pole Position Marketing helps clients expand their online presence and grow their businesses. Stoney is Associate Editor at Search Engine Guide and has written several SEO and SEM e-books including E-Marketing Performance; The Best Damn Web Marketing Checklist, Period!; Keyword Research and Selection, Destination Search Engine Marketing, and more.

Stoney has five wonderful children and spends his free time reviewing restaurants and other things to do in Canton, Ohio.





Comments(9)

I believe that duplicate content will at worst be ignored. But the real loss is that visitors can link to a variety of URL's that are actually the same page. In this way the pages lose "link juice" and that is where the duplicate content may really hurt you.

optimizer... as you know, that only sets the preferred version of your website for Google. Believe it or not, that's not the only search engine. Even more unbelievable might be the fact that that doesn't always work either.

Best option is always to do the job properly yourself. Take the responsibility on your own shoulders because that way if Google is having a bad day your website won't show it.

Hey Stony,

Thank you for the tip, I found that i have the same thing going on with my website, gonna fix it straight.

I never understood why i could see two different entries for same URL in Google analysts.

thank you man..

Stony,
I just discovered your blog and have been bookmarking your articles like crazy. Thanks for all the excellent info. My site had the same issue, and I've just fixed it. I never understood why it was accessible both with and without the www, and I also didn't realize it was penalizing my SEO rankings.

My new blog site, www.stewartdesignweb.com, didn't seem to have this issue, as my static site did. Is that because wordpress takes care of that automatically?

@Amy It's very possible that this is built into Wordpress. I'll have to look into that.

Ok - so we are running apache as cgi and cannot use .htaccess to do the above - would a c record in dns to the same? Or how about http redirect?

@Dannette, sorry, I'm not familiar enough with other platforms to be able to provide an answer.

Many thanks for the tip. I've just tried it but it didn't work. Any ideas? (yes I changed site.com for my own domain name).

@Jersey - it only works on Apache servers. That could be the problem. Othewise talk to your web host.

Comments closed after 30 days to combat spam.


Search Engine Guide > Stoney deGeyter > How to Use Your WWW. to Prevent Duplicate Content