Search Engine Guide
Home
Search
Engines
Knowledge
Base
Vendor
Directory
Newsletters
About
Search The Internet: 


Orbidex provides two additional ways to stay current on search engine optimization techniques, trends, and other vital information:


Subscribe to have these articles delivered directly to you each week via email.



In addition to receiving the weekly articles, be sure to sign up for Orbidex's monthly newsletter.

Search Engine Optimization

Article provided by:
Orbidex
© 2001 Orbidex.


Back To Article Index

Tell Your Friends About This Site




Never Ignore the 404
By Eric Lander - October 04, 2001

Those who choose to work within site optimization should already be aware that information within log files serve vital to any effort. While many optimizers spend hours crunching the paths of traffic flow, few focus on the errors their site encounters.

Any time a user makes an attempt to pull up a page on a website that is unavailable, the server responds to that user with an error. The type of error message that is sent out to user is called a 404 Document Error, or “404” to be short.

If the user of any website encounters such an error, then the fault lies within poor site management. In other words, if your site’s content pages are the sources of such errors – get busy.

We all know that search engine agents come into a site and begin following all of the links it can within the site and use it to form an evaluation of content worth. (Note, this is emphasized by engines like Google, prioritizing link-through)

Now, what might happen when an agent set out to review your site encounters dead links, or these 404 messages? While no one aside from those creating the algorithms behind our most popular engines may know, we can all speculate. Whether listings are demoted in rank, removed, or banned – all remain unknown. But the risk is still present.

Fixing the issues that cause 404 errors is actually rather simple. If you are familiar with how to code your own website, invest in a tool like Linkbot Pro, or another link validation utility. After using such programs, spotting and solving the source of these errors is an effortless task.

But, what if the sources of some 404 errors on a site are not caused from errors on internal site links? Then, it is time to investigate…

The Log Files
If you have or can gain access to log files – do it. While many formats of log files exist, it should be easy to spot a 404 Error in any one format. From there, identify the source in which the error had been generated from.

Too often, a search engine’s listings point to documents within sites that are out of date or have been removed. If this is happening with your site, make an attempt to retain that traffic through providing content for those searching at the URL that receives the attention. Chances are, it is not just one engine pointing there.

If you can, avoid using a “META-Refresh” trick to get the viewers to the new page’s URL. Reason being, if an engine’s agent comes by to that URL again, it will not show much interest for a redirection page.

The Most Common 404’s
While it can be doubted, webmasters the world over have seen search engines and analyzing agents looking for two very common files. While there may be more, these two are identified as the most popularly requested 404 Error sources.

1) favicon.ico
This file is non-essential to any website’s success, but does prove that a URL is in fact resourceful. This particular file is an icon that Microsoft’s Internet Explorer uses to identify a website when placed in a user’s “Favorite’s” folder. (For those reading who have browsed the web prior to IE’s browser-monopolization on the general public, you may recognize the term “bookmark”.)

If you are interested in generating an icon for your site, and solving the source of these pesky 404 – Errors, consider the following sites:

http://www.favicon.com/
http://www.xoomhacker.com/favicon.html
http://global-positioning.com/favicon/

2) robots.txt
You have a website, and have submit it to search engines. In looking at the traffic statistics of your website though, you see that search engines like Scooter, GoogleBot, and T-Rex all coming in, reviewing one page, then leaving. Quite often, the reason can be attributed to the lack of a robots.txt file.

Again, many resources exist that can better inform you of the ways to create, setup, and establish a working robots.txt:

http://www.robotstxt.org/wc/faq.html
http://www.robotstxt.org/wc/norobots-rfc.html

The process of creating and establishing a working robots.txt file is quite simple. Even if you do have a robots.txt already up on your website, make sure that it is coded properly. Start by reviewing it with a validation tool like the one linked below:

http://www.searchengineworld.com/cgi-bin/robotcheck.cgi

Keeping It Up…
While at first this may seem like a one-time fix for a website, it is not. While other SEO efforts continue, (ex: link popularity efforts, news and article additions, and content revisions) more link errors are bound to be of misdirected type. Keeping a constant eye on this and other errors is always going to be an issue for webmasters and optimizers alike.

Once you have managed to fix the majority of your issues, help protect any new errors before you notice them. Do this by creating a page to display to users in the event that a 404 Error has been returned by the server. Again, the ways to set this type of document up will be determined by the server operating system that is used for a particular website.

For example, IIS control panels within Windows NT and Windows 2000 servers allow you to simply point to a file when such an error is encountered. Other systems will require another approach. The best bet is to consult the software’s documentation.