Search Engine Guide
Home
Search
Engines
Knowledge
Base
Vendor
Directory
Newsletters
About
Search The Internet: 


Rank Write Logo

Article provided with permission by
Rank Write Roundtable.
© 2001 Rank Write Roundtable.


Question Marks, Equal Signs, and Other Strange Characters
By Jill Whalen - April 05, 2001 (From the Rank Write Roundtable Newsletter)

~~~High Search Engine Rankings~~~

From: Prather, Don

I have a question for you I hope you can resolve for me. I have read that most major search engines can not spider sites that are dynamic in content. That is, let's say a string of code that has a question or hash marks. Is this true? And if it is, what happens when the spider encounters these obstacles? Does it index just that page or does it drop the page entirely? And last, Is there a way to get around this?

Thank you for taking the time to read this.

Don Prather
Webmaster wana-be


~~~Jill's Response~~~

Don,

This is a question that is on the mind of many a Webmaster these days, since more and more sites are generating their content dynamically. We did touch upon this topic way back in Issue 009, which you may want to refer to for more information. Since we have a couple thousand more subscribers (we're currently at 2846!) than we had when that issue came out, it won't hurt to revisit this topic today.

Not all dynamically generated pages are a problem for the search engines. It's the pages that have question marks and equal signs (and other *strange* characters) in their URLs that cause many search engine spiders to "lock-up" and go no further into your site. From what I understand, the reason for this is that these characters signal the spider that there could be an infinite loop of possibilities existing for that page. If they try and spider the page, it can cause what's known as a "spider trap" and cause a huge load on the engine's server. In order to avoid this, many of the search engines have programmed their spiders to simply ignore these pages all together.

Since dynamic pages are becoming more common, the search engine powers that be are aware that they need to make some changes and start indexing this type of content, if at all possible. We are beginning to see more dynamic pages indexed in many of the engines, and I believe we'll see even more in the future. However, if getting listed were a high priority for your Web site, I personally would not leave it to chance (and certainly wouldn't leave it to the search engines!).

There are many options out there that allow you to change your URL formatting to one that can be spidered by the search engines. If you use Cold Fusion for your dynamically generated pages, there is a workaround that helps you create spider-friendly URLs. You can learn more about it at: [http://forums.allaire.com/devconf/Index.cfm?Message_ID=18401]. There are also workarounds for those using Microsoft Active Server Pages (ASP). A software product which removes the question marks from your ASP URLs can be found at: [http://www.alphasierrapapa.com/products/portalpagefilter/]. And if you use an Apache server, there is a rewrite module for this purpose that you can read about at: [http://httpd.apache.org/docs/mod/mod_rewrite.html].

(Please note, the above links were found at the Search Engine Watch Paid Subscription area. For more ideas and rewrite modules regarding dynamically generated pages, it's well worth a subscription and a visit to: http://www.searchenginewatch.com/subscribers/more/dynamic.html).

Even if you can't or don't want to change your URLs to be spider-friendly, you can use Inktomi's Paid Inclusion program to submit your dynamically generated URLs with question marks. This won't help you for every engine, but Inktomi powers many major search engines these days, so you'll have some decent reach if you choose to go that route.

Another idea for those with dynamically generated pages is to create static HTML "copies" of your dynamic pages. These should have the same look and feel of your dynamic pages, but not have any spider-stopping characters in their URLs. If you choose to go this route, be sure to place links to these pages on the main page of your site and on each of the new static pages. If you don't do this, the search engines will think these pages are simply doorway/gateway pages and either ignore them, or give them a poor ranking. Again, please read Rank Write Issue 009, as referenced above, for more information on creating these types of pages.

There's also a tool at Search Mechanics that supposedly gives you the ability to crawl your own dynamic content and create a static page that references it all. I haven't personally tried it, but it sounds like an interesting concept! (You'll need Shockwave in order to access their product.)

And lastly, there is a good discussion of dynamic pages in I-Search Issue #288.

Hope this info is helpful!

Jill


~~~Send Us Your Questions~~~

If you have questions about online copywriting or search engine optimization (or both!), just zip us an email to questions@rankwrite.com. We've had some folks ask if their question was "too basic" to be printed - and you don't have to worry about that! There are no "stupid" search engine optimization or copywriting questions, so ask away!