Website architecture is one of the most important aspects of creating a search engine friendly website. Below are just a few questions I was asked recently on the topic of navigation, site structure, site maps and pages site.

If I have a relatively small [site] with a flat/linear file structure and each page has links on it to every other page that the spiders can follow does it benefit me at all to have a site map?

If you only have a five to ten page site where every page is of equal value to the rest then a site map is unnecessary. However once you get beyond that, or begin having pages that are sub-pages of a section of your site it's a good idea to create a site map. Even if all the pages are represented in the navigation the site map will help the user know what information your site contains without having to go look for it.

One primary benefit of the site map is that it gives the visitor a birds-eye view of the entire site. In most instances, the navigation itself won't accomplish this, though it can depending on how it's laid out. But even still, if the visitor looking for a site map does so because they perceive an inability to find something that your site should have. It provides them with a complete picture. They know if they cannot find it on the site map then there is no need to keep looking through the rest of the site.

If I have a relatively small site is there any benefit for ranking by having a silo or pyramid structure rather than a flat structure? Do the spiders always prefer silo structures or does it only make a difference if the site if it is over a certain number of pages and if so what is that number?

I think in this case you defer to the visitor over the search engine. Organize your pages in a way that makes sense and will help the visitor easily find the information they are looking for. There is nothing wrong with silo-ing your site structure, even for small sites. Just make sure you're not adding additional layers for the sake of adding layers. Create a structure that makes sense from the visitor perspective. If you do that then it will make sense from the search engine perspective as well.

Do the robots really produce a 404 error if they can't find a robot text file on my site and should I make one just for that reason even if its a small site that otherwise wouldn't need it?

Anytime a search engine comes to your site (at least those who honor the robots.txt protocol) will search for the robots.txt file. Generally, not having a robots.txt file won't hurt you in any way but I do recommend adding one anyway.

Without the file in place the search engines can interpret that any way they want. Mostly it will tell them that your site is either completely open or forbidden. If you have the file in place you are letting the engines know that you have made a definitive declaration of what they can and cannot do in regards to spidering the site.

I have heard that pages should not be more that 100 or 150 k. Is that true? Does it affect ranking besides obviously affecting download times? Is k the same as kb (I know I could probably figure that one out)? How can I tell how big my pages are?

Its always best to keep your code as streamlined as possible. I don't think there is a hard and fast rule anymore as to how big a page can be, however if the engines are finding pages that have significantly more code than content then it very well may have a negative effect. Large pages with lots of content generally won't be a problem, but if the content to code ration is skewed too far toward code, then that represents an issue that's worth fixing. If too many of these pages exist on your site then its highly likely that many of your pages won't get fully spidered and/or many pages of your site could be left out of the index.

Typically when referring to file size someone might say 150k when they mean 150 kb. However if they say it's 150k kb then that suggests its 150 thousand kb. Just about any web file manager such as dreamweaver can tell you how big the page is. If the file is saved on your computer you can also look at the file size through windows explorer. When considering page size, search engines generally are just looking at the code, not necessarily at any additional downloadable items on the page (images, flash files, etc.)

I use iweb to make my site and it is limited in some things that it can do. It uses iframes to inset html content (or you can mess with it yourself). The way I understand it the frames are actually an entirely separate page that is imposed on top of the page you put it on. I have heard frames are bad for search rankings are these frames the same as iframes.

I don't have any experience with iweb so I can't answer any question in regards to that. Traditionally frames are not a good way to go for a site, but there are many ways to have the benefit of frames, without actually having frames. For one, if you need separate scrolling areas, this can be accomplished with CSS. For another, if you want to have one file for navigation instead of putting the same navigation code on every page, the use of server side includes (SSI) is the way to go. Includes allow you to create one file then globally include that file throughout the site. Update that one file and every page that pulls the include file in shows the updated information.

These questions are just a drop in the bucket of what you need to know regarding site architecture and navigation. But I do hope you they provided you with some valuable insight. I have answered a number of other website architecture questions in the past that will give you even more information on this topic. You can also check out my architecture checklist.

April 21, 2009

Stoney deGeyter is the President of Pole Position Marketing, a leading search engine optimization and marketing firm helping businesses grow since 1998. Stoney is a frequent speaker at website marketing conferences and has published hundreds of helpful SEO, SEM and small business articles.

If you'd like Stoney deGeyter to speak at your conference, seminar, workshop or provide in-house training to your team, contact him via his site or by phone at 866-685-3374.

Stoney pioneered the concept of Destination Search Engine Marketing which is the driving philosophy of how Pole Position Marketing helps clients expand their online presence and grow their businesses. Stoney is Associate Editor at Search Engine Guide and has written several SEO and SEM e-books including E-Marketing Performance; The Best Damn Web Marketing Checklist, Period!; Keyword Research and Selection, Destination Search Engine Marketing, and more.

Stoney has five wonderful children and spends his free time reviewing restaurants and other things to do in Canton, Ohio.


Some interesting points. One thing I don't agree with is the code to content ratio. I think this may have been an issue in the early days, but today the internet is fast and the engines are sophisticated. One PHP command instantly removes all code from a page, no matter what the code-to-content ratio is, and I imagine the big 3 engines to be a little more sophisticated yet than a single PHP command. Having decent Web Design Best Practices is a good way to avoid some of the common architecture traps. No point in everyone learning the same lessons over long periods of time, all the hard way, when a simple checklist can ensure quality across the board.

@ Andreas - You make a good point, though I'm not fully convinced. We know that the engined download the entire page because we can see that when viewing a cached version. The engines also look at the code to determine layout segmentation, headers, footers, navigation blocks etc. Also consider that the search engines want to be able to mimic the actions of real people and if a page takes a significant time to download because it's overly convoluted then the engines would want to be aware of that and factor it into the results. It may not be a significant factor but it can still play a role.

I have such big problems with google webmaster tools. I submitted my sitemap in form of rss and constructed sitemap page with all my post's links, but Google, special as always, gives totally 37 errors saying something about robot restrictions. I tryed to follow every seo and sitemap advice but it did not work. I hope your advice will help.

Large pages with lots of content generally won't be a problem, but if the content to code ration is skewed too far toward code, then that represents an issue that's worth fixing.

regarding web site architecture, duplicate content and the so famous canonical tag first thing first: a fine structure and forget the rest.

Comments closed after 30 days to combat spam.

Search Engine Guide > Stoney deGeyter > Q&A: A Few Things You Need to Know About Site Architecture