Fast Search & Transfer (FAST) is the first search technology company to enable indexing of Flash content in AllTheWeb and partner sites. It probably won't be the last.

In July, I wrote about the push to index file formats beyond HTML, showing that search engines were indexing formats such as Image, Video, MP3, and PDF files, and demonstrating how these files can be optimized for search engines. I predicted FAST would eventually index Flash and other multimedia files.

Well, it didn't take long. FAST just announced AllTheWeb and its partner sites will now index Macromedia Flash content and applications, a breakthrough that allows many popular sites to get expanded visibility in search engines.

As the Web matures, more and more designers and Web site developers are using Flash to create Web sites that go beyond the limits of HTML. That's because users are demanding more sophistication with animations, multimedia features, and advanced menu options. In fact, Macromedia Director Param Singh states that "Macromedia Flash content is included in nine of the top ten most visited websites.

Up until now, it's been difficult to get Flash content indexed. We've advised our clients to use Flash along with HTML text links to make a page crawlable by search engine spiders. This also makes Flash pages readable by users who don't have a Flash plug-in, as many users are still resistant to downloading plug-ins.

While a few search engines (like Google) include some Flash sites in their databases, these pages are not regularly indexed by search engine crawlers. So the bulk of Flash pages do not appear in search engine databases unless they also include optimized HTML.

Expanded visibility for Flash sites is good news for designers and users alike because it means that AllTheWeb, Lycos, InfoSpace, and other partner sites using FAST technology can readily display links to a cache of Flash content that wasn't available to users before.

Advanced Search Options

AllTheWeb alone serves over 100 million users per month. Its users can now further refine their searches in Macromedia Flash content and applications through the use of the site s Advanced Features section.

There are a couple of new options in AllTheWeb's Advanced Features section:

  • A new Embedded Content option where you can specify whether to include or exclude embedded content in such files as: Images, Audio, Video, RealVideo & RealAudio, Macromedia Flash, Java applets, JavaScript, and VBScript.

  • A new Result Restrictions option where you can specify the document depth at the directory level when querying your search terms. That is, you can ask for search terms that appear at the home page level and above or below, all the way to the 10th level directory of the site architecture. Alternatively, you can search "all document depths."

This new, advanced search feature from FAST presents a winning solution for the millions of people looking to perform highly-specialized searches for relevant information contained with Macromedia Flash content on the millions of sites which include our technology, said Macromedia's Singh. It seems to me that these new advanced search options can certainly improve the user experience -- provided that ordinary users become aware of them and use them.

Flash Q&A

When we first heard about this new functionality in our office, Operations Manager Nicole Falsey wasn't sure she got it right. So she queried Peter Gorman, Director, Corporate Communications for Fast Search & Transfer, who answered her question thusly.

Question: Is FAST able to go into a site's embedded .swf files to crawl and index the text within that file just like search engines do with HTML documents?

Answer: Yes, FAST crawls links as they appear within the document and treats Flash files like HTML when converted. FAST uses the Flash Search Engine SDK, which basically converts the Flash app into a HTML file. Click here for detailed information on the Flash Search Engine SDK.

Nicole immediately downloaded the Flash Search Engine SDK from the site and was able to convert .swf files into html. Not only does this tool allow FAST to index Flash .swf content, it may allow us to analyze our clients Flash files and assist them in optimizing their sites. This will become important as other search engines follow FAST s lead in indexing Flash content.

Converting .swf Files to HTML

Once she downloaded the Macromedia app, Nicole ran it on a couple of .swf Flash files downloaded from Macromedia's web site. Below is a screen shot of one of those files so you can see the links and text that are viewable through a browser.

Now, when you run the swf2html application from a DOS prompt and type in the command to convert the .swf file to HTML, you get the following, which shows that there are other .swf files embedded within the first .swf file, plus text links and regular text in the first .swf file. Since many .swf files are embedded in each other, we're assuming FAST would just keep converting all of them until it finished and had a completely indexed site.

We wondered how Fast pulls the TITLE and description for the search engine listings for Flash pages. We assumed these elements would come from the original HTML document's <TITLE> and meta description tags, simply converting any Flash on the page to HTML, with this data added to the index in addition to the HTML page content.

But what if the Flash site is not embedded in an HTML document and is just a .swf file? Where will Fast get its listing title and description?

We got answers from FAST engineer Rolf Michelsen, who confirmed our assumptions were correct. "When the Flash file is embedded as part of a HTML document, we use the document title and various heuristics to extract title and teaser for our search results. The heuristics for extracting a teaser may use the meta description tag if present," said Michelsen.

"When indexing a stand-alone Flash file, we extract title and teaser directly from the Flash file -- basically trying to compose a teaser from the first few sentences of text extracted from the Flash file," concluded Michelsen.

So there you have it from the horse's mouth. This makes the critical point that designers should always embed their .swf Flash files in an HTML document and add a <TITLE>, meta description, and meta keywords tags to ensure their Flash based pages will be indexed in the search engines with a title and description of which they approve. Otherwise, the listing may show No Title, and the description will be the first text indexed in the file, which may not be advantageous.
September 18, 2002

Paul J. Bruemmer has provided search engine marketing expertise and consulting services to prominent American businesses since 1995. As Director of Search Marketing at Red Door Interactive, he is responsible for strategizing and implementing search engine marketing activities within Red Door's Internet Presence Management (IPM) services.

