This week’s non-issue: Google indexes SWF files

It’s probably the number one topic on Flash blogs today: O’Reilly launches InsideRIA and has an article about how Google indexes SWF files. As usual when it comes to SEO and Flash, almost no one understands what it means, but almost everyone talks about it as the New Big Thing. Guess what? It’s not. I’ve said quite a few things on the subject (try the Search Engine Optimization category in the list to the right), but I think Geoff Stearns has put it best:

It doesn’t matter.

From “A modern approach to Flash SEO” on the Deconcept blog.

That is the bottom line. It doesn’t matter if Google can index the SWF, because it won’t do you any good.

It is interesting that Google now uses Adobe’s Search Engine SDK, not because it will change anything, but why they have switched. I have commented on Google’s approach to SWF indexing before (Google and Flash) and found that they don’t understand the problem, this is just further proof of that.

The article that started it all has this to say about Google’s new approach:

This is great news for RIAs and especially Flash/Flex folks who care about search engine ranking.

and

This is also good because we can avoid techniques that could look like cloaking to Googlebot which will hurt your search engine listings.

Both quotes from Google’s Indexing Flash Text with Adobe’s SDK on InsideRIA.

These two statements are false. Firstly there’s no good news here, Google has changed from their home-brewn mistargeted effort to Adobe’s mistargeted effort. It will do almost nothing for Flash site rankings, and absolutely nothing for Flex applications. Secondly, it will have no effect on the techniques currently empolyed to achieve indexable Flash sites, because the cloaking technique is the only technique that works, and more to the point: the SWF indexing technique does not work, and cannot work but in the simplest SWF files.

The reasons are very simple:

  • Google doesn’t know about the majority of SWF files. Because the overwhelming majority of Flash and Flex sites and applications embed the SWF using JavaScript of some sort (SWFObject or Adobe’s embed code, for example) Google will not even find them. Because progressive enhancement/graceful degradation is the best approach to SEO and accessibility, and because of the issues with IE and lack of a standard way to embed Flash, this is not going to change.
  • SWF files have no semantical structure. Even if Google can extract text from SWF files, how could it ever know in which order the text appears in? How does it know that the text is not only a placeholder that will get parts of it replaced when the application runs? How can it associate a header with a body? These are only a few issues that make the data in a SWF unusable without actually running the application.
  • SWF files contain very little content. Only the simplest Flash site contains static text, most load their content from XML files on the server, or at least dynamically generates or changes text fields. Unless the SWF is actually run there is no way of knowing how it will appear to the user.
  • SWF files load other SWF files. A not uncommon technique is to have a bootstrap SWF that loads a bigger SWF containing the actual site. Once again: without actually running the application there is no way to know how the application appears to the user.
  • Any application that requires the user to login cannot be indexed. Most Flex applications are just that: applications, they let users work with their data, data which is not public and should never ever be indexed. It’s not like Google is allowed past the login-screen on Facebook. Asking that Google index a Flex application is like asking Google Desktop to index Flex Builder.

In short: without running the SWF file there is no way of knowing how it will appear to the user, it cannot be known which text appears (or if it appears at all), nor in what order or what context — but since Google can’t find the SWF files to start with it shouldn’t bother with those questions and instead work on something useful like how to make progressive enhancement/graceful degradation easier and more Google-compatible. For more discussion about these issues, read my previous article on the subject: Google and Flash.

I really hope that the article is not a measure of the future quality of InsideRIA, because this just shows that the author understands as little as Google about the subject.

6 Responses to “This week’s non-issue: Google indexes SWF files”

  1. Ahmet Says:

    I agree with all the points you just mentioned, apart of the It doesn’t matter. It does matter to some people. Considering the fact that one of the worst critics about SWF content is the lack of indexing and accessibility. We should really work out a solution to offer more discoverability to multimedia content (http://www.technologyreview.com/Infotech/18847/), would it be text, images or videos.

    Accepting the present situation is just not the way to solve this, is it?

  2. Theo Says:

    There is a perfectly good solution: progressive enhancement/graceful degradation, talking about indexing SWF’s is only diverting the attention from that.

  3. zedia.net Says:

    Is it me or we’ve known for a long time that Google was referencing only staic text in Flash files, if I knew it I surely wasn’t the only one. I don’t understand this whole buzz lately.

  4. Theo Says:

    RTFA: Google has changed their method of reading static text.

    As if that would change anything.

  5. laflex.org » Blog Archive » Google Indexing Flash Text with Adobe Search SDK Says:

    [...] Check out Theo Hultberg’s post on Iconara about why this is a non-issue and, in fact, why it is the wrong approach to indexing Flash [...]

  6. » Flash et Google : les SWF pas si indexables que ça  Tendances Web, CSS, Ergonomie, Marketing Internet - Blog de Smile Interactive Says:

    [...] les spécialistes sceptiques se mirent à réfléchir sur le sujet. Et le désenchantement s’opéra : tout cela n’a rien de [...]

Leave a Reply