A List Apart

Menu
Issue № 115

Much Ado About Smart Tags

by Published in Browsers, Industry

We believe in total empowerment of the user to decide what content they want to look at.
Microsoft Product Manager Shawn Sanford, as quoted by NewsBytes
Microsoft thinks they can improve my writing. This makes me want to get a gun and go to war.
Dave Winer, Scripting.com
Smart Tags can be developed by anybody, are completely under the user’s control, and can do some very useful things.
Executive Editor David Coursey, writing for ZDNet AnchorDesk
This is exactly like what Microsoft did in the past…leveraging what they have on the desktop into another market…
Gartner analyst Michael Silver, as quoted by News.com

The dustup surrounding Microsoft’s new smart tags rivals those touched off by Hailstorm and the Pentium III identification number. The Wall Street Journal’s Walter Mossberg charges that extending the smart tags to Internet Explorer 6 effectively gives Microsoft the ability to edit any page on the web without the author’s knowledge. SiliconValley.com’s Dan Gillmor is concerned that smart tags may be amenable nefarious uses, such as covert user tracking. Microsoft and its supporters, like ZDNet’s David Coursey, insist that smart tags offer a great deal of utility, and that users can turn them off tag-by-tag or even disable the technology altogether if they wish. Meanwhile, many users and webheads would prefer to let the market decide while they get back to debating the virtues of WYSIWYGs vs. hand-coding, or whether Tom Cruise is gay.

The free market isn’t

As tempting as the last option is, it isn’t prudent. Microsoft owns over 80% of the market for desktop operating systems, browsers and office suites. If we “wait and see,” the technology will be ubiquitous by the time we reach a conclusion.

Some have argued that users can switch software if they don’t like smart tags, but switching operating systems isn’t practical for most people. Changing to Mac OS requires an expensive hardware purchase, and installing Linux isn’t yet something the average user can accomplish unassisted. Even if it were, many – if not most – businesses use software that runs only on Windows. If your job requires the full functionality of Access, Exchange, Microsoft Project or Visio, you won’t be able to do it on anything but a Windows PC.

Switching browsers is a dubious proposition as well. ALA has documented the troubles with Netscape Navigator 4.x, and Netscape 6 is still not stable enough for day-to-day surfing. Even if it were, most users are still stuck on a 56 Kbps modem, so downloading and installing a new browser is time-consuming and intimidating. Opera 5 includes ads in the free version; to eliminate them, users must pay – something they are unlikely to do when Internet Explorer remains free. Moreover, many sites the average user relies on function erratically in both Netscape 6 and Opera 5. While that wouldn’t be the case if all developers followed the W3C standards, the fact is they don’t and Jane Consumer doesn’t care why. She just wants to use her online banking service today.

In any event, switching browsers only eliminates the smart tags from web pages. The technology will be built directly into Windows XP. Even when used in Office applications, if the user is connected to the Internet, smart tags can contact an external server when the user clicks on them. Doing so provides an opportunity for user tracking or the download of additional code without the user’s knowledge. Moreover, while smart tags are currently only enabled in Office XP and Internet Explorer 6, there is nothing stopping Microsoft from enabling them throughout Windows, perhaps even in third party applications.

Just not upgrading isn’t an option, either. Even users who decide to eschew Office XP, Internet Explorer 6 and Windows XP will eventually be forced to upgrade when they buy a new computer or when other applications decide to stop supporting “outdated” versions of Windows. This is a serious concern for those using Windows NT, 9x and ME since all future Windows releases will be based on significantly-different Windows 2000 code.

If smart tags are a Bad Thing, then they should be stopped sooner rather than later. As Bruce Tognazzini points out in his article about an unwanted “upgrade” to the Replay TV service, the advent of time-limited licenses and on-the-fly software updates is making software upgrades mandatory. If we want to keep the camel out of the tent, we had best kick his nose out now. The question is, might we be better off having a camel around to fetch our newspaper?

Slicker than snot on a doorknob

In Office XP (the only smart tag implementation currently in general release), smart tags are a pretty hip idea. A “smart tag” consists of two things: a list of words that it will recognize (a smart tag “recognizer”) and a list of actions that the user can invoke when the recognizer “recognizes” one of those words.

For example, a smart tag from Adobe might include a list of words like: “Photoshop, Illustrator, InDesign, AfterEffects.” If you type “Photoshop” into Word 2002 (the version of Word included in Office XP) or open a document with the word “Photoshop” in it, the smart tag engine will match it with the Adobe smart tag recognizer’s list and paint a dotted purple underline below the word. Rest your cursor over the underlined word and a button will appear (the smart tag icon), hovering over the page next to the word. Click on the button, and a “smart tag action menu” appears. The smart tag action menu contains a list of custom actions defined by smart tags that recognize the word in addition to the standard “Smart Tag Options…” and “Remove this Smart Tag” actions. A custom action could launch Adobe’s website in a new instance of Internet Explorer, it could launch Photoshop or it could get a quote for Adobe’s stock price from whatever source the smart tag author chose.

Doesn’t sound terribly useful, does it? It probably isn’t, but smart tags can do much more. Smart tags can be set up to recognize words that match an expression, rather than a simple list. For example, the Lexis-Nexis smart tag recognizes the names of court cases, like Roe v. Wade. If you type that text into Word 2002, and have the Lexis-Nexis smart tag installed and enabled, a smart tag will offer the option to turn the text Roe v. Wade into a full legal citation: Roe v. Wade 410 U.S. 113, 93 S.Ct. 705, 35 L.Ed. 2nd 147 (1972). As a former law student, I can attest to the value of something like this.

Better yet, smart tags can be coded to automate custom applications. You could, for example, define a smart tag to recognize the phrase “annual widget sales.” If a user types the phrase “annual widget sales” (or opens a document containing that phrase) and invokes the smart tag’s action, the smart tag could query the company database and insert the number of widgets your company has sold so far this year.

Microsoft already has a smart tag that recognizes proper names, like “Larry Love,” and can insert them into your Outlook address book. You could take it a step further and create one that inserts such names into your company’s corporate contact database. The possibilities are almost limitless.

You can read more about how to use smart tags at the Microsoft Office Help site, take a look at some third party smart tags at the Microsoft Office Smart Tag site, or see a Flash demo of smart tags (after launching the demo, select “Simplifying Productivity” and then click on square number 2).

Where’s the beef?

So long as smart tags are confined to Office, things appear fine (though there may still be cause for concern; more on that later), However, Microsoft has added smart tag recognition to Internet Explorer 6 as well. If you have a smart tag on your system that recognizes the word “office” and offers a link to Microsoft’s Office site, and you are using Internet Explorer 6 with smart tags enabled, every time you view a page containing the word “office” the smart tag engine will add the purple dots. Hover over them and click the smart tag button, and the smart tag will helpfully offer to launch the Microsoft Office site in a new instance of Internet Explorer.

As Paul Thurrott of Paul Thurrott’s WinInfo found out, this sort of thing can get old in a hurry. While typing up his review of Office XP (using Word 2002, of course), Mr. Thurrott typed the word “nice.” Up popped a smart tag offering to book a flight to Nice, France using Microsoft’s Expedia website. When he typed the word “long,” up popped a smart tag from ESPN offering more information on Oakland Athletics centerfielder Terrence Long. As Thurrott put it, “Folks, this is lame.”

Robin Gross, attorney for the Electronic Frontier Foundation, told NewsBytes it’s more than lame, it may be illegal. According to Ms. Gross, adding links to web pages may create a derivative work by modifying the content of the original. Doing so without permission violates a site owner’s copyright.

Even if courts decide that smart tags don’t cross the derivative work threshold, Microsoft may still be on the wrong side of deceptive trade regulations. If the links appear to be placed on the site by the author, users could think that the author is endorsing services or points of view that were actually linked by Microsoft without the author’s knowledge. If they do, Microsoft may be held legally accountable.

That may sound a bit far-fetched, but keep in mind that while you or I may discern dotted purple smart tags from garden variety hyperlinks, most users may not. I teach basic Internet skills at the New York Public Library. Most of my students have trouble with the distinction between the operating system and an application, let alone the distinction between an underline created by the browser because the author added a link and one created by a smart tag. These people aren’t dumb by any stretch of the imagination. They are simply grappling with unfamiliar technology.

Over there

Even if U.S. courts find smart tags acceptable, courts in other countries may not. In a post to Arts & Farces, Michael Fraase says smart tags may run afoul of moral rights.

Moral rights are a bundle of rights that remain with the author of a creative work even after she sells the copyright to it. The concept originated in France, where it is believed that a creative work necessarily incorporates the character of the author. Unlike copyrights, which are a property right that can be transferred, moral rights are intrinsinc to the author and her work. They cannot be transferred.

Smart tags, says Fraase, may well violate the moral right of integrity. This right prohibits the distortion of a work in a way that would damage the reputation of the author. Since smart tags could link to anything from advertising to extremist points of view, they are quite likely to damage the reputation of an author insomuch as the user associates those links with the original author of the page they appear on.

While another moral right – the right to respond to criticism – might be seen as permitting smart tags as criticism or responses to such, this reasoning is flawed. Smart tags do not necessarily contain criticism. More, the right to respond to criticism extends to the author to reply to criticism in the same forum where the criticism was leveled. Since the author has no way of knowing which smart tags appear on his page, he has no way to respond to the “criticism” of his work. While smart tags might allow an author to link his response from a critical page, there would be no way to ensure that readers of the page saw the response unless the owner of the critical page embedded the smart tag in that page. Even so, readers using other browsers or browsing with smart tags disabled would not see the response.

Caesar, there’s a barbarian at the gate; he’s here to sack the city

Even if smart tags don’t violate copyright or deceptive trade laws, they still violate the integrity of the web. Part of the appeal of the web is that it allows anyone to publish anything, to take their thoughts, feelings and opinions and put them before the world with no censors or marketroids in the way. By adding smart tags to web pages, Microsoft is interposing itself between authors and their audience. Microsoft told Walter Mossberg “the feature will spare users from ‘under-linked’ sites.” Microsoft is in effect deciding how authors should write, and how developers should build, websites.

Worse, Microsoft’s decisions may be at odds with the intent of the site’s author or developer. If an Internet Explorer 6 user visits Travelocity and looks at a page with information on visiting Nice, France, the smart tag that aggravated Thurrott will link the word “Nice” to Microsoft’s Expedia site. With smart tags, Microsoft is able to insert their ads right into competitors’ sites.

Microsoft is crossing the Rubicon of journalistic and artistic integrity. Editors and authors no longer have final authority over what their sites say; Microsoft and its partners do. For a preview of what the web may look like for Internet Explorer 6 users who also have Office XP or Windows XP installed, take a look at InteractiveWeek’s Connie Guglielmo’s preview. With smart tags, Microsoft is effectively extending its role from being a supplier of tools people use to view content to being the executive editor and creative director of every site on the web.

Where does Microsoft want your audience to go today?

Aside from issues of integrity, smart tags might even impact web developers’ livelihoods. How anxious will your clients be to spend thousands of dollars on a website that will be peppered with ads for competing products from Microsoft and its partners? How do you, as a developer or designer, feel about spending days, even weeks, building a site that looks just so, only to have Microsoft pepper it with dotted purple links? By usurping our control over our websites, Microsoft is abusing its monopoly power.

Looking down the road a bit, how long before Microsoft decides to extend that power even further? Pocket PC and Ultimate TV, Microsoft’s platforms for PDAs and settop boxes respectively, are gaining steam. What happens if Microsoft extends smart tags to them? Add voice and image recognition technology and Microsoft will be able to insert ads into audio and video streams, MP3s, TV shows, even your own DVDs and CDs.

When I hear Alabama3’s Rev. D. Wayne Love sing that he “ain’t goin’ to Goa,” an ad for plane tickets to India from Expedia might pop up on the screen of my iPaq – not at all what the good Reverend had in mind, I’m sure. Nor would he ever see dime one for having his lyric turned into an ad, even if I ignored his advice and bought a trip to Goa.

Microsoft could even extend the technology to your desktop. Name a file resume.doc and Microsoft could make it a smart tag linking to HotJobs, and collect a nice fee for the referral.

Open is as open does

Microsoft’s response is that they’ve made smart tags an open standard. Anyone with a Windows PC and Office XP can develop them. In fact, if you only want to create an XML-based smart tag you don’t even need those. Any text editor will do.

That’s well and good, but smart tags must be downloaded to a user’s hard drive, much like a plug-in for a web browser. Also like browser plug-ins, companies that have access to users’ hard drives will have no trouble distributing their smart tags. Companies that build PCs, like Compaq or Sony, can install them when they install the operating system and bundled software; companies that sell software, like Adobe or Macromedia, can install their smart tags when the users install the companies’ software; online services and ISPs, like AOL or AT&T, can do the same. Other companies will run into the same sort of trouble that plug-in developers have had: users don’t want to wait for the download, or don’t trust the code they’re being asked to install.

The same large companies that currently control your TV, cell phone and PC desktop will have a major advantage in exploiting smart tags. Smaller companies or companies that don’t routinely have access to users’ hard drives will be stuck trying to persuade consumers to wait around for 100+ KB downloads. At a time when 56 Kbps modems are still the most widespread connection method, that isn’t an easy task.

Worse, smart tags, can be executable code. The compelling ones, like the Lexis-Nexis smart tag I described earlier, almost all are. That means a smart tag presents the same security risks that an ActiveX control does; they can be spyware, trojan horses or even destructive virii. How anxious would you be to download Ma & Pa Beasley’s Super Smart Tag if you’ve never heard of Ma & Pa Beasley? Even if the thing would defrag my hard drive, sort and answer my e-mail and regrow my thinning hair, I wouldn’t touch it. Ma & Pa might be fine people, but how am I to know?

Even if a smaller company can develop something appealing enough to entice users to wait, and convince them that the code is safe to download, larger companies are almost certain to take the idea and put an end to the smaller company’s success. The reality is, if a company wants their smart tag widely distributed, they’ll have to pay Microsoft or some other company with access to users’ hard drives for the priviledge.

A hollow victory

Decentralization listserv member Robert Scoble has informed that list that Microsoft has backed away from their original smart tag implementation. Among other items, Microsoft has promised that they won’t ship any smart tags “in the box.”

Putting aside for a moment the fact that, as Gillmor points out, the source for this information is an anonymous Microsoft employee rather than an official statement, it is still scant comfort. For one thing, it begs the question: which box? The “box” Internet Explorer 6 comes in? That still leaves Windows XP and Office XP – both products with monopoly-level market share – for Microsoft to load up with smart tags. Since smart tags are shared among applications, that means Microsoft still has their smart tags on a majority of desktops.

Even if Microsoft removes all default smart tags from Windows XP, Internet Explorer 6 and future copies of Office XP, they still have a back door that other companies won’t: the “More Smart Tags” button. In Office, if a user clicks this button they are presented with a list of smart tags offered by Microsoft, some of which are from Microsoft itself and others from their partners. That still constitutes a significant advantage over other companies and developers who do not have their smart tags listed right in the Office interface. Whether a similar button will be present in Internet Explorer 6 is unclear, but even if it is not the monopoly market share held by Office XP still constitutes an unfair advantage for Microsoft.

Microsoft will likely counter that the source for the “More Smart Tags” button can be changed. This is true, but their documentation indicates it is the network adminstrator who must change it. While a consumer OS like Windows XP may allow end-users to change their own “More Smart Tags” source, the fact that this activity requires an administrator in commercial versions indicates that it will be anything but an easy, intuitive process. Moreover, most users rarely change most system defaults. Having the default “More Smart Tags” list come from Microsoft is a major advantage any way you slice it.

NOOP out

Microsoft has attempted to allay the fears of designers, authors and developers by assuring us there will be a meta tag to disable smart tags on any page in which it is included. After spending days searching in vain for the tag syntax, Scoble pointed me to a page containing the Internet Explorer 6 FAQ. The tag looks like this:

<meta name=“MSSmartTagsPreventParsing” c>

So you just put the tag on every page you code and the problem is solved, right? Wrong. Even if the tag worked as advertised, Gross told NewsBytes it wouldn’t affect the legal issues with smart tags. By requiring site owners to write a meta tag to disable smart tags, Gross says, Microsoft is unfairly putting the burden on site owners to prevent their copyright from being violated.

The tag may not even work. Sjoerd Visscher claims to have found an option in Internet Explorer to always display smart tags on web pages, presumably even if an author includes the meta tag on his site.

The option may or may not be on by default – if it is even there to begin with. Scoble’s anonymous source has assured him that this option will not be in the shipping version of Internet Explorer. Even if the source proves inaccurate, if the feature is off to begin with, it is unlikely that web users who often don’t even know how to change their screen resolution will find and check it. However, there is nothing stopping Microsoft from adding the override feature back in and turning it on by default in a future release, or even after the installation of a service pack, security patch or hotfix.

The users have us right where Microsoft wants them

Microsoft is quick to point out that users are in complete control of the smart tags on their systems. Users can disable the smart tags engine entirely, or disable individual tags. As vice president for Windows XP development Chris Jones told the Wall Street Journal, “Smart Tags represent another step in personalizing the web and helping bring it to life for individuals by allowing them to get the information they want in the way they want it.”

Microsoft would like to paint this as an issue of personal freedom; the freedom of surfers to choose how to view the web. One Microsoft employee on the Decentralization listserv likens smart tags to carrying around a copy of Consumer Reports while shopping. He tries to paint the issue as a power struggle between publishers and their audience: publishers trying to control what users see, users trying to get more information.

That’s a nice story, but it simply doesn’t wash. Eliminating smart tags in no way prevents users from gaining access to alternative sources of information. In fact, the real problem with smart tags isn’t the information they present at all. It’s the way that information is presented that is the problem.

The Microsoft employee on the Decentralization list likens smart tags to walking around a store with the owner of a competing store – whom you invited along – whispering in your ear. The trouble with that analogy is that if you’ve invited someone along to whisper in your ear, you know the information they whisper is coming from them, not from the owner of the store you’re in. With smart tags, the source of the information is unclear. In fact, it isn’t even clear that the user has invited this particular person to whisper in their ear. A more accurate analogy would be if the competing store owner went around and placed official-looking labels on every package detailing how much cheaper or better the products in their competing store were before you arrived. Once you get to the store, how are you to know it wasn’t the owner of the store you are in who placed the labels on the packages?

In fact, Microsoft told Mossberg that the tags would be disabled by default, and Scoble’s source has reiterated that intent. Ordinarily I’d accept that, but IT industry ‘zine The Register found smart tags on by default in one recent build of Windows XP. While Microsoft claims that this was done in order to get more testers to use the code so they could get more feedback, in the case of smart tags Microsoft has been less than forthcoming on several details. I’m not so sure they’re playing straight with this one, either.

An error of the third kind

In any event, letting the users decide is the right answer to the wrong problem. If the issue is solely how annoying smart tags turn out to be, or whether they are a security risk, then users’ voices carry a great deal of weight. The problem is, neither of those is the biggest problem with smart tags.

The biggest trouble with smart tags is that they edit your content before the user ever sees it, and they do it in such a way that the user may never know.

The user can certainly decide to get a second opinion while browsing my site, and the user has every right to browse with whatever tools he chooses. Second opinions need to be chosen, though, and they need to be clearly distinct from the original page. Smart tags are not. They are integrated right into the page, with no visible indication whether they’re there because the author wanted them there, because Microsoft wanted them there or because some other company wanted them there. Without digging around in XML files or options boxes, the user can never be sure which smart tag is offering which action.

They started it!

Microsoft and their supporters also insist that Alexa, FlySwat, NBCi’s QuickClicks, Comet Cursor and the now-defunct Third Voice all offer similar functionality. Alexa is even integrated right into Netscape, they say. They are missing the point.

Most of those products do indeed offer links to web pages relevant to, sometimes even competing with, the web page in the browser window. However, Alexa exists outside the browser window. The user must click a button on the browser, then get options from a menu that extends from the browser, not from the web page.

Comet Cursor Search operates in a similar manner: the user clicks a button on the browser, selects a cursor and then uses that cursor to click on the word they’re interested in. It is obvious that any suggestions, commentary or links are coming from the cursor, not the web page.

FlySwat and QuickClicks (an NBCi-branded version of FlySwat), operate more like smart tags. Once a user clicks on a browser button, they underline words that then become links. Click on the links and up pops a menu of related items or pages. The difference is in the details: the user has to click a button to get the links to appear – a positive action asking for more information, and action directed at the browser, not the page. More importantly, the pop-up menu includes the FlySwat or NBCi logo. It is apparent that the choices in the menu are from FlySwat or NBCi.

Smart tags offer no such indication that they come from a third party, let alone which one. They are integrated right into the page, and once they are on they require no further user input to appear; those dotted lines will show up as if the author had placed them there herself.

Playing metadumb

Third Voice is a bit different than the others. Rather than providing links to more information, Third Voice allowed any user with the Third Voice plug-in to post and read comments attached to Web pages. These comments took the form of yellow boxes of text, not unlike Post-it notes, tacked on to the Web page.

Microsoft and its supporters seem to believe that this is what smart tags do: annotate existing pages. The Microsoft employee on the Decentralization listserv insists that they are metadata, like what one would get from an annotation service.

This is patently false. Smart tags don’t add data, they add links to data. Moreover, smart tags aren’t blocks of text tacked on to the Web page, they are links embedded right into the page. Literally. According to Scoble, when one copies a word highlighted by a smart tag and pastes it into another Microsoft application, such as Word 2002 or FrontPage, the code for the smart tag is pasted along with the word. That isn’t adding annotations, that’s altering the very code of the page.

Also, Third Voice allowed even the most non-technical user to post their comments for everyone who had the Third Voice plug-in to see. Smart tags require the use of specialized tools to create them or knowledge of how to write an XML file – specialized knowledge that the average user just doesn’t have. Moreover, even once the user creates their smart tag they must get it distributed. The Microsoft employee on Distribution even says as much:

Smart tags are very much like the first stages of the web in that anyone can publish metadata, so long as they learn the technical side, and they need to advertise in search engines, etc. to get people aware of their metadata.

Requiring a user to “learn the technical side” and advertise doesn’t sound much like a populist tool for individuals to use when talking back to large corporations. It sounds like another way for corporations to talk to users, quite possibly without even identifying themselves.

Smart tag technology will undoubtedly evolve, and there are already tools that minimize the technical savvy required to develop smart tags, such as the Excel 2002-based application described in the MSDN article Advanced Smart Tag Tools. This does not, however, change the fact that smart tags must be installed or embedded in a document for them to be visible. Most users don’t have the expertise or the patience to install XML smart tags manually, and will be (or at least should be) wary of running installer programs from companies and individuals they are not familiar with.

Jurassic Park syndrome

There might be a way for smaller companies to get their smart tags out into the world: embed them into their Web pages. In Adding Smart Tags to Web Pages, Microsoft explains how to embed a smart tag into a Web page. It’s a trivial process: add an XML entity and an <object> tag to the page, put the files for the smart tag on your server and you’re all set; anyone visiting your page with Internet Explorer 6 will get your smart tag.

Once users have downloaded a smart tag that was embedded in a Web page, will it turn up on other pages, or even in Office XP? Microsoft’s documentation isn’t clear on this point. Decentralization listerv poster Eric Moore says they will only work in the page in which they are embedded. If so, then embedding smart tags in your web pages won’t do much to distribute them; you’ll still have to either convince users that they’re safe and worth the wait, or pay one of the big companies to distribute it for you.

If embedded smart tags are available outside the web page they’re embedded in – and Moore doesn’t specify whether the limits on smart tags apply to all smart tags or only XML smart tags, nor does he address willful abuse of the technology by shady marketers – the situation is even more grim. Smart tags could be embedded not only in web pages, but also in e-mails. Unsolicited commercial e-mail – spam – for example. Imagine trying to read F. Scott Fitzgerald’s The Great Gatsby online and having every mention of Daisy sport a smart tag inviting you to SEE DAISY FUENTES NUDE!!!

In fact, even if embedded smart tags require the user to explicitly save them to their hard disk before they will be available on other documents or web pages (the most likely scenario), it is probably only a matter of time before some clever spammeister creates an e-mail virus or trojan horse that installs smart tags on users’ computers.

While Microsoft points out that DLL-based smart tags cannot be embedded in office documents or e-mails, this offers protection only against DLL-based smart tags. XML-based smart tags – the variety that can create lists of links associated with recognized words – can be sent along with any web page, Office document or e-mail (provided Word is used as the e-mail editor in Outlook). A spammer or other marketer wishing to distribute their advertising links could still send along their XML-based smart tags, and for those using default security settings they will show up at least in the documents in which they are embedded. In addition, if the smart tag includes a download URL, they provide the option to “Check for new actions….” Selecting that option might well pass trackable information to an unknown server, or perhaps even download a DLL (the Smart Tag Installation and Security for Microsoft Office XP white paper isn’t entirely clear on the matter).

To make matters worse, the default settings for smart tags allow users to download and install DLL smart tags with no warning as to their origin or the fact that they are executable code. In an ideal world, users would be wary of such downloads, or have anti-virus software installed. This is not, however, an ideal world. Users routinely execute code that is known to be dangerous, such as cute e-mail joke programs. These programs are a vector for spreading the “zombie” programs used in Distributed Denial of Service attacks, among other things. Why would users be any more careful with smart tags than they are with other executable code?

Before you write the security concerns off as being a problem that only affects the careless users themselves, remember that DDoS attacks have taken down sites like Yahoo!, Amazon and even Microsoft itself. They represent a significant threat to the stability of the web itself.

To paraphrase Dr. Ian Malcolm, Microsoft was so preoccupied with whether or not they could, they didn’t stop to think if they should.

Your hard drive is on a third party server

When Gillmor asked Microsoft about potential privacy concerns regarding smart tags, Microsoft responded that smart tags couldn’t be used to track users. The lists of terms and actions reside on the user’s own computer, a Microsoft representative assured him, so there was no way to track smart tag users until they click the tag and visit the site to which the tag points.

Except that’s not exactly the case. According to the MSDN article Developing Simple Smart Tags, an XML smart tag term list – the list of terms that tell the recognizer which words to underline – can be set to automatically update itself from an external server. Nowhere in the article does it say the server must belong to the same domain that the web page came from (in fact, that wouldn’t make sense since the whole point of smart tags is that they’re available on any web page or Office document containing a word from the tag’s term list) or from the server the smart tag points to. Nor does the article give any indication the user is alerted when the smart tag contacts a third party server. The article also doesn’t say whether that server is then allowed to set a cookie, or what sort of access it has to browser environment variables, such as which page the user is on.

If the server containing the updated term list has any access to browser environment variables whatever, tracking becomes simple. The smart tag determines whether it needs an update by comparing the value of the lastcheckpoint field in the XML smart tag list with the lastcheckpoint field in an update.xml file on the server specified in the smart tag XML file. If the lastcheckpoint value of the update.xml file is greater than the lastcheckpoint value of the smart tag XML list, the smart tag downloads a new list. It would be a simple matter of setting the value of lastcheckpoint in that new list to a number smaller than the one in the update.xml file to force the smart tag to update itself every time it is invoked, thereby providing a means for companies to track when, where and possibly even by whom their smart tags are being used. So much for smart tags lists being stored on the user’s own hard drive.

Microsoft counters by saying the smart tags only update if the user clicks on them and selects an action. Even so, the basic problem remains: the user is downloading a file to their hard drive from an unknown server, ostensibly with no warning.

That’s some funny looking XML ya got there

Not to worry, says Microsoft, these are just plain XML files, not executable code or anything. Even if that were true, it is still a potential security risk. Nowhere in the documentation does Microsoft say what sort of precautions have been taken to ensure that what the smart tag downloads is only XML. In fact, in the Advanced Smart Tag Tools article, Microsoft gives instructions for including a term list as a binary file, rather than a plain text list. This binary file will then be updated whenever the XML file is.

While good reasons exist for using a binary file instead of text, such as smaller file size, the chief advantage seems to be that end users typically cannot view or edit the binary file. Users can’t tell which terms a smart tag is set to recognize, nor can they change them.

More worrisome is the fact that Microsoft seems to view these files as completely harmless. Smart tags can automatically contact an unknown server and download a binary file to a standard location on the user’s hard drive with no user intervention or, according to what Microsoft has said so far, any sort of check to ensure that the file is benign. Bear in mind that anyone can create a smart tag, so even if you trust Microsoft and their partners not to slip a trojan horse spyware application onto your computer, there’s no guarantee that a less trustworthy company won’t do that or worse.

Oh, you mean that executable code

Getting back to Microsoft’s assertion that smart tags can’t download executable code, if that is true then why is there an MSDN article entitled Instructions for Developing a Smart Tag DLL? In fact, in virtually every article describing how to develop smart tags Microsoft emphasizes how much more powerful DLL-based smart tags are compared to XML-based ones. For those of you unfamiliar with Windows, DLLs (Dynamically Linked Libraries) are files that contain compiled code. While I’m no Windows programmer, my understanding is that while they are typically components rather than complete programs, they are most definitely executable.

Microsoft is, as yet, mum on whether these can or cannot be used for nefarious purposes. Judging by the documentation, there is nothing preventing DLL smart tags from updating themselves without warning the user that they were downloading new code. This leads to some nasty possibilities.

Windows programming is based on events. When a user does something – presses a key, clicks a mouse button, whatever – or a piece of code executes, an event is fired. Sometimes more than one. Just as with DHTML coding, Windows programmers use those events to invoke their code. I am not at all convinced that there are no events fired when a smart tag recognizes a word that a clever programmer could use to invoke some action or other buried in the smart tag DLL’s code. Such an action would be running on a user’s computer, meaning it would have full access to everything that the user does: password files, browser environment variables, addressbooks…everything.

I may not trust Microsoft entirely, but I certainly trust they wouldn’t do anything as nasty as making a smart tag that grabbed users’ passwords and e-mailed them to a Microsoft server. That just wouldn’t be smart. But as I pointed out above, anyone can develop smart tags – including DLL-based smart tags. I’m not so sure I trust DoubleClick not to make a smart tag that recognizes a long list of words and, when one of those words is recognized, sends a unique ID and the URL of the page I’m on back to DoubleClick. I am sure I do not trust script kiddies to refrain from altering an otherwise useful smart tag to search my hard drive for credit card numbers and send them to a server for later pickup whenever it is invoked.

Smart tags may or may not be a major threat to privacy or security. We won’t really know that until the technology has been deployed among average users; users who do things like fail to install anti-virus software on their computers, or use the word “password” for their password. We won’t really know until script kiddies have spent hours finding ways to use it to victimize those users. By then, though, some disgruntled 19-year-old may already have used them to install bots on thousands of computers in order to launch a DDoS attack on major websites, or to steal the identities of web users. In other words, we won’t really know until it is too late.

Take ’em or leave ’em

Microsoft is quick to point out that whatever shortcomings smart tags may have, the user can disable them one at a time or en masse. There’s just one catch: if you disable smart tags in one Office application, they are disabled across all Office applications. The same goes for individual smart tags; you cannot have a smart tag disabled in Word 2002 but enabled in Excel 2002. That makes smart tags in Office an all-or-nothing affair. Since the same smart tag that lets you convert State of Arizona v. Miranda into a full legal citation may also offer to send you to the Lexis-Nexis subscription site whenever it encounters a legal term, users may be forced to choose between a useful function and keeping their personal documents free of advertising.

Whether the all-or-nothing proposition extends to Internet Explorer isn’t clear. Since Microsoft says they intend to ship Internet Explorer 6 with smart tags off by default, one would presume it does not. However, some of the admittedly handy new features touted in Office XP, such as paste options that let you select which styles to apply to pasted text, are implemented as smart tags. If the all-or-nothing restriction includes or is extended to include Internet Explorer, Microsoft will have a great deal of leverage in persuading users to keep smart tags on. Users will have to accept Microsoft’s links on every page they browse, or lose significant portions of the functionality they paid for when they bought Office XP.

Embrace and, well, you know

If the potential security risks of smart tags don’t disturb you, the forcible appropriation of the web by Microsoft should. In the Adding Smart Tags to Web Pages article, Microsoft states:

Web pages have traditionally relied on hyperlinks to associate text with individual resources on a file system, Web server, or e-mail server. With the introduction of smart tags in Microsoft® Office XP, you can enhance this type of hyperlinking behavior by associating text with a shortcut menu containing associations with not only multiple resources but custom applications as well.

Microsoft has extended the hyperlink, and rather than using xLinks, the W3C’s proposed extended linking model, they’ve gone off in their own direction – a direction that limits the most useful functionality (anything beyond what could be accomplished with an ordinary hyperlink) to those created using Windows-only technology. Smart tags aren’t about empowering the user. They are about turning the web into yet another proprietary platform for Microsoft to dominate.

As John Robb observes, Microsoft is at war with AOL. AOL has a consistent, simple interface to their offerings. Smart tags are the beginning of a similar interface for Microsoft’s offerings, from Yahoo! to Disinformation.

Too little, too late

As mentioned before, Scoble reported that an anonymous Microsoft employee has said that Microsoft has modified its smart tag implementation in the following ways:

  1. smart tags off by default (they were on by default to exercise the code, get Watson crashes)
  2. the user-overrides-authored-meta-tag option removed
  3. no recognizers shipping in box. So until you do have a recognizer installed, when you click on the smart tag button, you get navigated to a gallery from which you can download desired recognizers.

While turning smart tags off by default is good as it requires a conscious user action to enable them, it still doesn’t address the full issue. The smart tag option will be right on the browser toolbar, making it extremely easy for user to enable them. That isn’t bad in and of itself, but when combined with the fact that smart tags are embedded directly into the page and offer no indication of where they come from, having a simple show/hide button that turns smart tags on – and leaves them on – makes it seem all too much like the tags were there all along to a naive user. It still is unclear whether the tags were put there by the author or added after the fact.

Removing the meta tag override option is also good, but as mentioned before it is unfair for Microsoft to force authors to do extra work to prevent Microsoft from editing their pages. This is like saying that if you leave you car door unlocked, then it is OK for anyone to steal your stereo. It isn’t. In many places, you may well expect that some petty crook will steal the stereo from an unlocked car, but if the crook is caught he can and will be punished. Microsoft isn’t a petty crook. They are the largest software company in the world, and have monopolies on operating systems, browsers and office suites. Microsoft shouldn’t be allowed to force authors to do extra work to make Microsoft obey the rules.

Finally, as noted earlier, the promise not to ship any smart tags “in the box” is deceptive. Microsoft will still have the smart tags they want distributed available for download directly from the Windows interface in 4 clicks or so. Moreover, they have not said which box it is that won’t have any smart tags in it.

What site owners really want

Making smart tags acceptable is really quite simple. Microsoft need only address the fundamental problem with them – that they alter a page without explicit notification to the user of who is doing the altering – and I’ll be as enthusiastic about them as I am any other cool Internet Explorer for Windows-only technology. What I recommend is:

  • Smart tags must be always invisible by default on any page retrieved from a web server (or pulled from the user’s cache). If Microsoft wants to offer a meta tag, they can offer one to make smart tags visible by default.
  • Smart tags should become visible only when the user issues a “show smart tags” command, either by clicking a button, selecting a menu item or invoking a keyboard shortcut. Smart tags could also become visible when a user clicks or highlights a word for which they have a recognizer.
  • Smart tag actions should be labeled with the source company’s name or logo to show who is offering the user the functionality or information.
  • With default settings in place, smart tags should notify the user every time they contact an external server.
  • P3P support should be added to the technology to allow users to disallow any smart tags that attempt to track them.

That’s it. Nothing less than the above will suffice. Whatever technical obstacles Microsoft may claim prevent them from doing the above should also prevent them from shipping the technology in Internet Explorer. If they had the time and expertise to develop them in the first place, they have the time and expertise to implement them correctly.

Whaddaya gonna do abouddit?

So the next question is: what can we do to head smart links off at the pass? First, we can include the meta tag listed above on all our sites. While users may be able to override it, after the controversy smart links have caused Microsoft is unlikely to switch the override option on by default, at least for now. Most users probably either won’t find the option or won’t want to check it anyway, so the meta tag will help keep our sites looking and reading the way we intended when we coded them

Secondly, some site owners, such as ALA contributor and head lemur Alan Herrell are angry enough that they are blocking Internet Explorer 6, or even all Microsoft browsers, from their Web servers. Herrell is redirecting users to a page explaining his decision. This is a radical option that, while appealing in a 1960’s stick-it-to-the-man sort of way, is untenable even for many personal sites given Microsoft’s dominance in the web browser market.

More realistic is Dave Winer’s Microsoft Free Fridays campaign. Winer suggests blocking Microsoft browsers from your site on Fridays, instead redirecting users to a page on your site that explains what has happened and why. It’s a good proposal for personal sites, at least. There are indeed alternative browsers visitors can use to view your site, and most of the week even Internet Explorer 6 users can view your site unmolested.

Finally, you can add a pop-up window to your site that informs Internet Explorer 6 users that they are using an insecure browser. Even if smart tags prove to be no threat to privacy or the security of the user’s system, they are most definitely defacing the web pages the user visits as much as if Microsoft had hacked the sites and inserted links directly in the HTML code

Bad things happening to a good browser

Microsoft seems genuinely astonished by the reaction to smart tags. As group product manager for Microsoft’s Windows Client Shawn Sanford told NewsBytes, “Everybody tends to focus on the negative side of this like we’re going to expose (users) to a lot of bad content ... I think we’re going to expose people to a lot of good content.”

They’ve missed the point entirely. It’s not Microsoft’s job to expose users to content while they’re on our sites. It’s our job as authors, designers and developers. We don’t want Microsoft “saving users from underlinked sites” as one representative told Mossberg. If users feel our sites are “underlinked,” then it is our job to correct it, not Microsoft’s.

It is doubly unfortunate that Microsoft has so completely missed the boat, because the controversy is overshadowing what is shaping up to be a very good browser. Improved standards support, P3P support and doctype switching are all significant features that will benefit users and developers in the long run. In an effort to avoid the inherent problems with smart tags, many users and developers may miss out on those features. Instead of a tool to empower users and developers alike, smart tags make Internet Explorer 6 a tool for further empowering Microsoft at everyone’s expense.

That’s a shame.

No Comments

  1. Sorry, commenting is closed on this article.