PCG FAQ

[Use your browser's BACK button to return to the PRIME FAQ Page or click here if you came directly to this page.]

We wrote this FAQ to answer the many questions we receive on this topic from our clients and other inquiring minds in the many electronic communities we frequent.

This FAQ is taken from Word 97 Annoyances (by Woody Leonhard, Lee Hudspeth, and T.J. Lee; ISBN 1-56592-308-1; O'Reilly and Associates). For additional information on this title, visit our site's page http://www.primeconsulting.com/annoyances/.

Click here to order Word 97 Annoyances from Amazon.

(Annoyances, Word 97) Hidden Symbols and Their Nuances in Word 97

Q: I'm mystified by how Word handles symbol characters. Can you shed some light on this subject?

A: Here's our take on symbol characters in Word. Try this annoying experiment.

  1. Start with a clean document. Type abc and hit Enter. Select the paragraph you typed and change the font to Symbol. You'll see the Greek letters alpha beta and gamma. That's great.

  2. Start a new paragraph. Change the font to Symbol. Type abc again and hit Enter a couple of times. You'll see alpha beta and gamma. So far, so good.

  3. Now select that alpha-beta-gamma paragraph and change the font to Times New Roman. See how your alpha, beta, and gamma have turned to rectangles?

  4. Select the final paragraph mark in the document. Click Insert, Symbol, move to the Symbol font and put a lower-case beta into your document.

  5. Click Edit, Find, and look for the character b. Note how Word doesn't find any b's, even though you typed two of them into your document. Arguably, the Insert/Symbol beta is also a b.

  6. Select the Insert/Symbol beta. Copy it (Ctrl+C). Click Edit, Find, and paste (Ctrl+V) the beta into the Find box. Note how what you see in the find box is a rectangle. Word finds three betas in the document-one of which is a Times New Roman formatted rectangle?
What's happening? Good question. Behind the scenes, Word is converting your Symbol font letters into Unicode characters. As discussed in Chapter 6, Word almost always stores characters as numbers between 0 and 255. They're commonly called ANSI character numbers or ASCII character number. (There's a slight, largely academic, difference between the two.) That's enough to accommodate most of the common European characters.

Unicode represents the computer industry's latest, best effort to bring all languages into the electronic age. It not only covers European languages, Unicode runs through Asian and Middle Eastern languages as well. While it can't possibly contain all Chinese pictographs, for example, Unicode allows for 65,536 character numbers--more than enough to take a good stab at most widely-recognized languages.

Microsoft is slowly trying to bring Unicode support into all its applications, and we Word users, as usual, get stuck on the bleeding edge. The odd behavior you saw in our experiment represents just the tip of the polyglot iceberg. You'll find similar oddities when working with any of the Wingdings fonts, Marlett, MS Extra, MS Outlook, Monotype Sorts, or Bookshelf Symbol 3, among others.

It gets even stranger:

What does this mean to you as a Word user? And if you're a VBA/Word programmer trying to pick up character values from a document, you're in for some hairy programming shenanigans. (In particular, look at the AscW and ChrW functions.)
While the transition to Unicode is laudable, this particular implementation stinks. Maybe some day Microsoft will find a way to make Unicode and ASCII live together in peace. In the meantime, you're going to be stuck with all sorts of inexplicable and marginally explicable behavior.
If you frequently use Edit/Find with symbols, and want to be able to see what you're searching for, look in macros8.dot for a routine called FindSymbol. This routine works much slower than the built-in Edit Find, but it does allow you to put symbols in the "Find" box without cutting and pasting your heart away.

If you have a document with Unicode characters in it, and want to convert them to "real" ANSI characters, there's a monstrous 50-line VBA/Word program from Microsoft that purports to do so. Look on the Web for www.microsoft.com/kb/articles/q160/0/22.htm.

* If you installed all of Word, ANSIValue is both a procedure and a module inside the macros8.dot template that sits in the folder:

c:\Program Files\Microsoft Office\Office\Macros

ANSIValue returns the character number associated with a specific selected character in a document.

The Naked PC
Subscribe to our free electronic newsletter. Get the latest on all things PC, updates to PRIME Freeware page, and more. Type your email name and click Subscribe.

Return to Top