You should use unicode on your web site, because you do not know for how long you only will write in English. One day you may want to write the correct accents for Běijīng or you may make sure that your readers know that you are writing about the province Shǎnxī and not the neighbouring Shānxī. And what about writing Łódź like they do themselves in that city?
If you have built your web site around a non-unicode encoding, you cannot do that.
And if you are a mathematician you may one day want to type ε or ∀ or ∃ or ∈.
And even if you just type normal boring English text, you may come into situations where you want to use a curved apostrophe in "I’m" or real curved quotation marks like ‘Hello’ or “Hello”. And what about writing that someone is 5′3″ tall?
With the non-unicode "Western" encoding, you cannot do any of that. You can of course use the straight vertical characters ' and ". I do so myself all the time. However, it is strictly speaking not correct. The sad thing with the Western encoding is that you cannot write it correctly even if you want to.
With Unicode you can write not only that but much more.
Sunday, 27 January 2008
UTF8 problems - none
Over at Sitepoint, there is a strange article about which encoding one shall use for web pages. It is strange, because it is written as if there was a real choice. The whole article should have been replaced by this simple sentence:
Use UTF-8. Always. And nothing else.
It is true that other encodings still are used and understood, but there is no real advantage with any of them.
The "problems" Sitepoint lists for UTF-8 are:
The second and third points are about browser limitations. Even if you manage to find a browser that is old enough not to be able to handle a BOM, it is just a blank line at the beginning of the file that may be wrong. However, modern browsers handle this well.
One could add a fourth point: the size of text files goes up for non-Latin alphabets, if you use UTF-8 compared to native encodings. However, so much of most modern HTML pages is javascript, tags and CSS and then images and media that the size of the actual text in 99.99% of all cases is insignificant.
Today there is no good reason for anyone to use anything but UTF-8.
Use UTF-8. Always. And nothing else.
It is true that other encodings still are used and understood, but there is no real advantage with any of them.
The "problems" Sitepoint lists for UTF-8 are:
- Not all editors or publishing tools support it.
- Some browsers don't understand the BOM, and will output it as text. Some editors won't allow us to omit the BOM.
- Some ancient browsers don't support UTF-8.
The second and third points are about browser limitations. Even if you manage to find a browser that is old enough not to be able to handle a BOM, it is just a blank line at the beginning of the file that may be wrong. However, modern browsers handle this well.
One could add a fourth point: the size of text files goes up for non-Latin alphabets, if you use UTF-8 compared to native encodings. However, so much of most modern HTML pages is javascript, tags and CSS and then images and media that the size of the actual text in 99.99% of all cases is insignificant.
Today there is no good reason for anyone to use anything but UTF-8.
Wednesday, 23 January 2008
Why would I think about posterity? What has posterity ever done for me?
The first computer I did not buy was an Amstrad CPC back in the eighties. It was the absolutely cheapest one in the market at the time and included monitor, CPU and printer in one marvellous package. The reason I did not buy it was luck. I stood in the shop ready to pay for it, when it suddenly struck me to ask the shop keeper if it would be easy to transfer files from this PC to other ones. Sheepishly he had to admit that it was impossible. The Amstrad CPC used Hitachi's 3" floppy disk drive, which no one else was using. Whatever one typed on an Amstrad CPC had to be retyped by hand on other computers if one wanted to preserve the data.
My most trusty computer was a PowerBook 170. However, it became increasingly difficult to get any files from it, as it did not have USB or Ethernet or Wifi, and as most modern computers do not have diskette stations. It served me for about 10 years before I gave it up due to the compatibility problems.
The question of future compatibility is still surprisingly ignored in the world of high tech.
Hardware is rarely a problem any more, as most computers handle WiFi and USB memory sticks. When you buy a new PC or PDA or telephone, there is usually no big problems transferring files as much as you like.
But the problem with file format remains.
Apple bluntly tries to sell its iWork suite, in spite of the fact that only iWork applications can read iWork files - much like Amstrad did in the eighties. One can, admittedly export files from iWork in more readable formats, like PDF or Word document, but it is completely and utterly impossible to set a widely used file format as default.
This is nothing new of course. Many fine applications have used obscure file formats, which locked the users in - Mellel, egword and iWorks' predecessor AppleWorks, just to mention a few.
AppleWorks is not even fully backward compatible with itself. The last version was not able to write in the formats of the earlier versions. And as Apple no longer sells the program, there is no legal way for your sister to access that pile of old AppleWorks documents you have on your harddisk, unless you have a spare license of the program to give her.
Microsoft Office probably has the most used file formats in the world, but with Service Pack 3 of Microsoft Office 2003, they suddenly decided not to support old files any more. With some documented hacking of the registry, one can still activate access to the old "unsecure" file formats, but if you are not careful, you may disable the OS in the process.
So what is the best file format to choose, if you want to guarantee posterity a chance to read your text?
Word documents of version 2003 is probably still one of the best bets. There are so many Word documents out there, and so many free and open source programs that support it, that it is unlikely to become impossible to read any time soon.
RTF is probably a reasonably good bet, if you avoid pictures and if you only type in Western languages. RTF files created by Mac OS X with Chinese and Japanese may fail in some applications due to encoding problems. (Shy RTFD, which Apple seems to claim is a "variant" of RTF. No application on any Operating System but Mac OS can read them.)
The ISO approved ODF format is all well, but it still has not got enough momentum to tell whether it will last.
The only file format that is promoted to be used for long time archiving is PDF-A. It is approved by ISO for this purpose. However, not many applications are able to create PDF-A files out of the box, and it is difficult to guarantee that there will be applications that can read it in 50 years' time.
My take is that the best format for long term archival is simple unformatted text files. However, even with text, things are not that simple.
My most trusty computer was a PowerBook 170. However, it became increasingly difficult to get any files from it, as it did not have USB or Ethernet or Wifi, and as most modern computers do not have diskette stations. It served me for about 10 years before I gave it up due to the compatibility problems.
The question of future compatibility is still surprisingly ignored in the world of high tech.
Hardware is rarely a problem any more, as most computers handle WiFi and USB memory sticks. When you buy a new PC or PDA or telephone, there is usually no big problems transferring files as much as you like.
But the problem with file format remains.
Apple bluntly tries to sell its iWork suite, in spite of the fact that only iWork applications can read iWork files - much like Amstrad did in the eighties. One can, admittedly export files from iWork in more readable formats, like PDF or Word document, but it is completely and utterly impossible to set a widely used file format as default.
This is nothing new of course. Many fine applications have used obscure file formats, which locked the users in - Mellel, egword and iWorks' predecessor AppleWorks, just to mention a few.
AppleWorks is not even fully backward compatible with itself. The last version was not able to write in the formats of the earlier versions. And as Apple no longer sells the program, there is no legal way for your sister to access that pile of old AppleWorks documents you have on your harddisk, unless you have a spare license of the program to give her.
Microsoft Office probably has the most used file formats in the world, but with Service Pack 3 of Microsoft Office 2003, they suddenly decided not to support old files any more. With some documented hacking of the registry, one can still activate access to the old "unsecure" file formats, but if you are not careful, you may disable the OS in the process.
So what is the best file format to choose, if you want to guarantee posterity a chance to read your text?
Word documents of version 2003 is probably still one of the best bets. There are so many Word documents out there, and so many free and open source programs that support it, that it is unlikely to become impossible to read any time soon.
RTF is probably a reasonably good bet, if you avoid pictures and if you only type in Western languages. RTF files created by Mac OS X with Chinese and Japanese may fail in some applications due to encoding problems. (Shy RTFD, which Apple seems to claim is a "variant" of RTF. No application on any Operating System but Mac OS can read them.)
The ISO approved ODF format is all well, but it still has not got enough momentum to tell whether it will last.
The only file format that is promoted to be used for long time archiving is PDF-A. It is approved by ISO for this purpose. However, not many applications are able to create PDF-A files out of the box, and it is difficult to guarantee that there will be applications that can read it in 50 years' time.
My take is that the best format for long term archival is simple unformatted text files. However, even with text, things are not that simple.
Saturday, 19 January 2008
Do you feel counted?
If you do not feel as counted as you should today, it is because I "upgraded" the design of some of my blogs. Even though blogspot is google and google analytics is google, the upgrade managed to remove all trace of you, dear reader.
But do not worry if you do not feel as counted as you like. Our highly skilled technicians (a couple of lost cockroaches and a kitten I'm thinking about buying) will without doubt have solved the problem shortly.
But do not worry if you do not feel as counted as you like. Our highly skilled technicians (a couple of lost cockroaches and a kitten I'm thinking about buying) will without doubt have solved the problem shortly.
Thursday, 17 January 2008
Thin air
Apple has come out with a new laptop. Everyone who sees it gets ecstatic. (By "everyone" I mean "I", btw.) It is slim. It is beautiful. It is portable. It is useless. For me, that is.
I really hope Apple sells a lot of them, especially the model with solid state memory. I want it to succeed, and I would really like to have one myself - but I will not buy one. Is it the price? No.
It is the harddisk. It simply is too small. I want to live omnia mea mecum, and I cannot possibly fit all my files on only 80 G and even less the 64 G that the sold state model offers. I would have to have an external harddisk as well. And an external optical drive. And it will all be heavier than buying a simple standard Macbook.
That is the biggest problem I see with it.
The second problem is smaller, but looks bigger. That is, the screen looks bigger. Not only that but it is too big. I do not want a 13.3" screen. 12" is more than enough. It is not because it is lighter, but because it fits better wherever one puts it. I do not need those extra 28 square inches, so why would I carry them around?
Update I: Gizmodo has a comparison between the Macbook Air and a number of other really small laptops. It seems increasingly ridiculous to just look at one dimension - how thick it is.
Update II: This is probably an excellent case for wait and see. The prices of solid state memories seem to be dropping like stones. In six months, either the price of the solid state MBA will have dropped by half or the amount of memory will have doubled or both. Perhaps.
I really hope Apple sells a lot of them, especially the model with solid state memory. I want it to succeed, and I would really like to have one myself - but I will not buy one. Is it the price? No.
It is the harddisk. It simply is too small. I want to live omnia mea mecum, and I cannot possibly fit all my files on only 80 G and even less the 64 G that the sold state model offers. I would have to have an external harddisk as well. And an external optical drive. And it will all be heavier than buying a simple standard Macbook.
That is the biggest problem I see with it.
The second problem is smaller, but looks bigger. That is, the screen looks bigger. Not only that but it is too big. I do not want a 13.3" screen. 12" is more than enough. It is not because it is lighter, but because it fits better wherever one puts it. I do not need those extra 28 square inches, so why would I carry them around?
Update I: Gizmodo has a comparison between the Macbook Air and a number of other really small laptops. It seems increasingly ridiculous to just look at one dimension - how thick it is.
Update II: This is probably an excellent case for wait and see. The prices of solid state memories seem to be dropping like stones. In six months, either the price of the solid state MBA will have dropped by half or the amount of memory will have doubled or both. Perhaps.
Saturday, 12 January 2008
MS Word may one day grow up
Microsoft Word has been around for more than 20 years now, and I still cannot take it quite seriously. It does a lot of things well, but if it were to be sold as a real word processor, methinks it would need to implement at least the following features.
Language
Language
- Right to left writing (Mac OS X version).
- Automatic conversion of kanji and hanzi to phonetics with the phonetic guides (as is already supported in Word for Windows).
- Multi-language spell checking (one setting that accepts words from any of several languages).
- One version for all languages, so the user freely can switch UI, formats, features and dictionaries.
- Recordable Applescripts (Mac OS X version).
- Applescript code complete in macro editor (Mac OS X version).
- VBA scripting in Mac OS X version for compatibility reasons.
- Open and save (not export) Open Document files.
- Working Autosave that does not clutter the harddisk with old backups.
- Ligatures, glyph variants and support for other OpenType and TrueType font features.
- High definition graphics.
- Handling of standard vector graphics like PDF and EPS.
- Text wraps around curved objects.
- Decent graphics editing.
- Fixed line height with changing fonts.
- Furigana format modification.
- Page spread view to adjust pictures spanning two pages.
- Layers to switch on and off certain objects for viewing, printing and export.
- Full screen view. (Mac OS X version)
- Snap objects to alignment guides.
- Support for Mac OS X Services.
- Support for Mac OS X dictionaries.
Pages may one day grow up
There is an excellent blog called Pages FAQ, where you can go if you want to know more about what Apple's word processor Pages is like.
This blog entry here is about what it is not like. Pages has been around for three years now, and I still cannot take it quite seriously. It does a lot of things well, but if it were to be sold as a word processor not only "for the rest of us", but "for all", methinks it would need to implement at least the following features.
Language
This blog entry here is about what it is not like. Pages has been around for three years now, and I still cannot take it quite seriously. It does a lot of things well, but if it were to be sold as a word processor not only "for the rest of us", but "for all", methinks it would need to implement at least the following features.
Language
- Right to left writing (Arabic, Hebrew).
- Vertical writing and furigana (Japanese).
- Grammar and spell checking in many more languages.
- Applescript access to content of tables.
- Recordable Applescripts
- Customizable menus, toolbars and keyboard shortcuts for Applescripts.
- Save (instead of "export") to Word, ODT, RTF and text format.
- Save and open RTF files with tables and images.
- Autosave.
- Document comparison (diff).
- Multiple document versions.
- Relative hyperlinks to other files.
- Ability to mix landscape and portrait sections.
- Drop Caps.
- Intelligent caps formatting (like Title Case).
- Multiple tables of contents.
- Numbered table and picture captions.
- Displays of the same file in multiple windows.
- Split window view.
- Page spread view to adjust pictures spanning two pages.
- Collapsible outlines.
- Layers to switch on and off certain objects for viewing, printing and export.
- Full screen view.
- "Normal" view without formatting.
- Equation Editor.
- Organisation charts.
- Bibliography.
- Version control system integration.
- Form design.
Friday, 11 January 2008
Italicized Arial Unicode MS? Never on a Mac!
In the main text processing applications on Mac OS X, like TextEdit, Keynote and Pages, one cannot italicize a font, unless the font designer went to the trouble of creating an italic typeface. Some fonts you therefore cannot put italics on are Arial Unicode MS, Comics Sans MS, and plenty of Apple's own older fonts like New York or Geneva and most Asiatic fonts like Hiragino.
Why?
You can do it without problems in AppleWorks, MacWrite, MS Word, OpenOffice and in about any Windows or Linux application.
I have three theories, and nothing to really back up any of them.
1. "Apple wants to protect the users from ugly fake italics and bolds."
This can be a message to send to users and readers. "It is an improvement." "You will no longer see bad italics." However, I think this message is getting increasingly obsolete. It might have been valid when screen resolutions were lower, but with higher resolutions on screens and printers, it is very rare that one sees an unacceptably bad fake italic or bold rendering.
2. "Apple wants to protect intellectual property rights for font designers."
This is a message to send to font designers. However, it rings a little hollow as one of the biggest font designers, Adobe, themselves produce software to warp their own and others' fonts alike.
3. "Apple does not manage to make acceptable fake italics."
This is a wild guess, but it is possible that Apple had problems getting fake italics right in the first version of Mac OS X. It may have been performance problems with display postscript and getting the pixels right on low resolution screens and things like that. If so, Apple gave up, and promoted the lack of this functionality as a feature.
Personally I cannot see any good reason to prevent italics or bold of fonts that lack the type faces. I have never met anyone who has promoted Mac OS with the words "and thank God, you cannot italicise all fonts" or "it is great not to be able to add bold words to some texts".
If someone had asked me, "do you want to remove the ability to italicize fonts without italic type faces from your Windows PC", I would have said "no" with considerably raised eyebrows, and I have a feeling that others would use their eyebrows in the same way.
I have more than once been in the middle of typing notes in for example Hiragino Mincho Pro, when I suddenly felt like adding italics just as a reminder to myself. (6 eggs, 1 loaf of bread, 200g butter but not salted.) With TextEdit today, I have to change font to add the italics. Open the font panel, click on another font, make sure the typefaces are visible, make sure one of them is "italic" or "oblique". Change the font of the rest of the text to match it. Adjust the text so page breaks and images wraps work smoothly with the new font... And I thought computers were there to try to make our life easier.
But I do see some advantages as well. I appreciate the confidence I can have that the italics I use for publication have gone through the approval of a font designer. I am grateful that I do not have to see ugly italics in printed or displayed public texts.
It is just that the advantages are insignificant compared to the drawbacks with Apple's solution.

Apple's font Skia deformed by Adobe Illustrator.
Why?
You can do it without problems in AppleWorks, MacWrite, MS Word, OpenOffice and in about any Windows or Linux application.
I have three theories, and nothing to really back up any of them.
1. "Apple wants to protect the users from ugly fake italics and bolds."
This can be a message to send to users and readers. "It is an improvement." "You will no longer see bad italics." However, I think this message is getting increasingly obsolete. It might have been valid when screen resolutions were lower, but with higher resolutions on screens and printers, it is very rare that one sees an unacceptably bad fake italic or bold rendering.
2. "Apple wants to protect intellectual property rights for font designers."
This is a message to send to font designers. However, it rings a little hollow as one of the biggest font designers, Adobe, themselves produce software to warp their own and others' fonts alike.
3. "Apple does not manage to make acceptable fake italics."
This is a wild guess, but it is possible that Apple had problems getting fake italics right in the first version of Mac OS X. It may have been performance problems with display postscript and getting the pixels right on low resolution screens and things like that. If so, Apple gave up, and promoted the lack of this functionality as a feature.
Personally I cannot see any good reason to prevent italics or bold of fonts that lack the type faces. I have never met anyone who has promoted Mac OS with the words "and thank God, you cannot italicise all fonts" or "it is great not to be able to add bold words to some texts".
If someone had asked me, "do you want to remove the ability to italicize fonts without italic type faces from your Windows PC", I would have said "no" with considerably raised eyebrows, and I have a feeling that others would use their eyebrows in the same way.
I have more than once been in the middle of typing notes in for example Hiragino Mincho Pro, when I suddenly felt like adding italics just as a reminder to myself. (6 eggs, 1 loaf of bread, 200g butter but not salted.) With TextEdit today, I have to change font to add the italics. Open the font panel, click on another font, make sure the typefaces are visible, make sure one of them is "italic" or "oblique". Change the font of the rest of the text to match it. Adjust the text so page breaks and images wraps work smoothly with the new font... And I thought computers were there to try to make our life easier.
But I do see some advantages as well. I appreciate the confidence I can have that the italics I use for publication have gone through the approval of a font designer. I am grateful that I do not have to see ugly italics in printed or displayed public texts.
It is just that the advantages are insignificant compared to the drawbacks with Apple's solution.

Apple's font Skia deformed by Adobe Illustrator.
Sunday, 6 January 2008
Market forces and boredom
Once upon a time, Slashdot was an excellent source of information about what is going on in the world of high tech. However, as the number of readers grew, they needed more money to keep the site running, and they had to rely on commercial interests.
Commercial interests want ad revenues, so they want even more readers. Unfortunately, you get more readers with heated discussions, and a discussion is rarely heated unless the subject is something a lot people already know enough to have an opinion on - in other words, it is old news.
Consequently, the "news" at Slashdot is often found in one of the following categories:
Commercial interests want ad revenues, so they want even more readers. Unfortunately, you get more readers with heated discussions, and a discussion is rarely heated unless the subject is something a lot people already know enough to have an opinion on - in other words, it is old news.
Consequently, the "news" at Slashdot is often found in one of the following categories:
- Comparisons between two known systems (Operating systems, standards, ...)
- Provocative closed-ended questions about known subjects ("Is...?")
- Superlatives about known subjects (biggest..., smallest ever...)
- Lists about known subjects (top ten...)
- Scientific news from popular sources (Link to CNN or Yahoo News but never to Science or Nature)
The end of history
People who have lived most of their active lives before the 1990ies may think that the computer age will remove our respect for the old.
They are right in the sense that the school books of 1910 will not bring us much new knowledge.
However, the history we are building right now is impressive. Each and everyone of us can keep more and more data from the past: all the music you listened to as a child, the school essay you write from now on, each bank statement, each travel receipt...
It was interesting to study the letters between Abélard and Heloïse. With the arrival of the telephone, that kind of studies became more difficult, because the number of written conversations went down. However, with computers, the number of written records goes up and up. The sky is the limit. Together with our patience.
They are right in the sense that the school books of 1910 will not bring us much new knowledge.
However, the history we are building right now is impressive. Each and everyone of us can keep more and more data from the past: all the music you listened to as a child, the school essay you write from now on, each bank statement, each travel receipt...
It was interesting to study the letters between Abélard and Heloïse. With the arrival of the telephone, that kind of studies became more difficult, because the number of written conversations went down. However, with computers, the number of written records goes up and up. The sky is the limit. Together with our patience.
Thursday, 3 January 2008
Hardware problem
Tech support told me I had a hardware problem. I promptly put the computer in boiling water for seven days and seven nights, and finally it starts getting a little softer.
Wednesday, 2 January 2008
The reason MS sucks is that it listens to its customers
Year after year Apple has come with much better products than Microsoft. At any given moment, Mac OS has been better than the current DOS or Windows version.
Why?
What is Apple's secret?
How can they with much smaller budget produce much better products? For more than twenty years?
The answer is that life is much more difficult for Microsoft. MS woos the big corporations. All the big corporations want all their requests fulfilled all the time, and that is pretty difficult to keep up with. If a big corporation still has DOS applications, Microsoft has to make sure that DOS applications still work.
In contrast, Apple only has to program for the moment. If old applications no longer work it is not much of a concern of theirs. There are plenty of potential buyers who happily accept that shortcoming, as everything else works so well. No support for 68000 processors any more? No worry! No support for classic Mac OS? No worry! No support for SCSI or diskettes? No worry! They can kick out old standards much quicker than MS can.
If 20% of all Apple users were to quit the platform each year, there would still be a potential market of 90% of all computer users to switch over. If 20% of all Microsoft users were to quit the platform each year, it would be a catastrophe, as there would hardly be anyone to win over.
As Mac OS constantly is cleansed of legacy code, Windows has a much larger overhang of old mouldy code they cannot remove without annoying some of their biggest clients.
That's why I use Mac OS - because Apple is strong enough to say no to its customers.
Why?
What is Apple's secret?
How can they with much smaller budget produce much better products? For more than twenty years?
The answer is that life is much more difficult for Microsoft. MS woos the big corporations. All the big corporations want all their requests fulfilled all the time, and that is pretty difficult to keep up with. If a big corporation still has DOS applications, Microsoft has to make sure that DOS applications still work.
In contrast, Apple only has to program for the moment. If old applications no longer work it is not much of a concern of theirs. There are plenty of potential buyers who happily accept that shortcoming, as everything else works so well. No support for 68000 processors any more? No worry! No support for classic Mac OS? No worry! No support for SCSI or diskettes? No worry! They can kick out old standards much quicker than MS can.
If 20% of all Apple users were to quit the platform each year, there would still be a potential market of 90% of all computer users to switch over. If 20% of all Microsoft users were to quit the platform each year, it would be a catastrophe, as there would hardly be anyone to win over.
As Mac OS constantly is cleansed of legacy code, Windows has a much larger overhang of old mouldy code they cannot remove without annoying some of their biggest clients.
That's why I use Mac OS - because Apple is strong enough to say no to its customers.
Subscribe to:
Posts (Atom)