APFS and Unicode Normalization

One key feature of macOS High Sierra is the arrival of Apple File System (APFS) as the default file system format. The iOS 10.3 update migrated the iOS file system from HFS+ to APFS, an amazingly smooth transition that was celebrated at WWDC last week.

The WWDC APFS developer session video is well worth your time if you have access. I am familiar with font encoding issues but was completely unaware of the Unicode Normalization file system issue that developers outside the ASCII bubble have been worried about. The best blog to read about APFS and Unicode issues is The Electric Light Company by Howard Oakley. His take on AI is great too.

I particularly enjoyed reading his explanation of Unicode file naming and the limits of having the file system handle normalization. There will be two different flavors of APFS, native normalization will be default for iOS 11, the default for macOS High Sierra is normalization-insensitive. This should work well. The basic encoding issue that affects all systems everywhere however, remains:

it is time for the Unicode Consortium to map indistiguishable characters to the same encodings, so that each visually distinguishable character is represented by one, and only one, encoding.

That is a stark challenge, and one that I am sure will never even be started. But until we do, today’s minor running sores will only fester and grow.

I have heard similar complaints about the Unicode Consortium from Japanese font developers over the years. Unicode has done many good things but like any human organization there are agendas and politics. For some, the Unicode Consortium working method is too top down for comfort. Sometimes grand plans don’t work out, like IVS.

As Oakley points out, getting a big new effort off the ground is too much to ask of the Unicode Consortium.