Two-letter vs three-letter language codes?

I’ve been testing my font against Michael Impallari’s font testing page. One thing that jumped out at me was that my Catalan localizations weren’t getting rendered properly.

Sniffing around I realized that there’s a difference between three-letter language codes and two-letter ones. The two-letter ones apparently are an older standard. Impallari is using the older two-letter codes on his page.

I figured that I’d copy my .loclCAT glyphs and add .loclCA ones for backwards-compatibility, but Glyphs isn’t compiling the feature and seems to just be ignoring the new glyphs.

Is this intentional behavior? Should I just not worry about users using the older two-letter codes in the wild?

Sniffing around more, I realize that OpenType doesn’t use the ISO codes so now I’m doubly-confused :smiley:

Right. The reason for this is that OT ‘language system’ codes are really a bit misnamed: they actually represent particular orthographic preferences around the shape or behaviour of characters, which only sometimes map directly to language per se. So when the initial set of OT codes were defined, there wasn’t considered to be any need to base the language system codes on ISO language codes. It’s a decision I sort of regret now, given that mapping to document language codes has turned out to be the most common implementation of OT language system support. What we should have done is defined a mechanism that would have allowed ISO language codes to be used in the fonts, while also enabling non-ISO codes for language systems that don’t map directly to languages. [An example of the latter is the four-characters IPPH and APPH language system codes for phonetic notation systems.]

The thing to remember is that

  • language system codes you put in your font, and
  • language tags used in html, xml and css

are two separate namespaces. It is up to the browser (with the help of the OS and rendering system) to map between the two.

Note also that, strictly speaking, html, etc., language tags are not defined by ISO 639 codes (either 2- or 3-letter) but are BCP 47 codes. For more info, see https://www.w3.org/International/articles/language-tags/.

Thanks both for the info! So to clarify, you’re saying that the failure to map lang="ca" to .loclCAT in my font is a failure of the browser (in this case, Safari)?

Have you tested any other browsers?

In recent testing I found that Firefox maps HTML lang attributes to the font’s locl features, while Safari and Chrome don’t.

For me, worked in Chrome but not in Safari. Thanks all for the explanation!