Bugs with non-Latins

Bendy · November 7, 2016, 8:56am

I’m still experiencing a couple of bugs that are preventing Glyphs from exporting properly-produced fonts. I know I’ve mentioned them before, but I’ve been struggling with them for quite a while, and they seem like important issues, so I’m hoping this will prompt some attention. My experience comes from working on Burmese, Cham, Khmer, Lao and Thai. I’m on version 231 build 897 for most of my current work, though I now have four versions installed for different projects.

Kerning, mark positioning and mark-to-mark features are essential for these scripts. However, as Glyphs auto-generates these features and hides them behind the scenes, it’s not possible to edit them. Unfortunately Glyphs does not include script and language tags in these features. This means any language-tagged text (such as HTML lang=“th”) loses all kerning and mark positioning.

Glyph properties (called ‘info for selection’). When marks need to attach to a base, that base needs to be classed in category ‘letter’ and in subcategory ‘spacing’, otherwise the mark feature does not write lookups to combine the mark with the base. In all these scripts, ligatures, numerals and punctuation also need to carry marks, which they cannot do in Glyphs currently unless they are manually classed as spacing letters.

An associated problem is that for most of these scripts, spacing vowel glyphs are classed inconsistently and wrongly by default (unless I have a buggy GlyphData xml?) For example, Thai uni0E32 is equivalent to Lao 0EB2 and Khmer 17B5 and Burmese 102C. They all represent a long -aa- vowel and behave identically. However, the Thai and Lao versions are classed as Letter/Other, Khmer is classed as Mark/Spacing and Burmese is Mark/Spacing Combining. None of these is actually correct as the character is always spacing and never a mark.

What are the other subcategories meant for? Do they affect the exported fonts?

GeorgSeifert · November 9, 2016, 9:20pm

The different shapers for each script behave very differently (and users practice is not consistent, too). Some need script tags, other will not work without. I need to specify that for each script. Can you send me more info about the requirements of each script?

Until now, only letters needed marks. I’ll change that.

That is mostly based on the unicode properties. I need to correct that for each script, too. If you send me all things you find? Or could be go through the list together?

Bendy · November 9, 2016, 9:45pm

I’m trying to think of the most general approch, rather than specific behaviours for each script. Would it be possible to compile duplicate lookups in kerning and mark/mkmk so they work for dflt and for the associated non-Latin? If that’s not possible, Thai, Lao, Burmese, Khmer should all include mark/mkmk positioning, Cham tends to be just mark, and Thai, Lao, Burmese and Cham require kerning (for Khmer it’s possible to do without, though I would always want the option.)

There’s also the question of what to do when non-Latin scripts such as Georgian are used for minority dialects which use Latin diacritics such as macron; mark and mkmk features are needed as Unicode doesn’t include precomposed combinations.

Great. Now that I’ve classed all the other things as letters to get the mark feature working, is there any problem with not converting them back to what they should be? I mean, this info doesn’t get compiled into the GDEF table or anything, does it?

This was why I was making the spreadsheets for you, and our discussions about the best naming scheme for all these scripts to ensure consistency. Maybe it’s easier to talk it through though? Unicode is a mess.

GeorgSeifert · November 9, 2016, 9:54pm

This is done already for some scripts.

That is already implemented. All letter that take marks need to be in the Base class in the GDEF table. So the effect is the same.

any time

Bendy · April 4, 2017, 4:12pm

Georg, what is the best way to give you the information you need?

Also please please how can we make sure the mark and mkmk features work when a text’s language is set? This is really essential.

mekkablue · April 4, 2017, 8:48pm

You mean in the app where the font is used? That depends. They should always work in browsers, but apps like InD are a little more picky.

GeorgSeifert · April 4, 2017, 9:04pm

I changed it quite recently that all lookups are registered with the corresponding script and without. That should make sure that the features work regardless of the setting in the app. If that still don’t work as expected, please have a look at the feature code and see if you can fix it and tell me your findings. Or, if you can’t figure it out, tell me regardless and we fix it together.

Bendy · April 4, 2017, 9:20pm

Ok, I’ll make sure we’re all using the latest versions and see. Testing today in Firefox showed all marks in wrong positions when language was set to Thai.

Mark · April 5, 2017, 12:31pm

I think we are dealing with the same problem (and project) here so I want to join the party .
I worked on the Thai design along with a some HTML test sheets (mainly in Chrome) and I could manage to get the mark positioning correct. However now that I read the thread, I tried to add a HTML lang=“th” (which should be to correct language tag for Thai) and that scrumbles the marks all over the place.

So now I am confused about what to do.

Will the user not apply the language tag in web anyway?
As far as my feature code goes, Indesign does a good job with setting the marks correctly (when Adobe World-Ready […] Composer is on).

@GeorgSeifert The project is one you already know, I could send you a .glyphs file if you wish.

GeorgSeifert · April 7, 2017, 9:06am

If you could send me the ,glyphs file and the html file.

GeorgSeifert · May 13, 2017, 9:32am

That should be fixed by now.

Avantino · September 6, 2017, 1:26pm

Further to what Ben mentioned above; Manual settings of certain glyphs by the “Info for Selection” cannot be retained as user definitions…
Everytime those glyphs get Update Glyphs info command they loose user settings !
My suggestion to introduce a lock button to prevent those changes.

Bendy · August 3, 2018, 8:29am

And also please a CANCEL button.