Auto-feature suggestion for Greek

Tosche · March 12, 2016, 1:38am

Greek OpenType features suggestion:
Greek-specific features are absent in most fonts but there should be three substitutions, first two of which should be ideally done by a text layout engine but only InDesign is smart enough to support. The last one is a solution to the misuse of periodcentered caused by antique keyboard layouts.

Capital letters with tonos have to be substituted with normal letters:
The letters Ά Έ Ή Ί Ό Ύ Ώ only occur at the beginning of a word in a lowercase-dominant text. In all-cap setting, all tonos caps need to be tonos-less. Unless you want to resolve PDF text extraction issue, addition of tonos-less alternates is not necessary.
The code logic should be the following, and be written in calt, case, and/or ccmp (given that InDesign’s all-cap option does exactly what it’s supposed to do, support in case feature may not be necessary):
If a tonos capital is followed by another capital letter, suppress
If a tonos capital is preceded by another capital letter, suppress
There are single-letter words with tonos (e.g. ώ). You need to check them across word spaces and punctuations.
As an additional rule, if there is a diphthong whose first letter has a tonos, the second letter (either ι or υ) should have a dieresis when converted to all cap. (άυλες > ΑΫΛΕΣ)
άι έι όι ύι becomes ΑΪ ΕΪ ΟΪ ΥΪ in all-cap
άυ έυ ήυ όυ ώυ becomes ΑΫ ΕΫ ΗΫ ΟΫ ΩΫ in all-cap
Some of the vowel combinations that are not listed here do exist (e.g. ήι ώι), but they are not diphthongs and dieresis insertion should not happen.
In Greek keyboard, anoteleia is not accessible, instead periodcentered is used. When periodcentered is preceded by a greek letter, it should be substituted with anoteleia. In dictionary use, letter-periodcentered-letter sequence is probably a legitimate case of periodcentered and therefore should not be substituted. For the graphical difference between periodcentered and anoteleia, see this link:
http://leonidas.org/greek-type-design/examples-of-ano-teleia-use/

GeorgSeifert · March 12, 2016, 10:30am

That sounds like a good idea.

Tosche · March 14, 2016, 8:32pm

My suggestion (in ccmp?)

lookup grek_dieresisCapInsertion {
	script grek;
	sub [Alphatonos Epsilontonos Omicrontonos Upsilontonos] Iota' by Iotadieresis;
	sub [Alphatonos Epsilontonos Etatonos Omicrontonos Omegatonos] Upsilon' by Upsilondieresis;
	sub [alphatonos.sc epsilontonos.sc omicrontonos.sc upsilontonos.sc] iota.sc' by iotadieresis.sc;
	sub [alphatonos.sc epsilontonos.sc etatonos.sc omicrontonos.sc omegatonos.sc] upsilon.sc' by upsilondieresis.sc;
} grek_dieresisCapInsertion;

lookup grek_tonosCap {
	script grek;
	# tonos supression of single-letter word
    sub @grekCapsTonos' @nonLetter @nonLetter @nonLetter @nonLetter @nonLetter @grekCaps by @grekCapsTonosless;
    sub @grekCapsTonos' @nonLetter @nonLetter @nonLetter @nonLetter @grekCaps by @grekCapsTonosless;
    sub @grekCapsTonos' @nonLetter @nonLetter @nonLetter @grekCaps by @grekCapsTonosless;
    sub @grekCapsTonos' @nonLetter @nonLetter @grekCaps by @grekCapsTonosless;
    sub @grekCapsTonos' @nonLetter @grekCaps by @grekCapsTonosless;
	sub @grekCaps @nonLetter @nonLetter @nonLetter @nonLetter @nonLetter @grekCapsTonos' by @grekCapsTonosless;
	sub @grekCaps @nonLetter @nonLetter @nonLetter @nonLetter @grekCapsTonos' by @grekCapsTonosless;
	sub @grekCaps @nonLetter @nonLetter @nonLetter @grekCapsTonos' by @grekCapsTonosless;
	sub @grekCaps @nonLetter @nonLetter @grekCapsTonos' by @grekCapsTonosless;
	sub @grekCaps @nonLetter @grekCapsTonos' by @grekCapsTonosless;
	# tonos supression within a word
	sub @grekCapsTonos' @grekSmcp by @grekCapsTonosless;
	sub 	@grekCaps @grekCapsTonos' by @grekCapsTonosless;
	sub 	@grekCapsTonos' @grekCaps by @grekCapsTonosless;
} grek_tonosCap;

lookup grek_anoteleia {
	script grek;
	ignore sub @grekAll periodcentered' @grekAll;
	sub @grekAll periodcentered' by anoteleia;
} grek_anoteleia;

and in smcp

sub [alphatonos.sc epsilontonos.sc omicrontonos.sc upsilontonos.sc] iota.sc' by iotadieresis.sc;
sub [alphatonos.sc epsilontonos.sc etatonos.sc omicrontonos.sc omegatonos.sc] upsilon.sc' by upsilondieresis.sc;

GeorgSeifert · March 14, 2016, 10:03pm

Thanks for the code. This helps a lot.

isn’t the long lookahead and look behind a bit dangerous? that might apply to normal mixed case settings, too.

Tosche · March 14, 2016, 11:45pm

It could be, but it of course doesn’t include lowercase in the search range (only trying to find the first uppercase letter and skipping non-letter junk). I guess it should search for two uppercase letters in sequence over non-letter junk, not just one, assuming the previous or following word consists of multiple letters?

Mark · November 8, 2016, 2:37pm

What do you Glyphs guys propose for dealing with a Greek UC-only Font? Particularly I am referencing the case conversion. Is it the best to just add the UC glyphs as a component to the LC glyphs and then hope for the user’s grammar responsibility during typesetting? And/or is there some backstage magic happening in Glyphs during the export or auto feature writing that can be messed up by not having real LC? Any experiences with Uppercase-only Greek?

Tosche · November 8, 2016, 6:20pm

I do have experience with UC only Greek. I simply copied tonos-less letters to tonos glyph slot, and duplicated that to lowercase. I still made dieresis conversion happen.

Mark · November 9, 2016, 10:23am

Thanks Tosche, I will take this into account.

reesederksen · September 24, 2025, 9:37pm

It’s been almost nine years since this thread was last active; is it still recommended to build Greek UC fonts the way Toshi describes? (by placing tonos-less letters in the tonos glyph slots) Or has a new standard been established?

Tosche · September 24, 2025, 9:48pm

Thanks for the reminder. A certain Leonidas in our community told me the above code needs revising, and it’s time that I correct it. Give me a bit of time please.

reesederksen · September 25, 2025, 1:56pm

No problem! Take the time you need.

reesederksen · November 4, 2025, 6:45pm

Just checking in to see if there’s any developments on this!

Tosche · April 27, 2026, 5:07pm

Thanks for the wait (or apologies if it was too late). Below was passed on to me by my former colleague Emilios Theofanous who was in turn given by Irene Vlachou, if I’m not mistaken. I wouldn’t be surprised if there’s any edge cases, but I’m personally not aware of it and it’s much better than the above anyway:

#—————————————————————————

# Greek Features

# —————————————————————————

# Recover Dieresis: for cases like μπόι / σέικερ to be ΜΠΟΪ / ΣEΪΚΕΡ

lookup Dieresis {
sub [Alphatonos Epsilontonos Omicrontonos Omegatonos] [Iota Upsilon]' by [Iotadieresis Upsilondieresis];
}Dieresis ;

# Special suppression and recovery cases for Capital

@GrCaps = [Alpha Beta Gamma Delta Epsilon Zeta Eta Theta Iota Kappa Lambda Mu Nu Xi Omicron Pi Rho Sigma Tau Upsilon Phi Chi Psi Omega Alphatonos Epsilontonos Etatonos Iotatonos Omicrontonos Upsilontonos Omegatonos Iotadieresis Upsilondieresis];
@UCtonos = [Alphatonos Epsilontonos Etatonos Iotatonos Omicrontonos Upsilontonos Omegatonos];
@UCtonosless = [Alpha Epsilon Eta Iota Omicron Upsilon Omega];

lookup GrCapitals {
sub @UCtonos' @GrCaps by @UCtonosless; # For cases like Έλενα / καλημέρα to be ΕΛΕΝΑ / ΚΑΛΗΜΕΡΑ
sub [@GrCaps Iotadieresis] @UCtonos' by @UCtonosless; # for cases like υϊός to be ΥΪΟΣ
sub @GrCaps iotadieresistonos' @GrCaps by Iotadieresis; # for cases like καΐκι to be ΚΑΪΚΙ
sub @GrCaps upsilondieresistonos' @GrCaps by Upsilondieresis; # for cases like εξαΰλωση to be ΕΞΑΫΛΩΣΗ
} GrCapitals;

SCarewe · April 27, 2026, 5:36pm

Excellent resource, thank you Toshi!

reesederksen · April 28, 2026, 3:04pm

Amazing, thanks Toshi!

michaelrafailyk · April 29, 2026, 7:52am

So, for example Alphatonos will be substituted with just Alpha. Question — in the text (for copy/paste or when changing fonts), there is still be Alphatonos unicode, right?

I remember solution using tonos-less Alphatonos.001 that contains component of Alpha, but couldn’t understand why that was required if the substitution is font-level and it doesn’t transform unicodes in the text.

GeorgSeifert · April 29, 2026, 8:07am

there are some (rare) cases where only the glyph names are stored in a PDF (.ps file > Distiller; that is the reason we care about production glyph names at all, as they are needed to restore the characters in those cases). In those cases, the you would get “Alpha” if it was substituted out. But in most PDFs produced in the last 20 years, the original characters are preserved so the names don’t matter.