Dutch IJ Ligatures

In the tutorial for Dutch ij and IJ is a procedure discribed to produce ligature designs of all ij variations.

My Question is, why should i add the glyphs
to the font instead of
iacute_jacute.loclNLD ?

I understand in Dutch exists ij with dots and with acutes on both letters. Why should I want mixed forms like íj?

I noticed the mentioned locl feature language NLD with
Iacute_Jacute.loclNLD and
won´t be generated automatically nevertheless.

May somebody please explain. Thank you in advance!

You cannot type jacute. That is why.

Please read the tutorial carefully. It is not about creating the ‘mixed form’ you mention, quite the opposite. It describes what the problem is, and why this is a solution that works.

iacute_j.loclNLD is actually ij with two acutes.
Just rename your iacute_jacute.loclNLD to iacute_j.loclNLD and glyphs will do the rest.

Aaah! OK. Now I get it. Thank you, mekkablue and FAlthausen.

I wonder why you would sub the ligature from ‘iacute+j’, and not ‘iacute+j+acutecomb’? I don’t feel comfortable building such hacks into a (retail) font.

You are right. It currently breaks the ‘proper’ Unicode solution for an accented uppercase IJ, Iacute + J + acutecomb, where the J may end up with two acutes on top of each other because the J is replaced by Jacute: Iacute + Jacute + acutecomb. Admittedly, it is a pretty rare case, but must be dealt with nevertheless. (The lowercase is not a problem because ccmp replaces the j before a combining top accent with jdotless.)

So this is the solution that accommodates both, the way it is typed with current keyboards (where combining accents are generally inaccessible), and the way it is supposed to be entered according to Unicode. I passed it on to Georg, perhaps it will end up in one of the upcoming updates:

lookup NLD {
    language NLD;
    sub iacute j' by jacute;
    ignore sub J' [@MarkscombCase @Markscomb];
    sub Iacute J' by Jacute;
} NLD;

To use it now, deactivate feature automation for locl, remove the NLD lines, and add these lines at the end of the locl feature code. This solution assumes the presence of class definitions for @MarkscombCase and @Markscomb in ccmp, and that ccmp comes before locl.

I don’t think we should cheat, just because glyphs aren’t easily typed on a keyboard. It is better to make a dedicated keyboard shortcut for the combining acute.

There really is no need for a dedicated jacute glyph, assuming CCMP already swaps ‘j’ for ‘jdotless’ when it is followed by a combining mark. If you, however, want to kern Jacute/jacute against other things, I suggest the following code in CCMP:

sub J acutecomb by Jacute;
sub j acutecomb by jacute;

This can also be handled with contextual kerning.

If your font has a special dutch IJ/ij dlig, I would suggest the following DLIG feature:

lookup NLD {
    language NLD;
    sub i j by i_j.dlig_loclNLD;
    sub iacute j acutecomb by iacute_j_acutecomb.dlig_loclNLD;
    sub I J by I_J.dlig_loclNLD;
    sub Iacute J acutecomb by Iacute_J_acutecomb.dlig_loclNLD;
    # possibly, you could also include the old IJ ij digraphs like so:
    sub ij acutecomb by iacute_j_acutecomb.dlig_loclNLD;
    sub IJ acutecomb by Iacute_J_acutecomb.dlig_loclNLD;
} NLD;

Unfortunately, there is a bug in Adobe InDesign that shows the discretionary ligature feature in brackets (i.e. not included in the font) when this is the only DLIG entry.

Some Dutch don’t want a language tag at all there, because they are not always available in publishing tools, but IMO you should not expect proper typography from a tool that doesn’t support it.

Well, ‘cheating’. It is enabling backwards-compatibility. You can regard the contextual jacute as a representation form for Dutch íj (iacute-j). For the way texts have been encoded in the past thirty years (and probably will be for a long time to come), it is the only working solution. Changing the text base seems unlikely, and since it is including historic transcriptions and legal content, often not even possible. (In these cases, transcripts include a note that the acute on the j in print could not be recreated in the text. There are some examples on dbnl.)

There is a similar encoding problem, and a similar solution in place for Romanian/Moldovan. The only difference is that, for these languages, new codepoints have been created. Not so for Dutch.

I agree with you. and out of pure pragmatism already: ÍJ is possible in other languages where there must not be an acute on the J. There even is the single case of the accented proper name FÍJI/Fíji in a Dutch-language context. I have once added an ignore rule just for that very case. But what I would rather advice in cases as these is utilising a non-joiner on the user’s side.

The assumption that all Dutch text must be following the current official spelling is problematic.

See for example Zijn dat kooplieden of zijn kooplieden dat?, De Nieuwe Taalgids. Jaargang 54 - dbnl where the author clearly made the choice of only putting the acute on the first letter of digraphs: dat zijn zíj níet, de méisjes wél, En díe kerels, dát zijn kóoplieden ; Beschouwingen en opmerkingen door L. van Deyssel., De Nieuwe Gids. Jaargang 48 - dbnl with zíjn, níet, méeste, díe, dóor, zíen, íets; Zes idyllische gedichten, Verzameld werk. Deel 1. Lyrische poëzie, Karel van de Woestijne - dbnl with Zóo, vréemdeling, níet, híj; or Taal van alle tijden door prof. dr. K. Heeroma, Onze Taal. Jaargang 39 - dbnl on the first letter of diphthongs with míjn, bóeken, níet, úitgesproken but óók and géén; or even [Blinkend licht splinterde fijn], Verzen, Herman Gorter - dbnl where the grave is used with , zìj; or Klemtoonverschuiving Is het Nederlands een ‘hangmat’-taal?, Onze Taal. Jaargang 61 - dbnl with many more; and so on.

On dbnl.org, there are actually between 4 and 5 times more pages with zíjn and níet, with only the first letters with acute, than pages with zíjn and níét, with two acutes on niet but not zijn. This could mean the majority of zíjn were intentional on dbnl.org, if we’re assuming the níet were. Even when we exclude the pages with niét, with the acute on the second letter which seems like a typo at best, this is still true. On the web the proportions are different, it looks like 2 times less, but it’s still a significant portion of text where it seems intentional. Ironically officielebekendmakingen.nl has a roughly 50%-50% ratio.

Looking at books, there’s still a significant portion, see Naakt model by C.S. Adama van Scheltema published in 1917 with ìs, wíj, àndere, ónzen; or Twee brieven uit Westerbork by Etty Hillesum published in 1943 with hóe, nóu, Wíj, zíj, zíjn, díe, wàt, jíj.
You can see the authors of The Ethics and Religious Philosophy of Etty Hillesum, published in 2014, using Hillesum’s spelling of wíj. Note that the Brill typeface doesn’t substitute it to look like wíj́ whether it’s tagged as Dutch or not. Maybe it’s onto something.
It’s not just old books or old text. See also ‘Indische is een gevoel’ by Marlene de Vries published in 2009 at Amsterdam University Press with zíjn, díe, níet, líegt, íjskoud, zíel, dóen, júist; or the edition of Joe Speedboat by Tommy Wieringa published in 2008 at De Bezige Bij with rúikt, móest, móet, bedóelde, jíj, híj, míj; Villa Serena by Marion Pauw published in 2009 at Archipel with híj, jíj, níet, schíetschijf; and so on. They follow spelling rules, just not the current official ones.

All these are lost to the readers when shown in a fonts that substitutes íj.

The use of acute as a stress mark on the first two letters of vowels written with multiple letters has only settled as a government sanctioned rule in the 1995-1996 spelling. Jan Renkema mentions the different rules in Schrijfwijzer, 1989, and says they are replaced by the acute rule in later editions. It doesn’t seem to be in the 1954 spelling. Before 1995 the grave was also in the usage for short vowels and and there was some variation in marking stress. Even after 1995, some editors intentionally only put the stress mark on the first letter of what they consider to be diphthongs like níet or of digraphs like vóor.

Unicode had j + combining acute standardized in its encoding in 1992. Dutch spelling rules settled on the current ones for marked stress in 1995-1996. It’s a bother Dutch data was input in Dutch ASCII encodings or ISO 8859 encodings, or with keyboard layouts that didn’t support the j acute. While we know in many cases the users probably want the j acute, certainly when there’s a note like you mention or when they use it, we also know in a significant number of cases they did not want it.

Even today, the Onze Taal spelling recommendations slightly differ from the Taalunie official spelling. They may agree on the stress mark rules but disagree on other topics. Some style guides like the one of the Financieele Dagblad (see Dutch IJ with dots - Page 3 — TypeDrawers) disagree on the stress mark rules. Some users do stick to the pre-1996 rules they learned.

What if tomorrow Taalunie or Onze Taal decide that the marked stress rule can be more tolerant or different?

The user should decide what rules they follow. Unless we know what the user decided, it’s problematic to guess.