are missing under Hebrew when you generate a new font, add Hebrew under Languages, and look at any of the subsets Basic, Punctuation, Composites, Diacritic Marks and Cantillation and Extras. These would best be categorized under Basic, since double-vav and double-yud function very much as base characters. Although Hebrew is under “Languages”, it should be regarded as the Hebrew script, used across all languages that use the Hebrew script across numerous time eras from antiquity to the present, much like Latin is used for the language Latin as well as English, et al. Note that the character
U+FB1F [ײַ] HEBREW_LIGATURE_YIDDISH_YOD_YOD_PATAH
is present, specifically in the subset Cantillation and Extras. There, it’s called “yodyodpatah-hb”.
Thanks for the input, I will try and add those to the sidebar. Would it make sense to have a subgroup ‘Yiddish’, next to ‘Basic’ for instance?
For now, you can either use Window > Glyph Info for finding and adding glyphs that are not in the sidebar, or just Glyphs > Add Glyphs…, and paste vavvav-hb vavyod-hb yodyod-hb to add those three.
Shall we remove yodyodpatah-hb from the sidebar? It is a representation form, it seems, and should not be used anyway, at least as a character. But does it make sense to have it in a Hebrew font?
That is a misnomer we have had since version 1, in the vain attempt of making things sound simpler. We will probably change that in a future version of the app. Technically you will also find languages and language groups in the foldouts, but that is besides the point. And it gets worse, Script is wrong too, because the section is not limited to writing systems. Math symbols, emojis and musical notation are not scripts, either.
Would it make sense to have a subgroup ‘Yiddish’, next to ‘Basic’ for instance?
For purposes of the Hebrew script, it would make sense to include the three so-called Yiddish digraphs in Basic. This helps prevent them from being marginalized while avoiding the need to introduce another subset.
The double-yod and double-vav digraphs function very much like base characters and have historically been used for languages other than Yiddish, with a variety of diacritic combinations. The vav-yod digraph functions as a base character as well in certain contexts, but to a lesser degree.
The double-yod digraph is also used in Hebrew with a qamats centered underneath in a well-attested liturgical practice, where יי serves as a written substitute for the Tetragrammatonwhen the form carries a prefix (e.g., ל or ו).
Shall we remove yodyodpatah-hb from the sidebar? It is a representation form, it seems, and should not be used anyway, at least as a character. But does it make sense to have it in a Hebrew font?
I would not recommend removing yodyodpatah-hb. It does belong in a Hebrew (or more generally Hebrew-script) font. However, it would be more appropriately placed under Composites rather than Cantillations and Extras.
Precomposed characters are needed in certain computing environments and for specific use cases, but I won’t go into detail here, as there are many presentation forms of the same type.
I also noticed that the Diacritic Marks set is missing U+05BF [ֿ] HEBREW POINT RAFE.
This should be included. It is used in Ladino and Yiddish, was used in historical Hebrew-language texts, and is also found in some modern liturgical contexts as an accent marker.
Not convinced, sorry. Fighting marginalization of letters, as noble as the cause may be, is not the point of the Basic set. And a subset does not marginalize. Just like we are not marginalizing Aacute because it is not in Latin Basic. The question is, is it possible to make a font that supports exclusively Iwrit and what do I need for supporting Yiddish. A set can indicate that I have my coverage intact, and encourage expansion.
Can you point me to reliable resources for Yiddish written in Hebrew script?
Yiddish is always written in the Hebrew script. The definitive resource on standard Yiddish orthography is this book; a previous edition can be found scanned here. But FYI, this reference work is written in Yiddish and so is not accessible in translation.
I’m going to leave a more practical comment in my next message. (As a new user, I hit my message limit on links and embedded images.)
Here is one reputable resource in English about the Yiddish writing system. Under Table of Contents, if you click on How Yiddish Reading is Taught in YiddishPOP, you’ll see in the first paragraph that the sample words meaning ‘Read’ and ‘Write’ both contain the “double yod” digraph (לײען, שרײַב). Even more relevant is the section Letter Index, where you’ll see that װ, ױ, ײ are represented as single Unicode characters.
…are regularly used in digital Yiddish text. So, if Glyphs were to have a “Yiddish” subgroup inside “Hebrew,” these three should be located there. But arguably, they could also belong to the “Basic” group, as they are letter forms included in the main Unicode Hebrew block, and they function as singleton (digraph) letters with respect to combining diacritics. We see this in ײַ (a HEBREW_LIGATURE_YIDDISH_DOUBLE_YOD followed by a HEBREW_POINT_PATAH), which is used on every standard Yiddish website today. Here’s an example from a reputable academic journal.
Finally: This character…
U+FB1F [ײַ] HEBREW_LIGATURE_YIDDISH_YOD_YOD_PATAH
…is one of the Alphabetic Presentation Forms. Any good Yiddish typeface should cover it. For consistency, I would recommend that all five of these characters…
…be moved to the subgroup “Composites.” In fact, I note that the “Composites” subgroup in Glyphs already includes ~9 other composite characters used in Yiddish. There’s nothing that makes these 5 (including the final one, ײַ) different from those 9 others.
Honestly, I am reluctant about the composites. Are there keyboard layouts that still produce the presentation form directly (like fi and fl remain on many Latin Mac keyboards )?
So: if we have rafe-hb with the right anchors in it, can’t we rely on mark positioning? Then I would collect presentation forms as ‘Legacy Composites’.
Or would a font with mark positioning but without the composites still fail in common Yiddish app usage contexts?
What I have so far (for now without presentation forms) for a Yiddish glyphset:
@mekkablue Thanks for your attention to detail and questions!
Your table is missing the final letter allographs (required for Hebrew and all Hebrew-scripted languages; they’re already in the Glyphs “Basic” subgroup). But it’s otherwise correct.
So, here is the full table for standard Yiddish:
Char
Unicode
Unicode Name
א
U+05D0
HEBREW LETTER ALEF
ב
U+05D1
HEBREW LETTER BET
ג
U+05D2
HEBREW LETTER GIMEL
ד
U+05D3
HEBREW LETTER DALET
ה
U+05D4
HEBREW LETTER HE
ו
U+05D5
HEBREW LETTER VAV
ז
U+05D6
HEBREW LETTER ZAYIN
ח
U+05D7
HEBREW LETTER HET
ט
U+05D8
HEBREW LETTER TET
י
U+05D9
HEBREW LETTER YOD
ך
U+05DA
HEBREW LETTER FINAL KAF
כ
U+05DB
HEBREW LETTER KAF
ל
U+05DC
HEBREW LETTER LAMED
ם
U+05DD
HEBREW LETTER FINAL MEM
מ
U+05DE
HEBREW LETTER MEM
ן
U+05DF
HEBREW LETTER FINAL NUN
נ
U+05E0
HEBREW LETTER NUN
ס
U+05E1
HEBREW LETTER SAMEKH
ע
U+05E2
HEBREW LETTER AYIN
ף
U+05E3
HEBREW LETTER FINAL PE
פ
U+05E4
HEBREW LETTER PE
ץ
U+05E5
HEBREW LETTER FINAL TSADI
צ
U+05E6
HEBREW LETTER TSADI
ק
U+05E7
HEBREW LETTER QOF
ר
U+05E8
HEBREW LETTER RESH
ש
U+05E9
HEBREW LETTER SHIN
ת
U+05EA
HEBREW LETTER TAV
װ
U+05F0
HEBREW LIGATURE YIDDISH DOUBLE VAV
ױ
U+05F1
HEBREW LIGATURE YIDDISH VAV YOD
ײ
U+05F2
HEBREW LIGATURE YIDDISH DOUBLE YOD
ִ
U+05B4
HEBREW POINT HIRIQ
ַ
U+05B7
HEBREW POINT PATAH
ָ
U+05B8
HEBREW POINT QAMATS
ּ
U+05BC
HEBREW POINT DAGESH OR MAPIQ
ֿ
U+05BF
HEBREW POINT RAFE
ׁ
U+05C1
HEBREW POINT SHIN DOT
ׂ
U+05C2
HEBREW POINT SIN DOT
Note: U+05C1 HEBREW POINT SHIN DOT is not actually used in YIVO Yiddish orthography, but it is used in some other standardized spelling systems and so I included it in the table.
If you wanted to create a new “Yiddish” subgroup, then it should include everything in this table from װ U+05F0 downwards.
FYI, standard Yiddish also needs the following punctuation symbols:
Char
Unicode
Unicode Name
־
U+05BE
HEBREW PUNCTUATION MAQAF
׳
U+05F3
HEBREW PUNCTUATION GERESH
״
U+05F4
HEBREW PUNCTUATION GERSHAYIM
“
U+201C
LEFT DOUBLE QUOTATION MARK
„
U+201E
DOUBLE LOW-9 QUOTATION MARK
The first three are also used in Hebrew and are currently in the Glyphs Hebrew “Punctuation” subgroup. But the last two should be included in any font with Yiddish coverage.
Some writers also use these (here is one prominent example), but they are definitely more marginal:
Char
Unicode
Unicode Name
′
U+2032
PRIME
″
U+2033
DOUBLE PRIME
⸗
U+2E17
DOUBLE OBLIQUE HYPHEN
Yes, there are some keyboard layouts that use the presentation forms directly. My own Yiddish Klal, which is widely used by Yiddish writers on macOS, does not, but I offer a variant that outputs the presentation forms (called “Yiddish Klal Ligatur”). The reason is that a number of word processing programs, including Google Docs for many years, have/had a bug that only allowed one-char-per-keystroke input, and people requested I create Yiddish Klal Ligatur. I also know folks who type in those presentation forms for their design projects, and my colleagues and I have used the presentation forms when training text-to-speech and speech-to-text models. However, you are correct that a font with anchor points and mark positioning should work for the vast majority of usage contexts, and in that sense the Composites could be renamed “Legacy Composites.”
Thanks for the insightful posting. I guess for now, we’ll add the presentation forms. They should eventually phase out though.
Is it a good idea to decompose presentation forms in ccmp? We do this for Latin presentation forms like fi automatically, so you do not need to add an fi.sc, for instance.
A separate “Yiddish” subgroup makes sense. I would include all Yiddish-specific characters there, including presentation forms. That is also where I would keep yodyodpatah-hb. Maybe there should also be a small “Historical Characters” subsection.
Also, do not forget this cute historical character: ׯ (uni05EF), as well as FB2E, FB2F, FB4C, FB4D and FB4E.
I would not remove yodyodpatah-hb entirely. Even if it is not recommended for modern text, it is still useful for compatibility, OCR, older texts and fuller Yiddish support. I also use it as a yod-yod smiley י‿י in some of my designs.
Okay, so this is what I have now as the Yiddish subgroup. I took out the 27 basic characters, because they are in the ‘Basic’ subset already:
Character
Unicode
Glyph name
Unicode name
װ
U+05F0
vavvav‑hb
HEBREW LIGATURE YIDDISH DOUBLE VAV
ױ
U+05F1
vavyod‑hb
HEBREW LIGATURE YIDDISH VAV YOD
ײ
U+05F2
yodyod‑hb
HEBREW LIGATURE YIDDISH DOUBLE YOD
ִ
U+05B4
hiriq‑hb
HEBREW POINT HIRIQ
ַ
U+05B7
patah‑hb
HEBREW POINT PATAH
ָ
U+05B8
qamats‑hb
HEBREW POINT QAMATS
ּ
U+05BC
dagesh‑hb
HEBREW POINT DAGESH OR MAPIQ
ֿ
U+05BF
rafe‑hb
HEBREW POINT RAFE
ׁ
U+05C1
shindot‑hb
HEBREW POINT SHIN DOT
ׂ
U+05C2
sindot‑hb
HEBREW POINT SIN DOT
“
U+201C
quotedblleft
LEFT DOUBLE QUOTATION MARK
„
U+201E
quotedblbase
DOUBLE LOW‑9 QUOTATION MARK
U+2E17
doubleobliquehyphen
DOUBLE OBLIQUE HYPHEN
אַ
U+FB2E
alefpatah‑hb
HEBREW LETTER ALEF WITH PATAH
אָ
U+FB2F
alefqamats‑hb
HEBREW LETTER ALEF WITH QAMATS
בֿ
U+FB4C
betrafe‑hb
HEBREW LETTER BET WITH RAFE
כֿ
U+FB4D
kafrafe‑hb
HEBREW LETTER KAF WITH RAFE
פֿ
U+FB4E
perafe‑hb
HEBREW LETTER PE WITH RAFE
שׂ
U+FB2B
shinsindot‑hb
HEBREW LETTER SHIN WITH SHIN DOT
כּ
U+FB3B
kafdagesh‑hb
HEBREW LETTER KAF WITH DAGESH
תּ
U+FB4A
tavdagesh‑hb
HEBREW LETTER TAV WITH DAGESH
בּ
U+FB31
betdagesh‑hb
HEBREW LETTER BET WITH DAGESH
יִ
U+FB1D
yodhiriq‑hb
HEBREW LETTER YOD WITH HIRIQ
ײַ
U+FB1F
yodyodpatah-hb
HEBREW LIGATURE YIDDISH YOD YOD PATAH
וּ
U+FB35
vavdagesh‑hb
HEBREW LETTER VAV WITH DAGESH
Did I overlook something?
Notes:
I left out the prime/double prime glyphs for now because I could not confirm it in typographically proper use. Is it people trying to type proper quotes?
The punctuation glyphs maqaf‑hb, geresh‑hb, gershayim‑hb make up the ‘Punctuation’ subgroup already, so I left them out here as well.
I still wonder if we should do a Hebrew ccmp like this one:
lookup ccmp_hebrew {
sub yodhiriq‑hb by yod-hb hiriq-hb;
sub betdagesh-hb by bet-hb dagesh-hb;
# and so on, you get the idea
} ccmp_hebrew;
Correct me if I am wrong, but there should not be a visible difference between a precomposed betdagesh-hb and a bet-hb + dagesh-hb, right? So, I would decompose all U+FBxx glyphs into their equivalents in the ccmp feature.
Also, I removed these from ‘Cantillation and Extras’, since they are now at home in the ‘Yiddish’ subgroup: