Many newish Arabic charcters in Unicode are missing glyph info

It seems that the Arabic Extended-B and Arabic Extended-C blocks are missing entirely. Many characters from Arabic Extended-A block are also missing (e.g. 08C8–08D2).

Thanks for the reminder.
Here is a preliminary XML for the missing glyphs:

	<glyph unicode="0870" name="alefAttachedfatha-ar" category="Letter" script="arabic" production="uni0870" direction="RTL" description="ARABIC LETTER ALEF WITH ATTACHED FATHA" />
	<glyph unicode="0871" name="alefAttachedfathatopright-ar" category="Letter" script="arabic" production="uni0871" direction="RTL" description="ARABIC LETTER ALEF WITH ATTACHED TOP RIGHT FATHA" />
	<glyph unicode="0872" name="alefwithrightmiddlestroke-ar" category="Letter" script="arabic" production="uni0872" direction="RTL" description="ARABIC LETTER ALEF WITH RIGHT MIDDLE STROKE" />
	<glyph unicode="0873" name="alefLeftmiddlestroke-ar" category="Letter" script="arabic" production="uni0873" direction="RTL" description="ARABIC LETTER ALEF WITH LEFT MIDDLE STROKE" />
	<glyph unicode="0874" name="alefAttachedkasra-ar" category="Letter" script="arabic" production="uni0874" direction="RTL" description="ARABIC LETTER ALEF WITH ATTACHED KASRA" />
	<glyph unicode="0875" name="alefAttachedkasrabottomright-ar" category="Letter" script="arabic" production="uni0875" direction="RTL" description="ARABIC LETTER ALEF WITH ATTACHED BOTTOM RIGHT KASRA" />
	<glyph unicode="0876" name="alefAttachedrounddotabove-ar" category="Letter" script="arabic" production="uni0876" direction="RTL" description="ARABIC LETTER ALEF WITH ATTACHED ROUND DOT ABOVE" />
	<glyph unicode="0877" name="alefAttachedrounddotright-ar" category="Letter" script="arabic" production="uni0877" direction="RTL" description="ARABIC LETTER ALEF WITH ATTACHED RIGHT ROUND DOT" />
	<glyph unicode="0878" name="alefAttachedrounddotleft-ar" category="Letter" script="arabic" production="uni0878" direction="RTL" description="ARABIC LETTER ALEF WITH ATTACHED LEFT ROUND DOT" />
	<glyph unicode="0879" name="alefAttachedrounddotbelow-ar" category="Letter" script="arabic" production="uni0879" direction="RTL" description="ARABIC LETTER ALEF WITH ATTACHED ROUND DOT BELOW" />
	<glyph unicode="087A" name="alefDotabove-ar" decompose="alef-ar, dotabove-ar" category="Letter" script="arabic" production="uni087A" direction="RTL" description="ARABIC LETTER ALEF WITH DOT ABOVE" />
	<glyph unicode="087B" name="alefAttachedfathatoprightDotabove-ar" decompose="alefAttachedfathatopright, dotabove-ar" category="Letter" script="arabic" production="uni087B" direction="RTL" description="ARABIC LETTER ALEF WITH ATTACHED TOP RIGHT FATHA AND DOT ABOVE" />
	<glyph unicode="087C" name="alefRightmiddlestrokeDotabove-ar" decompose="alefRightmiddlestroke-ar, dotabove-ar" category="Letter" script="arabic" production="uni087C" direction="RTL" description="ARABIC LETTER ALEF WITH RIGHT MIDDLE STROKE AND DOT ABOVE" />
	<glyph unicode="087D" name="alefAttachedkasrabottomrightDotabove-ar" category="Letter" script="arabic" production="uni087D" direction="RTL" description="ARABIC LETTER ALEF WITH ATTACHED BOTTOM RIGHT KASRA AND DOT ABOVE" />
	<glyph unicode="087E" name="alefAttachedfathatoprightRingleft-ar" category="Letter" script="arabic" production="uni087E" direction="RTL" description="ARABIC LETTER ALEF WITH ATTACHED TOP RIGHT FATHA AND LEFT RING" />
	<glyph unicode="087F" name="alefRightmiddlestrokeRingleft-ar" category="Letter" script="arabic" production="uni087F" direction="RTL" description="ARABIC LETTER ALEF WITH RIGHT MIDDLE STROKE AND LEFT RING" />
	<glyph unicode="0880" name="alefAttachedkasrabottomrightRingleft-ar" category="Letter" script="arabic" production="uni0880" direction="RTL" description="ARABIC LETTER ALEF WITH ATTACHED BOTTOM RIGHT KASRA AND LEFT RING" />
	<glyph unicode="0881" name="alefAttachedhamzaright-ar" category="Letter" script="arabic" production="uni0881" direction="RTL" description="ARABIC LETTER ALEF WITH ATTACHED RIGHT HAMZA" />
	<glyph unicode="0882" name="alefAttachedhamzaleft-ar" category="Letter" script="arabic" production="uni0882" direction="RTL" description="ARABIC LETTER ALEF WITH ATTACHED LEFT HAMZA" />
	<glyph unicode="0883" name="tatweelOverstruckhamza-ar" category="Letter" script="arabic" production="uni0883" direction="RTL" description="ARABIC TATWEEL WITH OVERSTRUCK HAMZA" />
	<glyph unicode="0884" name="tatweelOverstruckwaw-ar" category="Letter" script="arabic" production="uni0884" direction="RTL" description="ARABIC TATWEEL WITH OVERSTRUCK WAW" />
	<glyph unicode="0885" name="tatweelTwodotsbelow-ar" decompose="tatweel-ar, twodotsbelow-ar" category="Letter" script="arabic" production="uni0885" direction="RTL" description="ARABIC TATWEEL WITH TWO DOTS BELOW" />
	<glyph unicode="0886" name="yehThin-ar" category="Letter" script="arabic" production="uni0886" direction="RTL" description="ARABIC LETTER THIN YEH" />
	<glyph unicode="0887" name="baselineRounddot-ar" category="Letter" script="arabic" production="uni0887" direction="RTL" description="ARABIC BASELINE ROUND DOT" />
	<glyph unicode="0888" name="raisedRounddot-ar" category="Symbol" subCategory="Modifier" script="arabic" production="uni0888" direction="RTL" description="ARABIC RAISED ROUND DOT" />
	<glyph unicode="0889" name="noonVinverted-ar" decompose="noon-ar, vinvertedabove-ar" category="Letter" script="arabic" production="uni0889" direction="RTL" description="ARABIC LETTER NOON WITH INVERTED SMALL V" />
	<glyph unicode="088A" name="hahVinvertedbelow-ar" category="Letter" script="arabic" production="uni088A" direction="RTL" description="ARABIC LETTER HAH WITH INVERTED SMALL V BELOW" />
	<glyph unicode="088B" name="tahDotbelow-ar" decompose="tah-ar, dotbelow-ar" category="Letter" script="arabic" production="uni088B" direction="RTL" description="ARABIC LETTER TAH WITH DOT BELOW" />
	<glyph unicode="088C" name="tahThreedotsbelow-ar" decompose="tah-ar, threedotsbelow-ar" category="Letter" script="arabic" production="uni088C" direction="RTL" description="ARABIC LETTER TAH WITH THREE DOTS BELOW" />
	<glyph unicode="088D" name="kehehTwodotsverticalbelow-ar" decompose="keheh-ar, twodotsverticalbelow-ar" category="Letter" script="arabic" production="uni088D" direction="RTL" description="ARABIC LETTER KEHEH WITH TWO DOTS VERTICALLY BELOW" />
	<glyph unicode="088E" name="verticaltail-ar" category="Letter" script="arabic" production="uni088E" direction="RTL" description="ARABIC VERTICAL TAIL" />
	<glyph unicode="0890" name="poundabove-ar" category="Other" subCategory="Format" script="arabic" production="uni0890" direction="LTR" description="ARABIC POUND MARK ABOVE" />
	<glyph unicode="0891" name="piastreabove-ar" category="Other" subCategory="Format" script="arabic" production="uni0891" direction="LTR" description="ARABIC PIASTRE MARK ABOVE" />
	<glyph unicode="0898" name="aljuzhigh-ar" category="Mark" subCategory="Nonspacing" script="arabic" production="uni0898" direction="RTL" description="ARABIC SMALL HIGH WORD AL-JUZ" />
	<glyph unicode="0899" name="ishmaamlow-ar" category="Mark" subCategory="Nonspacing" script="arabic" production="uni0899" direction="RTL" description="ARABIC SMALL LOW WORD ISHMAAM" />
	<glyph unicode="089A" name="imaalalow-ar" category="Mark" subCategory="Nonspacing" script="arabic" production="uni089A" direction="RTL" description="ARABIC SMALL LOW WORD IMAALA" />
	<glyph unicode="089B" name="tasheellow-ar" category="Mark" subCategory="Nonspacing" script="arabic" production="uni089B" direction="RTL" description="ARABIC SMALL LOW WORD TASHEEL" />
	<glyph unicode="10EFD" name="saktalow-ar" category="Mark" subCategory="Nonspacing" script="arabic" production="u10EFD" direction="RTL" description="ARABIC SMALL LOW WORD SAKTA" />
	<glyph unicode="10EFE" name="qasrlow-ar" category="Mark" subCategory="Nonspacing" script="arabic" production="u10EFE" direction="RTL" description="ARABIC SMALL LOW WORD QASR" />
	<glyph unicode="10EFF" name="maddalow-ar" category="Mark" subCategory="Nonspacing" script="arabic" production="u10EFF" direction="RTL" description="ARABIC SMALL LOW WORD MADDA" />
	<glyph unicode="089C" name="maddawaajib-ar" category="Mark" subCategory="Nonspacing" script="arabic" production="uni089C" direction="RTL" description="ARABIC MADDA WAAJIB" />
	<glyph unicode="089D" name="alefmokhassasabove-ar" category="Mark" subCategory="Nonspacing" script="arabic" production="uni089D" direction="RTL" description="ARABIC SUPERSCRIPT ALEF MOKHASSAS" />
	<glyph unicode="089E" name="maddadoubled-ar" category="Mark" subCategory="Nonspacing" script="arabic" production="uni089E" direction="RTL" description="ARABIC DOUBLED MADDA" />
	<glyph unicode="089F" name="halfmaddaovermadda-ar" category="Mark" subCategory="Nonspacing" script="arabic" production="uni089F" direction="RTL" description="ARABIC HALF MADDA OVER MADDA" />
	<glyph unicode="08B5" name="qafDotbelowNodotsabove-ar" decompose="qafDotless-ar, dotbelow-ar" category="Letter" script="arabic" production="uni08B5" direction="RTL" description="ARABIC LETTER QAF WITH DOT BELOW AND NO DOTS ABOVE" />
	<glyph unicode="08C8" name="graf-ar" category="Letter" script="arabic" production="uni08C8" direction="RTL" description="ARABIC LETTER GRAF" />
	<glyph unicode="08C9" name="yehSmallFarsi-ar" category="Letter" subCategory="Modifier" script="arabic" production="uni08C9" direction="RTL" description="ARABIC SMALL FARSI YEH" />
	<glyph unicode="08CA" name="yehSmallFarsiabove-ar" category="Mark" subCategory="Nonspacing" script="arabic" production="uni08CA" direction="RTL" description="ARABIC SMALL HIGH FARSI YEH" />
	<glyph unicode="08CB" name="yehbarreeTwodotsbelowabove-ar" category="Mark" subCategory="Nonspacing" script="arabic" production="uni08CB" direction="RTL" description="ARABIC SMALL HIGH YEH BARREE WITH TWO DOTS BELOW" />
	<glyph unicode="08CC" name="sahAbove-ar" category="Mark" subCategory="Nonspacing" script="arabic" production="uni08CC" direction="RTL" description="ARABIC SMALL HIGH WORD SAH" />
	<glyph unicode="08CD" name="zahAbove-ar" category="Mark" subCategory="Nonspacing" script="arabic" production="uni08CD" direction="RTL" description="ARABIC SMALL HIGH ZAH" />
	<glyph unicode="08CE" name="largerounddotabove-ar" category="Mark" subCategory="Nonspacing" script="arabic" production="uni08CE" direction="RTL" description="ARABIC LARGE ROUND DOT ABOVE" />
	<glyph unicode="08CF" name="largerounddotbelow-ar" category="Mark" subCategory="Nonspacing" script="arabic" production="uni08CF" direction="RTL" description="ARABIC LARGE ROUND DOT BELOW" />
	<glyph unicode="08D0" name="sukunbelow-ar" category="Mark" subCategory="Nonspacing" script="arabic" production="uni08D0" direction="RTL" description="ARABIC SUKUN BELOW" />
	<glyph unicode="08D1" name="largecirclebelow-ar" category="Mark" subCategory="Nonspacing" script="arabic" production="uni08D1" direction="RTL" description="ARABIC LARGE CIRCLE BELOW" />
	<glyph unicode="08D2" name="largerounddotinsidecirclebelow-ar" category="Mark" subCategory="Nonspacing" script="arabic" production="uni08D2" direction="RTL" description="ARABIC LARGE ROUND DOT INSIDE CIRCLE BELOW" />

Does this make sense?

Looks good.
poundabove-ar and piastreabove-ar are currency symbols (but they behave like endofayah-ar).

I forgot also U+061D from the main Arabic block.

I can put them in Symbol/Currency. Should influence the behavior in the font, just the ordering in Glyphs.

	<glyph unicode="061D" name="endoftext-ar" category="Punctuation" script="arabic" production="uni061D" description="ARABIC END OF TEXT MARK" />

Did you check the names? Some are quite a mouthful.

And can you think of a way to add decomposition to the “alefAttache…” glyphs? Just adding the components would probably move them to the wrong anchor/position.

I’ll share with you my GlyphData.xml file, it has my choice for glyph names and it changes some of the existing glyph names as well, feel free to re-use whatever you want from it. It includes some decompositions, too.

1 Like

It just feels natural to has it with other currency symbols.

1 Like

Do we need to distinguish between alefFatha-ar and alefAttachedfatha-ar? Can there be a alef with a fatha that is not attached? And will users understand that they need to draw the letter with an attached fatha?

Unicode wouldn’t encode a new character like this as it would be undistinguished from a regular alef + fatha, so if a font has one, it would be a ligature and would be named alef_fatha-ar (or fatha_alef-ar).

A glyph name is not enough to design a glyph and people shouldn’t be designing based on glyph names only (it happens and the result is almost always comically wrong), so no amount of verbose naming is going to fix that.