Unicode variation sequences


#1

I have a question about variation sequence support in Glyphs 2.6.1.

Glyph MJ030274, which is encoded as a CJK compatibility ideograph at U+FA4C, can also be represented by variation sequences <U+793E,U+FE00> (SVS) and <U+793E,U+E0103> (IVS).

Is there a way to support both variation sequences without duplicating the glyph?


#2

AFAIK that is not supposed to be dealt with in the font, but an issue of the layout engine.


#3

Unicode Variation Sequence support implies that a Format 14 subtable is being created. See the Unicode VS FAQ (which was linked from the release notes) or Ken Lunde’s OpenType ‘cmap’ Table Ramblings.


#4

I’ll have a look.


#5

This capability is important for CJK (all regions) and emoji support.

FontCreator and FontForge can do it.

Thank you.


#6

Are there more glyphs like this?


#7

First off, I looked at Glyphs export output and I see the cmap format 14 table is being created. Having this capability at all is a great step forward for me. Thank you so much for your work!

To answer your question, this issue technically affects all characters with defined Unicode variation sequences. “Default” and “non-default” UVSes are specified in different tables. It seems that Glyphs only handles non-default UVSes.

The case I mentioned above represents one of the ~91 Japan-source CJK compatibility ideographs. These cases are the most problematic examples, because they have three Unicode representations for the same glyph:

  • the code point for the compatibility ideograph
  • the code point for the corresponding unified ideograph plus a [standard] variation selector (for the NFC form of the compatibility character)
  • the code point for the corresponding unified ideograph plus an ideographic variation selector (for the ideographic variation sequence, as defined by the Adobe Japan or Moji Joho collection)

Here’s some TTX cmap data for comparison. (For clarity, I deleted the format 4 Microsoft-platform subtable data.) Glyphs only maps the “default” UVSes (corresponding to their base character’s default glyph) when I duplicate the glyph: once for uni793E and twice for uniFA4C.

Glyphs 2.6.2 (1242) output

<cmap>
  <tableVersion version="0"/>
  <cmap_format_4 platformID="0" platEncID="3" language="0">
    <map code="0x20" name="space"/><!-- SPACE -->
    <map code="0x793e" name="uni793E"/><!-- CJK UNIFIED IDEOGRAPH-793E -->
    <map code="0xfa4c" name="uniFA4C"/><!-- CJK COMPATIBILITY IDEOGRAPH-FA4C -->
  </cmap_format_4>
  <cmap_format_14 platformID="0" platEncID="5">
    <map uv="0x793e" uvs="0xfe00" name="uni793E.uv001"/>
    <map uv="0x793e" uvs="0xe0102" name="uni793E.uv019"/>
    <map uv="0x793e" uvs="0xe0103" name="uni793E.uv020"/>
    <map uv="0x793e" uvs="0xe0104" name="uni793E.uv021"/>
  </cmap_format_14>
</cmap>

target ideal output

<cmap>
  <tableVersion version="0"/>
  <cmap_format_4 platformID="0" platEncID="3" language="0">
    <map code="0x20" name="space"/><!-- SPACE -->
    <map code="0x793e" name="uni793E"/><!-- CJK UNIFIED IDEOGRAPH-793E -->
    <map code="0xfa4c" name="uniFA4C"/><!-- CJK COMPATIBILITY IDEOGRAPH-FA4C -->
  </cmap_format_4>
  <cmap_format_14 platformID="0" platEncID="5">
    <map uv="0x793e" uvs="0xe0102"/>
    <map uv="0x793e" uvs="0xfe00" name="uniFA4C"/>
    <map uv="0x793e" uvs="0xe0103" name="uniFA4C"/>
    <map uv="0x793e" uvs="0xe0104" name="uni793E.uv021"/>
  </cmap_format_14>
</cmap>