Unicode variation selector U+FE01

I want to implement the variation selector U+FE01, which selects a different glyph if it’s typed after a particular Unicode glyph.

This is apparently part of the Unicode 14 standard, and is described here in the original proposal:

https://www.unicode.org/L2/L2020/20275r-math-calligraphic.pdf (PDF)

I’m not sure how to define/add the replacements - is is possible?

You should just add a “.uv002” (the number is the index of the selector starting at FE00 = 1 …) suffix to the alternate glyph. But the code that is building the variation sequences is (wrongly) limiting it to Chinese chars. I can remove that limit and it seems to work fine with e.g.: normal latin letters.

Cool, thanks, Georg. Since the alternate glyphs already have a name/code, is it possible for me to define “.uv002” aliases to them?

What glyphs do you like to switch?

I’m switching from Aboldscript-math (which is at u1D4D09C) to Aboldscript-math.alt. But I realise now that my question was silly - I can rename *.alt*.uv002 easily. :blush:

[Edited to correct the Unicodes.]

The Unicode Standard, section 23.4 Variation Selectors, allows only specific variation sequences that are defined in three documents. For characters other than emoji and ideographs, StandardizedVariants.txt in the Unicode Character Database is relevant. That file has no entries for 1D4D0, so no variation sequences are allowed for it.

The Standard says, “The use of variation selectors is not intended as a general extension mechanism for the character encoding.”

That was the reason I added the limitation to ideographs. But it seems to work (at least in my initial testing) with any glyph.

And I wonder why it is limited at all.

These are part of the Unicode 14.0 standard.

The ones you show are, and some more. But there’s no variation sequence for 1D4D0 in Unicode 14.

Unicode is meant to be a standard that works across platforms and versions. If I send you an email that contains the character U+1D4D0, no matter which operating systems you and I use, and whether you look at the email now or in ten years, you should see either 𝓐 or some indication that you don’t have a font for it. You should not see 𝓑.

Variation sequences are similar. When I send you U+1D4AC U+FE01, you should see 𝒬︁ in a roundhand style, not a chancery style, assuming you have fonts supporting both.

When the Unicode Standard has not defined a variation sequence yet, such as for U+1D4D0 U+FE02, it’s like an unassigned code point – available to be defined at some point in the future to mean whatever the Unicode Technical Committee finds appropriate. If you use it for something else, you’ll be out of luck.

The recommended solutions for showing alternate glyphs where the Unicode Standard doesn’t specify variation sequences are font features such as stylistic sets (ss01-ss20) or character variants (cv01-cv99), or possibly code points in the Unicode private use areas.

1 Like

Ah yes, I searched for the wrong Unicode number, sorry.

Here’s the full list in Unicode 14:

# Mathematical alphabet script variants

1D49C FE00; chancery style; #  MATHEMATICAL SCRIPT CAPITAL A
212C FE00;  chancery style; #  SCRIPT CAPITAL B
1D49E FE00; chancery style; #  MATHEMATICAL SCRIPT CAPITAL C
1D49F FE00; chancery style; #  MATHEMATICAL SCRIPT CAPITAL D
2130 FE00;  chancery style; #  SCRIPT CAPITAL E
2131 FE00;  chancery style; #  SCRIPT CAPITAL F
1D4A2 FE00; chancery style; #  MATHEMATICAL SCRIPT CAPITAL G
210B FE00;  chancery style; #  SCRIPT CAPITAL H
2110 FE00;  chancery style; #  SCRIPT CAPITAL I
1D4A5 FE00; chancery style; #  MATHEMATICAL SCRIPT CAPITAL J
1D4A6 FE00; chancery style; #  MATHEMATICAL SCRIPT CAPITAL K
2112 FE00;  chancery style; #  SCRIPT CAPITAL L
2133 FE00;  chancery style; #  SCRIPT CAPITAL M
1D4A9 FE00; chancery style; #  MATHEMATICAL SCRIPT CAPITAL N
1D4AA FE00; chancery style; #  MATHEMATICAL SCRIPT CAPITAL O
1D4AB FE00; chancery style; #  MATHEMATICAL SCRIPT CAPITAL P
1D4AC FE00; chancery style; #  MATHEMATICAL SCRIPT CAPITAL Q
211B FE00;  chancery style; #  SCRIPT CAPITAL R
1D4AE FE00; chancery style; #  MATHEMATICAL SCRIPT CAPITAL S
1D4AF FE00; chancery style; #  MATHEMATICAL SCRIPT CAPITAL T
1D4B0 FE00; chancery style; #  MATHEMATICAL SCRIPT CAPITAL U
1D4B1 FE00; chancery style; #  MATHEMATICAL SCRIPT CAPITAL V
1D4B2 FE00; chancery style; #  MATHEMATICAL SCRIPT CAPITAL W
1D4B3 FE00; chancery style; #  MATHEMATICAL SCRIPT CAPITAL X
1D4B4 FE00; chancery style; #  MATHEMATICAL SCRIPT CAPITAL Y
1D4B5 FE00; chancery style; #  MATHEMATICAL SCRIPT CAPITAL Z

1D49C FE01; roundhand style; #  MATHEMATICAL SCRIPT CAPITAL A
212C FE01;  roundhand style; #  SCRIPT CAPITAL B
1D49E FE01; roundhand style; #  MATHEMATICAL SCRIPT CAPITAL C
1D49F FE01; roundhand style; #  MATHEMATICAL SCRIPT CAPITAL D
2130 FE01;  roundhand style; #  SCRIPT CAPITAL E
2131 FE01;  roundhand style; #  SCRIPT CAPITAL F
1D4A2 FE01; roundhand style; #  MATHEMATICAL SCRIPT CAPITAL G
210B FE01;  roundhand style; #  SCRIPT CAPITAL H
2110 FE01;  roundhand style; #  SCRIPT CAPITAL I
1D4A5 FE01; roundhand style; #  MATHEMATICAL SCRIPT CAPITAL J
1D4A6 FE01; roundhand style; #  MATHEMATICAL SCRIPT CAPITAL K
2112 FE01;  roundhand style; #  SCRIPT CAPITAL L
2133 FE01;  roundhand style; #  SCRIPT CAPITAL M
1D4A9 FE01; roundhand style; #  MATHEMATICAL SCRIPT CAPITAL N
1D4AA FE01; roundhand style; #  MATHEMATICAL SCRIPT CAPITAL O
1D4AB FE01; roundhand style; #  MATHEMATICAL SCRIPT CAPITAL P
1D4AC FE01; roundhand style; #  MATHEMATICAL SCRIPT CAPITAL Q
211B FE01;  roundhand style; #  SCRIPT CAPITAL R
1D4AE FE01; roundhand style; #  MATHEMATICAL SCRIPT CAPITAL S
1D4AF FE01; roundhand style; #  MATHEMATICAL SCRIPT CAPITAL T
1D4B0 FE01; roundhand style; #  MATHEMATICAL SCRIPT CAPITAL U
1D4B1 FE01; roundhand style; #  MATHEMATICAL SCRIPT CAPITAL V
1D4B2 FE01; roundhand style; #  MATHEMATICAL SCRIPT CAPITAL W
1D4B3 FE01; roundhand style; #  MATHEMATICAL SCRIPT CAPITAL X
1D4B4 FE01; roundhand style; #  MATHEMATICAL SCRIPT CAPITAL Y
1D4B5 FE01; roundhand style; #  MATHEMATICAL SCRIPT CAPITAL Z

Hi Georg - sorry to revive this old topic…

You said:

But the code that is building the variation sequences is (wrongly) limiting it to Chinese chars. I can remove that limit and it seems to work fine with e.g.: normal latin letters.

Did this get into a Glyphs release?

I later learned that are not supposed to be used outside of Chinese fonts. I don’t fully remember the details.

That link above (https://www.unicode.org/L2/L2020/20275r-math-calligraphic.pdf) suggested that it might be implemented in Unicode 14. This document:

shows them highlighted in blue, so I think they were implemented in the 14.0 release.

It’s not a major problem if they’re not implemented in Glyphs; saves some work… :grinning:

Are you talking about variation selectors generally? FE00 is specified for many Myanmar characters (though I personally prefer to use the LOCL feature instead of variation sequences)

Screenshot 2022-08-30 at 11.51.16

So these must be the same ones, but used for a different range of glyphs?

They are used for scripts other than han for both official Unicode sequences and in many unofficial pua sets.

Variation sequences are currently allowed for some mathematical variants, Myanmar, Phags-pa, Manichaean, Mongolian, CJK, and emoji. Ideographic variation sequences have moved to using VS17 (U+E0100) and above. There are still some ideographic VS using VS1, VS2, and VS3 for compatibility. More info in the Variation Sequences FAQ.

1 Like

It does look like that change to remove limiting variation sequences to Chinese characters was implemented back in February, as of build 3113.

2 Likes