General Punctuation | |
---|---|
Range | U+2000..U+206F (112 code points) |
Plane | BMP |
Scripts | Common (109 char.) Inherited (2 char.) |
Symbol sets | Punctuation Spaces Format controls |
Assigned | 111 code points |
Unused | 1 reserved code points 6 deprecated |
Unicode version history | |
1.0.0 (1991) | 67 (+67) |
1.1 (1993) | 76 (+9) |
3.0 (1999) | 83 (+7) |
3.2 (2002) | 95 (+12) |
4.0 (2003) | 97 (+2) |
4.1 (2005) | 106 (+9) |
5.1 (2008) | 107 (+1) |
6.3 (2013) | 111 (+4) |
Unicode documentation | |
Code chart ∣ Web page | |
Note: [1] [2] |
General Punctuation is a Unicode block containing punctuation, spacing, and formatting characters for use with all scripts and writing systems. Included are the defined-width spaces, joining formats, directional formats, smart quotes, archaic and novel punctuation such as the interrobang, and invisible mathematical operators.
Additional punctuation characters are in the Supplemental Punctuation block and sprinkled in dozens of other Unicode blocks.
General Punctuation [1] [2] [3] Official Unicode Consortium code chart (PDF) | ||||||||||||||||
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
U+200x | NQ SP | MQ SP | EN SP | EM SP | 3/M SP | 4/M SP | 6/M SP | F SP | P SP | TH SP | H SP | ZW SP | ZW NJ | ZW J | LRM | RLM |
U+201x | ‐ | NB ‑ | ‒ | – | — | ― | ‖ | ‗ | ‘ | ’ | ‚ | ‛ | “ | ” | „ | ‟ |
U+202x | † | ‡ | • | ‣ | ․ | ‥ | … | ‧ | L SEP | P SEP | LRE | RLE | LRO | RLO | NNB SP | |
U+203x | ‰ | ‱ | ′ | ″ | ‴ | ‵ | ‶ | ‷ | ‸ | ‹ | › | ※ | ‼ | ‽ | ‾ | ‿ |
U+204x | ⁀ | ⁁ | ⁂ | ⁃ | ⁄ | ⁅ | ⁆ | ⁇ | ⁈ | ⁉ | ⁊ | ⁋ | ⁌ | ⁍ | ⁎ | ⁏ |
U+205x | ⁐ | ⁑ | ⁒ | ⁓ | ⁔ | ⁕ | ⁖ | ⁗ | ⁘ | ⁙ | ⁚ | ⁛ | ⁜ | ⁝ | ⁞ | MM SP |
U+206x | WJ | ƒ() | × | , | + | LRI | RLI | FSI | PDI | I SS | A SS | I AFS | A AFS | NA DS | NO DS | |
Notes |
Several characters in this block are usually not rendered with a directly visible glyph. Ten whitespace characters U+2002 through U+200B (fixed en or 1⁄2em, em, 1⁄3em, 1⁄4em, 1⁄6em, figure and punctuation space, variable thin or 1⁄5em and hair space, fixed zero-width space) and U+205F (math medium or 2⁄9 em space) differ by horizontal width, while U+2000 and U+2001 (en and em quad) are effectively aliases of U+2002 and U+2003, respectively; another two, U+202F and U+2060 (ill-termed word joiner) are variants of U+2009 or U+2004 and U+200B that prohibit line-breaks. Three zero-width characters U+200B through U+200D (space, non-joiner and joiner) differ in how they affect ligation and shaping of adjacent letters such as contextual forms in Arabic. Eleven invisible characters U+200E, U+200F (left-to-right and right-to-left mark), U+202A through U+202E (embeds, pops and overrides) and U+2066 through U+2069 (isolates) control the directionality of text unless higher-level markup overrides them. There are explicit line and paragraph separators at U+2028 and U+2029.
The General Punctuation block contains two emoji: U+203C and U+2049. [3] [4]
The block has four standardized variants defined to specify emoji-style (U+FE0F VS16) or text presentation (U+FE0E VS15) for the two emoji, both of which default to a text presentation. [5]
U+ | 203C | 2049 |
base code point | ‼ | ⁉ |
base+VS15 (text) | ‼︎ | ⁉︎ |
base+VS16 (emoji) | ‼️ | ⁉️ |
The following Unicode-related documents record the purpose and process of defining specific characters in the General Punctuation block:
Version | Final code points [lower-alpha 1] | Count | UTC ID | L2 ID | WG2 ID | Document |
---|---|---|---|---|---|---|
1.0.0 | U+2000..202E, 2030..203E, 2040..2044 | 67 | (to be determined) | |||
L2/11-438 [lower-alpha 2] [lower-alpha 3] | N4182 | Edberg, Peter (2011-12-22), Emoji Variation Sequences (Revision of L2/11-429) | ||||
L2/17-086 | Burge, Jeremy; et al. (2017-03-27), Add ZWJ, VS-16, Keycaps & Tags to Emoji_Component | |||||
L2/17-103 | Moore, Lisa (2017-05-18), "E.1.7 Add ZWJ, VS-16, Keycaps & Tags to Emoji_Component", UTC #151 Minutes | |||||
1.1 | U+203F, 2045..2046 | 3 | (to be determined) | |||
U+206A..206F | 6 | (to be determined) | ||||
UTC/1992-xxx | Freytag, Asmus (1992-05-12), "C. Bidi", Unconfirmed minutes for UTC Meeting #52, May 8, 1992 at Xerox | |||||
L2/01-275 | Davis, Mark (2001-07-16), New Properties (ReservedForCf, Deprecated, Discouraged) | |||||
L2/01-301 | Whistler, Ken (2001-08-01), "Alternate format controls inherited from 10646", Analysis of Character Deprecation in the Unicode Standard | |||||
L2/01-326 | Davis, Mark (2001-08-15), New Properties: Reserved_Cf_Code_Point & Deprecated | |||||
L2/01-295R | Moore, Lisa (2001-11-06), "Motion 88-M13", Minutes from the UTC/L2 meeting #88 | |||||
3.0 | U+202F, 2048..2049 | 3 | L2/97-288 | N1603 | Umamaheswaran, V. S. (1997-10-24), "8.18", Unconfirmed Meeting Minutes, WG 2 Meeting # 33, Heraklion, Crete, Greece, 20 June – 4 July 1997 | |
L2/98-088 | N1711 | The Working Meeting on Mongolian Encoding Attended by Representatives of China and Mongolia, 1998-02-15 | ||||
L2/98-104 | N1734 | Whistler, Ken (1998-03-20), Comments on the Mongolian Encoding Proposal, WG2 N1711 | ||||
L2/98-252 (pdf, txt) | N1833RM (pdf, doc) | Moore, Richard (1998-05-04), Feedback on Ken Whistler's Comments on Mongolian Encoding: N 1734 | ||||
L2/98-251 (pdf, html, txt) | N1808 (pdf, doc) | Reply to "Proposal WG2 N1734" Raised at the Seattle Meeting Regarding "Proposal WG 2 N1711", 1998-07-09 | ||||
L2/98-281R (pdf, html) | Aliprand, Joan (1998-07-31), "Mongolian (IV.A)", Unconfirmed Minutes – UTC #77 & NCITS Subgroup L2 # 174 JOINT MEETING, Redmond, WA -- July 29-31, 1998 | |||||
N1862 | Revision of N1711 - Mongolian, 1998-09-17 | |||||
N1865 | US Position - Mongolian (N1711, N1734 and N1808), 1998-09-18 | |||||
N1918 | Paterson, Bruce (1998-10-28), Text for Combined PDAM registration and consideration ballot - SC2 N 3208 | |||||
L2/99-010 | N1903 (pdf, html, doc) | Umamaheswaran, V. S. (1998-12-30), "8.1.3", Minutes of WG 2 meeting 35, London, U.K.; 1998-09-21--25 | ||||
L2/99-075.1 | N1973 | Irish Comments on SC 2 N 3208, 1999-01-19 | ||||
L2/99-075 | N1972 (pdf, html, doc) | Summary of Voting on SC 2 N 3208, PDAM ballot on WD for ISO/IEC 10646-1/Amd. 29: Mongolian, 1999-02-12 | ||||
N2020 | Paterson, Bruce (1999-04-05), FPDAM 29 Text - Mongolian | |||||
L2/99-113 | Text for FPDAM ballot of ISO/IEC 10646, Amd. 29 - Mongolian, 1999-04-06 | |||||
L2/99-232 | N2003 | Umamaheswaran, V. S. (1999-08-03), "6.1.3 PDAM29 – Mongolian script", Minutes of WG 2 meeting 36, Fukuoka, Japan, 1999-03-09--15 | ||||
L2/99-304 | N2126 | Paterson, Bruce (1999-10-01), Revised Text for FDAM ballot of ISO/IEC 10646-1/FDAM 29, AMENDMENT 29: Mongolian | ||||
L2/99-381 | Final text for ISO/IEC 10646-1, FDAM 29 -- Mongolian, 1999-12-07 | |||||
L2/00-010 | N2103 | Umamaheswaran, V. S. (2000-01-05), "6.4.4", Minutes of WG 2 meeting 37, Copenhagen, Denmark: 1999-09-13—16 | ||||
L2/07-209 | Whistler, Ken (2007-07-05), UTR 14 and U+202F NARROW NO-BREAK SPACE | |||||
L2/11-438 [lower-alpha 2] [lower-alpha 3] | N4182 | Edberg, Peter (2011-12-22), Emoji Variation Sequences (Revision of L2/11-429) | ||||
L2/15-187 | Moore, Lisa (2015-08-11), "B.14.5", UTC #144 Minutes | |||||
L2/16-258 | N4752R2 | Eck, Greg (2016-09-19), Mongolian Base Forms, Positional Forms, & Variant Forms | ||||
L2/16-259 | N4753 | Eck, Greg; Rileke, Orlog Ou (2016-09-20), WG2 #65 Mongolian Discussion Points | ||||
L2/16-266 | N4763 | Anderson, Deborah; Whistler, Ken; McGowan, Rick; Pournader, Roozbeh; Glass, Andrew; Iancu, Laurențiu; Moore, Lisa (2016-09-26), "1. Mongolian", Comments on Mongolian, Small Khitan, and other WG2 #65 documents | ||||
L2/16-297 | N4769 | Anderson, Deborah (2016-10-27), Mongolian ad hoc report | ||||
U+204A | 1 | L2/98-214 | N1747 | Everson, Michael (1998-05-25), Contraction characters for the UCS | ||
L2/98-281R (pdf, html) | Aliprand, Joan (1998-07-31), "Characters from ISO 5426-2 (IV.C.5-6)", Unconfirmed Minutes – UTC #77 & NCITS Subgroup L2 # 174 JOINT MEETING, Redmond, WA -- July 29-31, 1998 | |||||
L2/98-292R (pdf, html, Figure 1) | "2.6", Comments on proposals to add characters from ISO standards developed by ISO/TC 46/SC 4, 1998-08-19 | |||||
L2/98-292 | N1840 | "2.6", Comments on proposals to add characters from ISO standards developed by ISO/TC 46/SC 4, 1998-08-25 | ||||
L2/98-301 | N1847 | Everson, Michael (1998-09-12), Responses to NCITS/L2 and Unicode Consortium comments on numerous proposals | ||||
L2/98-372 | N1884R2 (pdf, doc) | Whistler, Ken; et al. (1998-09-22), Additional Characters for the UCS | ||||
L2/98-329 | N1920 | Combined PDAM registration and consideration ballot on WD for ISO/IEC 10646-1/Amd. 30, AMENDMENT 30: Additional Latin and other characters, 1998-10-28 | ||||
L2/99-010 | N1903 (pdf, html, doc) | Umamaheswaran, V. S. (1998-12-30), "8.1.5.1", Minutes of WG 2 meeting 35, London, U.K.; 1998-09-21--25 | ||||
U+204B..204D | 3 | L2/98-215 | N1748 | Everson, Michael (1998-05-25), Additional signature mark characters for the UCS | ||
L2/98-281R (pdf, html) | Aliprand, Joan (1998-07-31), "Signature Marks (IV.C.7)", Unconfirmed Minutes – UTC #77 & NCITS Subgroup L2 # 174 JOINT MEETING, Redmond, WA -- July 29-31, 1998 | |||||
L2/98-292R (pdf, html, Figure 1) | "2.7", Comments on proposals to add characters from ISO standards developed by ISO/TC 46/SC 4, 1998-08-19 | |||||
L2/98-292 | N1840 | "2.7", Comments on proposals to add characters from ISO standards developed by ISO/TC 46/SC 4, 1998-08-25 | ||||
L2/98-301 | N1847 | Everson, Michael (1998-09-12), Responses to NCITS/L2 and Unicode Consortium comments on numerous proposals | ||||
L2/98-372 | N1884R2 (pdf, doc) | Whistler, Ken; et al. (1998-09-22), Additional Characters for the UCS | ||||
L2/98-329 | N1920 | Combined PDAM registration and consideration ballot on WD for ISO/IEC 10646-1/Amd. 30, AMENDMENT 30: Additional Latin and other characters, 1998-10-28 | ||||
L2/99-010 | N1903 (pdf, html, doc) | Umamaheswaran, V. S. (1998-12-30), "8.1.5.1", Minutes of WG 2 meeting 35, London, U.K.; 1998-09-21--25 | ||||
3.2 | U+2047, 2051 | 2 | L2/99-238 | Consolidated document containing 6 Japanese proposals, 1999-07-15 | ||
N2092 | Addition of forty eight characters, 1999-09-13 | |||||
L2/99-365 | Moore, Lisa (1999-11-23), Comments on JCS Proposals | |||||
L2/00-024 | Shibano, Kohji (2000-01-31), JCS proposal revised | |||||
L2/99-260R | Moore, Lisa (2000-02-07), "JCS Proposals", Minutes of the UTC/L2 meeting in Mission Viejo, October 26-28, 1999 | |||||
L2/00-098, L2/00-098-page5 | N2195 | Rationale for non-Kanji characters proposed by JCS committee, 2000-03-15 | ||||
L2/00-119 [lower-alpha 4] | N2191R | Whistler, Ken; Freytag, Asmus (2000-04-19), Encoding Additional Mathematical Symbols in Unicode | ||||
L2/00-234 | N2203 (rtf, txt) | Umamaheswaran, V. S. (2000-07-21), "8.18, 8.20", Minutes from the SC2/WG2 meeting in Beijing, 2000-03-21 -- 24 | ||||
L2/00-115R2 | Moore, Lisa (2000-08-08), "Motion 83-M11", Minutes Of UTC Meeting #83 | |||||
L2/00-297 | N2257 | Sato, T. K. (2000-09-04), JIS X 0213 symbols part-1 | ||||
L2/00-342 | N2278 | Sato, T. K.; Everson, Michael; Whistler, Ken; Freytag, Asmus (2000-09-20), Ad hoc Report on Japan feedback N2257 and N2258 | ||||
L2/01-050 | N2253 | Umamaheswaran, V. S. (2001-01-21), "7.16 JIS X0213 Symbols", Minutes of the SC2/WG2 meeting in Athens, September 2000 | ||||
U+204E..2050, 2057, 205F, 2061..2062 | 7 | L2/00-005R2 | Moore, Lisa (2000-02-14), "Motion 82-M11", Minutes of UTC #82 in San Jose | |||
L2/00-119 [lower-alpha 4] | N2191R | Whistler, Ken; Freytag, Asmus (2000-04-19), Encoding Additional Mathematical Symbols in Unicode | ||||
L2/00-234 | N2203 (rtf, txt) | Umamaheswaran, V. S. (2000-07-21), "8.18", Minutes from the SC2/WG2 meeting in Beijing, 2000-03-21 -- 24 | ||||
L2/00-115R2 | Moore, Lisa (2000-08-08), "Motion 83-M11", Minutes Of UTC Meeting #83 | |||||
U+2052, 2063 | 2 | L2/01-142 [lower-alpha 4] | N2336 | Beeton, Barbara; Freytag, Asmus; Ion, Patrick (2001-04-02), Additional Mathematical Symbols | ||
L2/01-156 | N2356 | Freytag, Asmus (2001-04-03), Additional Mathematical Characters (Draft 10) | ||||
L2/01-344 | N2353 (pdf, doc) | Umamaheswaran, V. S. (2001-09-09), "7.7 Mathematical Symbols", Minutes from SC2/WG2 meeting #40 -- Mountain View, April 2001 | ||||
U+2060 | 1 | L2/99-260R | Moore, Lisa (2000-02-07), "Unicode in Markup Languages", Minutes of the UTC/L2 meeting in Mission Viejo, October 26-28, 1999 | |||
L2/00-005R2 | Moore, Lisa (2000-02-14), "Zero Width Grapheme Break/Join", Minutes of UTC #82 in San Jose, Action Item for Arnold Winkler: As the zero width grapheme break/join proposal was withdrawn, re-open Action Item 81-12 (for Mark Davis to prepare a proposal for WG2 for the Zero Width Word Joiner.) | |||||
L2/00-258 | N2235 | Davis, Mark (2000-08-09), Proposal for addition of ZERO WIDTH WORD JOINER | ||||
L2/00-369 | Whistler, Ken (2000-10-06), "e. (ZERO WIDTH) WORD JOINER", WG2 in Vouliagmeni (Athens) | |||||
L2/01-050 | N2253 | Umamaheswaran, V. S. (2001-01-21), "7.7 Proposal for addition of ZERO WIDTH WORDJOINER", Minutes of the SC2/WG2 meeting in Athens, September 2000 | ||||
4.0 | U+2053..2054 | 2 | L2/02-141 | N2419 | Everson, Michael; et al. (2002-03-20), Uralic Phonetic Alphabet characters for the UCS | |
L2/02-192 | Everson, Michael (2002-05-02), Everson's Reply on UPA | |||||
N2442 | Everson, Michael; Kolehmainen, Erkki I.; Ruppel, Klaas; Trosterud, Trond (2002-05-21), Justification for placing the Uralic Phonetic Alphabet in the BMP | |||||
L2/02-291 | Whistler, Ken (2002-05-31), WG2 report from Dublin | |||||
L2/02-292 | Whistler, Ken (2002-06-03), Early look at WG2 consent docket | |||||
L2/02-166R2 | Moore, Lisa (2002-08-09), "Scripts and New Characters - UPA", UTC #91 Minutes | |||||
L2/02-253 | Moore, Lisa (2002-10-21), "Consensus 92-C2", UTC #92 Minutes | |||||
4.1 | U+2055 | 1 | L2/03-151R | Constable, Peter; Lloyd-Williams, James; Lloyd-Williams, Sue; Chowdhury, Shamsul Islam; Ali, Asaddar; Sadique, Mohammed; Chowdhury, Matiar Rahman (2003-05-10), Revised Proposal for Encoding Syloti Nagri Script in the BMP | ||
L2/03-136 | Moore, Lisa (2003-08-18), "Scripts and New Characters - Syloti Nagri Script", UTC #95 Minutes | |||||
U+2056, 2058..2059 | 3 | L2/03-282R | N2610R | Everson, Michael; Cleminson, Ralph (2003-09-04), Final proposal for encoding the Glagolitic script in the UCS | ||
L2/03-324 | N2642 | Pantelia, Maria (2003-10-06), Proposal to encode additional Greek editorial and punctuation characters in the UCS | ||||
U+205A..205C | 3 | L2/03-157 | Pantelia, Maria (2003-05-19), Additional Beta Code Characters not in Unicode (WIP) | |||
L2/03-193R | N2612-7 | Pantelia, Maria (2003-06-11), Proposal to encode additional Punctuation Characters in the UCS | ||||
U+205D | 1 | L2/02-312R | Pantelia, Maria (2002-11-07), Proposal to encode additional Greek editorial and punctuation characters in the UCS | |||
L2/03-324 | N2642 | Pantelia, Maria (2003-10-06), Proposal to encode additional Greek editorial and punctuation characters in the UCS | ||||
U+205E | 1 | L2/03-354 | N2655 | Freytag, Asmus (2003-10-10), Proposal -- Symbols used in Dictionaries | ||
L2/03-356R2 | Moore, Lisa (2003-10-22), "Consensus 97-C15", UTC #97 Minutes | |||||
5.1 | U+2064 | 1 | L2/07-011R | N3198R | Freytag, Asmus; Beeton, Barbara; Ion, Patrick; Sargent, Murray; Carlisle, David; Pournader, Roozbeh (2007-01-15), 29 Additional Mathematical and Symbol Characters | |
L2/07-015 | Moore, Lisa (2007-02-08), "Mathematical Characters and Symbols (C.4)", UTC #110 Minutes | |||||
L2/07-268 | N3253 (pdf, doc) | Umamaheswaran, V. S. (2007-07-26), "M50.16", Unconfirmed minutes of WG 2 meeting 50, Frankfurt-am-Main, Germany; 2007-04-24/27 | ||||
6.3 | U+2066..2069 | 4 | L2/12-186R | Lanin, Aharon; Davis, Mark; Pournader, Roozbeh (2012-07-24), A Proposal for Bidi Isolates in Unicode | ||
L2/12-290 | N4310 | Lanin, Aharon; Davis, Mark; Pournader, Roozbeh (2012-07-31), Proposal for Four Characters for Bidi | ||||
L2/12-239 | Moore, Lisa (2012-08-14), "Consensus 132-C12", UTC #132 Minutes | |||||
L2/13-040 | Pournader, Roozbeh; Lanin, Aharon (2013-01-29), Fasttracking Arabic Letter Mark (ALM) | |||||
L2/13-125 | N4447 | Constable, Peter (2013-06-10), Unicode Liaison Report to WG2 | ||||
|
In computer programming, whitespace is any character or series of characters that represent horizontal or vertical space in typography. When rendered, a whitespace character does not correspond to a visible mark, but typically does occupy an area on a page. For example, the common whitespace symbol U+0020 SPACE represents a blank space punctuation character in text, used as a word divider in Western scripts.
Letterlike Symbols is a Unicode block containing 80 characters which are constructed mainly from the glyphs of one or more letters. In addition to this block, Unicode includes full styled mathematical alphabets, although Unicode does not explicitly categorize these characters as being "letterlike."
Miscellaneous Technical is a Unicode block ranging from U+2300 to U+23FF, which contains various common symbols which are related to and used in the various technical, programming language, and academic professions. For example:
Supplemental Arrows-B is a Unicode block containing miscellaneous arrows, arrow tails, crossing arrows used in knot descriptions, curved arrows, and harpoons.
Miscellaneous Symbols and Arrows is a Unicode block containing arrows and geometric shapes with various fills, astrological symbols, technical symbols, intonation marks, and others.
The Unicode Consortium and the ISO/IEC JTC 1/SC 2/WG 2 jointly collaborate on the list of the characters in the Universal Coded Character Set. The Universal Coded Character Set, most commonly called the Universal Character Set, is an international standard to map characters, discrete symbols used in natural language, mathematics, music, and other domains, to unique machine-readable data values. By creating this mapping, the UCS enables computer software vendors to interoperate, and transmit—interchange—UCS-encoded text strings from one to another. Because it is a universal map, it can be used to represent multiple languages at the same time. This avoids the confusion of using multiple legacy character encodings, which can result in the same sequence of codes having multiple interpretations depending on the character encoding in use, resulting in mojibake if the wrong one is chosen.
The Basic Latin Unicode block, sometimes informally called C0 Controls and Basic Latin, is the first block of the Unicode standard, and the only block which is encoded in one byte in UTF-8. The block contains all the letters and control codes of the ASCII encoding. It ranges from U+0000 to U+007F, contains 128 characters and includes the C0 controls, ASCII punctuation and symbols, ASCII digits, both the uppercase and lowercase of the English alphabet and a control character.
The Latin-1 Supplement is the second Unicode block in the Unicode standard. It encodes the upper range of ISO 8859-1: 80 (U+0080) - FF (U+00FF). C1 Controls (0080–009F) are not graphic. This block ranges from U+0080 to U+00FF, contains 128 characters and includes the C1 controls, Latin-1 punctuation and symbols, 30 pairs of majuscule and minuscule accented Latin characters and 2 mathematical operators.
The zero-width space (), abbreviated ZWSP, is a non-printing character used in computerized typesetting to indicate word boundaries to text-processing systems in scripts that do not use explicit spacing, or after characters that are not followed by a visible space but after which there may nevertheless be a line break. It is also used with languages without visible space between words, for example, Japanese. Normally, it is not a visible separation, but it may expand in passages that are fully justified.
Enclosed Alphanumerics is a Unicode block of typographical symbols of an alphanumeric within a circle, a bracket or other not-closed enclosure, or ending in a full stop.
The Unicode Standard assigns various properties to each Unicode character and code point.
CJK Symbols and Punctuation is a Unicode block containing symbols and punctuation used for writing the Chinese, Japanese and Korean languages. It also contains one Chinese character.
The regional indicator symbols are a set of 26 alphabetic Unicode characters (A–Z) intended to be used to encode ISO 3166-1 alpha-2 two-letter country codes in a way that allows optional special treatment.
Enclosed Alphanumeric Supplement is a Unicode block consisting of Latin alphabet characters and Arabic numerals enclosed in circles, ovals or boxes, used for a variety of purposes. It is encoded in the range U+1F100–U+1F1FF in the Supplementary Multilingual Plane.
Miscellaneous Symbols and Pictographs is a Unicode block containing meteorological and astronomical symbols, emoji characters largely for compatibility with Japanese telephone carriers' implementations of Shift JIS, and characters originally from the Wingdings and Webdings fonts found in Microsoft Windows.
Enclosed CJK Letters and Months is a Unicode block containing circled and parenthesized Katakana, Hangul, and CJK ideographs. Also included in the block are miscellaneous glyphs that would more likely fit in CJK Compatibility or Enclosed Alphanumerics: a few unit abbreviations, circled numbers from 21 to 50, and circled multiples of 10 from 10 to 80 enclosed in black squares.
Dingbats is a Unicode block containing dingbats. Most of its characters were taken from Zapf Dingbats; it was the Unicode block to have imported characters from a specific typeface; Unicode later adopted a policy that excluded symbols with "no demonstrated need or strong desire to exchange in plain text," and thus no further dingbat typefaces were encoded until Webdings and Wingdings were encoded in Version 7.0. Some ornaments are also an emoji, having optional presentation variants.
Arrows is a Unicode block containing line, curve, and semicircle symbols terminating in barbs or arrows.
Enclosed Ideographic Supplement is a Unicode block containing forms of characters and words from Chinese, Japanese and Korean enclosed within or stylised as squares, brackets, or circles. It contains three such characters containing one or more kana, and many containing CJK ideographs. Many of its characters were added for compatibility with the Japanese ARIB STD-B24 standard. Six symbols from Chinese folk religion were added in Unicode version 10.
Emoticons is a Unicode block containing emoticons or emoji. Most of them are intended as representations of faces, although some of them include hand gestures or non-human characters.