Braille Patterns

Last updated
Braille Patterns
RangeU+2800..U+28FF
(256 code points)
Plane BMP
Scripts Braille
Assigned256 code points
Unused0 reserved code points
Source standardsISO 11548-1
Unicode version history
3.0 (1999)256 (+256)
Chart
Code chart
Note: [1] [2]

The Unicode block Braille Patterns (U+2800..U+28FF) contains all 256 possible patterns of an 8-dot braille cell, thereby including the complete 6-dot cell range. [3] In Unicode, a braille cell does not have a letter or meaning defined. For example, Unicode does not define U+2817 to be "R".

Contents

Symbols, not letters

In Unicode braille characters are not defined as belonging to any other script, but are defined as the Braille script. [3] That is, the patterns are available as symbols, without connection to an alphabetic letter or a number. This is because the same symbol can be used in multiple scripts, e.g. as a Latin character, a Vietnamese character, a Chinese character and a digit. For example: although U+2813BRAILLE PATTERN DOTS-125 represents the letter "H" in basic braille, its Unicode definition makes no reference to "H", and it is just as valid representing Korean ᄐ t-, or Japanese ri.

For this reason – a dot-pattern is not a letter – Unicode declares that, strictly speaking, braille patterns are 'symbols', not 'letters'. The General Property is "So" (Symbol, other), not "Lo" (Letter, other). Beyond that declaration, however, braille is treated as a script in multiple places. E.g., the character property "Script" for the 256 braille code points is ISO 15924 "Brai", for braille. This way, searching users and programs are led to the right place.

Identifying, naming and ordering

Braille dot numbering Braille8dotCellNumbering.svg
Braille dot numbering

The coding is in accordance with ISO/TR 11548-1 Communication aids for blind persons. [3] Unicode uses the standard dot-numbering 1 to 8. Historically only the 6-dot cell was used in braille. The lower two dots were added later, which explains the irregular numbering 1-2-3-7 in the left column and 4-5-6-8 in the right column. Where dots 7 and 8 are not raised, there is no distinction between 6-dot and 8-dot definitions.

The Unicode name of a specific pattern mentions the raised dots: U+2813BRAILLE PATTERN DOTS-125 has dots 1, 2 and 5 raised. By exception, the zero dot raised pattern is named U+2800BRAILLE PATTERN BLANK. [4]

In the 8-dot cell each dot individually can be raised or not. That creates 28=256 different patterns. By mapping each of the eight dots to a bit in a byte (in a little-endian order), and by defining "0"/"1" for not raised/raised per bit, every specific pattern generates an identifying binary number. So the pattern with dots 1-2-5 raised would yield (00010011)2, equivalent to (13)16 or (19)10.

The mapping can also be computed by adding together the hexadecimal values, seen at right, of the dots raised. So the pattern with dots 1-2-5 raised would yield 116+216+1016 = 1316. Whether computed directly in hexadecimal, or indirectly via binary, the result is added to 280016, the offset for the Braille Patterns Unicode block.

Braille symbol
Unicode characterU+2813U+28C7U+28FF
NameBRAILLE PATTERN DOTS-125BRAILLE PATTERN DOTS-12378BRAILLE PATTERN DOTS-12345678
Dot numbers available1 2 3 4 5 6 7 81 2 3 4 5 6 7 81 2 3 4 5 6 7 8
Dot raised=11 1 0 0 1 0 0 01 1 1 0 0 0 1 11 1 1 1 1 1 1 1
Reverse order
(lowest value right, as in decimal notation)
0 0 0 1 0 0 1 121 1 0 0 0 1 1 121 1 1 1 1 1 1 12
Hex value of dots10+2+11680+40+4+2+11680+40+20+10+8+4+2+116
Total hexadecimal value1316C716FF16
Into block,
offset U+280016
280016+1316=U+2813280016+C716=U+28C7280016+FF16=U+28FF

There is no regular mapping to the braille ASCII numbering.

Colloquial names

The Unicode names of braille dot patterns are not the same as what many English speakers would use colloquially. In particular, Unicode names use the word dots in the plural even when only one dot is listed: thus Unicode says braille pattern dots-5 when most English-speaking users of braille would simply say "braille dot 5" or just "dot 5".

In addition, some English-speaking users of braille use the word "and" when listing only two dots. Thus braille pattern dots-45 would be spoken as "braille dots 4 and 5". The word "and" is not always used when listing many dots however.

Block

Braille was added to the Unicode Standard in September, 1999 with the release of version 3.0.

When using punching, the filled (black) dots are to be punched.

The Unicode block for braille is U+2800 ... U+28FF:

Braille Patterns [1]
Official Unicode Consortium code chart (PDF)
 0123456789ABCDEF
U+280x
U+281x
U+282x
U+283x
(end of 6-dot cell patterns)
U+284x
U+285x
U+286x
U+287x
U+288x
U+289x
U+28Ax
U+28Bx
U+28Cx
U+28Dx
U+28Ex
U+28Fx
Notes
1. ^ As of Unicode version 15.0

Font differences

When showing braille graphically in printed instruction manuals, it can be useful to indicate the dots that are not punched, especially if a single braille cell of only one or two punched dots is shown out of context: in this case it might otherwise be difficult to judge the vertical alignment of the dots and tell the difference between, say, dots 2 and 4 versus dots 3 and 5.

The current Unicode charts, and some fonts, use empty circles to indicate dots that are not punched. This does not always render very clearly: if the circle outlines are printed heavily then it can be difficult to tell at a glance whether the dot is filled in or not. The braille package for LaTeX (and several printed publications such as the printed manual for the new international braille music code) show unpunched dots as very small dots (much smaller than the filled-in dots) rather than circles, and this tends to print better.

Some braille fonts do not indicate unpunched dots at all. Additionally, some Linux braille fonts (e.g. GNU Unifont and the DejaVu fonts) use small squares instead of small circles to indicate dots. [5]

Other uses

The Braille Pattern characters are commonly [6] used in terminal applications as a way to draw multiple pixels per character.

History

The following Unicode-related documents record the purpose and process of defining specific characters in the Braille Patterns block:

Version Final code points [lower-alpha 1] Count UTC  ID L2  ID WG2  IDDocument
3.0U+2800..28FF256X3L2/90-064Extracts from "A Proposal for Funding the Programs of Braille Research and Literacy", 1990-01-01
X3L2/91-085TC 173 Proposals for new work items for Braille Coding, 1991-03-20
X3L2/92-039Bishop, Avery (1991-10-29), The long awaited draft reply on Braille symbol encoding
N1093Shibano, Kohji (1994-12-26), Braille Letters
N1203 Umamaheswaran, V. S.; Ksar, Mike (1995-05-03), "6.1.3.3", Unconfirmed minutes of SC2/WG2 Meeting 27, Geneva
X3L2/95-114 N1279 Braille letters (addition request), 1995-10-27
N1303 (html, doc)Umamaheswaran, V. S.; Ksar, Mike (1996-01-26), "8.14 Braille", Minutes of Meeting 29, Tokyo
X3L2/95-125Duran, Peter (1990-05-24), A Proposal for Funding the Programs of Braille Research and Literacy
UTC/1996-002 Aliprand, Joan; Hart, Edwin; Greenfield, Steve (1996-03-05), "Braille", UTC #67 Minutes
UTC/1996-007Hart, Edwin (1996-03-07), Contribution on Encoding Braille in ISO/IEC 10646
N1342Sato, Takayuki K. (1996-03-19), Braille letters (confirmation of request)
N1339Ksar, Mike (1996-03-28), Liaison Letter on Braille to ISO/TC137 Secretariat
N1345Hart, Edwin (1996-04-01), Initial comments on encoding Braille into ISO/IEC 10646
N1353 Umamaheswaran, V. S.; Ksar, Mike (1996-06-25), "8.9", Draft minutes of WG2 Copenhagen Meeting # 30
UTC/1996-027.2 Greenfield, Steve (1996-07-01), "K. Braille Proposal", UTC #69 Minutes (PART 2)
N1409RBraille Symbols, 1996-08-12
N1453 Ksar, Mike; Umamaheswaran, V. S. (1996-12-06), "8.9", WG 2 Minutes - Quebec Meeting 31
L2/97-047N1541 Everson, Michael (1997-03-11), Proposed pDAM text for Braille
N1588DIS 11 548-1 - Communication aids for blind persons Part 1: Braille identifiers and shift marks - General guidelines, 1997-06-23
N1588.1DIS 11 548-2 - Communication aids for blind persons Part 2: Latin alphabet based character sets
L2/97-157N1612Report of ad-hoc group on Braille encoding, 1997-07-01
L2/97-288 N1603 Umamaheswaran, V. S. (1997-10-24), "8.4", Unconfirmed Meeting Minutes, WG 2 Meeting # 33, Heraklion, Crete, Greece, 20 June - 4 July 1997
L2/98-136 N1770 Paterson, Bruce (1998-04-06), Revised Text of 10646-1/FPDAM 16: Amendment 16: Braille Patterns
L2/98-286 N1703 Umamaheswaran, V. S.; Ksar, Mike (1998-07-02), "6.2.2 FPDAM-16 - Braille patterns", Unconfirmed Meeting Minutes, WG 2 Meeting #34, Redmond, WA, USA; 1998-03-16--20
  1. Proposed code points and characters names may differ from final code points and names

Related Research Articles

<span class="mw-page-title-main">Braille</span> Tactile writing system for blind and visually impaired people

Braille is a tactile writing system used by people who are visually impaired, including people who are blind, deafblind or who have low vision. It can be read either on embossed paper or by using refreshable braille displays that connect to computers and smartphone devices. Braille can be written using a slate and stylus, a braille writer, an electronic braille notetaker or with the use of a computer connected to a braille embosser.

<span class="mw-page-title-main">Unicode</span> Character encoding standard

Unicode, formally The Unicode Standard, is an information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard, which is maintained by the Unicode Consortium, defines as of the current version (15.0) 149,186 characters covering 161 modern and historic scripts, as well as symbols, emoji, and non-visual control and formatting codes.

<span class="mw-page-title-main">ArmSCII</span> Set of obsolete single-byte character encodings

ArmSCII or ARMSCII is a set of obsolete single-byte character encodings for the Armenian alphabet defined by Armenian national standard 166–9. ArmSCII is an acronym for Armenian Standard Code for Information Interchange, similar to ASCII for the American standard. It has been superseded by the Unicode standard.

Mathematical Alphanumeric Symbols is a Unicode block comprising styled forms of Latin and Greek letters and decimal digits that enable mathematicians to denote different notions with different letter styles. The letters in various fonts often have specific, fixed meanings in particular areas of mathematics. By providing uniformity over numerous mathematical articles and books, these conventions help to read mathematical formulas.

<span class="mw-page-title-main">Japanese Braille</span> Braille script of the Japanese language

Japanese Braille is the braille script of the Japanese language. It is based on the original braille script, though the connection is tenuous. In Japanese it is known as tenji (点字), literally "dot characters". It transcribes Japanese more or less as it would be written in the hiragana or katakana syllabaries, without any provision for writing kanji.

In Unicode, a Private Use Area (PUA) is a range of code points that, by definition, will not be assigned characters by the Unicode Consortium. Three private use areas are defined: one in the Basic Multilingual Plane, and one each in, and nearly covering, planes 15 and 16. The code points in these areas cannot be considered as standardized characters in Unicode itself. They are intentionally left undefined so that third parties may define their own characters without conflicting with Unicode Consortium assignments. Under the Unicode Stability Policy, the Private Use Areas will remain allocated for that purpose in all future Unicode versions.

In computer programming, whitespace is any character or series of characters that represent horizontal or vertical space in typography. When rendered, a whitespace character does not correspond to a visible mark, but typically does occupy an area on a page. For example, the common whitespace symbol U+0020 SPACE represents a blank space punctuation character in text, used as a word divider in Western scripts.

<span class="mw-page-title-main">Zapf Dingbats</span> Dingbat typeface

ITC Zapf Dingbats is one of the more common dingbat typefaces. It was designed by the typographer Hermann Zapf in 1978 and licensed by International Typeface Corporation.

Braille ASCII is a subset of the ASCII character set which uses 64 of the printable ASCII characters to represent all possible dot combinations in six-dot braille. It was developed around 1969 and, despite originally being known as North American Braille ASCII, it is now used internationally.

A Unicode font is a computer font that maps glyphs to code points defined in the Unicode Standard. The vast majority of modern computer fonts use Unicode mappings, even those fonts which only include glyphs for a single writing system, or even only support the basic Latin alphabet. Fonts which support a wide range of Unicode scripts and Unicode symbols are sometimes referred to as "pan-Unicode fonts", although as the maximum number of glyphs that can be defined in a TrueType font is restricted to 65,535, it is not possible for a single font to provide individual glyphs for all defined Unicode characters. This article lists some widely used Unicode fonts that support a comparatively large number and broad range of Unicode characters.

In Unicode and the UCS, a compatibility character is a character that is encoded solely to maintain round-trip convertibility with other, often older, standards. As the Unicode Glossary says:

A character that would not have been encoded except for compatibility and round-trip convertibility with other standards

Many scripts in Unicode, such as Arabic, have special orthographic rules that require certain combinations of letterforms to be combined into special ligature forms.

The Universal Coded Character Set is a standard set of characters defined by the international standard ISO/IEC 10646, Information technology — Universal Coded Character Set (UCS), which is the basis of many character encodings, improving as characters from previously unrepresented typing systems are added.

The Unicode Standard assigns various properties to each Unicode character and code point.

Louis Braille's original publication, Procedure for Writing Words, Music, and Plainsong in Dots (1829), credits Barbier's night writing as being the basis for the braille script. It differed in a fundamental way from modern braille: It contained nine decades (series) of characters rather than the modern five, utilizing dashes as well as dots. Braille recognized, however, that the dashes were problematic, being difficult to distinguish from the dots in practice, and those characters were abandoned in the second edition of the book.

Computer Braille is an adaptation of braille for precise representation of computer-related materials such as programs, program lines, computer commands, and filenames. Unlike standard 6-dot braille scripts, but like Gardner–Salinas braille codes, this may employ the extended 8-dot braille patterns. The resulting 256 braille characters are assigned to the 256 characters of 8-bit computer encodings.

Old Italic is a Unicode block containing a unified repertoire of several Old Italic scripts used in various parts of Italy starting about 700 BCE, including the Etruscan alphabet and others that were derived from it. All those languages went extinct by about the 1st century BCE; except Latin, which however evolved its own Latin alphabet that is covered by other Unicode blocks.

The Braille pattern dots-0 , also called a blank Braille pattern, is a 6-dot or 8-dot braille cell with no dots raised. It is represented by the Unicode code point U+2800, and in Braille ASCII with a space.

References

  1. "Unicode character database". The Unicode Standard. Retrieved 2016-07-09.
  2. "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2016-07-09.
  3. 1 2 3 Unicode Chapter 15, section 15.10
  4. Unicode chart U+2800, braille patterns
  5. "Braille Pattern Dots-1358 (U+2895) Font Support". fileformat.info. Retrieved December 7, 2014.
  6. "Iirelu/Braillify". GitHub . 22 July 2021.