Backslash

Last updated
\
Backslash
In  Unicode U+005C\REVERSE SOLIDUS (\)
Related
See alsoU+29F5REVERSE SOLIDUS OPERATOR

U+29F9BIG REVERSE SOLIDUS
U+FE68 SMALL REVERSE SOLIDUS

Contents

U+FF3CFULLWIDTH REVERSE SOLIDUS

The backslash\ is a mark used mainly in computing and mathematics. It is the mirror image of the common slash /. It is a relatively recent mark, first documented in the 1930s. It is sometimes called a hack, whack, escape (from C/UNIX), reverse slash, slosh, downwhack, backslant, backwhack, bash, reverse slant, reverse solidus, and reversed virgule. [1] [2]

History

A Teletype Wheatstone Perforator keyboard from the 1930s, with backslash in the end of the third row Teletype Wheatstone Perforator keyboard.jpg
A Teletype Wheatstone Perforator keyboard from the 1930s, with backslash in the end of the third row
Teletype ASR-33 keyboard layout with ASCII character set, prior to June 14, 1966, with backslash on shift+L Mappa Teletype ASR-33.jpg
Teletype ASR-33 keyboard layout with ASCII character set, prior to June 14, 1966, with backslash on shift+L

As of November 2022, efforts to identify either the origin of this character or its purpose before the 1960s have not been successful. The earliest known reference found to date is a 1937 maintenance manual from the Teletype Corporation with a photograph showing the keyboard of its Kleinschmidt keyboard perforator WPE-3 using the Wheatstone system. [3] [4] The symbol was called the "diagonal key", [5] and given a (non-standard) Morse code of        . [6] [lower-alpha 1]

In June 1960, IBM published an "Extended character set standard" that includes the symbol at 0x19. [4] In September 1961, Bob Bemer (IBM) proposed to the X3.2 standards committee that [, ] and \ be made part of the proposed standard, describing the backslash as a "reverse division operator" and cited its prior use by Teletype in telecommunications. In particular, he said, the \ was needed so that the ALGOL Boolean operators (logical conjunction) and (logical disjunction) could be composed using /\ and \/ respectively. The Committee adopted these changes into the draft American Standard (subsequently called ASCII) at its November 1961 meeting. [4]

These operators were used for min and max in early versions of the C programming language supplied with Unix V6 [7] and V7. [8]

Usage

Programming languages

In many programming languages such as C, Perl, PHP, Python, Unix scripting languages, and many file formats such as JSON, the backslash is used as an escape character, to indicate that the character following it should be treated specially (if it would otherwise be treated literally), or literally (if it would otherwise be treated specially). For instance, inside a C string literal the sequence \n produces a newline byte instead of an 'n', and the sequence \" produces an actual double quote rather than the special meaning of the double quote ending the string. An actual backslash is produced by a double backslash \\.

Regular expression languages used it the same way, changing subsequent literal characters into metacharacters and vice versa. For instance \||b searches for either '|' or 'b', the first bar is escaped and searched for, the second is not escaped and acts as an "or".

Outside quoted strings, the only common use of backslash is to ignore ("escape") a newline immediately after it. In this context it may be called a "continued line" [9] as the current line continues into the next one. Some software replaces the backslash+newline with a space. [10]

To support computers that lacked the backslash character, the C trigraph ??/ was added, which is equivalent to a backslash. Since this can escape the next character, which may itself be a ?, the primary modern use may be for code obfuscation. Support for trigraphs in C++ was removed in C++17, and support for them in C is planned to be removed in C23.

In Visual Basic (and some other BASIC dialects) the backslash is used as an operator symbol to indicate integer division. [11] This rounds toward zero.

The ALGOL 68 programming language uses the "\" as its Decimal Exponent Symbol. ALGOL 68 has the choice of 4 Decimal Exponent Symbols: e, E, \, or 10. Examples: 6.0221415e23, 6.0221415E23, 6.0221415\23 or 6.02214151023. [12]

In APL \ is called Expand when used to insert fill elements into arrays, and Scan when used to produce prefix reduction (cumulative fold).

In PHP version 5.3 and higher, the backslash is used to indicate a namespace. [13]

In Haskell, the backslash is used both to introduce special characters and to introduce lambda functions (since it is a reasonable approximation in ASCII of the Greek letter lambda, λ). [14]

Filenames

MS-DOS 2.0, released 1983, copied the idea of a hierarchical file system from Unix and thus used the (forward) slash as the directory separator. [15] Possibly on the insistence of IBM, [16] [17] Microsoft added the backslash to allow paths to be typed at the command line interpreter prompt, while retaining compatibility with MS-DOS 1.0 (in which / was the command-line option indicator. Typing "DIR/W" gave the "wide" option to the " DIR " command, so some other method was needed if one actually wanted to run a program called W inside a directory called DIR). Except for COMMAND.COM, all other parts of the operating system accept both characters in a path, but the Microsoft convention remains to use a backslash, and APIs that return paths use backslashes. [18] In some versions, the option character can be changed from / to - via SWITCHAR, which allows COMMAND.COM to preserve / in the command name.

The Microsoft Windows family of operating systems inherited the MS-DOS behavior and so still support either character – but individual Windows programs and sub-systems may, wrongly, only accept the backslash as a path delimiter, or may misinterpret a forward slash if it is used as such. Some programs will only accept forward slashes if the path is placed in double-quotes. [19] The failure of Microsoft's security features to recognize unexpected-direction slashes in local and Internet paths, while other parts of the operating system still act upon them, has led to some serious lapses in security. Resources that should not be available have been accessed with paths using particular mixes, such as http://example.net/secure\private.aspx. [20] [21]

Text markup

The backslash is used in the TeX typesetting system and in RTF files to begin markup tags.

In USFM, [22] the backslash is used to mark format features for editing Bible translations.

In caret notation, ^\ represents the control character 0x1C, file separator. [23] This is entirely a coincidence and has nothing to do with its use in file paths.

Mathematics

A backslash-like symbol is used for the set difference. [24]

The backslash is also sometimes used to denote the right coset space. [25]

Especially when describing computer algorithms, it is common to define backslash so that a\b is equivalent to a/b.[ citation needed ] This is integer division that rounds down, not towards zero. In Wolfram Mathematica the backslash is used this way for integer divide. [26]

In MATLAB and GNU Octave the backslash is used for left matrix divide, while the (forward) slash is for right matrix divide. [27]

Confusion with ¥ and other characters

In the Japanese encodings ISO 646-JP (a 7-bit code based on ASCII), JIS X 0201 (an 8-bit code), and Shift JIS (a multi-byte encoding which is 8-bit for ASCII), the code point 0x5C that would be used for backslash in ASCII is instead rendered as a yen sign ¥. Due to extensive use of the 005C code point to represent the yen sign, even today some fonts such as MS Mincho render the backslash character as a ¥, so the characters at Unicode code points 00A5 (¥) and 005C (\) both render as ¥ when these fonts are selected. Computer programs still treat 005C as a backslash in these environments but display it as a yen sign, causing confusion, especially in MS-DOS filenames. [28]

Several other ISO 646 versions also replace backslash with other characters, including (Korean), Ö (German, Swedish), Ø (Danish, Norwegian), ç (French) and Ñ (Spanish), leading to similar problems, though with less lasting impact compared to the yen sign.

In 1991, RFC 1345 suggested // as a unique two-character mnemonic that might be used in internet standards as "a practical way of identifying [this] character, without reference to a coded character set and its code in [that] coded character set". [29] Consequently, this style may be seen in early Internet Engineering Task Force documents.

Notes

  1. This is the inverse of the code for the slash symbol (       ).

Related Research Articles

<span class="mw-page-title-main">ASCII</span> American character encoding standard

ASCII, an acronym for American Standard Code for Information Interchange, is a character encoding standard for electronic communication. ASCII codes represent text in computers, telecommunications equipment, and other devices. Because of technical limitations of computer systems at the time it was invented, ASCII has just 128 code points, of which only 95 are printable characters, which severely limited its scope. Modern computer systems have evolved to use Unicode, which has millions of code points, but the first 128 of these are the same as the ASCII set.

In computing and telecommunication, a control character or non-printing character (NPC) is a code point in a character set that does not represent a written character or symbol. They are used as in-band signaling to cause effects other than the addition of a symbol to the text. All other characters are mainly graphic characters, also known as printing characters, except perhaps for "space" characters. In the ASCII standard there are 33 control characters, such as code 7, BEL, which rings a terminal bell.

In computing and telecommunication, an escape character is a character that invokes an alternative interpretation on the following characters in a character sequence. An escape character is a particular case of metacharacters. Generally, the judgement of whether something is an escape character or not depends on the context.

The slash is the oblique slanting line punctuation mark /. It is also known as a stroke, a solidus, a forward slash and several other historical or technical names. Once used to mark periods and commas, the slash is now used to represent division and fractions, exclusive 'or' and inclusive 'or', and as a date separator.

In computer science, an escape sequence is a combination of characters that has a meaning other than the literal characters contained therein; it is marked by one or more preceding characters.

<span class="mw-page-title-main">COMMAND.COM</span> Default command line for MS-DOS and Windows 9x

COMMAND.COM is the default command-line interpreter for MS-DOS, Windows 95, Windows 98 and Windows Me. In the case of DOS, it is the default user interface as well. It has an additional role as the usual first program run after boot. As a shell, COMMAND.COM has two distinct modes of operation: interactive mode and batch mode. Internal commands are commands stored directly inside the COMMAND.COM binary; thus, they are always available, but can only be executed directly from the command interpreter.

dir (command) Directory information command on various operating systems

In computing, dir (directory) is a command in various computer operating systems used for computer file and directory listing. It is one of the basic commands to help navigate the file system. The command is usually implemented as an internal command in the command-line interpreter (shell). On some systems, a more graphical representation of the directory structure can be displayed using the tree command.

A string literal or anonymous string is a literal for a string value in the source code of a computer program. Modern programming languages commonly use a quoted sequence of characters, formally "bracketed delimiters", as in x = "foo", where "foo" is a string literal with value foo. Methods such as escape sequences can be used to avoid the problem of delimiter collision and allow the delimiters to be embedded in a string. There are many alternate notations for specifying string literals especially in complicated cases. The exact notation depends on the programming language in question. Nevertheless, there are general guidelines that most modern programming languages follow.

<span class="mw-page-title-main">Newline</span> Special characters in computing signifying the end of a line of text

A newline is a control character or sequence of control characters in character encoding specifications such as ASCII, EBCDIC, Unicode, etc. This character, or a sequence of characters, is used to signify the end of a line of text and the start of a new one.

The yen and yuan sign (¥) is a currency sign used for the Japanese yen and the Chinese yuan currencies when writing in Latin scripts. This character resembles a capital letter Y with a single or double horizontal stroke. The symbol is usually placed before the value it represents, for example: ¥50, or JP¥50 and CN¥50 when disambiguation is needed. When writing in Japanese and Chinese, the Japanese kanji and Chinese character is written following the amount, for example 50円 in Japan, and 50元 or 50圆 in China.

A path is a string of characters used to uniquely identify a location in a directory structure. It is composed by following the directory tree hierarchy in which components, separated by a delimiting character, represent each directory. The delimiting character is most commonly the slash ("/"), the backslash character ("\"), or colon (":"), though some operating systems may use a different delimiter. Paths are used extensively in computer science to represent the directory/file relationships common in modern operating systems and are essential in the construction of Uniform Resource Locators (URLs). Resources can be represented by either absolute or relative paths.

The archiver, also known simply as ar, is a Unix utility that maintains groups of files as a single archive file. Today, ar is generally used only to create and update static library files that the link editor or linker uses and for generating .deb packages for the Debian family; it can be used to create archives for any purpose, but has been largely replaced by tar for purposes other than static libraries. An implementation of ar is included as one of the GNU Binutils.

The backtick` is a typographical mark used mainly in computing. It is also known as backquote, grave, or grave accent.

<span class="mw-page-title-main">Slashed zero</span> Glyph variant of numeral 0 (zero) with slash

The dotted or slashed zero 0̷ is a representation of the Arabic digit "0" (zero) with a slash or a dot through it. This variant zero glyph is often used to distinguish the digit "zero" ("0") from the Latin script letter "O" anywhere that the distinction needs emphasis, particularly in encoding systems, scientific and engineering applications, computer programming, and telecommunications. It thus helps to differentiate characters that would otherwise be homoglyphs. It was commonly used during the punch card era, when programs were typically written out by hand, to avoid ambiguity when the character was later typed on a card punch.

<span class="mw-page-title-main">JIS X 0201</span> Japanese single byte character encoding

JIS X 0201, a Japanese Industrial Standard developed in 1969, was the first Japanese electronic character set to become widely used. The character set was initially known as JIS C 6220 before the JIS category reform. Its two forms were a 7-bit encoding or an 8-bit encoding, although the 8-bit form was dominant until Unicode replaced it. The full name of this standard is 7-bit and 8-bit coded character sets for information interchange (7ビット及び8ビットの情報交換用符号化文字集合).

The delete control character is the last character in the ASCII repertoire, with the code 127. It is supposed to do nothing and was designed to erase incorrect characters on paper tape. It is denoted as ^? in caret notation and is U+007F in Unicode.

<span class="mw-page-title-main">Command-line interface</span> Computer interface that uses text

A command-line interface (CLI) is a means of interacting with a computer program by inputting lines of text called command-lines. Command-line interfaces emerged in the mid-1960s, on computer terminals, as an interactive and more user-friendly alternative to the non-interactive interface available with punched cards.

In the C programming language, an escape sequence is specially delimited text in a character or string literal that represents one or more other characters to the compiler. It allows a programmer to specify characters that are otherwise difficult or impossible to specify in a literal.

Microsoft Windows code page 932, also called Windows-31J amongst other names, is the Microsoft Windows code page for the Japanese language, which is an extended variant of the Shift JIS Japanese character encoding. It contains standard 7-bit ASCII codes, and Japanese characters are indicated by the high bit of the first byte being set to 1. Some code points in this page require a second byte, so characters use either 8 or 16 bits for encoding.

The ISO 2033:1983 standard defines character sets for use with Optical Character Recognition or Magnetic Ink Character Recognition systems. The Japanese standard JIS X 9010:1984 is closely related.

References

  1. Macquarie Dictionary (3rd edition)
  2. Raymond, Eric S. "ASCII". The Jargon File. Archived from the original on 2019-01-18. Retrieved 2007-07-28.
  3. Bulletin 125, issue 2: Description and Adjustments of the Teletype Wheatstone Perforator (PDF). Teletype Corporation. May 1938 [August 1937]. p. ii. Archived (PDF) from the original on 2021-02-12. Retrieved 2021-03-15 via navy-radio.com.
  4. 1 2 3 Fischer, Eric (20 June 2000). "The Evolution of Character Codes, 1874–1968" (PDF). pp. 14–15 (Table 35). Archived (PDF) from the original on 30 November 2022. Retrieved 30 November 2022. Referencing Bemer, R.W.; Buchholz, W. (June 1, 1960). An Extended Character Set Standard (Technical report). Product Development Laboratory, Data Systems Division, International Business Machines Corporation. TR 00.721. Computer Standards Collection, Archives Center, National Museum of American History, Smithsonian Institution, box 1.
  5. "Bulletin 188: Teletype automatic perforator set" (PDF). Teletype Corporation. August 1945. p. 6. Archived (PDF) from the original on 2021-12-19. Retrieved 2021-12-19 via navy-radio.com.
  6. "Bulletin 1025, issue 3: Parts for Teletype Wheatstone Perforator" (PDF). Teletype Corporation. July 1945. p. 9. Archived (PDF) from the original on 2021-12-19. Retrieved 2021-12-19 via navy-radio.com.
  7. "C compiler source". bitsavers.org. 1975. Archived from the original on 2011-08-15. Retrieved 2009-10-11.
  8. "C compiler source". mit.edu. 1979. Archived from the original on 2013-12-03. Retrieved 2009-10-11.
  9. "The C Preprocessor". GNU.org. Archived from the original on 2021-08-03. Retrieved 2021-01-04.
  10. "3.1.1 Splitting long lines". GNU make manual. Archived from the original on June 5, 2018. Retrieved July 28, 2019.
  11. "Arithmetic Operators in Visual Basic". Visual Basic Language Features: Operators and Expressions. MSDN. 10 July 2012. Archived from the original on 17 September 2012. Retrieved 7 October 2012.
  12. "Revised Report on the Algorithmic Language Algol 68". Acta Informatica. 5 (1–3): 1–236. September 1973. doi:10.1007/BF00265077. S2CID   2490556. Archived from the original on 2020-07-26. Retrieved 2020-02-10.
  13. "Namespaces overview". php.net. Archived from the original on 2012-10-16. Retrieved 2012-10-10.
  14. O'Sullivan, Stewart, and Goerzen, Real World Haskell, ch. 4: anonymous (lambda) functions, p.99
  15. "Why is the DOS path character "\"?". Blogs.msdn.com. 2005-06-24. Archived from the original on 2022-11-19. Retrieved 2013-06-16.
  16. Necasek, Michal (24 May 2019). "Why Does Windows Really Use Backslash as Path Separator?". OS/2 Museum. Archived from the original on 24 May 2019. Retrieved 28 May 2019.
  17. Larson, Chris; Immerwahr, Don (1983), MS-DOS v1.25 and v2.0 Source Code, README.txt, Microsoft, archived from the original on 21 November 2022, retrieved 21 November 2022, The user manual contains some significant errors. Most of these are due to last minute changes to achieve a greater degree of compatibility with IBM's implementation of MS-DOS (PC DOS). This includes the use of "\" instead of "/" as the path separator, and "/" instead of "-" as the switch character. For transporting of batch files across machines, Microsoft encourages the use of "\" and "/" respectively in the U.S. market. [...] We apologize for any inconveniences these changes may have caused your technical publications staff.
  18. "Path.GetFullPath Method". .NET Framework Class Library. Microsoft. Archived from the original on 21 December 2008. Retrieved 2009-01-02.
  19. "When did Windows start accepting forward slash as a path separator?". Bytes.com. Archived from the original on 6 February 2009. Retrieved 2009-01-02.
  20. Kaplan, Simone (2004). "Microsoft Probes Flaw in ASP.NET". DevSource, sponsored by Microsoft. Ziff Davis Enterprise Holdings Inc. Archived from the original on 2013-01-21. Retrieved 2009-06-14.
  21. Burnett, Mark (2004). "Security Holes That Run Deep". SecurityFocus. Archived from the original on 2021-02-02. Retrieved 2009-06-14.
  22. "USFM – Unified Standard Format Markers". paratext.org. 18 April 2017. Archived from the original on 13 August 2020. Retrieved 30 June 2013.
  23. "C0 control characters set, 1987". pd.infn.it. Archived from the original on 2019-09-06. Retrieved 2021-03-14.
  24. "Quantities and units – Part 2: Mathematical signs and symbols to be used in the natural sciences and technology". ISO 80000-2:2009. International Organization for Standardization. 15 April 2008. Archived from the original on 2014-03-26. Retrieved 2012-10-07.
  25. "Definition:Coset Space". ProofWiki. Archived from the original on 26 February 2021. Retrieved 1 February 2017.
  26. "Integer Division". mathworld.wolfram.com. Archived from the original on 2020-02-21. Retrieved 2020-03-10.
  27. Eaton, John W.; David Bateman; Søren Hauberg (February 2011). "GNU Octave: A high-level interactive language for numerical computations" (PDF). Free Software Foundation. p. 145. Archived (PDF) from the original on 24 October 2012. Retrieved 7 October 2012.
  28. "When is a backslash not a backslash?". Blogs.msdn.com. Archived from the original on 2016-03-23. Retrieved 2013-06-16.
  29. Simonsen, Keld (1991-12-26). "RFC 1345". Tools.ietf.org. Archived from the original on 2016-06-13. Retrieved 2013-06-16.