Why is there a back tick ` but no mirrored version of it (the ' is not)?
Why is there no degree symbol in it?
Why did they make most the first 32 symbols useless codes instead of also symbols?
Why did they have to start the whole "newline vs carriage return" thing, why not just a single newline character from the beginning?
Why is there no pilcrow, paragraph symbol, dagger and double dagger in it?
Why no symbols for not equals, subset, intersection, union, (no) element of, AND, OR in it?
Why is it 7-bit and not 8-bit? Who uses 7 bits, honestly.
Programming languages would have had some nicer symbols available if some of the above were done...
Why the first 32. Symmetry. Look at the table here: https://commons.wikimedia.org/wiki/File:ASCII_Code_Chart-Qui...
Four groups of 32, each group differing in only one bit position.
Why were the first 32 devoted to "control" codes? Because ASCII appeared in the days of mechanical paper teletype interfaces to computers and so there was a need for codes to control the teletype. Additionally the intent was to use other of the control codes for "control" (i.e., use of XON (control-q) and XOFF (control-s) for flow control).
> Why did they have to start the whole "newline vs carriage return" thing,
Because when you have a mechanical paper teletype printer as your interface to the computer, the teletype printer needs to be told to do two things:
1) return the print carriage to the left margin
2) roll the paper up one line
Having the ability to do both independently (for a paper printer) allows for simulating some otherwise impossible effects (i.e., bold face).
> Why is it 7-bit and not 8-bit?
To allow bit 8 to be used for a parity bit for data transmission purposes.
> Why is there no pilcrow, paragraph symbol, dagger and double dagger in it?
> Why no symbols for not equals, subset, intersection, union, (no) element of, AND, OR in it?
Likely because, after choosing to make a 7-bit code (to allow bit 8 to be parity), there is only so much room left in only 128 slots.
That's impossible. If you have four groups, they have to differ in at least TWO bit positions.
A better example is underlining.
To that extent, it's very much worthwhile to have a control code for a carriage return - setting the cursor to the start of the line without advancing to a new one. This lets you, for example, tell an electronic typewriter to double-strike a line for a boldface effect, or strikethrough a line of text.
And once you have that, you might as well have the control character that advances the output to the next line not move the cursor at all, since the operator can just send a carriage return if they do want to start at the beginning of the next line instead of where they left off on the previous one.
With regards to bitness, computers weren't always strictly power-of-2 based (quick example: the PDP-8 was a 12-bit machine). While today, memory is cheap enough that "rounding up" 30 bits to 32, or 7 bits to 8, is definitely worth it in terms of how it simplifies your other logic, back when ASCII was developed that wasn't really the case.
There is no degree symbol because nobody ever proposed one during the standards process. Most of the punctuation came from what appeared on US typewriters at the time. Likewise pilcrow, paragraph, etc.
The first 32 characters are controls because one of the major proponents of the code was Teletype, a division of AT&T. Nobody understood what network protocols were going to turn out to look like and existing protocols were very heavy on in-band signaling. It was an attempt to eradicate the worst features of the Baudot code that was previously in use, where every character had multiple shift modes and multiple protocol interpretations.
The newline vs. carriage return thing is also an artifact of AT&T's involvement. Most US computing organizations didn't care about controls at all and wanted fixed-size records. European computing organizations wanted a single newline. The compromise in ASCII-1968 was that LF could be interpreted as CRLF if sender and receiver agreed, which became the Multics convention and thence into Unix and C.
7-bit because computers at the time universally used 6-bit characters and nobody thought the computer people would actually use the control characters, only the middle 64 characters of the code. (No lower case either.) IBM threw a wrench in the works when they went to 8-bit bytes with the 360 and others followed.
My attempt to tell the ASCII story several years ago: https://docs.google.com/file/d/0B6gxjm4UN7VjZnFmNlIzQmJoRDg/...
The 8th bit allowed for various charactersets, encoded in Extended ASCII. The one most people know of is ISO-8859-1, also known as Latin-1.
It's more complicated than that, I wrote about it once at length here:
http://www.randomtechnicalstuff.blogspot.com.au/2009/05/unic...
If ASCII where 8 bits to start with, we probably would have had zillions of incompatible extensions that used Control-N and Control-O (shift out and shift in) to switch into and out of 'non-ASCII' mode (here's an idea to sort-of improve upon HTML and XML: instead of < and >, use shift in and shift out to delineate nodes. That way, we wouldn't need that < stuff)
Alternatively, the nine bit byte might have won the battle.
> Why no symbols for not equals, subset, intersection, union, (no) element of, AND, OR in it?
Which characters would you like to replace?
http://en.wikipedia.org/wiki/ASCII#ASCII_printable_character...
With instead some characters useful for programming languages and basic text (and the problem is Unicode is a bit too big, confusing and redundant for programming languages...)
American Standard Code for Information Interchange http://www.wps.com/projects/codes/X3.4-1963/index.html
Revised U.S.A. Standard Code for Information Interchange http://www.wps.com/J/codes/Revised-ASCII/index.html
Eric Fischer, The Evolution of Character Codes, 1874-1968 http://www.pobox.com/~enf/ascii/ascii.pdf
Tom Jennings, An annotated history of some character codes or ASCII: American Standard Code for Information Infiltration http://www.wps.com/J/codes/
R W Bemer, The 1960 Survey of Coded Character Sets: The Reasons for ASCII, http://www.trailing-edge.com/~bobbemer/SURVEY.HTM
R W Bemer, Design of an improved transmission/data processing code http://dx.doi.org/10.1145/366532.366538
Charles E. MacKenzie, Coded Character Sets: History and Development 978-0201144604
To all the other history cited, there was also an early digital telephony multiplexing ("T1") practice of the day called "robbed bit signalling"[1] [2]. You got 7 bits clean, 8 bits - not so much.
----------
An interesting trivium: the IRC protocol lets users choose their own nicknames, and demands that nicknames be compared in a case-insensitive manner. For the purposes of IRC, you must treat {|} as the 'lower-case' versions of [\], because IRC was invented in Scandinavia and it did indeed use those character-codes for accented characters.
http://tools.ietf.org/html/rfc2812#section-2.2
" 2.2 Character codes
No specific character set is specified. The protocol is based on a
set of codes which are composed of eight (8) bits, making up an
octet. Each message may be composed of any number of these octets;
however, some octet values are used for control codes, which act as
message delimiters.
Regardless of being an 8-bit protocol, the delimiters and keywords
are such that protocol is mostly usable from US-ASCII terminal and a
telnet connection.
Because of IRC's Scandinavian origin, the characters {}|^ are
considered to be the lower case equivalents of the characters []\~,
respectively. This is a critical issue when determining the
equivalence of two nicknames or channel names." A /\ B \/ C
I wonder why that notation was abandoned in programming languages.> Similarly, the twospaces version of counting demonstrated that vertical space is more important then indentation to programmers when judging whether or not statements belong to the same loop body. Programmers often group blocks of related statements together using vertical white space, but our results indicate that this seemingly superficial space can cause even experienced programmers to internalize the wrong program.
http://arxiv.org/abs/1304.5257
Of course, whether or not curly braces significantly help in this situation would require another experiment. Anecdotally though, I do feel like it requires less mental effort to structure code that I'm reading.