Commit 1265a0c2 authored by Robert Griesemer's avatar Robert Griesemer

- essentially reverted my change of yesterday with respect to char/string syntax

- fixed indentation in many places
- fixed a couple of typos

SVN=116120
parent 75bbce9e
The Go Programming Language The Go Programming Language
---- ----
(April 17, 2008) (April 18, 2008)
This document is an informal specification/proposal for a new systems programming This document is an informal specification/proposal for a new systems programming
language. language.
...@@ -194,12 +194,14 @@ Notation ...@@ -194,12 +194,14 @@ Notation
The syntax is specified using Extended Backus-Naur Form (EBNF). The syntax is specified using Extended Backus-Naur Form (EBNF).
In particular: In particular:
- "" encloses lexical symbols (a backslash precedes a literal quote within a symbol) - | separates alternatives (least binding strength)
- | separates alternatives
- () groups - () groups
- [] specifies an option (0 or 1 times) - [] specifies an option (0 or 1 times)
- {} specifies repetition (0 to n times) - {} specifies repetition (0 to n times)
Lexical symbols are enclosed in double quotes '''' (the
double quote symbol is written as ''"'').
A production may be referenced from various places in this document A production may be referenced from various places in this document
but is usually defined close to its first use. Productions and code but is usually defined close to its first use. Productions and code
examples are indented. examples are indented.
...@@ -356,7 +358,7 @@ point value that is constrained only upon assignment. ...@@ -356,7 +358,7 @@ point value that is constrained only upon assignment.
fractional_lit = { dec_digit } ( dec_digit "." | "." dec_digit ) fractional_lit = { dec_digit } ( dec_digit "." | "." dec_digit )
{ dec_digit } [ exponent ] . { dec_digit } [ exponent ] .
exponential_lit = dec_digit { dec_digit } exponent . exponential_lit = dec_digit { dec_digit } exponent .
exponent = ( "e" | "E" ) [ sign ] dec_digit { dec_digit } exponent = ( "e" | "E" ) [ sign ] dec_digit { dec_digit } .
07 07
0xFF 0xFF
...@@ -373,15 +375,15 @@ Strings behave like arrays of bytes, with the following properties: ...@@ -373,15 +375,15 @@ Strings behave like arrays of bytes, with the following properties:
contents of a string. contents of a string.
- No internal pointers: it is illegal to create a pointer to an inner - No internal pointers: it is illegal to create a pointer to an inner
element of a string. element of a string.
- They can be indexed: given string s1, s1[i] is a byte value. - They can be indexed: given string "s1", "s1[i]" is a byte value.
- They can be concatenated: given strings s1 and s2, s1 + s2 is a value - They can be concatenated: given strings "s1" and "s2", "s1 + s2" is a value
combining the elements of s1 and s2 in sequence. combining the elements of "s1" and "s2" in sequence.
- Known length: the length of a string s1 can be obtained by the function/ - Known length: the length of a string "s1" can be obtained by the function/
operator len(s1). The length of a string is the number of bytes within. operator "len(s1)". The length of a string is the number of bytes within.
Unlike in C, there is no terminal NUL byte. Unlike in C, there is no terminal NUL byte.
- Creation 1: a string can be created from an integer value by a conversion; - Creation 1: a string can be created from an integer value by a conversion;
the result is a string containing the UTF-8 encoding of that code point. the result is a string containing the UTF-8 encoding of that code point.
string('x') yields "x"; string(0x1234) yields the equivalent of "\u1234" "string('x')" yields "x"; "string(0x1234)" yields the equivalent of "\u1234"
- Creation 2: a string can by created from an array of integer values (maybe - Creation 2: a string can by created from an array of integer values (maybe
just array of bytes) by a conversion just array of bytes) by a conversion
a [3]byte; a[0] = 'a'; a[1] = 'b'; a[2] = 'c'; string(a) == "abc"; a [3]byte; a[0] = 'a'; a[1] = 'b'; a[2] = 'c'; string(a) == "abc";
...@@ -390,38 +392,36 @@ Strings behave like arrays of bytes, with the following properties: ...@@ -390,38 +392,36 @@ Strings behave like arrays of bytes, with the following properties:
Character and string literals Character and string literals
---- ----
Character and string literals are almost the same as in C, but with Character and string literals are almost the same as in C, with the
UTF-8 required. This section is precise but can be skipped on first following differences:
reading.
Character and string literals are similar to C except: - The encoding is UTF-8
- Octal character escapes are always 3 digits (\077 not \77)
- Hexadecimal character escapes are always 2 digits (\x07 not \x7)
- Strings are UTF-8 and represent Unicode
- `` strings exist; they do not interpret backslashes - `` strings exist; they do not interpret backslashes
- Octal character escapes are always 3 digits ("\077" not "\77")
The rules are: - Hexadecimal character escapes are always 2 digits ("\x07" not "\x7")
char_lit = "'" ( utf8_char_no_single_quote | "\" esc_seq ) "'" . This section is precise but can be skipped on first reading. The rules are:
esc_seq = char_lit = "'" ( unicode_value | byte_value ) "'" .
"a" | "b" | "f" | "n" | "r" | "t" | "v" | "\" | "'" | "\"" | unicode_value = utf8_char | little_u_value | big_u_value | escaped_char .
oct_digit oct_digit oct_digit | byte_value = octal_byte_value | hex_byte_value .
"x" hex_digit hex_digit | octal_byte_value = "\" oct_digit oct_digit oct_digit .
"u" hex_digit hex_digit hex_digit hex_digit | hex_byte_value = "\" "x" hex_digit hex_digit .
"U" hex_digit hex_digit hex_digit hex_digit little_u_value = "\" "u" hex_digit hex_digit hex_digit hex_digit .
big_u_value = "\" "U" hex_digit hex_digit hex_digit hex_digit
hex_digit hex_digit hex_digit hex_digit . hex_digit hex_digit hex_digit hex_digit .
escaped_char = "\" ( "a" | "b" | "f" | "n" | "r" | "t" | "v" | "\" | "'" | """ ) .
A unicode_value takes one of four forms: A unicode_value takes one of four forms:
* The UTF-8 encoding of a Unicode code point. Since Go source * The UTF-8 encoding of a Unicode code point. Since Go source
text is in UTF-8, this is the obvious translation from input text is in UTF-8, this is the obvious translation from input
text into Unicode characters. text into Unicode characters.
* The usual list of C backslash escapes: \n \t etc. * The usual list of C backslash escapes: "\n", "\t", etc.
* A `little u' value, such as \u12AB. This represents the Unicode * A `little u' value, such as "\u12AB". This represents the Unicode
code point with the corresponding hexadecimal value. It always code point with the corresponding hexadecimal value. It always
has exactly 4 hexadecimal digits. has exactly 4 hexadecimal digits.
* A `big U' value, such as \U00101234. This represents the * A `big U' value, such as "\U00101234". This represents the
Unicode code point with the corresponding hexadecimal value. Unicode code point with the corresponding hexadecimal value.
It always has exactly 8 hexadecimal digits. It always has exactly 8 hexadecimal digits.
...@@ -457,8 +457,8 @@ Double-quoted strings have the usual properties; back-quoted strings ...@@ -457,8 +457,8 @@ Double-quoted strings have the usual properties; back-quoted strings
do not interpret backslashes at all. do not interpret backslashes at all.
string_lit = raw_string_lit | interpreted_string_lit . string_lit = raw_string_lit | interpreted_string_lit .
raw_string_lit = "`" { utf8_char_no_back_quote } "`" . raw_string_lit = "`" { utf8_char } "`" .
interpreted_string_lit = "\"" { utf8_char_no_double_quote | "\\" esc_seq } "\"" . interpreted_string_lit = """ { unicode_value | byte_value } """ .
A string literal has type 'string'. Its value is constructed by A string literal has type 'string'. Its value is constructed by
taking the byte values formed by the successive elements of the taking the byte values formed by the successive elements of the
...@@ -769,7 +769,7 @@ a method indicates the type of the struct by declaring a receiver of type ...@@ -769,7 +769,7 @@ a method indicates the type of the struct by declaring a receiver of type
the declaration the declaration
func (p *Point) distance(float scale) float { func (p *Point) distance(scale float) float {
return scale * (p.x*p.x + p.y*p.y); return scale * (p.x*p.x + p.y*p.y);
} }
...@@ -1732,5 +1732,5 @@ TODO ...@@ -1732,5 +1732,5 @@ TODO
- TODO: type switch? - TODO: type switch?
- TODO: words about slices - TODO: words about slices
- TODO: what is nil? do we type-test by a nil conversion or something else? - TODO: I (gri) would like to say that sizeof(int) == sizeof(pointer), always.
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment