ref: 8df4cede33ff87c7cc7ecdeba5d0edb47a55b8a3
parent: 0893bfe6bde2b9498a32354894b003adc13a9ea9
author: Ori Bernstein <ori@eigenstate.org>
date: Sun Jan 15 16:25:51 EST 2017
Move type inference up.
--- a/doc/lang.txt
+++ b/doc/lang.txt
@@ -13,10 +13,10 @@
3.1. Summary
4. SYNTAX
4.1. Declarations
- 4.2. Literal Values
- 4.3. Control Constructs and Blocks
- 4.4. Expressions
- 4.5. Data Types
+ 4.2. Data Types
+ 4.3. Literal Values
+ 4.4. Control Constructs and Blocks
+ 4.5. Expressions
4.6. Type Inference
4.7. Generics
4.8. Traits
@@ -48,16 +48,31 @@
Syntax is defined using an informal variant of EBNF.
token: /regex/ | "quoted" | <english description>
- prod: prodname ":" [ expr ]
+ prod: prodname ":" expr*
expr: alt ( "|" alt )*
alt: term term*
- term: prodname | token | group | opt | rep
+ term: prod | token | group | opt | rep
group: "(" expr ")" .
opt: "[" expr "]" .
rep: zerorep | onerep
- zerorep: expr "*"
- onerep: expr "+"
+ zerorep: term "*"
+ onerep: term "+"
+ Whitespace and comments are ommitted in this description.
+
+ To put it in words, /regex/ defines a regular expression that would
+ match a single token in the input. "quoted" would match a single
+ string. <english description> contains an informal description of what
+ characters would match.
+
+ Productions are defined by any number of expressions, in which
+ expressions are '|' separated sequences of terms.
+
+ Terms can are productions or tokens, and may come with a repeat
+ specifier. wrapping a term in "[]" denotes that the term is repeated
+ 0 or 1 times. suffixing it with a '*' denotes 0 or more repetitions,
+ and '+' denotes 1 or more repetitions.
+
2.2. As-If Rule:
Anything specified here may be treated however the compiler wishes,
@@ -186,10 +201,150 @@
-> a + b + c
}
- 4.2. Literal Values
- 4.2.1. Atomic Literals:
+ 4.2. Data Types:
+
+ The language defines a number of built in primitive types. These
+ are not keywords, and in fact live in a separate namespace from
+ the variable names. Yes, this does mean that you could, if you want,
+ define a variable named 'int'.
+
+ There are no implicit conversions within the language. All types
+ must be explicitly cast if you want to convert, and the casts must
+ be of compatible types, as will be described later.
+
+ 4.2.1. Primitive types:
+
+ void
+ bool char
+ int8 uint8
+ int16 uint16
+ int32 uint32
+ int64 uint64
+ int uint
+ long ulong
+ float32 float64
+
+ These types are as you would expect. 'void' represents a
+ lack of type, although for the sake of genericity, you can
+ assign between void types, return values of void, and so on.
+ This allows generics to not have to somehow work around void
+ being a toxic type. The void value is named `void`.
+
+ It is interesting to note that these types are not keywords,
+ but are instead merely predefined identifiers in the type
+ namespace.
+
+ bool is a type that can only hold true and false. It can be
+ assigned, tested for equality, and used in the various boolean
+ operators.
+
+ char is a 32 bit integer type, and is guaranteed to be able
+ to hold exactly one codepoint. It can be assigned integer
+ literals, tested against, compared, and all the other usual
+ numeric types.
+
+ The various [u]intXX types hold, as expected, signed and
+ unsigned integers of the named sizes respectively.
+ Similarly, floats hold floating point types with the
+ indicated precision.
+
+ var x : int declare x as an int
+ var y : float32 declare y as a 32 bit float
+
+
+ 4.2.2. Composite types:
+
+ pointer
+ slice array
+
+ Pointers are, as expected, values that hold the address of
+ the pointed to value. They are declared by appending a '#'
+ to the type. Pointer arithmetic is not allowed. They are
+ declared by appending a '#' to the base type
+
+ Arrays are a group of N values, where N is part of the type.
+ Arrays of different sizes are incompatible. Arrays in
+ Myrddin, unlike many other languages, are passed by value.
+ They are declared by appending a '[SIZE]' to the base type.
+
+ Slices are similar to arrays in many contemporary languages.
+ They are reference types that store the length of their
+ contents. They are declared by appending a '[,]' to the base
+ type.
+
+ foo# type: pointer to foo
+ foo[123] type: array of 123 foo
+ foo[,] type: slice of foo
+
+ 4.2.3. Aggregate types:
+
+ tuple struct
+ union
+
+ Tuples are the traditional product type. They are declared
+ by putting the comma separated list of types within square
+ brackets.
+
+ Structs are aggregations of types with named members. They
+ are declared by putting the word 'struct' before a block of
+ declaration cores (ie, declarations without the storage type
+ specifier).
+
+ Unions are the traditional sum type. They consist of a tag
+ (a keyword prefixed with a '`' (backtick)) indicating their
+ current contents, and a type to hold. They are declared by
+ placing the keyword 'union' before a list of tag-type pairs.
+ They may also omit the type, in which case, the tag is
+ sufficient to determine which option was selected.
+
+ [int, int, char] a tuple of 2 ints and a char
+
+ struct a struct containing an int named
+ a : int 'a', and a char named 'b'.
+ b : char
+ ;;
+
+ union a union containing one of
+ `Thing int int or char. The values are not
+ `Other float32 named, but they are tagged.
+ ;;
+
+
+ 4.2.4. Magic types:
+
+ tyvar typaram
+ tyname
+
+ A tyname is a named type, similar to a typedef in C, however
+ it genuinely creates a new type, and not an alias. There are
+ no implicit conversions, but a tyname will inherit all
+ constraints of its underlying type.
+
+ A typaram is a parametric type. It is used in generics as
+ a placeholder for a type that will be substituted in later.
+ It is an identifier prefixed with '@'. These are only valid
+ within generic contexts, and may not appear elsewhere.
+
+ A tyvar is an internal implementation detail that currently
+ leaks in error messages out during type inference, and is a
+ major cause of confusing error messages. It should not be in
+ this manual, except that the current incarnation of the
+ compiler will make you aware of it. It looks like '@$type',
+ and is a variable that holds an incompletely inferred type.
+
+ type mine = int creates a tyname named
+ 'mine', equivalent to int.
+
+
+ @foo creates a type parameter
+ named '@foo'.
+
+ 4.3. Literal Values
+
+ 4.3.1. Atomic Literals:
+
literal: strlit | chrlit | floatlit |
boollit | voidlit | intlit |
funclit | seqlit | tuplit
@@ -264,7 +419,7 @@
eg: true, false
- 4.2.2. Sequence and Tuple Literals:
+ 4.3.2. Sequence and Tuple Literals:
seqlit: "[" structelts | arrayelts "]"
tuplit: "(" tuplelts ")"
@@ -313,7 +468,7 @@
(1,), (1,'b',"three")
- 4.2.3. Function Literals:
+ 4.3.3. Function Literals:
funclit: "{" arglist "\n" blockbody "}"
arglist: (ident [":" type])*
@@ -354,7 +509,7 @@
}
- 4.2.4: Labels:
+ 4.3.4: Labels:
label: ":" ident
goto: "goto" ident
@@ -371,7 +526,7 @@
the ':' is not part of the label name.
- 4.3. Blocks:
+ 4.4. Blocks:
block: blockbody ";;"
blockbody: (decl | stmt | tydef | "\n")*
@@ -389,7 +544,7 @@
limited to within the block, and any attempts to access the associated
storage (via pointer, for example) is not valid.
- 4.3. Control Constructs:
+ 4.5. Control Constructs:
ifstmt: "if" cond "\n" blockbody
("elif" blockbody)*
@@ -513,7 +668,7 @@
;;
- 4.4. Expressions:
+ 4.6. Expressions:
Myrddin expressions are relatively similar to expressions in C. The
operators are listed below in order of precedence, and a short
@@ -609,147 +764,9 @@
on overflow. Right shift expressions fill with the sign bit on
signed types, and fill with zeros on unsigned types.
- 4.5. Data Types:
- The language defines a number of built in primitive types. These
- are not keywords, and in fact live in a separate namespace from
- the variable names. Yes, this does mean that you could, if you want,
- define a variable named 'int'.
+ 4.7. Type Inference:
- There are no implicit conversions within the language. All types
- must be explicitly cast if you want to convert, and the casts must
- be of compatible types, as will be described later.
-
- 4.5.1. Primitive types:
-
- void
- bool char
- int8 uint8
- int16 uint16
- int32 uint32
- int64 uint64
- int uint
- long ulong
- float32 float64
-
- These types are as you would expect. 'void' represents a
- lack of type, although for the sake of genericity, you can
- assign between void types, return values of void, and so on.
- This allows generics to not have to somehow work around void
- being a toxic type. The void value is named `void`.
-
- It is interesting to note that these types are not keywords,
- but are instead merely predefined identifiers in the type
- namespace.
-
- bool is a type that can only hold true and false. It can be
- assigned, tested for equality, and used in the various boolean
- operators.
-
- char is a 32 bit integer type, and is guaranteed to be able
- to hold exactly one codepoint. It can be assigned integer
- literals, tested against, compared, and all the other usual
- numeric types.
-
- The various [u]intXX types hold, as expected, signed and
- unsigned integers of the named sizes respectively.
- Similarly, floats hold floating point types with the
- indicated precision.
-
- var x : int declare x as an int
- var y : float32 declare y as a 32 bit float
-
-
- 4.5.2. Composite types:
-
- pointer
- slice array
-
- Pointers are, as expected, values that hold the address of
- the pointed to value. They are declared by appending a '#'
- to the type. Pointer arithmetic is not allowed. They are
- declared by appending a '#' to the base type
-
- Arrays are a group of N values, where N is part of the type.
- Arrays of different sizes are incompatible. Arrays in
- Myrddin, unlike many other languages, are passed by value.
- They are declared by appending a '[SIZE]' to the base type.
-
- Slices are similar to arrays in many contemporary languages.
- They are reference types that store the length of their
- contents. They are declared by appending a '[,]' to the base
- type.
-
- foo# type: pointer to foo
- foo[123] type: array of 123 foo
- foo[,] type: slice of foo
-
- 4.5.3. Aggregate types:
-
- tuple struct
- union
-
- Tuples are the traditional product type. They are declared
- by putting the comma separated list of types within square
- brackets.
-
- Structs are aggregations of types with named members. They
- are declared by putting the word 'struct' before a block of
- declaration cores (ie, declarations without the storage type
- specifier).
-
- Unions are the traditional sum type. They consist of a tag
- (a keyword prefixed with a '`' (backtick)) indicating their
- current contents, and a type to hold. They are declared by
- placing the keyword 'union' before a list of tag-type pairs.
- They may also omit the type, in which case, the tag is
- sufficient to determine which option was selected.
-
- [int, int, char] a tuple of 2 ints and a char
-
- struct a struct containing an int named
- a : int 'a', and a char named 'b'.
- b : char
- ;;
-
- union a union containing one of
- `Thing int int or char. The values are not
- `Other float32 named, but they are tagged.
- ;;
-
-
- 4.5.4. Magic types:
-
- tyvar typaram
- tyname
-
- A tyname is a named type, similar to a typedef in C, however
- it genuinely creates a new type, and not an alias. There are
- no implicit conversions, but a tyname will inherit all
- constraints of its underlying type.
-
- A typaram is a parametric type. It is used in generics as
- a placeholder for a type that will be substituted in later.
- It is an identifier prefixed with '@'. These are only valid
- within generic contexts, and may not appear elsewhere.
-
- A tyvar is an internal implementation detail that currently
- leaks in error messages out during type inference, and is a
- major cause of confusing error messages. It should not be in
- this manual, except that the current incarnation of the
- compiler will make you aware of it. It looks like '@$type',
- and is a variable that holds an incompletely inferred type.
-
- type mine = int creates a tyname named
- 'mine', equivalent to int.
-
-
- @foo creates a type parameter
- named '@foo'.
-
-
- 4.6. Type Inference:
-
The myrddin type system is a system similar to the Hindley Milner
system, however, types are not implicitly generalized. Instead, type
schemes (type parameters, in Myrddin lingo) must be explicitly provided
@@ -833,7 +850,7 @@
< <= > >=
- 4.7. Packages and Uses:
+ 4.8. Packages and Uses:
pkg use