NameDateSize

..19-Oct-202032

Allocation.hsH A D29-Jul-20209.1 KiB

CodeGen.hsH A D19-Oct-202030.1 KiB

Core.hsH A D02-Oct-20203.6 KiB

Desugar.hsH A D28-Apr-20206.3 KiB

PrettyPrint.hsH A D28-Apr-2020148

README.mdH A D28-Apr-20206.2 KiB

Surface.hsH A D28-Apr-20202.4 KiB

TypeCheck.hsH A D02-Oct-202013.6 KiB

Util.hsH A D19-Oct-20201.3 KiB

README.md

1# Dargent Grammar
2
3This is the dargent layout grammar (as it currently is)
4
5```
6; General categories of tokens, not specified in this grammar
7<nat>
8<var-name>
9<layout-name>
10<variant-case-name>
11
12; Dargent grammar
13
14<layout-size> ::= <nat> ("b" / "B")
15
16<layout-size-expr> ::= *[ <layout-size> "+" ] <layout-size>
17
18<layout-expr> ::= <layout-expr-types> [ "at" <layout-size-expr> ]
19
20<layout-expr-types> ::= "record"  "{" [ *[ <record-expr>  "," ] <record-expr>  ] "}"
21                      | "variant" "(" <layout-expr> ")" "{" [ *[ <variant-expr> "," ] <variant-expr> ] "}"
22                      | <layout-name>
23                      | <layout-size-expr>
24
25<record-expr> ::= <var-name> ":" <layout-expr>
26
27<variant-expr> ::= <variant-case-name> "(" <nat> ")" ":" <layout-expr>
28```
29Note that `b` specifies location in bits, and `B` in bytes.
30
31Layouts are attached to types (currently, they can only be attached to records) by declaring the type, then writing the keyword `layout` and then a description of the layout in the above grammar.
32
33For example, a record which stores its fields in reverse order is specified as follows.
34```
35type Foo = { a : U32 , b : U64 } layout record { a : 4B at 8B , b : 8B at 0B }
36```
37
38
39# In Progress
40* Add support for attaching layouts to types to the surface syntax, and then add lots of tests to the compiler to test custom layouts
41
42# Todo
43
44## Important
45* Change the Eq instance of DataLayout to ignore SourcePos information
46  * Maybe should use the trees that grow approach to add this information?
47
48## Other
49* Fix antiquoted-C test pass_static_array -
50  * Need to make a way of specifying which additional getters/setters should be generated for the antiquoted C. Currently, there is no need for the getter in the Cogent code, so it isn't generated. But it is needed in the C code to replace the code `args.p2->f` with `d?_get_f(args.p2)`.
51* Think about whether we need to add syntax into the language to allow explicit specification of the size of a layout
52* Refactor the code generation code into several largely independent files with explicit imports (to improve readability of code)
53  * Particularly the code which does type generation
54  * The StrlType thing doesn't really make sense - I feel like custom equality on normal Cogent types which ignores whether fields are/aren't taken would suffice (there should be an equivalence relation on cogent types such that each equivalence class gets mapped to a unique C type)
55* Check what kind of machine code is being generated
56  * Make sure to try non-default layouts and all kinds of types embedded in the boxed record
57* Add a compiler flag to set the variable which decides the compilation architecture
58  * In future this could be decided automatically based on the architecture that gcc is compiling for
59    * Partha said could use `gcc -dumpmachine` that shows the architecture that the compiler is built for
60    * use `#if` in Cogent code and   `$escstm:()` in the antiquoted C code.
61  * Also figure out how to switch on the architecture in a `#IF ... #ELSE ... #ENDIF` C preprocessor statement, so we can have different layouts for different architectures
62  * Potentially add Cogent surface syntax for switching on architecture/architecture dependent offsets - to be desugared to the existing core layout structures.
63* More testing
64  * Create more antiquoted-C tests
65    * Some which actually print out all the bits of the data structure after setting to check that the correct bits are being set (the existing tests only check that setting and then getting gets what was set)
66    * Some which check that the *default layouts* actually match up with the old layouts for boxed records
67    * Try a large scale test (like BilbyFS, except that won't work as BilbyFS has unboxed abstract types in boxed records)
68* Implement embedded  `TProduct _ _` (the type of `Tuple e1 e2` expressions, ie. pairs)
69* Core type checker for data layouts after monomorphisation is only partially implemented, must be finished and integrated with rest of compiler
70* Figure out how to handle endianness
71  * Should be possible to expand the dargent syntax to allow decomposing the layout of primitive types into an array of bitranges, each of which doesn't have to be contiguous in memory (like in the core syntax after splitting into aligned bit ranges in the CodeGen stage).
72  * Opposite endianess would be a layout like `[1B at 3B, 1B at 2B, 1B at 1B, 1B]` for `U32`.
73* Add fixed size arrays to types allowed to be included in boxed records (using while loops in the code generation) - this may be non-trivial
74
75
76# Proposed new surface Syntax
77`{ fi : Ti }` - boxed record with default layout
78`#{ fi : Ti }` - unboxed record
79`{ fi : Ti } with layout { fi : Li }` - boxed record with custom layout
80
81* Backwards compatible
82* Need to add the `$` type operator to the parser
83* Layout syntax has already been added to the parser, but I have some proposals for changes before it is made public:
84
85* `record { fi : Li }` should become `{fi : Li}`
86* `variant (Ltag) { Ai(vi) Li }` should become
87```
88Ltag
89| v1 -> A1 L1
90| v2 -> A2 L2
91```
92
93eg.
94
95```
96< Success U32 | Fail () > $
97(1b at 4B
98  | 0 -> Success (4B at 0B)
99  | 1 -> Fail 0B)
100```
101
102# 2. Layout polymorphism
103`f: forall l. { a : U8 } $ l -> Bool`
104* Add syntax for layout variables
105* Add syntax for insantiating layout variables - layout application.
106* `f { a : 8b at 8b } 0`
107
108Need predicate that a layout and type match!
109
110# 3. Layout inference
111`{ fi : Ti }` now is a shorthand for `forall l. { fi : Ti } $ l` everywhere except in C FFI declarations, in which case it is a shorthand for the boxed type with the default layout.
112
113This is good because existing functions, such as
114`f : { a : U8 } -> Bool` will still work in the existing code as is, because the implicit layout variable will be inferred to be the default layout by layout inference. But, we additionally gain the ability to call the function `f` on boxed values `{ a: U8}` with non-default layouts.
115
116The only way we get boxed records into cogent is through the C FFI, and so we should restrict the layout polymorphism allowed on the C FFI??? Because the layout has to be monomorphised eventually.
117
118# Nested pattern matching:
119* How does put/take work - eg. record in record. take a field of the record in the record. Can I take the record itself?
120* Extend syntax
121* Improves performance
122
123