diff --git a/docs/types/infinite_types.md b/docs/types/infinite_types.md new file mode 100644 index 0000000000000000000000000000000000000000..0c4da9f55c4da91f1f9fa0ede0a180fa847202cf --- /dev/null +++ b/docs/types/infinite_types.md @@ -0,0 +1,281 @@ +# Dealing with "Infinite Types" in a Value-Based Language + +## The Problem + +In many languages constructing `struct`s with members that, when expanded, end up having a field that refers to the struct itself by value is illegal. This makes sense because, firstly, one could never actually construct such a `struct`, if one would want to construct a literal then one would be typing an infinite amount of time. Secondly, because even if the struct could be constructed, one would have to take up an infinite amount of memory. As a practical example: a binary tree can conceptually grow to an infinite size, and such an infinite tree cannot be constructed. + +And so, in low-level programming languages one would replace these with pointers (or something equivalent) in order to deal with both points. The first point is no longer true, because the pointer can be `null` (in the example: this would terminate a node in the tree, making it a leaf of the tree). And secondly the size is no longer infinite and is calculable because a pointer will always have a fixed size. + +In Reowolf we have value-based semantics. To have reasonable cache coherency, quick `struct`/`union` member access, and prevent memory (de)allocation overhead we want to put as much values onto a stack-like construct. For this reason we would like to precalculate the alignment and offset of each type and its members. + +Point one above still stands: if one wishes to construct a type that never terminates, then the compiler should throw an error. An example of which would be: + +```pdl +// Simple truly infinite type +struct Infinite { + Infinite member, +} + +// Infinite type of cycle length 2 +struct Left { Right other } +struct Right { Left other } + +// Infinite type of cycle length 3 +struct A { B b } +struct B { C c } +struct C { A a } + +// Etcetera + +// But we can also do this using unions +union A { B(B), C(C) } +union B { A(A), C(C) } +union C { A(A), B(B) } +``` + +If one wishes to express a type whose value causes it to terminate somewhere, the only option in this language is to use `union`s to indicate the optionality of a certain member. For example: + +```pdl +// One option, allows setting branches individually +union Option{ Some(T), None } +struct U32TreeExample { + u32 value, + Option left, + Option right, +} + +// Another option, must be a well-formed binary tree +union U32TreeLink { + Leaf, + Node(U32TreeLink, U32TreeLink) +} +struct U32TreeNode { + u32 value, + U32TreeLink link, +} + +// Another kind of thing +union List { + End, + Entry(T, List) +} +``` + +These are all valid types, and we should be able to express them, but naively calculating their size causes one to run into issues. + +## Onwards to an Algorithm + +So far one can see that `struct`s and `union`s are the elements that might constitute an infinite type. All of the other types (user defined `enum`s and all of the builtins like `u32`, `string`, `s64`, etc.) do not have any members and have a fixed size. Whenever there is a union in the type, we might have a type that is not truly infinite, but may terminate somewhere. + +Imagine a graph containing all user-defined `struct` and `union` types. The edges in this graph are directional, and indicate that one type is embedded in the other. It is important to construct this graph on a per-member basis. `struct` members may have one edge running towards it, coming from the member's type. `union`s can have multiple edges running towards it, one for each embedded type in the union member. Note that types may refer to themselves. + +Having constructed this graph, one can visualise the potentially infinite types by loops in the graph. A struct is potentially infinite if one of its members is part of a loop. A union is potentially infinite if all of its members are part of a loop. A potentially infinite type is a truly infinite type if the relevant loops all contain potentially infinite types. Or, conversely: a potentially infinite type is not actually infinite if the relevant loops contain a union that is not potentially infinite. + +More practically: a potentially infinite struct (implying that at least one of its members is part of a type loop) is not infinite if each of its "looped members" is part of a loop which contains a non-infinite union. Likewise, a potentially infinite union (implying that all of its members are part of a loop) is not actually infinite if, for one of its members that contains a set of loops (because a union can embed multiple types per variant), each of those loops contain a non-infinite union. + +And finally, in human terms: a type is not actually infinite if, given enough time, a programmer can express a literal of that type. So one may type (using the examples above): + +```pdl +auto thing = List::Entry(3, List::Entry(2, List::Entry(1, List::End))); +auto flong = U32TreeExample{ + value: 0, + left: Option::Some(U32TreeExample{ + value: 1, + left: Option::None + }, + right: Option::None +} +``` + +But one simply cannot express: + +```pdl +auto cannot_construct = A::B(B::C(C::A(A::C(C::B( /* onwards, to infinity ))))); +``` + +Now we arrive at several conditions to end up at a concrete algorithm. Firsly we have that types whose literals are expressable should be constructable in the language. Secondly, for the performance reasons mentioned above we would like to specify the memory layout for each type. So that includes the size and alignment of the type itself, and the offset (with implicit alignment and size) of each of its members. Thirdly, because we're going to have to put pointers somewhere in the aforementioned type loops (necessarily, because we have variably sized values, we need to perform some kind of allocation somewhere, hence we need to have pointers somewhere), we need these pointers to be placed consistently throughout the code. This last point is a design decision: we could decide to have some types including pointers to members, and some types excluding pointers to members. This would then require special conversion functions implemented when two values of the same type, but with a different memory layout, are converted into one another. This on top of the fact that each of the types containing a pointer will require a constructor and a destructor. Finally, albeit this point looks ahead a bit at the implementation, we desire that subsequent compilations of the same files, but potentially in a different order, result in the same layout of each of the types. + +We may observe that only unions may break up potentially infinite types. So for each loop that contains a non-infinite union we need at least one union that is implemented using some kind of pointer. There are multiple ways we can fix the byte size of the union, thereby breaking the type loop: + +1. Make each instance of the union a pointer to the data in that union: both the tag and the potentially embedded values. This has the upside of being relatively simple to implement. Furthermore the union itself will just have a fixed size. The downside is that for a tag lookup we have to follow the pointer, generally causing cache misses in the case that we just want to check the tag. + +2. Keep the tag in a union value, but dynamically allocate all of the contents. This way the size of a union value becomes a tag and a pointer. On 32-bit systems this is most likely fine, on 64-bit systems (which is the norm, if we're not talking about embedded computers) alignment will generally cause the union size to bloat to 16-bytes. This kind of implementation has two subcases to consider. + + - Allocate as much memory as the maximum union variant. This way changing the union variant will not force a reallocation. Instead we can use up the allocated memory. + + - Allocate as much memory as the variant requires. This way we will not waste any memory, at the cost of having to reallocate when the variant is changed. + +3. Keep the tag in the union value, with the same up- and downsides as the previous point. Then dynamically allocate the members of the variants (that is, if we have `union Foo { Bar(CyclicType, u64, u64), Qux }`, then we will just make `CyclicType` a pointer). The obvious downside is that changing variants might cause (de)allocations. And if we have multiple members that cause a loop then we need to allocate each of them. The upside is that when we're only accessing the non-pointer members that they're likely already in the cache. + +There are many more tiny variants here that are not explicitly stated. I'll document my reasons for choosing the variant I think works best: + +- The whole idea of breaking the type cycles is not to allocate an infinite amount of memory, so always allocating the union is naturally out of the question. Furthermore type loops will generally only occur in particular datastructures that are potentially infinite: trees, linked lists, hashmaps, etc. + +- The contract when using a union is that the byte size of a union value is the size of the largest member plus the size of the tag (plus padding). So allocating the union just once, with enough size to fit all possible allocated variants seems like a good idea to me. I suspect that in most cases these unions will express something along the lines of: there is nothing, or there is nothing. + +Finally, as a note: all of this type-loop checking has to be performed per monomorphization. A polymorphic type specification itself does not need to be checked for type loops. + +Lastly, we have to consider the point of making sure that multiple compilations, where the AST is constructed in a different order, hence where the types are "discovered" in a different order, will result in the same types. As a rather simple solution, instead of just marking one union in the type loop as pointerlike, all of the unions in the type loops will be implemented using some kind of pointer scheme. + +## Concluding on the Algorithm + +As an initial conclusion: + +- All non-infinite unions in type loops are implemented using dynamic allocation to break the type loops and to prevent potentially infinite types from being truly infinite. +- The tag of the union will be a non-allocated value associated with the union. Considering that most of the time these kinds of unions will represent an option-like type (there is one more entry in the linked list, there is one more node in the tree, etc.) I think it is nice to not have cache misses when checking whether a value is present, or whether it is not present. +- Considering that we cannot allocate all of the time (because then we will still end up with an infinite amount of allocated memory), but I do not currently want to implement very complicated logic in the compiler for handling memory allocations, I will initially implement the following scheme: all union variants that do not contain any types part of a type loop will be stored on the stack, contributing to the byte size of the union. All variants that do contain such types will be fully allocated on the heap in one allocation. The size of the allocated region will be the size of the largest variant that requires dynamic allocation. + +## Rough Specification of the Algorithm + +I might be missing some stuff here, that stuff will probably pop up while I'm implementing this: + +- + Perform the basic pre-compilation for symbol discovery and definition discovery. This constructs the AST such that we can parse the AST to determine what fields are present on each `struct` and what variants are used in each `union`. + +- + For each type added to the type table (including monomorphs where we have all of the polymorphic variables fully specified), we will traverse all of the members and their types. For each type we will check if was already checked not to be cyclic. If it was already completely checked, then all types up until the most recently checked type cannot be cyclic (another member might still make them cyclic). If the type was not completely checked, but previously encountered in our search for loops, then all of types from the first encounter of that type are considered part of a type loop. + +- + Note that this algorithm is recursive in the sense that for structs we need to check each member for type loops. For unions we need to check each embedded type in each variant. So when we encounter a type loop for a specific set of members, we walk back and mark each of the unions along the way as requiring allocation. If there were no such unions then we immediately return an error. We keep all of the unions that require allocation around in a list. + +- + While we're doing the above, if we encounter a union who has a member that does not exist in a type loop then we mark it as non-infinite. + +- + After all of the above is done, we need to check all of the unions we have encountered. Necessarily if we started with a particular type, and recursively checked all of the embedded types, then we will discover *all* type loops. If none of the unions is non-infinite, then we throw an error (which might be kind of hard, because what kind of error do we show in the case of the `union A; union B; union C;` example?). With the algorithm above, we now know that if there are type loops, that each of them contains at least one union. With the condition that there is at least one non-infinite union we now know that the type is expressable. The constructed list of unions only contains unions that were part of a type loop. + + With that knowledge, to guarantee consistent compilation, we will decide to make each union in each type loop, whether it is infinite or non-infinite, a pointerlike union. + +- + A pointerlike union's value will consist of a tag, plus reserved size (with the proper alignment) for the largest non-pointerlike union variant if it is larger than the size of a pointer on the particular machine. If it is smaller, than we just need a tag, and a pointer that is properly aligned. So the union value will have a fixed size that is not dependent on the types that were part of the type cycle. I'll call the variants that were not part of type cycles "value variants", and parts that *were* part of type cycles "pointer variants". + +Then, during runtime, as an initial implementation (we could be smarter if we know the old value and know the new value, but that would take a lot of static analysis, we're not yet at that point in the compiler), we will check the tag value whether the old and new values are pointers. If the old and new variants are both pointer variants, or when they're both value variants, then we do not (re)allocate. If we go from a value variant to a pointer variant then we allocate. If we go from a pointer variant to a value variant then we deallocate. + +## Some Examples of the Algorithm + +I wrote these before I started writing all of the above, I'll keep it here for some concrete examples of the algorithm. + +### Truly Infinite Example with Structs + +```pdl +// Most simple case +struct SelfReferential{ SelfReferential field } + +// Slightly more complicated case +struct DualA { DualB b } +struct DualB { DualA a } +``` + +In some sense we would be able to compute the sizes of these datastructures, we simply make their fields pointers. Let's assume 64 bit systems for all of these examples. For just the first case we would then have that `SelfReferential` has a size of 8 bytes. However, we always need to allocate its member `field`. So this would always take up an infinite amount of memory. We disallow this case because we would never be able to construct a literal of these kinds of datatypes. + +### Truly Infinite Example with Unions + +```pdl +struct S { UOne one, UTwo two } +union UOne { Struct(S), One(UOne), Two(UTwo) } +union UTwo { One(UOne), Two(UTwo) } +``` + +Again, here we have a case where we can compute the sizes. We make the unions have pointerlike contents. So the unions will be 16 bytes (8 bytes for the tag and padding for pointer alignment, then an 8 byte pointer). Hence the struct will be 32 bytes. But once more we can never construct a literal and instantiating an element would always take up an infinite amount of memory. + +### Slightly Ambiguous Example + +With the following example we get one union which is non-infinite. + +```pdl +struct SEnd { u32 some_value } +struct SMiddle { + SEnd this_ends, + UOne union_one, + UTwo union_two, +} +union UOne { One(UOne), Two(UTwo), Struct(SMiddle) } +union UTwo { One(UOne), Two(UTwo), End(u32) } +``` + +If we draw the type graph, we get something like: + +```pdl ++--------+ +---------+ +-------+ +-------+ +| struct | | struct | <----- | union | <----- | union | +| SEnd | -----> | SMiddle | -----> | UOne | -----> | UTwo | ++--------+ +---------+ +-------+ +-------+ + | ^ | ^ + | | | | + \-----/ \-----/ +``` + +For this case we can actually always construct a literal of `SMiddle`. Viewing the literal as a tree (each node a type instance) then we can terminate the tree branches using `UTwo::End`. However, if we would just make `UTwo` pointerlike, we cannot compute the size of `UOne` (it contains a reference to itself, so would grow infinitely in size). This hints at that we should make `UOne` pointerlike as well. + +### Slightly Ambiguous Example, Version 2 + +Not really ambiguous, but just to show that not all unions which are part of a connected type graph need to be turned into pointerlike unions. + +```pdl +union Value { + Unsigned(u64), + Signed(s64), + Undefined +} +struct Node { + Value value, + Children children +} + +union Children { + One(Node), + Two(Node, Node), + None +} +``` + +We get a cycle between `Node` and `Children`, so we would want to implement `Children` as being pointerlike. But we don't have to make `Value` pointerlike. In terms of the described algorithm: `Value` is not part of a type loop, so is not considered a candidate for being pointerlike. + +### Slightly Ambigious Example, Version 3 + +Contains part of the type graph that is not part of a type loop. So the associated union should not be pointerlike. + +```pdl +// Parts that are outside of the type loop +struct SOutside { + UOutside next, +} +union UOutside { + Next(SInside) +} + +// Parts that are inside the type loop +struct SInside { + UInside next, +} +union UInside { + Next(SInside), + NoNext, +} +``` + +Here `UInside` should become pointerlike, but `UOutside` should remain a normal value. + +### Which Union Breaks the Cycle? + +```pdl +struct S { UOne one } +union UOne { Two(UTwo), Nope } +union UTwo { Struct(S), Nope } +``` + +Here we see that we have a type loop to which both unions contribute. We can either lay out `UOne` as pointerlike, or `UTwo`. Both would allow us to calculate the size of the types and make the type expressable. However we need consistent compilation. For this reason and this reason only (because it is much more efficient to only lay out one of the unions as a pointer) we have to make both unions pointerlike. Perhaps in the future some other consistent metric can be applied. + +### Does it even Matter? + +```pdl +// Obviously doesn't use `T` +enum Something { A, B } + +// So this should be fine +struct Foo { + Something some_member, +} +``` + +Here we have a struct `Foo` that has a reliance on `Something`, but this shouldn't case a type loop. Because `Foo` doesn't actually appear in `Something`. So whatever algorithm we come up with should resolve member types not by iterating over each individual type, but by attempting to instantiate a monomorph of that type, then to check the members of that monomorph. \ No newline at end of file diff --git a/src/protocol/ast.rs b/src/protocol/ast.rs index ac14cd4da6bc831eaa42b9634a9685e8e0df4e7c..b94f81be548e9f95787e334020dfb98457e8d578 100644 --- a/src/protocol/ast.rs +++ b/src/protocol/ast.rs @@ -494,7 +494,28 @@ pub enum ConcreteTypePart { Input, Output, // User defined type with any number of nested types - Instance(DefinitionId, u32), + Instance(DefinitionId, u32), // instance of data type + Function(DefinitionId, u32), // instance of function + Component(DefinitionId, u32), // instance of a connector +} + +impl ConcreteTypePart { + fn num_embedded(&self) -> u32 { + use ConcreteTypePart::*; + + match self { + Void | Message | Bool | + UInt8 | UInt16 | UInt32 | UInt64 | + SInt8 | SInt16 | SInt32 | SInt64 | + Character | String => + 0, + Array | Slice | Input | Output => + 1, + Instance(_, num_embedded) => *num_embedded, + Function(_, num_embedded) => *num_embedded, + Component(_, num_embedded) => *num_embedded, + } + } } #[derive(Debug, Clone, Eq, PartialEq)] @@ -508,6 +529,139 @@ impl Default for ConcreteType { } } +impl ConcreteType { + /// Returns an iterator over the subtrees that are type arguments (e.g. an + /// array element's type, or a polymorphic type's arguments) to the + /// provided parent type (specified by its index in the `parts` array). + pub(crate) fn embedded_iter<'a>(&'a self, parent_part_idx: usize) -> ConcreteTypeIter<'a> { + let num_embedded = self.parts[parent_part_idx].num_embedded(); + return ConcreteTypeIter{ + concrete: self, + idx_embedded: 0, + num_embedded, + part_idx: parent_part_idx + 1, + } + } + + /// Given the starting position of a type tree, determine the exclusive + /// ending index. + pub(crate) fn subtree_end_idx(&self, start_idx: usize) -> usize { + let mut depth = 1; + let num_parts = self.parts.len(); + debug_assert!(start_idx < num_parts); + + for part_idx in start_idx..self.parts.len() { + let depth_change = self.parts[part_idx].num_embedded() as i32 - 1; + depth += depth_change; + debug_assert!(depth >= 0); + + if depth == 0 { + return part_idx + 1; + } + } + + debug_assert!(false, "incorrectly constructed ConcreteType instance"); + return 0; + } + + /// Construct a human-readable name for the type. Because this performs + /// a string allocation don't use it for anything else then displaying the + /// type to the user. + pub(crate) fn display_name(&self, heap: &Heap) -> String { + fn display_part(parts: &[ConcreteTypePart], heap: &Heap, mut idx: usize, target: &mut String) -> usize { + use ConcreteTypePart as CTP; + use crate::protocol::parser::token_parsing::*; + + let cur_idx = idx; + idx += 1; // increment by 1, because it always happens + + match parts[cur_idx] { + CTP::Void => { target.push_str("void"); }, + CTP::Message => { target.push_str(KW_TYPE_MESSAGE_STR); }, + CTP::Bool => { target.push_str(KW_TYPE_BOOL_STR); }, + CTP::UInt8 => { target.push_str(KW_TYPE_UINT8_STR); }, + CTP::UInt16 => { target.push_str(KW_TYPE_UINT16_STR); }, + CTP::UInt32 => { target.push_str(KW_TYPE_UINT32_STR); }, + CTP::UInt64 => { target.push_str(KW_TYPE_UINT64_STR); }, + CTP::SInt8 => { target.push_str(KW_TYPE_SINT8_STR); }, + CTP::SInt16 => { target.push_str(KW_TYPE_SINT16_STR); }, + CTP::SInt32 => { target.push_str(KW_TYPE_SINT32_STR); }, + CTP::SInt64 => { target.push_str(KW_TYPE_SINT64_STR); }, + CTP::Character => { target.push_str(KW_TYPE_CHAR_STR); }, + CTP::String => { target.push_str(KW_TYPE_STRING_STR); }, + CTP::Array | CTP::Slice => { + idx = display_part(parts, heap, idx, target); + target.push_str("[]"); + }, + CTP::Input => { + target.push_str(KW_TYPE_IN_PORT_STR); + target.push('<'); + idx = display_part(parts, heap, idx, target); + target.push('>'); + }, + CTP::Output => { + target.push_str(KW_TYPE_OUT_PORT_STR); + target.push('<'); + idx = display_part(parts, heap, idx, target); + target.push('>'); + }, + CTP::Instance(definition_id, num_poly_args) | + CTP::Function(definition_id, num_poly_args) | + CTP::Component(definition_id, num_poly_args) => { + let definition = &heap[definition_id]; + target.push_str(definition.identifier().value.as_str()); + + if num_poly_args != 0 { + target.push('<'); + for poly_arg_idx in 0..num_poly_args { + if poly_arg_idx != 0 { + target.push(','); + idx = display_part(parts, heap, idx, target); + } + } + target.push('>'); + } + } + } + + idx + } + + let mut name = String::with_capacity(128); + let _final_idx = display_part(&self.parts, heap, 0, &mut name); + debug_assert_eq!(_final_idx, self.parts.len()); + + return name; + } +} + +#[derive(Debug)] +pub struct ConcreteTypeIter<'a> { + concrete: &'a ConcreteType, + idx_embedded: u32, + num_embedded: u32, + part_idx: usize, +} + +impl<'a> Iterator for ConcreteTypeIter<'a> { + type Item = &'a [ConcreteTypePart]; + + fn next(&mut self) -> Option { + if self.idx_embedded == self.num_embedded { + return None; + } + + // Retrieve the subtree of interest + let start_idx = self.part_idx; + let end_idx = self.concrete.subtree_end_idx(start_idx); + + self.idx_embedded += 1; + self.part_idx = end_idx; + + return Some(&self.concrete.parts[start_idx..end_idx]); + } +} + #[derive(Debug, Clone, Copy)] pub enum Scope { Definition(DefinitionId), @@ -733,7 +887,7 @@ impl StructDefinition { } } -#[derive(Debug, Clone)] +#[derive(Debug, Clone, Copy)] pub enum EnumVariantValue { None, Integer(i64), @@ -766,17 +920,11 @@ impl EnumDefinition { } } -#[derive(Debug, Clone)] -pub enum UnionVariantValue { - None, - Embedded(Vec), -} - #[derive(Debug, Clone)] pub struct UnionVariantDefinition { pub span: InputSpan, pub identifier: Identifier, - pub value: UnionVariantValue, + pub value: Vec, // if empty, then union variant does not contain any embedded types } #[derive(Debug, Clone)] diff --git a/src/protocol/ast_printer.rs b/src/protocol/ast_printer.rs index 95a47ea43f4b4996e1fdb4e701237dd1cd475b49..4e566cdc23d634459b11530513dee8b50d57f704 100644 --- a/src/protocol/ast_printer.rs +++ b/src/protocol/ast_printer.rs @@ -330,16 +330,13 @@ impl ASTWriter { self.kv(indent4).with_s_key("Name") .with_identifier_val(&variant.identifier); - match &variant.value { - UnionVariantValue::None => { - self.kv(indent4).with_s_key("Value").with_s_val("None"); - } - UnionVariantValue::Embedded(embedded) => { - self.kv(indent4).with_s_key("Values"); - for embedded in embedded { - self.kv(indent4+1).with_s_key("Value") - .with_custom_val(|v| write_parser_type(v, heap, embedded)); - } + if variant.value.is_empty() { + self.kv(indent4).with_s_key("Value").with_s_val("None"); + } else { + self.kv(indent4).with_s_key("Values"); + for embedded in &variant.value { + self.kv(indent4+1).with_s_key("Value") + .with_custom_val(|v| write_parser_type(v, heap, embedded)); } } } @@ -922,7 +919,9 @@ fn write_concrete_type(target: &mut String, heap: &Heap, def_id: DefinitionId, t idx = write_concrete_part(target, heap, def_id, t, idx + 1); } target.push('>'); - } + }, + CTP::Function(_, _) => todo!("AST printer for ConcreteTypePart::Function"), + CTP::Component(_, _) => todo!("AST printer for ConcreteTypePart::Component"), } idx + 1 diff --git a/src/protocol/parser/mod.rs b/src/protocol/parser/mod.rs index e7d5ffba6263c89f97bf3bde91d7697047b454dd..e23b69169d47007e1d58c8c94af6f6786a6a0822 100644 --- a/src/protocol/parser/mod.rs +++ b/src/protocol/parser/mod.rs @@ -50,10 +50,23 @@ pub struct Module { pub phase: ModuleCompilationPhase, } +// TODO: This is kind of wrong. Because when we're producing bytecode we would +// like the bytecode itself to not have the notion of the size of a pointer +// type. But until I figure out what we do want I'll just set everything +// to a 64-bit architecture. +pub struct TargetArch { + pub array_size_alignment: (usize, usize), + pub slice_size_alignment: (usize, usize), + pub string_size_alignment: (usize, usize), + pub port_size_alignment: (usize, usize), + pub pointer_size_alignment: (usize, usize), +} + pub struct PassCtx<'a> { heap: &'a mut Heap, symbols: &'a mut SymbolTable, pool: &'a mut StringPool, + arch: &'a TargetArch, } pub struct Parser { @@ -73,6 +86,7 @@ pub struct Parser { pass_typing: PassTyping, // Compiler options pub write_ast_to: Option, + pub(crate) arch: TargetArch, } impl Parser { @@ -90,6 +104,13 @@ impl Parser { pass_validation: PassValidationLinking::new(), pass_typing: PassTyping::new(), write_ast_to: None, + arch: TargetArch { + array_size_alignment: (3*8, 8), // pointer, length, capacity + slice_size_alignment: (2*8, 8), // pointer, length + string_size_alignment: (3*8, 8), // pointer, length, capacity + port_size_alignment: (3*4, 4), // two u32s: connector + port ID + pointer_size_alignment: (8, 8), + } }; parser.symbol_table.insert_scope(None, SymbolScope::Global); @@ -167,6 +188,7 @@ impl Parser { heap: &mut self.heap, symbols: &mut self.symbol_table, pool: &mut self.string_pool, + arch: &self.arch, }; // Advance all modules to the phase where all symbols are scanned @@ -189,21 +211,25 @@ impl Parser { for module_idx in 0..self.modules.len() { let mut ctx = visitor::Ctx{ heap: &mut self.heap, - module: &mut self.modules[module_idx], + modules: &mut self.modules, + module_idx, symbols: &mut self.symbol_table, types: &mut self.type_table, + arch: &self.arch, }; self.pass_validation.visit_module(&mut ctx)?; } // Perform typechecking on all modules let mut queue = ResolveQueue::new(); - for module in &mut self.modules { + for module_idx in 0..self.modules.len() { let mut ctx = visitor::Ctx{ heap: &mut self.heap, - module, + modules: &mut self.modules, + module_idx, symbols: &mut self.symbol_table, types: &mut self.type_table, + arch: &self.arch, }; PassTyping::queue_module_definitions(&mut ctx, &mut queue); }; @@ -211,9 +237,11 @@ impl Parser { let top = queue.pop().unwrap(); let mut ctx = visitor::Ctx{ heap: &mut self.heap, - module: &mut self.modules[top.root_id.index as usize], + modules: &mut self.modules, + module_idx: top.root_id.index as usize, symbols: &mut self.symbol_table, types: &mut self.type_table, + arch: &self.arch, }; self.pass_typing.handle_module_definition(&mut ctx, &mut queue, top)?; } diff --git a/src/protocol/parser/pass_definitions.rs b/src/protocol/parser/pass_definitions.rs index 2fe843f6a6a6c434b3a374966a6c6f2620dc0766..7a48893e72677a23ba5654dfbc30215d74d61db1 100644 --- a/src/protocol/parser/pass_definitions.rs +++ b/src/protocol/parser/pass_definitions.rs @@ -222,10 +222,10 @@ impl PassDefinitions { &mut types_section, "an embedded type", Some(&mut close_pos) )?; let value = if has_embedded { - UnionVariantValue::Embedded(types_section.into_vec()) + types_section.into_vec() } else { types_section.forget(); - UnionVariantValue::None + Vec::new() }; Ok(UnionVariantDefinition{ diff --git a/src/protocol/parser/pass_typing.rs b/src/protocol/parser/pass_typing.rs index 6d58f7e3853adc9af739ec5c80d7f938617a5e03..46b17c2d98ac567766dd264cc04bf251aa068be1 100644 --- a/src/protocol/parser/pass_typing.rs +++ b/src/protocol/parser/pass_typing.rs @@ -559,14 +559,15 @@ impl InferenceType { /// Performs the conversion of the inference type into a concrete type. /// By calling this function you must make sure that no unspecified types - /// (e.g. Unknown or IntegerLike) exist in the type. + /// (e.g. Unknown or IntegerLike) exist in the type. Will not clear or check + /// if the supplied `ConcreteType` is empty, will simply append to the parts + /// vector. fn write_concrete_type(&self, concrete_type: &mut ConcreteType) { use InferenceTypePart as ITP; use ConcreteTypePart as CTP; // Make sure inference type is specified but concrete type is not yet specified debug_assert!(!self.parts.is_empty()); - debug_assert!(concrete_type.parts.is_empty()); concrete_type.parts.reserve(self.parts.len()); let mut idx = 0; @@ -807,22 +808,14 @@ impl DefinitionType { } pub(crate) struct ResolveQueueElement { + // Note that using the `definition_id` and the `monomorph_idx` one may + // query the type table for the full procedure type, thereby retrieving + // the polymorphic arguments to the procedure. pub(crate) root_id: RootId, pub(crate) definition_id: DefinitionId, - pub(crate) monomorph_types: Vec, pub(crate) reserved_monomorph_idx: i32, } -impl PartialEq for ResolveQueueElement { - fn eq(&self, other: &Self) -> bool { - return - self.root_id == other.root_id && - self.definition_id == other.definition_id && - self.monomorph_types == other.monomorph_types; - } -} -impl Eq for ResolveQueueElement {} - pub(crate) type ResolveQueue = Vec; #[derive(Clone)] @@ -927,24 +920,36 @@ impl PassTyping { // TODO: @cleanup Unsure about this, maybe a pattern will arise after // a while. pub(crate) fn queue_module_definitions(ctx: &mut Ctx, queue: &mut ResolveQueue) { - debug_assert_eq!(ctx.module.phase, ModuleCompilationPhase::ValidatedAndLinked); - let root_id = ctx.module.root_id; + debug_assert_eq!(ctx.module().phase, ModuleCompilationPhase::ValidatedAndLinked); + let root_id = ctx.module().root_id; let root = &ctx.heap.protocol_descriptions[root_id]; for definition_id in &root.definitions { let definition = &ctx.heap[*definition_id]; - let should_add_to_queue = match definition { - Definition::Function(definition) => definition.poly_vars.is_empty(), - Definition::Component(definition) => definition.poly_vars.is_empty(), - Definition::Enum(_) | Definition::Struct(_) | Definition::Union(_) => false, + let first_concrete_part = match definition { + Definition::Function(definition) => { + if definition.poly_vars.is_empty() { + Some(ConcreteTypePart::Function(*definition_id, 0)) + } else { + None + } + } + Definition::Component(definition) => { + if definition.poly_vars.is_empty() { + Some(ConcreteTypePart::Component(*definition_id, 0)) + } else { + None + } + }, + Definition::Enum(_) | Definition::Struct(_) | Definition::Union(_) => None, }; - if should_add_to_queue { - let reserved_idx = ctx.types.reserve_procedure_monomorph_index(definition_id, None); + if let Some(first_concrete_part) = first_concrete_part { + let concrete_type = ConcreteType{ parts: vec![first_concrete_part] }; + let reserved_idx = ctx.types.reserve_procedure_monomorph_index(definition_id, concrete_type); queue.push(ResolveQueueElement{ root_id, definition_id: *definition_id, - monomorph_types: Vec::new(), reserved_monomorph_idx: reserved_idx, }) } @@ -954,15 +959,26 @@ impl PassTyping { pub(crate) fn handle_module_definition( &mut self, ctx: &mut Ctx, queue: &mut ResolveQueue, element: ResolveQueueElement ) -> VisitorResult { - // Visit the definition - debug_assert_eq!(ctx.module.root_id, element.root_id); self.reset(); + debug_assert_eq!(ctx.module().root_id, element.root_id); debug_assert!(self.poly_vars.is_empty()); + + // Prepare for visiting the definition self.reserved_idx = element.reserved_monomorph_idx; - self.poly_vars = element.monomorph_types; - self.visit_definition(ctx, element.definition_id)?; - // Keep resolving types + let proc_base = ctx.types.get_base_definition(&element.definition_id).unwrap(); + if proc_base.is_polymorph { + let proc_monos = proc_base.definition.procedure_monomorphs(); + let proc_mono = &(*proc_monos)[element.reserved_monomorph_idx as usize]; + + for poly_arg in proc_mono.concrete_type.embedded_iter(0) { + self.poly_vars.push(ConcreteType{ parts: Vec::from(poly_arg) }); + } + } + + // Visit the definition, setting up the type resolving process, then + // (attempt to) resolve all types + self.visit_definition(ctx, element.definition_id)?; self.resolve_types(ctx, queue)?; Ok(()) } @@ -1391,10 +1407,23 @@ impl PassTyping { // Helper for transferring polymorphic variables to concrete types and // checking if they're completely specified - fn poly_inference_to_concrete_type( - ctx: &Ctx, expr_id: ExpressionId, inference: &Vec - ) -> Result, ParseError> { - let mut concrete = Vec::with_capacity(inference.len()); + fn inference_type_to_concrete_type( + ctx: &Ctx, expr_id: ExpressionId, inference: &Vec, + first_concrete_part: ConcreteTypePart, + ) -> Result { + // Prepare storage vector + let mut num_inference_parts = 0; + for inference_type in inference { + num_inference_parts += inference_type.parts.len(); + } + + let mut concrete_type = ConcreteType{ + parts: Vec::with_capacity(1 + num_inference_parts), + }; + concrete_type.parts.push(first_concrete_part); + + // Go through all polymorphic arguments and add them to the concrete + // types. for (poly_idx, poly_type) in inference.iter().enumerate() { if !poly_type.is_done { let expr = &ctx.heap[expr_id]; @@ -1410,19 +1439,17 @@ impl PassTyping { }; let poly_vars = ctx.heap[definition].poly_vars(); return Err(ParseError::new_error_at_span( - &ctx.module.source, expr.operation_span(), format!( + &ctx.module().source, expr.operation_span(), format!( "could not fully infer the type of polymorphic variable '{}' of this expression (got '{}')", poly_vars[poly_idx].value.as_str(), poly_type.display_name(&ctx.heap) ) )); } - let mut concrete_type = ConcreteType::default(); poly_type.write_concrete_type(&mut concrete_type); - concrete.push(concrete_type); } - Ok(concrete) + Ok(concrete_type) } // Inference is now done. But we may still have uninferred types. So we @@ -1437,7 +1464,7 @@ impl PassTyping { } else { let expr = &ctx.heap[infer_expr.expr_id]; return Err(ParseError::new_error_at_span( - &ctx.module.source, expr.full_span(), format!( + &ctx.module().source, expr.full_span(), format!( "could not fully infer the type of this expression (got '{}')", expr_type.display_name(&ctx.heap) ) @@ -1458,27 +1485,34 @@ impl PassTyping { // storage of the struct type whose field it is selecting. match &ctx.heap[extra_data.expr_id] { Expression::Call(expr) => { - if expr.method != Method::UserFunction && expr.method != Method::UserComponent { + // Check if it is not a builtin function. If not, then + // construct the first part of the concrete type. + let first_concrete_part = if expr.method == Method::UserFunction { + ConcreteTypePart::Function(expr.definition, extra_data.poly_vars.len() as u32) + } else if expr.method == Method::UserComponent { + ConcreteTypePart::Component(expr.definition, extra_data.poly_vars.len() as u32) + } else { // Builtin function continue; - } + }; let definition_id = expr.definition; - let poly_types = poly_inference_to_concrete_type(ctx, extra_data.expr_id, &extra_data.poly_vars)?; + let concrete_type = inference_type_to_concrete_type( + ctx, extra_data.expr_id, &extra_data.poly_vars, first_concrete_part + )?; - match ctx.types.get_procedure_monomorph_index(&definition_id, &poly_types) { + match ctx.types.get_procedure_monomorph_index(&definition_id, &concrete_type) { Some(reserved_idx) => { // Already typechecked, or already put into the resolve queue infer_expr.field_or_monomorph_idx = reserved_idx; }, None => { // Not typechecked yet, so add an entry in the queue - let reserved_idx = ctx.types.reserve_procedure_monomorph_index(&definition_id, Some(poly_types.clone())); + let reserved_idx = ctx.types.reserve_procedure_monomorph_index(&definition_id, concrete_type); infer_expr.field_or_monomorph_idx = reserved_idx; queue.push(ResolveQueueElement{ root_id: ctx.heap[definition_id].defined_in(), definition_id, - monomorph_types: poly_types, reserved_monomorph_idx: reserved_idx, }); } @@ -1491,9 +1525,11 @@ impl PassTyping { Literal::Struct(lit) => lit.definition, _ => unreachable!(), }; - - let poly_types = poly_inference_to_concrete_type(ctx, extra_data.expr_id, &extra_data.poly_vars)?; - let mono_index = ctx.types.add_data_monomorph(&definition_id, poly_types); + let first_concrete_part = ConcreteTypePart::Instance(definition_id, extra_data.poly_vars.len() as u32); + let concrete_type = inference_type_to_concrete_type( + ctx, extra_data.expr_id, &extra_data.poly_vars, first_concrete_part + )?; + let mono_index = ctx.types.add_data_monomorph(ctx.modules, ctx.heap, ctx.arch, definition_id, concrete_type)?; infer_expr.field_or_monomorph_idx = mono_index; }, Expression::Select(_) => { @@ -1520,7 +1556,6 @@ impl PassTyping { }; let target = ctx.types.get_procedure_expression_data_mut(&definition_id, self.reserved_idx); - debug_assert!(target.poly_args == self.poly_vars); debug_assert!(target.expr_data.is_empty()); // makes sure we never queue something twice target.expr_data.reserve(self.expr_types.len()); @@ -1993,7 +2028,7 @@ impl PassTyping { struct_def } else { return Err(ParseError::new_error_at_span( - &ctx.module.source, select_expr.field_name.span, format!( + &ctx.module().source, select_expr.field_name.span, format!( "Can only apply field access to structs, got a subject of type '{}'", subject_type.display_name(&ctx.heap) ) @@ -2015,7 +2050,7 @@ impl PassTyping { if struct_def_id.is_none() { let ast_struct_def = ctx.heap[type_def.ast_definition].as_struct(); return Err(ParseError::new_error_at_span( - &ctx.module.source, select_expr.field_name.span, format!( + &ctx.module().source, select_expr.field_name.span, format!( "this field does not exist on the struct '{}'", ast_struct_def.identifier.value.as_str() ) @@ -2033,7 +2068,7 @@ impl PassTyping { }, Err(()) => { return Err(ParseError::new_error_at_span( - &ctx.module.source, select_expr.field_name.span, format!( + &ctx.module().source, select_expr.field_name.span, format!( "Can only apply field access to structs, got a subject of type '{}'", subject_type.display_name(&ctx.heap) ) @@ -2459,9 +2494,9 @@ impl PassTyping { let cast_expr = &ctx.heap[id]; let subject_expr = &ctx.heap[cast_expr.subject]; return Err(ParseError::new_error_str_at_span( - &ctx.module.source, cast_expr.full_span, "invalid casting operation" + &ctx.module().source, cast_expr.full_span, "invalid casting operation" ).with_info_at_span( - &ctx.module.source, subject_expr.full_span(), format!( + &ctx.module().source, subject_expr.full_span(), format!( "cannot cast the argument type '{}' to the cast type '{}'", subject_type.display_name(&ctx.heap), expr_type.display_name(&ctx.heap) @@ -2609,12 +2644,12 @@ impl PassTyping { if infer_res == DualInferenceResult::Incompatible { let var_decl = &ctx.heap[var_id]; return Err(ParseError::new_error_at_span( - &ctx.module.source, var_decl.identifier.span, format!( + &ctx.module().source, var_decl.identifier.span, format!( "Conflicting types for this variable, previously assigned the type '{}'", var_data.var_type.display_name(&ctx.heap) ) ).with_info_at_span( - &ctx.module.source, var_expr.identifier.span, format!( + &ctx.module().source, var_expr.identifier.span, format!( "But inferred to have incompatible type '{}' here", expr_type.display_name(&ctx.heap) ) @@ -2664,12 +2699,12 @@ impl PassTyping { let link_decl = &ctx.heap[linked_id]; return Err(ParseError::new_error_at_span( - &ctx.module.source, var_decl.identifier.span, format!( + &ctx.module().source, var_decl.identifier.span, format!( "Conflicting types for this variable, assigned the type '{}'", var_data.var_type.display_name(&ctx.heap) ) ).with_info_at_span( - &ctx.module.source, link_decl.identifier.span, format!( + &ctx.module().source, link_decl.identifier.span, format!( "Because it is incompatible with this variable, assigned the type '{}'", link_data.var_type.display_name(&ctx.heap) ) @@ -2828,10 +2863,10 @@ impl PassTyping { ) }; return Err(ParseError::new_error_str_at_span( - &ctx.module.source, outer_span, + &ctx.module().source, outer_span, "failed to fully resolve the types of this expression" ).with_info_at_span( - &ctx.module.source, span, format!( + &ctx.module().source, span, format!( "because the {} signature has been resolved to '{}', but the expression has been resolved to '{}'", span_name, signature_display_type, expression_display_type ) @@ -3499,7 +3534,10 @@ impl PassTyping { for concrete_part in concrete_type { match concrete_part { CTP::Void => parser_type.push(ITP::Void), - CTP::Message => parser_type.push(ITP::Message), + CTP::Message => { + parser_type.push(ITP::Message); + parser_type.push(ITP::UInt8) + }, CTP::Bool => parser_type.push(ITP::Bool), CTP::UInt8 => parser_type.push(ITP::UInt8), CTP::UInt16 => parser_type.push(ITP::UInt16), @@ -3519,6 +3557,8 @@ impl PassTyping { CTP::Input => parser_type.push(ITP::Input), CTP::Output => parser_type.push(ITP::Output), CTP::Instance(id, num) => parser_type.push(ITP::Instance(*id, *num)), + CTP::Function(_, _) => unreachable!("function type during concrete to inference type conversion"), + CTP::Component(_, _) => unreachable!("component type during concrete to inference type conversion"), } } } @@ -3540,12 +3580,12 @@ impl PassTyping { let arg_type = &self.expr_types[arg_expr_idx as usize].expr_type; return ParseError::new_error_at_span( - &ctx.module.source, expr.operation_span(), format!( + &ctx.module().source, expr.operation_span(), format!( "incompatible types: this expression expected a '{}'", expr_type.display_name(&ctx.heap) ) ).with_info_at_span( - &ctx.module.source, arg_expr.full_span(), format!( + &ctx.module().source, arg_expr.full_span(), format!( "but this expression yields a '{}'", arg_type.display_name(&ctx.heap) ) @@ -3566,15 +3606,15 @@ impl PassTyping { let arg2_type = &self.expr_types[arg2_idx as usize].expr_type; return ParseError::new_error_str_at_span( - &ctx.module.source, expr.operation_span(), + &ctx.module().source, expr.operation_span(), "incompatible types: cannot apply this expression" ).with_info_at_span( - &ctx.module.source, arg1.full_span(), format!( + &ctx.module().source, arg1.full_span(), format!( "Because this expression has type '{}'", arg1_type.display_name(&ctx.heap) ) ).with_info_at_span( - &ctx.module.source, arg2.full_span(), format!( + &ctx.module().source, arg2.full_span(), format!( "But this expression has type '{}'", arg2_type.display_name(&ctx.heap) ) @@ -3589,7 +3629,7 @@ impl PassTyping { let expr_type = &self.expr_types[expr_idx as usize].expr_type; return ParseError::new_error_at_span( - &ctx.module.source, expr.full_span(), format!( + &ctx.module().source, expr.full_span(), format!( "incompatible types: got a '{}' but expected a '{}'", expr_type.display_name(&ctx.heap), InferenceType::partial_display_name(&ctx.heap, template) @@ -3665,7 +3705,7 @@ impl PassTyping { Expression::Call(expr) => { let (poly_var, func_name) = get_poly_var_and_definition_name(ctx, poly_var_idx, poly_data.definition_id); return ParseError::new_error_at_span( - &ctx.module.source, expr.func_span, format!( + &ctx.module().source, expr.func_span, format!( "Conflicting type for polymorphic variable '{}' of '{}'", poly_var, func_name ) @@ -3674,7 +3714,7 @@ impl PassTyping { Expression::Literal(expr) => { let (poly_var, type_name) = get_poly_var_and_definition_name(ctx, poly_var_idx, poly_data.definition_id); return ParseError::new_error_at_span( - &ctx.module.source, expr.span, format!( + &ctx.module().source, expr.span, format!( "Conflicting type for polymorphic variable '{}' of instantiation of '{}'", poly_var, type_name ) @@ -3683,7 +3723,7 @@ impl PassTyping { Expression::Select(expr) => { let (poly_var, struct_name) = get_poly_var_and_definition_name(ctx, poly_var_idx, poly_data.definition_id); return ParseError::new_error_at_span( - &ctx.module.source, expr.full_span, format!( + &ctx.module().source, expr.full_span, format!( "Conflicting type for polymorphic variable '{}' while accessing field '{}' of '{}'", poly_var, expr.field_name.value.as_str(), struct_name ) @@ -3729,7 +3769,7 @@ impl PassTyping { ) { return construct_main_error(ctx, poly_data, poly_idx, expr) .with_info_at_span( - &ctx.module.source, expr.full_span(), format!( + &ctx.module().source, expr.full_span(), format!( "The {} inferred the conflicting types '{}' and '{}'", expr_return_name, InferenceType::partial_display_name(&ctx.heap, section_a), @@ -3751,7 +3791,7 @@ impl PassTyping { // Same argument let arg = &ctx.heap[expr_args[arg_a_idx]]; return error.with_info_at_span( - &ctx.module.source, arg.full_span(), format!( + &ctx.module().source, arg.full_span(), format!( "This argument inferred the conflicting types '{}' and '{}'", InferenceType::partial_display_name(&ctx.heap, section_a), InferenceType::partial_display_name(&ctx.heap, section_b) @@ -3761,12 +3801,12 @@ impl PassTyping { let arg_a = &ctx.heap[expr_args[arg_a_idx]]; let arg_b = &ctx.heap[expr_args[arg_b_idx]]; return error.with_info_at_span( - &ctx.module.source, arg_a.full_span(), format!( + &ctx.module().source, arg_a.full_span(), format!( "This argument inferred it to '{}'", InferenceType::partial_display_name(&ctx.heap, section_a) ) ).with_info_at_span( - &ctx.module.source, arg_b.full_span(), format!( + &ctx.module().source, arg_b.full_span(), format!( "While this argument inferred it to '{}'", InferenceType::partial_display_name(&ctx.heap, section_b) ) @@ -3780,13 +3820,13 @@ impl PassTyping { let arg = &ctx.heap[expr_args[arg_a_idx]]; return construct_main_error(ctx, poly_data, poly_idx, expr) .with_info_at_span( - &ctx.module.source, arg.full_span(), format!( + &ctx.module().source, arg.full_span(), format!( "This argument inferred it to '{}'", InferenceType::partial_display_name(&ctx.heap, section_arg) ) ) .with_info_at_span( - &ctx.module.source, expr.full_span(), format!( + &ctx.module().source, expr.full_span(), format!( "While the {} inferred it to '{}'", expr_return_name, InferenceType::partial_display_name(&ctx.heap, section_ret) @@ -3802,7 +3842,7 @@ impl PassTyping { let arg = &ctx.heap[expr_args[arg_idx]]; return construct_main_error(ctx, poly_data, poly_idx, expr) .with_info_at_span( - &ctx.module.source, arg.full_span(), format!( + &ctx.module().source, arg.full_span(), format!( "The polymorphic variable has type '{}' (which might have been partially inferred) while the argument inferred it to '{}'", InferenceType::partial_display_name(&ctx.heap, poly_section), InferenceType::partial_display_name(&ctx.heap, arg_section) @@ -3814,7 +3854,7 @@ impl PassTyping { if let Some((poly_idx, poly_section, ret_section)) = has_explicit_poly_mismatch(&poly_data.poly_vars, &poly_data.returned) { return construct_main_error(ctx, poly_data, poly_idx, expr) .with_info_at_span( - &ctx.module.source, expr.full_span(), format!( + &ctx.module().source, expr.full_span(), format!( "The polymorphic variable has type '{}' (which might have been partially inferred) while the {} inferred it to '{}'", InferenceType::partial_display_name(&ctx.heap, poly_section), expr_return_name, diff --git a/src/protocol/parser/pass_validation_linking.rs b/src/protocol/parser/pass_validation_linking.rs index 57c1e2b94ccad88bb44718d92541dd09f09d89ae..dca88025f2fd36f261e77a8f9e67e650ac8d4023 100644 --- a/src/protocol/parser/pass_validation_linking.rs +++ b/src/protocol/parser/pass_validation_linking.rs @@ -132,9 +132,9 @@ macro_rules! assign_and_replace_next_stmt { impl Visitor for PassValidationLinking { fn visit_module(&mut self, ctx: &mut Ctx) -> VisitorResult { - debug_assert_eq!(ctx.module.phase, ModuleCompilationPhase::TypesAddedToTable); + debug_assert_eq!(ctx.module().phase, ModuleCompilationPhase::TypesAddedToTable); - let root = &ctx.heap[ctx.module.root_id]; + let root = &ctx.heap[ctx.module().root_id]; let section = self.definition_buffer.start_section_initialized(&root.definitions); for definition_idx in 0..section.len() { let definition_id = section[definition_idx]; @@ -142,7 +142,7 @@ impl Visitor for PassValidationLinking { } section.forget(); - ctx.module.phase = ModuleCompilationPhase::ValidatedAndLinked; + ctx.module_mut().phase = ModuleCompilationPhase::ValidatedAndLinked; Ok(()) } //-------------------------------------------------------------------------- @@ -343,15 +343,15 @@ impl Visitor for PassValidationLinking { // Nested synchronous statement let old_sync_span = ctx.heap[self.in_sync].span; return Err(ParseError::new_error_str_at_span( - &ctx.module.source, cur_sync_span, "Illegal nested synchronous statement" + &ctx.module().source, cur_sync_span, "Illegal nested synchronous statement" ).with_info_str_at_span( - &ctx.module.source, old_sync_span, "It is nested in this synchronous statement" + &ctx.module().source, old_sync_span, "It is nested in this synchronous statement" )); } if !self.def_type.is_primitive() { return Err(ParseError::new_error_str_at_span( - &ctx.module.source, cur_sync_span, + &ctx.module().source, cur_sync_span, "synchronous statements may only be used in primitive components" )); } @@ -375,7 +375,7 @@ impl Visitor for PassValidationLinking { let stmt = &ctx.heap[id]; if !self.def_type.is_function() { return Err(ParseError::new_error_str_at_span( - &ctx.module.source, stmt.span, + &ctx.module().source, stmt.span, "return statements may only appear in function bodies" )); } @@ -404,9 +404,9 @@ impl Visitor for PassValidationLinking { let goto_stmt = &ctx.heap[id]; let sync_stmt = &ctx.heap[self.in_sync]; return Err( - ParseError::new_error_str_at_span(&ctx.module.source, goto_stmt.span, "goto may not escape the surrounding synchronous block") - .with_info_str_at_span(&ctx.module.source, target.label.span, "this is the target of the goto statement") - .with_info_str_at_span(&ctx.module.source, sync_stmt.span, "which will jump past this statement") + ParseError::new_error_str_at_span(&ctx.module().source, goto_stmt.span, "goto may not escape the surrounding synchronous block") + .with_info_str_at_span(&ctx.module().source, target.label.span, "this is the target of the goto statement") + .with_info_str_at_span(&ctx.module().source, sync_stmt.span, "which will jump past this statement") ); } @@ -420,7 +420,7 @@ impl Visitor for PassValidationLinking { if !self.def_type.is_composite() { let new_stmt = &ctx.heap[id]; return Err(ParseError::new_error_str_at_span( - &ctx.module.source, new_stmt.span, + &ctx.module().source, new_stmt.span, "instantiating components may only be done in composite components" )); } @@ -465,10 +465,13 @@ impl Visitor for PassValidationLinking { match self.expr_parent { // Look at us: lying through our teeth while providing error messages. ExpressionParent::ExpressionStmt(_) => {}, - _ => return Err(ParseError::new_error_str_at_span( - &ctx.module.source, assignment_expr.full_span, - "assignments are statements, and cannot be used in expressions" - )), + _ => { + let assignment_span = assignment_expr.full_span; + return Err(ParseError::new_error_str_at_span( + &ctx.module().source, assignment_span, + "assignments are statements, and cannot be used in expressions" + )) + }, } let left_expr_id = assignment_expr.left; @@ -494,14 +497,14 @@ impl Visitor for PassValidationLinking { // Check for valid context of binding expression if let Some(span) = self.must_be_assignable { return Err(ParseError::new_error_str_at_span( - &ctx.module.source, span, "cannot assign to the result from a binding expression" + &ctx.module().source, span, "cannot assign to the result from a binding expression" )); } if self.in_test_expr.is_invalid() { let binding_expr = &ctx.heap[id]; return Err(ParseError::new_error_str_at_span( - &ctx.module.source, binding_expr.full_span, + &ctx.module().source, binding_expr.full_span, "binding expressions can only be used inside the testing expression of 'if' and 'while' statements" )); } @@ -510,10 +513,10 @@ impl Visitor for PassValidationLinking { let binding_expr = &ctx.heap[id]; let previous_expr = &ctx.heap[self.in_binding_expr]; return Err(ParseError::new_error_str_at_span( - &ctx.module.source, binding_expr.full_span, + &ctx.module().source, binding_expr.full_span, "nested binding expressions are not allowed" ).with_info_str_at_span( - &ctx.module.source, previous_expr.operator_span, + &ctx.module().source, previous_expr.operator_span, "the outer binding expression is found here" )); } @@ -551,10 +554,10 @@ impl Visitor for PassValidationLinking { let binding_expr = &ctx.heap[id]; let parent_expr = &ctx.heap[seeking_parent.as_expression()]; return Err(ParseError::new_error_str_at_span( - &ctx.module.source, binding_expr.full_span, + &ctx.module().source, binding_expr.full_span, "only the logical-and operator (&&) may be applied to binding expressions" ).with_info_str_at_span( - &ctx.module.source, parent_expr.operation_span(), + &ctx.module().source, parent_expr.operation_span(), "this was the disallowed operation applied to the result from a binding expression" )); } @@ -584,7 +587,7 @@ impl Visitor for PassValidationLinking { _ => { let binding_expr = &ctx.heap[id]; return Err(ParseError::new_error_str_at_span( - &ctx.module.source, binding_expr.operator_span, + &ctx.module().source, binding_expr.operator_span, "the left hand side of a binding expression may only be a variable or a literal expression" )); }, @@ -610,7 +613,7 @@ impl Visitor for PassValidationLinking { if let Some(span) = self.must_be_assignable { return Err(ParseError::new_error_str_at_span( - &ctx.module.source, span, "cannot assign to the result from a conditional expression" + &ctx.module().source, span, "cannot assign to the result from a conditional expression" )) } @@ -640,7 +643,7 @@ impl Visitor for PassValidationLinking { if let Some(span) = self.must_be_assignable { return Err(ParseError::new_error_str_at_span( - &ctx.module.source, span, "cannot assign to the result from a binary expression" + &ctx.module().source, span, "cannot assign to the result from a binary expression" )) } @@ -667,7 +670,7 @@ impl Visitor for PassValidationLinking { if let Some(span) = self.must_be_assignable { return Err(ParseError::new_error_str_at_span( - &ctx.module.source, span, "cannot assign to the result from a unary expression" + &ctx.module().source, span, "cannot assign to the result from a unary expression" )) } @@ -715,7 +718,7 @@ impl Visitor for PassValidationLinking { if let Some(span) = self.must_be_assignable { // TODO: @Slicing return Err(ParseError::new_error_str_at_span( - &ctx.module.source, span, "assignment to slices should be valid in the final language, but is currently not implemented" + &ctx.module().source, span, "assignment to slices should be valid in the final language, but is currently not implemented" )); } @@ -768,7 +771,7 @@ impl Visitor for PassValidationLinking { if let Some(span) = self.must_be_assignable { return Err(ParseError::new_error_str_at_span( - &ctx.module.source, span, "cannot assign to a literal expression" + &ctx.module().source, span, "cannot assign to a literal expression" )) } @@ -796,7 +799,7 @@ impl Visitor for PassValidationLinking { let literal = ctx.heap[id].value.as_struct(); let ast_definition = &ctx.heap[literal.definition]; return Err(ParseError::new_error_at_span( - &ctx.module.source, field_span, format!( + &ctx.module().source, field_span, format!( "This field does not exist on the struct '{}'", ast_definition.identifier().value.as_str() ) @@ -806,8 +809,9 @@ impl Visitor for PassValidationLinking { // Check if specified more than once if specified[field.field_idx] { + let field_span = field.identifier.span; return Err(ParseError::new_error_str_at_span( - &ctx.module.source, field.identifier.span, + &ctx.module().source, field_span, "This field is specified more than once" )); } @@ -835,8 +839,9 @@ impl Visitor for PassValidationLinking { format!("not all fields are specified, [{}] are missing", not_specified) }; + let literal_span = literal.parser_type.full_span; return Err(ParseError::new_error_at_span( - &ctx.module.source, literal.parser_type.full_span, msg + &ctx.module().source, literal_span, msg )); } @@ -868,7 +873,7 @@ impl Visitor for PassValidationLinking { let literal = ctx.heap[id].value.as_enum(); let ast_definition = ctx.heap[literal.definition].as_enum(); return Err(ParseError::new_error_at_span( - &ctx.module.source, literal.parser_type.full_span, format!( + &ctx.module().source, literal.parser_type.full_span, format!( "the variant '{}' does not exist on the enum '{}'", literal.variant.value.as_str(), ast_definition.identifier.value.as_str() ) @@ -889,7 +894,7 @@ impl Visitor for PassValidationLinking { let literal = ctx.heap[id].value.as_union(); let ast_definition = ctx.heap[literal.definition].as_union(); return Err(ParseError::new_error_at_span( - &ctx.module.source, literal.parser_type.full_span, format!( + &ctx.module().source, literal.parser_type.full_span, format!( "the variant '{}' does not exist on the union '{}'", literal.variant.value.as_str(), ast_definition.identifier.value.as_str() ) @@ -905,7 +910,7 @@ impl Visitor for PassValidationLinking { let literal = ctx.heap[id].value.as_union(); let ast_definition = ctx.heap[literal.definition].as_union(); return Err(ParseError::new_error_at_span( - &ctx.module.source, literal.parser_type.full_span, format!( + &ctx.module().source, literal.parser_type.full_span, format!( "The variant '{}' of union '{}' expects {} embedded values, but {} were specified", literal.variant.value.as_str(), ast_definition.identifier.value.as_str(), union_variant.embedded.len(), literal.values.len() @@ -953,7 +958,7 @@ impl Visitor for PassValidationLinking { if let Some(span) = self.must_be_assignable { return Err(ParseError::new_error_str_at_span( - &ctx.module.source, span, "cannot assign to the result from a cast expression" + &ctx.module().source, span, "cannot assign to the result from a cast expression" )) } @@ -977,7 +982,7 @@ impl Visitor for PassValidationLinking { if let Some(span) = self.must_be_assignable { return Err(ParseError::new_error_str_at_span( - &ctx.module.source, span, "cannot assign to the result from a call expression" + &ctx.module().source, span, "cannot assign to the result from a call expression" )) } @@ -987,42 +992,48 @@ impl Visitor for PassValidationLinking { match &mut call_expr.method { Method::Get => { if !self.def_type.is_primitive() { + let call_span = call_expr.func_span; return Err(ParseError::new_error_str_at_span( - &ctx.module.source, call_expr.func_span, + &ctx.module().source, call_span, "a call to 'get' may only occur in primitive component definitions" )); } if self.in_sync.is_invalid() { + let call_span = call_expr.func_span; return Err(ParseError::new_error_str_at_span( - &ctx.module.source, call_expr.func_span, + &ctx.module().source, call_span, "a call to 'get' may only occur inside synchronous blocks" )); } }, Method::Put => { if !self.def_type.is_primitive() { + let call_span = call_expr.func_span; return Err(ParseError::new_error_str_at_span( - &ctx.module.source, call_expr.func_span, + &ctx.module().source, call_span, "a call to 'put' may only occur in primitive component definitions" )); } if self.in_sync.is_invalid() { + let call_span = call_expr.func_span; return Err(ParseError::new_error_str_at_span( - &ctx.module.source, call_expr.func_span, + &ctx.module().source, call_span, "a call to 'put' may only occur inside synchronous blocks" )); } }, Method::Fires => { if !self.def_type.is_primitive() { + let call_span = call_expr.func_span; return Err(ParseError::new_error_str_at_span( - &ctx.module.source, call_expr.func_span, + &ctx.module().source, call_span, "a call to 'fires' may only occur in primitive component definitions" )); } if self.in_sync.is_invalid() { + let call_span = call_expr.func_span; return Err(ParseError::new_error_str_at_span( - &ctx.module.source, call_expr.func_span, + &ctx.module().source, call_span, "a call to 'fires' may only occur inside synchronous blocks" )); } @@ -1031,14 +1042,16 @@ impl Visitor for PassValidationLinking { Method::Length => {}, Method::Assert => { if self.def_type.is_function() { + let call_span = call_expr.func_span; return Err(ParseError::new_error_str_at_span( - &ctx.module.source, call_expr.func_span, + &ctx.module().source, call_span, "assert statement may only occur in components" )); } if self.in_sync.is_invalid() { + let call_span = call_expr.func_span; return Err(ParseError::new_error_str_at_span( - &ctx.module.source, call_expr.func_span, + &ctx.module().source, call_span, "assert statements may only occur inside synchronous blocks" )); } @@ -1051,15 +1064,17 @@ impl Visitor for PassValidationLinking { if expected_wrapping_new_stmt { if !self.expr_parent.is_new() { + let call_span = call_expr.func_span; return Err(ParseError::new_error_str_at_span( - &ctx.module.source, call_expr.func_span, + &ctx.module().source, call_span, "cannot call a component, it can only be instantiated by using 'new'" )); } } else { if self.expr_parent.is_new() { + let call_span = call_expr.func_span; return Err(ParseError::new_error_str_at_span( - &ctx.module.source, call_expr.func_span, + &ctx.module().source, call_span, "only components can be instantiated, this is a function" )); } @@ -1076,8 +1091,9 @@ impl Visitor for PassValidationLinking { let num_provided_args = call_expr.arguments.len(); if num_provided_args != num_expected_args { let argument_text = if num_expected_args == 1 { "argument" } else { "arguments" }; + let call_span = call_expr.full_span; return Err(ParseError::new_error_at_span( - &ctx.module.source, call_expr.full_span, format!( + &ctx.module().source, call_span, format!( "expected {} {}, but {} were provided", num_expected_args, argument_text, num_provided_args ) @@ -1118,7 +1134,7 @@ impl Visitor for PassValidationLinking { // then this may be the thing we're binding to. if self.in_binding_expr.is_invalid() || !self.in_binding_expr_lhs { return Err(ParseError::new_error_str_at_span( - &ctx.module.source, var_expr.identifier.span, "unresolved variable" + &ctx.module().source, var_expr.identifier.span, "unresolved variable" )); } @@ -1158,10 +1174,10 @@ impl Visitor for PassValidationLinking { if !is_valid_binding { let binding_expr = &ctx.heap[self.in_binding_expr]; return Err(ParseError::new_error_str_at_span( - &ctx.module.source, var_expr.identifier.span, + &ctx.module().source, var_expr.identifier.span, "illegal location for binding variable: binding variables may only be nested under a binding expression, or a struct, union or array literal" ).with_info_at_span( - &ctx.module.source, binding_expr.operator_span, format!( + &ctx.module().source, binding_expr.operator_span, format!( "'{}' was interpreted as a binding variable because the variable is not declared and it is nested under this binding expression", var_expr.identifier.value.as_str() ) @@ -1416,9 +1432,9 @@ impl PassValidationLinking { if local.identifier == parameter.identifier { return Err( ParseError::new_error_str_at_span( - &ctx.module.source, local.identifier.span, "Local variable name conflicts with parameter" + &ctx.module().source, local.identifier.span, "Local variable name conflicts with parameter" ).with_info_str_at_span( - &ctx.module.source, parameter.identifier.span, "Parameter definition is found here" + &ctx.module().source, parameter.identifier.span, "Parameter definition is found here" ) ); } @@ -1442,9 +1458,9 @@ impl PassValidationLinking { // Collision within this scope return Err( ParseError::new_error_str_at_span( - &ctx.module.source, local.identifier.span, "Local variable name conflicts with another variable" + &ctx.module().source, local.identifier.span, "Local variable name conflicts with another variable" ).with_info_str_at_span( - &ctx.module.source, other_local.identifier.span, "Previous variable is found here" + &ctx.module().source, other_local.identifier.span, "Previous variable is found here" ) ); } @@ -1467,10 +1483,10 @@ impl PassValidationLinking { let ident = &ctx.heap[id].identifier; if let Some(symbol) = ctx.symbols.get_symbol_by_name(cur_scope, &ident.value.as_bytes()) { return Err(ParseError::new_error_str_at_span( - &ctx.module.source, ident.span, + &ctx.module().source, ident.span, "local variable declaration conflicts with symbol" ).with_info_str_at_span( - &ctx.module.source, symbol.variant.span_of_introduction(&ctx.heap), "the conflicting symbol is introduced here" + &ctx.module().source, symbol.variant.span_of_introduction(&ctx.heap), "the conflicting symbol is introduced here" )); } } @@ -1488,9 +1504,9 @@ impl PassValidationLinking { // Collision return Err( ParseError::new_error_str_at_span( - &ctx.module.source, local.identifier.span, "Local variable name conflicts with another variable" + &ctx.module().source, local.identifier.span, "Local variable name conflicts with another variable" ).with_info_str_at_span( - &ctx.module.source, other_local.identifier.span, "Previous variable is found here" + &ctx.module().source, other_local.identifier.span, "Previous variable is found here" ) ); } @@ -1572,9 +1588,9 @@ impl PassValidationLinking { if other_label.label == label.label { // Collision return Err(ParseError::new_error_str_at_span( - &ctx.module.source, label.label.span, "label name is used more than once" + &ctx.module().source, label.label.span, "label name is used more than once" ).with_info_str_at_span( - &ctx.module.source, other_label.label.span, "the other label is found here" + &ctx.module().source, other_label.label.span, "the other label is found here" )); } } @@ -1615,9 +1631,9 @@ impl PassValidationLinking { let local = &ctx.heap[*local_id]; if local.relative_pos_in_block > relative_scope_pos && local.relative_pos_in_block < label.relative_pos_in_block { return Err( - ParseError::new_error_str_at_span(&ctx.module.source, identifier.span, "this target label skips over a variable declaration") - .with_info_str_at_span(&ctx.module.source, label.label.span, "because it jumps to this label") - .with_info_str_at_span(&ctx.module.source, local.identifier.span, "which skips over this variable") + ParseError::new_error_str_at_span(&ctx.module().source, identifier.span, "this target label skips over a variable declaration") + .with_info_str_at_span(&ctx.module().source, label.label.span, "because it jumps to this label") + .with_info_str_at_span(&ctx.module().source, local.identifier.span, "which skips over this variable") ); } } @@ -1628,7 +1644,7 @@ impl PassValidationLinking { scope = &block.scope_node.parent; if !scope.is_block() { return Err(ParseError::new_error_str_at_span( - &ctx.module.source, identifier.span, "could not find this label" + &ctx.module().source, identifier.span, "could not find this label" )); } @@ -1673,18 +1689,18 @@ impl PassValidationLinking { // present underneath this particular labeled while statement if !self.has_parent_while_scope(ctx, target_stmt.this) { return Err(ParseError::new_error_str_at_span( - &ctx.module.source, label.span, "break statement is not nested under the target label's while statement" + &ctx.module().source, label.span, "break statement is not nested under the target label's while statement" ).with_info_str_at_span( - &ctx.module.source, target.label.span, "the targeted label is found here" + &ctx.module().source, target.label.span, "the targeted label is found here" )); } target_stmt.this } else { return Err(ParseError::new_error_str_at_span( - &ctx.module.source, label.span, "incorrect break target label, it must target a while loop" + &ctx.module().source, label.span, "incorrect break target label, it must target a while loop" ).with_info_str_at_span( - &ctx.module.source, target.label.span, "The targeted label is found here" + &ctx.module().source, target.label.span, "The targeted label is found here" )); } }, @@ -1693,7 +1709,7 @@ impl PassValidationLinking { // nested within that while statement if self.in_while.is_invalid() { return Err(ParseError::new_error_str_at_span( - &ctx.module.source, span, "Break statement is not nested under a while loop" + &ctx.module().source, span, "Break statement is not nested under a while loop" )); } @@ -1711,9 +1727,9 @@ impl PassValidationLinking { debug_assert!(!self.in_sync.is_invalid()); let sync_stmt = &ctx.heap[self.in_sync]; return Err( - ParseError::new_error_str_at_span(&ctx.module.source, span, "break may not escape the surrounding synchronous block") - .with_info_str_at_span(&ctx.module.source, target_while.span, "the break escapes out of this loop") - .with_info_str_at_span(&ctx.module.source, sync_stmt.span, "And would therefore escape this synchronous block") + ParseError::new_error_str_at_span(&ctx.module().source, span, "break may not escape the surrounding synchronous block") + .with_info_str_at_span(&ctx.module().source, target_while.span, "the break escapes out of this loop") + .with_info_str_at_span(&ctx.module().source, sync_stmt.span, "And would therefore escape this synchronous block") ); } } diff --git a/src/protocol/parser/type_table.rs b/src/protocol/parser/type_table.rs index b1ccf33d7a3a837f173707f977a0195f4776ca79..6acf92b60bc3cebb569418508f605236a7255dde 100644 --- a/src/protocol/parser/type_table.rs +++ b/src/protocol/parser/type_table.rs @@ -1,3 +1,41 @@ +/** + * type_table.rs + * + * The type table is a lookup from AST definition (which contains just what the + * programmer typed) to a type with additional information computed (e.g. the + * byte size and offsets of struct members). The type table should be considered + * the authoritative source of information on types by the compiler (not the + * AST itself!). + * + * The type table operates in two modes: one is where we just look up the type, + * check its fields for correctness and mark whether it is polymorphic or not. + * The second one is where we compute byte sizes, alignment and offsets. + * + * The basic algorithm for type resolving and computing byte sizes is to + * recursively try to lay out each member type of a particular type. This is + * done in a stack-like fashion, where each embedded type pushes a breadcrumb + * unto the stack. We may discover a cycle in embedded types (we call this a + * "type loop"). After which the type table attempts to break the type loop by + * making specific types heap-allocated. Upon doing so we know their size + * because their stack-size is now based on pointers. Hence breaking the type + * loop required for computing the byte size of types. + * + * The reason for these type shenanigans is because PDL is a value-based + * language, but we would still like to be able to express recursively defined + * types like trees or linked lists. Hence we need to insert pointers somewhere + * to break these cycles. + * + * We will insert these pointers into the variants of unions. However note that + * we can only compute the stack size of a union until we've looked at *all* + * variants. Hence we perform an initial pass where we detect type loops, a + * second pass where we compute the stack sizes of everything, and a third pass + * where we actually compute the size of the heap allocations for unions. + * + * As a final bit of global documentation: non-polymorphic types will always + * have one "monomorph" entry. This contains the non-polymorphic type's memory + * layout. + */ + use std::fmt::{Formatter, Result as FmtResult}; use std::collections::HashMap; @@ -29,6 +67,13 @@ impl TypeClass { TypeClass::Component => "component", } } + + pub(crate) fn is_data_type(&self) -> bool { + match self { + TypeClass::Enum | TypeClass::Union | TypeClass::Struct => true, + TypeClass::Function | TypeClass::Component => false, + } + } } impl std::fmt::Display for TypeClass { @@ -52,6 +97,136 @@ pub struct DefinedType { pub(crate) is_polymorph: bool, } +impl DefinedType { + /// Returns the number of monomorphs that are instantiated. Remember that + /// during the type loop detection, and the memory layout phase we will + /// pre-allocate monomorphs which are not yet fully laid out in memory. + pub(crate) fn num_monomorphs(&self) -> usize { + use DefinedTypeVariant as DTV; + match &self.definition { + DTV::Enum(def) => def.monomorphs.len(), + DTV::Union(def) => def.monomorphs.len(), + DTV::Struct(def) => def.monomorphs.len(), + DTV::Function(_) | DTV::Component(_) => unreachable!(), + } + } + /// Returns the index at which a monomorph occurs. Will only check the + /// polymorphic arguments that are in use (none of the, in rust lingo, + /// phantom types). If the type is not polymorphic and its memory has been + /// layed out, then this will always return `Some(0)`. + pub(crate) fn get_monomorph_index(&self, concrete_type: &ConcreteType) -> Option { + use DefinedTypeVariant as DTV; + + // Helper to compare two types, while disregarding the polymorphic + // variables that are not in use. + let concrete_types_match = |type_a: &ConcreteType, type_b: &ConcreteType, check_if_poly_var_is_used: bool| -> bool { + let mut a_iter = type_a.embedded_iter(0).enumerate(); + let mut b_iter = type_b.embedded_iter(0); + + while let Some((section_idx, a_section)) = a_iter.next() { + let b_section = b_iter.next().unwrap(); + + if check_if_poly_var_is_used && !self.poly_vars[section_idx].is_in_use { + continue; + } + + if a_section != b_section { + return false; + } + } + + return true; + }; + + // Check check if type is polymorphic to some degree at all + if cfg!(debug_assertions) { + if let ConcreteTypePart::Instance(definition_id, num_poly_args) = concrete_type.parts[0] { + assert_eq!(definition_id, self.ast_definition); + assert_eq!(num_poly_args as usize, self.poly_vars.len()); + } else { + assert!(false, "concrete type {:?} is not a user-defined type", concrete_type); + } + } + + match &self.definition { + DTV::Enum(definition) => { + // Special case, enum is never a "true polymorph" + debug_assert!(!self.is_polymorph); + if definition.monomorphs.is_empty() { + return None + } else { + return Some(0) + } + }, + DTV::Union(definition) => { + for (monomorph_idx, monomorph) in definition.monomorphs.iter().enumerate() { + if concrete_types_match(&monomorph.concrete_type, concrete_type, true) { + return Some(monomorph_idx); + } + } + }, + DTV::Struct(definition) => { + for (monomorph_idx, monomorph) in definition.monomorphs.iter().enumerate() { + if concrete_types_match(&monomorph.concrete_type, concrete_type, true) { + return Some(monomorph_idx); + } + } + }, + DTV::Function(definition) => { + for (monomorph_idx, monomorph) in definition.monomorphs.iter().enumerate() { + if concrete_types_match(&monomorph.concrete_type, concrete_type, false) { + return Some(monomorph_idx) + } + } + } + DTV::Component(definition) => { + for (monomorph_idx, monomorph) in definition.monomorphs.iter().enumerate() { + if concrete_types_match(&monomorph.concrete_type, concrete_type, false) { + return Some(monomorph_idx) + } + } + } + } + + // Nothing matched + return None; + } + + /// Retrieves size and alignment of the particular type's monomorph if it + /// has been layed out in memory. + pub(crate) fn get_monomorph_size_alignment(&self, idx: usize) -> Option<(usize, usize)> { + use DefinedTypeVariant as DTV; + let (size, alignment) = match &self.definition { + DTV::Enum(def) => { + debug_assert!(idx == 0); + (def.size, def.alignment) + }, + DTV::Union(def) => { + let monomorph = &def.monomorphs[idx]; + (monomorph.stack_size, monomorph.stack_alignment) + }, + DTV::Struct(def) => { + let monomorph = &def.monomorphs[idx]; + (monomorph.size, monomorph.alignment) + }, + DTV::Function(_) | DTV::Component(_) => { + // Type table should never be able to arrive here during layout + // of types. Types may only contain function prototypes. + unreachable!("retrieving size and alignment of procedure type"); + } + }; + + if size == 0 && alignment == 0 { + // The "marker" for when the type has not been layed out yet. Even + // for zero-size types we will set alignment to `1` to simplify + // alignment calculations. + return None; + } else { + return Some((size, alignment)); + } + } +} + pub enum DefinedTypeVariant { Enum(EnumType), Union(UnionType), @@ -78,6 +253,13 @@ impl DefinedTypeVariant { } } + pub(crate) fn as_struct_mut(&mut self) -> &mut StructType { + match self { + DefinedTypeVariant::Struct(v) => v, + _ => unreachable!("Cannot convert {} to struct variant", self.type_class()) + } + } + pub(crate) fn as_enum(&self) -> &EnumType { match self { DefinedTypeVariant::Enum(v) => v, @@ -85,32 +267,24 @@ impl DefinedTypeVariant { } } - pub(crate) fn as_union(&self) -> &UnionType { + pub(crate) fn as_enum_mut(&mut self) -> &mut EnumType { match self { - DefinedTypeVariant::Union(v) => v, - _ => unreachable!("Cannot convert {} to union variant", self.type_class()) + DefinedTypeVariant::Enum(v) => v, + _ => unreachable!("Cannot convert {} to enum variant", self.type_class()) } } - pub(crate) fn data_monomorphs(&self) -> &Vec { - use DefinedTypeVariant::*; - + pub(crate) fn as_union(&self) -> &UnionType { match self { - Enum(v) => &v.monomorphs, - Union(v) => &v.monomorphs, - Struct(v) => &v.monomorphs, - _ => unreachable!("cannot get data monomorphs from {}", self.type_class()), + DefinedTypeVariant::Union(v) => v, + _ => unreachable!("Cannot convert {} to union variant", self.type_class()) } } - pub(crate) fn data_monomorphs_mut(&mut self) -> &mut Vec { - use DefinedTypeVariant::*; - + pub(crate) fn as_union_mut(&mut self) -> &mut UnionType { match self { - Enum(v) => &mut v.monomorphs, - Union(v) => &mut v.monomorphs, - Struct(v) => &mut v.monomorphs, - _ => unreachable!("cannot get data monomorphs from {}", self.type_class()), + DefinedTypeVariant::Union(v) => v, + _ => unreachable!("Cannot convert {} to union variant", self.type_class()) } } @@ -140,17 +314,12 @@ pub struct PolymorphicVariable { is_in_use: bool, // a polymorphic argument may be defined, but not used by the type definition } -/// Data associated with a monomorphized datatype -pub struct DataMonomorph { - pub poly_args: Vec, -} - /// Data associated with a monomorphized procedure type. Has the wrong name, /// because it will also be used to store expression data for a non-polymorphic /// procedure. (in that case, there will only ever be one) pub struct ProcedureMonomorph { // Expression data for one particular monomorph - pub poly_args: Vec, + pub concrete_type: ConcreteType, pub expr_data: Vec, } @@ -161,7 +330,12 @@ pub struct ProcedureMonomorph { /// variants to be equal to one another. pub struct EnumType { pub variants: Vec, - pub monomorphs: Vec, + pub monomorphs: Vec, + pub minimum_tag_value: i64, + pub maximum_tag_value: i64, + pub tag_type: ConcreteType, + pub size: usize, + pub alignment: usize, } // TODO: Also support maximum u64 value @@ -170,12 +344,25 @@ pub struct EnumVariant { pub value: i64, } +pub struct EnumMonomorph { + pub concrete_type: ConcreteType, +} + /// `UnionType` is the algebraic datatype (or sum type, or discriminated union). /// A value is an element of the union, identified by its tag, and may contain /// a single subtype. +/// For potentially infinite types (i.e. a tree, or a linked list) only unions +/// can break the infinite cycle. So when we lay out these unions in memory we +/// will reserve enough space on the stack for all union variants that do not +/// cause "type loops" (i.e. a union `A` with a variant containing a struct +/// `B`). And we will reserve enough space on the heap (and store a pointer in +/// the union) for all variants which do cause type loops (i.e. a union `A` +/// with a variant to a struct `B` that contains the union `A` again). pub struct UnionType { pub variants: Vec, - pub monomorphs: Vec, + pub monomorphs: Vec, + pub tag_type: ConcreteType, + pub tag_size: usize, } pub struct UnionVariant { @@ -184,9 +371,42 @@ pub struct UnionVariant { pub tag_value: i64, } +pub struct UnionMonomorph { + pub concrete_type: ConcreteType, + pub variants: Vec, + // stack_size is the size of the union on the stack, includes the tag + pub stack_size: usize, + pub stack_alignment: usize, + // heap_size contains the allocated size of the union in the case it + // is used to break a type loop. If it is 0, then it doesn't require + // allocation and lives entirely on the stack. + pub heap_size: usize, + pub heap_alignment: usize, +} + +pub struct UnionMonomorphVariant { + pub lives_on_heap: bool, + pub embedded: Vec, +} + +pub struct UnionMonomorphEmbedded { + pub concrete_type: ConcreteType, + // Note that the meaning of the offset (and alignment) depend on whether or + // not the variant lives on the stack/heap. If it lives on the stack then + // they refer to the offset from the start of the union value (so the first + // embedded type lives at a non-zero offset, because the union tag sits in + // the front). If it lives on the heap then it refers to the offset from the + // allocated memory region (so the first embedded type lives at a 0 offset). + pub size: usize, + pub alignment: usize, + pub offset: usize, +} + +/// `StructType` is a generic C-like struct type (or record type, or product +/// type) type. pub struct StructType { pub fields: Vec, - pub monomorphs: Vec, + pub monomorphs: Vec, } pub struct StructField { @@ -194,6 +414,22 @@ pub struct StructField { pub parser_type: ParserType, } +pub struct StructMonomorph { + pub concrete_type: ConcreteType, + pub fields: Vec, + pub size: usize, + pub alignment: usize, +} + +pub struct StructMonomorphField { + pub concrete_type: ConcreteType, + pub size: usize, + pub alignment: usize, + pub offset: usize, +} + +/// `FunctionType` is what you expect it to be: a particular function's +/// signature. pub struct FunctionType { pub return_types: Vec, pub arguments: Vec, @@ -228,63 +464,61 @@ pub struct MonomorphExpression { // Type table //------------------------------------------------------------------------------ -// TODO: @cleanup Do I really need this, doesn't make the code that much cleaner -struct TypeIterator { - breadcrumbs: Vec<(RootId, DefinitionId)> +// Programmer note: keep this struct free of dynamically allocated memory +#[derive(Clone)] +struct TypeLoopBreadcrumb { + definition_id: DefinitionId, + monomorph_idx: usize, + next_member: usize, + next_embedded: usize, // for unions, the index into the variant's embedded types } -impl TypeIterator { - fn new() -> Self { - Self{ breadcrumbs: Vec::with_capacity(32) } - } - - fn reset(&mut self, root_id: RootId, definition_id: DefinitionId) { - self.breadcrumbs.clear(); - self.breadcrumbs.push((root_id, definition_id)) - } - - fn push(&mut self, root_id: RootId, definition_id: DefinitionId) { - self.breadcrumbs.push((root_id, definition_id)); - } - - fn contains(&self, root_id: RootId, definition_id: DefinitionId) -> bool { - for (stored_root_id, stored_definition_id) in self.breadcrumbs.iter() { - if *stored_root_id == root_id && *stored_definition_id == definition_id { return true; } - } +#[derive(Clone)] +struct MemoryBreadcrumb { + definition_id: DefinitionId, + monomorph_idx: usize, + next_member: usize, + next_embedded: usize, + first_size_alignment_idx: usize, +} - return false - } +#[derive(Debug, PartialEq, Eq)] +enum TypeLoopResult { + TypeExists, + PushBreadcrumb(DefinitionId, ConcreteType), + TypeLoop(usize), // index into vec of breadcrumbs at which the type matched +} - fn top(&self) -> Option<(RootId, DefinitionId)> { - self.breadcrumbs.last().map(|(r, d)| (*r, *d)) - } +enum MemoryLayoutResult { + TypeExists(usize, usize), // (size, alignment) + PushBreadcrumb(MemoryBreadcrumb), +} - fn pop(&mut self) { - debug_assert!(!self.breadcrumbs.is_empty()); - self.breadcrumbs.pop(); - } +// TODO: @Optimize, initial memory-unoptimized implementation +struct TypeLoopEntry { + definition_id: DefinitionId, + monomorph_idx: usize, + is_union: bool, } -/// Result from attempting to resolve a `ParserType` using the symbol table and -/// the type table. -enum ResolveResult { - Builtin, - PolymoprhicArgument, - /// ParserType points to a user-defined type that is already resolved in the - /// type table. - Resolved(RootId, DefinitionId), - /// ParserType points to a user-defined type that is not yet resolved into - /// the type table. - Unresolved(RootId, DefinitionId) +struct TypeLoop { + members: Vec } pub struct TypeTable { /// Lookup from AST DefinitionId to a defined type. Considering possible /// polymorphs is done inside the `DefinedType` struct. lookup: HashMap, - /// Iterator over `(module, definition)` tuples used as workspace to make sure - /// that each base definition of all a type's subtypes are resolved. - iter: TypeIterator, + /// Breadcrumbs left behind while trying to find type loops. Also used to + /// determine sizes of types when all type loops are detected. + type_loop_breadcrumbs: Vec, + type_loops: Vec, + /// Stores all encountered types during type loop detection. Used afterwards + /// to iterate over all types in order to compute size/alignment. + encountered_types: Vec, + /// Breadcrumbs and temporary storage during memory layout computation. + memory_layout_breadcrumbs: Vec, + size_alignment_stack: Vec<(usize, usize)>, } impl TypeTable { @@ -292,15 +526,24 @@ impl TypeTable { pub(crate) fn new() -> Self { Self{ lookup: HashMap::new(), - iter: TypeIterator::new(), + type_loop_breadcrumbs: Vec::with_capacity(32), + type_loops: Vec::with_capacity(8), + encountered_types: Vec::with_capacity(32), + memory_layout_breadcrumbs: Vec::with_capacity(32), + size_alignment_stack: Vec::with_capacity(64), } } + /// Iterates over all defined types (polymorphic and non-polymorphic) and + /// add their types in two passes. In the first pass we will just add the + /// base types (we will not consider monomorphs, and we will not compute + /// byte sizes). In the second pass we will compute byte sizes of + /// non-polymorphic types, and potentially the monomorphs that are embedded + /// in those types. pub(crate) fn build_base_types(&mut self, modules: &mut [Module], ctx: &mut PassCtx) -> Result<(), ParseError> { // Make sure we're allowed to cast root_id to index into ctx.modules debug_assert!(modules.iter().all(|m| m.phase >= ModuleCompilationPhase::DefinitionsParsed)); debug_assert!(self.lookup.is_empty()); - debug_assert!(self.iter.top().is_none()); if cfg!(debug_assertions) { for (index, module) in modules.iter().enumerate() { @@ -308,20 +551,51 @@ impl TypeTable { } } - // Use context to guess hashmap size + // Use context to guess hashmap size of the base types let reserve_size = ctx.heap.definitions.len(); self.lookup.reserve(reserve_size); + // Resolve all base types for definition_idx in 0..ctx.heap.definitions.len() { let definition_id = ctx.heap.definitions.get_id(definition_idx); - self.resolve_base_definition(modules, ctx, definition_id)?; + let definition = &ctx.heap[definition_id]; + + match definition { + Definition::Enum(_) => self.build_base_enum_definition(modules, ctx, definition_id)?, + Definition::Union(_) => self.build_base_union_definition(modules, ctx, definition_id)?, + Definition::Struct(_) => self.build_base_struct_definition(modules, ctx, definition_id)?, + Definition::Function(_) => self.build_base_function_definition(modules, ctx, definition_id)?, + Definition::Component(_) => self.build_base_component_definition(modules, ctx, definition_id)?, + } } debug_assert_eq!(self.lookup.len(), reserve_size, "mismatch in reserved size of type table"); // NOTE: Temp fix for builtin functions - for module in modules { + for module in modules.iter_mut() { module.phase = ModuleCompilationPhase::TypesAddedToTable; } + // Go through all types again, lay out all types that are not + // polymorphic. This might cause us to lay out types that are monomorphs + // of polymorphic types. + for definition_idx in 0..ctx.heap.definitions.len() { + let definition_id = ctx.heap.definitions.get_id(definition_idx); + let poly_type = self.lookup.get(&definition_id).unwrap(); + + // Here we explicitly want to instantiate types which have no + // polymorphic arguments (even if it has phantom polymorphic + // arguments) because otherwise the user will see very weird + // error messages. + if poly_type.definition.type_class().is_data_type() && poly_type.poly_vars.is_empty() && poly_type.num_monomorphs() == 0 { + self.detect_and_resolve_type_loops_for( + modules, ctx.heap, + ConcreteType{ + parts: vec![ConcreteTypePart::Instance(definition_id, 0)] + }, + )?; + self.lay_out_memory_for_encountered_types(ctx.arch); + } + } + Ok(()) } @@ -335,22 +609,12 @@ impl TypeTable { /// Returns the index into the monomorph type array if the procedure type /// already has a (reserved) monomorph. - pub(crate) fn get_procedure_monomorph_index(&self, definition_id: &DefinitionId, types: &Vec) -> Option { + pub(crate) fn get_procedure_monomorph_index(&self, definition_id: &DefinitionId, types: &ConcreteType) -> Option { let def = self.lookup.get(definition_id).unwrap(); - if def.is_polymorph { - let monos = def.definition.procedure_monomorphs(); - return monos.iter() - .position(|v| v.poly_args == *types) - .map(|v| v as i32); - } else { - // We don't actually care about the types - let monos = def.definition.procedure_monomorphs(); - if monos.is_empty() { - return None - } else { - return Some(0) - } - } + let monos = def.definition.procedure_monomorphs(); + return monos.iter() + .position(|v| v.concrete_type == *types) + .map(|v| v as i32); } /// Returns a mutable reference to a procedure's monomorph expression data. @@ -374,132 +638,102 @@ impl TypeTable { /// monomorph may NOT exist yet (because the reservation implies that we're /// going to be performing typechecking on it, and we don't want to /// check the same monomorph twice) - pub(crate) fn reserve_procedure_monomorph_index(&mut self, definition_id: &DefinitionId, types: Option>) -> i32 { + pub(crate) fn reserve_procedure_monomorph_index(&mut self, definition_id: &DefinitionId, concrete_type: ConcreteType) -> i32 { let def = self.lookup.get_mut(definition_id).unwrap(); - if let Some(types) = types { - // Expecting a polymorphic procedure - let monos = def.definition.procedure_monomorphs_mut(); - debug_assert!(def.is_polymorph); - debug_assert!(def.poly_vars.len() == types.len()); - debug_assert!(monos.iter().find(|v| v.poly_args == types).is_none()); - - let mono_idx = monos.len(); - monos.push(ProcedureMonomorph{ poly_args: types, expr_data: Vec::new() }); - - return mono_idx as i32; - } else { - // Expecting a non-polymorphic procedure - let monos = def.definition.procedure_monomorphs_mut(); - debug_assert!(!def.is_polymorph); - debug_assert!(def.poly_vars.is_empty()); - debug_assert!(monos.is_empty()); - - monos.push(ProcedureMonomorph{ poly_args: Vec::new(), expr_data: Vec::new() }); + let mono_types = def.definition.procedure_monomorphs_mut(); + debug_assert!(def.is_polymorph == (concrete_type.parts.len() != 1)); + debug_assert!(!mono_types.iter().any(|v| v.concrete_type == concrete_type)); + + let mono_idx = mono_types.len(); + mono_types.push(ProcedureMonomorph{ + concrete_type, + expr_data: Vec::new(), + }); - return 0; - } + return mono_idx as i32; } /// Adds a datatype polymorph to the type table. Will not add the /// monomorph if it is already present, or if the type's polymorphic /// variables are all unused. - pub(crate) fn add_data_monomorph(&mut self, definition_id: &DefinitionId, types: Vec) -> i32 { - let def = self.lookup.get_mut(definition_id).unwrap(); - if !def.is_polymorph { - // Not a polymorph, or polyvars are not used in type definition - return 0; + /// TODO: Fix signature + pub(crate) fn add_data_monomorph( + &mut self, modules: &[Module], heap: &Heap, arch: &TargetArch, definition_id: DefinitionId, concrete_type: ConcreteType + ) -> Result { + debug_assert_eq!(definition_id, get_concrete_type_definition(&concrete_type)); + + // Check if the monomorph already exists + let poly_type = self.lookup.get_mut(&definition_id).unwrap(); + if let Some(idx) = poly_type.get_monomorph_index(&concrete_type) { + return Ok(idx as i32); } - let monos = def.definition.data_monomorphs_mut(); - if let Some(index) = monos.iter().position(|v| v.poly_args == types) { - // We already know about this monomorph - return index as i32; - } + // Doesn't exist, so instantiate a monomorph and determine its memory + // layout. + self.detect_and_resolve_type_loops_for(modules, heap, concrete_type)?; + debug_assert_eq!(self.encountered_types[0].definition_id, definition_id); + let mono_idx = self.encountered_types[0].monomorph_idx; + self.lay_out_memory_for_encountered_types(arch); - let index = monos.len(); - monos.push(DataMonomorph{ poly_args: types }); - return index as i32; + return Ok(mono_idx as i32); } - /// This function will resolve just the basic definition of the type, it - /// will not handle any of the monomorphized instances of the type. - fn resolve_base_definition<'a>(&'a mut self, modules: &[Module], ctx: &mut PassCtx, definition_id: DefinitionId) -> Result<(), ParseError> { - // Check if we have already resolved the base definition - if self.lookup.contains_key(&definition_id) { return Ok(()); } - - let root_id = ctx.heap[definition_id].defined_in(); - self.iter.reset(root_id, definition_id); + //-------------------------------------------------------------------------- + // Building base types + //-------------------------------------------------------------------------- - while let Some((root_id, definition_id)) = self.iter.top() { - // We have a type to resolve - let definition = &ctx.heap[definition_id]; + /// Builds the base type for an enum. Will not compute byte sizes + fn build_base_enum_definition(&mut self, modules: &[Module], ctx: &mut PassCtx, definition_id: DefinitionId) -> Result<(), ParseError> { + debug_assert!(!self.lookup.contains_key(&definition_id), "base enum already built"); + let definition = ctx.heap[definition_id].as_enum(); + let root_id = definition.defined_in; - let can_pop_breadcrumb = match definition { - // Bit ugly, since we already have the definition, but we need - // to work around rust borrowing rules... - Definition::Enum(_) => self.resolve_base_enum_definition(modules, ctx, root_id, definition_id), - Definition::Union(_) => self.resolve_base_union_definition(modules, ctx, root_id, definition_id), - Definition::Struct(_) => self.resolve_base_struct_definition(modules, ctx, root_id, definition_id), - Definition::Component(_) => self.resolve_base_component_definition(modules, ctx, root_id, definition_id), - Definition::Function(_) => self.resolve_base_function_definition(modules, ctx, root_id, definition_id), - }?; + // Determine enum variants + let mut enum_value = -1; + let mut variants = Vec::with_capacity(definition.variants.len()); - // Otherwise: `ingest_resolve_result` has pushed a new breadcrumb - // that we must follow before we can resolve the current type - if can_pop_breadcrumb { - self.iter.pop(); + for variant in &definition.variants { + if enum_value == i64::MAX { + let source = &modules[definition.defined_in.index as usize].source; + return Err(ParseError::new_error_str_at_span( + source, variant.identifier.span, + "this enum variant has an integer value that is too large" + )); } - } - // We must have resolved the type - debug_assert!(self.lookup.contains_key(&definition_id), "base type not resolved"); - Ok(()) - } + enum_value += 1; + if let EnumVariantValue::Integer(explicit_value) = variant.value { + enum_value = explicit_value; + } - /// Resolve the basic enum definition to an entry in the type table. It will - /// not instantiate any monomorphized instances of polymorphic enum - /// definitions. If a subtype has to be resolved first then this function - /// will return `false` after calling `ingest_resolve_result`. - fn resolve_base_enum_definition(&mut self, modules: &[Module], ctx: &mut PassCtx, root_id: RootId, definition_id: DefinitionId) -> Result { - debug_assert!(ctx.heap[definition_id].is_enum()); - debug_assert!(!self.lookup.contains_key(&definition_id), "base enum already resolved"); - - let definition = ctx.heap[definition_id].as_enum(); + variants.push(EnumVariant{ + identifier: variant.identifier.clone(), + value: enum_value, + }); + } - let mut enum_value = -1; + // Determine tag size let mut min_enum_value = 0; let mut max_enum_value = 0; - let mut variants = Vec::with_capacity(definition.variants.len()); - for variant in &definition.variants { - enum_value += 1; - match &variant.value { - EnumVariantValue::None => { - variants.push(EnumVariant{ - identifier: variant.identifier.clone(), - value: enum_value, - }); - }, - EnumVariantValue::Integer(override_value) => { - enum_value = *override_value; - variants.push(EnumVariant{ - identifier: variant.identifier.clone(), - value: enum_value, - }); - } + if !variants.is_empty() { + min_enum_value = variants[0].value; + max_enum_value = variants[0].value; + for variant in variants.iter().skip(1) { + min_enum_value = min_enum_value.min(variant.value); + max_enum_value = max_enum_value.max(variant.value); } - if enum_value < min_enum_value { min_enum_value = enum_value; } - else if enum_value > max_enum_value { max_enum_value = enum_value; } } - // Ensure enum names and polymorphic args do not conflict - self.check_identifier_collision( + let (tag_type, size_and_alignment) = Self::variant_tag_type_from_values(min_enum_value, max_enum_value); + + // Enum names and polymorphic args do not conflict + Self::check_identifier_collision( modules, root_id, &variants, |variant| &variant.identifier, "enum variant" )?; - // Because we're parsing an enum, the programmer cannot put the - // polymorphic variables inside the variants. But the polymorphic - // variables might still be present as "marker types" - self.check_poly_args_collision(modules, ctx, root_id, &definition.poly_vars)?; + // Polymorphic arguments cannot appear as embedded types, because + // they can only consist of integer variants. + Self::check_poly_args_collision(modules, ctx, root_id, &definition.poly_vars)?; let poly_vars = Self::create_polymorphic_variables(&definition.poly_vars); self.lookup.insert(definition_id, DefinedType { @@ -508,124 +742,109 @@ impl TypeTable { definition: DefinedTypeVariant::Enum(EnumType{ variants, monomorphs: Vec::new(), + minimum_tag_value: min_enum_value, + maximum_tag_value: max_enum_value, + tag_type, + size: size_and_alignment, + alignment: size_and_alignment }), poly_vars, is_polymorph: false, }); - Ok(true) + return Ok(()); } - /// Resolves the basic union definiton to an entry in the type table. It - /// will not instantiate any monomorphized instances of polymorphic union - /// definitions. If a subtype has to be resolved first then this function - /// will return `false` after calling `ingest_resolve_result`. - fn resolve_base_union_definition(&mut self, modules: &[Module], ctx: &mut PassCtx, root_id: RootId, definition_id: DefinitionId) -> Result { - debug_assert!(ctx.heap[definition_id].is_union()); - debug_assert!(!self.lookup.contains_key(&definition_id), "base union already resolved"); - + /// Builds the base type for a union. Will compute byte sizes. + fn build_base_union_definition(&mut self, modules: &[Module], ctx: &mut PassCtx, definition_id: DefinitionId) -> Result<(), ParseError> { + debug_assert!(!self.lookup.contains_key(&definition_id), "base union already built"); let definition = ctx.heap[definition_id].as_union(); + let root_id = definition.defined_in; - // Make sure all embedded types are resolved - for variant in &definition.variants { - match &variant.value { - UnionVariantValue::None => {}, - UnionVariantValue::Embedded(embedded) => { - for parser_type in embedded { - let resolve_result = self.resolve_base_parser_type(modules, ctx, root_id, parser_type, false)?; - if !self.ingest_resolve_result(modules, ctx, resolve_result)? { - return Ok(false) - } - } - } - } - } - - // If here then all embedded types are resolved - - // Determine the union variants - let mut tag_value = -1; + // Check all variants and their embedded types let mut variants = Vec::with_capacity(definition.variants.len()); + let mut tag_counter = 0; for variant in &definition.variants { - tag_value += 1; - let embedded = match &variant.value { - UnionVariantValue::None => { Vec::new() }, - UnionVariantValue::Embedded(embedded) => { - // Type should be resolvable, we checked this above - embedded.clone() - }, - }; + for embedded in &variant.value { + Self::check_member_parser_type( + modules, ctx, root_id, embedded, false + )?; + } variants.push(UnionVariant{ identifier: variant.identifier.clone(), - embedded, - tag_value, - }) + embedded: variant.value.clone(), + tag_value: tag_counter, + }); + tag_counter += 1; } - // Ensure union names and polymorphic args do not conflict - self.check_identifier_collision( + let mut max_tag_value = 0; + if tag_counter != 0 { + max_tag_value = tag_counter - 1 + } + + let (tag_type, tag_size) = Self::variant_tag_type_from_values(0, max_tag_value); + + // Make sure there are no conflicts in identifiers + Self::check_identifier_collision( modules, root_id, &variants, |variant| &variant.identifier, "union variant" )?; - self.check_poly_args_collision(modules, ctx, root_id, &definition.poly_vars)?; + Self::check_poly_args_collision(modules, ctx, root_id, &definition.poly_vars)?; - // Construct polymorphic variables and mark the ones that are in use + // Construct internal representation of union let mut poly_vars = Self::create_polymorphic_variables(&definition.poly_vars); - for variant in &variants { - for parser_type in &variant.embedded { - Self::mark_used_polymorphic_variables(&mut poly_vars, parser_type); + for variant in &definition.variants { + for embedded in &variant.value { + Self::mark_used_polymorphic_variables(&mut poly_vars, embedded); } } + let is_polymorph = poly_vars.iter().any(|arg| arg.is_in_use); - // Insert base definition in type table - self.lookup.insert(definition_id, DefinedType { + self.lookup.insert(definition_id, DefinedType{ ast_root: root_id, ast_definition: definition_id, definition: DefinedTypeVariant::Union(UnionType{ variants, monomorphs: Vec::new(), + tag_type, + tag_size, }), poly_vars, - is_polymorph, + is_polymorph }); - Ok(true) + return Ok(()); } - /// Resolves the basic struct definition to an entry in the type table. It - /// will not instantiate any monomorphized instances of polymorphic struct - /// definitions. - fn resolve_base_struct_definition(&mut self, modules: &[Module], ctx: &mut PassCtx, root_id: RootId, definition_id: DefinitionId) -> Result { - debug_assert!(ctx.heap[definition_id].is_struct()); - debug_assert!(!self.lookup.contains_key(&definition_id), "base struct already resolved"); - + /// Builds base struct type. Will not compute byte sizes. + fn build_base_struct_definition(&mut self, modules: &[Module], ctx: &mut PassCtx, definition_id: DefinitionId) -> Result<(), ParseError> { + debug_assert!(!self.lookup.contains_key(&definition_id), "base struct already built"); let definition = ctx.heap[definition_id].as_struct(); + let root_id = definition.defined_in; - // Make sure all fields point to resolvable types - for field_definition in &definition.fields { - let resolve_result = self.resolve_base_parser_type(modules, ctx, root_id, &field_definition.parser_type, false)?; - if !self.ingest_resolve_result(modules, ctx, resolve_result)? { - return Ok(false) - } - } - - // All fields types are resolved, construct base type + // Check all struct fields and construct internal representation let mut fields = Vec::with_capacity(definition.fields.len()); - for field_definition in &definition.fields { + + for field in &definition.fields { + Self::check_member_parser_type( + modules, ctx, root_id, &field.parser_type, false + )?; + fields.push(StructField{ - identifier: field_definition.field.clone(), - parser_type: field_definition.parser_type.clone(), - }) + identifier: field.field.clone(), + parser_type: field.parser_type.clone(), + }); } - // And make sure no conflicts exist in field names and/or polymorphic args - self.check_identifier_collision( + // Make sure there are no conflicting variables + Self::check_identifier_collision( modules, root_id, &fields, |field| &field.identifier, "struct field" )?; - self.check_poly_args_collision(modules, ctx, root_id, &definition.poly_vars)?; + Self::check_poly_args_collision(modules, ctx, root_id, &definition.poly_vars)?; - // Construct representation of polymorphic arguments + // Construct base type in table let mut poly_vars = Self::create_polymorphic_variables(&definition.poly_vars); for field in &fields { Self::mark_used_polymorphic_variables(&mut poly_vars, &field.parser_type); @@ -641,62 +860,56 @@ impl TypeTable { monomorphs: Vec::new(), }), poly_vars, - is_polymorph, + is_polymorph }); - Ok(true) + return Ok(()) } - /// Resolves the basic function definition to an entry in the type table. It - /// will not instantiate any monomorphized instances of polymorphic function - /// definitions. - fn resolve_base_function_definition(&mut self, modules: &[Module], ctx: &mut PassCtx, root_id: RootId, definition_id: DefinitionId) -> Result { - debug_assert!(ctx.heap[definition_id].is_function()); - debug_assert!(!self.lookup.contains_key(&definition_id), "base function already resolved"); - + /// Builds base function type. + fn build_base_function_definition(&mut self, modules: &[Module], ctx: &mut PassCtx, definition_id: DefinitionId) -> Result<(), ParseError> { + debug_assert!(!self.lookup.contains_key(&definition_id), "base function already built"); let definition = ctx.heap[definition_id].as_function(); + let root_id = definition.defined_in; - // Check the return type + // Check and construct return types and argument types. debug_assert_eq!(definition.return_types.len(), 1, "not one return type"); // TODO: @ReturnValues - let resolve_result = self.resolve_base_parser_type(modules, ctx, root_id, &definition.return_types[0], definition.builtin)?; - if !self.ingest_resolve_result(modules, ctx, resolve_result)? { - return Ok(false) - } - - // Check the argument types - for param_id in &definition.parameters { - let param = &ctx.heap[*param_id]; - let resolve_result = self.resolve_base_parser_type(modules, ctx, root_id, ¶m.parser_type, definition.builtin)?; - if !self.ingest_resolve_result(modules, ctx, resolve_result)? { - return Ok(false) - } + for return_type in &definition.return_types { + Self::check_member_parser_type( + modules, ctx, root_id, return_type, definition.builtin + )?; } - // Construct arguments to function let mut arguments = Vec::with_capacity(definition.parameters.len()); - for param_id in &definition.parameters { - let param = &ctx.heap[*param_id]; + for parameter_id in &definition.parameters { + let parameter = &ctx.heap[*parameter_id]; + Self::check_member_parser_type( + modules, ctx, root_id, ¶meter.parser_type, definition.builtin + )?; + arguments.push(FunctionArgument{ - identifier: param.identifier.clone(), - parser_type: param.parser_type.clone(), - }) + identifier: parameter.identifier.clone(), + parser_type: parameter.parser_type.clone(), + }); } - // Check conflict of argument and polyarg identifiers - self.check_identifier_collision( + // Check conflict of identifiers + Self::check_identifier_collision( modules, root_id, &arguments, |arg| &arg.identifier, "function argument" )?; - self.check_poly_args_collision(modules, ctx, root_id, &definition.poly_vars)?; + Self::check_poly_args_collision(modules, ctx, root_id, &definition.poly_vars)?; - // Construct polymorphic arguments + // Construct internal representation of function type let mut poly_vars = Self::create_polymorphic_variables(&definition.poly_vars); - Self::mark_used_polymorphic_variables(&mut poly_vars, &definition.return_types[0]); + for return_type in &definition.return_types { + Self::mark_used_polymorphic_variables(&mut poly_vars, return_type); + } for argument in &arguments { Self::mark_used_polymorphic_variables(&mut poly_vars, &argument.parser_type); } + let is_polymorph = poly_vars.iter().any(|arg| arg.is_in_use); - // Construct entry in type table self.lookup.insert(definition_id, DefinedType{ ast_root: root_id, ast_definition: definition_id, @@ -706,188 +919,114 @@ impl TypeTable { monomorphs: Vec::new(), }), poly_vars, - is_polymorph, + is_polymorph }); - Ok(true) + return Ok(()); } - /// Resolves the basic component definition to an entry in the type table. - /// It will not instantiate any monomorphized instancees of polymorphic - /// component definitions. - fn resolve_base_component_definition(&mut self, modules: &[Module], ctx: &mut PassCtx, root_id: RootId, definition_id: DefinitionId) -> Result { - debug_assert!(ctx.heap[definition_id].is_component()); - debug_assert!(!self.lookup.contains_key(&definition_id), "base component already resolved"); + /// Builds base component type. + fn build_base_component_definition(&mut self, modules: &[Module], ctx: &mut PassCtx, definition_id: DefinitionId) -> Result<(), ParseError> { + debug_assert!(!self.lookup.contains_key(&definition_id), "base component already built"); - let definition = ctx.heap[definition_id].as_component(); - let component_variant = definition.variant; - - // Check argument types - for param_id in &definition.parameters { - let param = &ctx.heap[*param_id]; - let resolve_result = self.resolve_base_parser_type(modules, ctx, root_id, ¶m.parser_type, false)?; - if !self.ingest_resolve_result(modules, ctx, resolve_result)? { - return Ok(false) - } - } + let definition = &ctx.heap[definition_id].as_component(); + let root_id = definition.defined_in; - // Construct argument types + // Check the argument types let mut arguments = Vec::with_capacity(definition.parameters.len()); - for param_id in &definition.parameters { - let param = &ctx.heap[*param_id]; + for parameter_id in &definition.parameters { + let parameter = &ctx.heap[*parameter_id]; + Self::check_member_parser_type( + modules, ctx, root_id, ¶meter.parser_type, false + )?; + arguments.push(FunctionArgument{ - identifier: param.identifier.clone(), - parser_type: param.parser_type.clone() - }) + identifier: parameter.identifier.clone(), + parser_type: parameter.parser_type.clone(), + }); } - // Check conflict of argument and polyarg identifiers - self.check_identifier_collision( - modules, root_id, &arguments, |arg| &arg.identifier, "component argument" + // Check conflict of identifiers + Self::check_identifier_collision( + modules, root_id, &arguments, |arg| &arg.identifier, "connector argument" )?; - self.check_poly_args_collision(modules, ctx, root_id, &definition.poly_vars)?; + Self::check_poly_args_collision(modules, ctx, root_id, &definition.poly_vars)?; - // Construct polymorphic arguments + // Construct internal representation of component let mut poly_vars = Self::create_polymorphic_variables(&definition.poly_vars); for argument in &arguments { Self::mark_used_polymorphic_variables(&mut poly_vars, &argument.parser_type); } - let is_polymorph = poly_vars.iter().any(|v| v.is_in_use); + let is_polymorph = poly_vars.iter().any(|arg| arg.is_in_use); - // Construct entry in type table self.lookup.insert(definition_id, DefinedType{ ast_root: root_id, ast_definition: definition_id, definition: DefinedTypeVariant::Component(ComponentType{ - variant: component_variant, + variant: definition.variant, arguments, - monomorphs: Vec::new(), + monomorphs: Vec::new() }), poly_vars, - is_polymorph, + is_polymorph }); - Ok(true) - } - - /// Takes a ResolveResult and returns `true` if the caller can happily - /// continue resolving its current type, or `false` if the caller must break - /// resolving the current type and exit to the outer resolving loop. In the - /// latter case the `result` value was `ResolveResult::Unresolved`, implying - /// that the type must be resolved first. - fn ingest_resolve_result(&mut self, modules: &[Module], ctx: &PassCtx, result: ResolveResult) -> Result { - match result { - ResolveResult::Builtin | ResolveResult::PolymoprhicArgument => Ok(true), - ResolveResult::Resolved(_, _) => Ok(true), - ResolveResult::Unresolved(root_id, definition_id) => { - if self.iter.contains(root_id, definition_id) { - // Cyclic dependency encountered - // TODO: Allow this - let module_source = &modules[root_id.index as usize].source; - let mut error = ParseError::new_error_str_at_span( - module_source, ctx.heap[definition_id].identifier().span, - "Evaluating this type definition results in a cyclic type" - ); - - for (breadcrumb_idx, (root_id, definition_id)) in self.iter.breadcrumbs.iter().enumerate() { - let msg = if breadcrumb_idx == 0 { - "The cycle started with this definition" - } else { - "Which depends on this definition" - }; - - let module_source = &modules[root_id.index as usize].source; - error = error.with_info_str_at_span(module_source, ctx.heap[*definition_id].identifier().span, msg); - } - - Err(error) - } else { - // Type is not yet resolved, so push IDs on iterator and - // continue the resolving loop - self.iter.push(root_id, definition_id); - Ok(false) - } - } - } + Ok(()) } - /// Each type may consist of embedded types. If this type does not have a - /// fixed implementation (e.g. an input port may have an embedded type - /// indicating the type of messages, but it always exists in the runtime as - /// a port identifier, so it has a fixed implementation) then this function - /// will traverse the embedded types to ensure all of them are resolved. - /// - /// Hence if one checks a particular parser type for being resolved, one may - /// get back a result value indicating an embedded type (with a different - /// DefinitionId) is unresolved. - fn resolve_base_parser_type( - &mut self, modules: &[Module], ctx: &PassCtx, root_id: RootId, - parser_type: &ParserType, is_builtin: bool - ) -> Result { - // Note that as we iterate over the elements of a + /// Will check if the member type (field of a struct, embedded type in a + /// union variant) is valid. + fn check_member_parser_type( + modules: &[Module], ctx: &PassCtx, base_definition_root_id: RootId, + member_parser_type: &ParserType, allow_special_compiler_types: bool + ) -> Result<(), ParseError> { use ParserTypeVariant as PTV; - // Result for the very first time we resolve a type (i.e the outer type - // that we're actually looking up) - let mut resolve_result = None; - let mut set_resolve_result = |v: ResolveResult| { - if resolve_result.is_none() { resolve_result = Some(v); } - }; - - for element in parser_type.elements.iter() { + for element in &member_parser_type.elements { match element.variant { + // Special cases PTV::Void | PTV::InputOrOutput | PTV::ArrayLike | PTV::IntegerLike => { - if is_builtin { - set_resolve_result(ResolveResult::Builtin); - } else { - unreachable!("compiler-only ParserTypeVariant within type definition"); + if !allow_special_compiler_types { + unreachable!("compiler-only ParserTypeVariant in member type"); } }, + // Builtin types, always valid PTV::Message | PTV::Bool | PTV::UInt8 | PTV::UInt16 | PTV::UInt32 | PTV::UInt64 | PTV::SInt8 | PTV::SInt16 | PTV::SInt32 | PTV::SInt64 | PTV::Character | PTV::String | - PTV::Array | PTV::Input | PTV::Output => { - // Nothing to do: these are builtin types or types with a - // fixed implementation - set_resolve_result(ResolveResult::Builtin); - }, + PTV::Array | PTV::Input | PTV::Output | + // Likewise, polymorphic variables are always valid + PTV::PolymorphicArgument(_, _) => {}, + // Types that are not constructable, or types that are not + // allowed (and checked earlier) PTV::IntegerLiteral | PTV::Inferred => { - // As we're parsing the type definitions where these kinds - // of types are impossible/disallowed to express: unreachable!("illegal ParserTypeVariant within type definition"); }, - PTV::PolymorphicArgument(_, _) => { - set_resolve_result(ResolveResult::PolymoprhicArgument); - }, - PTV::Definition(embedded_id, _) => { - let definition = &ctx.heap[embedded_id]; + // Finally, user-defined types + PTV::Definition(definition_id, _) => { + let definition = &ctx.heap[definition_id]; if !(definition.is_struct() || definition.is_enum() || definition.is_union()) { - let module_source = &modules[root_id.index as usize].source; + let source = &modules[base_definition_root_id.index as usize].source; return Err(ParseError::new_error_str_at_span( - module_source, element.element_span, "expected a datatype (struct, enum or union)" - )) + source, element.element_span, "expected a datatype (a struct, enum or union)" + )); } - if self.lookup.contains_key(&embedded_id) { - set_resolve_result(ResolveResult::Resolved(definition.defined_in(), embedded_id)) - } else { - return Ok(ResolveResult::Unresolved(definition.defined_in(), embedded_id)) - } + // Otherwise, we're fine } } } - // If here then all types in the embedded type's tree were resolved. - debug_assert!(resolve_result.is_some(), "faulty logic in ParserType resolver"); - return Ok(resolve_result.unwrap()) + // If here, then all elements check out + return Ok(()); } /// Go through a list of identifiers and ensure that all identifiers have /// unique names fn check_identifier_collision &Identifier>( - &self, modules: &[Module], root_id: RootId, items: &[T], getter: F, item_name: &'static str + modules: &[Module], root_id: RootId, items: &[T], getter: F, item_name: &'static str ) -> Result<(), ParseError> { for (item_idx, item) in items.iter().enumerate() { let item_ident = getter(item); @@ -911,7 +1050,7 @@ impl TypeTable { /// arguments all have unique names, and the arguments do not conflict with /// any symbols defined at the module scope. fn check_poly_args_collision( - &self, modules: &[Module], ctx: &PassCtx, root_id: RootId, poly_args: &[Identifier] + modules: &[Module], ctx: &PassCtx, root_id: RootId, poly_args: &[Identifier] ) -> Result<(), ParseError> { // Make sure polymorphic arguments are unique and none of the // identifiers conflict with any imported scopes @@ -949,6 +1088,787 @@ impl TypeTable { Ok(()) } + //-------------------------------------------------------------------------- + // Detecting type loops + //-------------------------------------------------------------------------- + + /// Internal function that will detect type loops and check if they're + /// resolvable. If so then the appropriate union variants will be marked as + /// "living on heap". If not then a `ParseError` will be returned + fn detect_and_resolve_type_loops_for(&mut self, modules: &[Module], heap: &Heap, concrete_type: ConcreteType) -> Result<(), ParseError> { + use DefinedTypeVariant as DTV; + + debug_assert!(self.type_loop_breadcrumbs.is_empty()); + debug_assert!(self.type_loops.is_empty()); + debug_assert!(self.encountered_types.is_empty()); + + // Push the initial breadcrumb + let initial_breadcrumb = self.check_member_for_type_loops(&concrete_type); + if let TypeLoopResult::PushBreadcrumb(definition_id, concrete_type) = initial_breadcrumb { + self.handle_new_breadcrumb_for_type_loops(definition_id, concrete_type); + } else { + unreachable!(); + } + + // Enter into the main resolving loop + while !self.type_loop_breadcrumbs.is_empty() { + // Because we might be modifying the breadcrumb array we need to + let breadcrumb_idx = self.type_loop_breadcrumbs.len() - 1; + let mut breadcrumb = self.type_loop_breadcrumbs[breadcrumb_idx].clone(); + + let poly_type = self.lookup.get(&breadcrumb.definition_id).unwrap(); + let poly_type_definition_id = poly_type.ast_definition; + + let resolve_result = match &poly_type.definition { + DTV::Enum(_) => { + TypeLoopResult::TypeExists + }, + DTV::Union(definition) => { + let monomorph = &definition.monomorphs[breadcrumb.monomorph_idx]; + let num_variants = monomorph.variants.len(); + + let mut union_result = TypeLoopResult::TypeExists; + + 'member_loop: while breadcrumb.next_member < num_variants { + let mono_variant = &monomorph.variants[breadcrumb.next_member]; + let num_embedded = mono_variant.embedded.len(); + + while breadcrumb.next_embedded < num_embedded { + let mono_embedded = &mono_variant.embedded[breadcrumb.next_embedded]; + union_result = self.check_member_for_type_loops(&mono_embedded.concrete_type); + + if union_result != TypeLoopResult::TypeExists { + // In type loop or new breadcrumb pushed, so + // break out of the resolving loop + break 'member_loop; + } + + breadcrumb.next_embedded += 1; + } + + breadcrumb.next_embedded = 0; + breadcrumb.next_member += 1 + } + + union_result + }, + DTV::Struct(definition) => { + let monomorph = &definition.monomorphs[breadcrumb.monomorph_idx]; + let num_fields = monomorph.fields.len(); + + let mut struct_result = TypeLoopResult::TypeExists; + while breadcrumb.next_member < num_fields { + let mono_field = &monomorph.fields[breadcrumb.next_member]; + struct_result = self.check_member_for_type_loops(&mono_field.concrete_type); + + if struct_result != TypeLoopResult::TypeExists { + // Type loop or breadcrumb pushed, so break out of + // the resolving loop + break; + } + + breadcrumb.next_member += 1; + } + + struct_result + }, + DTV::Function(_) | DTV::Component(_) => unreachable!(), + }; + + // Handle the result of attempting to resolve the current breadcrumb + match resolve_result { + TypeLoopResult::TypeExists => { + // We finished parsing the type + self.type_loop_breadcrumbs.pop(); + }, + TypeLoopResult::PushBreadcrumb(definition_id, concrete_type) => { + // We recurse into the member type. + self.type_loop_breadcrumbs[breadcrumb_idx] = breadcrumb; + self.handle_new_breadcrumb_for_type_loops(definition_id, concrete_type); + }, + TypeLoopResult::TypeLoop(first_idx) => { + // Because we will be modifying breadcrumbs within the + // type-loop handling code, put back the modified breadcrumb + self.type_loop_breadcrumbs[breadcrumb_idx] = breadcrumb; + + // We're in a type loop. Add the type loop + let mut loop_members = Vec::with_capacity(self.type_loop_breadcrumbs.len() - first_idx); + let mut contains_union = false; + + for breadcrumb_idx in first_idx..self.type_loop_breadcrumbs.len() { + let breadcrumb = &mut self.type_loop_breadcrumbs[breadcrumb_idx]; + let mut is_union = false; + + let entry = self.lookup.get_mut(&breadcrumb.definition_id).unwrap(); + match &mut entry.definition { + DTV::Union(definition) => { + // Mark the currently processed variant as requiring heap + // allocation, then advance the *embedded* type. The loop above + // will then take care of advancing it to the next *member*. + let monomorph = &mut definition.monomorphs[breadcrumb.monomorph_idx]; + let variant = &mut monomorph.variants[breadcrumb.next_member]; + variant.lives_on_heap = true; + breadcrumb.next_embedded += 1; + is_union = true; + contains_union = true; + }, + _ => {}, // else: we don't care for now + } + + loop_members.push(TypeLoopEntry{ + definition_id: breadcrumb.definition_id, + monomorph_idx: breadcrumb.monomorph_idx, + is_union + }); + } + + let new_type_loop = TypeLoop{ members: loop_members }; + if !contains_union { + // No way to (potentially) break the union. So return a + // type loop error. This is because otherwise our + // breadcrumb resolver ends up in an infinite loop. + return Err(construct_type_loop_error( + self, &new_type_loop, modules, heap + )); + } + + self.type_loops.push(new_type_loop); + } + } + } + + // All breadcrumbs have been cleared. So now `type_loops` contains all + // of the encountered type loops, and `encountered_types` contains a + // list of all unique monomorphs we encountered. + + // The next step is to figure out if all of the type loops can be + // broken. A type loop can be broken if at least one union exists in the + // loop and that union ended up having variants that are not part of + // a type loop. + fn type_loop_source_span_and_message<'a>( + modules: &'a [Module], heap: &Heap, defined_type: &DefinedType, monomorph_idx: usize, index_in_loop: usize + ) -> (&'a InputSource, InputSpan, String) { + // Note: because we will discover the type loop the *first* time we + // instantiate a monomorph with the provided polymorphic arguments + // (not all arguments are actually used in the type). We don't have + // to care about a second instantiation where certain unused + // polymorphic arguments are different. + let monomorph_type = match &defined_type.definition { + DTV::Union(definition) => &definition.monomorphs[monomorph_idx].concrete_type, + DTV::Struct(definition) => &definition.monomorphs[monomorph_idx].concrete_type, + DTV::Enum(_) | DTV::Function(_) | DTV::Component(_) => + unreachable!(), // impossible to have an enum/procedure in a type loop + }; + + let type_name = monomorph_type.display_name(&heap); + let message = if index_in_loop == 0 { + format!( + "encountered an infinitely large type for '{}' (which can be fixed by \ + introducing a union type that has a variant whose embedded types are \ + not part of a type loop, or do not have embedded types)", + type_name + ) + } else if index_in_loop == 1 { + format!("because it depends on the type '{}'", type_name) + } else { + format!("which depends on the type '{}'", type_name) + }; + + let ast_definition = &heap[defined_type.ast_definition]; + let ast_root_id = ast_definition.defined_in(); + + return ( + &modules[ast_root_id.index as usize].source, + ast_definition.identifier().span, + message + ); + } + + fn construct_type_loop_error(table: &TypeTable, type_loop: &TypeLoop, modules: &[Module], heap: &Heap) -> ParseError { + let first_entry = &type_loop.members[0]; + let first_type = table.lookup.get(&first_entry.definition_id).unwrap(); + let (first_module, first_span, first_message) = type_loop_source_span_and_message( + modules, heap, first_type, first_entry.monomorph_idx, 0 + ); + let mut parse_error = ParseError::new_error_at_span(first_module, first_span, first_message); + + for member_idx in 1..type_loop.members.len() { + let entry = &type_loop.members[member_idx]; + let entry_type = table.lookup.get(&first_entry.definition_id).unwrap(); + let (module, span, message) = type_loop_source_span_and_message( + modules, heap, entry_type, entry.monomorph_idx, member_idx + ); + parse_error = parse_error.with_info_at_span(module, span, message); + } + + parse_error + } + + for type_loop in &self.type_loops { + let mut can_be_broken = false; + debug_assert!(!type_loop.members.is_empty()); + + for entry in &type_loop.members { + if entry.is_union { + let base_type = self.lookup.get(&entry.definition_id).unwrap(); + let monomorph = &base_type.definition.as_union().monomorphs[entry.monomorph_idx]; + + debug_assert!(!monomorph.variants.is_empty()); // otherwise it couldn't be part of the type loop + let has_stack_variant = monomorph.variants.iter().any(|variant| !variant.lives_on_heap); + if has_stack_variant { + can_be_broken = true; + } + } + } + + if !can_be_broken { + // Construct a type loop error + return Err(construct_type_loop_error(self, type_loop, modules, heap)); + } + } + + // If here, then all type loops have been resolved and we can lay out + // all of the members + self.type_loops.clear(); + + return Ok(()); + } + + /// Checks if the specified type needs to be resolved (i.e. we need to push + /// a breadcrumb), is already resolved (i.e. we can continue with the next + /// member of the currently considered type) or is in the process of being + /// resolved (i.e. we're in a type loop). Because of borrowing rules we + /// don't do any modifications of internal types here. Hence: if we + /// return `PushBreadcrumb` then call `handle_new_breadcrumb_for_type_loops` + /// to take care of storing the appropriate types. + fn check_member_for_type_loops(&self, definition_type: &ConcreteType) -> TypeLoopResult { + use ConcreteTypePart as CTP; + + // We're only interested in user-defined types, so exit if it is a + // builtin of some sort. + debug_assert!(!definition_type.parts.is_empty()); + let definition_id = match &definition_type.parts[0] { + CTP::Instance(definition_id, _) | + CTP::Function(definition_id, _) | + CTP::Component(definition_id, _) => { + *definition_id + }, + _ => { + return TypeLoopResult::TypeExists + }, + }; + + let base_type = self.lookup.get(&definition_id).unwrap(); + if let Some(mono_idx) = base_type.get_monomorph_index(&definition_type) { + // Monomorph is already known. Check if it is present in the + // breadcrumbs. If so, then we are in a type loop + for (breadcrumb_idx, breadcrumb) in self.type_loop_breadcrumbs.iter().enumerate() { + if breadcrumb.definition_id == definition_id && breadcrumb.monomorph_idx == mono_idx { + return TypeLoopResult::TypeLoop(breadcrumb_idx); + } + } + + return TypeLoopResult::TypeExists; + } + + // Type is not yet known, so we need to insert it into the lookup and + // push a new breadcrumb. + return TypeLoopResult::PushBreadcrumb(definition_id, definition_type.clone()); + } + + /// Handles the `PushBreadcrumb` result for a `check_member_for_type_loops` + /// call. + fn handle_new_breadcrumb_for_type_loops(&mut self, definition_id: DefinitionId, definition_type: ConcreteType) { + use DefinedTypeVariant as DTV; + + let base_type = self.lookup.get_mut(&definition_id).unwrap(); + let mut is_union = false; + let monomorph_idx = match &mut base_type.definition { + DTV::Enum(definition) => { + debug_assert!(definition.monomorphs.is_empty()); + definition.monomorphs.push(EnumMonomorph{ + concrete_type: definition_type, + }); + 0 + }, + DTV::Union(definition) => { + // Create all the variants with their concrete types + let mut mono_variants = Vec::with_capacity(definition.variants.len()); + for poly_variant in &definition.variants { + let mut mono_embedded = Vec::with_capacity(poly_variant.embedded.len()); + for poly_embedded in &poly_variant.embedded { + let mono_concrete = Self::construct_concrete_type(poly_embedded, &definition_type); + mono_embedded.push(UnionMonomorphEmbedded{ + concrete_type: mono_concrete, + size: 0, + alignment: 0, + offset: 0 + }); + } + + mono_variants.push(UnionMonomorphVariant{ + lives_on_heap: false, + embedded: mono_embedded, + }) + } + + let mono_idx = definition.monomorphs.len(); + definition.monomorphs.push(UnionMonomorph{ + concrete_type: definition_type, + variants: mono_variants, + stack_size: 0, + stack_alignment: 0, + heap_size: 0, + heap_alignment: 0 + }); + + is_union = true; + mono_idx + }, + DTV::Struct(definition) => { + let mut mono_fields = Vec::with_capacity(definition.fields.len()); + for poly_field in &definition.fields { + let mono_concrete = Self::construct_concrete_type(&poly_field.parser_type, &definition_type); + mono_fields.push(StructMonomorphField{ + concrete_type: mono_concrete, + size: 0, + alignment: 0, + offset: 0 + }) + } + + let mono_idx = definition.monomorphs.len(); + definition.monomorphs.push(StructMonomorph{ + concrete_type: definition_type, + fields: mono_fields, + size: 0, + alignment: 0 + }); + + mono_idx + }, + DTV::Function(_) | DTV::Component(_) => { + unreachable!("pushing type resolving breadcrumb for procedure type") + }, + }; + + self.encountered_types.push(TypeLoopEntry{ + definition_id, + monomorph_idx, + is_union, + }); + + self.type_loop_breadcrumbs.push(TypeLoopBreadcrumb{ + definition_id, + monomorph_idx, + next_member: 0, + next_embedded: 0, + }); + } + + /// Constructs a concrete type out of a parser type for a struct field or + /// union embedded type. It will do this by looking up the polymorphic + /// variables in the supplied concrete type. The assumption is that the + /// polymorphic variable's indices correspond to the subtrees in the + /// concrete type. + fn construct_concrete_type(member_type: &ParserType, container_type: &ConcreteType) -> ConcreteType { + use ParserTypeVariant as PTV; + use ConcreteTypePart as CTP; + + // TODO: Combine with code in pass_typing.rs + fn parser_to_concrete_part(part: &ParserTypeVariant) -> Option { + match part { + PTV::Void => Some(CTP::Void), + PTV::Message => Some(CTP::Message), + PTV::Bool => Some(CTP::Bool), + PTV::UInt8 => Some(CTP::UInt8), + PTV::UInt16 => Some(CTP::UInt16), + PTV::UInt32 => Some(CTP::UInt32), + PTV::UInt64 => Some(CTP::UInt64), + PTV::SInt8 => Some(CTP::SInt8), + PTV::SInt16 => Some(CTP::SInt16), + PTV::SInt32 => Some(CTP::SInt32), + PTV::SInt64 => Some(CTP::SInt64), + PTV::Character => Some(CTP::Character), + PTV::String => Some(CTP::String), + PTV::Array => Some(CTP::Array), + PTV::Input => Some(CTP::Input), + PTV::Output => Some(CTP::Output), + PTV::Definition(definition_id, num) => Some(CTP::Instance(*definition_id, *num)), + _ => None + } + } + + let mut parts = Vec::with_capacity(member_type.elements.len()); // usually a correct estimation, might not be + for member_part in &member_type.elements { + // Check if we have a regular builtin type + if let Some(part) = parser_to_concrete_part(&member_part.variant) { + parts.push(part); + continue; + } + + // Not builtin, but if all code is working correctly, we only care + // about the polymorphic argument at this point. + if let PTV::PolymorphicArgument(_container_definition_id, poly_arg_idx) = member_part.variant { + debug_assert_eq!(_container_definition_id, get_concrete_type_definition(container_type)); + + let mut container_iter = container_type.embedded_iter(0); + for _ in 0..poly_arg_idx { + container_iter.next(); + } + + let poly_section = container_iter.next().unwrap(); + parts.extend(poly_section); + + continue; + } + + unreachable!("unexpected type part {:?} from {:?}", member_part, member_type); + } + + return ConcreteType{ parts }; + } + + //-------------------------------------------------------------------------- + // Determining memory layout for types + //-------------------------------------------------------------------------- + + fn lay_out_memory_for_encountered_types(&mut self, arch: &TargetArch) { + use DefinedTypeVariant as DTV; + + // Just finished type loop detection, so we're left with the encountered + // types only + debug_assert!(self.type_loops.is_empty()); + debug_assert!(!self.encountered_types.is_empty()); + debug_assert!(self.memory_layout_breadcrumbs.is_empty()); + debug_assert!(self.size_alignment_stack.is_empty()); + + // Push the first entry (the type we originally started with when we + // were detecting type loops) + let first_entry = &self.encountered_types[0]; + self.memory_layout_breadcrumbs.push(MemoryBreadcrumb{ + definition_id: first_entry.definition_id, + monomorph_idx: first_entry.monomorph_idx, + next_member: 0, + next_embedded: 0, + first_size_alignment_idx: 0, + }); + + // Enter the main resolving loop + 'breadcrumb_loop: while !self.memory_layout_breadcrumbs.is_empty() { + let cur_breadcrumb_idx = self.memory_layout_breadcrumbs.len() - 1; + let mut breadcrumb = self.memory_layout_breadcrumbs[cur_breadcrumb_idx].clone(); + + let poly_type = self.lookup.get(&breadcrumb.definition_id).unwrap(); + match &poly_type.definition { + DTV::Enum(definition) => { + // Size should already be computed + debug_assert!(definition.size != 0 && definition.alignment != 0); + }, + DTV::Union(definition) => { + // Retrieve size/alignment of each embedded type. We do not + // compute the offsets or total type sizes yet. + let mono_type = &definition.monomorphs[breadcrumb.monomorph_idx]; + let num_variants = mono_type.variants.len(); + while breadcrumb.next_member < num_variants { + let mono_variant = &mono_type.variants[breadcrumb.next_member]; + + if mono_variant.lives_on_heap { + // To prevent type loops we made this a heap- + // allocated variant. This implies we cannot + // compute sizes of members at this point. + } else { + let num_embedded = mono_variant.embedded.len(); + while breadcrumb.next_embedded < num_embedded { + let mono_embedded = &mono_variant.embedded[breadcrumb.next_embedded]; + match self.get_memory_layout_or_breadcrumb(arch, &mono_embedded.concrete_type) { + MemoryLayoutResult::TypeExists(size, alignment) => { + self.size_alignment_stack.push((size, alignment)); + }, + MemoryLayoutResult::PushBreadcrumb(new_breadcrumb) => { + self.memory_layout_breadcrumbs[cur_breadcrumb_idx] = breadcrumb; + self.memory_layout_breadcrumbs.push(new_breadcrumb); + continue 'breadcrumb_loop; + } + } + + breadcrumb.next_embedded += 1; + } + } + + breadcrumb.next_member += 1; + breadcrumb.next_embedded = 0; + } + + // If here then we can at least compute the stack size of + // the type, we'll have to come back at the very end to + // fill in the heap size/alignment/offset of each heap- + // allocated variant. + let mut max_size = definition.tag_size; + let mut max_alignment = definition.tag_size; + + let poly_type = self.lookup.get_mut(&breadcrumb.definition_id).unwrap(); + let definition = poly_type.definition.as_union_mut(); + let mono_type = &mut definition.monomorphs[breadcrumb.monomorph_idx]; + let mut size_alignment_idx = breadcrumb.first_size_alignment_idx; + + for variant in &mut mono_type.variants { + // We're doing stack computations, so always start with + // the tag size/alignment. + let mut variant_offset = definition.tag_size; + let mut variant_alignment = definition.tag_size; + + if variant.lives_on_heap { + // Variant lives on heap, so just a pointer + let (ptr_size, ptr_align) = arch.pointer_size_alignment; + align_offset_to(&mut variant_offset, ptr_align); + + variant_offset += ptr_size; + variant_alignment = variant_alignment.max(ptr_align); + } else { + // Variant lives on stack, so walk all embedded + // types. + for embedded in &mut variant.embedded { + let (size, alignment) = self.size_alignment_stack[size_alignment_idx]; + embedded.size = size; + embedded.alignment = alignment; + size_alignment_idx += 1; + + align_offset_to(&mut variant_offset, alignment); + embedded.offset = variant_offset; + + variant_offset += size; + variant_alignment = variant_alignment.max(alignment); + } + }; + + max_size = max_size.max(variant_offset); + max_alignment = max_alignment.max(variant_alignment); + } + + mono_type.stack_size = max_size; + mono_type.stack_alignment = max_alignment; + self.size_alignment_stack.truncate(breadcrumb.first_size_alignment_idx); + }, + DTV::Struct(definition) => { + // Retrieve size and alignment of each struct member. We'll + // compute the offsets once all of those are known + let mono_type = &definition.monomorphs[breadcrumb.monomorph_idx]; + let num_fields = mono_type.fields.len(); + while breadcrumb.next_member < num_fields { + let mono_field = &mono_type.fields[breadcrumb.next_member]; + + match self.get_memory_layout_or_breadcrumb(arch, &mono_field.concrete_type) { + MemoryLayoutResult::TypeExists(size, alignment) => { + self.size_alignment_stack.push((size, alignment)) + }, + MemoryLayoutResult::PushBreadcrumb(new_breadcrumb) => { + self.memory_layout_breadcrumbs[cur_breadcrumb_idx] = breadcrumb; + self.memory_layout_breadcrumbs.push(new_breadcrumb); + continue 'breadcrumb_loop; + }, + } + + breadcrumb.next_member += 1; + } + + // Compute offsets and size of total type + let mut cur_offset = 0; + let mut max_alignment = 1; + + let poly_type = self.lookup.get_mut(&breadcrumb.definition_id).unwrap(); + let definition = poly_type.definition.as_struct_mut(); + let mono_type = &mut definition.monomorphs[breadcrumb.monomorph_idx]; + let mut size_alignment_idx = breadcrumb.first_size_alignment_idx; + + for field in &mut mono_type.fields { + let (size, alignment) = self.size_alignment_stack[size_alignment_idx]; + field.size = size; + field.alignment = alignment; + size_alignment_idx += 1; + + align_offset_to(&mut cur_offset, alignment); + field.offset = cur_offset; + + cur_offset += size; + max_alignment = max_alignment.max(alignment); + } + + mono_type.size = cur_offset; + mono_type.alignment = max_alignment; + self.size_alignment_stack.truncate(breadcrumb.first_size_alignment_idx); + }, + DTV::Function(_) | DTV::Component(_) => { + unreachable!(); + } + } + + // If here, then we completely layed out the current type. So move + // to the next breadcrumb + self.memory_layout_breadcrumbs.pop(); + } + + debug_assert!(self.size_alignment_stack.is_empty()); + + // If here then all types have been layed out. What remains is to + // compute the sizes/alignment/offsets of the heap variants of the + // unions we have encountered. + for entry in &self.encountered_types { + if !entry.is_union { + continue; + } + + // First pass, use buffer to store size/alignment to prevent + // borrowing issues. + let poly_type = self.lookup.get(&entry.definition_id).unwrap(); + let definition = poly_type.definition.as_union(); + let mono_type = &definition.monomorphs[entry.monomorph_idx]; + + for variant in &mono_type.variants { + if !variant.lives_on_heap { + continue; + } + + debug_assert!(!variant.embedded.is_empty()); + + for embedded in &variant.embedded { + match self.get_memory_layout_or_breadcrumb(arch, &embedded.concrete_type) { + MemoryLayoutResult::TypeExists(size, alignment) => { + self.size_alignment_stack.push((size, alignment)); + }, + _ => unreachable!(), + } + } + } + + // Second pass, apply the size/alignment values in our buffer + let poly_type = self.lookup.get_mut(&entry.definition_id).unwrap(); + let definition = poly_type.definition.as_union_mut(); + let mono_type = &mut definition.monomorphs[entry.monomorph_idx]; + + let mut max_size = 0; + let mut max_alignment = 1; + let mut size_alignment_idx = 0; + + for variant in &mut mono_type.variants { + if !variant.lives_on_heap { + continue; + } + + let mut variant_offset = 0; + let mut variant_alignment = 1; + + for embedded in &mut variant.embedded { + let (size, alignment) = self.size_alignment_stack[size_alignment_idx]; + embedded.size = size; + embedded.alignment = alignment; + size_alignment_idx += 1; + + align_offset_to(&mut variant_offset, alignment); + embedded.alignment = variant_offset; + + variant_offset += size; + variant_alignment = variant_alignment.max(alignment); + } + + max_size = max_size.max(variant_offset); + max_alignment = max_alignment.max(variant_alignment); + } + + if max_size != 0 { + // At least one entry lives on the heap + mono_type.heap_size = max_size; + mono_type.heap_alignment = max_alignment; + } + } + + // And now, we're actually, properly, done + self.encountered_types.clear(); + } + + fn get_memory_layout_or_breadcrumb(&self, arch: &TargetArch, concrete_type: &ConcreteType) -> MemoryLayoutResult { + use ConcreteTypePart as CTP; + + // Before we do any fancy shenanigans, we need to check if the concrete + // type actually requires laying out memory. + debug_assert!(!concrete_type.parts.is_empty()); + let (builtin_size, builtin_alignment) = match concrete_type.parts[0] { + CTP::Void => (0, 1), + CTP::Message => arch.array_size_alignment, + CTP::Bool => (1, 1), + CTP::UInt8 => (1, 1), + CTP::UInt16 => (2, 2), + CTP::UInt32 => (4, 4), + CTP::UInt64 => (8, 8), + CTP::SInt8 => (1, 1), + CTP::SInt16 => (2, 2), + CTP::SInt32 => (4, 4), + CTP::SInt64 => (8, 8), + CTP::Character => (4, 4), + CTP::String => arch.string_size_alignment, + CTP::Array => arch.array_size_alignment, + CTP::Slice => arch.array_size_alignment, + CTP::Input => arch.port_size_alignment, + CTP::Output => arch.port_size_alignment, + CTP::Instance(definition_id, _) => { + // Special case where we explicitly return to simplify the + // return case for the builtins. + let entry = self.lookup.get(&definition_id).unwrap(); + let monomorph_idx = entry.get_monomorph_index(concrete_type).unwrap(); + + if let Some((size, alignment)) = entry.get_monomorph_size_alignment(monomorph_idx) { + // Type has been layed out in memory + return MemoryLayoutResult::TypeExists(size, alignment); + } else { + return MemoryLayoutResult::PushBreadcrumb(MemoryBreadcrumb{ + definition_id, + monomorph_idx, + next_member: 0, + next_embedded: 0, + first_size_alignment_idx: self.size_alignment_stack.len(), + }); + } + }, + CTP::Function(_, _) | CTP::Component(_, _) => { + todo!("storage for 'function pointers'"); + } + }; + + return MemoryLayoutResult::TypeExists(builtin_size, builtin_alignment); + } + + /// Returns tag concrete type (always a builtin integer type), the size of + /// that type in bytes (and implicitly, its alignment) + fn variant_tag_type_from_values(min_val: i64, max_val: i64) -> (ConcreteType, usize) { + debug_assert!(min_val <= max_val); + + let (part, size) = if min_val >= 0 { + // Can be an unsigned integer + if max_val <= (u8::MAX as i64) { + (ConcreteTypePart::UInt8, 1) + } else if max_val <= (u16::MAX as i64) { + (ConcreteTypePart::UInt16, 2) + } else if max_val <= (u32::MAX as i64) { + (ConcreteTypePart::UInt32, 4) + } else { + (ConcreteTypePart::UInt64, 8) + } + } else { + // Must be a signed integer + if min_val >= (i8::MIN as i64) && max_val <= (i8::MAX as i64) { + (ConcreteTypePart::SInt8, 1) + } else if min_val >= (i16::MIN as i64) && max_val <= (i16::MAX as i64) { + (ConcreteTypePart::SInt16, 2) + } else if min_val >= (i32::MIN as i64) && max_val <= (i32::MAX as i64) { + (ConcreteTypePart::SInt32, 4) + } else { + (ConcreteTypePart::SInt64, 8) + } + }; + + return (ConcreteType{ parts: vec![part] }, size); + } + //-------------------------------------------------------------------------- // Small utilities //-------------------------------------------------------------------------- @@ -963,10 +1883,26 @@ impl TypeTable { } fn mark_used_polymorphic_variables(poly_vars: &mut Vec, parser_type: &ParserType) { - for element in & parser_type.elements { + for element in &parser_type.elements { if let ParserTypeVariant::PolymorphicArgument(_, idx) = &element.variant { poly_vars[*idx as usize].is_in_use = true; } } } +} + +#[inline] fn align_offset_to(offset: &mut usize, alignment: usize) { + debug_assert!(alignment > 0); + let alignment_min_1 = alignment - 1; + *offset += alignment_min_1; + *offset &= !(alignment_min_1); +} + +#[inline] fn get_concrete_type_definition(concrete: &ConcreteType) -> DefinitionId { + if let ConcreteTypePart::Instance(definition_id, _) = concrete.parts[0] { + return definition_id; + } else { + debug_assert!(false, "passed {:?} to the type table", concrete); + return DefinitionId::new_invalid() + } } \ No newline at end of file diff --git a/src/protocol/parser/visitor.rs b/src/protocol/parser/visitor.rs index 7fe2688704ed60a76c8c5e4ec21ac200cc3d2e37..c30b5129c9d77c333103e4eb3f96768645067890 100644 --- a/src/protocol/parser/visitor.rs +++ b/src/protocol/parser/visitor.rs @@ -14,11 +14,25 @@ pub(crate) const STMT_BUFFER_INIT_CAPACITY: usize = 256; pub(crate) const EXPR_BUFFER_INIT_CAPACITY: usize = 256; /// General context structure that is used while traversing the AST. +/// TODO: Revise, visitor abstraction is starting to get in the way of programming pub(crate) struct Ctx<'p> { pub heap: &'p mut Heap, - pub module: &'p mut Module, + pub modules: &'p mut [Module], + pub module_idx: usize, // currently considered module pub symbols: &'p mut SymbolTable, pub types: &'p mut TypeTable, + pub arch: &'p crate::protocol::TargetArch, +} + +impl<'p> Ctx<'p> { + /// Returns module `modules[module_idx]` + pub(crate) fn module(&self) -> &Module { + &self.modules[self.module_idx] + } + + pub(crate) fn module_mut(&mut self) -> &mut Module { + &mut self.modules[self.module_idx] + } } /// Visitor is a generic trait that will fully walk the AST. The default @@ -29,9 +43,10 @@ pub(crate) trait Visitor { // Entry point fn visit_module(&mut self, ctx: &mut Ctx) -> VisitorResult { let mut def_index = 0; + let module_root_id = ctx.modules[ctx.module_idx].root_id; loop { let definition_id = { - let root = &ctx.heap[ctx.module.root_id]; + let root = &ctx.heap[module_root_id]; if def_index >= root.definitions.len() { return Ok(()) } diff --git a/src/protocol/tests/eval_silly.rs b/src/protocol/tests/eval_silly.rs index 62cb0ca3ae138c85373515dfe4c54f0428c30b3d..db7203cd559ffd171bf75c785788e4da7ecc27e4 100644 --- a/src/protocol/tests/eval_silly.rs +++ b/src/protocol/tests/eval_silly.rs @@ -135,14 +135,14 @@ fn test_slicing_magic() { f.call_ok(Some(Value::Bool(true))); }).for_struct("Holder", |s| { s .assert_num_monomorphs(8) - .assert_has_monomorph("u8") - .assert_has_monomorph("u16") - .assert_has_monomorph("u32") - .assert_has_monomorph("u64") - .assert_has_monomorph("s8") - .assert_has_monomorph("s16") - .assert_has_monomorph("s32") - .assert_has_monomorph("s64"); + .assert_has_monomorph("Holder") + .assert_has_monomorph("Holder") + .assert_has_monomorph("Holder") + .assert_has_monomorph("Holder") + .assert_has_monomorph("Holder") + .assert_has_monomorph("Holder") + .assert_has_monomorph("Holder") + .assert_has_monomorph("Holder"); }); } diff --git a/src/protocol/tests/mod.rs b/src/protocol/tests/mod.rs index 3578fece67601e6869de95537a1f340b6a977788..70eb7af70543577dd8a83fc315e8ce7a29221c1e 100644 --- a/src/protocol/tests/mod.rs +++ b/src/protocol/tests/mod.rs @@ -13,16 +13,17 @@ mod utils; mod lexer; -mod parser_validation; -mod parser_inference; -mod parser_monomorphs; -mod parser_imports; mod parser_binding; +mod parser_imports; +mod parser_inference; mod parser_literals; -mod eval_operators; +mod parser_monomorphs; +mod parser_types; +mod parser_validation; +mod eval_binding; mod eval_calls; mod eval_casting; -mod eval_binding; +mod eval_operators; mod eval_silly; pub(crate) use utils::{Tester}; // the testing harness diff --git a/src/protocol/tests/parser_monomorphs.rs b/src/protocol/tests/parser_monomorphs.rs index c38ada857574258cfd6076e26734c3af70b5cec6..a09c038446f1a6a9d08be59b83663768d8265558 100644 --- a/src/protocol/tests/parser_monomorphs.rs +++ b/src/protocol/tests/parser_monomorphs.rs @@ -11,7 +11,8 @@ fn test_struct_monomorphs() { "no polymorph", "struct Integer{ s32 field }" ).for_struct("Integer", |s| { s - .assert_num_monomorphs(0); + .assert_num_monomorphs(1) + .assert_has_monomorph("Integer"); }); Tester::new_single_source_expect_ok( @@ -28,11 +29,11 @@ fn test_struct_monomorphs() { } " ).for_struct("Number", |s| { s - .assert_has_monomorph("s8") - .assert_has_monomorph("s16") - .assert_has_monomorph("s32") - .assert_has_monomorph("s64") + .assert_has_monomorph("Number") .assert_has_monomorph("Number") + .assert_has_monomorph("Number") + .assert_has_monomorph("Number") + .assert_has_monomorph("Number>") .assert_num_monomorphs(5); }).for_function("instantiator", |f| { f .for_variable("a", |v| {v.assert_concrete_type("Number");} ) @@ -49,11 +50,12 @@ fn test_enum_monomorphs() { func do_it() -> s32 { auto a = Answer::Yes; return 0; } " ).for_enum("Answer", |e| { e - .assert_num_monomorphs(0); + .assert_num_monomorphs(1) + .assert_has_monomorph("Answer"); }); // Note for reader: because the enum doesn't actually use the polymorphic - // variable, we expect to have 0 polymorphs: the type only has to be laid + // variable, we expect to have 1 monomorph: the type only has to be laid // out once. Tester::new_single_source_expect_ok( "single polymorph", @@ -68,7 +70,8 @@ fn test_enum_monomorphs() { } " ).for_enum("Answer", |e| { e - .assert_num_monomorphs(0); + .assert_num_monomorphs(1) + .assert_has_monomorph("Answer"); }); } @@ -81,7 +84,8 @@ fn test_union_monomorphs() { func do_it() -> s32 { auto a = Trinary::Value(true); return 0; } " ).for_union("Trinary", |e| { e - .assert_num_monomorphs(0); + .assert_num_monomorphs(1) + .assert_has_monomorph("Trinary"); }); // TODO: Does this do what we want? Or do we expect the embedded monomorph @@ -100,11 +104,12 @@ fn test_union_monomorphs() { } " ).for_union("Result", |e| { e - .assert_num_monomorphs(4) - .assert_has_monomorph("s8;bool") - .assert_has_monomorph("bool;s8") - .assert_has_monomorph("Result;Result") - .assert_has_monomorph("s16;s64"); + .assert_num_monomorphs(5) + .assert_has_monomorph("Result") + .assert_has_monomorph("Result") + .assert_has_monomorph("Result,Result>") + .assert_has_monomorph("Result") + .assert_has_monomorph("Result"); }).for_function("instantiator", |f| { f .for_variable("d", |v| { v .assert_parser_type("auto") diff --git a/src/protocol/tests/parser_types.rs b/src/protocol/tests/parser_types.rs new file mode 100644 index 0000000000000000000000000000000000000000..0f3ebb06c522e83c8f351694c8b8337779d5f999 --- /dev/null +++ b/src/protocol/tests/parser_types.rs @@ -0,0 +1,87 @@ +/// parser_types.rs +/// +/// Tests for (infinite) type construction + +use super::*; + +#[test] +fn test_invalid_struct_type_loops() { + Tester::new_single_source_expect_err( + "self-referential struct", + "struct Foo { Foo foo }" + ).error(|e| { e + .assert_occurs_at(0, "Foo {") + .assert_msg_has(0, "infinitely large type"); + }); +} + +#[test] +fn test_invalid_union_type_loops() { + Tester::new_single_source_expect_err( + "self-referential union", + "union Foo { One(Foo) }" + ).error(|e| { e + .assert_occurs_at(0, "Foo {") + .assert_msg_has(0, "infinitely large type"); + }); + + Tester::new_single_source_expect_err( + "two-cycle union", + " + union UOne { Two(UTwo) } + union UTwo { One(UOne) } + " + ).error(|e| { e + .assert_occurs_at(0, "UOne") + .assert_msg_has(0, "infinitely large type"); + }); +} + +#[test] +fn test_valid_loops() { + // Linked-list like thing + Tester::new_single_source_expect_ok( + "linked list", + " + struct Entry { u32 number, Link next } + union Link { Next(Entry), None } + " + ).for_struct("Entry", |s| { + s.assert_size_alignment("Entry", 24, 8); // [4 'number', 4 'pad', x 'union tag', 8-x 'padding', 8 pointer] + }).for_union("Link", |u| { + u.assert_size_alignment("Link", 16, 8, 24, 8); + }); + + // Tree-like thing, version 1 + Tester::new_single_source_expect_ok( + "tree, optional children", + " + union Option { Some(T), None } + struct Node { + u32 value, + Option left, + Option right, + } + " + ).for_struct("Node", |s| { + s.assert_size_alignment("Node", 40, 8); + }).for_union("Option", |u| { + u.assert_size_alignment("Option", 16, 8, 40, 8); + }); + + // Tree-like thing, version 2 + Tester::new_single_source_expect_ok( + "tree, with left/right link", + " + union Option { Some(T), None } + struct Link { Node left, Node right } + struct Node { u32 value, Option link } + " + ).for_struct("Node", |s| { + s.assert_size_alignment("Node", 24, 8); + }).for_struct("Link", |s| { + s.assert_size_alignment("Link", 48, 8); + }).for_union("Option", |u| { + u.assert_size_alignment("Option", 16, 8, 48, 8); + }); +} \ No newline at end of file diff --git a/src/protocol/tests/utils.rs b/src/protocol/tests/utils.rs index 753c1f10d5a66b3962e03d16b9b9eee460627c7f..6407afea6458747e10b2a7e6e534aee15e3aa619 100644 --- a/src/protocol/tests/utils.rs +++ b/src/protocol/tests/utils.rs @@ -5,7 +5,7 @@ use crate::protocol::{ input_source::*, parser::{ Parser, - type_table::{TypeTable, DefinedTypeVariant}, + type_table::*, symbol_table::SymbolTable, token_parsing::*, }, @@ -149,13 +149,17 @@ impl AstOkTester { pub(crate) fn for_struct(self, name: &str, f: F) -> Self { let mut found = false; for definition in self.heap.definitions.iter() { - if let Definition::Struct(definition) = definition { - if definition.identifier.value.as_str() != name { + if let Definition::Struct(ast_definition) = definition { + if ast_definition.identifier.value.as_str() != name { continue; } // Found struct with the same name - let tester = StructTester::new(self.ctx(), definition); + let definition_id = ast_definition.this.upcast(); + let type_entry = self.types.get_base_definition(&definition_id).unwrap(); + let type_definition = type_entry.definition.as_struct(); + + let tester = StructTester::new(self.ctx(), ast_definition, type_definition); f(tester); found = true; break @@ -201,7 +205,9 @@ impl AstOkTester { } // Found union with the same name - let tester = UnionTester::new(self.ctx(), definition); + let definition_id = definition.this.upcast(); + let base_type = self.types.get_base_definition(&definition_id).unwrap(); + let tester = UnionTester::new(self.ctx(), definition, &base_type.definition.as_union()); f(tester); found = true; break; @@ -257,25 +263,26 @@ impl AstOkTester { pub(crate) struct StructTester<'a> { ctx: TestCtx<'a>, - def: &'a StructDefinition, + ast_def: &'a StructDefinition, + type_def: &'a StructType, } impl<'a> StructTester<'a> { - fn new(ctx: TestCtx<'a>, def: &'a StructDefinition) -> Self { - Self{ ctx, def } + fn new(ctx: TestCtx<'a>, ast_def: &'a StructDefinition, type_def: &'a StructType) -> Self { + Self{ ctx, ast_def, type_def } } pub(crate) fn assert_num_fields(self, num: usize) -> Self { assert_eq!( - num, self.def.fields.len(), + num, self.ast_def.fields.len(), "[{}] Expected {} struct fields, but found {} for {}", - self.ctx.test_name, num, self.def.fields.len(), self.assert_postfix() + self.ctx.test_name, num, self.ast_def.fields.len(), self.assert_postfix() ); self } pub(crate) fn assert_num_monomorphs(self, num: usize) -> Self { - let (is_equal, num_encountered) = has_equal_num_monomorphs(self.ctx, num, self.def.this.upcast()); + let (is_equal, num_encountered) = has_equal_num_monomorphs(self.ctx, num, self.ast_def.this.upcast()); assert!( is_equal, "[{}] Expected {} monomorphs, but got {} for {}", self.ctx.test_name, num, num_encountered, self.assert_postfix() @@ -283,20 +290,32 @@ impl<'a> StructTester<'a> { self } - /// Asserts that a monomorph exist, separate polymorphic variable types by - /// a semicolon. pub(crate) fn assert_has_monomorph(self, serialized_monomorph: &str) -> Self { - let (has_monomorph, serialized) = has_monomorph(self.ctx, self.def.this.upcast(), serialized_monomorph); + let (has_monomorph, serialized) = has_monomorph(self.ctx, self.ast_def.this.upcast(), serialized_monomorph); assert!( - has_monomorph, "[{}] Expected to find monomorph {}, but got {} for {}", + has_monomorph.is_some(), "[{}] Expected to find monomorph {}, but got {} for {}", self.ctx.test_name, serialized_monomorph, &serialized, self.assert_postfix() ); self } + pub(crate) fn assert_size_alignment(mut self, monomorph: &str, size: usize, alignment: usize) -> Self { + self = self.assert_has_monomorph(monomorph); + let (mono_idx, _) = has_monomorph(self.ctx, self.ast_def.this.upcast(), monomorph); + let mono_idx = mono_idx.unwrap(); + + let mono = &self.type_def.monomorphs[mono_idx]; + assert!( + mono.size == size && mono.alignment == alignment, + "[{}] Expected (size,alignment) of ({}, {}), but got ({}, {}) for {}", + self.ctx.test_name, size, alignment, mono.size, mono.alignment, self.assert_postfix() + ); + self + } + pub(crate) fn for_field(self, name: &str, f: F) -> Self { // Find field with specified name - for field in &self.def.fields { + for field in &self.ast_def.fields { if field.field.value.as_str() == name { let tester = StructFieldTester::new(self.ctx, field); f(tester); @@ -314,9 +333,9 @@ impl<'a> StructTester<'a> { fn assert_postfix(&self) -> String { let mut v = String::new(); v.push_str("Struct{ name: "); - v.push_str(self.def.identifier.value.as_str()); + v.push_str(self.ast_def.identifier.value.as_str()); v.push_str(", fields: ["); - for (field_idx, field) in self.def.fields.iter().enumerate() { + for (field_idx, field) in self.ast_def.fields.iter().enumerate() { if field_idx != 0 { v.push_str(", "); } v.push_str(field.field.value.as_str()); } @@ -384,7 +403,7 @@ impl<'a> EnumTester<'a> { pub(crate) fn assert_has_monomorph(self, serialized_monomorph: &str) -> Self { let (has_monomorph, serialized) = has_monomorph(self.ctx, self.def.this.upcast(), serialized_monomorph); assert!( - has_monomorph, "[{}] Expected to find monomorph {}, but got {} for {}", + has_monomorph.is_some(), "[{}] Expected to find monomorph {}, but got {} for {}", self.ctx.test_name, serialized_monomorph, serialized, self.assert_postfix() ); self @@ -406,25 +425,26 @@ impl<'a> EnumTester<'a> { pub(crate) struct UnionTester<'a> { ctx: TestCtx<'a>, - def: &'a UnionDefinition, + ast_def: &'a UnionDefinition, + type_def: &'a UnionType, } impl<'a> UnionTester<'a> { - fn new(ctx: TestCtx<'a>, def: &'a UnionDefinition) -> Self { - Self{ ctx, def } + fn new(ctx: TestCtx<'a>, ast_def: &'a UnionDefinition, type_def: &'a UnionType) -> Self { + Self{ ctx, ast_def, type_def } } pub(crate) fn assert_num_variants(self, num: usize) -> Self { assert_eq!( - num, self.def.variants.len(), + num, self.ast_def.variants.len(), "[{}] Expected {} union variants, but found {} for {}", - self.ctx.test_name, num, self.def.variants.len(), self.assert_postfix() + self.ctx.test_name, num, self.ast_def.variants.len(), self.assert_postfix() ); self } pub(crate) fn assert_num_monomorphs(self, num: usize) -> Self { - let (is_equal, num_encountered) = has_equal_num_monomorphs(self.ctx, num, self.def.this.upcast()); + let (is_equal, num_encountered) = has_equal_num_monomorphs(self.ctx, num, self.ast_def.this.upcast()); assert!( is_equal, "[{}] Expected {} monomorphs, but got {} for {}", self.ctx.test_name, num, num_encountered, self.assert_postfix() @@ -433,20 +453,40 @@ impl<'a> UnionTester<'a> { } pub(crate) fn assert_has_monomorph(self, serialized_monomorph: &str) -> Self { - let (has_monomorph, serialized) = has_monomorph(self.ctx, self.def.this.upcast(), serialized_monomorph); + let (has_monomorph, serialized) = has_monomorph(self.ctx, self.ast_def.this.upcast(), serialized_monomorph); assert!( - has_monomorph, "[{}] Expected to find monomorph {}, but got {} for {}", + has_monomorph.is_some(), "[{}] Expected to find monomorph {}, but got {} for {}", self.ctx.test_name, serialized_monomorph, serialized, self.assert_postfix() ); self } + pub(crate) fn assert_size_alignment( + mut self, serialized_monomorph: &str, + stack_size: usize, stack_alignment: usize, heap_size: usize, heap_alignment: usize + ) -> Self { + self = self.assert_has_monomorph(serialized_monomorph); + let (mono_idx, _) = has_monomorph(self.ctx, self.ast_def.this.upcast(), serialized_monomorph); + let mono_idx = mono_idx.unwrap(); + let mono = &self.type_def.monomorphs[mono_idx]; + assert!( + stack_size == mono.stack_size && stack_alignment == mono.stack_alignment && + heap_size == mono.heap_size && heap_alignment == mono.heap_alignment, + "[{}] Expected (stack | heap) (size, alignment) of ({}, {} | {}, {}), but got ({}, {} | {}, {}) for {}", + self.ctx.test_name, + stack_size, stack_alignment, heap_size, heap_alignment, + mono.stack_size, mono.stack_alignment, mono.heap_size, mono.heap_alignment, + self.assert_postfix() + ); + self + } + fn assert_postfix(&self) -> String { let mut v = String::new(); v.push_str("Union{ name: "); - v.push_str(self.def.identifier.value.as_str()); + v.push_str(self.ast_def.identifier.value.as_str()); v.push_str(", variants: ["); - for (variant_idx, variant) in self.def.variants.iter().enumerate() { + for (variant_idx, variant) in self.ast_def.variants.iter().enumerate() { if variant_idx != 0 { v.push_str(", "); } v.push_str(variant.identifier.value.as_str()); } @@ -868,56 +908,51 @@ fn has_equal_num_monomorphs(ctx: TestCtx, num: usize, definition_id: DefinitionI (num_on_type == num, num_on_type) } -fn has_monomorph(ctx: TestCtx, definition_id: DefinitionId, serialized_monomorph: &str) -> (bool, String) { +fn has_monomorph(ctx: TestCtx, definition_id: DefinitionId, serialized_monomorph: &str) -> (Option, String) { use DefinedTypeVariant::*; let type_def = ctx.types.get_base_definition(&definition_id).unwrap(); // Note: full_buffer is just for error reporting let mut full_buffer = String::new(); - let mut has_match = false; - - let serialize_monomorph = |monomorph: &Vec| -> String { - let mut buffer = String::new(); - for (element_idx, element) in monomorph.iter().enumerate() { - if element_idx != 0 { - buffer.push(';'); - } - serialize_concrete_type(&mut buffer, ctx.heap, definition_id, element); - } - - buffer - }; + let mut has_match = None; full_buffer.push('['); - let mut append_to_full_buffer = |buffer: String| { + let mut append_to_full_buffer = |concrete_type: &ConcreteType, mono_idx: usize| { if full_buffer.len() != 1 { full_buffer.push_str(", "); } full_buffer.push('"'); - full_buffer.push_str(&buffer); + + let first_idx = full_buffer.len(); + serialize_concrete_type(&mut full_buffer, ctx.heap, definition_id, concrete_type); + if &full_buffer[first_idx..] == serialized_monomorph { + has_match = Some(mono_idx); + } + full_buffer.push('"'); }; match &type_def.definition { - Enum(_) | Union(_) | Struct(_) => { - let monomorphs = type_def.definition.data_monomorphs(); - for monomorph in monomorphs.iter() { - let buffer = serialize_monomorph(&monomorph.poly_args); - if buffer == serialized_monomorph { - has_match = true; - } - append_to_full_buffer(buffer); + Enum(definition) => { + for (mono_idx, mono) in definition.monomorphs.iter().enumerate() { + append_to_full_buffer(&mono.concrete_type, mono_idx); + } + }, + Union(definition) => { + for (mono_idx, mono) in definition.monomorphs.iter().enumerate() { + append_to_full_buffer(&mono.concrete_type, mono_idx); + } + }, + Struct(definition) => { + for (mono_idx, mono) in definition.monomorphs.iter().enumerate() { + append_to_full_buffer(&mono.concrete_type, mono_idx); } }, Function(_) | Component(_) => { let monomorphs = type_def.definition.procedure_monomorphs(); - for monomorph in monomorphs.iter() { - let buffer = serialize_monomorph(&monomorph.poly_args); - if buffer == serialized_monomorph { - has_match = true; - } - append_to_full_buffer(buffer); + for (mono_idx, mono) in monomorphs.iter().enumerate() { + append_to_full_buffer(&mono.concrete_type, mono_idx); } } } @@ -1051,7 +1086,9 @@ fn serialize_concrete_type(buffer: &mut String, heap: &Heap, def: DefinitionId, idx = serialize_recursive(buffer, heap, poly_vars, concrete, idx + 1); buffer.push('>'); }, - CTP::Instance(definition_id, num_sub) => { + CTP::Instance(definition_id, num_sub) | + CTP::Function(definition_id, num_sub) | + CTP::Component(definition_id, num_sub) => { let definition_name = heap[*definition_id].identifier(); buffer.push_str(definition_name.value.as_str()); if *num_sub != 0 { @@ -1062,7 +1099,7 @@ fn serialize_concrete_type(buffer: &mut String, heap: &Heap, def: DefinitionId, } buffer.push('>'); } - } + }, } idx