diff --git a/notes_max.md b/notes_max.md index fcfb3e8b6d9fe970f0a2548a7741642c64bbf4e9..6ee6773d472ccc029713f344aeeb97437850743d 100644 --- a/notes_max.md +++ b/notes_max.md @@ -54,45 +54,77 @@ For now we will use the C++-like trait/interfaceless polymorphism: once we know Within functions and connectors we may employ polymorphic types. The specification of the polymorphic arguments is not required if they can be inferred. Polymorphic arguments may be partially specified by using somekind of throwaway `_` or `auto` type. When we are monomorphizing a type we must be able to fully determine all of the polymorphic arguments, if we can't then we throw a compiler error. -## Conclusions +## Type Inference - Monomorphization -- - Type definitions may contain polymorphic arguments. These are simple identifiers. The polymorphic arguments are embedded in definitions, they are erased after type inference on the type's use (e.g. function call or struct instantiation) has finished. +There are two parts to type inference: determining polymorphic variables and determining the `auto` variables. Among the polymorphic types we have: `struct`s, `enum`s, `union`s, `component`s and `function`s. The first three are datatypes and have no body to infer. The latter two are, for the lack of a better word, procedural types or proctypes. -- - While instantiating polymorphic types (e.g. `T something = call_a_func(3, 5)` or the inferred variant `let something = call_a_func(3, 5)`) we actually need to resolve these types. This implies: - - - Memory declarations will have their type embedded within the memory declaration itself. If we do not know the type yet, or only partially know the type, then we need to mark the type as being inferred. - - Expressions will have their type inferred from the constraints the expression imposes on its arguments, and the types of the arguments themselves. Within an expression tree type information may flow from the root towards the leaves or from the leaves towards the root. - - This implies that type information in the AST must be fully specified. If the type information is not yet fully known then it must be embedded in some kind of datastructure that is used for type inference. +We may only infer the types within a proctype if its polymorphic arguments are already inferred. Hence the type inference algorithm for a proctype does not work on the polymorphic types in its body. The type inference algorithm needs to work on all of the `auto` types in the body (potentially those of polymorphic types without explicitly specified polymorphic arguments). + +If the type inference algorithm determines the arguments to a polymorphic type within the body of a proctype then we need to (recursively) instantiate a monomorph of that type. This process is potentially recursive in two ways: + +1. **for datatypes**: when one instantiates a monomorph we suddenly know all of the embedded types within a datatype. At this point we may determine whether the type is potentially recursive or not. We may use this fact to mark struct members or union variants as pointerlike. We can at this point lay out the size and the offsets of the type. -- - A function definition with polyargs seems to behave a bit differently: During the initial definition parsing we may want to make sure that each type in the definition either resolves to a concrete type (optionally with polymorphic arguments) or to one of the polyargs. This would also provide information to the type inference algorithm. +2. **for proctypes**: when one instantiates a monomorph we can restart the inference process for the body of the function. During this process we may arrive at new monomorphs for datatypes or new monomorphs for function types. -## Conclusions - Parsing +## Type Inference - The Algorithm -During parsing we want to embed all of the information the programmer has provided to the compiler in the AST. So polyargs go inside type definitions, types that are supposed to be inferred are embedded in the AST, types that depend on polyargs also go into the AST. So we may have: +So as determined above: when we perform type inference for a function we create a lookup for the assigned polymorphic variables and assign these where used in the expression trees. After this we end up with expression trees with `auto` types scattered here and there, potentially embedded within concrete types (e.g. `in>>`, an input port from which we get arrays of a monomorphized struct `SomeStruct` whose second argument should be inferred). -- Partially concrete types with embedded inferred types (e.g. `let[] list_of_things` is an `Array` or `Result thing = Result::Ok(5)` or `let thing = Result::Ok(5)`). -- Direct references to polyarg types (e.g. `T list_of_things` or `Result thing = ...` or `T[][] array_of_arrays_yes_sir` and `Tuple2 = construct_a_thing()`) -- Fully inferred types (e.g. `auto thing = 5` or `auto result = call_a_thing()`) -- Completely specified types. +Since memory statements have already been converted into variable declarations (present in the appropriate scope) we only truly need to care about the expression trees that reside within a body. The one exception is the place where those expression trees are used: the expression tree used as the condition in an `if` statement must have a boolean return type. We have the following types of expressions, together with the constraints they impose on the inference algorithm (note that I will use the word "matching" types a lot. With that I will disregard implicit casting, for now. This is later fixed by inserting casting expressions that allow for (builtin) type conversion): + +- **assignment**: + Assignment expression (e.g. `a = 5 + 2` or `a <<= 2`). Where we make the distinction between: + + - **=**: We require that the RHS type matches the type of the LHS. If any side is known (partially) then the other side may be inferred. The return type is the type of the LHS. + - **\*=**, **/=**, **%=**, **+=**, **-=**: We require that the LHS is of the appropriate number type and the RHS is of equal type. The return type is the type of the LHS. + - **<<=**, **>>=**: The LHS is of an integer type and the RHS is of any integer type. The return type is the type of the LHS. + - **&=**, **|=**, **^=**: The LHS and the RHS are of equal integer type. The return type is the type of the LHS. + +- **conditional**: + C-like ternary operator (e.g. `something() ? true : false`). We require that the test-expression is of boolean type. We require that the true-expression and the false-expression are of equal type. The return type is of the same type as the true/false-expression return type. + +- **binary**: + Binary operator, where we make the distinction between: + - **@** (concatenate): *I might kick this one out*, requires that the LHS and RHS are arrays with equal subtype (`Array`), or LHS and RHS are strings. The result is an array with the same subtype (`Array`), or string. + - **||**, **&&**: Both LHS and RHS must be booleans. The result is a boolean. + - **|**, **&**, **^**: Both LHS and RHS must be of equal integer type. The return type is the type of the LHS. + - **==**, **!=**: Both LHS and RHS must be of the same type. The return type is boolean + - **<**, **>**, **<=**, **>=**: Both LHS and RHS must be of equal numeric type. The return type is boolean. + - **<<**, **>>**: The LHS is of an integer type and the RHS if of any integer type. The return type is the type of the LHS. + - **+**, **-**, **\***, **/**, **%**: The LHS and RHS are of equal integer type. The return type is the type of the LHS. + +- **unary**: + Unary operator, where we make the distinction between: + **+**, **-**: Argument may be any numeric type, the return type is the same as the argument. + **!**: Argument must be of boolean type, the return type is also boolean. + **~**: Argument must be of any integer type, the return type is the same as the argument. + **--**, **++**: Argument must be of any integer type. The return type is the same as the argument. + +- **indexing expression**: + Indexes into an array or a string (e.g. `variable[i]`). The index must always be a `usize`/`isize` integer type (with the appropriate implicit casts). The subject must always be of type `Array` or string and the result is of type `T`. + +- **slicing expression**: + Slices an array or a string (e.g. `some_array[5..calculate_the_end()]`). The beginning and ending indices must always be of **equal** `usize`/`isize` integer type. The subject must always be of type `Array` or string, the return type is of the same type as the subject. + +- **select expression**: + Selects a member from a struct (e.g. `some_struct.some_field`). This expression can only be evaluated if we know the type of the subject (which must be a struct). If the field exists on that struct then we may determine the return type to be the type of that field. + +- **array expression**: + Constructs an array by explicitly specifying the members of that array. Lets assume for now that you cannot specify a string in this silly fashion. Then all of the array elements must have the same type `T`, and the result type of the expression is `Array`. -## Conclusions - Type Inference +- **call expression**: + Calls a particular function (e.g. `some_function(some_arg)`). Non-polymorphic function arguments imply that the argument must be of the type of the argument. A non-polymorphic return type fixes the output type of the expression. -During type inference and typechecking we need to determine all concrete types. So we will arrive at a fully specified type: every polymorphic type will have its polyargs specified. Likewise every inferred type will be fully specified as well. + In the polymorphic case the inference works the other way around: once we know the output type of the expression that is used as a polymorphic variable, then we can fix that particular polymorphic argument and feed that information to all places where that polymorphic argument is used. -All of this hints at having to specify two classes of types: `ParserType` and a `Type`. The `ParserType` can be: + The same is true for the return type: if the return type of the call is polymorphic then the place where this return type is used determines the type of the polymorphic argument. -- A builtin type (with embedded `ParserType`s where appropriate) -- An inferred type -- A polyarg type -- A symbolic type (with embedded `ParsedType`s where appropriate) +- **constant expression**: *This requires a lot of rework at some point.* + A constant expression contains a literal of a specific type. Hence the output type of the literal is the type of the literal itself. In case of literal `struct`, `enum` and `union` instances the output type is that exact datatype. -While the final `Type` can be: + For the embedded types we have two cases: if the embedded type is not polymorphic and some expression is used at that position, then the embedded type must match the output type of the expression. If the embedded type is polymorphic then the output type of the expression determines the polymorphic argument. -- A builtin type (with embedded types where appropriate) -- A known user-defined type (with embedded types where appropriate) + In case of `string` literals the output type is also a `string`. However, in the case of integer literals we can only determine that the output type is some integer, the place where the literal is employed determines the integer type. It is valid to use this for byteshifting, but also for operating on floating-point types. In the case of decimal literals we can use these for operations on floating point types. But again: whether it is a `float` or a `double` depends on the place where the literal is used. -Note that we cannot typecheck polymorphic connectors and functions until we monomorphize them. \ No newline at end of file +- **variable expression**: + Refers to some variable declared somewhere. Hence the return type is the same as the type of the variable. \ No newline at end of file diff --git a/src/protocol/ast.rs b/src/protocol/ast.rs index 03c0c16a470160e0af118fd5a909495f38d34fbb..808bf838ab760e31342e87290ec309670474fc37 100644 --- a/src/protocol/ast.rs +++ b/src/protocol/ast.rs @@ -475,6 +475,12 @@ impl Index for Heap { } } +impl IndexMut for Heap { + fn index_mut(&mut self, index: ParserTypeId) -> &mut Self::Output { + &mut self.parser_types[index] + } +} + impl Index for Heap { type Output = Variable; fn index(&self, index: VariableId) -> &Self::Output { @@ -1096,7 +1102,7 @@ pub struct SymbolicParserType { #[derive(Debug, Clone, serde::Serialize, serde::Deserialize)] pub enum SymbolicParserTypeVariant { Definition(DefinitionId), - PolyArg((DefinitionId, u32)), // index of polyarg in the definition + PolyArg(usize), // index of polyarg in the definition } #[derive(Debug, Clone, PartialEq, Eq, serde::Serialize, serde::Deserialize)] @@ -1416,6 +1422,12 @@ impl Definition { _ => panic!("Unable to cast `Definition` to `Component`"), } } + pub fn is_function(&self) -> bool { + match self { + Definition::Function(_) => true, + _ => false, + } + } pub fn as_function(&self) -> &Function { match self { Definition::Function(result) => result, @@ -1557,68 +1569,6 @@ impl SyntaxElement for Function { self.position } } -// TODO: @remove ??? -// #[derive(Debug, Clone, serde::Serialize, serde::Deserialize)] -// pub enum Signature { -// Component(ComponentSignature), -// Function(FunctionSignature), -// } -// -// impl Signature { -// pub fn from_definition(h: &Heap, def: DefinitionId) -> Signature { -// // TODO: Fix this -// match &h[def] { -// Definition::Component(com) => Signature::Component(ComponentSignature { -// identifier: com.identifier.clone(), // TODO: @fix -// arity: Signature::convert_parameters(h, &com.parameters), -// }), -// Definition::Function(fun) => Signature::Function(FunctionSignature { -// return_type: h[fun.return_type].the_type.clone(), -// identifier: fun.identifier.clone(), // TODO: @fix -// arity: Signature::convert_parameters(h, &fun.parameters), -// }), -// _ => panic!("cannot retrieve signature (for StructDefinition or EnumDefinition)") -// } -// } -// fn convert_parameters(h: &Heap, params: &Vec) -> Vec { -// let mut result = Vec::new(); -// for ¶m in params.iter() { -// result.push(h[h[param].type_annotation].the_type.clone()); -// } -// result -// } -// fn identifier(&self) -> &Identifier { -// match self { -// Signature::Component(com) => &com.identifier, -// Signature::Function(fun) => &fun.identifier, -// } -// } -// pub fn is_component(&self) -> bool { -// match self { -// Signature::Component(_) => true, -// Signature::Function(_) => false, -// } -// } -// pub fn is_function(&self) -> bool { -// match self { -// Signature::Component(_) => false, -// Signature::Function(_) => true, -// } -// } -// } -// -// #[derive(Debug, Clone, serde::Serialize, serde::Deserialize)] -// pub struct ComponentSignature { -// pub identifier: Identifier, -// pub arity: Vec, -// } -// -// #[derive(Debug, Clone, serde::Serialize, serde::Deserialize)] -// pub struct FunctionSignature { -// pub return_type: Type, -// pub identifier: Identifier, -// pub arity: Vec, -// } #[derive(Debug, Clone, serde::Serialize, serde::Deserialize)] pub enum Statement { diff --git a/src/protocol/inputsource.rs b/src/protocol/inputsource.rs index 8e25f7d6e655d53d4d8926da1dc864e07a52160a..0be0127d9d4ddbf1d8b5fcd70adce61fabf40ed9 100644 --- a/src/protocol/inputsource.rs +++ b/src/protocol/inputsource.rs @@ -15,19 +15,19 @@ pub struct InputSource { } static STD_LIB_PDL: &'static [u8] = b" -primitive forward(in i, out o) { +primitive forward(in i, out o) { while(true) synchronous put(o, get(i)); } -primitive sync(in i, out o) { +primitive sync(in i, out o) { while(true) synchronous if(fires(i)) put(o, get(i)); } -primitive alternator(in i, out l, out r) { +primitive alternator(in i, out l, out r) { while(true) { synchronous if(fires(i)) put(l, get(i)); synchronous if(fires(i)) put(r, get(i)); } } -primitive replicator(in i, out l, out r) { +primitive replicator(in i, out l, out r) { while(true) synchronous { if(fires(i)) { msg m = get(i); @@ -36,7 +36,7 @@ primitive replicator(in i, out l, out r) { } } } -primitive merger(in l, in r, out o) { +primitive merger(in l, in r, out o) { while(true) synchronous { if(fires(l)) put(o, get(l)); else if(fires(r)) put(o, get(r)); diff --git a/src/protocol/lexer.rs b/src/protocol/lexer.rs index ca6752201592a4531edefc8629fa3256315e1765..5a87f118da17ea7f45d6a5fb4169057a966b923c 100644 --- a/src/protocol/lexer.rs +++ b/src/protocol/lexer.rs @@ -369,45 +369,51 @@ impl Lexer<'_> { /// the type specification. When done the input position will be at the end /// of the type specifications (hence may be at whitespace). fn consume_type2(&mut self, h: &mut Heap, allow_inference: bool) -> Result { - // Small helper function to convert in/out polymorphic arguments + // Small helper function to convert in/out polymorphic arguments. Not + // pretty, but return boolean is true if the error is due to inference + // not being allowed let reduce_port_poly_args = | heap: &mut Heap, port_pos: &InputPosition, args: Vec, - | -> Result { + | -> Result { match args.len() { - 0 => Ok(heap.alloc_parser_type(|this| ParserType{ + 0 => if allow_inference { + Ok(heap.alloc_parser_type(|this| ParserType{ this, pos: port_pos.clone(), variant: ParserTypeVariant::Inferred - })), + })) + } else { + Err(true) + }, 1 => Ok(args[0]), - _ => Err(()) + _ => Err(false) } }; // Consume the type let pos = self.source.pos(); let parser_type_variant = if self.has_keyword(b"msg") { - self.consume_keyword(b"msg"); + self.consume_keyword(b"msg")?; ParserTypeVariant::Message } else if self.has_keyword(b"boolean") { - self.consume_keyword(b"boolean"); + self.consume_keyword(b"boolean")?; ParserTypeVariant::Bool } else if self.has_keyword(b"byte") { - self.consume_keyword(b"byte"); + self.consume_keyword(b"byte")?; ParserTypeVariant::Byte } else if self.has_keyword(b"short") { - self.consume_keyword(b"short"); + self.consume_keyword(b"short")?; ParserTypeVariant::Short } else if self.has_keyword(b"int") { - self.consume_keyword(b"int"); + self.consume_keyword(b"int")?; ParserTypeVariant::Int } else if self.has_keyword(b"long") { - self.consume_keyword(b"long"); + self.consume_keyword(b"long")?; ParserTypeVariant::Long } else if self.has_keyword(b"str") { - self.consume_keyword(b"str"); + self.consume_keyword(b"str")?; ParserTypeVariant::String } else if self.has_keyword(b"auto") { if !allow_inference { @@ -417,25 +423,37 @@ impl Lexer<'_> { )); } - self.consume_keyword(b"auto"); + self.consume_keyword(b"auto")?; ParserTypeVariant::Inferred } else if self.has_keyword(b"in") { // TODO: @cleanup: not particularly neat to have this special case // where we enforce polyargs in the parser-phase - self.consume_keyword(b"in"); + // TODO: @hack, temporarily allow inferred port values + self.consume_keyword(b"in")?; let poly_args = self.consume_polymorphic_args(h, allow_inference)?; let poly_arg = reduce_port_poly_args(h, &pos, poly_args) - .map_err(|_| ParseError2::new_error( - &self.source, pos, "'in' type only accepts up to 1 polymorphic argument" - ))?; + .map_err(|infer_error| { + let msg = if infer_error { + "Type inference is not allowed here" + } else { + "Type 'in' only allows for 1 polymorphic argument" + }; + ParseError2::new_error(&self.source, pos, msg) + })?; ParserTypeVariant::Input(poly_arg) } else if self.has_keyword(b"out") { - self.consume_keyword(b"out"); + // TODO: @hack, temporarily allow inferred port values + self.consume_keyword(b"out")?; let poly_args = self.consume_polymorphic_args(h, allow_inference)?; let poly_arg = reduce_port_poly_args(h, &pos, poly_args) - .map_err(|_| ParseError2::new_error( - &self.source, pos, "'out' type only accepts up to 1 polymorphic argument" - ))?; + .map_err(|infer_error| { + let msg = if infer_error { + "Type inference is not allowed here" + } else { + "Type 'out' only allows for 1 polymorphic argument, but {} were specified" + }; + ParseError2::new_error(&self.source, pos, msg) + })?; ParserTypeVariant::Output(poly_arg) } else { // Must be a symbolic type @@ -655,86 +673,6 @@ impl Lexer<'_> { } } - // fn consume_primitive_type(&mut self) -> Result { - // if self.has_keyword(b"in") { - // self.consume_keyword(b"in")?; - // Ok(PrimitiveType::Input) - // } else if self.has_keyword(b"out") { - // self.consume_keyword(b"out")?; - // Ok(PrimitiveType::Output) - // } else if self.has_keyword(b"msg") { - // self.consume_keyword(b"msg")?; - // Ok(PrimitiveType::Message) - // } else if self.has_keyword(b"boolean") { - // self.consume_keyword(b"boolean")?; - // Ok(PrimitiveType::Boolean) - // } else if self.has_keyword(b"byte") { - // self.consume_keyword(b"byte")?; - // Ok(PrimitiveType::Byte) - // } else if self.has_keyword(b"short") { - // self.consume_keyword(b"short")?; - // Ok(PrimitiveType::Short) - // } else if self.has_keyword(b"int") { - // self.consume_keyword(b"int")?; - // Ok(PrimitiveType::Int) - // } else if self.has_keyword(b"long") { - // self.consume_keyword(b"long")?; - // Ok(PrimitiveType::Long) - // } else if self.has_keyword(b"auto") { - // // TODO: @types - // return Err(self.error_at_pos("inferred types using 'auto' are reserved, but not yet implemented")); - // } else { - // let identifier = self.consume_namespaced_identifier()?; - // Ok(PrimitiveType::Symbolic(PrimitiveSymbolic{ - // identifier, - // definition: None - // })) - // } - // } - // fn has_array(&mut self) -> bool { - // let backup_pos = self.source.pos(); - // let mut result = false; - // match self.consume_whitespace(false) { - // Ok(_) => result = self.has_string(b"["), - // Err(_) => {} - // } - // self.source.seek(backup_pos); - // return result; - // } - // fn consume_type(&mut self) -> Result { - // let primitive = self.consume_primitive_type()?; - // let array; - // if self.has_array() { - // self.consume_string(b"[]")?; - // array = true; - // } else { - // array = false; - // } - // Ok(Type { primitive, array }) - // } - // fn create_type_annotation_input(&self, h: &mut Heap) -> Result { - // let position = self.source.pos(); - // let the_type = Type::INPUT; - // let id = h.alloc_type_annotation(|this| TypeAnnotation { this, position, the_type }); - // Ok(id) - // } - // fn create_type_annotation_output(&self, h: &mut Heap) -> Result { - // let position = self.source.pos(); - // let the_type = Type::OUTPUT; - // let id = h.alloc_type_annotation(|this| TypeAnnotation { this, position, the_type }); - // Ok(id) - // } - // fn consume_type_annotation(&mut self, h: &mut Heap) -> Result { - // let position = self.source.pos(); - // let the_type = self.consume_type()?; - // let id = h.alloc_type_annotation(|this| TypeAnnotation { this, position, the_type }); - // Ok(id) - // } - // fn consume_type_annotation_spilled(&mut self) -> Result<(), ParseError2> { - // self.consume_type()?; - // Ok(()) - // } - // Parameters fn consume_parameter(&mut self, h: &mut Heap) -> Result { @@ -1514,19 +1452,14 @@ impl Lexer<'_> { result } fn has_label(&mut self) -> bool { - /* To prevent ambiguity with expression statements consisting - only of an identifier, we look ahead and match the colon - that signals a labeled statement. */ + // To prevent ambiguity with expression statements consisting only of an + // identifier or a namespaced identifier, we look ahead and match on the + // *single* colon that signals a labeled statement. let backup_pos = self.source.pos(); let mut result = false; - match self.consume_identifier_spilled() { - Ok(_) => match self.consume_whitespace(false) { - Ok(_) => { - result = self.has_string(b":"); - } - Err(_) => {} - }, - Err(_) => {} + if self.consume_identifier_spilled().is_ok() { + // next character is ':', second character is NOT ':' + result = Some(b':') == self.source.next() && Some(b':') != self.source.lookahead(1) } self.source.seek(backup_pos); return result; diff --git a/src/protocol/parser/mod.rs b/src/protocol/parser/mod.rs index f4d25bd932e5aaa4234cbbeb06b6ca04031cb1de..cfae1b9fce0e6870ffbba6f0c1667b8c4708aac9 100644 --- a/src/protocol/parser/mod.rs +++ b/src/protocol/parser/mod.rs @@ -188,8 +188,8 @@ impl Parser { // All imports in the AST are now annotated. We now use the symbol table // to construct the type table. - let type_ctx = TypeCtx::new(&symbol_table, &self.heap, &self.modules); - let type_table = TypeTable::new(&type_ctx)?; + let mut type_ctx = TypeCtx::new(&symbol_table, &mut self.heap, &self.modules); + let type_table = TypeTable::new(&mut type_ctx)?; Ok((symbol_table, type_table)) } diff --git a/src/protocol/parser/type_table.rs b/src/protocol/parser/type_table.rs index f7dd586908f6c00ca4d8a62d9b4af19472e61536..4857ae973485b8e07f9d2c75a710e25b62aa5e5e 100644 --- a/src/protocol/parser/type_table.rs +++ b/src/protocol/parser/type_table.rs @@ -228,24 +228,24 @@ impl TypeIterator { } } -#[derive(PartialEq, Eq)] -enum SpecifiedTypeVariant { - // No subtypes - Message, - Bool, - Byte, - Short, - Int, - Long, - String, - // Always one subtype - ArrayOf, - InputOf, - OutputOf, - // Variable number of subtypes, depending on the polymorphic arguments on - // the definition - InstanceOf(DefinitionId, usize) -} +// #[derive(PartialEq, Eq)] +// enum SpecifiedTypeVariant { +// // No subtypes +// Message, +// Bool, +// Byte, +// Short, +// Int, +// Long, +// String, +// // Always one subtype +// ArrayOf, +// InputOf, +// OutputOf, +// // Variable number of subtypes, depending on the polymorphic arguments on +// // the definition +// InstanceOf(DefinitionId, usize) +// } // #[derive(Eq)] // struct SpecifiedType { @@ -332,10 +332,22 @@ enum SpecifiedTypeVariant { // } // } +/// Result from attempting to resolve a `ParserType` using the symbol table and +/// the type table. enum ResolveResult { + /// ParserType is a builtin type BuiltIn, - PolyArg, + /// ParserType points to a polymorphic argument, contains the index of the + /// polymorphic argument in the outermost definition (e.g. we may have + /// structs nested three levels deep, but in the innermost struct we can + /// only use the polyargs that are specified in the type definition of the + /// outermost struct). + PolyArg(usize), + /// ParserType points to a user-defined type that is already resolved in the + /// type table. Resolved((RootId, DefinitionId)), + /// ParserType points to a user-defined type that is not yet resolved into + /// the type table. Unresolved((RootId, DefinitionId)) } @@ -353,12 +365,12 @@ pub(crate) struct TypeTable { pub(crate) struct TypeCtx<'a> { symbols: &'a SymbolTable, - heap: &'a Heap, + heap: &'a mut Heap, modules: &'a [LexedModule] } impl<'a> TypeCtx<'a> { - pub(crate) fn new(symbols: &'a SymbolTable, heap: &'a Heap, modules: &'a [LexedModule]) -> Self { + pub(crate) fn new(symbols: &'a SymbolTable, heap: &'a mut Heap, modules: &'a [LexedModule]) -> Self { Self{ symbols, heap, modules } } } @@ -366,7 +378,7 @@ impl<'a> TypeCtx<'a> { impl TypeTable { /// Construct a new type table without any resolved types. Types will be /// resolved on-demand. - pub(crate) fn new(ctx: &TypeCtx) -> Result { + pub(crate) fn new(ctx: &mut TypeCtx) -> Result { // Make sure we're allowed to cast root_id to index into ctx.modules if cfg!(debug_assertions) { for (index, module) in ctx.modules.iter().enumerate() { @@ -382,9 +394,12 @@ impl TypeTable { parser_type_iter: VecDeque::with_capacity(64), }; - for root in ctx.heap.protocol_descriptions.iter() { - for definition_id in &root.definitions { - table.resolve_base_definition(ctx, *definition_id)?; + // TODO: @cleanup Rework this hack + for root_idx in 0..ctx.modules.len() { + let last_definition_idx = ctx.heap[ctx.modules[root_idx].root_id].definitions.len(); + for definition_idx in 0..last_definition_idx { + let definition_id = ctx.heap[ctx.modules[root_idx].root_id].definitions[definition_idx]; + table.resolve_base_definition(ctx, definition_id)?; } } @@ -403,7 +418,7 @@ impl TypeTable { /// This function will resolve just the basic definition of the type, it /// will not handle any of the monomorphized instances of the type. - fn resolve_base_definition<'a>(&'a mut self, ctx: &TypeCtx, definition_id: DefinitionId) -> Result<(), ParseError2> { + fn resolve_base_definition<'a>(&'a mut self, ctx: &mut TypeCtx, definition_id: DefinitionId) -> Result<(), ParseError2> { // Check if we have already resolved the base definition if self.lookup.contains_key(&definition_id) { return Ok(()); } @@ -415,10 +430,11 @@ impl TypeTable { let definition = &ctx.heap[definition_id]; let can_pop_breadcrumb = match definition { - Definition::Enum(definition) => self.resolve_base_enum_definition(ctx, root_id, definition), - Definition::Struct(definition) => self.resolve_base_struct_definition(ctx, root_id, definition), - Definition::Component(definition) => self.resolve_base_component_definition(ctx, root_id, definition), - Definition::Function(definition) => self.resolve_base_function_definition(ctx, root_id, definition), + // TODO: @cleanup Borrow rules hax + Definition::Enum(_) => self.resolve_base_enum_definition(ctx, root_id, definition_id), + Definition::Struct(_) => self.resolve_base_struct_definition(ctx, root_id, definition_id), + Definition::Component(_) => self.resolve_base_component_definition(ctx, root_id, definition_id), + Definition::Function(_) => self.resolve_base_function_definition(ctx, root_id, definition_id), }?; // Otherwise: `ingest_resolve_result` has pushed a new breadcrumb @@ -437,8 +453,11 @@ impl TypeTable { /// not instantiate any monomorphized instances of polymorphic enum /// definitions. If a subtype has to be resolved first then this function /// will return `false` after calling `ingest_resolve_result`. - fn resolve_base_enum_definition(&mut self, ctx: &TypeCtx, root_id: RootId, definition: &EnumDefinition) -> Result { - debug_assert!(!self.lookup.contains_key(&definition.this.upcast()), "base enum already resolved"); + fn resolve_base_enum_definition(&mut self, ctx: &mut TypeCtx, root_id: RootId, definition_id: DefinitionId) -> Result { + debug_assert!(ctx.heap[definition_id].is_enum()); + debug_assert!(!self.lookup.contains_key(&definition_id), "base enum already resolved"); + + let definition = ctx.heap[definition_id].as_enum(); // Check if the enum should be implemented as a classic enumeration or // a tagged union. Keep track of variant index for error messages. Make @@ -519,13 +538,12 @@ impl TypeTable { let mut poly_args = self.create_initial_poly_args(&definition.poly_vars); for variant in &variants { if let Some(embedded) = variant.parser_type { - self.check_embedded_type_and_modify_poly_args(ctx, &mut poly_args, root_id, embedded)?; + self.check_and_resolve_embedded_type_and_modify_poly_args(ctx, &mut poly_args, root_id, embedded)?; } } let is_polymorph = poly_args.iter().any(|arg| arg.is_in_use); // Insert base definition in type table - let definition_id = definition.this.upcast(); self.lookup.insert(definition_id, DefinedType { ast_definition: definition_id, definition: DefinedTypeVariant::Union(UnionType{ @@ -596,8 +614,11 @@ impl TypeTable { /// Resolves the basic struct definition to an entry in the type table. It /// will not instantiate any monomorphized instances of polymorphic struct /// definitions. - fn resolve_base_struct_definition(&mut self, ctx: &TypeCtx, root_id: RootId, definition: &StructDefinition) -> Result { - debug_assert!(!self.lookup.contains_key(&definition.this.upcast()), "base struct already resolved"); + fn resolve_base_struct_definition(&mut self, ctx: &mut TypeCtx, root_id: RootId, definition_id: DefinitionId) -> Result { + debug_assert!(ctx.heap[definition_id].is_struct()); + debug_assert!(!self.lookup.contains_key(&definition_id), "base struct already resolved"); + + let definition = ctx.heap[definition_id].as_struct(); // Make sure all fields point to resolvable types for field_definition in &definition.fields { @@ -625,12 +646,11 @@ impl TypeTable { // Construct representation of polymorphic arguments let mut poly_args = self.create_initial_poly_args(&definition.poly_vars); for field in &fields { - self.check_embedded_type_and_modify_poly_args(ctx, &mut poly_args, root_id, field.parser_type)?; + self.check_and_resolve_embedded_type_and_modify_poly_args(ctx, &mut poly_args, root_id, field.parser_type)?; } let is_polymorph = poly_args.iter().any(|arg| arg.is_in_use); - let definition_id = definition.this.upcast(); self.lookup.insert(definition_id, DefinedType{ ast_definition: definition_id, definition: DefinedTypeVariant::Struct(StructType{ @@ -648,8 +668,12 @@ impl TypeTable { /// Resolves the basic function definition to an entry in the type table. It /// will not instantiate any monomorphized instances of polymorphic function /// definitions. - fn resolve_base_function_definition(&mut self, ctx: &TypeCtx, root_id: RootId, definition: &Function) -> Result { - debug_assert!(!self.lookup.contains_key(&definition.this.upcast()), "base function already resolved"); + fn resolve_base_function_definition(&mut self, ctx: &mut TypeCtx, root_id: RootId, definition_id: DefinitionId) -> Result { + debug_assert!(ctx.heap[definition_id].is_function()); + debug_assert!(!self.lookup.contains_key(&definition_id), "base function already resolved"); + + let definition = ctx.heap[definition_id].as_function(); + let return_type = definition.return_type; // Check the return type let resolve_result = self.resolve_base_parser_type( @@ -688,19 +712,18 @@ impl TypeTable { // Construct polymorphic arguments let mut poly_args = self.create_initial_poly_args(&definition.poly_vars); - self.check_embedded_type_and_modify_poly_args(ctx, &mut poly_args, root_id, definition.return_type)?; + self.check_and_resolve_embedded_type_and_modify_poly_args(ctx, &mut poly_args, root_id, definition.return_type)?; for argument in &arguments { - self.check_embedded_type_and_modify_poly_args(ctx, &mut poly_args, root_id, argument.parser_type)?; + self.check_and_resolve_embedded_type_and_modify_poly_args(ctx, &mut poly_args, root_id, argument.parser_type)?; } let is_polymorph = poly_args.iter().any(|arg| arg.is_in_use); // Construct entry in type table - let definition_id = definition.this.upcast(); self.lookup.insert(definition_id, DefinedType{ ast_definition: definition_id, definition: DefinedTypeVariant::Function(FunctionType{ - return_type: definition.return_type, + return_type, arguments, }), poly_args, @@ -715,8 +738,12 @@ impl TypeTable { /// Resolves the basic component definition to an entry in the type table. /// It will not instantiate any monomorphized instancees of polymorphic /// component definitions. - fn resolve_base_component_definition(&mut self, ctx: &TypeCtx, root_id: RootId, definition: &Component) -> Result { - debug_assert!(!self.lookup.contains_key(&definition.this.upcast()), "base component already resolved"); + fn resolve_base_component_definition(&mut self, ctx: &mut TypeCtx, root_id: RootId, definition_id: DefinitionId) -> Result { + debug_assert!(ctx.heap[definition_id].is_component()); + debug_assert!(!self.lookup.contains_key(&definition_id), "base component already resolved"); + + let definition = ctx.heap[definition_id].as_component(); + let component_variant = definition.variant; // Check argument types for param_id in &definition.parameters { @@ -748,17 +775,16 @@ impl TypeTable { // Construct polymorphic arguments let mut poly_args = self.create_initial_poly_args(&definition.poly_vars); for argument in &arguments { - self.check_embedded_type_and_modify_poly_args(ctx, &mut poly_args, root_id, argument.parser_type)?; + self.check_and_resolve_embedded_type_and_modify_poly_args(ctx, &mut poly_args, root_id, argument.parser_type)?; } let is_polymorph = poly_args.iter().any(|v| v.is_in_use); // Construct entry in type table - let definition_id = definition.this.upcast(); self.lookup.insert(definition_id, DefinedType{ ast_definition: definition_id, definition: DefinedTypeVariant::Component(ComponentType{ - variant: definition.variant, + variant: component_variant, arguments, }), poly_args, @@ -777,7 +803,7 @@ impl TypeTable { /// that the type must be resolved first. fn ingest_resolve_result(&mut self, ctx: &TypeCtx, result: ResolveResult) -> Result { match result { - ResolveResult::BuiltIn | ResolveResult::PolyArg => Ok(true), + ResolveResult::BuiltIn | ResolveResult::PolyArg(_) => Ok(true), ResolveResult::Resolved(_) => Ok(true), ResolveResult::Unresolved((root_id, definition_id)) => { if self.iter.contains(root_id, definition_id) { @@ -862,9 +888,9 @@ impl TypeTable { // Check if the symbolic type is one of the definition's // polymorphic arguments. If so then we can halt the // execution - for poly_arg in poly_vars.iter() { + for (poly_arg_idx, poly_arg) in poly_vars.iter().enumerate() { if poly_arg.value == symbolic.identifier.value { - set_resolve_result(ResolveResult::PolyArg); + set_resolve_result(ResolveResult::PolyArg(poly_arg_idx)); continue 'resolve_loop; } } @@ -969,67 +995,97 @@ impl TypeTable { /// This function will also make sure that if the embedded type has /// polymorphic variables itself, that the number of polymorphic variables /// matches the number of arguments in the associated definition. - fn check_embedded_type_and_modify_poly_args( - &mut self, ctx: &TypeCtx, poly_args: &mut [PolyArg], root_id: RootId, embedded_type_id: ParserTypeId, + /// + /// Finally, for all embedded types (which includes function/component + /// arguments and return types) in type definitions we will modify the AST + /// when the embedded type is a polymorphic variable or points to another + /// user-defined type. + fn check_and_resolve_embedded_type_and_modify_poly_args( + &mut self, ctx: &mut TypeCtx, poly_args: &mut [PolyArg], root_id: RootId, embedded_type_id: ParserTypeId, ) -> Result<(), ParseError2> { + use ParserTypeVariant as PTV; + self.parser_type_iter.clear(); self.parser_type_iter.push_back(embedded_type_id); 'type_loop: while let Some(embedded_type_id) = self.parser_type_iter.pop_back() { - let embedded_type = &ctx.heap[embedded_type_id]; - if let ParserTypeVariant::Symbolic(symbolic) = &embedded_type.variant { - // Check if it matches any of the polymorphic arguments - for (_poly_arg_idx, poly_arg) in poly_args.iter_mut().enumerate() { - if poly_arg.identifier.value == symbolic.identifier.value { - poly_arg.is_in_use = true; - // TODO: Set symbolic value as polyarg in heap - // TODO: If we allow higher-kinded types in the future, - // then we can't continue here, but must resolve the - // polyargs as well - debug_assert!(symbolic.poly_args.is_empty()); - continue 'type_loop; - } - } - - // Must match a definition - // TODO: Set symbolic value as definition in heap - let symbol = ctx.symbols.resolve_namespaced_symbol(root_id, &symbolic.identifier); - debug_assert!(symbol.is_some(), "could not resolve symbolic parser type when determining poly args"); - let (symbol, ident_iter) = symbol.unwrap(); - debug_assert_eq!(ident_iter.num_remaining(), 0, "no exact symbol match when determining poly args"); - let (_root_id, definition_id) = symbol.as_definition().unwrap(); - - // Must be a struct, enum, or union - let defined_type = self.lookup.get(&definition_id).unwrap(); - if cfg!(debug_assertions) { - let type_class = defined_type.definition.type_class(); - debug_assert!( - type_class == TypeClass::Struct || type_class == TypeClass::Enum || type_class == TypeClass::Union, - "embedded type's class is not struct, enum or union" - ); - } + let embedded_type = &mut ctx.heap[embedded_type_id]; - if symbolic.poly_args.len() != defined_type.poly_args.len() { - // Mismatch in number of polymorphic arguments. This is not - // allowed in type definitions (no inference allowed). - let module_source = &ctx.modules[root_id.index as usize].source; - let number_args_msg = if defined_type.poly_args.is_empty() { - String::from("is not polymorphic") - } else { - format!("accepts {} polymorphic arguments", defined_type.poly_args.len()) - }; + match &mut embedded_type.variant { + PTV::Message | PTV::Bool | + PTV::Byte | PTV::Short | PTV::Int | PTV::Long | + PTV::String => { + // Builtins, no modification/iteration required + }, + PTV::IntegerLiteral | PTV::Inferred => { + // TODO: @hack Allowed for now so we can continue testing + // the parser/lexer + // debug_assert!(false, "encountered illegal parser type during ParserType/PolyArg modification"); + // unreachable!(); + }, + PTV::Array(subtype_id) | + PTV::Input(subtype_id) | + PTV::Output(subtype_id) => { + // Outer type is fixed, but inner type might be symbolix + self.parser_type_iter.push_back(*subtype_id); + }, + PTV::Symbolic(symbolic) => { + for (poly_arg_idx, poly_arg) in poly_args.iter_mut().enumerate() { + if poly_arg.identifier.value == symbolic.identifier.value { + poly_arg.is_in_use = true; + // TODO: If we allow higher-kinded types in the future, + // then we can't continue here, but must resolve the + // polyargs as well + debug_assert!(symbolic.poly_args.is_empty(), "got polymorphic arguments to a polymorphic variable"); + debug_assert!(symbolic.variant.is_none(), "symbolic parser type's variant already resolved"); + symbolic.variant = Some(SymbolicParserTypeVariant::PolyArg(poly_arg_idx)); + continue 'type_loop; + } + } - return Err(ParseError2::new_error( - module_source, symbolic.identifier.position, - &format!( - "The type '{}' {}, but {} polymorphic arguments were provided", - String::from_utf8_lossy(&symbolic.identifier.value), - number_args_msg, symbolic.poly_args.len() - ) - )); + // Must match a definition + let symbol = ctx.symbols.resolve_namespaced_symbol(root_id, &symbolic.identifier); + debug_assert!(symbol.is_some(), "could not resolve symbolic parser type when determining poly args"); + let (symbol, ident_iter) = symbol.unwrap(); + debug_assert_eq!(ident_iter.num_remaining(), 0, "no exact symbol match when determining poly args"); + let (_root_id, definition_id) = symbol.as_definition().unwrap(); + + // Must be a struct, enum, or union + let defined_type = self.lookup.get(&definition_id).unwrap(); + if cfg!(debug_assertions) { + let type_class = defined_type.definition.type_class(); + debug_assert!( + type_class == TypeClass::Struct || type_class == TypeClass::Enum || type_class == TypeClass::Union, + "embedded type's class is not struct, enum or union" + ); + } + + if symbolic.poly_args.len() != defined_type.poly_args.len() { + // Mismatch in number of polymorphic arguments. This is + // not allowed in type definitions (no inference is + // allowed within type definitions, only in bodies of + // functions/components). + let module_source = &ctx.modules[root_id.index as usize].source; + let number_args_msg = if defined_type.poly_args.is_empty() { + String::from("is not polymorphic") + } else { + format!("accepts {} polymorphic arguments", defined_type.poly_args.len()) + }; + + return Err(ParseError2::new_error( + module_source, symbolic.identifier.position, + &format!( + "The type '{}' {}, but {} polymorphic arguments were provided", + String::from_utf8_lossy(&symbolic.identifier.value), + number_args_msg, symbolic.poly_args.len() + ) + )); + } + + self.parser_type_iter.extend(&symbolic.poly_args); + debug_assert!(symbolic.variant.is_none(), "symbolic parser type's variant already resolved"); + symbolic.variant = Some(SymbolicParserTypeVariant::Definition(definition_id)); } - - self.parser_type_iter.extend(&symbolic.poly_args); } } diff --git a/src/runtime/tests.rs b/src/runtime/tests.rs index 03eba43af3465abf484aeede0228df3a8889bd45..dfdd0474cdebe27e744fa68041d2035addf5b9ad 100644 --- a/src/runtime/tests.rs +++ b/src/runtime/tests.rs @@ -1314,7 +1314,7 @@ fn for_msg_byte() { #[test] fn eq_causality() { - let test_log_path = Path::new("./logs/eq_no_causality"); + let test_log_path = Path::new("./logs/eq_causality"); let pdl = b" primitive eq(in a, in b, out c) { msg ma = null; @@ -1368,11 +1368,11 @@ fn eq_causality() { fn eq_no_causality() { let test_log_path = Path::new("./logs/eq_no_causality"); let pdl = b" - composite eq(in a, in b, out c) { + composite eq(in a, in b, out c) { channel leftfirsto -> leftfirsti; new eqinner(a, b, c, leftfirsto, leftfirsti); } - primitive eqinner(in a, in b, out c, out leftfirsto, in leftfirsti) { + primitive eqinner(in a, in b, out c, out leftfirsto, in leftfirsti) { msg ma = null; msg mb = null; while(true) synchronous { @@ -1397,7 +1397,7 @@ fn eq_no_causality() { } } } - primitive quick_test(in a, in b) { + primitive quick_test(in a, in b) { msg ma = null; while(true) synchronous { if (fires(a)) {