CSY/reowolf Files · docs/runtime/consensus.md · Centrum Wiskunde & Informatica (CWI)

Files @ 339b493050e2
Branch filter:
Location: CSY/reowolf/docs/runtime/consensus.md

339b493050e2 6.4 KiB text/markdown Show Annotation Show as Raw Download as Raw
WIP on tcp listener/socket test
# Previous Consensus Algorithm

## Introduction

The previous consensus algorithm (the one within Reowolf 1.0 and 1.1) had support for speculative execution. This means that the user may (directly or indirectly) fork the execution of a component. That particular execution then becomes two executions. At some point a component will have to choose which particular execution will be committed to memory. This is one reason for the existence of a `sync` block: a block of code wherein one may perform forking, and at the end a component will have to choose the execution that is committed to memory.

With speculative execution we may have multiple components that are all forking their execution and sending/receiving messages. So we do not end up with one component that has to choose its final execution, but all components choosing their final execution. Note that one component's execution may apply restrictions on the validity of another component's execution. As an example, suppose the following components and their executions:

- Component A: Has two executions:
    - Execution A1: Component A has sent a message to component B.
    - Execution A2: Component A has received a message from component B.
- Component B: Has three executions:
    - Execution B1: Component B has received a message from component A, then sends a message back to component A.
    - Execution B2: Component B has received a message from component A.
    - Execution B3: Component B has sent two messages to component A.

Without delving into too much detail, one may see that the only valid solution to this problem is the combination of `A1` and `B2`.

## Component Execution Tree, and Execution Traces

Components execute PDL code, which may contain calls to `fork`, `put`, and `get`. A `fork` explicitly forks the execution of the code. A `put` sends a message to a particular component, and a `get` receives a message from a component and forks (as explained later).

As the component enters a `sync` block, it has only one possible execution. But as stated above there are reasons for the execution to split up. These individual executions may themselves split up later, thereby forming a so-called "execution tree":

```
                             +-----+       +------+
                             | put |------>| sync |
+-------+      +------+----->+-----+       | end  |
| sync  |      | fork |                    +------+
| start |----->+------+----->+-----+
+-------+                    | get |------>+------+
                             +-----+       | sync |
                                   |       | end  |
                                   |       +------+
                                   |
                                   +------>+------+
                                   |       | sync |
                                   |       | end  |
                                   |       +------+
                                   |
                                   +--> ...
```

This corresponds to the following PDL code:

```
primitive some_component(out<u32> tx, in<u32> rx) {
  sync {
    fork {
      put(tx, 5);
    } or fork {
      get(rx, 1);
    }
}
```

We can see the reason for calling the execution tree a "tree". There are several things to note about the execution tree: Firstly that some executions have been completed and form a complete trace, that is: starting from the "sync start" a complete trace may be represented by the line running to the "sync end". Conversely, there is one trace that is incomplete: there is a trace waiting at the `get` for a message. We'll call a place where the execution splits into multiple branches/executions a "branching point".

Note that the branching points can in the *general case* only be discovered at runtime. Any code may have control flow points like `if` statements, or `while` loops. Consider the following code:

```
primitive some_component(out<u32> tx, bool which_way) {
  sync {
    if (which_way) {
      fork {
        put(tx, 1);
      } or fork {
        put(tx, 2);
      }
    } else {
      put(tx, 3);
    }
  }
}
```

Depending on the value of `which_way` we produce two different execution trees (of which we can determine all traces). The compiler cannot decide at compile time which execution tree will be generated.

Note that the `get` branching points have an arbitrary number of forked executions arising from them. We'll call them "waiting points". In the *general case* we cannot figure out how many forked executions arise from a `get` branching point. The reason being might be illustrated by the following simple example:

```
primitive sender(out<u32> tx, u32 num_forks) {
  sync {
    auto fork_counter = 0;
    while (fork_counter < num_forks) {
      fork {
        put(tx, fork_counter); 
      } or fork { } // empty case
    }
    put(tx, num_forks);
  }
}

primitive receiver(in<u32> rx) {
  u32[] values = {};
  sync {
    bool keep_going = true;
    while (keep_going) {
      auto new_value = get(rx);
      values @= { new_value }; // append
      fork { 
        keep_going = false; 
      } or fork { }
    }
  }
}
```

If the sender is connected to the receiver, then the sender will send anywhere between `1` and `num_forks` messages, depending on a user-supplied parameter (which we cannot figure out at compile-time). The isolated receiver can generate an infinite number of forked executions. We can analyze that the receiver will at most have `num_forks + 1` forked executions arising from its `get` branching point, but the compiler cannot.

For this reason a `get` branching point is a "waiting point": the runtime will always need to have a copy of the component's memory and execution state the moment it encountered a `get` instruction. It might just be that another component will send it another message, such that it needs to produce a new forked execution.

A `get` operation is also a "blocking operation": in the *general case* the component needs to know the value produced by the `get` operation in order to continue its execution (perhaps more specifically: the first time a `read` operation is performed on the variable that will store the transmitted message). In the simplest case the received message contains a boolean that is used in the test expression of an `if` statement: we'll need to have actually received that boolean before we can decide which control flow path to take. Speculating on the contents of messages is too computationally expensive to be taken seriously.