Reduce, Reuse, Recycle
Both code and data structures in uFork are immutable. This makes it safe to share these structures among actors. We can take advantage of commonality to reduce our memory footprint, and also to pass information efficiently between actors (without making copies).
When two or more actors need access to the same mutable state, we define a “state holder” actor for that state. Since shared data is immutable, the state-holder maintains the mutable state privately, and the sharing actors coordinate by sending messages to the state-holder. We will define a “storage cell” actor as a simple state-holder example.
Common Tail-Sequences
In our previous examples,
you may have noticed
that the code for many actors
ends with the some of the same instructions.
Almost all actors end with
an end commit instruction.
Many actors preceed this with
an actor send instruction.
A common pattern is to send to a customer
provided as msg 1.
These patterns are so common,
they are provided in the
std.asm
module (usually imported with an std prefix).
; Common Tail-Sequences
cust_send: ; msg
msg 1 ; msg cust
send_msg: ; msg cust
actor send ; --
sink_beh: ; _ <- _
commit:
end commit
It may come as a surprise that these instructions may not be sequential in memory. In fact, there is no way to tell if they are sequential or not! This is because uFork memory is organized into linked structures called quad-cells. Quad-cells are the minimum addressable unit. They have 4 value fields. Each value is a fixnum, a capability (actor address), or a pointer to a quad-cell. There is no conversion between these types, thus you can't determine the numeric value of an address. The code fragment above is illustrated by this diagram:
cust_send ---> [#instr_t, "msg", 1, k]
|
+--------------------+
|
V
send_msg ----> [#instr_t, "actor", "send", k]
|
|
sink_beh ------+---------------------------+
|
V
commit ------> [#instr_t, "end", "commit", #?]
This code fragment consists of 3 instructions
and occupies 3 quad-cells.
This first field of each quad
designates the type of the quad,
in this case #instr_t
for executable instructions.
The labels cust_send, send_msg,
sink_beh, and commit
denote pointers to these instruction quads.
The final field of each quad
contains a pointer to the next instruction
in the code sequence (the continuation),
regardless of where it resides in memory.
By default,
the assembler translates a sequence of instructions
into a linked-list of quad-cells.
However, the continuation
can always be specified explicitly
by including an extra argument
at the end of an instruction.
For example,
cust_send could be written as:
cust_send: ; msg
msg 1 send_msg ; msg cust
A more common approach is
to use a ref pseudo-instruction
to provide the continuation,
as shown here:
cust_send: ; msg
msg 1 ; msg cust
ref send_msg
A ref can also be used
to introduce a named constant.
The following code associates the label E_OK
with the fixnum value 0.
E_OK: ; not an error
ref 0
A ref occupies no storage.
It's just a reference to another value.
You should use ref std.commit
instead of end commit
to avoid creating redundant instruction-quads.
The send_msg and cust_send
labels allow sharing of longer common tail-sequences.
Watch for opportunities to use them.
Static Data
We will use the ref mechanism
to define a set of operation-codes,
and then use the dict_t data constructor
to define a dictionary of code-addresses.
read_op:
ref 0
write_op:
ref 1
CAS_op:
ref 2
op_table:
dict_t read_op read
dict_t write_op write
dict_t CAS_op CAS
ref #nil
The dict_t data constructor
defines statically-initialized instances
of type #dict_t.
It takes a key, a value,
and an optional next pointer.
As with instructions,
if the final next pointer is missing,
it defaults to the next statement in the source code.
The #nil value
represents the empty dictionary,
which terminates the sequence.
Note that the named constants
(#?, #nil, #f and #t)
are pointers to read-only memory.
Method Dispatch Table
We begin defining the “storage cell” behavior
by using the staically-defined op_table
to dispatch to the code specified by the op
selector in the incoming message.
beh:
cell_beh: ; value <- cust,op,args
push op_table ; op_table
msg 2 ; op_table op
dict get ; op_code
dup 1 ; op_code op_code
typeq #instr_t ; op_code is_instr(op_code)
if_not std.abort ; op_code
jump ; --
If the dictionary lookup produces a code-pointer,
we jump to the selected method.
Otherwise, the actor transaction is aborted.
The message is discarded
and there is no effect on the actor or the system.
On some platforms,
an “auditor” may be notified
and the anomaly can be reported.
Read Method
The method code is executed in the same transaction,
with the same message and the same private state
as the method dispatch.
The “read” method simply sends the current value
to the cust provided in the message.
read: ; value <- cust,#read,_
state 0 ; value
ref std.cust_send
Write Method
The “write” method updates the cell's private state
with a new value' from the message,
and sends its own address to the provided cust.
write: ; value <- cust,#write,value'
msg -2 ; value'
push cell_beh ; value' cell_beh
actor become ; --
actor self ; SELF
ref std.cust_send
Sending the actor's address to the customer is the “sync signal” pattern. It allows the customer to continue processing, knowing that the write has been completed.
Compare-and-Swap Method
You might think that “read” and “write” are all the operations you need for a mutable storage cell. However, consider trying to do something as simple as incrementing a shared counter. This requires a read-modify-write cycle. With only a read/write API, the cell's value could change between the read of the original value and the write of the modified value. This is called a “data race”. One mechanism to avoid this kind of hazard is to add a “compare-and-swap” operation to the API.
CAS: ; value <- cust,#CAS,old,new
msg 3 ; old
state 0 ; old value
cmp eq ; old==value
if_not read ; --
msg -3 ; new
push cell_beh ; new cell_beh
actor become ; --
ref read
The “CAS” method request message provides
the expected old value
and the desired new value.
If old matches the current value,
then the cell's private state is updated to new.
In either case, “CAS” acts like “read”
(literally reusing the same code),
sending value to cust.
Note that, by definition, this is the value
before any updates.
If the “CAS” cust receives
the old value they expected,
they know the operation was successful.
Otherwise, they can use the returned value
as old and retry the operation,
possibly with a recalculated new.
With actors, nothing blocks
and everyone makes progress,
even when there is contention.
The one-at-a-time transactional nature
of actor event handling ensures consistency.
Usage Example
An actor with cas_add behavior
uses a cell's “CAS” method
to safely update the cell's value.
The actor's state includes
the expected old value
(initialized to #?),
the increment to add,
and the cell holding the value.
It receives the cell's
updated old' value
as the customer of a “CAS” operation.
cas_add: ; old,inc,cell <- old'
state 1 ; old
msg 0 ; old old'
cmp eq ; old==old'
if std.commit ; --
If the old' value received
matches the expected old value,
the operation was successful
and there is no more work to do.
state 0 ; old,inc,cell
part 2 ; cell inc old
drop 1 ; cell inc
msg 0 ; cell inc old'
Otherwise,
we replace the old value
with the updated old'.
dup 2 ; cell inc old' inc old'
alu add ; cell inc old' new=inc+old'
pick 2 ; cell inc old' new old'
push CAS_op ; cell inc old' new old' #CAS
actor self ; cell inc old' new old' #CAS cust=SELF
pair 3 ; cell inc old' cust,#CAS,old',new
state -2 ; cell inc old' cust,#CAS,old',new cell
actor send ; cell inc old'
Then we recalculate new=inc+old'
and prepare another “CAS” request
for the cell,
designating ourself as the customer.
pair 2 ; old',inc,cell
push cas_add ; old',inc,cell cas_add
actor become ; --
ref std.commit
Finally, we update our private state and commit the transaction.
Complete Module
The complete module is available in the uFork library at https://ufork.org/lib/cell.asm.
Open in Playground
.import
dev: "https://ufork.org/lib/dev.asm"
std: "https://ufork.org/lib/std.asm"
lib: "https://ufork.org/lib/lib.asm"
...
beh:
cell_beh: ; value <- cust,op,args
...
boot: ; _ <- {caps}
...
test: ; judge <- {caps}
...
.export
beh
read_op
write_op
CAS_op
boot
test
The module exports beh,
rather than cell_beh,
so when the module is imported
with a namespace prefix of cell
it can be referenced as cell.beh
The module also defines demonstration and testing procedures which are a subject for future tutorials.