Thursday, August 30, 2007

Defunc'ed Join Calculus, now in Haskell


>module J where

This module defines alternative semantics for join calculus (JC) by means of providing an interpreter.

Major differences from standard semantics are:
1. "Kinda closure-converted/defunctionalized" form of object program
- we do not have nested definitions defining scope of names, all pieces of code are "kinda combinators".
2. Making code closed required introducing notion of object instances (data Obj).
- these objects are just identities (plus "vtable"), and could have been defined as just (Int, Code).
- however, this module models identity of the object as history of reactions (joins) that lead to its creation.
3. As all state is carried by messages, the history must be carried by them, as well.
- as a side-effect, we get unlimited stack-, err, graph-trace of the executed program!
- not that this is practical :)

>import Data.List as List
>import Data.Maybe (listToMaybe)

Let us model monadic JC, i.e., messages carrying just one channel as their body.
(see http://citeseer.ist.psu.edu/fournet95reflexive.html, Chapter 5).

Messages are the main part of the system, carrying all its mutable state.

"cause" is Nothing for primordial messages.

>data Msg = Msg {to::Channel, body::Channel, cause::Maybe Join} -- ("from" is redundant, as both causes have it as their "to")
> deriving (Eq, Ord)
>instance Show(Msg) where
> show (Msg t b Nothing) = (show t)++"("++(show b)++")"
> show (Msg t b (Just j)) = (show j)++" -> "++(show t)++"("++(show b)++")"

Join is just a couple of messages that reacted.

Forward reference: Messages of Join have equal "to", left one has L port id, right one has R.

>data Join = Join Msg Msg
> deriving (Eq, Ord)
>instance Show(Join) where
> show (Join m1 m2) = "<"++(show m1)++", "++(show m2)++">"

Channels are destinations of messages (and in standard monadic JC the only payload of messages).
Channels are identified by a pair of object identity and a port.

>data Channel = Ch {obj::Obj, port::PortId}
> deriving (Eq, Ord)
>instance Show(Channel) where
> show (Ch o p) = (show o)++""++(show p)

As this is core JC, each definition/code/object has just two channels, identified by their port ids - Left or Right.

>data PortId = L | R
> deriving (Show, Eq, Ord)

Obj is in fact a pair of object identity and procedure for handling messages ("kinda vtable").
Identity is formed by NewId qualified by birthjoin (think birthday).

"birthjoin" is Nothing for primordial objects.
TODO: "code" should not be a part of Eq's "key"

>data Obj = Obj {id::NewId, code::Code, birthjoin::Maybe Join}
> deriving (Eq, Ord)
>instance Show(Obj) where
> show (Obj i _ (Just j)) = (show j)++"->"++i
> show (Obj i _ Nothing) = i

NewId differentiates multiple objects created during reaction to the same join.
In more bytecode-like implementation this would be integer.

>type NewId = String

Code is the procedure to react to a join by creating new objects and sending new messages.
Code is closed, and the only input (capabilities!) it has are:
1. Its left and right channels - via ThisChannel
2. Messages in them (parts of the join causing the reaction) - via ThisMsg
3. Channels in newly created objects - via NewChannel

The reaction is a list of codes for new objects, keyed by their NewIds,
and a list of to/body references for new messages.
TODO: introduce datatypes for these pairs.

>data Code = Code [(NewId, Code)] [(Ref, Ref)]
> deriving (Show, Eq, Ord)

Reference - means for constructing capabilities.

>data Ref = ThisChannel PortId
> | ThisMsg PortId
> | NewChannel NewId PortId
> deriving (Show, Eq, Ord)

Config of the system is currently just a list of messages (order is important to define deterministic scheduling).

>type Config = [Msg]

Context is "a config with a hole". It is assumed that in this implementation hole is at the end of list.

>type Context = [Msg]

We follow the usual practice:
1. Find in the config a redex and its surrounding context.
2. Reduce redex.
3. Plug the result back.

One might use Danvy's refocusing here (link?),
but I am reluctant to optimize for speed just right now.

Just find the first(!) pair of messages having the same object and having left and right channels as destinations:

>findRedex :: Config -> Maybe (Join, Context)
>findRedex ms = listToMaybe $ [(Join m1 m2, ms \\ [m1, m2])| m1 <- ms, m2 <- ms, (obj $ to m1) == (obj $ to m2), (port $ to m1) == L, (port $ to m2) == R]

Basically translate capabilities to channels:

>reduce :: Join -> [Msg]
>reduce j@(Join l r) = (List.map s2m) sends where
> Code news sends = code this
> this = obj $ to l
> s2m (to, body) = Msg (r2c to) (r2c body) $ Just j
> r2c (ThisChannel p) = Ch this p
> r2c (ThisMsg L) = body l
> r2c (ThisMsg R) = body r
> r2c (NewChannel i p) = Ch (i2o i) p
> i2o i = case lookup i news of
> Just c -> Obj i c $ Just j
> Nothing -> error ("broken ref "++(show i))

Put result back into the hole:

>plugBack :: Context -> [Msg] -> Config
>plugBack c ms = c ++ ms

TODO: externalise scheduling out of findRedex/plugBack (probably use lazy evaluation - link?).

Now combine all three functions to make a single step:

>step :: Config -> Maybe Config
>step c = case findRedex c of
> Nothing -> Nothing
> Just (j@(Join l r), c2) -> Just $ plugBack c2 $ reduce j

And now the most dangerous part, as it is potentially non-terminating (unlike step):

>run :: Config -> Config
>run c = case step c of
> Nothing -> c
> Just c2 -> run c2

A few examples:

>c1 = Code [] [(ThisMsg L, ThisMsg R)]
>c2 = Code [] [(ThisMsg L, ThisChannel R), (ThisChannel L, ThisMsg R)]
>--c3 = Code [("x", c2)] [(ThisMsg L, NewChannel "x" L), (ThisMsg R, NewChannel "x" R)]

>ms1 = [Msg (Ch o1 L) (Ch o2 R) Nothing, Msg (Ch o1 R) (Ch o2 R) Nothing, Msg (Ch o2 L) (Ch o1 L) Nothing]
> where o1 = Obj "a" c1 Nothing ; o2 = Obj "b" c2 Nothing

Try "step ms1" and "run ms1" to get an idea.

The next installment should probably include a little compiler from LC terms to Config.

1 comment:

Andris Birkmanis said...

Careful readers might have noticed, that the interpreter does not (directly) support core JC. Particularly, as there are no closures, nested defs will be problematic. However, there is still a chance that terms of core JC can be efficiently encoded in "closureless" core JC - would make a nice theorem.