Thursday, August 30, 2007

Defunc'ed Join Calculus, now in Haskell


>module J where

This module defines alternative semantics for join calculus (JC) by means of providing an interpreter.

Major differences from standard semantics are:
1. "Kinda closure-converted/defunctionalized" form of object program
- we do not have nested definitions defining scope of names, all pieces of code are "kinda combinators".
2. Making code closed required introducing notion of object instances (data Obj).
- these objects are just identities (plus "vtable"), and could have been defined as just (Int, Code).
- however, this module models identity of the object as history of reactions (joins) that lead to its creation.
3. As all state is carried by messages, the history must be carried by them, as well.
- as a side-effect, we get unlimited stack-, err, graph-trace of the executed program!
- not that this is practical :)

>import Data.List as List
>import Data.Maybe (listToMaybe)

Let us model monadic JC, i.e., messages carrying just one channel as their body.
(see http://citeseer.ist.psu.edu/fournet95reflexive.html, Chapter 5).

Messages are the main part of the system, carrying all its mutable state.

"cause" is Nothing for primordial messages.

>data Msg = Msg {to::Channel, body::Channel, cause::Maybe Join} -- ("from" is redundant, as both causes have it as their "to")
> deriving (Eq, Ord)
>instance Show(Msg) where
> show (Msg t b Nothing) = (show t)++"("++(show b)++")"
> show (Msg t b (Just j)) = (show j)++" -> "++(show t)++"("++(show b)++")"

Join is just a couple of messages that reacted.

Forward reference: Messages of Join have equal "to", left one has L port id, right one has R.

>data Join = Join Msg Msg
> deriving (Eq, Ord)
>instance Show(Join) where
> show (Join m1 m2) = "<"++(show m1)++", "++(show m2)++">"

Channels are destinations of messages (and in standard monadic JC the only payload of messages).
Channels are identified by a pair of object identity and a port.

>data Channel = Ch {obj::Obj, port::PortId}
> deriving (Eq, Ord)
>instance Show(Channel) where
> show (Ch o p) = (show o)++""++(show p)

As this is core JC, each definition/code/object has just two channels, identified by their port ids - Left or Right.

>data PortId = L | R
> deriving (Show, Eq, Ord)

Obj is in fact a pair of object identity and procedure for handling messages ("kinda vtable").
Identity is formed by NewId qualified by birthjoin (think birthday).

"birthjoin" is Nothing for primordial objects.
TODO: "code" should not be a part of Eq's "key"

>data Obj = Obj {id::NewId, code::Code, birthjoin::Maybe Join}
> deriving (Eq, Ord)
>instance Show(Obj) where
> show (Obj i _ (Just j)) = (show j)++"->"++i
> show (Obj i _ Nothing) = i

NewId differentiates multiple objects created during reaction to the same join.
In more bytecode-like implementation this would be integer.

>type NewId = String

Code is the procedure to react to a join by creating new objects and sending new messages.
Code is closed, and the only input (capabilities!) it has are:
1. Its left and right channels - via ThisChannel
2. Messages in them (parts of the join causing the reaction) - via ThisMsg
3. Channels in newly created objects - via NewChannel

The reaction is a list of codes for new objects, keyed by their NewIds,
and a list of to/body references for new messages.
TODO: introduce datatypes for these pairs.

>data Code = Code [(NewId, Code)] [(Ref, Ref)]
> deriving (Show, Eq, Ord)

Reference - means for constructing capabilities.

>data Ref = ThisChannel PortId
> | ThisMsg PortId
> | NewChannel NewId PortId
> deriving (Show, Eq, Ord)

Config of the system is currently just a list of messages (order is important to define deterministic scheduling).

>type Config = [Msg]

Context is "a config with a hole". It is assumed that in this implementation hole is at the end of list.

>type Context = [Msg]

We follow the usual practice:
1. Find in the config a redex and its surrounding context.
2. Reduce redex.
3. Plug the result back.

One might use Danvy's refocusing here (link?),
but I am reluctant to optimize for speed just right now.

Just find the first(!) pair of messages having the same object and having left and right channels as destinations:

>findRedex :: Config -> Maybe (Join, Context)
>findRedex ms = listToMaybe $ [(Join m1 m2, ms \\ [m1, m2])| m1 <- ms, m2 <- ms, (obj $ to m1) == (obj $ to m2), (port $ to m1) == L, (port $ to m2) == R]

Basically translate capabilities to channels:

>reduce :: Join -> [Msg]
>reduce j@(Join l r) = (List.map s2m) sends where
> Code news sends = code this
> this = obj $ to l
> s2m (to, body) = Msg (r2c to) (r2c body) $ Just j
> r2c (ThisChannel p) = Ch this p
> r2c (ThisMsg L) = body l
> r2c (ThisMsg R) = body r
> r2c (NewChannel i p) = Ch (i2o i) p
> i2o i = case lookup i news of
> Just c -> Obj i c $ Just j
> Nothing -> error ("broken ref "++(show i))

Put result back into the hole:

>plugBack :: Context -> [Msg] -> Config
>plugBack c ms = c ++ ms

TODO: externalise scheduling out of findRedex/plugBack (probably use lazy evaluation - link?).

Now combine all three functions to make a single step:

>step :: Config -> Maybe Config
>step c = case findRedex c of
> Nothing -> Nothing
> Just (j@(Join l r), c2) -> Just $ plugBack c2 $ reduce j

And now the most dangerous part, as it is potentially non-terminating (unlike step):

>run :: Config -> Config
>run c = case step c of
> Nothing -> c
> Just c2 -> run c2

A few examples:

>c1 = Code [] [(ThisMsg L, ThisMsg R)]
>c2 = Code [] [(ThisMsg L, ThisChannel R), (ThisChannel L, ThisMsg R)]
>--c3 = Code [("x", c2)] [(ThisMsg L, NewChannel "x" L), (ThisMsg R, NewChannel "x" R)]

>ms1 = [Msg (Ch o1 L) (Ch o2 R) Nothing, Msg (Ch o1 R) (Ch o2 R) Nothing, Msg (Ch o2 L) (Ch o1 L) Nothing]
> where o1 = Obj "a" c1 Nothing ; o2 = Obj "b" c2 Nothing

Try "step ms1" and "run ms1" to get an idea.

The next installment should probably include a little compiler from LC terms to Config.

Monday, August 27, 2007

Programming Languages: Application and Interpretation

I have to find time to read this textbook - if only to verify my background.

Friday, August 24, 2007

"Defunctionalization" of Join Calculus

(See Defunctionalization at Work for a background on the defunctionalization technique, The reflexive CHAM and the join-calculus for a background on join calculus.) What I do is not really defunctionalization, but I miss the right term for that.

Lambda calculus can be encoded in join calculus (JC). Algebraic datatypes can be encoded in either. A decent functional PL (think ML) can be encoded in JC. Yesterday I had an idea of looking at the reversal of this encoding - take a JC term, and tell if it is an image of ML term under this encoding (similar in spirit to recovering user-lambdas from brutally CPSed terms). Better yet, take a JC term and split it into functional and (emmm...) superfunctional pieces.
Then I wandered into implementation ideas, and suddenly decided I need to explore some kind of first order JC. I mean, definitions in JC nest and form structures very similar in spirit to closures. What if I closure-convert JC semantics, tinker it a bit, and them put closures back?

I analyzed pages 5-6 of the JC paper, and started scribbling. Spoiled by UML, I didn't use sets or functions, only informal diagrams. "I'll validate it later", thought I.

Everything went fine, until I stumbled upon necessity to construct new instances. Need for elegant generation of fresh names and manifestedly safe references to them was what stopped me... Will have to check literature for inspiration.

An Accidental Simula User: Luca Cardelli

A personal view of the history behind the calculus of objects!

Accidentally, just last night I considered playing with defunctionalized Join Calculus (a good topic for the next post).

Saturday, August 18, 2007

Analyzing the Environment Structure of Higher-Order Languages using Frame Strings

Besides tackling a really fundamental issue, the paper has an interesting preface.