Friday, April 28, 2006

A Language-Based Approach to Unifying Events and Threads

Looks like great minds think alike :-)
I have to read this paper to see if I am reinventing a wheel.

Some examples from E

I had ten minutes before lunch, and decided to code up a couple of examples from E in Haskell:

type Cont a = a -> STM ()

-- Blocking caretaker - requests are blocked until the gate is open
-- Note that the gate can be opened/closed repeatedly.
makeCaretaker :: Cont a -> STM (Cont a, Cont Bool)
makeCaretaker target = do
enabled <- newTVar True
return (\m -> do {e <- readTVar enabled ; if e then target m else retry}, writeTVar enabled)

-- Overwritable promise.
makePromise :: STM (Cont a, Cont (Cont a))
makePromise = do
queue <- newTChan
target <- newTVar Nothing
return (\m -> do
t <- readTVar target
case t of
Nothing -> writeTChan queue m
Just r -> r m
, writeTVar target . Just)

Note that both caretaker and promise allow the controller to change its mind - either by repeatedly opening/closing the gate, or by re-resolving the reference. Also, when closed, the caretaker does not signal errors or swallow messages - it keeps the senders blocked. OTOH, promise stores messages in a queue.
Of course, this can be changed (I leave this as an exercise for readers :-) ).
I may consider creating a small library of such patterns and see whether something DSLish comes of it...
But before that I must decide on verification procedure - should I use unit tests or proofs...

Thursday, April 27, 2006


I am trying to manually code up some simple programs using a -> STM () as a continuation type...
And, well, I've stumbled upon some problem with types. Consider a type for a cons list expressed in CPS:

type List a = ((STM (), (a, List a) -> STM ()) -> STM ())

The first STM () gets executed in case of Nil, the second in case of Cons (getting a pair of car and cdr).

Unfortunately, this type fails to check in Haskell. I have to grok yet why, and how to work around it. I would hate to use data:

data List a = List ((STM (), (a, List a) -> STM ()) -> STM ())

as this would reduce clarity by boxing unboxing. Compare:

makeNil :: List a
makeNil (caseNil, _) = caseNil


makeNil :: List a
makeNil = List $ \(caseNil, _)->caseNil

Then again, if I shoot for terseness, then nothing beats:

makeNil :: List a
makeNil = List fst

Envisioning code generation for algebraic data types, I arrive to:

type Cont a = a -> STM ()

type CaseNil = Cont ()
type CaseCons a = Cont (a, List a)
data List a = List {unList :: Cont (CaseNil, CaseCons a)}

makeNil :: () -> List a
makeNil payLoad = List $ \(caseNil, _) -> caseNil payLoad

makeCons :: (a, List a) -> List a
makeCons payLoad = List $ \(_, caseCons) -> caseCons payLoad

Hmm, I should consider using currying, after all...

Wednesday, April 26, 2006


The ideas presented so far are not even close to supporting join calculus (however simple it is).
Fluid still falls short in several areas: it is (practically) first-order (so messages do not carry references), it does not support creaition of objects at runtime, it is data flow driven (call it pull-only, or lazy, if you please).
The first problem is caused by the third, which is caused by input devices being the only active sources of messages.
This was caused by the decision to represent outputs of instances as transactionally lazy (STM a)'s, which promised better composability.
Basically, the choice (for a very simple case with one input and one output) is between:
STM i -> STM o -- transaction transformer paradigm, precludes "push", as only the receiving side of the reference is active
STM i -> (o -> STM ()) -> STM () -- hey, looks like monadic bind!

BTW, I suspect that relationship between these two types can be seen as expression of some property of negation in some logic :-)

Tuesday, April 25, 2006

Devices/distribution = chicken/egg?

Obviously, I/O devices and distribution (multiple vats) can be expressed using each other.
For various reasons I prefer to view devices as primitive, and implement distribution via stubs/scions (this is consistent with E paradigm).

What is not consistent with E paradigm, though, is that a vat ceases to be a single thread - it is a family of threads, namely one thread for (transactional) instances, and one thread for each device. I cannot understand the implications at the moment (mostly for robustness), but this is something to keep an eye on. OTOH, I can always bind this family to a single OS thread, which means I will match E perfectly, right? Must ensure this way of running vats is enforced, though.

Some code functions+devices

Here is some Haskell code to illustrate my previous post (I cheat a bit with tuples in unary cases):

-- AND join in workflow terminology.
binaryJoin :: (STM a, STM b) -> STM (a, b)
binaryJoin (sa, sb) = do
a <- sa
b <- sb
return (a, b)

-- Very simple code function.
inc :: Num a => STM a -> STM a
inc sa = do
a <- sa
return (a+1)

-- Code that delegates to other code.
add :: Num a => (STM a, STM a) -> STM a
add inputs = do
(a, b) <- binaryJoin inputs
return (a+b)

This code covers so called instances in terminology of Mark's thesis.
What about devices? For output the answer looks simple: code for output devices has type:
(STM i1, ..., STM im) -> IO (), but such object would probably require a separate thread to run (can we say that's because, unlike STM, IO is not composable?). Thus, I suggest splitting (code for) output devices into a transactional matcher part (if needed at all) of type (STM i1, ..., STM im) -> STM a and a reactive handler part of type a -> IO (). This gives the framework runtime a chance to run matchers transactionally, and only when one of them matches, run the handler.
So, we have (pseudo-Haskell):
data OutputDevice = forall a . OutputDevice ((STM i1, ..., STM im) -> STM a) (a -> IO ())

What about input devices? The first idea is to type them:
(a -> STM ()) -> IO ()
with the contract that the device performs some input, then applies the (parameterised) transaction to it, then runs it.
Why not just IO a? Deal, I can minimize input device's responsibility by saying the framework will "apply and run", while the device will just read and return, so the type is:
IO a

Well, by its nature (blocking input) each input device will be run on a separate thread, injecting its values into the transactional thread via TChan'nels. I don't like this (especially when we dealt with multithreading issues of output devices), but currently see no other alternative. I have to explore the Haskell standard library to see, whether the input is mostly blocking.
If nothing better can be done with input devices, then maybe it makes sense to run output device on separate threads as well, and let STM runtime to handle blocking (so output devices will be typed (STM i1, ..., STM im) -> IO (), or even a -> IO ())...

Some thoughts

Ok, I decided to stop fighting Haskell on the particular technical issues, and instead sit back and think a little.

What is an object? From the implementors point of view, of course :-)
Let's assume in E tradition that this is something that upon receiving a message can change its state and post a set of messages along the references it holds.

Nothing interesting so far. It becomes better when we notice that multiple objects often share some logic, which we might call "code" or "class" or "prototype". Let's call it code. What is this code? In Haskell, I postulate, it is natural to see this code as a function from message and object state to set of deliveries and new state. But what is the exact type of this function? Here some experience with join calculus suggest a nice trick: both the state and the messages can be modeled uniformly!
My current proposal is to say that object code is a function of type:
(STM i1, ..., STM im) -> (STM o1, ..., STM on)
The inputs are transactions that represent reading either public "facets" or private "state", and the outputs are transactions that represent either sending messages to other objects or updating private state (in a roundabout way - the transaction must be executed to obtain the "written" value - this simplifies composability of multiple code functions).
(Of course this type can be curried, but I like its simmetry).
is and os stand for (possibly polymorphic) types (not sure yet how to better handle multiple messages on the same ouput - and then, maybe on the same input, as well?).
Note that none of the output types mention the identity of the receiver - they are indeed types of messages, not of deliveries.

So how and who attaches the receivers to the messages? I say it is responsibility of (the instance of) the object that uses this "code". Note that I ended up with three parties: authors of the code, users of the code, and authors of the framework (that's me!) that are responsible for helping users to use the code easier. Now if only this third party knew how to achieve that :-)

Monday, April 24, 2006

No Data.Typeable for polymorphic types :(

Alas, after spending some time with Data.Typeable I understood it is (for some reason) limited to monomorphic types.
Thus I hesitate to use it as a way to glue (user) delivery handling functions to the (system) framework.
Hmm, should I invert the logic, and instead of functions from (reading) STMs to Effect use tuples of (writing) STMs and Effect?

Probably I will get smarter after lunch :)

Friday, April 21, 2006

The first problem

Well, I have the first subset of runtime working - objects with a single facet (so state or concurrency are not possible).
That was surprisingly easy, despite I needed to understand existential types.

Now I have the first problem - to introduce multi-faceted lexical compounds I need to write a function taking a function from several (let's say n) STM Delivery to Effect and returning exactly n TChan Delivery and an Effect...
A wrong type:
close :: ([STM Delivery] -> Effect) -> ([TChan Delivery], Effect)
as it does not constrain the lengths of the lists to be equal (and worse, operationally close does not know how many channels to create given just a function from a list to effect).

When I figure out how to do this (either by adding explicit n to the left side of the type, or by some GADT trick), I will generalize the problem to parametric (Delivery o a)...


Ok, I said I will "try and create a toy implementation of E-like system", and so I started this blog to document my thinking and its changes.

First of all, I am happy to start working on this, as my previous experience with Haskell was limited to school assignments - now I can finally use all the cool features, including type classes, existential types, composable transactions, and more :-)

I quickly rolled a parser for join calculus using Parsec, not that this was needed at the moment, but it was fun and a good warm-up.

Now I am pondering over the ways to skin... err, to modularize the runtime.
One idea is to have multiple languages interoperating in my vats (word "vat" is used without permission of E authors :-) ), so I want to separate semantics of the process definition language from that of the "robustness framework".

I will start from defining the notions of delivery (or message), object (the target of deliveries), and effect - what happens upon delivery. I need at least IO effects to model output devices (in terms of E thesis), and also (STM [Delivery]) effects to allow objects send messages to other objects (transactionally - in the same vat). I think I will need forall to allow multiple implementations of objects to cooperate... Time to learn some Haskell-fu :-)

BTW, if you wonder: Fluid is the code name of this project.