Friday, August 18, 2006

"It's machines all the way down" language

Seeing how concurrency support steadily becomes a must-have for a programming language, it's very sad to observe that details of scheduling concurrent processes are almost always hidden from the programmer.

{Rant about resource-aware programs}.

Would not it be good to allow the programmer to explicitly provide schedulers when needed? And resource management in general.

The idea of what I am thinking about goes as follows:

Execution history of the program forms a tree, where root represents start of the execution (initiated "supernaturally"), edges represent causality, branching represents concurrent execution, some leaves are final and represent completed (quescent??) execution, while other leaves are "growing points" and represent schedulable resumptions or input requests (there is actually not much difference between these later two, you can think about them as tokens in Petri nets). The tree is unfolded by executing resumptions by (potentially) multiple evaluators (machines?), guided by program-provided schedulers (constraints?). Each resumption is a function from evaluator's input to a tree (which can be a final leaf, an input request or a new resumption (growing point leaves), or a more complicated structure). Whenever an evaluator chooses to execute a resumption, it runs it on an input determined by the resumption's position in the tree, and then grafts the resulting tree under the resumption (probably recording the input as well, which may be useful when input is not _uniquely_ determined by the path).

Virtual machines
To make execution concurrent, resumption may return a "VM" node containing a virtual machine - with its own execution tree and a scheduler. The key is to be able to define the scheduler in the same way as "user" programs.

Another option is to return a "conc" node containing multiple resumptions - these will merge into the pool (err, tree) of resumptions controlled by the current (virtual) machine, and, respectively, scheduled using its scheduler.

Schedulers control only fairness of resource consumption, while all synchronisation issues must be dealt with by using "promise".

Issue: in this design, program is able to consume only one resource at a time - namely, the machine that executes it. How to model multiple resources? And why?
Also: explore potential for join of multiple resumptions - it's dual to the previous - one resource is simultaneously (and even atomically) consumed by multiple processes.
Also: where trust/ownership/responsibility comes into picture?

Also: vau?

1 comment:

Andris Birkmanis said...

So tree was a bad idea, now I am thinking about directed graphs of general form (including loops). See "cobs" for details.