Dynamic Pipelines and Debugging DSL Transformations

I’ve been working on a dynamic transformation library for MetaSharp the last couple of weeks. I think I’m finally past the prototype phase and am almost ready to make it real. There is still lots of work for it to be actually usable but I think that the infrastructure is in place to enable me to solve all of the rest of the problems. Here are some samples of how I have it working so far:

A Simple Eval

int z = pipeline.Eval<int>("x * y", new { x = 3, y = 7 });

Delegate Compilation With a More Complex Context

public class MathContext
{
    public double x;
    public double y;

    public double sin(double d)
    {
        return Math.Sin(d);
    }

    public double cos(double d)
    {
        return Math.Cos(d);
    }
}
Func<MathContext, double> equation = pipeline.Compile<MathContext, double>("sin(x) * cos(y)");
double result = equation(new MathContext { x = 3, y = 7 });

I want to allow you to add white-listed namespaces to the pipeline also to allow “new” objects and static methods to be called using those namespaces. And of course the actual code can be more complex then one line expressions. The new expressions in .net 4 include all statements as well. Hopefully we’ll get better support for dynamic types in the future, but I may have some tricks up my sleeve for doing that as is, we’ll see.

Anyway, one of the things I needed to do to get it to work was to create a more robust object as a result of a transformation. Previously you’d get in some object then simply yield whatever you wanted to transform it into. This was ok for the CodeDom where you could stuff the original AST nodes into a metadata dictionary on any CodeObject but it doesn’t work for the DLR objects. At some point you need to know what a node came from in order to understand the context of how to apply it in a following transformation. For example, a method invocation node, needs to know if what it is invoking is a property, field or method reference and do something different for each case.

So, now when you yield transformed objects it stuffs them into a Transform<T>. Which contains the original node, the yielded transformation nodes and all child Transforms. The end result is 3 trees, the original AST tree, the new AST tree and a tree showing the relationship between the two trees.

This is necessary to actually do transformations but one of the cool side effects I want to play around with is that the relationship tree (the tree of Transform<T> objects) could be really cool to visualize. I really want to make a Debugger Visualizer where you can pop open a dialog and see a visual representation of the transformation. I’m envisioning something like this:

transform

Where as your mouse moves over a node in the original tree you show the nodes it transformed into. In this image I have a AST representing a Song DSL. When run through the pipeline it transforms into CodeDom objects (or whatever but CodeDom in this case). Here you can see the Note is being transformed into objects that would generate code that looks like this “this.Play(key, octave, duration)”. As you move up to the bar you’d see a more complete representation of the generated tree and as you move up to Song you’d see the entire tree.

That is the ultimate goal, in reality, to do this transformation takes several steps and it would be more complicated than this. But now that we have all of the Transform trees you could track these sorts of things and display it visually like this. A tool such as this could be invaluable for debugging complex transformations!

Competing With Your Own Product

This seems more like a business focused subject than a strictly programming related topic and as such I feel obligated to add a disclaimer: I’m not really qualified to talk about this subject with any authority but this is a thought I’ve been having for a while so I thought I’d just throw it out there. Also, these are totally my opinions and not necessarily the opinions of my employer. With that out of the way I’ll get to what I’m really trying to say.

It seems like there is a pretty consistent pattern in the software world where someone creates something really clever and innovative then after a short time, as the implementation of that program begins mature, the ideas of how it should be start to become well known yet the actual application gets bogged down with backwards compatibility concerns, and increasing complexity, slowing it’s velocity.

It seems like maintaining that compatibility and reusing that source base becomes a necessity to maintain current users so you end up stuck between a rock and a hard place as you try to innovate and change without changing too much too fast.

What’s really interesting is that, not burdened with backwards compatibility, or existing codebases your competitors are free to create their own implementation of what they envision to be a more ideal solution to the problem that your application is trying to solve… and they have a tendency to actually do it much better.

They cycle is almost Darwinian and it takes quite a special application to resist the inevitable undertow over time. The classic application I think about when I’m pondering these ideas is Lotus Notes, though I think it’s true of nearly every piece of software ever created. As far as I understand it Lotus Notes was one of the first document editors and spreadsheet applications, then came Office not too long after. And while it’s only my opinion I think it’s clear which is really the king. My limited experience with Lotus Notes was as a worn down, buggy, ugly, highly idiosyncratic application not intended for use by mere mortals.

You could potentially make the same argument for Internet Explorer, first there was Netscape Navigator then there was Internet Explorer and now there is Firefox. While what is “better” is still largely subjective it’s still easy to see the pattern of competitors, free from backwards compatibility, are free to innovate very quickly and overtake their more aged competition.

So my main point of this post is to suggest the idea that it’s important to identify when an applications velocity is suffering, and also to suggest that becoming your own competitor might be necessary for survival. What I mean by this is not to suggest that your current application should be dropped suddenly but that it could be healthy to start up a completely parallel effort free from all of the malaise affecting your current application. If your competitor can do it then so can you… in fact if you don’t it could be fatal. While your aged application begins to fade gracefully into maintenance mode you should begin to divert resources fully towards the successor (Darwinian metaphors galore!).

I think a few potential reason it may be hard for companies to come to this conclusion is to think that A.) they take it as a sign of weakness and B.) they tend to make the mistake that their software is their most valuable asset. My argument to these two points are related, I believe that it’s actually the developers of the software that are the real assets, and by creating your own competing application you can reuse the truly important aspects of the software: the developers. Bringing all of the domain knowledge with you and starting from a clean slate could only result in amazing things and it’s not a sign of weakness to show intelligent, pro-active development for the future. After all, if you don’t do it some other company will.

Obviously, from a pragmatic perspective you can’t afford to do this for every release. Likewise, why bother with a thriving well liked application in its prime? I think the key here is, dying, slow moving, bogged down applications need to know when to let go and start over.

From a more micro perspective I think that the DRY principle is related and brings up some interesting thoughts. As a programmer, the DRY principle has been hammered into my head since the very beginning of my education but at some point you just have to come to the conclusion that reuse can result in decreased value when that thing you’re trying to reuse is done poorly. I often times think about the DRY principle as simply the output of a given candidate for reuse. For example the thought process “if we have libraryX and it’s task is to do X then from now on, whenever we need to do X we can reuse this library”. Well this sounds good in principal, but how libraryX does X is just as important as the result. You are not repeating yourself if you do X differently.

The DRY principal says Do Not Repeat Yourself, which does not necessarily mean Do Reuse Yourself.

I would love to hear the thoughts of others on this topic.

Dynamic Pipeline in MetaSharp

I’ve been working on a prototype of a Dynamic Pipeline for MetaSharp using the new Linq Expressions in .NET 4. I’m pretty darn excited about it, here is what I have working so far:

[Test]
public void BinaryVariablesTest()
{
    var result = this.pipeline.Eval<int>("x * y", new { x = 3, y = 7 });
    Assert.AreEqual(21, result);
}

What I’m doing here is compiling the string as MetaSharp code then using a node visitor to generate the appropriate linq expressions. The trick here is the context parameter being passed in. This object contains all of the properties and methods allowed to be called from the expression, essentially it’s the “this” parameter. To get this to work you can use the handy dandy ConstantExpression. Here is how I’m doing it:

public class ReferenceExpressionVisitor 
    : TransformVisitor<linq.Expression, ReferenceExpression>
{
    protected override IEnumerable<linq.Expression> Visit(
        IEnumerable<linq.Expression> items, 
        ReferenceExpression visitedObject)
    {
        IDynamicContextService parameterService = 
            this.ServiceProvider.Locate<IDynamicContextService>();
    
        yield return linq.Expression.PropertyOrField(
            linq.Expression.Constant(parameterService.Context),
            visitedObject.Name);
    }
}

The constant allows us to call properties or fields on the context object directly. You should even be able to attach lambda expressions to fields so you can call those as well. I have to do some restructuring in order to enable everything like Method calls and whatnot but it should all be doable. Very sweet. Also the dynamic pipeline will have to have some type of caching provider, clearly you wouldn’t want to rebuild these expressions every single time you try to execute it. Optionally I could just give back a delegate instead of executing it directly, then let the caller do all the caching. That sounds easier and more flexible actually now that I think about it.

What’s cool about this though isn’t the fact that you can compile expressions dynamically (which is pretty cool but there are others out there already) but the fact that you will be able to dynamically compile DSLs… which reduce to dynamic expressions. Imagine this:

when truck is late:
  apply discount(.1);

when truck is early:
  apply bonus(.1);

when truck cost > $10,000:
  notify

You compile your DSL into MetaSharp nodes, transform it into Common language nodes then transform that into dynamic linq expressions. Very sweet! You might transform this into something like:

if(truck.late)
{
    context.discount(.1);
}
if(truck.early)
{
    context.bonus(.1);
}
if(truck.cost > 10000)
{
    context.notify();
}

Where you’d have a custom ShippingContext object with all of the methods and properties you wanted to expose in your DSL. This would be really handy for creating all sorts of systems that have the potential to have rapidly changing rules and also, potentially, enable business analysts to author their own rules.