Design Patterns as External DSLs

I’ve been thinking about how you might layer DSLs into ever increasing abstractions and how you might combine multiple DSLs with ease. These are problems I definitely don’t have the answers to just yet but while looking at how Axum is coming along I can’t help but feel like Axum is a language designed specifically to help you write code that conforms to a specific design pattern. I mean it’s still .net under the hood (I think) which is no more or less async safe than what you or I can write manually but it constrains you such that you cannot create bad asynchronous code (or should make it much harder at least!).

This is really fascinating to me, since intuitively it feels like a really good idea but I don’t quite get how it can coexist with other types of development or DSLs right off the bat. I also think it reinforces the idea that constraint, at times, can be more powerful than flexibility. One of the things about “dynamic” languages that I’m not totally convinced about is the how it feels like the wild-wild-west, anything goes type of programming. Sometimes having limiting constructs can actually more powerful and that is not appreciated enough, I don’t think.

Anyway, I was just trying to picture a world where you had one DSL where you designed various models for your application and another DSL where consumed them in a specific design pattern. This seems feasible I’m still trying to figure out how you might “glue” them together. Probably using templates somehow. Food for thought, if anyone has any insight or ideas leave a comment!

Out with Code Generation and in with Transformation

As I’ve been playing around with DSLs for the past couple of years I’ve been focused on Code Generation as my primary strategy. This is all well and good and I think that code generation still servers its purpose in the greater world of DSLs but it’s not quite good enough. I would like to start using the word Transformation as more generalized form of code generation and manipulation. What I used to refer to as Code Generation I will now simply call Textual Transformation. The other main form of Transformation is an AST Transformation. The Groovy folks have also adopted this to be synonymous with Compile-time Meta Programming and the Boo folks would call this a Syntactic Macro.

In order to promote the DRY principle and really allow N levels of arbitrary transformations I’ve been busy changing MetaSharp to adopt the Pipelinepattern for the compilation process (according to that wikipedia article what I have now is more of a psued-pipeline though, since each step is done synchronously). The end result is pretty simple actually.

image

The Pipeline has a series of steps and a collection of services. Each step depends on certain services and may alter / create certain services. In this way each step can be completely re-usable for different compilation scenarios. For example the MetaCompilePipeline has three steps:

  1. MetaSharpCompileStep
  2. CodeDomTransformStep
  3. CodeDomCodeGenerateStep

Which is to say, if you want to compile MetaSharp code inside of a project of a different language your pipeline needs to perform those three steps. First compile the code into MetaSharp AST nodes. Second transform those nodes into CodeDom objects. Third use a CodeDomProvider to generate code based on those CodeDom objects. The MetaTemplatePipeline is the same as the above with one extra step at the beginning, for transforming the code into something else.

The point here though, is that key to this whole process is the idea of Transformation. In fact the whole theory behind MetaSharp is simply to be a transformation tool. Each step is simply transforming the results of the previous step into something else. This is powerful because your DSL can consist of arbitrary levels of transformation, litterally your DSL could transform into a lower level DSL, which transforms into an even lower level DSL, etc. all the way down to machine code.

Transformation isn’t a new concept it’s, been around forever. At the very root of any software is essentially a bunch of 1’s and 0’s but we haven’t written raw 1’s and 0’s as our software for a long time. The compiler has always been a way for us to transform slightly more complex concepts into lower level ones. Even the extremely low level machine code is a step above raw 1’s and 0’s. General purpose programming languages themselves consist of constructs used to transform into much more verbose machine code or IL.

Taking transformation to the next level of abstraction is necessary for us to effectively create DSLs. If there was a tool to help us easily perform those transformations then it would go a long way towards making external DSL authoring more realistic, which is what I’m hoping to do with MetaSharp.

So to me, at this point, Code Generation is just another form of Transformation, which I will be calling “Textual Transformation” from now on. It has its pros and cons, of which I hope to discuss further in other posts. However, my point today is simply to convey the idea of Transformation as more general and more important to the DSL world than simply Code Generation and also to consciously force myself to update my lexicon.

MetaSharp Vision for the Future

I was just having some ideas and wanted to put it down somewhere partly for myself and partly to get some feedback.

One of the next things I want to do is to convert the compile-to-CodeDom parts of MetaSharp into a Vistor Pattern so that I can use the same system to compile to CodeDom or Generate MetaSharp or to transform the AST or whatever I want. This will bring a lot of flexibility and power to the whole system.

I was thinking about a post by Ayende Rahien the other day called M is to DSL as Drag and Drop is to Programming and specifically I was thinking about the quote “If you want to show me a DSL, show me one that has logic, not one that is a glorified serialization format.“ And what I took this to mean is that there is no logic in this DSL. Which can still be declarative but will often time have concepts like less-than or greater-than or equal-to. Certainly not limited by this but these are fairly common. To me his complaint (which is valid) is that with an external DSL, no matter how easy it is to write a grammar, it is still hard to expression logic with a grammar, and furthermore it is just as hard to translate that logic into something executable.

With an internal DSL, such as you get with Boo, you can easily just author keywords for your DSL but you get all of the logical operators for free, which is very nice of Boo. But unfortunately with an internal DSL you not only get the logical operators for free you are forced to get them. With an internal DSL you can do less work to get it working but you are not operating in a constrained universe. This has trades offs but lets certainly not dismiss it. There plenty of use cases where this is the preferred way of doing it.

However there are some distinct benefits of an external DSL, the major tradeoff being the effort required to implement it. The main benefit is that you can constrain your universe such that only allowable logic can happen in the correct spots. It’s like a sandboxed language, which I like to call a constrained universe. And believe it or not constraint can actually be freeing.

So my sudden flash of insight this morning was when I realized that actually, with MGrammar, you can choose to import grammars defined in other assemblies and use the syntax and tokens defined there. So when you choose to use MetaSharp by adding a reference to the assembly you can actually also import the MetaSharp.Lang grammar and easily make use of the BinaryExpression syntax in your own DSL (or anything else). Then I was also thinking that you could probably make use of the same AST serialization tools and (soon to be) AST transformation Visitors to build your own DSLs without a lot of the extra work. Using that type of system you could probably transform directly into executable code completely without using the templating at all, haha! Simply transform your custom AST nodes into standard supported Nodes, or write your own visitor that can handle your custom nodes. Your custom visitor could probably also tap into the templating system so you could write the AST transformation as a MetaSharp template if you desired as well.

This would put MetaSharp into the role of being an extensible compiler system where custom external DSLs can opt-in to standard language grammar where appropriate rather than not even being able to opt-out as in current internal DSLs. This is powerful idea and I think it is well within my grasp.

LambdaExpression.CompileToMethod … not nearly as cool as I had hoped.

I was messing around with the new dynamic expressions in C# 4 (found in System.Linq.Expressions). One new thing I noticed was the method CompileToMethod on LambdaExpressoin. The first parameter is a MethodBuilder so I got really excited, finally an easy way to create dynamic methods on Types!

Wrong, turns out the method has to be Static, also it cannot accept, as a parameter, the Type the method is being built on. So this basically completely negates the entire purpose. At this point you might as well simply compile it to a delegate and use that instead. Bummer.

Here is an explanation of why:

After much investigation, it turns out this is by won’t fix for CLR 4.0

You can’t use expression trees to model instance methods or constructors. First problem: you can’t create the “this” parameter, because all you have in hand is a TypeBuilder, but that is not the final type. We can create a delegate with the TypeBuilder, but CLR won’t let us create a LambdaExpression with that delegate (because the type is not finished). You can workaround that by making the “this” parameter be of type object, then you end up with meaningless casts. Worse, calls to your own methods don’t work, because MethodBuilder doesn’t implement enough of reflection for expression trees to do their normal sanity checks.

DynamicMethods run into their own problems. When emitting into a DynamicMethod, we can’t add our Closure argument to the signature of the DynamicMethod. Without a closure,

  • DynamicMethods run into some serious limiations:
    They can’t have nested lambdas at all (due to a CLR limitation: can’t ldftn to a DynamicMethod).
  • Some things that are O(1) or amortized O(1) become O(N), such as RuntimeVariables and Switch on strings. This is really sneaky for the user who won’t expect things to suddenly be slower.

This needs work we plan for ETs v3, and the design around the support will likely change.

One potential work around I have contemplated but haven’t gotten up the energy to try would be to try to generate a matching interface first, then pass that around as a parameter to the static method. You’d have to give access to all fields through explicitly implemented properties though, destroying encapsulation in the process. So if suppose you wanted to generate something like this:

public class Test
{
    private int foo;

    public int Bar()
    {
        return this.foo;
    }
}

Instead you would generate something like this:

public interface ITest
{
    int foo { get; set; }

    int Bar();
}

public class Test : ITest
{
    private int foo;

    int ITest.foo
    {
        get { return this.foo; }
        set { this.foo = value; }
    }

    public int Bar()
    {
        return Test.Bar(this);
    }

    public static int Bar(ITest test)
    {
        return test.foo;
    }
}

The benefits of using the Linq expressions to build your method bodies quickly diminish with this sort of workaround however. You might as well just go back to the ILGenerator (*shudders*). This would probably work because the static Bar method can accept a parameter of a Type already dynamically created.