MetaSharp – A CodeDom based Template Engine using MGrammar

I’ve been working on a tangential project related to NBusiness for a couple of weeks now and I just wanted to take a moment to get a few of my thoughts out. The project I have been working on I am tentatively calling “MetaSharp” for now. It’s been fun and educational but hopefully it will have real usefullness when it is done. I wanted to have a fully working example before I publicly posted the code (since it’s basically prototype quality right now) but if anyone is interested in seeing what I have so far feel free to ask and I’ll hook you up somehow.

I’ll try to start at the beginning to justify my rationale for creating this strange project. I’ve been working on NBusiness for quite a while now and while I’ve really had NBusiness “working” almost all along I have never quite been able to get it where I want it to be (complete). If I had to sum up the entire process of working on NBusiness into one sentence it would be “creating a DSL is hard”. That’s an understatement frankly. Let me see if I can lay out the various layers required for DSL creation.
·         Domain Objects
·         Parser
·         Compiler
·         Template Engine
·         Build Integration
·         Tooling Support
The first three items are actually relatively easy and pretty fun. This is what we all know how to do, write code to parse strings and stick values into objects. No problem. It turns out the next three layers which really provide the fit, finish and ultimate usability of your DSL are not easy at all. Build integration isn’t really that bad actually but tooling integration can be a real bear. In the case of a DSL you really want syntax hilighting and intellisense and nice IDE integration for file templates and things like that. Maybe a few additional context menus in your IDE and such. For me I have been trying to integrate into Visual Studio and I can officially say that I have sunk well over half my time into that aspect alone and it has been one of the hardest things I have ever tried to do. Visual Studio is also architected such that I had to completely redo my parser and compiler to be compatible with the needs of Visual Studio. Very painful.
But what is really hanging me up now is what I consider to be a large gap in the .NET  DSL world and that is a suitable templating engine. By templating engine I mean something that can take metadata and translate it into code.
I mean we have a bunch out there but they’re all (as far as I know) effectively giant string builders. They suffer from Tag Soup and and are bound strongly to a specific language implementation. For NBusiness I want to support side by side integration with any .NET language, C# or VB or Python or whatever. And re-creating all of these templates for every language is not an option. It’s too much upfront work and it’s too much long term maintenance. I absolutely need templates that are based on the CodeDom so I can be language agnostic… But if you’ve ever tried to use the CodeDom you know how hard it is to work with. Because of this users are very unlikely to actually make their own templates (which is almost always necessary) and when they do it is a very painful process. So I’ve been stuck in this cunundrum for quite a while, how can you make a template engine that is both based on the CodeDom but has the ease of use of a string builder?
Enter MGrammar. Using MGrammar I have found a way to define a DSL for generating code. This DSL turns out to be a full fledged programming language in and of itself with the caveat of being restricted only to that which is CLS compliant. I have combined this DSL with the capability to create templates (to extend the language, similar to macros in Boo) and databinding similar to what you have in XAML. The end result allows you to do something similar to this:
namespace Example:
    import System;
    template One:
        public class {Binding Name}:
            {SequenceBinding Items, Template=Two}
    template Two:
        private field {Binding Type} _{Binding Name};
        public property {Binding Type} {Binding Name}:
                return this._{Binding Name};
                this._{Binding Name} = value;
(This is just an example, the end result might not actually be exactly this syntax)
Which when compiled will generate a class called OneTemplate that inherits from Template and returns a CodeTypeDeclaration object from it’s Generate method. Extensions such as the BindingExtension show here can be custom objects to extend behaviors but in this case it binds the name of the class to the Name property (or Name sequence node of an MGraph tree) of the provided metadata.
Technically you could write your entire project in pure MetaSharp code but more likely you will write all of your static classes in your rich language of choice and simply use MetaSharp to define templates. Since this is all compiling down to CodeDom objects I have cooked up some MSBuild tasks that simply translate those objects into the code of the project the files exist in. You could share this same file in a VB or C# project and it would compile to the same thing in both assemblies.
Currently I am working on a prototype using the Song example from the MGrammar sample code that will allow you to write songs that generate song classes using templates like these. It’s almost working… the CSharpCodeProvider is throwing a random NullReferenceException with no useful error messages. Which is one reason why a DSL like this is helpful, it should be able to abstract away the pain of working directly with the CodeDom.

Template DSL with MGrammar

I have been quiet for a few days mostly due to this new toy I have been playing with called MGrammar. I haven’t had this poor of sleep in quite a while. It is a tool you can use to create custom DSLs with (Domain Specific Languages). It is not to be confused with the M language, which is a specific DSL created using MGrammar. I think of M as being a general purpose DSL for defining a general domain while you can optionally create a more specific DSL for your specific domain with a little extra effort. Needless to say it has been fun to learn about and will be a handy new tool in my toolbox.
At first I was a little disappointed with it to be honest, after all it does seem like just another parser generator and I think that in reality it probably is just that. It has some interesting features though, in that instead of generating files it generates an object model in memory, it is .NET and it has a syntax that is actually comprehensible by mere mortals. There is one thing it is lacking though that left me a little disappointed and that is a clear way to translate your MGrammar nodes into another form. I will say that since I first started working with MGrammar I have learned that there is a complimentary tool called MSchema that you can use to translate your tree into concrete classes. It is helpful but not quite what I was hoping for. Frankly what I want is something that can create those concrete classes instead of just transmuting nodes into them.
So after thinking about it for a while I have concluded that this is a problem that only really needs to be solved once… not ironically with a special DSL. So I have begun working on a template DSL, which roughly translates into a general purpose language-agnostic programming language. Sounds weird I know but I have been thinking about it a lot and I don’t think it’s totally crazy.
So the idea is this, you create your custom DSL. The example they give us in the MGrammar source code is a “Song” DSL where you can create source code such as this:
– – – D
C C# F G
E E – D
A E – E
G F – E
When you parse this the result is a Tree structure of nodes with the note values in them, this then get’s translated into Console.Beep calls or Thread.Sleep calls to make some fun music. The process of the conversion in this example isn’t exactly pretty and certainly isn’t scalable. What would be nice though is to have this be translated into the creation of classes. In a lot of circumstances you would want exactly that, generated classes. You may also want this DSL translated into other forms such as SQL or, in this case, an image of the notes printed on staff paper but my template DSL will not solve those problems. Those are other DSLs waiting to happen.
But what I want is something that we can use to easily translate this DSL into real code. Here is an example of my prototype template DSL for this Song:
namespace MySong:
      import System
      import System.Collections.Generic
      template Main:
            return [|
            Song s = new Song()
            bars ${Bars}
            return s |]
      template Bars:
            return [|
            s.Bars.Add(new Bar {
                  note ${Bars.One},
                  note ${Bars.Two},
                  note ${Bars.Three},
                  note ${Bars.Four} }) |]
      template Note:
            return [| new Note(${Note.Value}) |]
Here I’m declaring three templates (which is a template itself declared in the framework) one is called Main. When compiling this file you will end up with a CodeDom object that defines a namespace, some imports and three classes inheriting from Template (MainTemplate, BarsTemplate and NoteTemplate). Each of these templates will implement a method that executes the body of the template. Code defined in ‘[| … |]’ blocks is itself building CodeDom objects.
So this DSL turns out to be a programming language in itself, not a very complicated one though and I don’t have to actually deal with the IL anywhere since its generating CodeDom objects. This language should be able to live side by side with any project type as well; the same templates can work for VB or C# or F# or whatever. It will be very simple in the sense that it will only implement CLS compliant features exposed by the CodeDom except for the meta-programming constructs (and of course non-cls compliant constructs such as “using” statements can themselves be templates).

dsl flow

This diagram shows my idea in general on how it would work to use this template engine to translate your DSL into code. From there, once you have the CodeDom object, you could easily compile it directly into an assembly or generate source code from it in any .NET language. I will probably create a simple MSBuild task that you can apply to files in your project that will generate code to be included in the projects built assembly. Addtionally, there would be another task to embed your templates into the assembly so they can be shared by reference.

While working with NBusiness I have been in the process of creating templates and I have had the chance to work with a few template engines. They seem to have pros and cons but none of them ever suited my needs, they’re typically glorified string builders with horribly ugly Tag Soup. While we still have tag soup in this language it seems a lot less ugly and this is basically the same approach that Boo takes except DSLs in Boo are internal and the metadata they are transforming is the actual Boo AST. Here we are an external DSL transforming the metadata of another external DSL into your CLR language of choice. The only form of tag soup are the meta blocks ( [| … |] ) and it’s code interleaved with meta code rather than meta code interleaved with code (like ASP WebForms), a subtle but distinct difference.
Using templates like this you can recursively construct code with ever more general code. Perhaps it would be nice to have a text writing DSL so you can go from DSL down to DSL then eventually into a CodeDom template but that is another DSL for another time. When I am done here I will change the NBusiness parser into an MGrammar DSL then run it through this template engine instead of my current template scheme.
So I realize this sounds like a crap load of work and probably a little crazy, additionally I’m not totally convinced there is nothing exactly like this already but on the other hand I can actually comprehend this and I can see what gap it would fill in my current projects. I have been working on the MGrammar for this sort of language already and I have been pleasantly surprised how easy it is. I have been trying to use the NewLine character to indicate the end of a statement rather than a semi-colon (or something similar) but this turns out to be pretty tough in some cases so I might have to go the semi colon route (I already compromised enough to use the “: … end” instead of tabs however). Currently I am able to define namespaces, imports class declarations and constructors (with parameters) and translate them into CodeDom objects. If this idea generates enough interest or comes along far enough to be usable I will probably post it to Codeplex, for now however I will just work on it in my own repository to spare everyone my painful research details.
I have been working on creating a DSL of my own (NBusiness) for quite a while now and have seen the dire need for a simple grammar parser (again, comprehensible by mere mortals) and a corresponding template engine and I can say with some confidence that this tool will be extremely useful for me.
Just for fun here is how a “using” template might look (purely hypothetical at this point):
namespace Common:
      import System
      import System.Collections.Generic
      template Using:
            CodeExpression parameter = Using.Parameters[0]
            CodeStatementCollection body = Using.Statements
            return [|
                  end |]