Getting a CodeDomProvider in an MSBuild Task

Trying to get the correct CodeDomProvider inside of an MSBuild task wasn’t as easy as I would have liked. Well actually the code itself is pretty simple but finding out how to actually do it was difficult. There doesn’t appear to be a profusion of people doing such a thing so the blogosphere and forums are fairly sparse with related information. After messing around with it for a couple of hours I think I finally found a pretty reliable (aka it doesn’t feel like a dirty hack) way to do it.

The two key bits of information is the <Language /> PropertyItem that each language defines in its own common targets file (by convention) and the CodeDomProvider.GetAllCompilerInfo() method. Here is my targets file.

<!–Reference the assembly where our tasks are defined–>
<UsingTask TaskName=“MetaSharp.MSBuild.TemplateTask” AssemblyFile=“$(MSBuildExtensionsPath)\MetaSharp\MetaSharp.MSBuild.dll” />
<!–Compile target (this is the target that calls the compiler task)–>
<Target Name=“BeforeBuild”>
    <Message Text=“Building: @(MetaSharpTemplate)” />
    <TemplateTask Templates=“@(MetaSharpTemplate)” Language=“$(Language)”>
      <Output TaskParameter=“Generated” ItemName=“Compile” />
The key here is the Language=”$(Language)” property on the task. And here is my task with a Linq statement to find the correct CodeDomProvider.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using Microsoft.Build.Utilities;
using Microsoft.Build.Framework;
using System.Collections;
using System.CodeDom;
using System.CodeDom.Compiler;
namespace MetaSharp.MSBuild
      publicclassTemplateTask : Task
            // Properties
            publicITaskItem[] Templates { get; set; }
            publicITaskItem Language { get; set; }
            publicITaskItem[] Generated { get; set; }
            publicoverridebool Execute()
                  base.Log.LogMessage(0, “Building MetaSharp templates.”);
                  if (this.Language == null || string.IsNullOrEmpty(this.Language.ItemSpec))
                        base.Log.LogError(“You must have a Language PropertyItem defined somewhere in your project files to specify which CodeDomProvider to use (i.e <Language>C#</Language>)”);
                  CodeDomProvider provider = FindProvider(this.Language.ItemSpec);
            privateCodeDomProvider FindProvider(string language)
                  CodeDomProvider[] providers = (from info inCodeDomProvider.GetAllCompilerInfo()
                                                               from l in info.GetLanguages()
                                                               where l.ToUpperInvariant() == language.ToUpperInvariant()
                  CodeDomProvider provider = null;
                  if (providers.Length == 0)
                        Log.LogError(“Unable to find a valid CodeDomProvider for this project type. Try adding a valid Language property item to your msbuild project file”);
                  elseif (providers.Length > 1)
                        // It would be surprising if this ever happened…
                        Log.LogError(“Found multiple valid CodeDomProviders for this Language type. Try adding a less ambiguous Language property item to your msbuild project file”);
                  else provider = providers[0];
                  return provider;
For those of you who are curious MetaSharp is the tentative name of the CodeDom DSL I have been working on using MGrammar. It’s coming along pretty well, this task will build MetaSharp code files to be compiled along with the project they are contained in. There are lots of details to be shored up but the basic use cases are working right now. When I have things a little more polished I will probably create another post with some samples.

Template DSL with MGrammar

I have been quiet for a few days mostly due to this new toy I have been playing with called MGrammar. I haven’t had this poor of sleep in quite a while. It is a tool you can use to create custom DSLs with (Domain Specific Languages). It is not to be confused with the M language, which is a specific DSL created using MGrammar. I think of M as being a general purpose DSL for defining a general domain while you can optionally create a more specific DSL for your specific domain with a little extra effort. Needless to say it has been fun to learn about and will be a handy new tool in my toolbox.
At first I was a little disappointed with it to be honest, after all it does seem like just another parser generator and I think that in reality it probably is just that. It has some interesting features though, in that instead of generating files it generates an object model in memory, it is .NET and it has a syntax that is actually comprehensible by mere mortals. There is one thing it is lacking though that left me a little disappointed and that is a clear way to translate your MGrammar nodes into another form. I will say that since I first started working with MGrammar I have learned that there is a complimentary tool called MSchema that you can use to translate your tree into concrete classes. It is helpful but not quite what I was hoping for. Frankly what I want is something that can create those concrete classes instead of just transmuting nodes into them.
So after thinking about it for a while I have concluded that this is a problem that only really needs to be solved once… not ironically with a special DSL. So I have begun working on a template DSL, which roughly translates into a general purpose language-agnostic programming language. Sounds weird I know but I have been thinking about it a lot and I don’t think it’s totally crazy.
So the idea is this, you create your custom DSL. The example they give us in the MGrammar source code is a “Song” DSL where you can create source code such as this:

– – – D
C C# F G
E E – D
A E – E
G F – E
When you parse this the result is a Tree structure of nodes with the note values in them, this then get’s translated into Console.Beep calls or Thread.Sleep calls to make some fun music. The process of the conversion in this example isn’t exactly pretty and certainly isn’t scalable. What would be nice though is to have this be translated into the creation of classes. In a lot of circumstances you would want exactly that, generated classes. You may also want this DSL translated into other forms such as SQL or, in this case, an image of the notes printed on staff paper but my template DSL will not solve those problems. Those are other DSLs waiting to happen.
But what I want is something that we can use to easily translate this DSL into real code. Here is an example of my prototype template DSL for this Song:

namespace MySong:
      import System
      import System.Collections.Generic
      template Main:
            return [|
            Song s = new Song()
            bars ${Bars}
            return s |]
      template Bars:
            return [|
            s.Bars.Add(new Bar {
                  note ${Bars.One},
                  note ${Bars.Two},
                  note ${Bars.Three},
                  note ${Bars.Four} }) |]
      template Note:
            return [| new Note(${Note.Value}) |]
Here I’m declaring three templates (which is a template itself declared in the framework) one is called Main. When compiling this file you will end up with a CodeDom object that defines a namespace, some imports and three classes inheriting from Template (MainTemplate, BarsTemplate and NoteTemplate). Each of these templates will implement a method that executes the body of the template. Code defined in ‘[| … |]’ blocks is itself building CodeDom objects.
So this DSL turns out to be a programming language in itself, not a very complicated one though and I don’t have to actually deal with the IL anywhere since its generating CodeDom objects. This language should be able to live side by side with any project type as well; the same templates can work for VB or C# or F# or whatever. It will be very simple in the sense that it will only implement CLS compliant features exposed by the CodeDom except for the meta-programming constructs (and of course non-cls compliant constructs such as “using” statements can themselves be templates).

dsl flow

This diagram shows my idea in general on how it would work to use this template engine to translate your DSL into code. From there, once you have the CodeDom object, you could easily compile it directly into an assembly or generate source code from it in any .NET language. I will probably create a simple MSBuild task that you can apply to files in your project that will generate code to be included in the projects built assembly. Addtionally, there would be another task to embed your templates into the assembly so they can be shared by reference.

While working with NBusiness I have been in the process of creating templates and I have had the chance to work with a few template engines. They seem to have pros and cons but none of them ever suited my needs, they’re typically glorified string builders with horribly ugly Tag Soup. While we still have tag soup in this language it seems a lot less ugly and this is basically the same approach that Boo takes except DSLs in Boo are internal and the metadata they are transforming is the actual Boo AST. Here we are an external DSL transforming the metadata of another external DSL into your CLR language of choice. The only form of tag soup are the meta blocks ( [| … |] ) and it’s code interleaved with meta code rather than meta code interleaved with code (like ASP WebForms), a subtle but distinct difference.
Using templates like this you can recursively construct code with ever more general code. Perhaps it would be nice to have a text writing DSL so you can go from DSL down to DSL then eventually into a CodeDom template but that is another DSL for another time. When I am done here I will change the NBusiness parser into an MGrammar DSL then run it through this template engine instead of my current template scheme.
So I realize this sounds like a crap load of work and probably a little crazy, additionally I’m not totally convinced there is nothing exactly like this already but on the other hand I can actually comprehend this and I can see what gap it would fill in my current projects. I have been working on the MGrammar for this sort of language already and I have been pleasantly surprised how easy it is. I have been trying to use the NewLine character to indicate the end of a statement rather than a semi-colon (or something similar) but this turns out to be pretty tough in some cases so I might have to go the semi colon route (I already compromised enough to use the “: … end” instead of tabs however). Currently I am able to define namespaces, imports class declarations and constructors (with parameters) and translate them into CodeDom objects. If this idea generates enough interest or comes along far enough to be usable I will probably post it to Codeplex, for now however I will just work on it in my own repository to spare everyone my painful research details.
I have been working on creating a DSL of my own (NBusiness) for quite a while now and have seen the dire need for a simple grammar parser (again, comprehensible by mere mortals) and a corresponding template engine and I can say with some confidence that this tool will be extremely useful for me.
Just for fun here is how a “using” template might look (purely hypothetical at this point):

namespace Common:
      import System
      import System.Collections.Generic
      template Using:
            CodeExpression parameter = Using.Parameters[0]
            CodeStatementCollection body = Using.Statements
            return [|
                  end |]