MetaSharp mentioned on “M” Language Gallery

http://msdn.microsoft.com/en-us/oslo/cc749619.aspx

MetaSharp

A simple Common Language Specification (CLS)-compliant, general purpose programming language for MetaSharp, a fully extensible, pipelined transformation engine that can be used for templated textual transformations (code generation), AST transformations and any combination thereof.
Justin Chase

There are a bunch of other cool grammars and Oslo related projects on there as well. Check em’ out.

Creating an internal DSL with MetaSharp

Inside of MetaSharp is a CLS compliant language using the same system and patterns that you would use to create an external DSL. One of the parts of creating your own DSL in MetaSharp is declaring Node objects to represent your parsed graph. MetaSharp will convert your parsed MGrammar graph into your strongly Typed AST (Nodes) which you can then use to transform however you wish.

Declaring your AST nodes requires a certain design pattern and after some prompting from Attila the Hun I have created a DSL specifically for creating nodes. Here it is before the DSL:

//-----------------------------------------------------------------------
// <copyright company="MetaSharp">
//     Copyright (c) MetaSharp. All rights reserved.
// </copyright>
//-----------------------------------------------------------------------
namespace MetaSharp.Lang.Ast.Standard
{
    using System.Collections.Generic;
    using MetaSharp.Lang.Ast.Common;

    /// <summary>
    /// A foreach statement node.
    /// </summary>
    public class ForEachStatement : Statement
    {
        /// <summary>
        /// The VariableType NodeProperty.
        /// </summary>
        public static readonly NodeProperty VariableTypeProperty = 
            NodeProperty.Register<ForEachStatement>(n => n.VariableType);

        /// <summary>
        /// The VariableName NodeProperty.
        /// </summary>
        public static readonly NodeProperty VariableNameProperty = 
            NodeProperty.Register<ForEachStatement>(n => n.VariableName);

        /// <summary>
        /// The Expression NodeProperty.
        /// </summary>
        public static readonly NodeProperty ExpressionProperty = 
            NodeProperty.Register<ForEachStatement>(n => n.Expression);

        /// <summary>
        /// The Statements NodeProperty.
        /// </summary>
        public static readonly NodeProperty StatementsProperty = 
            NodeProperty.Register<ForEachStatement>(n => n.Statements);

        /// <summary>
        /// Gets the variable type.
        /// </summary>
        public TypeReference VariableType
        {
            get { return (TypeReference)this.GetValue(VariableTypeProperty); }
            set { this.SetValue(ForEachStatement.VariableTypeProperty, value); }
        }

        /// <summary>
        /// Gets the variable name.
        /// </summary>
        public string VariableName
        {
            get { return (string)this.GetValue(VariableNameProperty); }
            set { this.SetValue(ForEachStatement.VariableNameProperty, value); }
        }

        /// <summary>
        /// Gets the enumeration expression.
        /// </summary>
        public Expression Expression
        {
            get { return (Expression)this.GetValue(ExpressionProperty); }
            set { this.SetValue(ForEachStatement.ExpressionProperty, value); }
        }

        /// <summary>
        /// Gets the statements.
        /// </summary>
        public IEnumerable<Statement> Statements
        {
            get { return (IEnumerable<Statement>)this.GetValue(StatementsProperty); }
            set { this.SetValue(ForEachStatement.StatementsProperty, value); }
        }
    }
}

And here it is as a DSL:

//-----------------------------------------------------------------------
// <copyright company="MetaSharp">
//     Copyright (c) MetaSharp. All rights reserved.
// </copyright>
//-----------------------------------------------------------------------
namespace MetaSharp.Lang.Ast.Standard:

    import System.Collections.Generic;
    import MetaSharp.Lang.Ast.Common;
    import MetaSharp.Transformation;

    /// <summary>
    /// A foreach statement node.
    /// </summary>
    node ForEachStatement as Statement:
    
        /// <summary>
        /// The VariableType NodeProperty.
        /// </summary>
        property VariableType as TypeReference;

        /// <summary>
        /// The VariableName NodeProperty.
        /// </summary>
        property VariableName as string;

        /// <summary>
        /// The Expression NodeProperty.
        /// </summary>
        property Expression as Expression;

        /// <summary>
        /// The Statements NodeProperty.
        /// </summary>
        property Statements as IEnumerable<Statement>;

    end
end

It went from 76 lines to 38, so that’s a win in my book. Plus most of the lines that are there are much shorter. The only downside is the lack of intellisense and syntax hilighting but I have reason to believe that that is a solvable problem in general if you’re using MGrammar as your parser, since it’s already possible in Intellipad.

To use this DSL all you have to do is import MetaSharp.Lang.targets into your .csproj file.

<Import Project="$(NodeBuilderBinPath)\MetaSharp.Lang.targets" />

Then in your project you simply add your items to the project as Nodes. Like so:

image

This will generate a file for you at compile time in your projects language (i.e. this should work in VB as well) and that file will get compiled along with the assembly. Next I want to build a Pipeline DSL, then the process of building your own DSL will all be done in DSLs as well!

dynamic C# in unit tests

I’ve been writing some unit tests lately that require quite a bit of casting. This gets tiring pretty fast so I went ahead and decided to give the dynamic keyword a try. I’ll show the before and after examples and let you decide.

Here is the non-dynamic version.

[Test]
public void InvokeWithAddPlusInvokeTest()
{
    string code = "f(1 + 2) + g()";
    var b1 = (BinaryExpression)this.pipeline.Compile(code);

    var f = (MethodInvokeExpression)b1.Left;
    var g = (MethodInvokeExpression)b1.Right;

    var b2 = (BinaryExpression)f.Parameters.Single();

    var one = (PrimitiveExpression)b2.Left;
    var two = (PrimitiveExpression)b2.Right;

    Assert.That(b1.Operator == BinaryOperator.Add);
    Assert.That(b2.Operator == BinaryOperator.Add);

    Assert.That(((ReferenceExpression)f.Target).Name == "f");
    Assert.That(((ReferenceExpression)g.Target).Name == "g");
    Assert.That(one.Value == "1");
    Assert.That(two.Value == "2");
}

In this test I am compiling an expression into an AST and digging around to verify that the correct nodes were created in the right places in the tree.

Here is the dynamic version.

[Test]
public void InvokeWithAddPlusInvokeTest()
{
    string code = "f(1 + 2) + g()";
    dynamic b1 = this.pipeline.Compile(code);
    dynamic b2 = Enumerable.Single(b1.Left.Parameters);

    Assert.That(b1.Operator == BinaryOperator.Add);
    Assert.That(b2.Operator == BinaryOperator.Add);

    Assert.That(b1.Left.Target.Name == "f");
    Assert.That(b1.Right.Target.Name == "g");
    Assert.That(b2.Left.Value == "1");
    Assert.That(b2.Right.Value == "2");
}

A lot shorter that’s for sure. The only downside is that if I change the nodes I will no longer get compile time warnings… but I will get unit test errors so this shouldn’t theoretically matter. I also no longer get intellisense so I either have to just know the structure of the objects or use the debugger to figure it out. Still the simplicity is looking good.

Reflection vs. code generation – by Martin Fowler

Here is an interesting article by Martin Fowler on the subject of reflection vs. code generation.

http://www.javaworld.com/javaworld/jw-11-2001/jw-1102-codegen.html?

Here is a nice quote to sum it up (with which I couldn’t agree more).

Active code generation gives us all the benefits of reflection, but none of its limitations. Reflection will still be used, but only during the code generation process, and not during runtime.

Here is another endorsement he gives after many examples and justifications.

The benefits of reflection are obvious. When coupled with code generation it becomes an invaluable and, more importantly, a safe tool. There is often no other way to escape many seemingly redundant tasks. As for code generation: the more I work with it, the more I like it. With every refactoring and increase in functionality, the code becomes clearer and more understandable. However, runtime reflection has the opposite effect. The more I increase its functionality, the more it increases in complexity. So, in the future, if you feel you need to conquer a complicated problem using reflection, just remember one rule: don’t do it at runtime.

I couldn’t agree more, except I would go one step further and consider “code generation” just another form of transformation at compile time. At least that is what I’m hoping to accomplish with MetaSharp.

Staged Pipelines

In an effort to make the MetaSharp pipelines more powerful I’m about to add the concepts of stages and connectors. I’ve been thinking about it a bit and I drew up some diagrams to help me express how the pattern should work.

At a high level it’s pretty simple, for every pipeline there are multiple stages and for each stage there are multiple steps. Each stage has 1 or many input connectors and 1 or many output connectors, which connects to the next stage of the pipeline.

image

With this in mind there are four possible types of stages, defined by their input and output connectors. Stages must be chained together with matching input and output connections. You want multiple types because there are certain types of operations that are simply not possible to do simultaneously but there are other types that are completely isolated and are perfectly acceptable to run asynchronously.

image

Many to Many

For each type of input a complete inner pipeline of steps is created. Meaning each input value from a previous stage will be processed by the same steps. Each inner pipeline will run asynchronously and should not communicate between each other. The stage will complete when all steps have completed running.

image

1 to 1

This type of stage will accept one input value and produce one output value. It will create exactly one chain of steps and execute synchronously.

image

1 to Many

This type of stage will accept one input value and have exactly one chain of steps but will produce many output values.

image

Many to One

This type of stage will accept many values and run them all through exactly one chain of steps.

image

 

From this I should be able to make any type of compilation pipeline imaginable. For example a typical pipeline might be something like this:

  • Parse files
  • Combine AST
  • Resolve References
  • Generate Assembly

In which case you might end up with the following stages:

  • M:M, Parse files all at once
  • M:1, Combine the ASTs into one tree.
  • 1:1, Resolve and transform the tree.
  • 1:1, Transform into IL

You could also imagine that last step transforming into multiple objects or multiple files or something like that quite easily. Also the good news is that I think this shouldn’t actually be that complicated. The pipeline simply deals with connecting stages and each stage has a very simple strategy for processing the steps. The real work will lie in the implementing the stages but even then each stage is completely modular and singularly focused.