Object Matching in MetaSharp

I’ve been delaying this feature in MetaSharp for a while because I wasn’t entirely sure how I wanted it to work and it’s a very important feature. I also haven’t seen any other examples in the wild that matched what I’m trying to do exactly so I was slightly unsure of the syntax.

But I have finally dug in and got it working! It’s by no means set in stone, so if anyone has any good suggestions I will likely incorporate them. Feel free to send me suggestions.

If you’re wondering what I mean by Object Matching I’ll try to explain, first what it means and second why I think it’s important.

Object Matching

Most parsers in the world today are concerned with general purpose programming languages and, as a result, parsing text and transforming it into executable code. The patterns around this problem set are well known and have many implementations. However, the interesting thing is that they are largely geared towards accepting only text as an input and therefore rely on multiple patterns to solve the problems of compilers.

In MetaSharp there is a now Pattern for matching complex objects as well as characters. What this means is that your input can actually be streams of object trees instead of just strings.

For comparison, suppose you wanted to match a social security number from a string using MetaSharp. Here is how you might do it:

SSN = Number#3 “-” Number#2 “-” Number#4;

If your string input was “123-45-6789” then this SSN rule would match that. This is not all that different from regular expressions, though there are some notable differences.

But suppose what you actually had was a Customer object, which has an SSN property? With object matching to you can match on the entire object instead of just the string.

PersonWithSsn = Person { Ssn = SSN };

In this case the curly braces indicate an object match, where the object type is of Type Person and possesses the property Ssn with a value that matches the SSN rule. This is a pretty trivial example but the value of this becomes more apparent when you consider large object graphs and you have the desire to transform that object graph into something else.

Transforming Object Graphs

This is actually a very common operation in software and the standard two design patterns for doing this are the State Machine and Visitor patterns. In fact both techniques are required as multiple steps in order to implement a basic compiler. Using Object Matching patterns you can effectively create a Visitor pattern in the same grammar language as needed to create a State machine.

This is the real benefit of object matching, being able to use the same grammar to create State Machines and Visitors means that you can create every step in a compiler using a single language, tool or skill.

For example, in MetaSharp I have a sample application that plays songs using Console.Beep. It’s pretty silly but its a nice integration test for me. In this sample I have two songs and after parsing the DSL text what I used to do was to pass the AST resulting from that parse into a T4 template, which would then generate some code that would get compiled into linq expressions.

Here is an example of the T4 template I am referring to:
http://metasharp.codeplex.com/SourceControl/…/DynamicSong.template

<#@ template language=”C#” inherits=”SongMetaTemplate” hostspecific=”true” debug=”true” #>
<#@ assembly name=”DynamicSongBuilder” #>
<#@ import namespace=”DynamicSongBuilder” #>
<# foreach(var song in this.Songs) { #>
    <# foreach(var b in song.Bars) { #>
        <# foreach(var n in b.Notes) { #>
            play(<#=song.Duration #>, “<#=n.Key #>”, <#=n.Octave #>);
        <# }
    }
} #>

This works and is a legitimate technique and I don’t think it’s very popular to say this but string transformations like this have a couple serious drawbacks.

  1. The general purpose languages are hard coded, if you wanted to make this reusable you’d have to rewrite it for every language you wanted to support.
  2. You lose contextual information about your original object.
    • C# is a little better at this than some by giving you the #line directive but it’s still not ideal.
  3. It’s slow. With T4 you either get an unwanted assembly loaded into memory or you have to create a separate AppDomain which is very slow and cludgy.
  4. Tag Soup (http://www.codinghorror.com/blog/2008/07/web-development-as-tag-soup.html)

These drawbacks are usually acceptable. You’re usually only creating a website once, for example. And your views typically don’t get reused as part of another application. Being the top most part of the stack, it’s ok to lose a little context. But in a world where your intention is to create reusable grammars and inheritable grammars it’s very non-ideal.

Here is how you would implement the same transformation using a grammar with Object Matching:

grammar DynamicSongTransform:

Main = SongTransform;

SongTransform
= Song { b:Bars = [BarTransform*] }
-> BlockStatement { Statements = b.Flatten() };

BarTransform
= Bar { n:Notes = [NoteTransform*] }
-> n;

NoteTransform
= n:Note { }
-> InvokeExpression {
Target = ReferenceExpression { Name = "play" },
Parameters = [
PrimitiveExpression { Value = n.Parent.Parent.Duration.ToNumber() },
PrimitiveExpression { Value = n.Key },
PrimitiveExpression { Value = n.Octave.ToNumber() }
]
};
end

This grammar accepts a Object Stream containing a single Song object. That Song object will contain a collection of Bar object which in turn contains a collection of Note objects.

As a refresher the ‘->’ token indicates a projection. Which means the pattern to match is on the left side of the projection and if the match is successful then the right hand side is what is produced. If no projection is specified then the matched value is projected by default.

In the projection of the NoteTransform we are projecting an InvokeExpression, which has a Target and some parameters. This then gets passed down into a DynamicCompile step i created which transforms these objects into Linq expressions, which in turn translates those expressions into IL! In this way every step of our DSL is re-usable. This will work for VB or C# developers. It will also work by passing in a textual DSL or by passing in a Song AST created by other means. You could create a different textual DSL that produces the same AST and this will still work and we have debugging context data all the way to the end. Also, the general purpose code transformers are all implemented by MetaSharp so you don’t have to do that every time.

My theory is that creating a custom DSL should be as simple as creating multiple grammars and snapping them together. The next step will be to implement a general purpose UI that will let you use attributes to annotate your grammars to give hints about highlighting and other IDE related features. Keep an eye out for basic VS and NuGet integration in the near future!