Object Matching in MetaSharp

I’ve been delaying this feature in MetaSharp for a while because I wasn’t entirely sure how I wanted it to work and it’s a very important feature. I also haven’t seen any other examples in the wild that matched what I’m trying to do exactly so I was slightly unsure of the syntax.

But I have finally dug in and got it working! It’s by no means set in stone, so if anyone has any good suggestions I will likely incorporate them. Feel free to send me suggestions.

If you’re wondering what I mean by Object Matching I’ll try to explain, first what it means and second why I think it’s important.

Object Matching

Most parsers in the world today are concerned with general purpose programming languages and, as a result, parsing text and transforming it into executable code. The patterns around this problem set are well known and have many implementations. However, the interesting thing is that they are largely geared towards accepting only text as an input and therefore rely on multiple patterns to solve the problems of compilers.

In MetaSharp there is a now Pattern for matching complex objects as well as characters. What this means is that your input can actually be streams of object trees instead of just strings.

For comparison, suppose you wanted to match a social security number from a string using MetaSharp. Here is how you might do it:

SSN = Number#3 “-” Number#2 “-” Number#4;

If your string input was “123-45-6789” then this SSN rule would match that. This is not all that different from regular expressions, though there are some notable differences.

But suppose what you actually had was a Customer object, which has an SSN property? With object matching to you can match on the entire object instead of just the string.

PersonWithSsn = Person { Ssn = SSN };

In this case the curly braces indicate an object match, where the object type is of Type Person and possesses the property Ssn with a value that matches the SSN rule. This is a pretty trivial example but the value of this becomes more apparent when you consider large object graphs and you have the desire to transform that object graph into something else.

Transforming Object Graphs

This is actually a very common operation in software and the standard two design patterns for doing this are the State Machine and Visitor patterns. In fact both techniques are required as multiple steps in order to implement a basic compiler. Using Object Matching patterns you can effectively create a Visitor pattern in the same grammar language as needed to create a State machine.

This is the real benefit of object matching, being able to use the same grammar to create State Machines and Visitors means that you can create every step in a compiler using a single language, tool or skill.

For example, in MetaSharp I have a sample application that plays songs using Console.Beep. It’s pretty silly but its a nice integration test for me. In this sample I have two songs and after parsing the DSL text what I used to do was to pass the AST resulting from that parse into a T4 template, which would then generate some code that would get compiled into linq expressions.

Here is an example of the T4 template I am referring to:
http://metasharp.codeplex.com/SourceControl/…/DynamicSong.template

<#@ template language=”C#” inherits=”SongMetaTemplate” hostspecific=”true” debug=”true” #>
<#@ assembly name=”DynamicSongBuilder” #>
<#@ import namespace=”DynamicSongBuilder” #>
<# foreach(var song in this.Songs) { #>
    <# foreach(var b in song.Bars) { #>
        <# foreach(var n in b.Notes) { #>
            play(<#=song.Duration #>, “<#=n.Key #>”, <#=n.Octave #>);
        <# }
    }
} #>

This works and is a legitimate technique and I don’t think it’s very popular to say this but string transformations like this have a couple serious drawbacks.

  1. The general purpose languages are hard coded, if you wanted to make this reusable you’d have to rewrite it for every language you wanted to support.
  2. You lose contextual information about your original object.
    • C# is a little better at this than some by giving you the #line directive but it’s still not ideal.
  3. It’s slow. With T4 you either get an unwanted assembly loaded into memory or you have to create a separate AppDomain which is very slow and cludgy.
  4. Tag Soup (http://www.codinghorror.com/blog/2008/07/web-development-as-tag-soup.html)

These drawbacks are usually acceptable. You’re usually only creating a website once, for example. And your views typically don’t get reused as part of another application. Being the top most part of the stack, it’s ok to lose a little context. But in a world where your intention is to create reusable grammars and inheritable grammars it’s very non-ideal.

Here is how you would implement the same transformation using a grammar with Object Matching:

grammar DynamicSongTransform:

Main = SongTransform;

SongTransform
= Song { b:Bars = [BarTransform*] }
-> BlockStatement { Statements = b.Flatten() };

BarTransform
= Bar { n:Notes = [NoteTransform*] }
-> n;

NoteTransform
= n:Note { }
-> InvokeExpression {
Target = ReferenceExpression { Name = "play" },
Parameters = [
PrimitiveExpression { Value = n.Parent.Parent.Duration.ToNumber() },
PrimitiveExpression { Value = n.Key },
PrimitiveExpression { Value = n.Octave.ToNumber() }
]
};
end

This grammar accepts a Object Stream containing a single Song object. That Song object will contain a collection of Bar object which in turn contains a collection of Note objects.

As a refresher the ‘->’ token indicates a projection. Which means the pattern to match is on the left side of the projection and if the match is successful then the right hand side is what is produced. If no projection is specified then the matched value is projected by default.

In the projection of the NoteTransform we are projecting an InvokeExpression, which has a Target and some parameters. This then gets passed down into a DynamicCompile step i created which transforms these objects into Linq expressions, which in turn translates those expressions into IL! In this way every step of our DSL is re-usable. This will work for VB or C# developers. It will also work by passing in a textual DSL or by passing in a Song AST created by other means. You could create a different textual DSL that produces the same AST and this will still work and we have debugging context data all the way to the end. Also, the general purpose code transformers are all implemented by MetaSharp so you don’t have to do that every time.

My theory is that creating a custom DSL should be as simple as creating multiple grammars and snapping them together. The next step will be to implement a general purpose UI that will let you use attributes to annotate your grammars to give hints about highlighting and other IDE related features. Keep an eye out for basic VS and NuGet integration in the near future!

Author: justinmchase

I'm a Software Developer from Minnesota.

2 thoughts on “Object Matching in MetaSharp”

  1. Hi!

    First of all, great work on meta#, the grammar syntax seems to be elegant and powerful, just the way it should be.
    Second, I almost asked an embarrassingly dumb question here, wondering how I could pass a string to a rule, project it into objects, then pass those to another rule and do some fancy object matching. Turns out I forgot that “inheritance” wasn’t just plain old inheritance here, but was also the way to chain grammars after one another. This part is now merely included as a monument to my stupidity. Good.
    Third, I’m still not clear on the syntax of the Switch pattern. I didn’t see any examples in the docs, and the one in the blog post you made on the subject doesn’t compile. Well, what’s supposed to be a simplified version of it doesn’t compile. This:

    Main = BlueColor;
    BlueColor = c:PrimaryColor (Color { Name = “blue” })(c) -> c;
    PrimaryColor
    = “R” -> new Color(“red”)
    | “G” -> new Color(“green”)
    | “B” -> new Color(“blue”)
    ; // Color ctor sets “Name” property

    Could you explain a bit?

    1. Hello!

      The “switch” pattern basically indicates that you’re switching the input of the stream. Usually you pass in a stream and all of the rules match against that stream, with the switch you can actually apply a rule against production output. In that example you have there (which isn’t quite right, I think the final syntax you have to specify a rule rather than have an inline rule like that) it is passing the value in the variable “c” into the second rule. One of the reasons its useful is for situations where you want to do some extra checking on objects that are produced.

      If I recall the place I needed this was for event handlers in my general purpose language. For example the code:
      x.y.z += foo;

      Naively the pattern is something like this:
      Expression “+=” Expression

      But in reality this doesn’t work because expressions can have methods or binary expressions, etc:
      x.y.z() += 1 + 2;

      That obviously doesn’t work. But since the expression rule is recursive and needs to allow all of these cases the solution is to instead have a select pattern to verify that the object is actually a ReferenceExpression:

      EventAssignmentExpression = left:Expression IsReference(left) “+=” right:Expression IsReference(right)
      IsReference = ReferenceExpression { }

      If IsReference fails to match then the pattern fails and you get an error. So the idea is that you “switch” from the main input to some other input and do additional work on it.

%d bloggers like this: