Dynamic Pattern Matching in .net

For my most recent batch of work on MetaSharp I have been working on a completely dynamic version of the Grammar grammar. The reason I’ve been working on it is that I want to start working UI portions of the tool and in order to do that I needed a way to be able to dynamically compile grammars and run them.

What’s interesting is that thus far the Grammar language has been static, meaning you compile it at build-time and it results in a class that inherits from something using the standard .net Type system. So doing things dynamically is tricky because one of my goals was to still be able to “inherit” from a grammar without creating any new .net types. I managed to get this working. This has been a very interesting discovery process for me and I think it has really given me a few revelations about dynamism in type system, I’ll try to share a few of them here.

  • The only difference between dynamic and static is simply when you compile; during run-time or before run-time.
  • A dynamic language is simply expression trees + custom Type system.
  • Your Type system can be anything, it doesn’t need to actually be classes and structs, etc.

Here is an example of some unit tests I have been writing to verify certain functionality, which should hopefully give you a sampling of what it looks like to do pattern matching dynamically as well as some of the basics of MetaSharp.

Simple ‘And’ rule test

This rule says ‘a’ AND ‘b’. The CompileRule is actually adding the extra namespace and grammar declaration stuff automatically and compiling that grammar into a DynamicParser object and is returned as an IParser. Calling parse actually returns an object of Type dynamic but I have an explicit cast below as well.

[Test]
public void AndMatchTest()
{
    var dynamicParser = DynamicParser.CompileRule("'a' 'b'");

    var result = (dynamic)dynamicParser.Parse("ab");
    Assert.AreEqual('a', (char)result[0]);
    Assert.AreEqual('b', (char)result[1]);
}

Value projections

Here I’m projecting an actual double, not a string representation of a double.

[Test]
public void DoubleProjectionTest()
{
    var dynamicParser = DynamicParser.CompileRule("default -> 7.11");
    var result = (dynamic)dynamicParser.Parse("");
    Assert.AreEqual(7.11, (double)result);
}

Dynamic projections

This projection is returning a DynamicNode object. The Identifier is used instead of the Type Name. Properties can be dynamically added. This also has a variable assignment on the left hand side “a:” lets you use the a variable on the right hand side.

[Test]
public void CharacterMatchObjectProjectionTest()
{
    var dynamicParser = DynamicParser.CompileRule("a:'a' -> Test { Value = a }");
    var result = (dynamic)dynamicParser.Parse("a");
    Assert.AreEqual("Test", result.Identifier);
    Assert.AreEqual('a', (char)result.Value);
}

Matching multiple items

+ means One Or More. Here we’re matching an arbitrary number of a’s. This would fail if there were 0 a’s though.

[Test]
public void OneOrMoreMatchTest()
{
    var dynamicParser = DynamicParser.CompileRule("'a'+");
    var result = (dynamic)dynamicParser.Parse("a");
    Assert.AreEqual('a', (char)result[0]);

    result = dynamicParser.Parse("aaa");
    Assert.AreEqual('a', (char)result[0]);
    Assert.AreEqual('a', (char)result[1]);
    Assert.AreEqual('a', (char)result[2]);
}

Calling custom extensions

AsString() is actually a static method defined on Common. For now I am only searching for extensions in a very explicit whitelist. This will probably change into an Assembly whitelist eventually though.

[Test]
public void ExtensionReferenceProjectionTest()
{
    var dynamicParser = DynamicParser.CompileRule(
        @"'a' -> ""abc"".AsString()",
        typeof(MetaSharp.Transformation.Extensions.Common));

    var result = (dynamic)dynamicParser.Parse("a");
    Assert.AreEqual("abc", (string)result);
}

Static inheritance from a dynamic grammar

This one is a little crazy. Here we are actually doing the full grammar code not the abbreviated version and we are inheriting (that’s what the ‘<’ means) from BasicParser which is a static Type as you can tell by the white list. In grammar inheritance you can optionally add a Main rule which will be added to a queue of processors for the specified input. In this case BasicParser inherits from BasicInterleave which inherits from BasicTokenizer. So the input stream will be tokenized, interleaved then parsed by this dynamic grammar. Each step parses the output of the previous step. If you don’t have a Main rule you will be skipped but you can still provide rules for inheriting grammars to reference as well as override rules of grammars you are inheriting from.

[Test]
public void GrammarStaticInheritanceTest()
{
    var grammar = new GrammarGrammar();
    var unit = (GrammarUnitNode)grammar.Parse(@"
        namespace N:
            import MetaSharp.Transformation.Extensions;
            import MetaSharp.Transformation.Parsing.Common;

            grammar X < BasicParser:
                Main = a:(""a"" ""b"" ""c"") -> a.Flatten();
            end
        end");

    var parser = unit.DynamicCompile(
        "N.X",
        typeof(MetaSharp.Transformation.Extensions.Common),
        typeof(MetaSharp.Transformation.Parsing.Common.BasicParser));

    var result = (dynamic)parser.Parse("a b c");
    Assert.AreEqual("a", (string)result[0]);
    Assert.AreEqual("b", (string)result[1]);
    Assert.AreEqual("c", (string)result[2]);
}

Dynamic inheritance

This is essentially the same thing except in this case one of the dynamic grammars is inheriting from another dynamic grammar. You can create both and use them both.

[Test]
public void GrammarDynamicInheritanceDualNamespaceTest()
{
    var grammar = new GrammarGrammar();
    var unit = (GrammarUnitNode)grammar.Parse(@"
        namespace A:
            grammar X:
                Main = 'a';
            end
        end
        namespace B:
            import A;

            grammar Y < X:
            end
        end");

    var parser = unit.DynamicCompile("A.X");
    var result = (dynamic)parser.Parse("a");
    Assert.AreEqual('a', (char)result);

    parser = unit.DynamicCompile("B.Y");
    result = (dynamic)parser.Parse("a");
    Assert.AreEqual('a', (char)result);
}

Rule references and parameterized rules

You can also create as many rules as you want and reference them, including parameterized rules like below.

[Test]
public void ParameterizedRuleReferenceTest()
{
    var parser = DynamicParser.Compile(
        @"namespace N:
            import MetaSharp.Transformation.Extensions;
            grammar X:
                Main = (Expr('a'..'z') | ' ')+;
                Expr(a) = str:a+ -> str.AsString();
            end
        end",
        "N.X",
        typeof(MetaSharp.Transformation.Extensions.Common));

    var result = (dynamic)parser.Parse("hello world");
    Assert.AreEqual("hello", (string)result[0]);
    Assert.AreEqual(' ', (char)result[1]);
    Assert.AreEqual("world", (string)result[2]);
}

Direct left recursion

Direct left recursion is when, on the left hand side of a rule body you are referencing itself directly. This results in recursion but is completely solvable by PatternMatching grammars. Below the Expr rule is an example of DLR because it is referencing itself. DLR is a useful way to solve problems of forward lookup, such as binary expressions like below.

DLR is direct as opposed to indirect left recursion, where you reference a Rule which then references that Rule back directly. Indirect Left Recursion is also a solvable problem but I was not able to solve it for situations of cases of multiple nested indirect DLR. Also it results in a very confusing grammar which has diminishing returns in value. For now you get errors in the cases of indirect left recursion.

[Test]
public void LeftRecursiveReferenceTest()
{
    var parser = DynamicParser.Compile(
        @"namespace N:
            import MetaSharp.Transformation.Extensions;
            grammar X:
                Main = Expr;
                Expr = l:Expr '+' r:Expr -> Binary { Left = l, Right = r, Operator = '+' }
                        | n:Num -> n.ToNumber();
                Num = '0'..'9';
            end
        end",
        "N.X",
        typeof(MetaSharp.Transformation.Extensions.Common));

    var result = (dynamic)parser.Parse("1+2+3+4");
    Assert.AreEqual("Binary", result.Identifier);
    Assert.AreEqual("Binary", result.Right.Identifier);
    Assert.AreEqual("Binary", result.Right.Right.Identifier);
    Assert.AreEqual(1, (int)result.Left);
    Assert.AreEqual(2, (int)result.Right.Left);
    Assert.AreEqual(3, (int)result.Right.Right.Left);
    Assert.AreEqual(4, (int)result.Right.Right.Right);
}

Overriding rules

Y is overriding the Expr rule to project true instead of false. Parsing using Y will now result in a new result compared to parsing with X. You can override rules on static grammars as well.

[Test]
public void RuleDynamicOverrideDynamicTest()
{
    var unit = DynamicParser.Parse(
        @"namespace N:
            grammar X:
                Main = Expr;
                Expr = any -> false;
            end
            grammar Y < X:
                override Expr = any -> true;
            end
        end");

    var parser = unit.DynamicCompile("N.X");
    var result = parser.Parse("1");
    Assert.IsFalse((bool)result);

    parser = unit.DynamicCompile("N.Y");
    result = parser.Parse("1");
    Assert.IsTrue((bool)result);
}

There is more too of course but this is probably too long already, it’s also not 100% complete yet. I will be considering it complete (sans bugs) when I can dynamically compile the Grammar grammar using itself! Next up:

  • Silverlight
  • VS integration
  • Polish and an actual release