List<T> is dead, long live LINQ

Well, ok, maybe not completely dead. It has it’s uses but for the most part I find myself using an actual list less and less. I do end up using ObservableCollection<T> for ViewModels still, of course, but that can’t be avoided.

I do find myself slowly banishing it from my daily coding habits however, thanks to LINQ. I am finding that the covariant interface IEnumerable<T> is almost always a better choice as a return value for methods and properties and with the help of the higher order set functions provided along with LINQ there is nothing you can’t do without List<T> that you could have done with it.

Also, I would like to mention that when I am talking about LINQ I am talking exclusively about the extension methods found in System.Linq.Enumerable not the actual internal DSL syntax in C# / VB. I almost never use that syntax since it can’t express all of the functions I need and is sometimes hard to debug.

I did have to add one extension method to help ease general use of LINQ however.

public static class EnumerableExtensions
{
    public static IEnumerable<T> ToEnumerable<T>(this T @object)
    {
        if (@object == null)
            return System.Linq.Enumerable.Empty<T>();

        var enumerable = @object as IEnumerable;
        if (enumerable == null)
            return new[] { (T)@object };

        return enumerable.Cast<T>();
    }
}

This allows you to ensure than any object can be used as an IEnumerable<T>. With this in hand you can replace any method on List<T>. Here is an example of a bunch of methods on List<T> and their equivalent LINQ translations.

var a = 1;
var b = 2;
var c = 3;

var list = new List<int>();
var set = Enumerable.Empty<int>();

list.Add(a);
set = set.Concat(a.ToEnumerable());

list.Remove(a);
set = set.Where(i => i != a);

list.AddRange(new[] { a, b });
set = set.Concat(a.ToEnumerable()).Concat(b.ToEnumerable());

list.Sort();
set = set.OrderBy(i => i);

list.Contains(a);
set.Any(i => i == a);

var count1 = list.Count;
var count2 = set.Count();

list.FindAll(i => i == a);
set.Where(i => i == a);

list.IndexOf(a);
set.TakeWhile(i => i == a).Count();

list.FindLast(i => i == a);
set.Last(i => i == a);

list.GetRange(0, 2);
set.Skip(0).Take(2);

list.Insert(1, c);
set = set.Take(1).Concat(c.ToEnumerable()).Concat(set.Skip(1));

list.Reverse();
set = set.Reverse();

list.TrueForAll(i => i == 0);
set.All(i => i == 0);

Note that in order to get some of the features of List<T> you have to compose various LINQ functions together, such as Insert. At first this may seem overly verbose but the benefit is an incredible flexibility. There is nearly no limit on what you can express with LINQ because of this composability (also you could create extensions that simply wrap up the complex composition with a neater signature). List<T> has a few more methods that I will leave up to you to translate but LINQ has many more functions and function composition possibilities that I would highly recommend discovering.

Executing the above code and printing list and set results in this output.

list: { 2 3 1 }
set: { 2 3 1 }

When you use LINQ your set is immutable at every point in time, which is very helpful in case you would like to try to use the Parallel Extensions (aka PLINQ). One other interesting aspect of the LINQ extensions is deferred execution. This means that none of my set operations above are even executed until I iterate over the set in my print function. This allows me to change the value of ‘a’, reiterate and receive a different result. This can be confusing if you’re unaware of it but it can be a huge win if you understand it well. This is all possible because of the functional nature of LINQ and (as best as I can understand) is an example of a Monad in .NET. And, of course, coroutines in C# compliment LINQ very nicely as well (aka yield return). I think it would be very interesting to experiment with a language where every variable is essentially treated as a set, NULL is not possible but empty sets are. An object is simply a set of 1. It would be interesting.

Another very cool aspect of LINQ is Lambda Expressions and how they are related to Expression Trees and the DLR. It’s extremely powerful and they are intimately tied together.

Also, do check out IEnumerable<T>’s mirrored twin: IObservable<T>.

Object Matching in MetaSharp

I’ve been delaying this feature in MetaSharp for a while because I wasn’t entirely sure how I wanted it to work and it’s a very important feature. I also haven’t seen any other examples in the wild that matched what I’m trying to do exactly so I was slightly unsure of the syntax.

But I have finally dug in and got it working! It’s by no means set in stone, so if anyone has any good suggestions I will likely incorporate them. Feel free to send me suggestions.

If you’re wondering what I mean by Object Matching I’ll try to explain, first what it means and second why I think it’s important.

Object Matching

Most parsers in the world today are concerned with general purpose programming languages and, as a result, parsing text and transforming it into executable code. The patterns around this problem set are well known and have many implementations. However, the interesting thing is that they are largely geared towards accepting only text as an input and therefore rely on multiple patterns to solve the problems of compilers.

In MetaSharp there is a now Pattern for matching complex objects as well as characters. What this means is that your input can actually be streams of object trees instead of just strings.

For comparison, suppose you wanted to match a social security number from a string using MetaSharp. Here is how you might do it:

SSN = Number#3 “-” Number#2 “-” Number#4;

If your string input was “123-45-6789” then this SSN rule would match that. This is not all that different from regular expressions, though there are some notable differences.

But suppose what you actually had was a Customer object, which has an SSN property? With object matching to you can match on the entire object instead of just the string.

PersonWithSsn = Person { Ssn = SSN };

In this case the curly braces indicate an object match, where the object type is of Type Person and possesses the property Ssn with a value that matches the SSN rule. This is a pretty trivial example but the value of this becomes more apparent when you consider large object graphs and you have the desire to transform that object graph into something else.

Transforming Object Graphs

This is actually a very common operation in software and the standard two design patterns for doing this are the State Machine and Visitor patterns. In fact both techniques are required as multiple steps in order to implement a basic compiler. Using Object Matching patterns you can effectively create a Visitor pattern in the same grammar language as needed to create a State machine.

This is the real benefit of object matching, being able to use the same grammar to create State Machines and Visitors means that you can create every step in a compiler using a single language, tool or skill.

For example, in MetaSharp I have a sample application that plays songs using Console.Beep. It’s pretty silly but its a nice integration test for me. In this sample I have two songs and after parsing the DSL text what I used to do was to pass the AST resulting from that parse into a T4 template, which would then generate some code that would get compiled into linq expressions.

Here is an example of the T4 template I am referring to:
http://metasharp.codeplex.com/SourceControl/…/DynamicSong.template

<#@ template language=”C#” inherits=”SongMetaTemplate” hostspecific=”true” debug=”true” #>
<#@ assembly name=”DynamicSongBuilder” #>
<#@ import namespace=”DynamicSongBuilder” #>
<# foreach(var song in this.Songs) { #>
    <# foreach(var b in song.Bars) { #>
        <# foreach(var n in b.Notes) { #>
            play(<#=song.Duration #>, “<#=n.Key #>”, <#=n.Octave #>);
        <# }
    }
} #>

This works and is a legitimate technique and I don’t think it’s very popular to say this but string transformations like this have a couple serious drawbacks.

  1. The general purpose languages are hard coded, if you wanted to make this reusable you’d have to rewrite it for every language you wanted to support.
  2. You lose contextual information about your original object.
    • C# is a little better at this than some by giving you the #line directive but it’s still not ideal.
  3. It’s slow. With T4 you either get an unwanted assembly loaded into memory or you have to create a separate AppDomain which is very slow and cludgy.
  4. Tag Soup (http://www.codinghorror.com/blog/2008/07/web-development-as-tag-soup.html)

These drawbacks are usually acceptable. You’re usually only creating a website once, for example. And your views typically don’t get reused as part of another application. Being the top most part of the stack, it’s ok to lose a little context. But in a world where your intention is to create reusable grammars and inheritable grammars it’s very non-ideal.

Here is how you would implement the same transformation using a grammar with Object Matching:

grammar DynamicSongTransform:

Main = SongTransform;

SongTransform
= Song { b:Bars = [BarTransform*] }
-> BlockStatement { Statements = b.Flatten() };

BarTransform
= Bar { n:Notes = [NoteTransform*] }
-> n;

NoteTransform
= n:Note { }
-> InvokeExpression {
Target = ReferenceExpression { Name = "play" },
Parameters = [
PrimitiveExpression { Value = n.Parent.Parent.Duration.ToNumber() },
PrimitiveExpression { Value = n.Key },
PrimitiveExpression { Value = n.Octave.ToNumber() }
]
};
end

This grammar accepts a Object Stream containing a single Song object. That Song object will contain a collection of Bar object which in turn contains a collection of Note objects.

As a refresher the ‘->’ token indicates a projection. Which means the pattern to match is on the left side of the projection and if the match is successful then the right hand side is what is produced. If no projection is specified then the matched value is projected by default.

In the projection of the NoteTransform we are projecting an InvokeExpression, which has a Target and some parameters. This then gets passed down into a DynamicCompile step i created which transforms these objects into Linq expressions, which in turn translates those expressions into IL! In this way every step of our DSL is re-usable. This will work for VB or C# developers. It will also work by passing in a textual DSL or by passing in a Song AST created by other means. You could create a different textual DSL that produces the same AST and this will still work and we have debugging context data all the way to the end. Also, the general purpose code transformers are all implemented by MetaSharp so you don’t have to do that every time.

My theory is that creating a custom DSL should be as simple as creating multiple grammars and snapping them together. The next step will be to implement a general purpose UI that will let you use attributes to annotate your grammars to give hints about highlighting and other IDE related features. Keep an eye out for basic VS and NuGet integration in the near future!

Dynamic Pattern Matching in .net

For my most recent batch of work on MetaSharp I have been working on a completely dynamic version of the Grammar grammar. The reason I’ve been working on it is that I want to start working UI portions of the tool and in order to do that I needed a way to be able to dynamically compile grammars and run them.

What’s interesting is that thus far the Grammar language has been static, meaning you compile it at build-time and it results in a class that inherits from something using the standard .net Type system. So doing things dynamically is tricky because one of my goals was to still be able to “inherit” from a grammar without creating any new .net types. I managed to get this working. This has been a very interesting discovery process for me and I think it has really given me a few revelations about dynamism in type system, I’ll try to share a few of them here.

  • The only difference between dynamic and static is simply when you compile; during run-time or before run-time.
  • A dynamic language is simply expression trees + custom Type system.
  • Your Type system can be anything, it doesn’t need to actually be classes and structs, etc.

Here is an example of some unit tests I have been writing to verify certain functionality, which should hopefully give you a sampling of what it looks like to do pattern matching dynamically as well as some of the basics of MetaSharp.

Simple ‘And’ rule test

This rule says ‘a’ AND ‘b’. The CompileRule is actually adding the extra namespace and grammar declaration stuff automatically and compiling that grammar into a DynamicParser object and is returned as an IParser. Calling parse actually returns an object of Type dynamic but I have an explicit cast below as well.

[Test]
public void AndMatchTest()
{
    var dynamicParser = DynamicParser.CompileRule("'a' 'b'");

    var result = (dynamic)dynamicParser.Parse("ab");
    Assert.AreEqual('a', (char)result[0]);
    Assert.AreEqual('b', (char)result[1]);
}

Value projections

Here I’m projecting an actual double, not a string representation of a double.

[Test]
public void DoubleProjectionTest()
{
    var dynamicParser = DynamicParser.CompileRule("default -> 7.11");
    var result = (dynamic)dynamicParser.Parse("");
    Assert.AreEqual(7.11, (double)result);
}

Dynamic projections

This projection is returning a DynamicNode object. The Identifier is used instead of the Type Name. Properties can be dynamically added. This also has a variable assignment on the left hand side “a:” lets you use the a variable on the right hand side.

[Test]
public void CharacterMatchObjectProjectionTest()
{
    var dynamicParser = DynamicParser.CompileRule("a:'a' -> Test { Value = a }");
    var result = (dynamic)dynamicParser.Parse("a");
    Assert.AreEqual("Test", result.Identifier);
    Assert.AreEqual('a', (char)result.Value);
}

Matching multiple items

+ means One Or More. Here we’re matching an arbitrary number of a’s. This would fail if there were 0 a’s though.

[Test]
public void OneOrMoreMatchTest()
{
    var dynamicParser = DynamicParser.CompileRule("'a'+");
    var result = (dynamic)dynamicParser.Parse("a");
    Assert.AreEqual('a', (char)result[0]);

    result = dynamicParser.Parse("aaa");
    Assert.AreEqual('a', (char)result[0]);
    Assert.AreEqual('a', (char)result[1]);
    Assert.AreEqual('a', (char)result[2]);
}

Calling custom extensions

AsString() is actually a static method defined on Common. For now I am only searching for extensions in a very explicit whitelist. This will probably change into an Assembly whitelist eventually though.

[Test]
public void ExtensionReferenceProjectionTest()
{
    var dynamicParser = DynamicParser.CompileRule(
        @"'a' -> ""abc"".AsString()",
        typeof(MetaSharp.Transformation.Extensions.Common));

    var result = (dynamic)dynamicParser.Parse("a");
    Assert.AreEqual("abc", (string)result);
}

Static inheritance from a dynamic grammar

This one is a little crazy. Here we are actually doing the full grammar code not the abbreviated version and we are inheriting (that’s what the ‘<’ means) from BasicParser which is a static Type as you can tell by the white list. In grammar inheritance you can optionally add a Main rule which will be added to a queue of processors for the specified input. In this case BasicParser inherits from BasicInterleave which inherits from BasicTokenizer. So the input stream will be tokenized, interleaved then parsed by this dynamic grammar. Each step parses the output of the previous step. If you don’t have a Main rule you will be skipped but you can still provide rules for inheriting grammars to reference as well as override rules of grammars you are inheriting from.

[Test]
public void GrammarStaticInheritanceTest()
{
    var grammar = new GrammarGrammar();
    var unit = (GrammarUnitNode)grammar.Parse(@"
        namespace N:
            import MetaSharp.Transformation.Extensions;
            import MetaSharp.Transformation.Parsing.Common;

            grammar X < BasicParser:
                Main = a:(""a"" ""b"" ""c"") -> a.Flatten();
            end
        end");

    var parser = unit.DynamicCompile(
        "N.X",
        typeof(MetaSharp.Transformation.Extensions.Common),
        typeof(MetaSharp.Transformation.Parsing.Common.BasicParser));

    var result = (dynamic)parser.Parse("a b c");
    Assert.AreEqual("a", (string)result[0]);
    Assert.AreEqual("b", (string)result[1]);
    Assert.AreEqual("c", (string)result[2]);
}

Dynamic inheritance

This is essentially the same thing except in this case one of the dynamic grammars is inheriting from another dynamic grammar. You can create both and use them both.

[Test]
public void GrammarDynamicInheritanceDualNamespaceTest()
{
    var grammar = new GrammarGrammar();
    var unit = (GrammarUnitNode)grammar.Parse(@"
        namespace A:
            grammar X:
                Main = 'a';
            end
        end
        namespace B:
            import A;

            grammar Y < X:
            end
        end");

    var parser = unit.DynamicCompile("A.X");
    var result = (dynamic)parser.Parse("a");
    Assert.AreEqual('a', (char)result);

    parser = unit.DynamicCompile("B.Y");
    result = (dynamic)parser.Parse("a");
    Assert.AreEqual('a', (char)result);
}

Rule references and parameterized rules

You can also create as many rules as you want and reference them, including parameterized rules like below.

[Test]
public void ParameterizedRuleReferenceTest()
{
    var parser = DynamicParser.Compile(
        @"namespace N:
            import MetaSharp.Transformation.Extensions;
            grammar X:
                Main = (Expr('a'..'z') | ' ')+;
                Expr(a) = str:a+ -> str.AsString();
            end
        end",
        "N.X",
        typeof(MetaSharp.Transformation.Extensions.Common));

    var result = (dynamic)parser.Parse("hello world");
    Assert.AreEqual("hello", (string)result[0]);
    Assert.AreEqual(' ', (char)result[1]);
    Assert.AreEqual("world", (string)result[2]);
}

Direct left recursion

Direct left recursion is when, on the left hand side of a rule body you are referencing itself directly. This results in recursion but is completely solvable by PatternMatching grammars. Below the Expr rule is an example of DLR because it is referencing itself. DLR is a useful way to solve problems of forward lookup, such as binary expressions like below.

DLR is direct as opposed to indirect left recursion, where you reference a Rule which then references that Rule back directly. Indirect Left Recursion is also a solvable problem but I was not able to solve it for situations of cases of multiple nested indirect DLR. Also it results in a very confusing grammar which has diminishing returns in value. For now you get errors in the cases of indirect left recursion.

[Test]
public void LeftRecursiveReferenceTest()
{
    var parser = DynamicParser.Compile(
        @"namespace N:
            import MetaSharp.Transformation.Extensions;
            grammar X:
                Main = Expr;
                Expr = l:Expr '+' r:Expr -> Binary { Left = l, Right = r, Operator = '+' }
                        | n:Num -> n.ToNumber();
                Num = '0'..'9';
            end
        end",
        "N.X",
        typeof(MetaSharp.Transformation.Extensions.Common));

    var result = (dynamic)parser.Parse("1+2+3+4");
    Assert.AreEqual("Binary", result.Identifier);
    Assert.AreEqual("Binary", result.Right.Identifier);
    Assert.AreEqual("Binary", result.Right.Right.Identifier);
    Assert.AreEqual(1, (int)result.Left);
    Assert.AreEqual(2, (int)result.Right.Left);
    Assert.AreEqual(3, (int)result.Right.Right.Left);
    Assert.AreEqual(4, (int)result.Right.Right.Right);
}

Overriding rules

Y is overriding the Expr rule to project true instead of false. Parsing using Y will now result in a new result compared to parsing with X. You can override rules on static grammars as well.

[Test]
public void RuleDynamicOverrideDynamicTest()
{
    var unit = DynamicParser.Parse(
        @"namespace N:
            grammar X:
                Main = Expr;
                Expr = any -> false;
            end
            grammar Y < X:
                override Expr = any -> true;
            end
        end");

    var parser = unit.DynamicCompile("N.X");
    var result = parser.Parse("1");
    Assert.IsFalse((bool)result);

    parser = unit.DynamicCompile("N.Y");
    result = parser.Parse("1");
    Assert.IsTrue((bool)result);
}

There is more too of course but this is probably too long already, it’s also not 100% complete yet. I will be considering it complete (sans bugs) when I can dynamically compile the Grammar grammar using itself! Next up:

  • Silverlight
  • VS integration
  • Polish and an actual release

ObservableObject for WPF MVVM

For those of you who know what INotifyPropertyChanged is and also know what kind of a maintenance burden the OnPropertyChanged(“Property”) calls are I would like to share my favorite implementation of the ObservableObject.

public class ObservableObject : INotifyPropertyChanged
{
    public event PropertyChangedEventHandler PropertyChanged;

    protected virtual void SetAndNotify<T>(ref T field, T value, Expression<Func<T>> property)
    {
        if (!object.ReferenceEquals(field, value))
        {
            field = value;
            this.OnPropertyChanged(property);
        }
    }

    protected virtual void OnPropertyChanged<T>(Expression<Func<T>> changedProperty)
    {
        if (PropertyChanged != null)
        {
            string name = ((MemberExpression)changedProperty.Body).Member.Name;
            PropertyChanged(this, new PropertyChangedEventArgs(name));
        }
    }
}

Instead of using strings for property change notifications you use lambda expressions. This is beneficial because refactoring tools will pick this up instead of having to manually change the magic strings. It also will give you compile time errors if you have gotten something wrong. It also lets you maintain 1 statement setters in properties instead of multiple statements. Using the ObservableObject looks something like this:

public class Customer : ObservableObject
{
    private string name;

    public string Name
    {
        get { return this.name; }
        set { this.SetAndNotify(ref this.name, value, () => this.Name); }
    }
}

Simple and effective.

UnitDriven for Windows Phone 7

I have extended UnitDriven to provide support for (the current beta release of) Windows Phone 7. This means you can write a unit test that runs on .NET, Silverlight and Windows Phone. This is an Alpha release for now, hopefully if we get a few people to use it I can make a more stable release when the final version of the  WindowsPhone7 SDK is released.

Download: http://unitdriven.codeplex.com/releases/view/50214

Here is a screenshot of UnitDriven running tests on the WindowsPhone7 emulator.

UnitDrivenPhone

One interesting thing is that all I had to do to support this was link files from the Silverlight version of UnitDriven into a new Phone project. It all compiled and ran on the first try.

However, even though it ran it wasn’t actually usable. I had to create new versions of the Views to accomodate the smaller screen size and default layout differences (buttons are relatively bigger for example). Also the scroll bars are only visible while scrolling and you have to click and drag on a Circle or the text to actually do the scrolling.

Please feel free to comment on the UnitDriven forums if you have any comments or questions!