Programming and Scaling

tele-TASK Video: Programming and Scaling.

If you’ve heard me talk about DSLs but just haven’t quite been sold on the idea yet, watch this video. In fact, watch it anyway. Dr. Alan Kay gives a very inspirational and interesting speech about the past, present and future of Computer Science, technological innovation and creativity. The grand finale ties all of his ideas together in a beautiful example of the power of domain specific languages.

I found myself nodding throughout this entire presentation and even though I didn’t know where it was going I could see how it applied to my own personal research in meta#. Thank you Dr. Kay, I may never need to spend my time explaining the why’s of DSLs again, I will simply forward them to this presentation.

Does process kill developer passion?

http://radar.oreilly.com/2011/05/process-kills-developer-passion.html

In general I agree with most of what he says, such as

In short, you’re spending a lot of your time on process, and less and less
actually coding the applications.

… having to shoehorn in shims to make unit tests work has reduced the
readability of the code.

Disaffected programmers write poor code, and poor code makes management add
more process in an attempt to “make” their programmers write good code. That
just makes morale worse, and so on.

The blind application of process best practices across all development is
turning what should be a creative process into chartered accountancy with a side
of prison.

And as an aside, if you’re going to say you’re practicing agile development,
then practice agile development! A project where you decide before you
start a product cycle the features that must be in the product, the ship date,
and the assigned resources is a waterfall project.

However I strongly disagree with this

But, for example, maybe junior (or specialized) developers should be writing
the unit tests, leaving the more seasoned developers free to concentrate on the
actual implementation of the application.

But I would like to say that I really do love TDD, as I am working on this new version of MetaSharp I am really driving it with tests as best as I can. Tests are critical for verifying that your code is actually correct and that some new feature doesn’t break something you have already done. That being said it’s really only useful in a project where you know where you are going already. When I first started MetaSharp there was a lot of experimentation and a plenty of dead ends and when I wrote a lot of tests it was a mostly just wasted effort to undo and redo the them, it was a pain in the ass frankly. But after a lot of prototyping and experimenting I finally decided that I knew where I wanted to be and started over. In this new iteration I have been writing as many tests as I can without slowing my momentum down too much, and the thing is  when you know the domain and where you need to end up the tests do not slow you down at all. It’s excellent in that scenario. So I think if you go into a coding phase with a prototyping mentality then, meh, maybe TDD is more of a hindrance but seriously be prepared to throw it all away. The quality just wont’ be high enough without extensive tests. It’s not a foundation to really build too much ontop of.

But TDD is really only part of the story. I’m not in a position to dictate our development process on my team at work so I’m trying to analyze it and find out what I like and don’t like because you can learn about as much from what doesn’t work as you do from what does work. I feel like we’re essentially waterfall even though we use scrum terminology, there are some things managers just can’t live without!

If I had it my way, if I had developers working under me, I would see my role as essentially a buffer. I would be as much of a dev as I could manage but the difference would be that I would also buffer my devs from unnecessary meetings from higher up. I would be the one to gather information from external groups and filter down what is imporant to whom. I would gather my knowledge of their progress by asking them in person, one at a time, and being active in the daily process of work items. I would encourage them to communicate direction and continually with each other rather than setup a mandatory scrum meeting. To me scrum is like a “spray and pray” information dispersal method. It’s a waste of time for most people in the room every time someone else is speaking. I would encourage pair programming and I would be the one, as much as possible to maintain the database of work items and keep the devs pipelines full.

Also, integration sprints? Waste of time, continuous integration should fix that. Planning sprints? The dev lead should be continuously doing that. At some point bugs end up at the top of the developers pipelines simply because of their importance and therefore you are continuously fixing them rather than dedicating some period of time to bugs and other to features. In fact the whole idea of a sprint seems arbitrary to me. Just always be working on what’s next. At each step, with each feature your application should be working. Bugs simply coming to the top of the pipeline is basically equivalent to an integration sprint in my mind.

Code reviews? I despise gated commits. Code reviews should be done post commit. The dev manager should just look at the list of commits and start reviewing them. Peers shouldn’t really need to do formal code reviews because they should be in constant communication with people who they are working closely with. If there is a smell in something that was committed then start a discussion, no reason it couldn’t be fixed post commit. I’m assuming we’re using good source control tools that allow us to merge and branch confidently.

I could go on and on but I’m still working out these ideas, I have never really said these thoughts out loud before, except perhaps over a pint of beer with a friend.

Computation is Pattern Matching

This is my favorite single slide from the emerging languages conference.

pick_two

I think this sums up the struggle developers of today regularly encounter quite nicely. This image comes from Jonathan Edwards’ Coherence/Subtext talk slides. He went on to then propose some very interesting ideas on how to solve this problem by managing pointers and memory in a new scheme. I’m not able to really comment on his recommendations because I didn’t fully understand them. But what I would like to propose is that there actually might be another way to solve this same problem.

I believe that Pattern Calculus (or Pattern Matching) is a unification of these three problems. It’s a superset of lambda calculus and data structures.

978-3-540-89184-0_Jay_Cover_RawData.indd

Pattern Calculus

Computing with Functions and Structures
Jay, Barry
2009, XVII, 213 p. 58 illus., Hardcover
ISBN: 978-3-540-89184-0

I have yet to read this entire book but the foreward and introduction alone provide some deep gems, such as:

This book develops a new programming style, based on pattern matching,
from pure calculus to typed calculus to programming language. It can be viewed as
a sober technical development whose worth will be assessed in time by the programming
community. However, it actually makes a far grander claim, that the pattern matching
style subsumes the other main styles within it
. This is possible because it
is the first to fully resolve the tension between functions and data structures that has
limited expressive power till now. This introduction lays out the general argument,
and then surveys the contents of the book, at the level of the parts, chapters and
results.

The pattern calculus is a new foundation for computation, in which the expressive
power of functions and of data structures are fruitfully combined within pattern matching
functions. The best of existing foundations focus on either functions (in
the λ -calculus) or on data structures (in Turing machines) or compromise on both
(as in object-orientation). By contrast, a small typed pattern calculus supports all
the main programming styles, including functional, imperative, object-oriented and
query-based styles
.

The pattern calculus is the result of a profound re-examination of a 50-year development.
It attempts to provide a unifying approach, bridging the gaps between
different programming styles and paradigms according to a new slogan – computation
is pattern matching
.

It is surprising how this elementary principle allows one to uniformly and elegantly
cover the various programming paradigms, not only concerning execution
but also typing, which itself is also realized following the idea of pattern matching.

(emphasis mine)

Code itself can be thought of as data when viewed from inside a compiler. A compiler can transform that data into something executable. This process can be repeated any number of times. Alessandro Warth of OMeta contributes to this idea in his phd dissertation.

OMeta’s key insight is the realization that all of the passes in a traditional compiler are essentially pattern matching operations:

  • a lexical analyzer finds patterns in a stream of characters to produce a stream of
    tokens;
  • a parser matches a stream of tokens against a grammar (which itself is a collection
    of productions, or patterns) to produce abstract syntax trees (ASTs);
  • a typechecker pattern-matches on ASTs to produce ASTs annotated with types;
  • more generally, visitors pattern-match on ASTs to produce other ASTs;
  • finally, a (naive) code generator pattern-matches on ASTs to produce code.

And I agree with these guys. I have been working on a MetaSharp for a while now and on some of my more ponderous days I have seen to the bottom of the rabbit hole and come to the conclusion that they are right. I’m able to see a world where an application is entirely based upon pattern matching principles. Starting with a core of Patterns you can represent and transform to and from any design pattern imaginable including those needed to create traditional compilers. Essentially this gives you the ability to author an entire application as essentially tiers of patterns and projections.

I would like to contribute to the area of Pattern Matching by talking about the promise of scalability. In traditional general purpose programming as your application grows larger it becomes more and more difficult to maintain it. The sheer volume of code becomes infeasible to make broad changes with refactoring tools, the application becomes harder and harder to grasp in its entirety. Also, design patterns implied in the code become harder to enforce as developers begin to encounter new areas for the first time. There are many challenges that arise as an application grows.

Roman Ivantsov gave an interesting talk on ERP software and the scalability challenges it faces at the 2008 Lang.Net Symposium. Basically ERP programs are humongous applications, they are complex, costly and risky. What I took away from Roman’s talk, however, is that the solution to the problem is that we need a new approach to address applications of this scale. We cannot develop these applications purely as general purpose languages and succeed.

And again, I believe him. I think he is correct. As applications get bigger they can actually take so long to develop that they are obsolete before they are actually done. This is a problem.

Maintenance cost in Volume, for traditional applications.

scalability

Which is to say as your application becomes larger and you add more and more tiers the complexity also grows. The cost is like a pyramid. There are many things that go into Maintainability but in a general sense it might be said that maintainability is a function of code size and complexity.

Pattern matching offers an alternative to this traditional conundrum. It’s maintainability illustration is more columnar.

Maintenance cost in Volume, for pattern matching based applications.

scalability_patternmatching

Which is to say, that at each tier of an application the complexity could be about the same as the previous tier, thus increasing the functionality of an application for the same volume (cost) of maintainability. Interestingly enough, for small applications or applications with only a single tier the cost is probably higher. But as you grow  you begin to reap benefits.

This is a hypothesis of mine currently, it has yet to be proven however. Also, given the state of Pattern Matching as it exists today the above promises are definitively false. However I don’t think it needs to be this way. I think it is possible to change solve this problem and usher in a new era of software development paradigms.

Silverlight 4 Runs Natively in .NET

This is a big deal.

I just learned this the other day and found surprisingly little information about it online. It was announced at PDC so it’s not a secret but it seems like a big deal to me. The feature is otherwise known as “assembly portability”.

When doing Silverlight one of the most frustrating things currently is the inability to run unit tests out of browser. This results in a break in continuous integration and a frustrating manual step in your testing process. Well no more!

This is only the beginning of the implications however. If you’re writing any application with a business logical layer, you should probably be writing it in Silverlight exclusively now. No more duplicating projects and creating linked file references. You can literally just create one project, compile it, and run it in both runtimes. Incredible.

Of course there are some limitations. But in this case I almost feel like the limitations are actually benefits. The thing is, you are likely to have features in one environment that are different or inaccessible in another. For example, file system access. In .net if you’re easily accessing the file system, but you might not be able to access it directly in Silverlight in the same way.

Enter System.ComponentModel.Composition. Otherwise known as MEF. By making your application logic composable you can solve all of the problems of framework differences and make your project imminently unit test friendly and better in general (if you buy into the principals of IoC at least).

For example, in Silverlight you cannot get a FileStream directly, you must make a call to OpenFileDialog which will give you the FileStream, if a user allows it. This is all well and good but when running in .net or unit tests you may want to allow it to access the file system directly or give it a mock stream instead. The solution is to make your calls to retrieve streams composable (otherwise known as dependency injection). Create a service interface, and create service instances for different environments.

For example, suppose you have the following (contrived) method in a Silverlight class:

public void DoWork()
{
    var dto = new SimpleDTO { Id = 100, Foo = "Hello World!" };

    Stream stream = Open();
    Save(stream, dto);

    stream = Open();
    dto = Load<SimpleDTO>(stream);

    Console.WriteLine("{0} : {1}", dto.Id, dto.Foo);
}

The process of opening a stream in Silverlight is different from the way it must be done when running in .net. So we make it composable.

public Stream Open()
{
    var stateService = container.GetExport<IStateService>().Value;
    stateService.State.Seek(0, SeekOrigin.Begin);
    return stateService.State;
}

Instead of opening the stream ourselves we import an exported service via MEF, that does know how to do it. To do this we simply need to have access to a container.

private CompositionContainer container;

public ComposableObject(CompositionContainer container)
{
    this.container = container;
}

Our constructor accepts a CompositionContainer as a parameter, which gives us access to all of the composable parts configured for our runtime. Keep in mind this is all Silverlight code at this point. And here is the IStateService.

public interface IStateService : IDisposable
{
    Stream State { get; }
}

The following code snippets are plain-old-dot-net-console-application snippets. First off, here is a snapshot of my solution explorer so you can see how things are structured.

image

You can see that I have created a reference from a Console Application project directly to a Silverlight 4 Class Library project. Visual Studio gives me a yellow banger, presumably because of the framework differences but it builds just fine. My program loads and calls my Silverlight library just this easily:

using SilverlightClassLibrary1;
class Program
{
    static void Main(string[] args)
    {
        var assembly = new AssemblyCatalog(typeof(Program).Assembly);
        using (var container = new CompositionContainer(assembly))
        {
            var co = new ComposableObject(container);
            co.DoWork();
        }

        Console.ReadKey(true);
    }
}

The bits at the beginning are creating a CompositionContainer using the current assembly. MEF allows you to load containers from all sorts of sources however, including entire directores full of Assemblies so you can have an easy add-in system. The ComposableObject is the one defined in my Silverlight Assembly! No interop nastiness, no AppDomain hassles, it just loads the dll as if it were true .NET code!

Next all I have to do is create an instance of the IStateService and export it.

[Export(typeof(IStateService))]
public class ParallelProcessingService : IStateService
{
    private Stream state;

    public Stream State
    {
        get
        {
            if (state == null)
                state = File.Create("state.dat");
            return state;
        }
    }

    public void Dispose()
    {
        if (state != null)
            state.Dispose();
    }
}

Now when I run this application, my Silverlight code will use MEF to load an Exported IStateService instance for me. Running this code will the access the FileSystem directly even though I’m running a Silverlight class library.

So what you should do is to create a Class Library with all of your logic, composed in a similar fashion as the above. Then in your Silverlight Application you simply implement and Export all of the Silverlight specific code as services. You do the same for your unit testing in .net projects and you’ll be able to run the exact same assembly in both locations.

The bonus to this, of course, is that you’ll also be able to swap out logic that you want to actually be different in different locations as well. For example, if you’re creating a business application you could put all of your business logic into a single assembly that could be run on both the client and the server. However, what that logic does and how it does it might be different in both locations. You may need to do a server call to a database to determine if a particular value of your business object is unique. On the client you want to make an asynchronous web request back to the server but on the server you want to make a call directly to the Database. It’s the same object and assembly in both locations so in order to achieve this you need to make the ValidateUnique rule itself composable, then this is possible even though the object is simply applying the rule in the same way.

In fact this technique can be very pervasive and powerful in general. Running on multiple frameworks requires you to be composable, which may also inadvertently force you into some good practices in general.

One other thing to note. I had to set CopyLocal=True for some of my references in the Silverlight Class Library to get it to run correctly in .NET. Since those assemblies aren’t in the GAC by default, it won’t load them unless they tag along with your assembly.

image

I didn’t test this out myself but you wouldn’t want those files appearing in your .xap file for your Silverlight application. I’m pretty sure that it would be smart enough to exclude them but double check.