I’ve talked quite a bit about domain-specific languages (DSL) in other articles, and I freely admit that I am very excited about the concept at the moment, but I’ve not found much on the web that really gets to the nub of why they might be important to the future of software development.
There’s a pretty good article in the Linux Journal and Martin Fowler’s post covers the basics, but nearly all the others whiz through the subject, saying little more than Yacc and VBA in Excel, are good enough examples for us to be satisfied with.
Well.. yes.. but I think there’s a fundamentally deeper concept that can change the way you work on software, and with it change your chances of success, agility, enjoyment and all manner of good things. So here is my token contribution to the subject.
A domain-specific language is really no more than a semantic source-code contrivance geared toward making a specific problem set (the domain) easier to solve. If the language available in Microsoft Excel was COBOL, it would be pretty hard to customise your spreadsheets, so instead you get something tailored to the environment and the kind of things you are likely to want to do.
That’s the bit that’s been said many times before.
But I think to really get DSLs, you have to recognise that what they do is to mimic how human beings talk to each other in particular circumstances. We’re used to this because we inhabit a ‘human domain’, with a vastly tailorable language, creating jargons, creoles, pidgins, patois, vernaculars and argots as our business, community, audience, locality, class or culture demands. A DSL is simply derived from the wider norms (such as they are) and made useful in a unique environment.
In a sense, all code you write leans towards becoming a kind of a DSL, because you’ll very quickly start knocking up useful little utilities, functions, methods etc to make your life easy. But I think DSLs as a recognised concept are important because they formalise and move some of this thought process up front, making future development faster.
Certain properties of the underlying language help to give a fluidity to make this derivation easier. If you understand these, not only can you choose a better language to write your own DSL in, you will also be able to write (or extend) a better DSL. These features are:
Your language will perform a host of domain-specific operations (verbs) on your domain-specific nouns (data objects). Verbs are to nouns in a DSL pretty what methods are to instances in classic object-orientation, except that in OO the methods operate on their public and private data elements. In a DSL verbs live their lives largely unconnected with objects so that, in theory at least, a verb can operate on any noun (of course, you have to code for that eventuality, but it does make future use much more fluid than having to deal with the intricacies of inheritance).
Logic/Data Separate and Connected
But sometimes you want you data and the logic associated with it conjoined. A DSL noun is also a regular object with methods. So for example if you have a noun that represents an event, it’s an object that holds time values privately, with methods to represent those times in various forms (local time, time event occurred, time event noted, etc). A DSL allows data to be both separate where it needs to be, and bound when it doesn’t.
Domain-specific language generating mother-tongue languages are often better weakly, rather than strongly, typed. This helps when verbs need to act on multiple types of items but won’t know beforehand what those items are - they will though react dynamically to those types at run time, or run code bound to the noun object to perform the task rather than operate the equivalent of a massive case statement.
DSLs tend to have a rather loose syntax, not forcing their users into overuse of brackets and semi-colons. I would concede that this feature only really makes the source code look like it’s being spoken by a human, but no doubt there’s a minor benefit in maintenance if the code is a little bit easier to read. The danger is of course that if syntactic demands are low, the scope for variations between programmers is great (kind of the Perl mantra that there are lots of ways to do everything).
Interpreted, not compiled
DSLs also tend to be interpreted rather than compiled. I used to feel uncomfortable with saying this even was a feature, but honestly, it may just be one of the things that makes a DSL a DSL. If a language is interpreted on the fly then it can create more of itself as it goes. This is one of the features of Lisp, and to some extent Ruby (in that both are largely written in themselves). Having a language be able to write and execute itself can get enormously powerful very quickly, and can lead to complex ideas being carried out by quite small chunks of code. The specifics of this are for another day though, as it needs an article all to itself.
Rails is an excellent example of a DSL for the domain of database-driven web site creation with Ruby being is the mother tongue. If you want to display a link to another page on your site in Rails you just type:
link_to <link_text>, :controller => <controller_name>, :action => <action_name>
which makes for a much less brittle instruction than embedding the actual HTML itself with an<a href=/<controller_name>/<action_name>..
But, in creating a database-driven web site in Rails, your domain is actually a bit narrower than the one Rails is aimed at. For example, a personal blog site has demands over and above that met by the Rails package. In addition to just ‘link’ being a noun in the vocabulary you also need to think in terms of ‘articles’ and ‘archives’ and ‘comments’.
You could wrap the Rails the link_to helper in a verb-plus-noun helper of your own, like link_to_article which can handle any commonly used specifics that apply to your own brand of links. Then, whenever you decide that, say, the class=”blah” attribute of the anchor tag needs to change, you can do it in one place by passing in a different :class => ”<class_name>” parameter.
You can already see that building a DSL is a very neat way to embody DRY principles into your code without having to think about it all the time.
Clearly, this example is overly trivial, in reality it takes a good deal of effort to design a DSL that supports DRY, appropriate granularity and proper reuse.
It can all get a bit sticky if you get it badly wrong. As, in fact, I just did :)
If you used a link_to_article construct for all your postings, but then wanted to have a different kind of posting with a different display characteristic, you would have something of a dilemma: either extend my link_to_article to take optional parameters, indicating the new type (whilst setting a default to keep all the pre-existing article code working), or add a link_to_other_thing helper, which could start to introduce repetition.
Looking back at the noun/verb separation thing you can see that article and other_thing are actually two nouns, and both are acted on by the link verb. So here we have identified a possible area of future change (remember DSLs are all about making that easier), we are very likely to need link_to_yet_another_thing in the future (news, quotes, blogrolls).
Given this, it makes sense to add a touch of inversion of control to the helper, and pass in the class of object we need to link to. That way, the instruction stays as something akin to the original link_to (i.e. a pure verb) all through the code, and we need change only the internals of link_to to support new classes of noun.
I’m currently designing a DSL to manage real-time events (hence my earlier example), and making notes on my copious mistakes to publish at a later date. What has been a real revelation to me is that once you have even a semi-functional domain-specific language, the most complex of ideas become surprisingly easy to realise. As in English, once I have a collection of nouns and verbs (and more subtly, adverbs and adjectives) I can conjure up new and complex sentences at will that would hitherto never have occurred to me - and yet they would be understood to anyone familiar with the grammar that defines the language.
I also happen to believe that, despite the emphasis on the ‘up front’ aspects of DSLs, they are essential to make Agile work. For a start there isn’t that much up front to do once you understand the building blocks. Any mistakes I have made I can see are down to the fact that my coding brain was allowed to get too rusty after being concerned with management claptrap for too long. Secondly a DSL is surprisingly stable once you start to get it right - as in fact spoken language is. So, when your customer is chopping and changing their mind all over the place, it’s great fun to freak them out by serenely smiling back at them and saying ‘oh yes, I’ll have that ready to show you in a few minutes’.