This Week in D February 14, 2016

Welcome to This Week in D! Each week, we'll summarize what's been going on in the D community and write brief advice columns to help you get the most out of the D Programming Language.

The D Programming Language is a general purpose programming language that offers modern convenience, modeling power, and native efficiency with a familiar C-style syntax.

This Week in D has an RSS feed.

This Week in D is edited by Adam D. Ruppe. Contact me with any questions, comments, or contributions.

Statistics

In the community

Community announcements

See more at the announce forum.

Tip of the Week

This week, I want to explain away some confusion about string mixins.

A string mixin is used, typically with some string generating code, to inject some dynamic code at compile time into your program and is invoked with the mixin keyword followed by a parenthesized expression yielding a string:

void foo() {
   int a, b;
   int c = mixin("a + b"); // same as if you wrote: int c = (a+b);
}

A static string is of limited value, but with compile-time function evaluation (CTFE), you can generate interesting code:

string generateCode(string name, string[] members) {
    string code;
    code ~= "struct "~name~" {\n";
    foreach(member; members)
    	code ~= "\tint " ~ member ~ ";\n";
    code ~= "}\n";
    return code;
}

mixin(generateCode("Test", ["a", "b"])); // injects code:
/*
   struct Test {
       int a;
       int b;
   }
*/
// as if you wrote that yourself

OK, so we know what it is, but what is it not?

A string mixin is NOT source code injection. Consider the following:

    int[] arr = [mixin("1, 2, 3")];
    // verses
    int[] arr = [1, 2, 3];

If it was source code injection, you'd expect those two lines to be the same, but it isn't, and they aren't. The first array will have only one element! What's going on here?

The spec says "The AssignExpression must evaluate at compile time to a constant string. The text contents of the string must be compilable as a valid Expression, and is compiled as such.". This is compiler-writer speak that basically says it is treated as if you wrote:

    CodeNode node = compile_this_source("the string you are mixing in");
    node.injectHere();

or a bit less abstractly, think of the result if you ran:

    auto result = the string you are mixing in; // this is the AssignExpression magic
    // now replace the mixin() expression with the variable `result`

That's not what really happens, but would give a similar result.

See, the code inside the string is compiled separately from its context. In the array example, the mixin doesn't know it is inside [] brackets. It just sees 1, 2, 3 and compiles it as stand-alone code, giving a result as if it was the code on the right-hand side of the equals sign in an assignment.

1,2,3, in that stand-alone context, is compiled as a CommaExpression - yes, the dreaded comma operator which evaluates the first part, then discards its result and moves on to the second part. This is somewhat infrequently used, but is most commonly seen in C for loops and a handful of other contexts where you'd probably use semicolon-separated expression statements if the grammar allowed it.

So standing alone, 1,2,3 simply yields 3, which is then inserted into the brackets, giving [3] in that example, a single element array, not [1, 2, 3].

Source code injection, on the other hand, would be parsed differently because commas can mean other things in other contexts. In an array literal, you don't see comma expressions (unless they are in parenthesis): commas instead separate elements. In a function call, they separate arguments to the function. These different meanings of the comma require the context to be known to the parser, and in a mixin, unless the context is itself part of the mixin string, it just isn't known.

String mixins are actually closer to AST macros than they are to textual macros. They return a parsed block of code which is inserted into the compiler's internal data structures, not a string which is pasted into your source.

String mixins are also closer to eval in dynamic languages than they are to code pasting, in that eval also fires up a separate copy of the interpreter to run the code string independently of lexical context (albeit using the context of existing dynamic variables, etc.) rather than just pasting in a string to the source. Of course, the huge difference between mixin and eval is that mixin doesn't actually *run* the code. It just *compiles* it and it is a wholly compile-time construct. eval, in languages like PHP or Javascript, on the other hand, do actually run the code.


I hope I didn't confuse you even more than before. But really, just remember that a string mixin is NOT pasting in the code and you will get some helpful insights.

Next week, I may talk about some tips to keep your mixin strings as small as possible, because they can get ugly quickly...

Learn more about D

To learn more about D and what's happening in D: