EcmaScript 6 Template Strings

This blog post looks at the implementation of ES6 Template Strings in Mozilla’s SpiderMonkey JavaScript Engine.

Template Strings without Substitutions

The simplest form of a template string is

`hello`

Note the back-tick ( ` ) character.

That’s just plain boring. It is functionally equivalent to the string literal “hello”.

Something a bit more interesting is

`hello
there`

Note that we cannot split string literals into multiple lines in this manner. The simplest way of replicating this behavior using string literals is

"hello\n\
there"

which is just painful, especially as the number of lines grows. The first application of template strings, as illustrated above, is to provide support for elegant multi-line strings.

In SpiderMonkey, the behavior is very similar to that for string literals. The main change is in the tokenizer. When we hit an unescaped end of line character (one of line feed (‘\n’ or 0X000A), carriage return (‘\r’ or 0X000D), line separator (0X2028) and paragraph separator (0X2029)) within a template string, instead of yelling at the user, we just pass the character along. We have normalization – a ‘\r’ or a ‘\r\n’ are both replaced by a single ‘\n.’

Substitutions

If multi-line is all template strings can do, that’d be a pretty unambitious intern project, wouldn’t it? They are a lot cooler than that. We can do something like

var a = 5;
var b = 10;

`hey ${a + b} there` // hey 15 there

This is equivalent to doing

"hey " + (a + b) + " there"

Nice syntactic sugar. Note the ‘${‘, which indicates where the substitution starts, and the ‘}’, which shows the end of the substitution. The literals on both ends can be empty. We can use multiple substitutions in a single template string. Here are some examples to illustrate what’s possible:

`${"hey"} ${"there"} what are ${"you"} doing`
  // hey there what are you doing

`hey ${`there ${4} are`} you`
  // hey there 4 are you

The tokenizer in SpiderMonkey chugs along as before. Within the template string, when it sees a ‘${‘, it returns the string before the ‘${‘. The parser next parses an expression and then expects the ‘}’. It finally expects the template string to continue on. The template string, as usual, terminates with the ‘ ` ‘. We can have multiple substitutions after this. Nesting is automatically taken care of, as the expression within can be another template string.

The parser now has alternating literals and expressions (note that any number of the literals can be empty strings). It creates a list of such nodes and passes them along to the bytecode emitter. The bytecode emitter pushes the first two nodes in the list onto the stack. It then uses the new JSOP_TOSTRING opcode (added expressly for this feature!) to convert the result of the expression to a string. The two strings on the stack are then JSOP_ADDed to leave a single string. The next node is pushed onto the stack and the process repeats, until finally, there is a single string on the stack, and the list of nodes is exhausted.

Tagged Templates

Tagged templates are of the form

var b = 10;

func`a${b}c${"d"}`;

The expression before the first ‘ ` ‘ is a function that the template string is passed to. In the case above, func is passed three arguments – the second one is the result of the expression ‘b’, and the third one is the string literal “d”. The first argument is a little more complicated – it is called the call site object, and it encodes the literal parts of the template string.

I can get into its semantics, but it is easier on everyone to just use an example for this sort of thing

function func(a, b) { return a;}
  // returns the call site object

var a = func`a\n${b}c`;
  // 'a' now contains the call site object

a[0]     // a\n
a[1]     // c
a.raw[0] // a\\n
a.raw[1] // c

a[0] and a[1] are called cooked strings, and a.raw is an array of raw strings. Raw strings contain the string parts as they were entered. A ‘\n’ is represented by 2 characters, a ‘\’ and an ‘n’.

Call site objects are compile time constructs.

var cso = [];
cso[0] = func`a`;
for (i = 1; i <= 2; i++) {
    cso[i] = func`a`;
}

cso[0] === cso[1] // false
cso[1] === cso[2] // true

The above code example illustrates how a code location gets just one call site object, irrespective of how many times the particular line of code is executed.

SpiderMonkey’s parser is where most of the fun is. Following a function call, instead of just checking for an open parenthesis ‘(‘, we also check whether there is a template string, identified by the ‘ ` ‘. Just as before, we obtain the individual tokens within the template string, which are alternating literals and expressions. Instead of getting just the cooked strings, we obtain the raw strings as well. We construct an entity (let’s call it the call site entity) that holds all these strings together. We then construct a list which has the call site entity and all the substitutions.

The call site object is created at compile time from the call site entity. It is identified by an index into a list of objects. We have a shiny new opcode (JSOP_CALLSITEOBJ), that takes in an index and pushes the corresponding call site object onto the stack.

These are all the pieces needed to get this working. The bytecode emitter gets the list of objects described above. It sees that this is a tagged template, and starts pushing stuff onto the stack. It first pushes in the callee, followed by “this”, followed by the call site object (using JSOP_CALLSITEOBJ). It then pushes in the substitutions in order. It finally calls JSOP_CALL, and the function is invoked!

That’s all well and good, but what’s the point of all this? An example helps illustrate how tagged templates can be used.

function log(callSite, ...values) {

    var args = [], N = values.length;
    for (var i = 0; i < N; i++)
        args.push(callSite[i], values[i]);
    args.push(callSite[N]);

    console.log(...args);
}

var week = 7;
var month = 31;

log`A week has ${week} days. A month has ${month} days.`;
  // A week has 7 days. A month has 31 days.

Template strings has been pushed to mozilla-central, and is expected to be included in Firefox 34.

Thanks to Jason Orendorff (jorendorff) for mentoring me on this.

Useful Links

  1. The harmony page, though outdated, provides many use cases: http://wiki.ecmascript.org/doku.php?id=harmony:quasis
  2. More use cases: http://jaysoo.ca/2014/03/20/i18n-with-es6-template-strings/
  3. Even more use cases: http://www.slideshare.net/domenicdenicola/es6-the-awesome-parts (the second part of the slide-set)
  4. A link to the draft: http://people.mozilla.org/~jorendorff/es6-draft.html
  5. Buglist: 1021368 1024748 1031397 1038498
Advertisements

One thought on “EcmaScript 6 Template Strings

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s