Indepth Source Maps

Indepth Source Maps

Webdevelopment these days involves a lot of languages. HTML, CSS and
JavaScript are ubiquitous. In order to reduce the number of requests
and the size of individual requests for these resources, the files
involved are often concatenated and minified. Although this is great
news for the resource footprint, the corresponding files become an
unintelligible jumble of characters.

Furthermore, there are a lot of languages that compile down to CSS and
JavaScript. For CSS less, Sass and stylus are examples. For
JavaScript there is CoffeeScript, ClojureScript and TypeScript just to
name a few. These tools offer benefits above the languages they
compile down to, but at a cost. There now is a discrepency between
what was developed and what is interpreted by the browser.

Source maps can help in these situation by providing a map back into
the original files.

1 Marry Had A Little Lamb

Source maps are language and technology agnostic. So we are going to
use the following original source, found in mary.txt.

Mary had a little lamb,
his fleece was white as snow.

We will generate code for this text by removing interpunction and take
the starting letter of each word, wrapping lines after four
characters. The result will be written to the file out.txt.

Mhal
lhfw
was

2 Objective

We will create a source map that will relate segments in the
generated code back to the original source. We need to encode the
following information for a given segment of the generated code.

  1. Which source file does it refer to?
  2. What line is the source element on?
  3. In which column does the source element start?
  4. Optionally, describe the segment?

3 Mapping

The mapping between the generated code and the source files is
described in mappings string. A mappings string is build up from
groups. For each line in the generated code there is a corresponding
group. The groups are seperated by “;”.

Each group is build up from segments. The segments are seperated by
“,”. A segment is a logical unit for which a origin can be
traced. In our example they corrrespond with the characters in the
generated code.

For a segment the following information is recorded

  1. Zero-based column index of segment in the generated source
  2. Reference to the original source file
  3. Zero-based line in the source file the segment originated from
  4. Zero-based column in the line the segment originated from
  5. Optionally, name the segment. We will not provide this information.

The following table provides all the information to find the original
element.

Segment Column File Line Column
M 0 mary.txt 0 0
h 1 mary.txt 0 5
a 2 mary.txt 0 9
l 3 mary.txt 0 11
l 0 mary.txt 0 18
h 1 mary.txt 1 0
f 2 mary.txt 1 4
w 3 mary.txt 1 11
w 0 mary.txt 1 15
a 1 mary.txt 1 21
s 2 mary.txt 1 24

3.1 Tricks

In order to save space several tricks are employed. The first is
referencing the original file by index in a known list of source
files.

The second trick is used to restrict the growth of the numbers
involved. Instead of giving the absolute numbers, all numbers are
given relative to the preceding number, unless the number occurs for
the first time.

Together these tricks transform the table into the latest form.

Segment Column ?Column File File Index ?File Index Line ?Line Column ?Column
M 0 0 mary.txt 0 0 0 0 0 0
h 1 1 mary.txt 0 0 0 0 5 5
a 2 1 mary.txt 0 0 0 0 9 4
l 3 1 mary.txt 0 0 0 0 11 2
l 0 0 mary.txt 0 0 0 0 18 7
h 1 1 mary.txt 0 0 1 1 0 -18
f 2 1 mary.txt 0 0 1 0 4 4
w 3 1 mary.txt 0 0 1 0 11 7
w 0 0 mary.txt 0 0 1 0 15 4
a 1 1 mary.txt 0 0 1 0 21 6
s 2 1 mary.txt 0 0 1 0 24 3

3.2 Encoding

Each segment will be encoded by a sequences of four numbers. Each
number is VLQ encoded and the resulting sequence is Base64 encoded. All will become clear in an example.

3.3 Result

For the first segment, M, we want to encode the following numbers
0 0 0 0. The bits of their VLQ encoding are 32 zeroes, which Base64
encoded is AAAAAA==.

The next segment is a bit more interesting. The numbers to encode are
1 0 0 5, which in binary is

00000001 00000000 00000000 00000101

When this is Base64 encoded it results in AQAABQ==. Going through
all the segments, repeating the first two, we have the following encodings.

0 0 0 0
00000000 00000000 00000000 00000000
AAAAAA==

1 0 0 5
00000001 00000000 00000000 00000101
AQAABQ==

1 0 0 4
00000001 00000000 00000000 00000100
AQAABA==

1 0 0 2
00000001 00000000 00000000 00000010
AQAAAg==

0 0 0 7
00000000 00000000 00000000 00000111
AAAABw==

1 0 1 -18
00000001 00000000 00000001 01010010
AQABUg==

1 0 0 4
00000001 00000000 00000000 00000100
AQAABA==
1 0 0 7
00000001 00000000 00000000 00000111
AQAABw==

0 0 0 4
00000000 00000000 00000000 00000100
AAAABA==

1 0 0 6
00000001 00000000 00000000 00000110
AQAABg==

1 0 0 3
00000001 00000000 00000000 00000011
AQAAAw==

Stringing them together, respecting the segment and group boundaries
results in:

AAAAAA==,AQAABQ==,AQAABA==,AQAAAg==;AAAABw==,AQABUg==,AQAABA==,AQAABw==;AAAABA==,AQAABg==,AQAAAw==

4 Specification

The source map specification is described in a Google Document. It
offers improvements over older version, both in resulting size of the
source map as ease of using the data provided in a source map.

A source map is a json file. The properties of the file are
described below.

The current version of the specification is 3. To ensure backwards
compatibility the used version needs to be specified in a property version.

"version": 3

The next property is file. It designated the name of the file that
holds the generated code.

"file": "out.txt"

The property sourceRoot informs a client of the source map of the
common root of all the files involved with this source map.

"sourceRoot": ""

Multiple source files can end up in a single generated code. The
property sources is used to name the files that were used in
generating the code.

"sources": ["mary.txt"]

Providers of source maps could use the next property to give the
content of the corresponding source files. This is optional. When no
source content is present for a corresponding file, it should be null.

"sourcesContent": [null]

Sometimes it is usefull to provide information on a certain
segment. The names property is used for this by providing a target
to link back the particular type of segment. We do not use it now.

"names": []

The mapping property gives the actual mapping between the source
files and the generated code.

"mappings": "AAAAAA==,AQAABQ==,AQAABA==,AQAAAg==;AAAABw==,AQABUg==,AQAABA==,AQAABw==;AAAABA==,AQAABg==,AQAAAw=="

When put together this becomes the following json.

{
    "version": 3,
    "file": "out.txt",
    "sourceRoot": "",
    "sources": ["mary.txt"],
    "sourcesContent": [null],
    "names": [],
    "mappings": "AAAAAA==,AQAABQ==,AQAABA==,AQAAAg==;AAAABw==,AQABUg==,AQAABA==,AQAABw==;AAAABA==,AQAABg==,AQAAAw=="
}

5 Tools

One should not make a source map by hand. Instead you should let
tools generate the source. It is not feasible to describe for each
language how to generate a source map. So we will concentrate on
CoffeeScript.

The following CoffeeScript file is used to demonstrate how to
generated a source map.

greet = (name) -> console.log "Hello " + name

greet('InfoSupport')

The following command will generate a JavaScript file with a
corresponding source map.

coffee --compile --map hello.coffee

This will create a hello.js file

// Generated by CoffeeScript 1.8.0
(function() {
  var greet;

  greet = function(name) {
    return console.log("Hello " + name);
  };

  greet('InfoSupport');

}).call(this);

//# sourceMappingURL=hello.js.map

The last line hints at an hello.js.map source map which is

{
  "version": 3,
  "file": "hello.js",
  "sourceRoot": "",
  "sources": [
    "hello.coffee"
  ],
  "names": [],
  "mappings": ";AACA;AAAA,MAAA,KAAA;;AAAA,EAAA,KAAA,GAAQ,SAAC,IAAD,GAAA;WAAU,OAAO,CAAC,GAAR,CAAY,QAAA,GAAW,IAAvB,EAAV;EAAA,CAAR,CAAA;;AAAA,EAEA,KAAA,CAAM,aAAN,CAFA,CAAA;AAAA"
}

6 Conclusion

Source maps bridges the gap between using languages and tools that
aid development and resource managment and unintelligble files that
gets served to web clients.