Portrait

Julian Gonggrijp

smart solutions
common good
open source

Introducing Modular Underscore

Underscore has been JavaScript’s unofficial standard functional programming library for a decade (together with its major fork, Lodash). Its most recent significant development is the move to ECMAScript 6 modules (ESM). Underscore 1.11 is the first version to be fully modular.

With the new modularity, you can now create a custom build of Underscore with an even smaller footprint. At the same time, we still provide the standard UMD build, which is perfect if you want to get started quickly. The UMD bundle is easy to work with and it has great cache retention if you load it from a CDN.

Underscore was originally created by Jeremy Ashkenas and he is still helping with the releases. Besides Underscore, he created several other awesome JavaScript libraries and tools, notably Backbone and CoffeeScript. If you haven’t heard about them yet, you should check them out.

I am not Jeremy Ashkenas, but I modularized Underscore. In this article, I will discuss the ins and outs of modular Underscore, and I will try to answer all questions you may have. The article is written Q&A style, so you can read it once, then come back later and use it as a reference.

Table of contents

General

What is _?

Besides a bunch of essential functions, Underscore exports an object named _, which is central to its functionality. Hence the name “Underscore”. _ is all of the following at the same time:

The role of namespace handle is probably the most well known. Before the advent of ES6 modules, having such a namespace handle was a necessity. There are many other libraries from this era that exploit the namespace handle to be a useful function as well. jQuery’s $ is a famous example of this. While there is no longer a strong incentive to have a multi-purpose namespace handle, it continues to be useful, so we keep it around in Underscore. In ES6 module context, it is still the default export.

Arguably, the role of special value wrapping function is much more interesting. Thanks to this function, you can do _.map(x, f), _(x).map(f) or _.chain(x).map(f).value(). These expressions are equivalent, and they share a single underlying implementation of the map function. The same applies to all other Underscore functions. This is made possible by having a standalone definition of each function, and then adding all functions to _ with a single call to _.mixin. It is a very elegant design; please take a peek at (the bottom of) the annotated source if you’d like to admire the details.

What is the monolithic interface?

The _ object with its magical bells and whistles is made out of all the parts of the Underscore library. As such, _ represents the whole library at once; this is what I mean by the monolithic interface. Historically, this used to be the only interface to the library.

Any module that provides _ depends on all of Underscore. If your project imports something from such a module, even if you do not import _ itself, then your project effectively imports all of Underscore. I refer to this as “using the monolithic interface” in a wide sense.

Don’t worry though! With modular Underscore, you can have all the tricks that _ provides, such as chaining, and still be selective about which parts you want to import. Basically, you can create your own variant of the monolithic interface. I’ll cover the details below.

I should add that Underscore is both lightweight and widely adopted. Using the standard monolithic interface is a wise choice in many cases. More on this below as well.

What is the modular interface?

As of version 1.11, every individual part of Underscore can be imported from a separate module. Selectively importing these parts directly, instead of relying on a module that gives you the monolithic interface, amounts to using the modular interface.

There is a separate module for each Underscore function. Some of these functions, notably mixin and chain, can be used to create your own customized monolithic interface.

Keep in mind that the modular interface brings powerful flexibility but also responsibility. The standard monolithic interface is well designed and coherent. If you choose to omit parts of it, you might end up with something that doesn’t do what you want.

I will detail the options and the pitfalls below.

Why tiny modules?

In general, spreading the source code of a project over multiple modules promotes code reuse. If I can pick the code that I am interested in from your library, without having to import code that I am not interested in, then I am more likely to use the code that you already wrote, instead of reinventing the wheel. Everyone saves time and we get our flying cars sooner.

When I modularized Underscore, I went for the most extreme option: I put every function in a separate module. I did this in order to maximize the potential for code reuse. I want users to be able to select exactly the code that they need; no more, no less. I am already working on a library that will exploit this myself.

What about treeshaking?

In theory, I would not need to go all the way down to individual function modules in order to get perfect selectivity. Tools like Rollup can isolate individual functions from a larger module, a “trick” known as treeshaking. In practice, however, static analysis of JavaScript is hard. With everything being mutable and unsafely typed, it can be hard to tell whether one piece of code might influence another, especially without human intelligence, and the tools are far from perfect as a result.

In an early stage of my modularization effort, all of the exported functions were still pooled together in a single ES module. At this stage, I tested treeshaking with _.map. It sort of worked, in the sense that the result was only about half as large as the whole Underscore library. When I further split this module down to the individual functions, however, _.map shrank by another factor of 3. It turns out that _.map, and the other Underscore functions that it depends on, together make up only one sixth of the library.

\( \frac{1}{2} - \frac{1}{6} = \frac{1}{3} \), so there is a whole third of the Underscore code that Rollup is not able to treeshake. The only way to avoid needlessly importing that code is to manually isolate it in a separate module, which is exactly what I did.

What build variants can I choose from?

The source code consists of ESM modules. These modules live in the modules/ package subdirectory and you can use them directly. From the source, we build several variants in order to cater to a wide range of use cases:

“Monolithic bundle” in the first two build variants means both that the variant provides the monolithic interface, and that it is entirely contained in a single file.

Naturally, only the modular variants support modular usage. All variants support monolithic usage, but for efficiency, it is best to use the monolithic bundles.

Why have monolithic build variants?

Modules are useful, but the monolithic bundles have some compelling advantages as well (besides backwards compatibility):

What are the benefits of a CDN?

CDNs are awesome, especially for open source libraries. In a nutshell, websites often use the same resources—such as Underscore—and by referencing those resources from a common URL, instead of hosting their own copies, they enjoy collaborative caching. For the Underscore 1.11 UMD bundle, you can use one of these CDN URLs:

Underscore is widely used (some numbers), so if you use one of the above URLs, it is almost certain that a visitor of your site already has that copy of Underscore in her browser cache. Even when this isn’t the case, CDNs still reduce the amount of network traffic required, which means less latency and lower carbon emissions.

What makes this even more awesome, is that visits to your site also contribute to cache retention on other sites. Not only do you benefit from the caching effect, you also help to amplify the effect for other sites. Win-win!

Best practices

How to reduce code size?

Most projects use only a subset of the Underscore functions. In principle, this means that your users are downloading a bit more code than they need in order to run your software. Underscore is quite small and there is the collaborative caching effect, so it doesn’t always matter, but sometimes it does. There are two things you can do to shake weight.

Your first option is straightforward: don’t ship the parts of Underscore you don’t use. In other words, use a selective, custom Underscore. While straightforward, this option is not for everyone. More on this in the next sections.

Somewhat paradoxically, your other option is using more of Underscore. If you are loading the standard monolithic interface anyway, you may as well embrace it, and use as much of Underscore’s functional goodness as you can. Using more of Underscore’s 100+ functions, and adopting a more functional style in general, can help to make your code terser and more maintainable at the same time. This is an important factor that enables trusty Backbone to pack so much functionality in so little space.

These options can be combined. Most Underscore functions depend on other Underscore functions internally, so even if you are selective, some functions that you don’t use might slip through. You might as well try whether you can find a way to benefit from those functions.

Standard or custom Underscore?

In general, standard Underscore saves effort, being off-the-shelf, while a custom Underscore tends to reduce code size. For server-side, mobile and desktop applications, this will be the main tradeoff to inform your choice.

On the client side, where every session starts with downloading the application from the internet, caching effects are going to be the deciding factor. If cache retention of your custom Underscore is poor, then you’ll end up generating more network traffic, not less, regardless of how much you’re reducing code size. If cache retention is very good, however, you might make significant savings.

When you have a client-side application with lots of frequently returning users, and you can use a CDN, this is a convincing case for going with a customized Underscore. Otherwise, I recommend that you use standard Underscore. Cameron Beccario’s Earth is a very nice early showcase of a site using a customized Underscore.

If you are creating a library that depends on Underscore, you can leave the choice to the users of your library. More on this in a later section.

Monolithic or modular imports?

As I mentioned before, you can be selective and still have a monolithic interface (with chaining and other tricks), by creating your own customized Underscore, or by reusing a custom Underscore that somebody else already created. Essentially, there are two independent choices that you can make.

The first choice is about the pool of functions that you draw from. This might be the standard pool that Underscore provides, or a selective, customized library. In addition to either of those, you may draw from any number of extension libraries, like Underscore-contrib or underscore.string. An extension library just adds more functions to whichever Underscore you chose as your base.

The second choice is about the way in which you import functions from that pool. In monolithic usage, you import all functions as well as the _ object from a single entry point. In modular usage, you import each function from its own, separate module. When composing your own custom Underscore, modular is the way to go. In all other cases, you should use monolithic imports.

Why reserve modular imports for customization?

When both an application and its dependencies use monolithic imports (whether from standard or customized Underscores), it is easy to ensure that you need to load only a single Underscore to serve all of them at the same time. You just have to configure your build tool so that all imports from any monolithic Underscore-like interface alias to the same underlying library (as long as the function names don’t conflict—I’m looking at you, Lodash!).

If any of the parties involved are using the modular interface directly, it is almost impossible to avoid loading the same code multiple times. That means greater code size, larger network transfers, greater memory consumption and ultimately, greater energy consumption. This is bad for the environment.

I should mention that extension libraries are a bit of a grey area. On the one hand, you probably want to enable other developers to incorporate your extension functions in their customized Underscore builds. This requires that you put each function in a separate module, and each such module should use the modular interface. On the other hand, if you create a bundle, this bundle should use the monolithic interface, just like any other library that depends on Underscore. While I have not tried this yet myself, I think it should be possible to convert from the former to the latter with the bundling tool. More on this in a future article.

How to give users of your library a choice?

If you are maintaining a library that depends on Underscore, some of your users probably also use Underscore independently, either directly or through other libraries. Often, such users will want to include only a single copy of Underscore, especially on the client side. They may not always want to use the same flavor of Underscore (standard or custom) that your library depends on, so it is nice to give them the freedom to inject a different Underscore in your library.

The most important thing to do, is to use monolithic imports: import all Underscore functions from a single entry point. Likewise for extension libraries. This makes it easy for users to alias that entry point to a different flavor of Underscore. This will work for them, as long as that other flavor has all the functions your library needs.

As an additional service, consider listing all the functions that your library needs in your documentation. This helps your users to assess which functions are required and which can be left out, if they decide to use a custom Underscore.

For easy aliasing, it is also helpful if you document the module identifier through which you import Underscore, especially when it is not the default 'underscore'. For example, if your library is importing from an internal ./lib/custom-underscore.js, your users can alias your-package/lib/custom-underscore.js to a different Underscore as desired.

Summary of recommendations

The following table summarizes my recommendations for each use case.

What you create Conditions Base Underscore Imports
custom Underscore standard modular
Underscore extension library standard modular in source, monolithic in bundle
other library standard (leave choice to users) monolithic
client-side application general standard (CDN) monolithic
client-side application great caching and savings custom (CDN) monolithic
other application minimize effort standard monolithic
other application minimize size custom monolithic

How-tos

How do I use the monolithic interface?

The monolithic interface is what you have always used before Underscore 1.11. The old ways to import it still work, but there are some new options.

The available syntaxes depend on your target environment, although tools can convert between them to some extent:

In general, I recommend writing your imports in the static ESM syntax for new projects, and then converting it to one of the other syntaxes with your build tool if needed.

import _, { map, filter } from 'underscore';
// You can also still do _.map, _.filter etcetera.

Be warned that the following import statements are not equivalent, although conversion tools might emulate them in the same way. You should avoid the second form in new projects.

import _ from 'underscore';      // default export
import * as _ from 'underscore'; // module alias

In the following examples, your.cdn.com is a placeholder for whatever CDN you decide to use. Common options were listed above.

Dynamic ESM in the browser, if you are OK with only supporting new browsers:

const {
    'default': _,
    map,
    filter,
} = await import(
    'https://your.cdn.com/underscore@1.11.0/underscore-esm.js'
);

Note that if you use the ESM build, while you also have a dependency that uses the UMD build, your application runtime will end up with two independent copies of Underscore. Customizations to _.iteratee, _.templateSettings or _.partial.placeholder in one instance will not be seen by code that uses the other instance.

All the remaining options below use the UMD build.

AMD syntax:

define(['underscore'], function(_) {
    // Use _.map etcetera as usual.
});

In your require.js config, set the path for underscore to https://your.cdn.com/underscore@1.11.0/underscore.js.

CommonJS syntax:

var _ = require('underscore');
// Use _.map etcetera as usual.
// Or you can go fancy with ES6:
const { map, filter } = require('underscore');

If you are using Browserify, I recommend using exposify or a similar plugin, in order to replace such imports by a browser global.

Browser global (embedding):

<script
    src="https://your.cdn.com/underscore@1.11.0/underscore.js"
></script>
<script>
    // _ is a global variable
</script>

ExtendScript:

#include "path/to/node_modules/underscore/underscore.js"
// _ is a global variable

How can I extend the monolithic interface?

“Monolithic” is not meant to imply that the interface is set in stone! You can still add or override functions. This also applies if you are using a customized Underscore instead of the standard interface.

Adding a function is really easy with _.mixin. Chaining is automatically supported with any function you add, as long as it takes at least one argument and it returns its result. For example, this is how you can enable upper-casing strings in the middle of a chain:

import _, { chain, mixin } from 'underscore';

function toUpper(string) {
    return string.toUpperCase();
}

// You can add the same function under multiple aliases.
// This is almost cost-free.
mixin({
    toUpper: toUpper,
    upper: toUpper,
    capitalize: toUpper
});

// That's all, use it like any other Underscore function.
_.upper('big');
// 'BIG'
chain(['one', 'two', 'three']).join('! ').toUpper().value();
// ONE! TWO! THREE!

Overriding existing functions is exactly like adding new functions, except that you mixin a name that was already in the interface.

How do I import individual functions?

From this section onwards, we will be discussing the modular interface. As mentioned before, you should generally only do this if you are creating a custom Underscore. Take care to pick one interface and stick to it; don’t mix modular and monolithic imports within the same project.

ESM syntax:

import map from 'underscore/modules/map.js';

AMD syntax:

define(['underscore/amd/map'], function(map) { /*...*/ })

CommonJS syntax:

var map = require('underscore/cjs/map.js');

For functions with aliases, the first name that appears in the documentation is always used as the module name. For example, reduce/inject/foldl:

// Regardless of your preferred alias,
// the module name is reduce.
import reduce from 'underscore/modules/reduce.js';
import inject from 'underscore/modules/reduce.js';
import foldl from 'underscore/modules/reduce.js';

// You can make up your own alias as well.
import summarize from 'underscore/modules/reduce.js';

// The following will not work!
import inject from 'underscore/modules/inject.js';

You can convert your ESM imports to the other syntaxes. Any build tool that you use with the ESM, AMD or CommonJS syntax will also allow you to alias the module path prefix so that, for example, you could write underscore/modules/map.js, regardless of the module convention.

How do I import bare _?

If you go the modular route, you may still occasionally want to import the _ object in order to override _.iteratee or _.templateSettings, or to use OO style or chaining with a restricted set of functions. You can import just the wrapper function, without any functions mixed in, from modules/underscore.js:

import _ from 'underscore/modules/underscore.js';

var x = ['a'];

// These lines work:
var wrapper = _(x);
wrapper.value(); // x
wrapper + 'b'; // 'ab'
JSON.stringify(wrapper); // '["a"]'

// These lines won't, because the methods haven't been
// mixed in:
wrapper.size();
wrapper.sort();

More on overriding, mixing and chaining below.

How do I override _.iteratee?

The module in which _.iteratee is defined also sets the property on _ as a side effect. You just need to import both _ and iteratee and then assign your override to _.iteratee:

// The order of these imports does not matter.
import _ from 'underscore/modules/underscore.js';
import builtinIteratee from 'underscore/modules/iteratee.js';
import map from 'underscore/modules/map.js';
import isRegExp from 'underscore/modules/isRegExp.js';

// Classic example: iteratee that supports regex matching.
function iterateeOverride(value, context) {
    if (isRegExp(value)) return function(obj) {
        return value.test(obj);
    };
    return builtinIteratee(value, context);
}

_.iteratee = iterateeOverride;

map(['apple', 'banana', 'cherry'], /e/);
// [true, false, true]

How do I override _.templateSettings?

The module in which _.templateSettings is defined also sets the property on _ as a side effect. You just need to import both _ and templateSettings, and then make your overrides on _.templateSettings:

// The order of these imports does not matter.
import _ from 'underscore/modules/underscore.js';
import 'underscore/modules/templateSettings.js';

// Override just a key.
_.templateSettings.interpolate = /\{\{\{(.+?)\}\}\}/g;

// Or the whole object.
_.templateSettings = {
    interpolate: /\{\{\{(.+?)\}\}\}/g,
    escape: /\{\{(.+?)\}\}/g,
    evaluate: /\{\{#(.+?)\}\}/g
};

How do I use _.template?

You can just import the _.template function from its module and call it, but there are some gotchas. Template evaluation is a two-step process:

  1. The template string is compiled into a template function.
  2. The template function is called with the data to produce the final string.

The source code of the template function can be read out using _.template(templateString).source and then stored in a JavaScript file. This means that steps 1 and 2 may be widely separated in time; they might execute on different machines, in different JavaScript runtimes, with different instances of the Underscore library.

It is your responsibility to ensure that whatever environment step 2 executes in is compatible with your template functions. If that runtime uses modular Underscore, or a customized monolithic Underscore, there are two main things you need to be aware of.

I recommend to always use _ as a namespace handle inside template code: call _.each rather than just each. This ensures that the module in which the template function is saved only needs to import _ in order to work. This, however, requires that all functions used in templates are mixed into _. Your safest option, especially if you do not control the contents of the template strings, is to use the standard monolithic interface.

How do I enable chaining?

Just mix the functions you want to chain into _. Make sure to also include the _.chain function itself:

import mixin from 'underscore/modules/mixin.js';
import chain from 'underscore/modules/chain.js';
import map from 'underscore/modules/map.js';
import filter from 'underscore/modules/filter.js';

// mixin modifies the _ object as a side effect.
mixin({
    chain: chain,
    map: map,
    filter: filter,
    // Let's also add some aliases.
    project: map,
    select: filter
});

chain([1, 2, 3])
    .map(x => x * x * x)
    .filter(x => x > 5)
    .value();
    // [8, 27]

How do I add the Array prototype methods?

The underscore-array-methods module adds these to _ as a side effect. This module also re-exports _ for your convenience.

import _ from 'underscore/modules/underscore-array-methods.js';

_([1, 2, 3]).reverse(); // [3, 2, 1]

Note that in order to chain the Array methods, you need to mixin _.chain as well.

How do I compose my own custom Underscore?

If you want to create an interface that is similar to the standard Underscore, but with a different set of functions, you can mimic the internal structure of Underscore itself. This requires two or three modules, depending on your needs.

The first module simply collects and re-exports all the public functions that you want to include. This is also where you create aliases. Let’s call this module ./index.js.

// Any Underscore functions you want to reuse unmodified.
export { default as map } from 'underscore/modules/map.js';
export { default as filter,
         default as select } from 'underscore/modules/filter.js';
// This one is required if you want chaining.
export { default as chain } from 'underscore/modules/chain.js';

// Any functions you want to add.
export { default as foo } from './foo.js';
export { default as bar } from './bar.js';
export { default as baz } from './baz.js';

The second module takes all public functions from ./index.js and mixes them into _. It also exports _ as the default. Let’s call this ./index-default.js.

import * as allExports from './index.js';
import mixin from 'underscore/modules/mixin.js';
// Add the next one if you want the Array prototype methods.
import 'underscore/modules/underscore-array-methods.js';

// mixin returns _, so we don't need to import _ explicitly.
export default mixin(allExports);

Essentially, your own custom monolithic interface is now done. You can use it internally and you can create a nice UMD bundle from it with Rollup. If you also want to expose the entire interface from a single entry module in an ESM-friendly way, however, there is one more thing you need to do. Let’s call the next, final module ./index-all.js.

// _ is still the default export.
export { default } from './index-default.js';
// Also export each function individually by name.
export * from './index.js';

This final module is what we create the monolithic ESM bundle from in Underscore.

Compatibility

Will Underscore still be the tiny library I love?

Yes! Modularization added 110 bytes to the minified and gzipped UMD bundle. However, it is still an explicit goal of Underscore to have a small footprint. This will never change. While there is a general tendency to grow in size due to new functionality, like with any library, we try hard to keep this within limits. We are very critical about adding new functions, and we always look out for ways to make the library smaller again.

What about ECMAScript 3 and ExtendScript?

Don’t worry, these still work! While the native import and export statements were introduced in ECMAScript 6, we only use them in the source code. They disappear when we create the UMD, AMD and CommonJS variants. The rest of the code is still ES3 compatible. You can use the UMD, AMD or CommonJS variant directly, no transpilation required.

What about .mjs?

Node.js has experimental ESM support. To use it, you either have to set the type field of the package.json to "module", or use the .mjs extension for modules that use the ESM notation. In Underscore, we are currently not using these facilities.

Frankly, the experimental Node.js convention is a pain to adopt, especially if you want to support both CommonJS and ESM, especially if you are maintaining a sophisticated library like Underscore and you don’t want to break backwards compatibility. I’m not even sure it can be done at all. I have opted out of this convention for the time being, relying instead on the esm package, the package.json module field, and build tools like Rollup in order to provide ESM support.

When the Node.js convention moves past the experimental stage, and it gains widespread adoption, we will reconsider it. Hopefully, the Node.js team will find a way to make it more manageable in the meanwhile.

What about WebPack and Parcel?

At the time of writing, both WebPack and Parcel are suffering from bugs (WebPack, Parcel) that break non-ESM imports from the package entry point in some cases. If you do this:

var _ = require('underscore');

and you tell WebPack or Parcel to create a single bundle that includes both Underscore and your own application code, then you are likely to see errors like these:

_ is not a function

In short, this is caused by the fact that these tools indiscriminately prioritize the module field over main, even when resolving a CommonJS require, which the module field was never meant for.

To work around this issue, you have a number of options:

What about Underscore 2.0?

(Yes, there will be a version 2.0 of Underscore in the future!)

While there will obviously be some breaking changes, I expect that the overall architecture will remain the same in the next major version of Underscore. We will likely be providing the same build variants. You will still be able to compose your own custom Underscore in the same way. The present modular design was made with Underscore 2.0 already in mind.

At this time, I can think of two things that are likely to change:

Wrapping up

What other plans do you have for Underscore?

Glad you ask! This is my current wish list:

How can I learn more and stay informed?

How can I help?

You are already doing enough by using Underscore. Thank you! That said, if you want to do more, you can.

Firstly, you can help spread the word online. Post about modular Underscore on Twitter and other social media, hang out on Gitter, tell your friends and colleagues.

Secondly, if you feel up to it, you can help with the work. Answer questions on Stack Overflow, review new pull requests, or maybe even contribute your own code.

Finally, if you have a little money to spare, you can support me personally on Patreon. Donations help me to dedicate more time to open source contributions. In the near term, that means I’ll be able to dedicate more time to Underscore and associated libraries.

Acknowledgements

Cameron Beccario, Jeremy Ashkenas and Daniel Shamany gave excellent feedback on the drafts for this article. I would also like to thank Inge Hoogendam, Nadine Gonggrijp, Baloe Gonggrijp, Arie de Bruin and Diedel Kornet for their endless encouragements, and all of my patrons for their continuing support. All of you rock.