Archive for the ‘JavaScript’ Category

YAJSML

Thursday, April 21st, 2011

Ugh, I’m pretty tired of the endless parade of “Oh hai, iz wrtn a JS loaderz” projects. Given the number of existing implementations and the general solved-ness of the problem, the time devoted to it is disappointing. But here I am, doing just the same.

Principles

This work is related to that done in RequireJS and CommonJS, but hardly bound by them. Instead, the results are a product of the following principles:

  • Improving the loading characteristics of a JavaScript project should be approached as an incremental optimization problem.
  • Simplicity is best. Only implement the necessary features.
  • Caching should be exploited. Expensive one-time operations are acceptable provided their responses are reusable.
  • Modules are well defined, widely used, and well founded. The five different asynchronous loading specifications, not so much.

This leads to what I’m given to think is a much simpler version that works than existing implementations. The other nice thing is that the packaging tool takes an original approach to solving the dependency issue.

Observations

Optimization

Naïve implementations are good. Those implementations may be slow, but they are also cheap and set the stage for proper optimization. Chances are, that many possible optimizations are rendered unnecessary by the right tools.

Dependencies

Most current implementations use one of five ways to wrap a module’s code with a description of the dependencies that that code requires, and which a library will fetch asynchronously, finally evaluating the modules code once all are loaded. The thought being that, once that module is received, first all of its dependencies need loading (asynchronously so they are non-blocking, natch).

IMHO, that obscures the more obvious and important observation, that having dependencies that aren’t loaded by the time the current module is loaded, asynchronous or not, is never good. If this is to be treated as an optimization problem, then the issue is one of packaging. If the packaging works well, the question of synchronous/asynchronous loading is moot.

Packaging

Existing packagers all perform some sort of parsing on source files, usually a regular expression, maybe a full preprocessor language. Both approaches have the downsides of requiring boilerplate code or being unreliable. There is also the additional complication of describing lazy dependencies so that they do not get confused with loading dependencies.

The good news is that, given the availability of non-browser interpreters, there is a third way, where the code itself can be evaluated offline and dependencies extracted during run-time. Not only would this extract only those dependencies needed exactly at load time and require no boilerplate, using the same kernel in both environments, it would keep both the client and the packager’s results consistent.

Versioning

The current practice for deploying JavaScript is to set a query parameter like bust=v_n+1 on the URL of the script’s location to, in effect, invalidate the cache. This happens to work in the monolithic file case, however, lazy loading code makes versioning a problem that cannot be ignored. While new clients will use v_n+1, clients using v_n code must continue to receive v_n code. For this reason, versioning should be reflected in the base URI.

http://assets1.example.com/js/src/n/
http://assets1.example.com/js/src/n+1/
http://assets1.example.com/js/src/n+2/

Caching

It’s a common observation that different areas of a project change at different rates. This is certainly the case in web applications where library code will change much slower than application code. It follows then that updates to application code should have no effect on still cacheable library code.

Convention already specifies this using a leading slash for '/application/code' and none for 'library/code'. This is simple to exploit by allowing different URIs for the two classes of code.

http://assets1.example.com/js/lib/0.1.2/
http://assets1.example.com/js/src/0.3.0/

Implementation

The tool that implements this loader provides two things things, a kernel and a module compiler. For the moment it is on an experimental branch of the Modulizer project, though I’m beginning to like “Yajsml” more and more.

The kernel is a terse bit of JS that provides the module loading and fetching functionality. It has no references to the global environment and, by default, exports to the require symbol. It’s the part that enables a simple page like this:

<script type="text/javascript" src="/kernel.js"></script>
<script type="text/javascript">
  require.setRootURI('/js/src/');
  require.setLibraryURI('/js/lib/');
  require.setGlobalKeyPath('require');
  app = new (require('/app').Application)({
    "userId": 1234
  , "baseURI": "http://example.com/"
  });
</script>

The module compiler takes a number of paths and compiles them into a require.define() call. Using the command line tool:

../modulize --output code.js --root-path ./src --library-path ./lib --import-dependencies -- ./src/app.js

Produces the following package:
require.define({
  "/app": null
, "/app.js": function (require, exports, module) {
    var models = require('/models');
    var util = require('util');
  }
, "/models": null
, "/models.js": null
, "/models/group": null
, "/models/group.js": function (require, exports, module) {
    exports.Group = function () {
      /.../
    };
  }
, "/models/index": null
, "/models/index.js": function (require, exports, module) {
    exports.User = require('./user').User;
    exports.Group = require('./group').Group;
  }
, "/models/user": null
, "/models/user.js": function (require, exports, module) {
    exports.User = function () {
      /.../
    };
  }
, "util": null
, "util.js": null
, "util/index": null
, "util/index.js": function (require, exports, module) {
    exports.escapeHTML = function () {};
    exports.escapeHTMLAttribute = function () {};
    exports.importantURL = 'http://example.com/';
  }
});

Future

This the first iteration in a longer project with several more big ideas to adopt, but, IMHO, aside from one or two missing features, this is a pretty comprehensive solution for the problem of distributing code from the client’s perspective. The remaining improvements revolve around improving the optimization of packaging and using the cache more effectively.

Regarding effective use of the cache, having module requests get redirected to designated/canonical packages has lots of potential to increase cache hits when loading order varies – such as across pages. As far as finding an optimal packaging goes, it’s the kind of problem that sounds like the perfect job for some sort of nondeterministic heuristic-ish algorithm.

And finally, while the two buckets, libraryURI and rootURI, are probably sufficient for most projects, the thought of allowing for multiple library paths is appealing. Searching would of course be made more expensive for some modules, but I suspect that ordering the search paths by increasing frequency of updates may allow caching to compensate for this.

Updates

It’s since become clear that parts of this discussion are reasonably independent of each other, so they’ve been broken into their own projects:

  • require-kernel: A minimalist implementation of require that supports asynchronous retrieval.
  • yajsml: An asset server that performs packaging and clever things like redirecting to a canonical resource.
  • modulizer: A tool that finds dependencies at runtime.

Finding JavaScript’s Global Object

Sunday, April 10th, 2011

With JavaScript code being written in ever more diverse environments these days, some assumptions are bound to get broken. One such assumption is that the object bound to the symbol window in the current scope is the global object. Every approach I’ve seen searches through a list of probable symbols and returns the first defined, instead of using the language itself.

var global = (typeof window != 'undefined' ? window : global)

Below is a snippet that will return the global object independent of scope and interpreter.

var global = (function () {return this})();

Note: except in the rarest of cases, direct address of the global object is illegitimate regardless of approach, using this more robust snippet is no excuse.