Javascript files don’t auto-update

On a panel of 4 Javascript library developers at Ajax Experience 2008, a question came up about how their libraries use browser detection. When John Resig suggested that libraries should strive for full feature detection (hardly used at all at the time) instead of browser/object detection, the other developers reacted like he was crazy. They mentioned the cases where this just isn’t possible, but none of the developers mentioned the huge, very good reason to do this whenever possible: There are pages using libraries deployed everyday that never get maintained. Yes, when a new browser changes behavior the libraries can quickly update their codebase, but many pages will never get those updates.

Note, there are autoloaders that may maintain the latest library version, but this doesn’t guarantee a stable library API over time. Feature detection (mostly) does.

The Quickening of Facebook

If you’ve used Facebook in Opera and Firefox, you might have noticed that Facebook is several magnitudes faster in FF, but this has nothing to do with FF’s speed. For FF and IE users, Facebook uses a client-side architecture called “Quickening” that basically makes a few popular pages into full AJAX applications that stay loaded in the browser for a long time. All transitions between “quickened” pages are done through AJAX calls and a cache system makes sure all pages displayed from cache are updated based on changes from the server (e.g. comments others made, ad rotation) or client (e.g. comments you made).

While other sites have certainly done this before, the complexity of Facebook’s apps and level of optimization performed is staggering. The system continuously self-monitors page performance and usage of resources and re-optimizes resources like JS/CSS/sprite images to send and receive as few bytes as possible.

Video presentation goodness: Velocity 09: David Wei and Changhao Jiang, “Frontend Performance Engineering in Facebook”

Where’s the code?

Google’s free open source project hosting has been awesome for Minify, so when I was looking around for Subversion hosting for my personal code, I figured why not host it there? So here’s a bunch of my PHP and Javascript code. Hopefully some of it will be useful to people. A few PHP highlights:

  • HashUtils implements password hashing with a random salt, preventing the use of rainbow table cracks. It can also sign and verify the signature of string content.
  • StringDebug makes debugging strings with whitespace/non-printable/UTF-8 characters much less painful.
  • CookieStorage saves/fetches tamper-proof and optionally encrypted strings in cookies.
  • TimeZone simplifies the handling of date/times between timezones using an API you already know: strtotime() and date().
  • Utf8String is an immutable UTF-8 string class that aims to simplify the API of the phputf8 library and make behind-the-scene optimizations like using native functions whenever possible. Mostly a proof-of-concept, but it works.

I’ll get the examples of these online at some point, but if you export /trunk/php, all the lowercase-named files are demos/tests. There’s also an “oldies” branch (not necessarily goodies).

Hopefully this makes the political jibba jabba more forgivable.

Javascript humor

hasthelargehadroncolliderdestroyedtheworldyet.com has a news feed. It also has this in the source:

if (!(typeof worldHasEnded == "undefined")) {
document.write("YUP.");
} else {
document.write("NOPE.");
}

Folks without Javascript get a more definite answer:

<noscript>NOPE.</noscript>

Also appreciated:

<!-- if the lhc actually destroys the earth & this page isn't
yet updated please email mike@frantic.org to receive a full
refund -->

Case of the NS_ERROR_DOM_SECURITY_ERR

Working on a bookmarklet, I ran across “security errors” in Firefox and Opera (may happen in others, I didn’t check). In Firefox the code threw “Security error (NS_ERROR_DOM_SECURITY_ERR)” and in Opera it was something similarly vague.

The culprit code was trying to access the cssRules property of a style sheet from a different domain (my CSS is on a subdomain). It appears browsers apply the same origin policy to the DOM CSSStyleSheet interface. Effectively, you can access the href property of a different domain styleSheet object, but not its cssRules. In Firebug, the cssRules property will be null, but attempting to even read it in user script causes the exception.

If your script is throwing NS_ERROR_DOM_SECURITY_ERR, check for code trying to access objects on a different domain.

Minifying Javascript and CSS on mrclay.org

Update: Please read the new version of this article. It covers Minify 2.1, which is much easier to use.

Minify v2 is coming along, but it’s time to start getting some real-world testing, so last night I started serving this site’s Javascript and CSS (at least the 6 files in my WordPress templates) via a recent Minify snapshot.

As you can see below, I was serving 67K over 6 requests and was using some packed Javascript, which has a client-side decompression overhead.

fiddler1.png

Using Minify, this is down to 2 requests, 28K (58% reduction), and I’m no longer using any packed Javascript:

fiddler2.png

Getting it working

  1. Exported Minify from svn (only the /lib tree is really needed).
  2. Placed the contents of /lib in my PHP include path.
  3. Determined where I wanted to store cache files (server-side caching is a must.)
  4. Gathered a list of the JS/CSS files I wanted to serve.
  5. Created “min.php” in the doc root:
    // load Minify
    require_once 'Minify.php';
    
    // setup caching
    Minify::useServerCache(realpath("{$_SERVER['DOCUMENT_ROOT']}/../tmp"));
    
    // controller options
    $options = array(
    	'groups' => array(
    		'js' => array(
    			'//wp-content/chili/jquery-1.2.3.min.js'
    			,'//wp-content/chili/chili-1.8b.js'
    			,'//wp-content/chili/recipes.js'
    			,'//js/email.js'
    		)
    		,'css' => array(
    			'//wp-content/chili/recipes.css'
    			,'//wp-content/themes/orangesky/style.css'
    		)
    	)
    );
    
    // serve it!
    Minify::serve('Groups', $options);

    (note: The double solidi at the beginning of the filenames are shortcuts for $_SERVER['DOCUMENT_ROOT'].)

  6. In HTML, replaced the 4 script elements with one:
    <script type="text/javascript" src="/min/js"></script>

    (note: Why not “min.php/js”? Since I use MultiViews, I can request min.php by just “min”.)

  7. and replaced the 2 stylesheet links with one:
    <link rel="stylesheet" href="/min/css" type="text/css" media="screen" />

At this point Minify was doing its job, but there was a big problem: My theme’s CSS uses relative URIs to reference images. Thankfully Minify’s CSS minifier can rewrite these, but I needed to specify that option just for style.css.

I did that by giving a Minify_Source object in place of the filename:

// load Minify_Source
require_once 'Minify/Source.php';

// new controller options
$options = array(
	'groups' => array(
		'js' => array(
			'//wp-content/chili/jquery-1.2.3.min.js'
			,'//wp-content/chili/chili-1.8b.js'
			,'//wp-content/chili/recipes.js'
			,'//js/email.js'
		)
		,'css' => array(
			'//wp-content/chili/recipes.css'

			// style.css has some relative URIs we'll need to fix since
			// it will be served from a different URL
			,new Minify_Source(array(
				'filepath' => '//wp-content/themes/orangesky/style.css'
				,'minifyOptions' => array(
					'prependRelativePath' => '../wp-content/themes/orangesky/'
				)
			))
		)
	)
);

Now, during the minification of style.css, Minify prepends all relative URIs with ../wp-content/themes/orangesky/, which fixes all the image links.

What’s next

This is fine for now, but there’s one more step we can do: send far off Expires headers with our JS/CSS. This is tricky because whenever a change is made to a source file, the URL used to call it must change in order to force the browser to download the new version. As of this morning, Minify has an elegant way to handle this, but I’ll tackle this in a later post.

Update: Please read the new version of this article. It covers Minify 2.1, which is much easier to use.

Thoughts on a Javascript “validator”

(X)HTML and CSS have their own validators, but we need one for Javascript as well. JSLint could be part of it, but what I’m imagining would flag the use of “native” objects/interfaces not defined by W3C or ECMA standards. E.g., document.layers or window.ActiveXObject.

The hard part in doing this is finding a full Javascript interpreter with W3C DOM interfaces, but without the proprietary features that browsers have to support to remain usable on the web. A couple ways come to mind:

1. Rhino

John Resig has just implemented “window” for the Javascript interpreter Rhino, which, supposedly, is already based strictly on the ECMAscript standards. The short term goal of his script was to give Rhino enough of a browser interface to pass jQuery’s test suite, but the code could be branched to implement all current W3C DOM recommendations (and none of the proprietary interfaces). A Java app would set up Rhino, run John’s script, then run your script and simply report any errors.

2. A stricter browser

Most modern browsers are pretty malleable as far as letting user code clobber native functions/objects; in fact it’s a real problem. But this is convenient if you want to, say, overwrite all the proprietary features with your own code! With a whitelist in hand, you could use for-in loops to sniff out non-standard functions/properties and either delete or overwrite them with functions of your choice.

The advantage of the browser solution is that you could still provide the non-standard features (so the script would continue to run) while logging them. Using closures you could, in theory, “wrap” each non-standard function so that each of its calls would also call your logging function before returning the expected value. A crude example:

// may fail spectacularly
var _parseInt = window.parseInt;
window.parseInt = function (str) {
    alert('parseInt was called.');
    return _parseInt(str);
};
parseInt('2');

In practice, you’ll be at the mercy of the browser to allow native functions to be stored in variables like this. Sometimes it works, sometimes not. As for properties, you may be able to similarly log their usage with Javascript’s relatively new getters and setters.

So any idea where to get such a whitelist?

Hacking a 3rd party script for bookmarklet fun

A few weeks ago I created a simple bookmarklet that loads del.icio.us’s PlayTagger script into the current page. This post covers how some problems with this script were worked through.

Too late

The first challenge was that PlayTagger was designed to initialize itself (let’s call this method “init“) on window.onload: If a user fired the bookmarklet after window.onload (99% of the time), playtagger.js would load but init would’ve missed its chance to be called. This means I had to call init manually, but since script elements load asynchronously, I had to wait until init actually existed in the global scope to call it. This was fairly easily accomplished by attaching my code to the new script element’s “load” event (and using some proprietary “readyState” junk for IE).

Too early

If the page takes a long time to load, it’s possible the user will fire the bookmarklet before window.onload. One of two things will occur:

If it’s fired before the DOM is even “ready”, the bookmarklet throws an error when it tries to append the script element. I could use one of the standard “DOMready” routines to run the bookmarklet code a little later, but this case is rare enough to be not worth the effort to support; by the time the user can see there are mp3s on the page, the DOM is usually ready.

Assuming the DOM is ready, playtagger.js gets loaded via a new script element, the bookmarklet fires init, but then, thanks to playtagger’s built-in event attachment, init is called a second time on window.onload, producing a second “play” button per mp3 link. Harmless, but not good enough.

Preventing the 2nd init call

It would be nice if you could sniff whether or not window.onload has fired, but this doesn’t seem to be possible. Maybe via IE junk. Any ideas for a standards based way to tell?

My only hope seemed to be to somehow disable init after manually calling it. The first try was to just redefine init to a null function after calling it:

init();
init = function () {};

I figured out that redefining init would not help here due to the way it’s attached to window.onload:

// simplified
var addLoadEvent = function(f) {
    var old = window.onload;
    window.onload = function() {
        if (old) { old(); }
        f();
    };
};
addLoadEvent(init);

What’s important to notice here is that init is passed to addLoadEvent as f and window.onload is redefined as a new function, capturing f in the closure. So now f holds init‘s original code (because functions are first-class in Javascript), and f, not the global init, is what is really executed at window.onload. As f is private (hidden by the closure), I can’t overwrite it.

Disabling init from the inside by “breaking” Javascript

The second thing I tried was to break init‘s code from the inside. The first thing init does is loop over the NodeList returned by document.getElementsByTagName('a'), so if I could get that function to return an empty array, that would kill init‘s functionality. Because Javascript is brilliantly flexible I can do just that:

// cache for safe keeping
document.gebtn_ = document.getElementsByTagName;
// "break" the native function
document.getElementsByTagName = function(tag) {
    if (tag != 'a') return document.gebtn_(a);

    // called with 'a' (probably from init)
    // "repair" this function for future use
    document.getElementsByTagName = document.gebtn_;
    // return init-busting empty array
    return [];
};

Simplest solution

While the code above works pretty well, I thought of a simpler, more elegant solution: just rewrite window.onload to what it was before playtagger.js was loaded.

And with that here is the final unpacked bookmarklet code:

javascript:(function () {
    if (window.Delicious && (Delicious.Mp3 || window.Mp3))
        return;
    var d = document
        ,s = d.createElement('script')
        ,wo = window.onload
        ,go = function () {
            Delicious.Mp3.go();
            window.onload = wo || null;
        }
    ;
    s.src = 'http://images.del.icio.us/static/js/playtagger.js';
    if (null === s.onreadystatechange) 
        s.onreadystatechange = function () {
            if (s.readyState == 'complete')
                go();
        };
    else 
        s.onload = go;
    d.body.appendChild(s);
})();

Kill these DOM0 shortcuts

A problem a decade in the making

You can refer to a form’s elements in your code by using the element’s name (from the NAME attribute)
— an ancient Javascript spec.

This means myForm.myElement is a shortcut for myForm.elements['myElement']. I’m sure this was seen as handy and harmless at the time, but the problem is that references to form elements, simply by using valid name attributes, can overwrite important DOM properties and methods on the form element. Make a text input with name=”submit” and you overwrite the form’s native submit method; no more form submission!

Workaround

Effectively, you have to avoid using names that already exist as properties or methods of the form DOM element, from modern specs back to the stone age. To get an idea of how many there are, I created an empty form element with no attributes, fired up the Javascript shell bookmarklet and used its handy props function:

Methods: addEventListener, addRepetitionBlock, addRepetitionBlockByIndex, appendChild, attachEvent, blur, checkValidity, cloneNode, contains, detachEvent, dispatchEvent, dispatchFormChange, dispatchFormInput, focus, getAttribute, getAttributeNS, getAttributeNode, getAttributeNodeNS, getElementsByTagName, getElementsByTagNameNS, getFeature, hasAttribute, hasAttributeNS, hasAttributes, hasChildNodes, insertAdjacentElement, insertAdjacentHTML, insertAdjacentText, insertBefore, isDefaultNamespace, isSupported, item, lookupNamespaceURI, lookupPrefix, moveRepetitionBlock, namedItem, normalize, removeAttribute, removeAttributeNS, removeAttributeNode, removeChild, removeEventListener, removeNode, removeRepetitionBlock, replaceChild, reset, resetFromData, scrollIntoView, selectNodes, selectSingleNode, setAttribute, setAttributeNS, setAttributeNode, setAttributeNodeNS, submit, toString

Fields: accept, acceptCharset, action, all, attributes, childNodes, children, className, clientHeight, clientLeft, clientTop, clientWidth, contentEditable, currentStyle, data, dir, document, elements, encoding, enctype, firstChild, id, innerHTML, innerText, isContentEditable, lang, lastChild, length, localName, method, name, namespaceURI, nextSibling, nodeName, nodeType, nodeValue, offsetHeight, offsetLeft, offsetParent, offsetTop, offsetWidth, onblur, onclick, ondblclick, onfocus, onkeydown, onkeypress, onkeyup, onload, onmousedown, onmousemove, onmouseout, onmouseover, onmouseup, onunload, outerHTML, outerText, ownerDocument, parentElement, parentNode, prefix, previousSibling, repeatMax, repeatMin, repeatStart, repetitionBlocks, repetitionIndex, repetitionTemplate, repetitionType, replace, scrollHeight, scrollLeft, scrollTop, scrollWidth, sourceIndex, style, tagName, target, templateElements, text, textContent, title, unselectable

These are just in Opera; to be really safe you need to add in any extras from all the other modern browsers.

But there’s more

This problem also exists on the document object: Give your form name="getElementById", then try using document.getElementById(). Even worse, IE/win registers all element ids onto the global (window) object.

The real solution

The W3C/WhatWG needs to step up and formally denounce these shortcuts in the HTML DOM specs, and, IMO, also generate warnings in the HTML validator where input names match those in W3C DOM specs, at the very least. Since there’s a sea of legacy code out there using these shortcuts, browser makers desperately need a defined way to handle these collisions.