Hacking a 3rd party script for bookmarklet fun

A few weeks ago I created a simple bookmarklet that loads del.icio.us’s PlayTagger script into the current page. This post covers how some problems with this script were worked through.

Too late

The first challenge was that PlayTagger was designed to initialize itself (let’s call this method “init“) on window.onload: If a user fired the bookmarklet after window.onload (99% of the time), playtagger.js would load but init would’ve missed its chance to be called. This means I had to call init manually, but since script elements load asynchronously, I had to wait until init actually existed in the global scope to call it. This was fairly easily accomplished by attaching my code to the new script element’s “load” event (and using some proprietary “readyState” junk for IE).

Too early

If the page takes a long time to load, it’s possible the user will fire the bookmarklet before window.onload. One of two things will occur:

If it’s fired before the DOM is even “ready”, the bookmarklet throws an error when it tries to append the script element. I could use one of the standard “DOMready” routines to run the bookmarklet code a little later, but this case is rare enough to be not worth the effort to support; by the time the user can see there are mp3s on the page, the DOM is usually ready.

Assuming the DOM is ready, playtagger.js gets loaded via a new script element, the bookmarklet fires init, but then, thanks to playtagger’s built-in event attachment, init is called a second time on window.onload, producing a second “play” button per mp3 link. Harmless, but not good enough.

Preventing the 2nd init call

It would be nice if you could sniff whether or not window.onload has fired, but this doesn’t seem to be possible. Maybe via IE junk. Any ideas for a standards based way to tell?

My only hope seemed to be to somehow disable init after manually calling it. The first try was to just redefine init to a null function after calling it:

init();
init = function () {};

I figured out that redefining init would not help here due to the way it’s attached to window.onload:

// simplified
var addLoadEvent = function(f) {
    var old = window.onload;
    window.onload = function() {
        if (old) { old(); }
        f();
    };
};
addLoadEvent(init);

What’s important to notice here is that init is passed to addLoadEvent as f and window.onload is redefined as a new function, capturing f in the closure. So now f holds init‘s original code (because functions are first-class in Javascript), and f, not the global init, is what is really executed at window.onload. As f is private (hidden by the closure), I can’t overwrite it.

Disabling init from the inside by “breaking” Javascript

The second thing I tried was to break init‘s code from the inside. The first thing init does is loop over the NodeList returned by document.getElementsByTagName('a'), so if I could get that function to return an empty array, that would kill init‘s functionality. Because Javascript is brilliantly flexible I can do just that:

// cache for safe keeping
document.gebtn_ = document.getElementsByTagName;
// "break" the native function
document.getElementsByTagName = function(tag) {
    if (tag != 'a') return document.gebtn_(a);

    // called with 'a' (probably from init)
    // "repair" this function for future use
    document.getElementsByTagName = document.gebtn_;
    // return init-busting empty array
    return [];
};

Simplest solution

While the code above works pretty well, I thought of a simpler, more elegant solution: just rewrite window.onload to what it was before playtagger.js was loaded.

And with that here is the final unpacked bookmarklet code:

javascript:(function () {
    if (window.Delicious && (Delicious.Mp3 || window.Mp3))
        return;
    var d = document
        ,s = d.createElement('script')
        ,wo = window.onload
        ,go = function () {
            Delicious.Mp3.go();
            window.onload = wo || null;
        }
    ;
    s.src = 'http://images.del.icio.us/static/js/playtagger.js';
    if (null === s.onreadystatechange) 
        s.onreadystatechange = function () {
            if (s.readyState == 'complete')
                go();
        };
    else 
        s.onload = go;
    d.body.appendChild(s);
})();

Kill these DOM0 shortcuts

A problem a decade in the making

You can refer to a form’s elements in your code by using the element’s name (from the NAME attribute)
— an ancient Javascript spec.

This means myForm.myElement is a shortcut for myForm.elements['myElement']. I’m sure this was seen as handy and harmless at the time, but the problem is that references to form elements, simply by using valid name attributes, can overwrite important DOM properties and methods on the form element. Make a text input with name=”submit” and you overwrite the form’s native submit method; no more form submission!

Workaround

Effectively, you have to avoid using names that already exist as properties or methods of the form DOM element, from modern specs back to the stone age. To get an idea of how many there are, I created an empty form element with no attributes, fired up the Javascript shell bookmarklet and used its handy props function:

Methods: addEventListener, addRepetitionBlock, addRepetitionBlockByIndex, appendChild, attachEvent, blur, checkValidity, cloneNode, contains, detachEvent, dispatchEvent, dispatchFormChange, dispatchFormInput, focus, getAttribute, getAttributeNS, getAttributeNode, getAttributeNodeNS, getElementsByTagName, getElementsByTagNameNS, getFeature, hasAttribute, hasAttributeNS, hasAttributes, hasChildNodes, insertAdjacentElement, insertAdjacentHTML, insertAdjacentText, insertBefore, isDefaultNamespace, isSupported, item, lookupNamespaceURI, lookupPrefix, moveRepetitionBlock, namedItem, normalize, removeAttribute, removeAttributeNS, removeAttributeNode, removeChild, removeEventListener, removeNode, removeRepetitionBlock, replaceChild, reset, resetFromData, scrollIntoView, selectNodes, selectSingleNode, setAttribute, setAttributeNS, setAttributeNode, setAttributeNodeNS, submit, toString

Fields: accept, acceptCharset, action, all, attributes, childNodes, children, className, clientHeight, clientLeft, clientTop, clientWidth, contentEditable, currentStyle, data, dir, document, elements, encoding, enctype, firstChild, id, innerHTML, innerText, isContentEditable, lang, lastChild, length, localName, method, name, namespaceURI, nextSibling, nodeName, nodeType, nodeValue, offsetHeight, offsetLeft, offsetParent, offsetTop, offsetWidth, onblur, onclick, ondblclick, onfocus, onkeydown, onkeypress, onkeyup, onload, onmousedown, onmousemove, onmouseout, onmouseover, onmouseup, onunload, outerHTML, outerText, ownerDocument, parentElement, parentNode, prefix, previousSibling, repeatMax, repeatMin, repeatStart, repetitionBlocks, repetitionIndex, repetitionTemplate, repetitionType, replace, scrollHeight, scrollLeft, scrollTop, scrollWidth, sourceIndex, style, tagName, target, templateElements, text, textContent, title, unselectable

These are just in Opera; to be really safe you need to add in any extras from all the other modern browsers.

But there’s more

This problem also exists on the document object: Give your form name="getElementById", then try using document.getElementById(). Even worse, IE/win registers all element ids onto the global (window) object.

The real solution

The W3C/WhatWG needs to step up and formally denounce these shortcuts in the HTML DOM specs, and, IMO, also generate warnings in the HTML validator where input names match those in W3C DOM specs, at the very least. Since there’s a sea of legacy code out there using these shortcuts, browser makers desperately need a defined way to handle these collisions.