Plugin-based Systems (and Events)

Modern application design is solved! OK, well we at least have a set of camps with their own principles, tools, and paradigms leading us toward the light. One might be “build some components, wire them up with references to each other, and let them talk to each other.” We love/hate static typing, but it allows tools to reason about really large programs.

All that is great, but I feel like plugin system design is in the wild west. To be clear, I mean dropping a new directory of code in place, performing some ritual, and the system behaving radically differently. Of course, we’re getting better at this as apps learn from each other, but I feel like our principles and tools aren’t particularly well matched to this goal of plugins being able to cooperate on a deep level to build up a system.

Allowing plugins to provide APIs introduces a big set of challenges. Just a few:

  • How do we conditionally use API’s if they’re available?
  • How can a plugin decorate/filter the API’s output?
  • How can a plugin replace the API with its own?
  • If two plugins want to replace the API, which “wins”?
  • How can a plugin remove part of the behavior/API of another plugin?
  • How much can plugins detect about each other?

Most robust plugin systems solve these with event systems and some mechanism of ordering plugins to solve disputes (all unique).

A big problem is that events are almost always run-time linking based on strings. Hence it’s very difficult for tools and humans to reason about which listeners will be called, in what order, and what data will flow to each listener and be returned to the dispatching party. Ideally IDEs could sense all this stuff, and help wire up new listeners and events. Instead, devs on a plugin-heavy system must do string searching on event names or at best use some reporting tool built into the event system.

Symfony’s typed event objects and Ruby’s symbols probably help here. Drupal has a strong convention both in code and docs that helps, and is popular enough for toolmakers to focus on it. Middleware systems as in Express (node) and Stack (PHP) offer a formal way to compose applications, which is pretty exciting, but I’m not sure they solve any of the above problems of plugins tightly collaborating on processes.

What’s the future look like here, and what’s the way toward it? Standardizing on a single event system seems unlikely, but what are the best and most powerful ones out there? What language features would make this easier? Can someone stop me from diving deep into Event-driven programming literature?

Events and Dependency Injection

The Laravel podcast folks mentioned a backlash about firing events in controllers, and I would guess those arguments were based on the notion that events can be used as a sneaky way of using global dependencies. Depending on the design of the event system, the party firing the event can access resources returned by event listeners, and this pattern can be abused like a service locator.

If you allow events to return resources and to be called within a component, you don’t really know a component’s real dependencies without reading the source code. Not at all does this imply you shouldn’t do it, but just that there are trade-offs anytime you reach out for undeclared dependencies at runtime.

What might give the best of both worlds is a way of injecting a dispatcher limited to a single event. E.g. make a class that triggers a particular event and depend on that. This would be best baked into the language similar to generics, so reflection could determine the event(s) needed without a bunch of boilerplate single-event objects.

Pass all the Breakpoints

When modifying a system without adequate tests, I found it helps to include debugger breakpoints in my manual testing checklist:

For each change I make, I add a breakpoint after the modified line, and won’t remove it until the interpreter has passed through it exercising the logic to my satisfaction. Before committing the changes I can pull up the list of breakpoints to make sure I’ve hit them all.

This is probably most valuable when I’ve made a lot of changes at once, or if temporary delusion is making me think I don’t need to test everything.

Caveat 1: I suppose, if there’s any parallel operations, you should also test everything with the breakpoints gone, lest the code relies on losing a race condition.

Caveat 2: In no way is this a substitute for automated tests!

Bug Fixes: the Hidden Value of JS Libraries

Paul Irish points out a huge value added from jQuery that few people (even the jQuery home page) seem to acknowledge, which is that it transparently works around dozens of browser bugs. If you consider older versions of jQuery this is probably more like hundreds of bugs, and many of them have been big showstoppers that even crashed browsers.

Fans of the “vanilla.js” movement—an understandable reaction to mindless library use—tend to understate what a minefield browser APIs tend to be when you consider the wide variety of browsers in use (take a sobering dip into your site’s browser stats). There’s nothing wrong with using and learning the real APIs, of course. There are big sections of DOM, Events, and friends that are very well supported, but if you’re not extremely careful, as soon as your site picks up traffic you can count on “vanilla” code breaking for users with browsers that don’t auto-update.

The moral: Even if you’re a JS pro and know the native APIs, if you’re doing anything substantial with them, then using jQuery—or some other battle-tested library—isn’t so mindless; you’re improving UX for some end users.

Why doesn’t someone create a library that just fixes the bugs? Dean Edwards was an early experimenter here with base2 and IE7.js, but as far as I remember these were mostly combed over by the JS nerds who were working on their own libs. Combining workarounds with other useful bits just gave you more bang for the buck, and, frankly, the standard APIs like DOM have never been particularly elegant to work with anyway (it took the W3C way too long to even acknowledge that giving devs the ability to parse markup (innerHTML) was something browsers might want to offer!).

Frameworks and Developer Happiness

Jake Archibald tweeted this comic expressing (I’m interpreting here):

  • There’s a difference between “using libraries” and “using frameworks”
  • Even if you don’t understand the components themselves, when using libraries:
    • there are fewer components in the system
    • the program flow through the components is clearer

I believe this is completely accurate, in that a lot of developers feel this way about frameworks, but I don’t think it’s due to a big difference between using libraries vs. a framework. I think it’s rather a matter what kind of environments make developers happy.

Devs prefer a higher familiarity with the codebase

If you write an application “using libraries” it will always feel more comfortable. It will be crafted around your biases (your preferred configuration format and form, file layout, DI and other libraries, etc.) and it will be simple enough to meet just the use cases you foresee at the moment. Over time added features will force you to make more decisions about new components and refactoring. But no doubt you will have written a framework. Did you make objectively better decisions than those working on Framework (a public project that calls itself a “framework”), who maybe were also building on top of libraries? Maybe, maybe not, but you’ll probably feel better about those decisions, and when you look at the more complex code, you’ll remember why that code was needed and forgive the complexity.

But a new developer on this project won’t have the same biases, she’ll be overwhelmed by those complexities (which look unnecessary), and to her it will feel just like a Framework.

Code hosting sites are littered with skeleton apps built from libraries that have little or no documentation and would be difficult for a developer without the same biases to jump into. And every use case that’s had to be added has made those frameworks more complex and more impenetrable. At a certain level everything starts to look like Symfony, but frequently without the documentation and support community. An author that builds something such that the choices made were obvious may be less motivated to document it.

Devs prefer less complex systems that do a few things really well.

Large organizations maintain a variety of enterprisey apps like PeopleSoft that do 1E9 things to support 1,000s of business processes, and I feel for the folks “toiling in the Java mines” on these systems; it looks like messy, unglamorous work, and where each new feature has an impact on dozens of others. I think the sheer size of some Frameworks remind developers of these kind of nightmare scenarios.

Smaller projects with fewer use cases always enable a simpler framework around the business logic, and so any Framework that you’re not already very familiar with is going to seem like overkill. And it will be right up to the point where the project becomes complex or the original authors leave the team.

What to make of this

I guess my point is that, all code quality being equal and over time, there’s not a big objective quality difference between the framework you rolled from libraries vs the one downloaded that others rolled from libraries. But I recognize that its subjectively enjoyable to build them and to work on systems where you’re productive. And that matters.

Sorta-related aside: There’s an interesting tug-of-war dynamic between developers and management tasked with keeping a piece of software maintained. A lot of the web is geared towards hastily building something sexy and throwing it away if the product doesn’t take off, and so you want devs to create and use whatever they’re most productive in. But if you’re maintaining an internal business app that will certainly be critical for the foreseeable future—and one that devs will not tolerate working on it for long!—you have to optimize your dev processes for developer turnover, while simultaneously trying to keep them happy. Introducing any technology with a potentially short lifespan introduces big risk.

Elgg’s Path Forward

Like many older PHP projects, Elgg has lots of problems with tight coupling, procedural patterns, and untestability; and has a very web 1.9 model: spit out full page, add a little Ajax. The good news is that Elgg has a ton of great functionality and ideas embedded in that mess, we have a core team which often can find agreement about dev principles and goals, and we have a new schedule-based release process that ensures that hard work going into the product makes it to release more quickly.

Lately I feel like the Elgg core team is excitedly gearing up for a long hike, during which we’ll make tons of hard decisions and churn lots of code remolding Elgg to look more like a modern JavaScript + PHP API framework.

I’m not sure I want to make that hike.

My suspicion is there’s a shorter route around the mountain; some modern framework may be out there whose team has already put in the hard effort of building something close to what we’re looking forward to. I think the time it would take us to get there would be long and filled with tons of wheel-rebuilding—work that won’t be going into improving UX and which provides no cross-project knowledge gain for Elgg devs.

I’m also wondering if we would be wise to ignore our itching about back end code quality for a bit and focus all attention on the front end and on UI/UX problems. As a plugin developer, I certainly see back end design choices that cause problems, but they’re rarely blockers. I spend a lot more time improving the UX and dealing with our incomplete Ajax implementation. The jewel of the 1.9 release isn’t going to be the dependency container and PSR-0 compatible autoloading; it’ll be the responsive Aalborg theme.

For me, back end refactoring work is fun because it’s relatively easy. You’re changing the way the pieces snap together, not necessarily making them work better or solving new problems. It also keeps me in the comfort zone of working mostly with code and people I’m already familiar with. This is OK for a little while but doesn’t push me to grow.

This isn’t to imply that the core team is infected with Not-invented-here. We definitely want to replace as much home-grown code as possible with well-tested alternatives maintained elsewhere. It’s just a hunch I have that this will be a long process involving tons of decisions that have already been made somewhere else.

I’m still having a lot of fun developing for and in Elgg, but I’m itching to pick up something new, and to work in a system that’s already making good use of and establishing newer practices. Hitching Elgg to another project’s wagon seems adventurous.

I also have to vent that the decision to maintain support for PHP 5.2—a branch that ended long-term support 3.5 years ago—during 1.9.x seems disastrously wrong. 1.9 had a long development process during which a significant amount of high-quality, highly-tested, and actively maintained community code was off-the-table because it wasn’t 5.2 compatible. We had to port some things to 5.2 and fix the resulting bugs, and some unit tests are a mess without Closures; just a huge waste of time. Nor could we benefit from the work being done on Drupal or WordPress because both are GPL, as are a lot of other older PHP projects with 5.2-compatible code. PHP 5.2 is still expressive enough to solve most problems without namespaces, Closures, et al., but in 2014 devs don’t want to code with hands tied behind their backs to produce less readable code that will soon have to be refactored. /rant

Get rid of variable variable syntax

Uniform Variable Syntax was voted in (almost unanimously) for PHP6 PHP7 and introduces a rare back compatibility break, changing the semantics of expressions like these:

                           // old meaning            // new meaning
1. $$foo['bar']['baz']     ${$foo['bar']['baz']}     ($$foo)['bar']['baz']
2. $foo->$bar['baz']       $foo->{$bar['baz']}       ($foo->$bar)['baz']
3. $foo->$bar['baz']()     $foo->{$bar['baz']}()     ($foo->$bar)['baz']()
4. Foo::$bar['baz']()      Foo::{$bar['baz']}()      (Foo::$bar)['baz']()

IMO the “variable variable” syntax $$name is a readability disaster that we should get rid of. ${$name} is much clearer about what this is (dynamic name lookup) and what it’s doing: $ is the symbol table, and {$name} tells us that we’re finding an entry in it under a key with the value of $name. PHP should deprecate $$name syntax and emit a notice in the next minor version.

It should also deprecate the syntax ->$name and ::$$name. Both are bad news.

Doing so would completely eliminate the first 3 ambiguous expressions above, and the new warning emitted would call out code that would otherwise silently change meaning in PHP6 (one big negative about this RFC).

As for the fourth, Consider these expressions:

A. $foo->prop
B. $foo->prop()
C. $foo->prop['key']()
D. Foo::$prop
E. Foo::$prop()
F. Foo::$prop['key']()

To the chagrin of JavaScript devs, PHP will not let you reference a dynamic property in an execution context, so expression B will try to call a method “prop”, (and fatal if it can’t). For consistency, PHP should fatal on expression E. Currently PHP just does something completely unexpected and insane, looking up a local variable $prop.

Expression C does what you’d expect, accessing the property named “prop”, so expression F should do the same, which is why I think the RFC is a clear step forward at least.

What, when, and how is your IDE telling you?

A programmers.stackexchange question basically asks if there’s an ideal frequency to compile your code. Surely not, as it completely depends on what kind of work you’re doing and how much focus you need at any given moment; breaking to ask for feedback is not necessarily a good idea if you’re plan is solid and you’re in the zone.

But a related topic is more interesting to me: What’s the ideal form of automated information flow from IDE to programmer?

IDEs can now potentially compile/lint/run static analysis on your code with every keystroke. I’m reminded of that when writing new Java code in NetBeans. “OMG the method you started writing two seconds ago doesn’t have the correct return type!!!” You don’t say. And I’ve used an IDE that dumped a huge distracting compiler message in the middle of the code due to a simple syntax error that I could’ve easily dealt with a bit later. I vaguely remember one where the feedback interfered with my typing!

So on one side of the spectrum the IDE could be needling you, dragging you out of the zone, but you do want to know about real problems at some point. Maybe the ideal notification model is progressive, starting out very subtle then becoming more obvious as time passes or it becomes clear you’re moving away from that line.

Anyone seen any unique work in this area?

Stepping back to the notion of when to stop and see if your program works, I think the trifecta of dependency injection, sufficiently strong typing, and solid IDE static analysis has really made a huge difference in this for me. Assuming my design makes sense, I find the bugs tend to be caught in the IDE so that programs are more solid by the time I get to the slower—and more disruptive to flow—cycle of manual testing. YMMV!

On Frameworks

Building any non-trivial app, one of the toughest decisions to make is which development framework to base your work on. And there’s no way around this decision once you realize there’s no such thing as “not using a framework.”

Even if you’re just binding together 3rd party components, a framework will be born out of any development work, and it will have its own tradeoffs to consider.

  • Will it be documented?
  • Will it be tested?
  • Will it have a community of people working on/within it?
  • Will it have a plugin architecture encouraging code reuse?
  • Will you have access to plugins written by people outside your team?
  • Will configuration be easy/reusable?
  • Will it address code/UX scenarios you hadn’t considered?
  • Will you get free bug fixes/features from outside developers?
  • Will it securely handle password and protection against XSS, CSRF, etc.?
  • Will other developers be able to jump into the project?

Developers, plugins, sub-frameworks, tools, and shops all form an ecosystem around a framework that tends to build value as it grows. A strong ecosystem can make up for product weaknesses; a trip through the Drupal and WordPress codebases will have some seasoned developers shuddering at some architectural choices made (long ago), yet that code runs great due to the work of tons of people over time. And, thanks to the wealth of plugins available, and to the continual transfer of knowledge from one developer to the next, devs who are experienced on those platforms can build impressive systems quickly.

On the flip side, there are properties of mature frameworks that are important to consider:

  • They can be slow to change, which may mean maintaining a separate git branch with your own fixes.
  • They rarely bend to support the use cases of individual sites.
  • They tend to have older code that may be harder to integrate with the latest practices.
  • They can be large; any framework that solves a lot of problems out-of-the-box will be. Microframeworks can be a great choice, but their value tends to be limited to plumbing.

Thankfully the ideal framework needn’t be chosen first. You may be better off prototyping something as quickly as possible—by any means—then rebuilding after the requirements are clearer. The danger is the temptation to ship prototypes as production systems, where they grow to become liabilities over time.

Events with Dynamic Handlers, Simplified

Elgg’s event systems are based on registering callables to handle certain event names. As the codebase is still heavily procedural, most event handlers in the core and 3rd party plugins are global functions. Since I’m firmly in the static-code-is-evil camp, I’d much prefer handlers be dynamic method calls, but this introduces some complexity:

  1. Nice plugins allow their event handlers to be unregistered, but dynamic callbacks make this a bit awkward; any plugin that wanted to unregister another plugin’s handler would have to first get a reference to the other plugin’s object. Kind of messy.
  2. This requires proactively creating lots of objects—and loading lots of code—you don’t need in order to create the callbacks.

One way to work around the second issue would be to register callbacks on a proxy objects that lazy-load the real objects on the first method call. This adds complexity, could lead to typehint failures if the proxy gets accessed directly, and doesn’t solve the first issue at all. So I think the event system needs to be extended in a simple way:

We could allow callback strings of the form "service->method" where service is the name of a pre-registered (and lazy-loading) service on Elgg’s DI container. Essentially the event trigger would call $dic->service->method().

This would allow trivially (un)registering these handlers, as well as efficiently lazy-loading the objects without the complexity of proxies.

I’d also suggest never actually binding to the service. This way you could replace a whole set of handlers by replacing the service on the DI container.