Introduction
We love talking about "fatigue" in the JavaScript community. You know what I've
been getting a bit fatigued by lately? This:
I've been thinking a lot about dependencies. This is something of a brain dump.
I'll start by giving my working definition of what a dependency is, then I'll
go through a list of common dependencies. Finally, I'll present a simple model
I've started using for classifying dependencies.
What is a dependency?
For the purposes of this article, I will define a dependency as
any component of an experience which is outside your direct control
as the developer of that experience.
Note that this definition doesn't say anything about software. Software
experiences (ie apps, websites, games, etc) are only one type. Driving your
car down the road, attending a concert, or playing football
are all experiences, and all have dependencies. Ever tried to play football
without a ball? Whoever invented the rules of football was free to
design the experience however they wanted, but ultimately the experience always
depends on some sort of a ball being available at "runtime".
Note that some dependencies are optional. It's ideal to wear special shoes
when playing football, but it can be done with normal shoes or even barefoot,
with the experience being diminished to varying degrees.
For the remainder of this article, I'm going to focus on dependencies of
software experiences.
The more time users spend with a piece of software, the more they get used to a
specific experience. In general, we want to avoid changing an experience unless
we have a very good reason to do so (ie adding a feature that we are confident
will significantly improve the experience).
The central thesis of this article is that
dependencies create openings for the experiences we develop to change without
our deliberately wanting them to, and so we should be thoughtful and careful
about the dependencies we take on.
Examples of Dependencies
Here's a list of common dependencies.
Hardware
The hardware your software runs on is a dependency. In general, you
as a developer have little to no control what hardware your users run your
software on. A user may be perfectly happy running your app on the latest
flagship Android phone, but then they lose their phone and have to downgrade
to a budget model for a few months, and suddenly your app is unusable for them.
Note that developers who make software for Apple products have a huge advantage
here. The number of devices they need to test on is vastly smaller than
developers for Windows, Linux, and especially Android.
Hardware is particularly a challenge with web development, where the same
software stack is used to develop for every imaginable type of hardware. Not
only that, but this stack is running a dynamic, interpreted language with many
layers of security and abstraction. One of the most exciting things about
WebAssembly is the potential to normalize web app performance across a wider
range of hardware[1].
In cases where you can control the hardware, it's amazing the levels of
consistency and reliability you can achieve. Last year I developed some forearm
pain in both arms from typing too much, so I made some Arduino-based foot
pedals so I didn't have to strain
so much to hit combo keys. I have a pair at home and work, that have both been
working 24/7 for a year. No failures, no glitches, no reboots necessary.
Operating systems
If your app is too tightly coupled to a specific version of an operating
system, when the user updates their computer your app might quit working. This
is a much bigger problem on systems like Linux. I often have issues with apps
not being able to find the right versions of dynamic libraries.
From what I've heard, Windows has an excellent backwards-compatibility history.
Mobile OSes seem to be somewhere in the middle.
One example I've seen is where an OS adopts a specific design paradigm (such
as Material Design for Android), and there becomes pressure to overhaul your
app so the UI is consistent with the rest of the OS.
Programming Language Compilers/Runtimes
The functionality, performance, and distribution of your app are deeply tied to
your choice of programming language. Fortunately, these are some of the most
stable dependencies around.
An obvious exception is new or quickly evolving languages. Rust syntax looks
very different post-1.0 than it did in the beginning, and even today the async
story is rapidly changing.
One situation I wouldn't want to be in is having written an app in the latest
compiles-to-JavaScript language, then have the language go extinct 2 years
later. This is less of a problem if you only need codebase to live for 2 years,
but I'm not sure how common that is (or at least should be).
Build Tools
Bundlers, transpilers, minifiers, uglifiers, etc.
They're called "dev dependencies" for a reason. Have you ever done a fresh,
'npm init
' followed by 'npm install webpack webpack-cli
' then taken a peek in
node_modules
?
Speaking of npm, I think it belongs here as well. Yes, npm is a dependency.
Especially if the experience you're providing is a library that is only
installable by using it. It's perfectly possible to write a node service or
browser application without having a package.json at all. As a matter of fact,
that's the case with this website, and my personal website, both of which are
single-page apps with few dependencies.
Note that Python/pip is a similar story to npm.
The Internet
The internet can be a huge dependency, and if interfaced with poorly, a huge
liability. The quality of different connections (and even a single connection
over time) varies wildly, and is affected by outages, congestion, solar flares,
Georgian women with
shovels,
and whatever the cloud decided to have for breakfast on a given morning.
If your app relies on an internet connection at runtime, almost by definition
you are shipping a software experience which is constantly changing. Just because
people have become accustomed to dealing with slow internet connections doesn't
mean it's ok for us to abuse the internet as a dependency. There are many
techniques for improving the user experience, the most basic of which is
communicating what exactly is going on.
Time-to-first-byte is often touted as one of the most important attributes of
web software. Maybe that's true. I'm inclined to question it, and I think it
depends on whether you're talking about a content website, or a web app. If
it's an app, and I have a choice between waiting 5 seconds for it to download
all the code and enough data for page changes to be instantaneous, vs an
instant first page followed by variable page loads later, content jumping
around as things stream in, etc, I'll take the 5 seconds every time.
Especially if it gives me a loading bar.
It's all about expectations. For year, gamers have been waiting hours on huge
downloads before they can even run the game. Because once the download is done,
the performance is great.
Note that fast first load and instant page navigations are not mutually
exclusive.
Web Browsers
These days, browsers are essentially in the same category as operating systems.
If Chrome, Firefox, or Safari decided one day to make a major change, your app
could instantly break for thousands of users. Fortunately, web browsers
generally have an exceptionally good backwards compatibility story. Our current
browsers will gladly run JS from 10 years ago, and I expect the JS I'm
writing today to still work 10 years from now. That's impressive.
Web Links
Links are a central part of the web. However, they also make any given
web experience incredibly brittle. Any web page you make is dependent on
every link on that page. If you link to an external page, and that page
disappears (which happens often), your experience is now broken.
You could always link out to the Internet Archive, but then you're centralizing
all your link dependencies. I think the long-term solution to this problem
could be something like IPFS, where websites pin versions of everything they
link to. But that has its own problems, like if you link out to an insecure
version of a web app. This would basically be the web's version of static
linking.
Frameworks/Libraries
These ones are obvious. They're what I usually think of first when people talk
about dependencies. If you're using a large framework that does a lot of heavy
lifting for you app, you're at the mercy of that framework (and its likely many
sub-dependencies) for your experience to remain consistent. The less you
understand what that framework is doing under the hood, the more vulnerable you
are.
That doesn't mean frameworks are bad. A new developer with an exciting idea
might be able to crank out a prototype using a framework where they would
otherwise get bogged down with platform details.
However, in general I advocate learning the platform over time, not necessarily
to avoid using frameworks, but to reduce the vulnerability that comes from
dependance on them. If your framework just can't do what you need it to
(or as performantly as you need it to), ideally you should be able to throw
it out and implement a bespoke replacement.
Just a few days ago, the inventor of NPM made a change in minipass, which broke
node-pre-gyp (I still don't know how exactly it depends on minipass, since it's
not a direct dependency...), which broke bcrypt, which broke our Docker build,
and a lot of other people's
stuff. Kudos to them for
working to fix it right away.
Note that although pinning versions of the libraries we're using would have
avoided this particular problem for us, and is a good idea in general, it is
not a silver bullet. package-lock.json
can save you from your app breaking
overnight, but in general there are going to be security updates and other
issues.
A few months ago we had a dependency that hadn't been maintained in years. It
had gotten so old that it required an ancient version of node to work properly.
Eventually it started preventing us from keeping our other dependencies up to
date, because they depended on modern JavaScript features. Languages and
runtimes are relatively stable, but they DO change.
Datasets
If your app/experience relies on a specific dataset, that's a dependency.
One example would be an interactive data visualization. If you collected and
control the data yourself, this likely isn't a problem. However, if the data
comes from a 3rd source that is constantly changing, you're dependant on that
source, or risk the data becoming stale. Even if the data doesn't change, you
may be dependant on a 3rd party not changing their usage policies.
APIs
Closely related to datasets is APIs, which are often used to access datasets
such as Twitter. Over time, companies have consistently locked their APIs
down more and more to prevent 3rd party developers from making alternative
interfaces. This makes sense from a business point of view; you can't show
ads on an app you don't control. If you're going to rely on an API to develop
your app, make sure you understand the business incentives of all parties
involved, and what that likely means for the long-term viability of your app.
A Mental Model for Categorizing Dependencies
There are many different metrics we can use to gauge the risk of adding a
certain dependency, or to compare different dependencies with each other. One
criteria is how much control do you have over the dependency itself? A small,
in-house library provided by the team down the hall is much lower risk than
an off-the-shelf closed source framework. In general, the less work a
dependency is doing for you, and the more generic it is (ie easy to replace
with an alternative), the safer it is to use it.
As I was thinking about this, I started breaking dependencies down into several
distinct categories. This is how I think of them:
- Platform Dependencies
- Data Dependencies
- Logic Dependencies
Platform Dependencies
Platform dependencies are the most fundamental type. These include hardware,
operating systems, web browsers, programming language compilers/runtimes, and
APIs. Pretty much every software project is going to have platform
dependencies. APIs are unique here because they are much more risky than the
others. Rarely do popular APIs provide their source code, and even if they did,
usually it's the data behind it that's actually valuable. You're giving a 3rd
party complete control over the functionality of your app, with a high
liklihood of it changing. However, sometimes there is no choice. If you want to
develop a Facebook app, you have to use their API.
Platform dependencies are the level where you're almost certainly wasting your
time trying to build it yourself (which doesn't mean that's never a good
choice).
Data Dependencies
Data dependencies include things like public datasets, well-known lookup
tables, and sometimes even algorithms. One nice thing about these is that they
are often strongly related to something in the physical world, which lends them
a certain gravity that helps prevent them changing over time. For example, the
CORDIC algorithm/lookup tables have
been around essentially unchanged for many years, and will be useful in this
form for many more, because they are closely tied to a) math and b) fundamental
hardware architecture that is almost univerally used in our current computing
systems.
When re-writing my personal website recently, I
tried to avoid dependencies as much as possible. The site does have 2 though: a
markdown-to-html converter and a syntax highlighter. The syntax highlighter is
a great example of a data dependency. The core logic is unlikely to change from
language to language, and might be something I would consider writing myself.
However, it's not worth my time duplicating the effort already put into
creating grammars for all the supported languages.
Given their ubiquity and stable nature, I don't worry too much about using
data dependencies.
Note that public datasets that are trapped behind an API aren't pure data
dependencies, unless you can download the entire dataset.
Logic Dependencies
Logic dependencies are the least desirable (and most avoidable) type. Logic
here refers to basic programming logic, ie if/then, loops, etc. Frameworks and
most libraries (except thin wrappers around datasets) are in this
category. These dependencies include basically any unit of functionality which
you could write yourself and avoid the dependency. However, there's a
tradeoff here. The more complicated the job being done by the dependency, the
more you should consider whether it's worth doing it yourself.
My rule of thumb is that if I'm not familiar with the inner workings of a
dependency, to spend a bit of time trying to implement it myself. Maybe
an hour or two. Maybe a day or two. Sometimes I give up and decide to use the
dependency. Sometimes I realize I only need a tiny piece of the functionality
and implementing it myself is the right answer. Either way, I learn a lot and
can make a faster decision the next time. Plus if I do take on a dependency, I
likely have a much better idea what it's doing for me after going through this
process.
Client-side routing is one thing I recently realized is simpler
than I thought (for the features I need, at least) and don't always need a
library for.
Something that I doubt I'd ever try to write from scratch is a WYSIWYG HTML
editor. It's a very complex task, and there are already high quality, pluggable
solutions out there.
Conclusion
Dependencies are a necessary part of developing useful software experiences.
However, there is always a cost associated with taking on a dependency.
Generally, I try to avoid dependencies, and when I do need them, I try to only
use them in places where they could be swapped out for a similar option. The
syntax highlighter mentioned earlier is a great example of this. Doing this is
easier if you make a wrapper that only exposes the features you need.
I hope I've given you one or two new ideas to consider the next time you're
faced with the choice of whether to take on a dependency.
[1] Let's be honest, web developers will probably find a way to waste it.