Dependency Hell
Mar. 24th, 2013 02:30 pm![[personal profile]](https://www.dreamwidth.org/img/silk/identity/user.png)
I was thinking that I should complain about work. After all, what is a journal for besides complaining about the various problems that bedevil us during the week?
Except that this week I want to complain about dependency management. And dependency management is a highly technical subject that nobody will know about who doesn’t already know what I’m going to say. So it occurred to me that what I really need to do is explain dependency management, so that people who don't understand computers can see what I mean. Then I can complain about it as often as I want.
So, what is a dependency? A dependency is something that a program depends on. That’s not very useful. What’s a program?
A program is, at heart, a set of instructions. It’s telling the computer to do something. For instance, I program in python, which I’ll use here for a few examples just because I find it fairly readable. Python is what is known as an interpreted language, which for our purposes just means that it can do something one line at a time (this is opposed to other languages that require you to write the whole thing down before they can execute anything). So, here’s a basic command in python, using ipython:
Here’s another one:
Those should both be pretty self-explanatory. The first example tells it to print a line, the second tells it to do math. In both cases the computer complies. This is because python comes with some basic functions and operators. One of them is print which prints whatever it gets passed to the screen. One of them is ‘+’ which, when given two integers to play with, performs mathematical addition.
But what about something else? What if you want to take the cosine of a number for some really strange, obscure reason (probably due to a child in middle school):
Which is python telling you that it also has no idea how to do a cosine. Well, that’s pretty useless. But a bit of googling will tell you that you can do cosine in python. In fact, there’s a module appropriately named math, which has a cosine function. But math doesn’t come standard, it has to be imported:
You can think of programs sort of like recipes. They’re both lists of instructions. Imagine this as the recipe for a beef roast, buried somewhere deep in the basic cookbook you buy your kitchen-illiterate teenager when they go off to college. The recipe may be complete in and of itself. Or it may not be. It may, for instance, say: “Top with pan-roasted gravy (pg. 482)”. After all, there might be several things in the book that you should be topping with pan-roasted gravy, so why bother to repeat the code all over the place (this, by the way, is the foundational principle of Object Oriented Programming, but I’m not going to talk about that). The import statement in python serves much the same purpose (as does the include statement in C/C++, or other equivalents). The math module is included in python’s base installation, but it isn’t included automatically in every program because, after all, how often do you actually need to use a cosine? So to save space it’s kept on disk until someone decides they need it and manually load it with import.
Anyway, you have successfully solved a 7th grade math problem using a programming language, so you’re feeling sort of happy with yourself. What else can you do? Well, continuing on a mathematical bent there are a pair of libraries for python, NumPy and SciPy, that provide a great deal of numerical and mathematical power. So, let’s say you want to make a fast array (something that python doesn’t usually do). That’s possible using numpy.
(perceptive people with too much time on their hands will note I’ve changed interpreters momentarily)
So what happened there? What happened is that you blundered your way into Dependency Hell. You see, math is part of what’s called the Standard Library, the set of modules that is shipped out with every current version of python. But NumPy is not, it’s produced by an independent group of programmers and it’s not in the standard library. The code has done the computer equivalent of saying “Top with pan-roasted gravy from our fantastic book A Thousand Basic Recipes.”
Now in cookbooks this is kind of a bad thing to do, because it requires you to buy another book. In python the books are free, people want you to use their code. But that doesn’t mean it’s not a hassle. For one thing, you’ve got to find the book. The python version of Amazon is the Python Package Index, colloquially known as the Cheeseshop, which has everything. Well, almost everything. What if you’re importing something that one of your coworkers wrote? Then you’re hosed (unless they put it in the Cheeseshop). Now you’re out hunting through every used bookshop in the area in the hope that when they sold it they sold it somewhere local.

Oh come on, how many cookbooks can there possibly be? (The Ohio State University)
Next you have to start worrying about versions. If your 1988 copy of Oven Cooked Goodness tells you to pull a pan-roasted gravy recipe out of A Thousand Basic Recipes it’s possibly you have the book, but what if it’s the 1996 version? You flip to page 482 and find … a recipe for jello. Of course, it was the right page in the 1984 edition, but the 1996 edition has some re-editing. Now, your computer generally does one of a few things:

We don't know why they told us to put the cereal in the oven either (James and Everett)
To explain, the best thing that a computer can do is the right thing. The second best thing a computer can do is nothing. The absolutely worst thing that a computer can do is act inconsistently, and nothing makes a computer act inconsistently like badly versioned code. Now for gravy this isn’t such a bad deal. Gravy hasn’t changed much in the past few decades. But code can change dramatically in three or four weeks, let alone three years. Because if you use the wrong version chances are that there’s something in there with the right name but you have no idea if it’s doing the right thing. Anyone who has had to work with technology knows how bad this is, because then you end up with code that works right most of the time, so when you ask someone to fix it they try it and say “It works for me”, and you can’t reproduce the problem and eventually you just have to live with the fact that the code mostly works, unless you happen to be holding down the shift key while the moon is full, and then it blows up spectacularly and causes your machine to reboot. And you’ll never fix it, because everything looks alright and it has the right name. It’s just a little wrong.

Close enough (globalpackagegallery.com)
And of course, this is the simple case. What happens if your cookbooks come from some commercial cookbook empire (think Betty Crocker on steroids)? Then you could be moving happily along on your way through Scrumptious Dinner Recipes when you run across something demanding stuff from Oven Cooked Goodness. So you open Oven Cooked Goodness and find a line in there asking you to use Pan-Roasted Gravy from A Thousand Basic Recipes. Yes, even your dependencies have dependencies. And not just one, but possibly dozens. How many of them do you actually need? If you’re making an oven roast, do you really need the reference in Oven Cooked Goodness to The Elegant Pastry Chef? As a programmer, you have three options:

Code dependency plot provided as an example for Tinytag Explorer. This is why programmers have no friends. (depgraph)
And this assumes that the core libraries are functional. Up until now I’ve been thinking about dependencies as python modules. Python modules are like recipes, but they depend on lower-level pieces of code that are installed in libraries on your computer. This normally works fine until someone screws that one up. Then it’s like getting a recipe that directs you to “Fill the green measuring cup with flour”. What green measuring cup? How big is the green measuring cup? What if you have a green measuring cup, is it the right size? And even if you can figure out what the green measuring cup is, what happens if you have two different pieces of code, each of which asks for a different green measuring cup? No matter what you choose to put in your cupboard, chances are that at least half the code is going to break.
Computers make this worse by using dynamic linking for lower-level libraries, which is the equivalent of telling a program where things are by telling you where they were on the machine that built them. “Oh, you want the green measuring cup in the cupboard above the stove”, under the rational assumption that all kitchens are arranged exactly the same. When you do run into a kitchen arranged differently, the program immediately begins to sulk. “Why don’t you re-arrange your kitchen to look like my kitchen?” it asks. “What does your kitchen look like?” you ask it, exasperated. “I don’t know, but you should change yours to look like it”. At which point you are allowed to beat your computer to death.

Time to get a new job (Amazon)
And this is nothing compared to the absolutely, positively worse thing that a programmer can do, which is on-site hotpatching, changing the code on the machine that runs the code without putting it back in the master version. This is the cooking equivalent of modifying the recipe on the fly. A program is an implicit contract, if you follow these instructions, you will be able to produce this thing. If you have a recipe for gravy that uses far too much flour you can fix it for yourself by putting a little mark on your measuring cup so you know that you shouldn’t actually use more than that much flour. But if you then hand this recipe off to someone else without changing the recipe, they’re not going to know. Even if they use your kitchen how will they know what the mark in the cup means? They don’t. Then not only will your friends end up with a gravy that has the consistency of grits, they will have a burning desire to never eat at your house again. If you do this to your coworkers, they will come to your office and beat you to death with your own keyboard.

Modern software companies keep supplies of gasoline-soaked rags for just such an occasion.
In conclusion, don’t program, it sucks. And if you have to program, rubber-coat your keyboard so when they come to beat you with it, they won’t kill you.
Except that this week I want to complain about dependency management. And dependency management is a highly technical subject that nobody will know about who doesn’t already know what I’m going to say. So it occurred to me that what I really need to do is explain dependency management, so that people who don't understand computers can see what I mean. Then I can complain about it as often as I want.
So, what is a dependency? A dependency is something that a program depends on. That’s not very useful. What’s a program?
A program is, at heart, a set of instructions. It’s telling the computer to do something. For instance, I program in python, which I’ll use here for a few examples just because I find it fairly readable. Python is what is known as an interpreted language, which for our purposes just means that it can do something one line at a time (this is opposed to other languages that require you to write the whole thing down before they can execute anything). So, here’s a basic command in python, using ipython:
In [1]: print "Hello, how are you?"
Hello, how are you?
Here’s another one:
In [2]: 1+1
Out[2]: 2
Those should both be pretty self-explanatory. The first example tells it to print a line, the second tells it to do math. In both cases the computer complies. This is because python comes with some basic functions and operators. One of them is print which prints whatever it gets passed to the screen. One of them is ‘+’ which, when given two integers to play with, performs mathematical addition.
But what about something else? What if you want to take the cosine of a number for some really strange, obscure reason (probably due to a child in middle school):
In [3]: cos(0.0)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)in ()
----> 1 cos(0.0)
NameError: name 'cos' is not defined
Which is python telling you that it also has no idea how to do a cosine. Well, that’s pretty useless. But a bit of googling will tell you that you can do cosine in python. In fact, there’s a module appropriately named math, which has a cosine function. But math doesn’t come standard, it has to be imported:
In [4]: import math
In [5]: math.cos(0.0)
Out[5]: 1.0
You can think of programs sort of like recipes. They’re both lists of instructions. Imagine this as the recipe for a beef roast, buried somewhere deep in the basic cookbook you buy your kitchen-illiterate teenager when they go off to college. The recipe may be complete in and of itself. Or it may not be. It may, for instance, say: “Top with pan-roasted gravy (pg. 482)”. After all, there might be several things in the book that you should be topping with pan-roasted gravy, so why bother to repeat the code all over the place (this, by the way, is the foundational principle of Object Oriented Programming, but I’m not going to talk about that). The import statement in python serves much the same purpose (as does the include statement in C/C++, or other equivalents). The math module is included in python’s base installation, but it isn’t included automatically in every program because, after all, how often do you actually need to use a cosine? So to save space it’s kept on disk until someone decides they need it and manually load it with import.
Anyway, you have successfully solved a 7th grade math problem using a programming language, so you’re feeling sort of happy with yourself. What else can you do? Well, continuing on a mathematical bent there are a pair of libraries for python, NumPy and SciPy, that provide a great deal of numerical and mathematical power. So, let’s say you want to make a fast array (something that python doesn’t usually do). That’s possible using numpy.
import numpy
Traceback (most recent call last):
File "/base/data/home/apps/shell/1.335852500710379686/shell.py", line 267, in get
exec compiled in statement_module.__dict__
File "", line 1, in
ImportError: No module named numpy
(perceptive people with too much time on their hands will note I’ve changed interpreters momentarily)
So what happened there? What happened is that you blundered your way into Dependency Hell. You see, math is part of what’s called the Standard Library, the set of modules that is shipped out with every current version of python. But NumPy is not, it’s produced by an independent group of programmers and it’s not in the standard library. The code has done the computer equivalent of saying “Top with pan-roasted gravy from our fantastic book A Thousand Basic Recipes.”
Now in cookbooks this is kind of a bad thing to do, because it requires you to buy another book. In python the books are free, people want you to use their code. But that doesn’t mean it’s not a hassle. For one thing, you’ve got to find the book. The python version of Amazon is the Python Package Index, colloquially known as the Cheeseshop, which has everything. Well, almost everything. What if you’re importing something that one of your coworkers wrote? Then you’re hosed (unless they put it in the Cheeseshop). Now you’re out hunting through every used bookshop in the area in the hope that when they sold it they sold it somewhere local.

Oh come on, how many cookbooks can there possibly be? (The Ohio State University)
Next you have to start worrying about versions. If your 1988 copy of Oven Cooked Goodness tells you to pull a pan-roasted gravy recipe out of A Thousand Basic Recipes it’s possibly you have the book, but what if it’s the 1996 version? You flip to page 482 and find … a recipe for jello. Of course, it was the right page in the 1984 edition, but the 1996 edition has some re-editing. Now, your computer generally does one of a few things:
- It goes to the table of contents for the book, finds a correct recipe for pan-roasted gravy, and executes it. This is the best solution.
- It goes to the table of contents, finds that there’s no such recipe as “Pan roasted gravy”, raises and error, and goes home to sulk. This is the second best solution.
- It goes to page 482, reads the recipe, and does what the recipe says. This is an absolutely terrible outcome, because you have no idea what you’re getting.
- It finds the wrong recipe, which happens to be named “Pan roasted gravy” and executes this. You are now totally fucked.

We don't know why they told us to put the cereal in the oven either (James and Everett)
To explain, the best thing that a computer can do is the right thing. The second best thing a computer can do is nothing. The absolutely worst thing that a computer can do is act inconsistently, and nothing makes a computer act inconsistently like badly versioned code. Now for gravy this isn’t such a bad deal. Gravy hasn’t changed much in the past few decades. But code can change dramatically in three or four weeks, let alone three years. Because if you use the wrong version chances are that there’s something in there with the right name but you have no idea if it’s doing the right thing. Anyone who has had to work with technology knows how bad this is, because then you end up with code that works right most of the time, so when you ask someone to fix it they try it and say “It works for me”, and you can’t reproduce the problem and eventually you just have to live with the fact that the code mostly works, unless you happen to be holding down the shift key while the moon is full, and then it blows up spectacularly and causes your machine to reboot. And you’ll never fix it, because everything looks alright and it has the right name. It’s just a little wrong.
Close enough (globalpackagegallery.com)
And of course, this is the simple case. What happens if your cookbooks come from some commercial cookbook empire (think Betty Crocker on steroids)? Then you could be moving happily along on your way through Scrumptious Dinner Recipes when you run across something demanding stuff from Oven Cooked Goodness. So you open Oven Cooked Goodness and find a line in there asking you to use Pan-Roasted Gravy from A Thousand Basic Recipes. Yes, even your dependencies have dependencies. And not just one, but possibly dozens. How many of them do you actually need? If you’re making an oven roast, do you really need the reference in Oven Cooked Goodness to The Elegant Pastry Chef? As a programmer, you have three options:
- Run through all possible use-cases of your code, and every time it breaks, install whatever code is missing. Go through this is the vague, forlorn, and hopeless belief that you can actually figure out everything that needs to be done. Get it right 99% of the time, which is just 1% short of functional.
- Systematically install every single package required by every single dependency recursively, installing the entirety of the internet into your computer if necessary. Pray that none of them conflict.
- Seppuku

Code dependency plot provided as an example for Tinytag Explorer. This is why programmers have no friends. (depgraph)
And this assumes that the core libraries are functional. Up until now I’ve been thinking about dependencies as python modules. Python modules are like recipes, but they depend on lower-level pieces of code that are installed in libraries on your computer. This normally works fine until someone screws that one up. Then it’s like getting a recipe that directs you to “Fill the green measuring cup with flour”. What green measuring cup? How big is the green measuring cup? What if you have a green measuring cup, is it the right size? And even if you can figure out what the green measuring cup is, what happens if you have two different pieces of code, each of which asks for a different green measuring cup? No matter what you choose to put in your cupboard, chances are that at least half the code is going to break.
Computers make this worse by using dynamic linking for lower-level libraries, which is the equivalent of telling a program where things are by telling you where they were on the machine that built them. “Oh, you want the green measuring cup in the cupboard above the stove”, under the rational assumption that all kitchens are arranged exactly the same. When you do run into a kitchen arranged differently, the program immediately begins to sulk. “Why don’t you re-arrange your kitchen to look like my kitchen?” it asks. “What does your kitchen look like?” you ask it, exasperated. “I don’t know, but you should change yours to look like it”. At which point you are allowed to beat your computer to death.

Time to get a new job (Amazon)
And this is nothing compared to the absolutely, positively worse thing that a programmer can do, which is on-site hotpatching, changing the code on the machine that runs the code without putting it back in the master version. This is the cooking equivalent of modifying the recipe on the fly. A program is an implicit contract, if you follow these instructions, you will be able to produce this thing. If you have a recipe for gravy that uses far too much flour you can fix it for yourself by putting a little mark on your measuring cup so you know that you shouldn’t actually use more than that much flour. But if you then hand this recipe off to someone else without changing the recipe, they’re not going to know. Even if they use your kitchen how will they know what the mark in the cup means? They don’t. Then not only will your friends end up with a gravy that has the consistency of grits, they will have a burning desire to never eat at your house again. If you do this to your coworkers, they will come to your office and beat you to death with your own keyboard.

Modern software companies keep supplies of gasoline-soaked rags for just such an occasion.
In conclusion, don’t program, it sucks. And if you have to program, rubber-coat your keyboard so when they come to beat you with it, they won’t kill you.