Theory vs Practice
Ξ March 15th, 2010 | → 0 Comments | ∇ Algorithms, Thoughts |
I’ve recently come across an interesting article detailing an existing bug in the binary search implementation presented in Programming Pearls (Addison-Wesley, 1986; Second Edition, 2000). The article can be found here and it presents a bug that can easily be overlooked. Many of us may be very familiar with the problem and many a times we notice it, but easily disregard it because we say that it cannot occur in practice. The nice thing about this bug is that if the program would have been described in pseudocode, the bug would not be there. I really advise you to read the article… it presents how the bug was discovered and what was the impact.
My point however with respect to the presented problem is not that bugs exist or not, but the fact that there is a huge gap between theory and practice. Many things can have a very beautiful theoretical description, but the practical implementation is far from simple. I recall that when I was reading Introduction to Algorithms (MIT Press), for some algorithms it took me some time to get them run in the optimal time. The beauty of pseudocode is that you can make “magic happen”. For example, one of the most common pseudocode lines when it comes to graph algorithms is:
for every neighbor v of u in G do
This looks like a fairly simple statement but when it comes to implementation things are not all that clear… How is the graph represented? Adjacency matrix, list of neighbors, some other data structure? The answer is: it depends on the algorithm… because the data structure can strongly influence the running time of the algorithm and it can lead to implementations that are an order of magnitude less efficient then the theoretical complexity of the algorithm.
Another problem that is usually around the theoretical presentation of algorithms is related to several assumptions. Since algorithms are closer to mathematics then to the world of computers, several restrictions don’t get expressed in them and are either regarded as implicit or not necessary. Some of the most common omissions are ignoring data range checks, out of bounds (or for that matter almost any exceptional condition), underlying data structure, etc. I am not saying that all this should be presented within the theoretical algorithm, but I just want to emphasize that the theoretical description of a program is in many cases very far from the actual program implementing it and since this is the case, it is very easy to miss some of these details… This being said, I think that having a bug around for over 20 years, in a fairly simple algorithm implementation, that was missed by many, only proves once again that we are not machines and that the human race has a natural tendency to overlook certain unimportant details and consider them implicit. After all, this is what actually led to our progress and this is one of the major things that differentiates us from machines (the ability to extract the essential relevant information from a sea of data).
As a conclusion, I am not really that worried about having bugs around for more then 20 years. Since no-one complained about them only says that they didn’t really impact us until recently and now, that we are aware of them, we will remove them. After all, program implementation and design is an iterative process…