Book Review: Best Kept Secrets of Peer Code Review

Best Kept Secrets of Peer Code Review
Jason Cohen, et al

We cannot forever hide the truth about ourselves, from ourselves.
– John McCain

This book was a freebie from Smart Bear Software, which means that if it were presented in terms of utility per dollar, it would be nothing short of infinitely worthy. If you’re interested in Software Engineering, the disciplined and systematic development and maintenance of software, I can’t recommend highly enough that you go on and get your free copy. It’s an insightful and provacative book, and you won’t regret your investment of time.
The Best Kept Secrets of Peer Code Review remain secrets because it is manifestly evident to every coder that their code is flawless. This is pretty obvious; nobody ever sits down to add bugs to their code. A bug, logical or syntactical, is largely a matter of perception.

Black box testing for code flaws is a lot like smelling an old milk carton; a tester is here not hunting for actual flaws so much as the implication of flaws, their products, offspring, and output. Code review, as defined here, is the actual human fathoming of pre-production code. This is important, no one will deny, but is it absolutely necessary for proper development? That’s the argument made in this book, and it’s made with lucidity and care. The reason this book is worth reading is its careful, honest, and meticulous use of statistics to prove that enforced peer code review is a numerically better case for most software teams.

It is true, you know. The books sinews, the very guts of it, the part that will remain in the reader’s mind long after all thoughts of what constitutes a code flaw, what part of design is taste and what is doctrine, and what metrics of testing are more important that others, are its brilliant and decisive forays into the metrics of bugfinding. Simply put, the better and harder you look at code – any code – the more bugs you’ll find. The sooner you look, the easier they’ll be to fix, and the more lines reviewed, the more bugs will be found to exist. This is absolutely, universally, inarguably true of any code that exists, and if as those axioms weren’t philosophically fundamental, this book absolutely proves them and more.

The authors of the book reveal these truths so systematically that it’s impossible to flinch when, on putting the book down, you will realize that the more bugs that have found per line of code in a piece of software, the less buggy the software can be said to be. An excellent read for anyone who likes to ruminate on the dynamics of group software development. It’s convinced me to try out their cross-platform tool, Code Collaborator; here’s looking forward to it.

Book Review: Beautiful Code

Beautiful Code
Compiled by Andy Oram, Greg Wilson

A diamond is a chunk of coal that is made good under pressure. – Henry Kissinger

Beautiful Code is another non-animal O’Reilly volume, with high aspirations. As the sleeve submits: “How do the experts solve difficult problems in software development?” If this book had been able to answer that question, reading it would be a head-spinning experience indeed.

The book’s chapters are each the domain of a different prominent software developer or writer, and several are elegant outlines of what is unarguably some of the best code out there – Apache Webserver, Quicksort, and the Python interpreter. Ostensibly, the authors are talented beyond measure; with a lack of a cohesive theme, a unified structure, or an overall purpose the book quickly becomes a showcase of beautiful code-essays submitted by thirty-couple completely dissonant geniuses. The fact that the book still contains not much other than what it claims is no invitation for criticism. If you want to see real kung fu code, this is your book.

The potential disparities hide in between the lines, where compilers do not tarry. Few of the essays touch on why the code is elegant, or how it got to be the way it was. Most of them wander around what problem is solved by the code, some delve deep into the minutiae of the problem, and a few contain no code at all. This last set could have been chosen to present a semblance of organization, could have pulled loose ends together and formed some conceptual continuity among a wide variety of articles, but in their current states and places even these well-intended ‘theories of code beauty’ ruminations are ineffectual.

If a computer scientist was so riveted by unpolished essays surrounding the world’s best algorithmic hacks that they failed to notice that no new information was gained, no statements beyond the cold, functional truth were made, no concessions given to that imaginative side of the brain that, when it is given the occasional chance to influence the gnarled digits of a perl hacker, results in that big win – the one sought after in wake and sleep for a week – and doubles the maintainability and efficiency of some project, then that reader will be satisfied to the fullest extent.

For the rest of us, this book was no great failure but no revelation, and is deserving of its place on the shelf. And the geeks among geeks, the hackers who would have found a book closer to Hackers and Painters, held it and shook it until C code fell out, will no doubt appreciate this book.

Book Review: The Art of SQL

The Art of SQL
By St├ęphane Faroult, Peter Robson

As we know, there are known knowns. There are things we know we know. We also know there are known unknowns. That is to say we know there are some things we do not know. But there are also unknown unknowns, the ones we don’t know we don’t know. – Donald Rumsfeld (Sun Tzu’s second millennium CE incarnation.)

The Art of SQL is no Web Database Applications with PHP and MySQL. It’s far beyond that; it is well past even Practical PostgreSQL and the SQL Cookbook. Not only is it written for the experienced user, it is for the ambitious user, the one who wakes up with a smile trying to retain the n-dimensional join from their dream.

The excellent style of writing used is above O’Reilly par. Not only is the text concise and well edited, the book is organized very well. In a graceful and creative turn, the presentational style of the book is allusive to Sun Tzu’s Art of War; query diagrams, sample datasets, and business cases are rendered as plans of attack and battle formations in the Napoleonic era. The result is phenomenal, and structurally, this book is groundbreaking – no computer science book I’ve read prior has had so much attention paid to making its content engaging and enjoyable to consume – this is certainly not necessary, but it is a great indication of the overall quality of the book.

The book is SQL implementation agnostic and assumes the reader is interested in data integrity, extensibility, and scalability in the database. It assumes that you care, or want to care, whether you’re following third normal form. In fact, the implied understanding here is that an earnest investment in normalization will pay dividends in optimization. Only if you’re willing to perspire for it – it is an art, not a school of magic.

The SQL enthusiast will learn a lot from this book – perhaps a baffling amount. I absolutely cannot recommend it highly enough. It has been some time coming, the sort of thing that is an obvious boon when one considers that our ‘art’ has only been around for a few decades. We’ll get it right eventually, inspired by those like Faroult and Robson.

Mathematical Symbols

Mathematics, the study of totally constructed concepts like numbers, time, logic and all that, is the closest thing we’ve got to a universal language. Highly intuitive symbols are included, if not universal, and let me point out a few and critique them:

  • ‘Square foot’, exactly like it says on the box. Not that a foot is a good unit of measurement – a human foot is of course not a standard base for a system the way the size of our watery Earth is – but this symbol is excellent in its simplicity and expressiveness anyway.
  • An ‘angle’! Very nice. Just a drawing of the referenced thing. Conveniently, this one is extensible – you can mark it up to show that you mean the measure or that your angle bisects another one, or what three points describe it.
  • ƒ ‘Function of’, pretty daft as it’s just an initialization. At least it’s scripted, so you can tell it’s not just an F, I guess. This is precisely the kind of thing we should stay away from. A better symbol? ☡ would at least show the process of something undergoing a path, shows a thing passing a threshold.
  • ‘Infinity’. Well, I guess this one works. I’ve never been a huge fan of the ‘lazy eight‘, as it in no way indicates to me an unlimited amount of anything. But as an enormously abstract concept, it’s not easy to symbolize without encoding, so this arbitrary, somewhat rationalizable symbol (it keeps going!) works well enough.
  • Δ ‘Delta’, commonly indicating a change in something over time. Pragmatically, an arrow describing the path of the sun, phases of the moon, tides, plants, something like that would be better. However, I do think it’s great that, in English, a delta is a change in a river over time. A triangle shaped change. Is this intuitive? No. But for being referential in form, it’s way better than ‘Function of’.
  • ¬ ‘Not’. One of the absolute simplest, and best. One has taken the straight tally representing ‘a thing’, and broken it in just such a way that negates it in form, function, and concept. One of the most intuitive, least encoded symbols we have.
  • ‘And’ shows the logical joining of two concepts into a higher, compound version. + is still useful as it shows two tallies in combination, but rather than depicting them together, a∧b is clearly the union between. while it’s not as intuitive as +, we at least have a demonstrative picture of a joining, more indicative of the idea that ‘both of these are required’.
  • ‘Or’ is another logical operator, used as a∨b. The deficit of this and ∧ is that, while they’re wonderful in their logical inversion of one another, they are not referential of anything, and require some prior knowledge of their meaning. Good, but not perfect.
  • ‘Null’, a very difficult concept to represent in mathematics: a group of nothing. I’m very pleased with this one, which is so intuitive that it’s more often reddened and superimposed on another symbol to show that it is not allowed. What a fantastic and practical use! Another, even more definite success where symbols are concerned.
  • Here’s another terrific one. While not immediately intuitive, ‘therefore’ is an abstract concept. But if abstract concepts were shaped like dots, this symbol would show two of them with a third built on top, perfect units of reasoning being built on one another. Genius.

– ∧∨

Pictographic Encoding

Icons, so called because they’re miniature depictions of what happens after you click them, are literal and potentially lossless signs. Direct pictographs! These are better for representing saints or images or objects, maybe even processes or maps, but not possible for abstract concepts. At right, a good icon for your address book application. But if you wanted to use it as a shortcut for ‘import contents’, it’s not really literal anymore, just suggestive.

I think we’re better off considering icons a different set of symbols; symbols are by definition abstract stand-ins rather than direct representations. The line between the two isn’t perfect; I’d like to share some groovy symbolic libraries that might straddle it.

  • Heiroglyphics – Yeah, there are pictogrammatic elements to it, but they’re actually phonetic sketches made to document a verbal language. In other words, owls and crocodiles are really just letters.
  • Asian characters – Asian languages have really interesting crossed and common inheritances, and the earlier parts, still largely preserved, are translations from pictographic languages. You can still tell this in certain symbols today, like in this Chinese character for middle.
  • Naval Signals – totally encoded, but very widespread. I guess this is what words have to look like when they need to be seen from far away by strangers.
  • Trail signs – about as iconological as can be got with sticks and rocks. The nature of these as a sculptural language made only from materials found on the trail is interesting itself, but there’s something very cool about the actual symbols used, I think.
  • Hobo signs – as the trail signs above but more concerned with the specific occupation of tramping, and with a more urban vocabulary. Specifically, there are a lot of symbols describing the likely reaction of nearby people to different methods of begging, and suggested tactics for succeeding at the same.

Getting The Point Across

Language is pretty fantastic as a way of symbolically encoding information into lexemes and then written or verbal data. It’s always telling when you learn another way in which culture and language are built around each other, as the translation (lossy compression) process forces the data a ceertain way. To put it another way, thoughtspace is way more infinite than wordspace, and it’s hard to express ideas without distorting them a little bit in the telling.

That said, even relatively simple and long-established ideas still get lost in the telling. We’re certainly getting better at this; in the last few years usability has become a priority for corporate, academic, and governmental designers. Still, we haven’t found a simple lexicon for symbols.

So many of our signs have cultural or lexical meaning attached to them – really they’re encoded and not everyone has the keys to get to the data inside, unless they have prior exposure to the symbols used. Yes, I have some examples!

(At this time, please extinguish your cultural mind as far as possible, and use only conscious reasoning for the remainder of this post.)

  • Dig this faucet. You just got to a new country, and who knows whether cold is on the right or the left here? Good thing you read English, but if you didn’t, you’d be out of luck. That’s encoded data. (Also interesting: if instead of ‘Hot’ and ‘Cold’, it were heiroglyphics or kanji, would it be fixed? Maybe, if the characters you used weren’t too lossy.)
  • It’s obvious to us which of these is hot and which is cold, but that’s because we’ve all agreed to the standard. But this is still encoded – hot water isn’t actually red at all. One could make a case for this simple encoding, though – lakes are blue and coals are red – and it’s a pretty good one. There’s simply a little bit of intuition and guesswork going on, but it might be necessary. The red/blue temperature grammar is a pretty common one, at any rate.

One totally culture-encoding free way of conveying information is to use an actual representation – the way some bus stops have a picture of a bus. Not easy mistaking that one. However, look what happens here – we’d have to show the water molecules vibrating in place, faster for the hot water, to show what the difference is in the physical world. Well, that assumes a significant amount of prior knowledge of physics, and more people in the world likely speak English than know very much about molecules. We could do it a lot of different ways, but I can’t think of a perfect one, so comment if you were able to think of it.

Signs, Symbols, and Icons

The images below all express the same idea.

Do these all communicate the exact same information? Which of these three are you most accustomed to seeing? Which would you prefer to have if you were chilling with someone who wasn’t a native English speaker? If you were playing a game with a child? If you were inebriated? Seriously- these are all valid cases for usability.

OLPC, Followup

As long as the world is enduring the perils of globalization (Potential disease outbreaks, religious conflict, more fast food on Earth) we may as well really make it worth our own while – the most effective means of production we’ve got is also the easiest to divide up and do as chunks in separate hemispheres. I’m aware that international outsourcing is already taking place; that there are call centers and Visual Basic shops in developing countries. I know the setup is not ideal- who would expect it to be at this point? Certainly not those paying for it. But it’s a good start, and the quality of output will rapidly catch up.

But isn’t it a shame to move all those specialized jobs from the United States to a developing country? Absolutely not. The free market demands it, for one, and it’s immensely useful for both American businesses and the global economy I glossed over just a moment ago. If you’re not angry when your iPhone is manufactured in China, why would you mind if it’s programmed there too? I’m immensely pleased with the process. What the developing world craves is more development, and for that it needs to be transacted with freely. It’s far easier to outsource informational tasks than manufacturing orders. I know some people have issues with this:

If you feel protective of American jobs, how would you feel if your state passed a law that no business chartered within its borders was allowed to conduct business in neighboring states? That would harm both affected states and hamper the regional economy. This is a similar situation, if you’re of the mind that people on different continents are as important as the ones who live across the street.

Why is it that it’s so important for this to occur? Wouldn’t it be more desirable to our existing information economy for the second world to continue through the developmental epochs of subsistence agriculture into manufacturing before attempting to broker in information as we do? Absolutely not – any ‘leap’ over large-scale agriculture or crude manufacturing is desirable. Not only for the environment and public health, but for your own interests.

Incidentally, this kind of information brokering has already occurred, where dollars are exchanged for foreign labor in fair markets.

Rent A Coder is a site where you can post software projects or offer to do them. This is different from Craigslist primarily in that coders attempt to underbid each other for the gig.

Gold Farmers will spend hours playing games online so that they can sell you virtual stuff for $USD. Absurd, but completely rational, and a way for children to earn income in a place with no job market, while playing video games like children ought to be doing.

Amazon’s Mechanical Turk is a beautiful reduction of the information economy into nearly atomic pieces. You can earn seven cents by coming up with a sports trivia question. One can take it as a suggestion that the third world is ready to enter the information age, and that we would all benefit from it.

OLPC FTW

We’ll start with a classic:

Two men are standing beside the road watching the new backhoe dig a hole. “Look at that. Think of how many men with shovels could be working if we didn’t have that thing,” says the first man. The second one says, “Hey, think of how many men with spoons could be working if we didn’t have the shovels!”

Take the following with a grain of salt, as I’m no Steven D. Levitt, nor a Malcolm Gladwell, and certainly not any Thomas L. Friedman.

I think that the One Laptop Per Child program, an initiative to get easy-to-use, open source personal computers to children around the world, has the potential to completely change the world’s economy for the better. As little economists, we’re all taught somehow or another that:

  • Tools have throughout human history multiplied our ability to accomplish tasks. Technological change is responsible for the majority of our economic advancement over our ancestors.
  • The most advanced tools to date (that don’t fly around in space or kill people) are computers, which create and distribute massive amounts of information worldwide, and allow people to create and organize that information in useful ways to increase productivity.
  • Specialization of Labor makes the members of a cooperative market more effective at creating wealth than the same number of people operating on their own.
  • A free labor market dictates the cost of labor in proportion to the need for that type of labor. Specialized skilled labor tends to be a non-commodity, and wages are higher for those working in specialized fields. This is why income and education are directly correlated.

See where I’m going with this? Become a software developer. Just kidding; you can already do that if you want to, or you may have already. (If so, congratulations.) The important thing is that we get more software developers, more graphic artists and writers and musicians and paper pushers and bean counters. Let’s let everyone have a computer to write instructions for and pay the ones who can do it in a useful capacity.

Eight Queens in SQL

I told my co-workers last week that SQL could help one figure out, among other puzzles, Eight Queens. I’m sure they believed me, but I couldn’t find it on the internets so I wrote it. Here it is, just run it on your database of choice:


--SQL Eight Queens

--Create the board:
CREATE TABLE rows (
id integer PRIMARY KEY);
INSERT INTO rows (id) VALUES (1),(2),(3),(4),(5),(6),(7),(8);

CREATE TABLE cols (
id integer PRIMARY KEY);
INSERT INTO cols (id) VALUES (1),(2),(3),(4),(5),(6),(7),(8);

--Get a set of queens:
SELECT
cols.id AS col1, rows.id AS row1,
col2, row2,
col3, row3,
col4, row4,
col5, row5,
col6, row6,
col7, row7,
col8, row8
FROM rows, cols, (SELECT
col3, row3,
col4, row4,
col5, row5,
col6, row6,
col7, row7,
col8, row8,
rows.id AS row2, cols.id AS col2
FROM rows, cols, (SELECT
col4, row4,
col5, row5,
col6, row6,
col7, row7,
col8, row8,
rows.id AS row3, cols.id AS col3
FROM rows, cols, (SELECT
col5, row5,
col6, row6,
col7, row7,
col8, row8,
rows.id AS row4, cols.id AS col4
FROM rows, cols, (SELECT
col6, row6,
col7, row7,
col8, row8,
rows.id AS row5, cols.id AS col5
FROM rows, cols, (SELECT
col7, row7,
col8, row8,
rows.id AS row6, cols.id AS col6
FROM rows, cols, (SELECT
col8, row8,
rows.id AS row7, cols.id AS col7
FROM rows, cols, (SELECT
rows.id AS row8, cols.id AS col8
FROM rows, cols)
AS b8
WHERE cols.id != col8 AND rows.id != row8 --This checks rook moves
AND (cols.id + rows.id != col8 + row8) --This checks bishop moves
AND (cols.id - rows.id != col8 - row8)
) AS b7
WHERE cols.id != col8 AND rows.id != row8
AND cols.id != col7 AND rows.id != row7
AND (cols.id + rows.id != col8 + row8) AND (cols.id - rows.id != col8 - row8)
AND (cols.id + rows.id != col7 + row7) AND (cols.id - rows.id != col7 - row7)
) AS b6
WHERE cols.id != col8 AND rows.id != row8
AND cols.id != col7 AND rows.id != row7
AND cols.id != col6 AND rows.id != row6
AND (cols.id + rows.id != col8 + row8) AND (cols.id - rows.id != col8 - row8)
AND (cols.id + rows.id != col7 + row7) AND (cols.id - rows.id != col7 - row7)
AND (cols.id + rows.id != col6 + row6) AND (cols.id - rows.id != col6 - row6)
) AS b5
WHERE cols.id != col8 AND rows.id != row8
AND cols.id != col7 AND rows.id != row7
AND cols.id != col6 AND rows.id != row6
AND cols.id != col5 AND rows.id != row5
AND (cols.id + rows.id != col8 + row8) AND (cols.id - rows.id != col8 - row8)
AND (cols.id + rows.id != col7 + row7) AND (cols.id - rows.id != col7 - row7)
AND (cols.id + rows.id != col6 + row6) AND (cols.id - rows.id != col6 - row6)
AND (cols.id + rows.id != col5 + row5) AND (cols.id - rows.id != col5 - row5)
) AS b4
WHERE cols.id != col8 AND rows.id != row8
AND cols.id != col7 AND rows.id != row7
AND cols.id != col6 AND rows.id != row6
AND cols.id != col5 AND rows.id != row5
AND cols.id != col4 AND rows.id != row4
AND (cols.id + rows.id != col8 + row8) AND (cols.id - rows.id != col8 - row8)
AND (cols.id + rows.id != col7 + row7) AND (cols.id - rows.id != col7 - row7)
AND (cols.id + rows.id != col6 + row6) AND (cols.id - rows.id != col6 - row6)
AND (cols.id + rows.id != col5 + row5) AND (cols.id - rows.id != col5 - row5)
AND (cols.id + rows.id != col4 + row4) AND (cols.id - rows.id != col4 - row4)
) AS b3
WHERE cols.id != col8 AND rows.id != row8
AND cols.id != col7 AND rows.id != row7
AND cols.id != col6 AND rows.id != row6
AND cols.id != col5 AND rows.id != row5
AND cols.id != col4 AND rows.id != row4
AND cols.id != col3 AND rows.id != row3
AND (cols.id + rows.id != col8 + row8) AND (cols.id - rows.id != col8 - row8)
AND (cols.id + rows.id != col7 + row7) AND (cols.id - rows.id != col7 - row7)
AND (cols.id + rows.id != col6 + row6) AND (cols.id - rows.id != col6 - row6)
AND (cols.id + rows.id != col5 + row5) AND (cols.id - rows.id != col5 - row5)
AND (cols.id + rows.id != col4 + row4) AND (cols.id - rows.id != col4 - row4)
AND (cols.id + rows.id != col3 + row3) AND (cols.id - rows.id != col3 - row3)
) AS b2
WHERE cols.id != col8 AND rows.id != row8
AND cols.id != col7 AND rows.id != row7
AND cols.id != col6 AND rows.id != row6
AND cols.id != col5 AND rows.id != row5
AND cols.id != col4 AND rows.id != row4
AND cols.id != col3 AND rows.id != row3
AND cols.id != col2 AND rows.id != row2
AND (cols.id + rows.id != col8 + row8) AND (cols.id - rows.id != col8 - row8)
AND (cols.id + rows.id != col7 + row7) AND (cols.id - rows.id != col7 - row7)
AND (cols.id + rows.id != col6 + row6) AND (cols.id - rows.id != col6 - row6)
AND (cols.id + rows.id != col5 + row5) AND (cols.id - rows.id != col5 - row5)
AND (cols.id + rows.id != col4 + row4) AND (cols.id - rows.id != col4 - row4)
AND (cols.id + rows.id != col3 + row3) AND (cols.id - rows.id != col3 - row3)
AND (cols.id + rows.id != col2 + row2) AND (cols.id - rows.id != col2 - row2)
LIMIT 1337; --Arbitrary, you can let yours go all day.
--Note that this won't return very many unique solutions (unless your queens have numbers written on them)

I realize that this could be more elegant by trimming out the hard-coded values, and that I could set it up for N queens, but I got excited when it ran for 8. I wrote a nonrecursive brute-force version that ended as expected, with me sighing and restarting Postgres. If I go and edit it, it’ll certainly be to put the results in a human-readable form. Because it’s really cool, but isn’t smart enough to choose good placements ahead of time, I give myself 7 Queens out of a possible 8: