Making friends with binary representation

Posted by Cory Petosky Fri, 22 Aug 2008 02:46:00 GMT

This post is inspired by Nick’s aversion to bitwise operators all over the place. Despite this glaring personality flaw, he’s still worth hiring.

Many developers, and unfortunately almost every Flash developer, are far too comfortable with their abstract data types like double and int that they forget that, under the hood, everything’s stored as binary numbers. Having just a little knowledge about how things work under the hood makes you write better, faster, cleaner code.

For this article (and every article), the terms double and Number refer to the same thing – an IEEE-754 double-precision floating-point number. Put briefly, a double takes up 8 bytes (64 bits) in memory – 1 bit for the sign (positive or negative), 11 bits for the exponent, and 52 bits for the precise part of the number, or “mantissa”. Every time you need to use a double, it takes the mantissa and multiplies it by 2^exponent. There’s more to it than that, but the two main points are:

  1. They’re slow, because you and your clients are probably still using a 32-bit processor, so a double can’t fit through the processor in one go.
  2. They’re imprecise, as they’re storing a mathematical description to derive your value, not your actual value. This can lead to bugs.

Put simply, in performance-critical code, you don’t want to use doubles at all. Instead, you want to use your fixed-point alternative, commonly the int. In Flash and Java, an int is 4 bytes (32 bits), with an exceptionally easy pattern – 1 bit for the sign (posiitve or negative) and the other 31 bits represent the number in binary. There’s some trickery with negative numbers, but you can pretend it’s not there – it doesn’t affect anything you’re likely to do. Ints can’t store fractions natively, but that’s not usually an issue – store your price in cents instead of dollars, etc. Ints are fast and precise, and, because of their simple bit structure, you can do cool things with them.

The bitshift

With an int, you get a couple extra operators, called your shift operators. Using an int called foo, the expression foo << 2 shifts all bits in the binary representation of foo to the left two spaces. Similarly, foo >> 2 shifts all bits in the binary representation to the right two spaces. This might not seem helpful, so we’ll look at some regular decimal math to illustrate why you like these operators. You do these operations all the time in your day-to-day life!

The number 573 isn’t particularly friendly, and most of you probably can’t multiply or divide it in your head. However, you could multiply it by 10 if I asked you to – you’d immediately respond with 5,730. You could also divde it by 10 – you’d respond with 57. In base-10 math, shifting the digits left is the same as multiplying by 10, and shifting the digits right is the same as dividing by 10. It works similarly in binary base-2 math – shifting left is multiplying by 2, and shifting right divides by 2.

What’s less obvious is that, in a computer, a shift is a ridiculously easy operation to perform – significantly easier than a multiply and ridiculously easier than a divide. They’re something you should use whenever appropriate. Nick thinks they clutter your code with unreadable nonsense, to which I always reply, “well, you should learn to read code.” :)

The bitwise logical operators

You’ve probably used the logical operators AND and OR (&& and ||, in most languages these days). Both of these operators evaluate the truthiness of each of its arguments, and then return a truth value based on them. Truthiness, in most languages, is represented by a boolean, which can either be true or false. Under the hood, in binary, 0 is false and 1 is true.

An int is 32 binary values. Thus, you can choose to think of ints as 32 boolean values in a row. You could then operate on two ints, comparing the truthiness of each bit to generate a new int based on the first two. These are what the bitwise AND and bitwise OR accomplish (& and | in most languages these days).

For example, lets say you have an ARGB color. You want to extract the green value. In an ARGB color, each of the alpha, the red, the green, and the blue channels take up 8 bits. We can use a bitwise AND to zero-out everything but the green portion, and then use a bitshift to move it to the position where it’s readable by a human.

function getGreenPart(color:uint):uint { // For sake of example, let's say color is 0x3377AA
    const greenMask:uint = 0x00FF00; // Just get the green parts -- everything else in the int is binary 0s, so they'll AND to false, giving binary 0s in the result
    var greenByItself = color & greenMask; // Now greenByItself is 0x007700
    var greenWithoutPadding = greenByItself >> 8; // Now greenWithoutPadding is 0x000077.
    return greenWithoutPadding; // We can display this in an editable field and it will make sense.
}

You can also use bitwise operators to store a bunch of boolean values in a single int. For example, lets say a sandwich can have Mayo, Beef, Cheese, and Lettuce. None of these options are mutually exclusive. You could store them as four boolean values in your sandwich class, but that quickly gets bloated once you add all possible condiments, meats, cheeses, and accessories. A bitfield is much easier.

class Ingredients {
   // Bitfields are easy in hex -- just double the digit until it wraps around.
   public static const ROAST_BEEF:uint = 0x01;
   public static const SWISS_CHEESE:uint = 0x02;
   public static const LETTUCE:uint = 0x04;
   public static const MAYO:uint = 0x08;
   public static const SALT:uint = 0x10;
   public static const PEPPER:uint = 0x20;
   public static const JALAPENOS:uint = 0x40;

   // Some convenience entries
   public static const EVERYTHING:uint = 0xFFFF; // Has all ingredients, even future ones
   public static const NOTHING:uint = 0x0000; // Just the bread -- we make good bread, some people buy it plain
}
class Sandwich {
   public var ingredients:uint = Ingredients.NOTHING;

   public function Sandwich(ingredients:uint) {
      this.ingredients = ingredients;
   }
}

// Now we can start making sandwiches
var justBeefSandwich = new Sandwich(Ingredients.ROAST_BEEF);
var veggieSandwich = new Sandwich(Ingredients.LETTUCE | Ingredients.JALAPENOS);
var myFavoriteSandwich = new Sandwich(Ingredients.ROAST_BEEF | Ingredients.LETTUCE | Ingredients.SALT);
var theWorksSandwich = new Sandwich(Ingredients.EVERYTHING & !Ingredients.JALAPENOS); // It's for my mom, she can't take the spicy

// We can also detect which ingredients a sandwich has
function calculatePrice(sandwich:Sandwich):Number {
   var price:Number = 1.50; // Start with a little cost, for the bread.
   if (sandwich.ingredients & Ingredients.ROAST_BEEF)
      price += 3.00;
   if (sandwich.ingredients & Ingredients.SWISS_CHEESE)
      price += 2.00;
   if (sandwich.ingredients & Ingredients.JALAPENOS)
      price += 0.50;
   return price;
}

We can use a bitwise-OR to combine two flags into a single field, and we can chain them together to add as many ingredients as we want. We can use a bitwise-AND plus the NOT operator (! in many languages) to remove a flag that’s already there. I didn’t demo it, but we can also use a bitwise-XOR (the^ operator in most languages) to toggle a flag – it adds it if it’s not there, but removes it if it is. Finally, we can detect the presence of a (single) flag by using a bitwise-AND – the result will be 0 (false) if the flag doesn’t exist, and the value of the flag (which is non-zero, so true) if the flag is there.

Anyway, this is getting longish, so I’ll leave it at this. You should at least appreciate that bitwise operations exist now, and hopefully you can find a use for them in your next project.

This entry was posted on Fri, 22 Aug 2008 02:46:00 GMT . You can follow any any response to this entry through the Atom feed. You can leave a comment .

Tags , ,


Comments

  1. nate about 3 hours later:

    Yeah I should make more use of this in my projects as well. I will come back to this post for review for sure as I forget this stuff all the time. Hey, it saves you the time explaining it to me again! Good thinkin’! ;-)

  2. Nick about 22 hours later:

    Nice article :) Although, for the record, I don’t have an aversion to bitwise operations. Especially in situations where they make sense, like the examples you just posted. I was just saying that they make your code harder for noobs to read! Perhaps that’s actually a secret benefit, it keeps them from messin with your stuff.

  3. Cory about 23 hours later:

    Duly corrected.

    Man, I wish I could get n00bs to stop messing with my stuff here… though in fairness, it’s usually me messing with the noobs’ stuff, and them IMing me frantically later wondering why stuff looks funny now.

Leave a comment