Monday, 9 January 2017

Counting bits

Counting bits is a very important task in many problems, like data compression. In a computer all the numeric values are stored in binary format. For example, if we want to encode the number 212 in a 32 bits register it will look like this:

0000 0000 0000 0000 0000 0000 1101 0100

The operation that counts the number of bits set to 1 in a register is called bitcount() or popcount(). We can also employ the definition of the Hamming weight, which is the number of elements in a vector that are different to the 0 symbol. In our example:

bitcount(212) = 4

The implementation of this operation depends on the architecture of the CPU in which we run our program. Sometimes, the CPU will have a instruction implementing the bitcount() at hardware level. The instructions POPCNT and LZCNT are part of the ABM (Advanced Bit Manipulation) set. These instructions are included in the x86 architecture in the SSE 4.2 instruction set.

If we have access to a hardware instruction this will be our best option, because we will obtain the result in few execution cycles. If we are programming in a high level language we must check the necessary steps to use the hardware instruction (normally we must call an specific function and compile the program with an special flag).

Nevertheless, there are situations in which we do not have access to a hardware instruction. For such cases there is a good general solution that employs bitwise operations (before going forward you should have knowledge of hexadecimal numeric representations). The following code applies for 32 bits registers:

inline uint32_t popcount(uint32_t i) {
    i = i - ((i >> 1) & 0x55555555);
    i = (i & 0x33333333) + ((i >> 2) & 0x33333333);
    return (((i + (i >> 4)) & 0xF0F0F0F) * 0x1010101) >> 24;

This algorithm is known as HAKMEM, developed by Beeler, Gosper and Schroppel in 1972. As you can see the code is a bit complex, so you can add a comment stating "//Just works" or "//Magic follows". Another option is to keep reading a bit further, so you can blame me later if I do not explain it properly.

HAKMEM employs a divide and conquer approach to count the bits in parallel. Let's see how it works by following the example with the number 212.

1) Shift the initial register one bit to the right i >> 1:

0000 0000 0000 0000 0000 0000 1101 0100
0000 0000 0000 0000 0000 0000 0110 1010

2) Apply the AND function with the mask 0x55555555 ( i >> 1 ) & 0x55555555:

0000 0000 0000 0000 0000 0000 0110 1010
0101 0101 0101 0101 0101 0101 0101 0101
0000 0000 0000 0000 0000 0000 0100 0000

3) Subtract the result to the original value i - ( ( i >> 1 ) & 0x55555555 ):

0000 0000 0000 0000 0000 0000 1101 0100
0000 0000 0000 0000 0000 0000 0100 0000
0000 0000 0000 0000 0000 0000 1001 0100

With this three initial steps we have obtained the total number of bits set to one in every pair of bits of the original register. If we compare the last 8 bits of the original register with our current result:

11 01  01 00
10 01  01 00

We can see that the pair 11 has 2 bits set to one (10). It is followed by to pairs 01 with 1 bit set to one (01) and a pair 00 without any bit set (00).

The next operation adds this values in pairs, obtaining the number of bits set to one in each group of 4 bits, let's see how:

4) First we apply the bit mask 0x33333333 to our last value i & 0x33333333:

0000 0000 0000 0000 0000 0000 1001 0100
0011 0011 0011 0011 0011 0011 0011 0011
0000 0000 0000 0000 0000 0000 0001 0000

5) Then we shift the last value two bits to the right and apply the mask (i >> 2) & 0x33333333:

0000 0000 0000 0000 0000 0000 1001 0100
0000 0000 0000 0000 0000 0000 0010 0101
0011 0011 0011 0011 0011 0011 0011 0011
0000 0000 0000 0000 0000 0000 0010 0001

6) We add the two values we obtained:

0000 0000 0000 0000 0000 0000 0001 0000
0000 0000 0000 0000 0000 0000 0010 0001
0000 0000 0000 0000 0000 0000 0011 0001

Now we have the total number of ones in each group of four bits. If we compare the original value of the last 8 bits of 112 with our result:

1101 0100
0011 0001

We can see that the first group 1101 has 3 bits set to one (0011) and that the second group 0100 has only a bit set to one (0001).

The next steps obtain the total number of one in each group of 8 bits (a byte).

7) We add i + (i >> 4):

0000 0000 0000 0000 0000 0000 0011 0001
0000 0000 0000 0000 0000 0000 0000 0011
0000 0000 0000 0000 0000 0000 0011 0100

8) We apply the mask of bits 0xF0F0F0F (i + (i >> 4)) & 0xF0F0F0F:

0000 0000 0000 0000 0000 0000 0011 0100
0000 1111 0000 1111 0000 1111 0000 1111
0000 0000 0000 0000 0000 0000 0000 0100

If we compare the original value of the last byte of our original register:



We see that it is correct, 11010100 contains 4 bits (00000100).

Right now you may already understand how the algorithm works. For our example we already finished, because it only uses the 8 least significant bits. However, if you have managed to read until here there is a surprise. There is a mathematical property that can be used to calculate the bitcount of the 32 bit register in one step.

Multiplying by 0x1010101 has a special property. If the number that we multiply has four bytes (A, B, C, D), then the result of the multiplication will return a result whose 4 bytes will be (A+B+C+D, B+C+D, C+D, D). The first byte of the result contains the bit count of the 32 bit register. Now we only have to shift this result 24 bits to the right >> 24 to obtain the result.

We finished! I hope you enjoyed this as much as me preparing it. Comments, questions, knifes and shurikens ... are very welcome.

Some references:

Hackers delight, chapter 5

Sunday, 8 January 2017

Rust Tutorial 001 - Introduction and motivation

Some time ago I started a Rust tutorial, but only in spanish and I was not satisfied with the results. Now I have more spare time to spend in my own projects and a better microphone, so I decided to start the tutorial again, improving and making it more attractive. I hope to see your comments. I am doing the tutorial in two languages, spanish and english. English is not my mother tongue, but I listen to myself after the videos so I will improve as I remember the language. If you want to access to the Spanish version of the vídeo you can do it by clicking in the language tab of the blog.

Here is the playlist where I will add new videos to the tutorial. If you like it, subscribe to PerfectChipperman EN youtube channel and thumbs up.

Thursday, 5 January 2017

The Atom text editor and Rust

Atom is a modern text editor developed in Javascript by the people responsible of GitHub. In their blog you can find many tips and tricks to enhance it. I have tried this editor and works really nicely. It has a good integration with GitHub. They have done a really good promotional video with vintage aesthetics. You should have a look:

One of the best things about this editor is that it has many plug-ins that will help us to develop in Rust, the new systems language developed by Mozilla. And it works really well!!! If we set these plug-ins properly we will have code auto-completion and error highlighting. I have made a guide to help you setting up everything. This is the list of plug-ins I have installed so far and the commands needed:

language-rust - Rust language support in Atom.

apm install language-rust

linter-rust - Lint rust files, using rustc and cargo. Checks for errors in the code

apm install linter
apm install linter-rust

racer - Intelligent rust code completion. This one is bit more tricky, you need also the racer binary:

Check if language-rust plug-in is installed
Follow the steps on github to install racer or use the binary provided by mozilla
apm install racer
Download the rust source code and extract it
Configure racer from atom: Set the paths to the racer executable and the Rust source code

rust-api-docs-helper - Allows to check the Rust standard library automatically
apm install rust-api-doc-helper

You get the idea. It is very easy!!! Once installed the plug-ins can be updated automatically.
I hope you like this entry. Please post your comments with your experiences and share this post with your friends. Have you tried other alternatives to develop in Rust?

Hello everyone

Welcome to my blog, where with the aid of the android Chipperman we will get all the juice of informatics and computer science. Chipper means jolly, happy and lively, this is how I feel starting this project in which you, with your comments and contributions, are going to be the most important part. We will discuss topics of great interest, including not only posts about technical things but also about work planning and motivation.

What makes Chipperman a perfect android is not only his great attitude and motivation. He is also able to crush those problems that sometimes we do not understand and convert them into sawdust. It will be our task to take this sawdust and give a shape to it. Because what makes a computer engineer a good professional is not mastering many programming languages, but to understand the different algorithms and mathematical tools used to solve real life problems.

Perfect Chipperman is also the name of a song from an Atari ST scene musician, MC Laser / Lotek Style. Here are the links to the song and the author homepage. You will need an SNDH file player.

The project will evolve. I have created a Youtube channel where I will upload tutorials.