Byte is the smallest unit of measurement of computer’s memory. A byte consists of smaller units called bits (for binary digits). On most of the computer systems today, a byte is represented by eight bits. A bit can take one of the two values, 1 or 0. The C programming language gives programmers, the power to manipulate individual bits using several operators, categorized as ‘Bitwise’ operators. Many modern programming languages have copied this feature of C and they also provide these operators. We are going to discuss here these operators; but, before that it would be interesting and useful to understand the way integers are stored in computer’s memory.
Quick links
How numbers are stored in computer’s memory?
Let us first understand how non-negative integers are dealt with. When such a number is stored in computer’s internal memory, its binary equivalent is created there in the form of electronic charges or bits. We know that C language has several integral data types such as char, int, short int, long int, etc. And on most of the C compilers today (I use gcc on Linux), a char variable takes 1 byte (8 bits), a short int takes 2 bytes (16 bits), and an int takes 4 bytes (32 bits). Now suppose that a byte stored at some address in computer’s memory, is a string of eight bits as follows-
0 1 1 0 0 0 1 0
7 6 5 4 3 2 1 0
The rightmost bit of a byte (numbered 0), is known as “least significant bit” (LSB) or “low order bit”, while the leftmost bit (numbered 7) is known as the “most significant bit” (MSB) or “high order bit”. If the data (string of bits) stored in the byte mentioned above, is considered as an integer, then its value is 98. To understand clearly, why it is so, let us take analogy from our familiar decimal number system.
The decimal system derives its name from the fact that there are 10 symbols used to represent a number. These 10 symbols are our 10 digits, 0 to 9. Every digit in a number has a place value; and the value of the number is sum of the place values of its digits. The place value of a digit is equal to the intrinsic value of that digit multiplied by some power of 10. For example, consider the number 3568. It can be represented as-
3568 = 3000 + 500 + 60 + 8 = 3 x 103 + 5 x 102 + 6 x 101 + 8 x 100
The power of 10 is zero for the right-most digit (at the units place) and goes on increasing by 1 leftwards. Because we are using 10 symbols, that is why the multiplier base is 10.
Similarly, in the binary system, there are 2 symbols (0 and 1), and the value of a number is sum of the place values of its digits. Here, the place value of a digit is equal to the intrinsic value of that digit multiplied by some power of 2, instead of 10. So, the above-mentioned byte is storing the number with value-
0 x 27 + 1 x 26 + 1 x 25 + 0 x 24 + 0 x 23 + 0 x 22 + 1 x 21 + 0 x 20
= 64 + 32 + 2 = 98decimal
Another analogy to notice is the range of numbers that can be stored. We easily understand that, in decimal number system, the largest 3-digit positive number is 999; for 4 digits, it is 9999, and for 5 digits, it is 99999. In general, the largest n-digit positive decimal number is 10n -1. Likewise, the largest positive binary number that can be stored in a char type of variable (which occupies 1 byte or 8 bits) is 11111111, whose decimal equivalent is 28 -1 = 255. For short int type of variable (16 bits), it is 216 -1 = 65,535, and for int type of variable (32 bits), it is 232 -1 = 4,294,967,295. But for this to happen, you must declare these variables as unsigned char, unsigned short int, and unsigned int, respectively.
Why? To understand this, we have to know, how negative numbers are stored in computer’s memory. When you create signed variables, C language treats the data (in the form of bits) stored therein, somewhat differently. One change is, now the MSB (left most bit) of the variable is considered as sign bit; if it is 0 (zero), then the remaining number is considered as positive (non-negative), while if it is 1 (one), then the remaining number is considered as negative. One more change is, the place value of this sign bit is now taken as negative. For example, consider a byte containing the following string of bits-
1 1 1 0 0 0 1 0
7 6 5 4 3 2 1 0
Here, the C language would consider, the value of the number stored in this byte as –
-1 x 27 + 1 x 26 + 1 x 25 + 0 x 24 + 0 x 23 + 0 x 22 + 1 x 21 + 0 x 20
= -128 + 64 + 32 + 2 = -30decimal
One consequence of this, which you would easily guess, is that the value of largest positive number now reduces to half of the corresponding value in unsigned variable. This is because, now one bit (and that too, the most significant) less is contributing to the value of the number. So, the largest positive binary number that can be stored in a char type of variable now, is 01111111, whose decimal equivalent is 27 -1 = 127. For short int type of variable (16 bits), it is 0111111111111111 = 215 -1 = 32,767, and for int type of variable (32 bits), it is 231 -1 = 2,147,483,647.
What is the largest negative number that can be stored in char type of variable? It is 11111111, whose decimal value can be calculated as –
-1 x 27 + 1 x 26 + 1 x 25 + 1 x 24 + 1 x 23 + 1 x 22 + 1 x 21 + 1 x 20
= -128 + 64 + 32 + 16 + 8 + 4 + 2 + 1 = -1decimal
You can verify that even in case of short int, int and long int, the largest negative number would have value -1. The smallest negative numbers, however, would be different in these cases. For char type of variable, the smallest negative number is 10000000, whose decimal equivalent is -27 = -128. For short int type of variable (16 bits), it is 1000000000000000 = -215 = -32,768, and for int type of variable (32 bits), it is -231 = -2,147,483,648.
In technical jargon, this method of storage of signed integers is called as “twos complement notation”. Here, negating a number (whether positive or negative) is achieved by inverting all the bits (replace all 1s with 0s and all zeroes with ones) and then adding 1 to the result. So, remember the rule, “INVERT AND ADD 1”.
For example, 5decimal is stored in a char variable as 00000101. If we flip all the bits in it (this can be achieved by ones complement operator, described later in this article), we get 11111010; next, add 1 to it, which gives 11111011. We can easily verify that it value is -128 + 64 + 32 + 16 + 8 + 2 + 1 = -5decimal.
Again, if you want to negate this -5, to get the original +5, adopt the same procedure. Invert all the bits of 11111011 to get 00000100; then add 1 to it, giving 00000101.
Another View
To further consolidate understanding of storage of negative numbers, consider example of a byte with 3 bits only. For such a byte, the possible permutations of bits are-
000 001 010 011 100 101 110 111.
If we consider these as unsigned integers, then these 8 numbers can be used to represent decimal numbers 0 to 7. If we decide that the left most bit (MSB) should act as sign bit, then the first 4 numbers would represent non-negative integers, while the remaining 4 would be negative. In decimal number system, we get the next integer by adding 1 to the current integer and obtain the previous integer by subtracting 1 from the current integer. Same procedure is there in binary number system also, but now the addition or subtraction is binary. So, starting with zero, represented by 000, we get 1 by (binary) adding 1 to it: 000 + 001 = 001. Then 2 as 001 + 001 = 010 and then 3 as 010 + 001 = 011. We can not go beyond this, because adding 1 once again would give negative number by our agreed notation (011 + 001 = 100). So, 3 is the largest positive integer possible here. Now, to get negative range, we start from zero and go on subtracting 1. The first subtraction is shown below-
( 1 ) 0 0 0 /* (the 1 in parentheses is borrow) */
- 0 0 1
1 1 /* (carry) */
-------------
1 1 1 /* (minus one) */
Subtract 1 to get -2 (111 – 001 = 110). Again subtract 1 to get -3 (110 – 001 = 101). And finally subtract 1, once again, to get -4 (101 – 001 = 100). We can not reduce further, because next subtraction would give 011, which is a positive number in our agreed scheme. So, -4 is the smallest negative number here.
Bitwise Operators
Having understood the storage of numbers, we will now discuss the operators in C to manipulate at bit level. There are six bitwise operator in C language.
- & bitwise AND
- | bitwise inclusive OR
- ^ bitwise exclusive OR
- ~ ones complement
- << left shift operator
- >> right shift operator
Out of these, the one’s complement operator is unary, while the remaining five are binary. Only integral type operands are allowed for all the 6 operators.
Ones Complement Operator
This is simplest to understand. It flips the bits of its operand. On a single bit, it operates according to the following rule table-
For example,
37 00100101
----------
~37 11011010
Bitwise Logical Operators
The 3 of the bitwise operators, viz., &, |, ^ are called as bitwise logical operators. These are evaluated according to the rules in the following truth table.
b1 |
b2 |
b1 & b2 |
b1 | b2 |
b1 ^ b2 |
0 |
0 |
0 |
0 |
0 |
0 |
1 |
0 |
1 |
1 |
1 |
0 |
0 |
1 |
1 |
1 |
1 |
1 |
1 |
0 |
In the following examples, notice that an operator works on a pair of corresponding bits. That means, say, bit number 5 of first number and bit number 5 of second number will go as operands to the operator, to produce bit number 5 of the resulting number. Similar thing happens for all other bits of the operands.
37 00100101
52 00110100
----------
37 & 52 00100100
37 00100101
52 00110100
----------
37 | 52 00110101
37 00100101
52 00110100
----------
37 ^ 52 00010001
Bitwise Shift Operators
Two of the bitwise operators, viz., << and >> are called as shift operators. Each of these are binary operators. The left operand must be of integral type and it represents the bit pattern that is to be shifted. The right operand must be a non-negative integer. It represents the displacement of bits to be carried out in the bit string of the left operand. The right operand number can not be greater than the number of bits (i.e. word size) in the left operand.
Left Shift Operator
When a left shift operation is performed on an integer, the bits contained in that number are literally shifted to the left. Bits that are shifted out through the high order bit of the data item are lost, and zeroes are always shifted into the low order bit position of the value. The left shifting is equivalent to multiplying the number by 2.
Examples-
37 0000000000100101
37 << 0 0000000000100101 37
37 << 1 0000000001001010 74
37 << 2 0000000010010100 148
37 << 3 0000000100101000 296
Right Shift Operator
This operator shifts the bits of its first operand to the right, by the number of positions specified by its second operand. Bits that are shifted out of the low order bit of the data item are lost. Zeroes are shifted in on the left, that is, through the high order bits. The right shifting is equivalent to dividing the number by 2.
Examples-
37 0000000000100101
37 >> 0 0000000000100101 37
37 >> 1 0000000000010010 18
37 >> 2 0000000000001001 9
37 >> 3 0000000000000100 4
Printing Bit Pattern (Binary Number)
You may be wondering, how I have displayed the binary numbers in the examples above (shown in green) and your first guess may be that I have calculated those values and manually typed those. But No! They are output from a C program. We know that the standard C library function printf( ) has format specifiers to print an integer in decimal (%d), octal (%o), and hexadecimal (%x) format, but no facility to print it in binary format. However, it is easy now (with the above knowledge) to write a function to display an integral number in binary format.
The following program has two such functions, one to display 8-bits number and another for 16-bits number.
void show_char_bits(char n)
{
unsigned char i, j;
for(i=128; i>0; i = i>>1)
{
j = n & i;
if(j) printf("1");
else printf("0");
}
}
void show_int_bits(short int n)
{
unsigned short int i, j;
for(i=32768; i>0; i = i>>1)
{
j = n & i;
if(j) printf("1");
else printf("0");
}
}
main()
{
int i, j;
printf("\n 37 "); show_char_bits(37);
printf("\n 52 "); show_char_bits(52);
printf("\n ----------");
printf("\n 37 & 52 "); show_char_bits(37 & 52);
printf("\n\n");
printf("\n 37 "); show_char_bits(37);
printf("\n ----------");
printf("\n ~37 "); show_char_bits(~37);
printf("\n\n");
i = 37;
printf("\n 37 "); show_int_bits(i);
j = i << 0;
printf("\n 37 << 0 "); show_int_bits(j);
printf(" %4d", j);
j = i << 1;
printf("\n 37 << 1 "); show_int_bits(j);
printf(" %4d", j);
j = i << 2;
printf("\n 37 << 2 "); show_int_bits(j);
printf(" %4d", j);
j = i << 3;
printf("\n 37 << 3 "); show_int_bits(j);
printf(" %4d", j);
printf("\n\n");
}
The show_char_bits( ) function prints, at the current cursor position, the 8 bits of its argument ‘n’ which is of char type, and the show_int_bits( ) function prints the 16 bits of its argument ‘n’ which is of short int type. The show_char_bits( ) function has a ‘for’ loop that iterates 8 times and whose loop control variable (LCV) ‘i’ takes values 128, 64, 32, 16, 8, 4, 2, and 1 in the successive iterations respectively. You would notice that the binary representation of 128 has ‘1’ in the MSB (bit 7) and ‘0’ in all other bits. When this number is bitwise ‘anded’ with the argument ‘n’, it would result either in 128 (considered as true in C) or in zero (considered as false), depending upon whether the MSB of ‘n’ is 1 or 0. The result is stored in ‘j’ and the subsequent ‘if’ statement inspects its value; it then decides, depending upon that value whether MSB (bit 7) of ‘n’ is 1 or 0 and prints accordingly. In the second iteration LCV ‘i’ takes the value 64, which has got ‘1’ at bit 6 and ‘0’ in the remaining seven bits. So, it will print bit 6. This will go on till bit 0 (LSB) is printed after which the loop ends and the function returns. The show_int_bits( ) function behaves in similar fashion, but here the loop iterates 16 times.
In the subsequent article of this series, we will see practical uses of these operators in programming practice.