Note: Many people find this lesson challenging. If you get stuck, skip the lesson (and the next one) and come back later. This information is here for your knowledge, but is not required to progress with the tutorials.
Bit manipulation operators manipulate individual bits within a variable.
Why bother with bitwise operators?
In the past, memory was extremely expensive, and computers did not have much of it. Consequently, there were incentives to make use of every bit of memory available. Consider the bool data type -- even though it only has two possible values (true and false), which can be represented by a single bit, it takes up an entire byte of memory! This is because variables need unique addresses, and memory can only be addressed in bytes. The bool uses 1 bit and the other 7 go to waste.
Using bitwise operators, it is possible to write functions that allow us to compact 8 booleans into a single byte-sized variable, enabling significant memory savings at the expense of more complex code. In the past, this was a good trade-off. Today, at least for application programming, it is probably not.
Now memory is significantly cheaper, and programmers have found that it is often a better idea to code what is easiest to understand and maintain than what is most efficient. Consequently, bitwise operators have somewhat fallen out of favor, except in certain circumstances where maximum optimization is needed (eg. scientific programs that use enormous data sets, games where bit manipulation tricks can be used for extra speed, or embedded programming, where memory is still limited). Nevertheless, it is good to at least know about their existence.
There are 6 bit manipulation operators:
|left shift||<<||x << y||all bits in x shifted left y bits|
|right shift||>>||x >> y||all bits in x shifted right y bits|
|bitwise NOT||~||~x||all bits in x flipped|
|bitwise AND||&||x & y||each bit in x AND each bit in y|
|bitwise OR|||||x | y||each bit in x OR each bit in y|
|bitwise XOR||^||x ^ y||each bit in x XOR each bit in y|
Bit manipulation is one of the few cases where you should unambiguously use unsigned integer data types. This is because C++ does not guarantee how signed integers are stored, nor how some bitwise operators apply to signed variables.
Rule: When dealing with bit operators, use unsigned integers.
Bitwise left shift (<<) and bitwise right shift (>>) operators
Note: In the following examples, we will generally be working with 4-bit binary values. This is for the sake of convenience and keeping the examples simple. In C++, the number of bits used will be based on the size of the data type (8 bits per byte).
The bitwise left shift (<<) operator shifts bits to the left. The left operand is the expression to shift, and the right operand is an integer number of bits to shift by. So when we say , we are saying "shift the bits in the literal 3 left by 1 place".
For example, consider the number 3, which is binary 0011:
3 = 0011
3 << 1 = 0110 = 6
3 << 2 = 1100 = 12
3 << 3 = 1000 = 8
Note that in the third case, we shifted a bit off the end of the number! Bits that are shifted off the end of the binary number are lost forever.
(Reminder: We're working with 4-bit values here. With an 8-bit value, 3 << 3 would be 0001 1000, which is decimal 24. In this case, the 4th bit wouldn't be shifted off the left end of the binary number).
The bitwise right shift (>>) operator shifts bits to the right.
12 = 1100
12 >> 1 = 0110 = 6
12 >> 2 = 0011 = 3
12 >> 3 = 0001 = 1
Note that in the third case we shifted a bit off the right end of the number, so it is lost.
Although our examples above involve shifting literals, you can shift variables as well:
Note that the results of applying the bitwise shift operators to a signed integer are compiler dependent.
What!? Aren't operator<< and operator>> used for input and output?
They sure are.
Programs today typically do not make much use of the bitwise left and right shift operators to shift bits. Rather, you tend to see the bitwise left shift operator used with std::cout to output text. Consider the following program:
This program prints:8
In the above program, how does operator<< know to shift bits in one case and output x in another case? The answer is that std::cout has provided its own version of the << operator that gives it a new meaning when used in conjunction with std::cout. This process is called operator overloading. When the compiler sees that the left operand of operator<< is std::cout, it knows that it should call the version of operator<< that std::cout overloaded to do output. If the left operand is an integer type, then the standard version of operator<< is called, which does its usual bit-shifting behavior.
We will talk more about operator overloading in a future section, including discussion of how to override operators for your own purposes.
The bitwise NOT operator (~) is perhaps the easiest to understand of all the bitwise operators. It simply flips each bit from a 0 to a 1, or vice versa. Note that the result of a bitwise NOT is dependent on what size your data type is!
Assuming 4 bits:
4 = 0100
~4 = 1011 = 11 (decimal)
Assuming 8 bits:
4 = 0000 0100
~4 = 1111 1011 = 251 (decimal)
Bitwise AND, OR, and XOR
Bitwise AND (&) and bitwise OR (|) work similarly to their logical AND and logical OR counterparts. However, rather than evaluating a single boolean value, they are applied to each bit! For example, consider the expression . In binary, this is represented as 0101 | 0110. To do (any) bitwise operations, it is easiest to line the two operands up like this:0 1 0 1 // 5 0 1 1 0 // 6
and then apply the operation to each column of bits. If you remember, logical OR evaluates to true (1) if either the left or the right or both operands are true (1). Bitwise OR evaluates to 1 if either bit (or both) is 1. Consequently, 5 | 6 evaluates like this:0 1 0 1 // 5 0 1 1 0 // 6 ------- 0 1 1 1 // 7
Our result is 0111 binary (7 decimal).
We can do the same thing to compound OR expressions, such as . If any of the bits in a column are 1, the result of that column is 1.0 0 0 1 // 1 0 1 0 0 // 4 0 1 1 0 // 6 -------- 0 1 1 1 // 7
1 | 4 | 6 evaluates to 7.
Bitwise AND works similarly. Logical AND evaluates to true if both the left and right operand evaluate to true. Bitwise AND evaluates to true only if both bits in the column are 1. Consider the expression . Lining each of the bits up and applying an AND operation to each column of bits:0 1 0 1 // 5 0 1 1 0 // 6 -------- 0 1 0 0 // 4
Similarly, we can do the same thing to compound AND expressions, such as . If all of the bits in a column are 1, the result of that column is 1.0 0 0 1 // 1 0 0 1 1 // 3 0 1 1 1 // 7 -------- 0 0 0 1 // 1
The last operator is the bitwise XOR (^), also known as exclusive or. When evaluating two operands, XOR evaluates to true (1) if one and only one of its operands is true (1). If neither or both are true, it evaluates to 0. Consider the expression :0 1 1 0 // 6 0 0 1 1 // 3 ------- 0 1 0 1 // 5
It is also possible to evaluate compound XOR expression column style, such as . If there are an even number of 1 bits in a column, the result is 0. If there are an odd number of 1 bits in a column, the result is 1.0 0 0 1 // 1 0 0 1 1 // 3 0 1 1 1 // 7 -------- 0 1 0 1 // 5
Bitwise assignment operators
As with the arithmetic assignment operators, C++ provides bitwise assignment operators in order to facilitate easy modification of variables.
|Left shift assignment||<<=||x <<= y||Shift x left by y bits|
|Right shift assignment||>>=||x >>= y||Shift x right by y bits|
|Bitwise OR assignment|||=||x |= y||Assign x | y to x|
|Bitwise AND assignment||&=||x &= y||Assign x & y to x|
|Bitwise XOR assignment||^=||x ^= y||Assign x ^ y to x|
For example, instead of writing , you can write .
Summarizing how to evaluate bitwise operations utilizing the column method:
When evaluating bitwise OR, if any bit in a column is 1, the result for that column is 1.
When evaluating bitwise AND, if all bits in a column are 1, the result for that column is 1.
When evaluating bitwise XOR, if there are an odd number of 1 bits in a column, the result for that column is 1.
1) What does 0110 >> 2 evaluate to in binary?
2) What does 5 | 12 evaluate to in decimal?
3) What does 5 & 12 evaluate to in decimal?
4) What does 5 ^ 12 evaluate to in decimal?
1) Show Solution
0110 >> 2 evaluates to 0001
2) Show Solution
5 | 12 =
0 1 0 1
1 1 0 0
1 1 0 1 = 13
3) Show Solution
5 & 12 =
0 1 0 1
1 1 0 0
0 1 0 0 = 4
4) Show Solution
5 ^ 12 =
0 1 0 1
1 1 0 0
1 0 0 1 = 9
x=x<<1;// x will be 8
x=x<<1;// use operator<< for left shift
std::cout<<x;// use operator<< for output
Operator overloading (less commonly known as ad-hoc polymorphism) is a specific case of polymorphism (part of the OO nature of the language) in which some or all operators like , or are treated as polymorphic functions and as such have different behaviors depending on the types of its arguments. Operator overloading is usually only syntactic sugar. It can easily be emulated using function calls.
Consider this operation:
Using operator overloading permits a more concise way of writing it, like this:a + b * c
(Assuming the operator has higher precedence than .)
Operator overloading can provide more than an aesthetic benefit, since the language allows operators to be invoked implicitly in some circumstances. Problems, and critics, to the use of operator overloading arise because it allows programmers to give operators completely free functionality, without an imposition of coherency that permits to consistently satisfy user/reader expectations. Usage of the operator is an example of this problem.
Will return twice the value of if is an integer variable, but if is an output stream instead this will write "1" to it. Because operator overloading allows the programmer to change the usual semantics of an operator, it is usually considered good practice to use operator overloading with care.
To overload an operator is to provide it with a new meaning for user-defined types. This is done in the same fashion as defining a function. The basic syntax follows (where @ represents a valid operator):
Not all operators may be overloaded, new operators cannot be created, and the precedence, associativity or arity of operators cannot be changed (for example ! cannot be overloaded as a binary operator). Most operators may be overloaded as either a member function or non-member function, some, however, must be defined as member functions. Operators should only be overloaded where their use would be natural and unambiguous, and they should perform as expected. For example, overloading + to add two complex numbers is a good use, whereas overloading * to push an object onto a vector would not be considered good style.
- A simple Message Header
Operators as member functions
Aside from the operators which must be members, operators may be overloaded as member or non-member functions. The choice of whether or not to overload as a member is up to the programmer. Operators are generally overloaded as members when they:
- change the left-hand operand, or
- require direct access to the non-public parts of an object.
When an operator is defined as a member, the number of explicit parameters is reduced by one, as the calling object is implicitly supplied as an operand. Thus, binary operators take one explicit parameter and unary operators none. In the case of binary operators, the left hand operand is the calling object, and no type coercion will be done upon it. This is in contrast to non-member operators, where the left hand operand may be coerced.
- + (addition)
- - (subtraction)
- * (multiplication)
- / (division)
- % (modulus)
As binary operators, these involve two arguments which do not have to be the same type. These operators may be defined as member or non-member functions. An example illustrating overloading for the addition of a 2D mathematical vector type follows.
It is good style to only overload these operators to perform their customary arithmetic operation. Because operator has been overloaded as member function, it can access private fields.
- ^ (XOR)
- | (OR)
- & (AND)
- ~ (complement)
- << (shift left, insertion to stream)
- >> (shift right, extraction from stream)
All of the bitwise operators are binary, except complement, which is unary. It should be noted that these operators have a lower precedence than the arithmetic operators, so if ^ were to be overloaded for exponentiation, x ^ y + z may not work as intended. Of special mention are the shift operators, << and >>. These have been overloaded in the standard library for interaction with streams. When overloading these operators to work with streams the rules below should be followed:
- overload << and >> as friends (so that it can access the private variables with the stream be passed in by references
- (input/output modifies the stream, and copying is not allowed)
- the operator should return a reference to the stream it receives (to allow chaining, cout << 3 << 4 << 5)
- An example using a 2D vector
The assignment operator, =, must be a member function, and is given default behavior for user-defined classes by the compiler, performing an assignment of every member using its assignment operator. This behavior is generally acceptable for simple classes which only contain variables. However, where a class contains references or pointers to outside resources, the assignment operator should be overloaded (as general rule, whenever a destructor and copy constructor are needed so is the assignment operator), otherwise, for example, two strings would share the same buffer and changing one would change the other.
In this case, an assignment operator should perform two duties:
- clean up the old contents of the object
- copy the resources of the other object
For classes which contain raw pointers, before doing the assignment, the assignment operator should check for self-assignment, which generally will not work (as when the old contents of the object are erased, they cannot be copied to refill the object). Self assignment is generally a sign of a coding error, and thus for classes without raw pointers, this check is often omitted, as while the action is wasteful of cpu cycles, it has no other effect on the code.
Another common use of overloading the assignment operator is to declare the overload in the private part of the class and not define it. Thus any code which attempts to do an assignment will fail on two accounts, first by referencing a private member function and second fail to link by not having a valid definition. This is done for classes where copying is to be prevented, and generally done with the addition of a privately declared copy constructor
- == (equality)
- != (inequality)
- > (greater-than)
- < (less-than)
- >= (greater-than-or-equal-to)
- <= (less-than-or-equal-to)
All relational operators are binary, and should return either true or false. Generally, all six operators can be based off a comparison function, or each other, although this is never done automatically (e.g. overloading > will not automatically overload < to give the opposite). There are, however, some templates defined in the header <utility>; if this header is included, then it suffices to just overload operator== and operator<, and the other operators will be provided by the STL.
- ! (NOT)
- && (AND)
- || (OR)
The logical operators AND are used when evaluating two expressions to obtain a single relational result.The operator corresponds to the boolean logical operation AND,which yields true if operands are true,and false otherwise.The following panel shows the result of operator evaluating the expression.
The ! operator is unary, && and || are binary. It should be noted that in normal use, && and || have "short-circuit" behavior, where the right operand may not be evaluated, depending on the left operand. When overloaded, these operators get function call precedence, and this short circuit behavior is lost. It is best to leave these operators alone.
If the result of Function1() is false, then Function2() is not called.
Both Function3() and Function4() will be called no matter what the result of the call is to Function3() This is a waste of CPU processing, and worse, it could have surprising unintended consequences compared to the expected "short-circuit" behavior of the default operators. Consider:
Compound assignment operators
- += (addition-assignment)
- -= (subtraction-assignment)
- *= (multiplication-assignment)
- /= (division-assignment)
- %= (modulus-assignment)
- &= (AND-assignment)
- |= (OR-assignment)
- ^= (XOR-assignment)
- <<= (shift-left-assignment)
- >>= (shift-right-assignment)
Compound assignment operators should be overloaded as member functions, as they change the left-hand operand. Like all other operators (except basic assignment), compound assignment operators must be explicitly defined, they will not be automatically (e.g. overloading = and + will not automatically overload +=). A compound assignment operator should work as expected: A @= B should be equivalent to A = A @ B. An example of += for a two-dimensional mathematical vector type follows.
Increment and decrement operators
- ++ (increment)
- -- (decrement)
Increment and decrement have two forms, prefix (++i) and postfix (i++). To differentiate, the postfix version takes a dummy integer. Increment and decrement operators are most often member functions, as they generally need access to the private member data in the class. The prefix version in general should return a reference to the changed object. The postfix version should just return a copy of the original value. In a perfect world, A += 1, A = A + 1, A++, ++A should all leave A with the same value.
Often one operator is defined in terms of the other for ease in maintenance, especially if the function call is complex.
The subscript operator, [ ], is a binary operator which must be a member function (hence it takes only one explicit parameter, the index). The subscript operator is not limited to taking an integral index. For instance, the index for the subscript operator for the std::map template is the same as the type of the key, so it may be a string etc. The subscript operator is generally overloaded twice; as a non-constant function (for when elements are altered), and as a constant function (for when elements are only accessed).
Function call operator
The function call operator, ( ), is generally overloaded to create objects which behave like functions, or for classes that have a primary operation. The function call operator must be a member function, but has no other restrictions - it may be overloaded with any number of parameters of any type, and may return any type. A class may also have several definitions for the function call operator.
Address of, Reference, and Pointer operators
These three operators, operator&(), operator*() and operator->() can be overloaded. In general these operators are only overloaded for smart pointers, or classes which attempt to mimic the behavior of a raw pointer. The pointer operator, operator->() has the additional requirement that the result of the call to that operator, must return a pointer, or a class with an overloaded operator->(). In general A == *&A should be true.
Note that overloading operator& invokes undefined behavior:
- ISO/IEC 14882:2003, Section 5.3.1
- The address of an object of incomplete type can be taken, but if the complete type of that object is a class type that declares operator&() as a member function, then the behavior is undefined (and no diagnostic is required).
These are extremely simplified examples designed to show how the operators can be overloaded and not the full details of a SmartPointer or SmartReference class. In general you won't want to overload all three of these operators in the same class.
The comma operator,() , can be overloaded. The language comma operator has left to right precedence, the operator,() has function call precedence, so be aware that overloading the comma operator has many pitfalls.
For non overloaded comma operator, the order of execution will be Function1(), Function2(); With the overloaded comma operator, the compiler can call either Function1(), or Function2() first.
Member Reference operators
The two member access operators, operator->() and operator->*() can be overloaded. The most common use of overloading these operators is with defining expression template classes, which is not a common programming technique. Clearly by overloading these operators you can create some very unmaintainable code so overload these operators only with great care.
When the -> operator is applied to a pointer value of type (T *), the language dereferences the pointer and applies the . member access operator (so x->m is equivalent to (*x).m). However, when the -> operator is applied to a class instance, it is called as a unary postfix operator; it is expected to return a value to which the -> operator can again be applied. Typically, this will be a value of type (T *), as in the example under Address of, Reference, and Pointer operators above, but can also be a class instance with operator->() defined; the language will call operator->() as many times as necessary until it arrives at a value of type (T *).
Memory management operators
- new (allocate memory for object)
- new[ ] (allocate memory for array)
- delete (deallocate memory for object)
- delete[ ] (deallocate memory for array)
The memory management operators can be overloaded to customize allocation and deallocation (e.g. to insert pertinent memory headers). They should behave as expected, new should return a pointer to a newly allocated object on the heap, delete should deallocate memory, ignoring a NULL argument. To overload new, several rules must be followed:
- new must be a member function
- the return type must be void*
- the first explicit parameter must be a size_t value
To overload delete there are also conditions:
- delete must be a member function (and cannot be virtual)
- the return type must be void
- there are only two forms available for the parameter list, and only one of the forms may appear in a class:
Conversion operators enable objects of a class to be either implicitly (coercion) or explicitly (casting) converted to another type. Conversion operators must be member functions, and should not change the object which is being converted, so should be flagged as constant functions. The basic syntax of a conversion operator declaration, and declaration for an int-conversion operator follows.
Notice that the function is declared without a return-type, which can easily be inferred from the type of conversion. Including the return type in the function header for a conversion operator is a syntax error.
Operators which cannot be overloaded
- ?: (conditional)
- . (member selection)
- .* (member selection with pointer-to-member)
- :: (scope resolution)
- (object size information)
- typeid (object type information)
To understand the reasons why the language doesn't permit these operators to be overloaded, read "Why can't I overload dot, ::, , etc.?" at the Bjarne Stroustrup's C++ Style and Technique FAQ ( http://www.stroustrup.com/bs_faq2.html#overload-dot ).