Runtime complexity of an algorithm is always a huge confuse for me when I was at the college. Further, the math of variously termed Big O notation was hard to realize in practice. Trying to find a definition from Google, I would agree “Big O notation is used in Computer Science to describe the performance or complexity of an algorithm”. Generally Big O describes the worst-case scenario, and can further be used to describe the execution time required or the space used (e.g. in memory or on disk) by an algorithm. I found the best way to understand Big O thoroughly was to produce some examples in code, which in fact not my own original thought. For Big O, I generally would compare the complexity of algorithm comparative to the input. Big O notation denotes the limiting factor of an algorithm. It’s a simplified expression of how run time of an algorithm scales with relation to the input.
For example (in Java):
/** takes an array of strings and concatenates them
* @param strings the array of strings to concatenate
* @returns a string that is a result of the concatenation of all the strings
* in the array
*/
public static String badConcat(String[] Strings){
String totalString = "";
for(String s: strings){
for(int i = 0; i < s.length(); i++){
totalString+=s.charAt(i);
}
}
return totalString;
}
Now think about what this is actually doing. It is going to throw every character of input and adding them together. This seems straightforward. The problem is that String is immutable. So every time you add a letter onto the string you have to create a new String. To do this you have to copy the values from the old string into the new string and add the new character. This means You will be copying the first letter n times where n is the number of characters in the input, you will be copying the character n-1 times... so total there will be (n-1)*n/2 copies. This is (n^2-n)/2 and for big O notation we use only the highest magnitude factor(usually) and drop any constants that are multiplied times it and we end up with big O(n^2). Using something like a StringBuilder will be along the lines of O(nLog(n)). If you calculate the number of characters at the beginning and set the capacity of the StringBuilder you can get it to be O(n). So if we had 1000 characters of input, the first example would perform roughly a million operations. The simple StringBuilder would perform 10,000, and the StringBuilder with set capacity would perform 1000 operations to do the same thing. This is rough estimate, but O(n) notation is about orders of magnitudes, not exact run times.
O(1)
O(1) describes an algorithm that will always execute in the same time (or space) regardless of the size of the input data set.
bool checkFirstBlock(String[] strings)
{
if(strings[0] == null)
{
return true;
}
return false;
}
O(N)
O(N) describes an algorithm whose performance will grow linearly and in direct proportion to the size of the input data set. The example below also demonstrates how Big O favors the worst-case performance scenario; a matching string could be found during any iteration of the for loop and the function would return early, but Big O notation will always assume the upper limit where the algorithm will perform the maximum number of iterations.
bool ContainsValue(String[] strings, String value)
{
for(int i = 0; i < strings.Length; i++)
{
if(strings[i] == value)
{
return true;
}
}
return false;
}
O(N2)
O(N2) represents an algorithm whose performance is directly proportional to the square of the size of the input data set. This is common with algorithms that involve nested iterations over the data set. Deeper nested iterations will result in O(N3), O(N4) etc.
bool ContainsDuplicates(String[] strings)
{
for(int i = 0; i < strings.Length; i++)
{
for(int j = 0; j < strings.Length; j++)
{
if(i == j) // Don't compare with self
{
continue;
}
if(strings[i] == strings[j])
{
return true;
}
}
}
return false;
}
O(2N)
O(2N) denotes an algorithm whose growth will double with each additional element in the input data set. The execution time of an O(2N) function will quickly become very large.
Logarithms
Binary search is a technique used to search sorted data sets. It works by selecting the middle element of the data set, essentially the median, and compares it against a target value. If the values match it will return success. If the target value is higher than the value of the probe element it will take the upper half of the data set and perform the same operation against it. Likewise, if the target value is lower than the value of the probe element it will perform the operation against the lower half. It will continue to halve the data set with each iteration until the value has been found or until it can no longer split the data set.
This type of algorithm is described as O(log N). The iterative halving of data sets described in the binary search example produces a growth curve that peaks at the beginning and slowly flattens out as the size of the data sets increase e.g. an input data set containing 10 items takes one second to complete, a data set containing 100 items takes two seconds, and a data set containing 1000 items will take three seconds. Doubling the size of the input data set has little effect on its growth as after a single iteration of the algorithm the data set will be halved and therefore on a par with an input data set half the size. This makes algorithms like binary search extremely efficient when dealing with large data sets.
Oh, and do I use it?
Yes, all the time - when I'm figuring out how efficient my code is it gives a great 'back-of-the-envelope- approximation to the cost. Obviously, you may find 100 examples in internet for Big O. They may include polynominal, factorial, logarithm and other boring math. However many experienced developers I often use to consult confused at why they really need this information and could that be useful for them to save the day. Well, what the Big-O Is Good For? The good news is that the Big-O belongs to an entire family of notation. It also serves as a good indicator of what algorithm to use once you take your individual circumstances into consideration. I do use Big O notation on occasion to convey algorithmic complexity to fellow programmers. I use the underlying theory (e.g. Big O analysis techniques) all of the time when I think about what algorithms to use.
In a nutshell, the Big-O of a given algorithm combined with the specific problem knowledge is a great way to choose the best algorithm for your situation. I also agree that the Big-O lives in the land of theory and doesn't care very much about the real world.
So why is Big-O commonly associated with worst-case running times, and why is that imprecise?
It's because when considering the worst possible case, it is natural to give a limit on how bad that worst case can be, not how good it can me. That is, we want to give an upper bound on its degree of badness. Similarly, we often want to give a lower bound on how good the best-case is (i.e, even on good inputs, there's still a limit on how fast the algorithm can go; what is that limit?), so Big-Omega gets associated with best-case.
That's why Big-O gets associated with worst-case running times and Big-Omega with best-case. And it's true that if someone just says "the running time" is O(n^2), then n^2 is indeed "closer" to the worst-case running time than to the best-case running time, in the sense that n^2 is "bigger" than all possible running times, and the worst-case running time is "bigger" than the best-case running time. But O(n^2) doesn't mean that the worst-case running time actually is n^2, just that it is at most n^2.
Myths about Big O
You cannot use Big-O to compare the speed of two algorithms. Big-O only says how much slower an algorithm will get (approximately), if you double the number of items processed or how much faster it will get, if you cut the number in half.
However, if you have to entirely different algorithms and one (A) is O(n^2) and the other one (B) is O(log n), it is not said that A is slower than B. Actually with 100 items, A might be ten times faster than B. It only says that with 200 items, A will grow slower by the factor n^2 and B will grow slower by the factor log n. So if you benchmark both and you know how much time A takes to process 100 item and how much time B needs for the same 100 items, and A is faster than B, you can calculate at how many items B will overtake A in speed (as the speed of B decreases much slower than the one of A, it will overtake A sooner or later, this is for sure).
Math of Big O
'Big-O' notation is used to compare the growth rates of two functions of a variable (say n) as n gets very large. If function f grows much more quickly than function g we say that g = O(f) to imply that for large enough n, f will always be larger than g up to a scaling factor.
It turns out that this is a very useful idea in the analysis of algorithms, because we are often precisely concerned with the growth rates of functions which represent, for example, the time taken by two different algorithms. Very coarsely, we can determine that an algorithm with run-time t1(n) is more efficient than an algorithm with run-time t2(n) if t1 = O(t2) for large enough n which is typically the 'size' of the problem - like the length of the array or number of nodes in the graph or whatever.
This stipulation, that n gets large enough, allows us to pull a lot of useful tricks. Perhaps the most often used one is that you can simplify functions down to their fastest growing terms. For example n^2 + n = O(n^2) because as n gets large enough, the n^2 term gets so much larger than n that the n term is practically insignificant. So we can drop it from consideration.
However, it does mean that big-O notation is less useful for small n, because the slower growing terms that we've forgotten about are still significant enough to affect the run-time.
What we now have is a tool for comparing the costs of two different algorithms, and a short hand for saying that one is quicker or slower than the other. Big-O notation can be abused which is a shame as it is imprecise enough already! There are equivalent terms for saying that a function grows less quickly than another, and that two functions grow at the same rate.