Introduction
Algorithm is a precisely defined procedure implemented with a finite amount of instructions that execute a specific task. Few requirements:
- often has one or more entry values (not exclusively)
- at least 1 output value (results)
- completion in finite amount of time
An algorithm has a deterministic behaviour if it’s behaving the same under same conditions. It’s predictive and always producing the same results. There are “correct” algorithms, producing correct/expected results for input values and “incorrect” algorithms that produce unexpected results or simply don’t end (pseudoalgorithms).
Algorithm specification
They’re usualy expressed via:
- natural language
- graphic forms (flow diagram)
- programming language
- pseudo language
I’ll not go into too much details, but for pseudo language (in the following articles) I’ll share some basic conventions:
- while, repeat, for, if , case, loop (similar to pacal)
- identation is included in the body structure, end + structure name (
end_while, end_if
) - Single and multiple value assignment (
a=b & a=b=c
) - value replacement,
a<->b
- local & global variables
".."
– array range- Pointers. Object (containing fields
x
andy
) with a pointerp
on it, we can access it withx(p)
andy(p)
- Record. Reference:
<Name>.<field>
- Procedure parameter transfer is by value & reference
- Recursion is allowed
- Informal langauge constructions are allowed (
for each
,find
, etc.) - Calls to previously defined Procedures/Functions are marked with Upper case letters (
ERROR
,INPUT
, etc.) - Comments are excluded
Algorithm Analysis (Complexity)
Usually primary measure of an algorithm is efficiency and performance. Efficiency is measured by computer resources on execution (time and space). Space is less limiting factor nowdays so execution time is main/primary criterium.
Algorithm time depends on:
- Number of instructions and their execution time in cycles (architecture)
- Mashine code generated by a compiler
- entry data
First two don’t reflect the essence of the algorithm, so determing the algorithm performance relies on calculating time complexity of the algorithm itself (depending on the size of the problem). For the size of a problem, we’re going to take one parameter that mostly affects the time of the execution and that’s the input data (n)
. Algorithm complexity will be expresssed as a function f(n)
.
We can determine the algorithm complexity using mathematics, combinatorics, algebra transformations, etc. but some of them can lead us to unsolvable mathematical models. Complexity of such models is determined via simulation or empirically (measures after implementation)
Examples:
Analysis usualy provides a math. expression as a result, containing one or more members that depends on the size “n” and constants. To determine the algorithm performance it’s enough to know the order of magnitude. With that, complexity gets approximated with the fastest growing member.
Usual approach for expressing the algorithm complexity is O-notation. For f(n) we say it’s O(g(n))
if there are positive constants c
and n0
such that f(n) <= cg(n)
for any n>=n0
. Than we have lim f(n)/g(n) = c
when n => infinity
. This is the approach to find upper limit of complexty, because f(n)
is asymptotically limited by g(n)
.
The O-notation is not ideal method to determine algorithm complexity. It easily determines the upper limit, but due to approximations it doesn’t say anything about the real algorithm performance. It also excludes constant member that can be vital in some cases.
Because of those problems, sometimes we need to determine the lower limit of complexity using Ω-notation
. For f(n)
we say it’s Ω(g(n))
if there are positive constants c
and n0
, such that f(n) >= cg(n)
for any n >= n0
.
Aside from the number of input sets/data, their value also determine the algorithm performance. With that in mind we can have the best (minimum complexity), avg and worst case scenarios. The analysis of AVG case provides the essence of the algorithm. It’s the most difficult one since we don’t know what is the “average case” (not all inputs/values are equally probable). Usualy we’re relying on an analysis with a random inputs.
Worst case analysis provides us with the guarantee perfomance (upper limit):
- O(1) – constant algorithms : complexity doesn’t depend on the input data. Most desireable algorithms (rarest). E.g. adding an element to the begining of a linked list [Polinomial]
- O(log n) – logaritmic algorithms : pretty popular due to slow & constant growth. If not explicitly defined, the base is 2 (log 2 n). E.g. binary search of sorted set [Polinomial]
- O(n) – linear algorithms : cycles, executing n times, usually processing all input data. E.g. sequential search of unordered array. [Polinomial]
- O(n log n) – linear logaritmic algorithms : binary decision, splitting the problem, processing all input data. E.g. best sort algorithms have this complexity (key comparisson) [Polinomial]
- O(n^2) – square algorithms : usual form is a 2 nested loops with n size. E.g. direct sort methods. [Polinomial]
- O(n^k) – degree algorithms : form of k nested loops size of n, with a significant growth of complexity for big n. Acceptable for computer solving, especially for medium values of k [Polinomial]
- O(k^n), k > 1 – exponential algorithms : these are not convenient for computer solving (for high dimensions), due to extreme growth of exponential funcion. E.g. brute force
Algorithm implementation
The algorithm implementation can affect its efficiency. It depends on the programming language and on the computer system running it. If we have an algorithm running numerous times (frequent processing or cycles), optimization is very important. On the other hand, algorithms that are executed rarely, usual focus is on low complexity (easily understandable, coding & testing).
In our decision related to algorithm selection we shouldn’t rely exclusively on its complexity, but also on the conditions in which the algorithm is going to execute. None of the algorithms is ideal in all conditions, so we should take that into consideration.
Conclusion
This was just the basic introduction on specs, analysis and implementation. Although nowdays we don’t lack memory, we should balance out memory usage with time efficiency. They’re tightly connected. As mentioned, most of algorithm complexity assesment is an aproximation, but knowing how to approach them, how to analyse and read them will definitely come in handy at some point.