Computational Complexity Analysis Task
A Computational Complexity Analysis Task is an algorithm evaluation task that focuses on the Count of Operations for an Algorithm to perform an Task.
- AKA: Running Time Analysis, [math]\displaystyle{ \mathcal{O} }[/math].
- Context:
- Task Input: an Algorithm.
- Task Output:
- Task Requirement:
- It can be supported by Computational Complexity Theory.
- It can be used for determining Theoretical Bounds of some Algorithm's Performance Metric.
- It can range from being a Time Complexity Analysis Task to being a Space Complexity Analysis Task.
- …
- Example(s):
- a k-Means Computational Complexity Analysis, which can be characterized as having [math]\displaystyle{ \mathcal{O} (n K I d) }[/math] for [math]\displaystyle{ n }[/math] : number of points; K : number of clusters; I : number of iterations; d : number of attributes.
- a DTree Learning Computational Complexity Analysis, which can be characterized as having [math]\displaystyle{ \mathcal{O}(nm^2) }[/math], for [math]\displaystyle{ n }[/math] rows, and [math]\displaystyle{ m }[/math] attributes.
- a Machine Learning Training Algorithm Computational Complexity Analysis Task (for an ML training algorithm).
- an Attention Mechanism Computational Complexity Analysis (for an attention mechanism).
- …
- Counter-Example(s)
- See: Computational Complexity Performance Metric, Space Complexity Analysis, Computational Time Complexity.
References
2018
- (Wikipedia, 2018) ⇒ https://en.wikipedia.org/wiki/Computational_complexity_theory Retrieved:2018-4-1.
- Computational complexity theory is a branch of the theory of computation in theoretical computer science that focuses on classifying computational problems according to their inherent difficulty, and relating those classes to each other. A computational problem is understood to be a task that is in principle amenable to being solved by a computer, which is equivalent to stating that the problem may be solved by mechanical application of mathematical steps, such as an algorithm.
A problem is regarded as inherently difficult if its solution requires significant resources, whatever the algorithm used. The theory formalizes this intuition, by introducing mathematical models of computation to study these problems and quantifying the amount of resources needed to solve them, such as time and storage. Other complexity measures are also used, such as the amount of communication (used in communication complexity), the number of gates in a circuit (used in circuit complexity) and the number of processors (used in parallel computing). One of the roles of computational complexity theory is to determine the practical limits on what computers can and cannot do.
Closely related fields in theoretical computer science are analysis of algorithms and computability theory. A key distinction between analysis of algorithms and computational complexity theory is that the former is devoted to analyzing the amount of resources needed by a particular algorithm to solve a problem, whereas the latter asks a more general question about all possible algorithms that could be used to solve the same problem. More precisely, computational complexity theory tries to classify problems that can or cannot be solved with appropriately restricted resources. In turn, imposing restrictions on the available resources is what distinguishes computational complexity from computability theory: the latter theory asks what kind of problems can, in principle, be solved algorithmically.
- Computational complexity theory is a branch of the theory of computation in theoretical computer science that focuses on classifying computational problems according to their inherent difficulty, and relating those classes to each other. A computational problem is understood to be a task that is in principle amenable to being solved by a computer, which is equivalent to stating that the problem may be solved by mechanical application of mathematical steps, such as an algorithm.
2018
- (Wikipedia, 2018) ⇒ https://en.wikipedia.org/wiki/Analysis_of_algorithms#Run-time_analysis Retrieved:2018-4-1.
- Run-time analysis is a theoretical classification that estimates and anticipates the increase in running time (or run-time) of an algorithm as its input size (usually denoted as n) increases. Run-time efficiency is a topic of great interest in computer science: A program can take seconds, hours, or even years to finish executing, depending on which algorithm it implements. While software profiling techniques can be used to measure an algorithm's run-time in practice, they cannot provide timing data for all infinitely many possible inputs; the latter can only be achieved by the theoretical methods of run-time analysis.
2008
- (Manning et al., 2008) ⇒ Christopher D. Manning, Prabhakar Raghavan, Hinrich Schütze. (2008). “Introduction to Information Retrieval." Cambridge University Press. ISBN:0521865719.
- QUOTE: Table 14.3 gives the time complexity of kNN. kNN has properties that are quite different from most other classification algorithms. Training a kNN classifier simply consists of determining $k$ and preprocessing documents. In fact, if we preselect a value for $k$ and do not preprocess, then kNN requires no training at all. In practice, we have to perform preprocessing steps like tokenization. It makes more sense to preprocess training documents once as part of the training phase rather than repeatedly every time we classify a new test document.
Test time is $\Theta(|\vert \mathbb{D} vert M_{ave} M_{a})$ for kNN. It is linear in the size of the training set as we need to compute the distance of each training document from the test document. Test time is independent of the number of classes $J$. kNN therefore has a potential advantage for problems with large $J$.
- QUOTE: Table 14.3 gives the time complexity of kNN. kNN has properties that are quite different from most other classification algorithms. Training a kNN classifier simply consists of determining $k$ and preprocessing documents. In fact, if we preselect a value for $k$ and do not preprocess, then kNN requires no training at all. In practice, we have to perform preprocessing steps like tokenization. It makes more sense to preprocess training documents once as part of the training phase rather than repeatedly every time we classify a new test document.
kNN with preprocessing of training set | |
training | $\Theta(\vert\mathbb{D}\vert L_{ave})$ |
testing | $\Theta( L_{a} + \vert \mathbb{D} \vert M_{ave} M_{a})= \Theta(\vert\mathbb{D}\vert M_{ave} M_{a})$ |
kNN without preprocessing of training set | |
training | $\Theta(1)$ |
testing | $\Theta( L_{a} + \vert \mathbb{D} \vert L_{ave} M_{a}) = \Theta(\vert\mathbb{D}\vert L_{ave} M_{a})$ |
2006
- (Chu et al., 2006) ⇒ Cheng-Tao Chu, Sang Kyun Kim, Yi-An Lin, YuanYuan Yu, Gary Bradski, Andrew Y. Ng, and Kunle Olukotun. (2006). “Map-Reduce for Machine Learning on Multicore.” In: Proceedings of the 19th International Conference on Neural Information Processing Systems.
- QUOTE: ... Support Vector Machine (SVM) Linear SVM’s (Vapnik, 1981, Platt, 1999) primary goal is to optimize the following primal problem ...
Table 1 shows the theoretical complexity analysis for the ten algorithms we implemented on top of our framework (...)
- QUOTE: ... Support Vector Machine (SVM) Linear SVM’s (Vapnik, 1981, Platt, 1999) primary goal is to optimize the following primal problem ...
single | multi | |
LWLR | [math]\displaystyle{ O(mn^2 + n^3) }[/math] | [math]\displaystyle{ O\left(\dfrac{mn^2}{P} + \dfrac{n^3}{P'} + n^2 log(P) \right) }[/math] |
LR | [math]\displaystyle{ O(mn^2 + n^3) }[/math] | [math]\displaystyle{ O\left(\dfrac{mn^2}{P} + \dfrac{n^3}{P'} + n^2 log(P)\right) }[/math] |
NB | [math]\displaystyle{ O(mn + nc) }[/math] | [math]\displaystyle{ O\left(\dfrac{mn}{P} + nc \log(P) \right) }[/math] |
NN | [math]\displaystyle{ O(mn + nc) }[/math] | [math]\displaystyle{ O\left(\dfrac{mn}{P} + nc \log(P)\right) }[/math] |
GDA | [math]\displaystyle{ O(mn^2 + n^3) }[/math] | [math]\displaystyle{ O\left(\dfrac{mn^2}{P} + \dfrac{n^3}{P'} + n^2 \log(P)\right) }[/math] |
PCA | [math]\displaystyle{ O(mn^2 + n^3) }[/math] | [math]\displaystyle{ O\left(\dfrac{mn^2}{P} + \dfrac{n^3}{P'} + n^2 \log(P)\right) }[/math] |
ICA | [math]\displaystyle{ O(mn^2 + n^3) }[/math] | [math]\displaystyle{ O\left(\dfrac{mn^2}{P} + \dfrac{n^3}{P'} + n^2 \log(P)\right) }[/math] |
k-means | [math]\displaystyle{ O(mnc) }[/math] | [math]\displaystyle{ O\left(\dfrac{mnc}{P} + mn\log(P) \right) }[/math] |
EM | [math]\displaystyle{ O(mn^2 + n^3) }[/math] | [math]\displaystyle{ O\left(\dfrac{mn^2}{P} + \dfrac{n^3}{P'} + n^2 \log(P)\right) }[/math] |
SVM | [math]\displaystyle{ O(m^2n) }[/math] | [math]\displaystyle{ O\left(\dfrac{mn^2}{P} + n^2 \log(P)\right) }[/math] |