Test Time Complexity

AKA: Test-Time Computational Cost, Test Time.
Example(s):
- kNN Test Time Complexity: $\Theta(\vert\mathbb{D}\vert M_{ave} M_{a})$,
- ...
- …
Counter-Example(s):
- Training Time.
See: Machine Learning System, Machine Learning Task, Test Dataset, Training Dataset, Evaluation Dataset, Computational Complexity, Analysis of Algorithms.

References

(Choi et al., 2019) ⇒ Seok-Hwan Choi, Jin-Myeong Shin, and Yoon-Ho Choi(2019). "Dynamic Nonparametric Random Forest Using Covariance". In: Hindawi - Security and Communication Networks (2019). DOI:10.1155/2019/3984031.
- QUOTE: To evaluate the test time of the proposed C-DRF algorithm, we measured the time spending from loading of the forest into the memory to the output file creation. In Figure 6, we show the test time of the proposed C-DRF algorithm, the RF algorithm, and P. Latinne et al.’s algorithm under different datasets. For the UNSW-NB15 dataset, the test time of the proposed C-DRF algorithm was 4.45 seconds, which is faster than the other two algorithms by as much as 38.85% and 10.28%, respectively. For the real estate dataset, the proposed C-DRF algorithm showed the test time of 1.2 seconds, which is faster than 17.66 seconds of the RF algorithm by 56.98%.

(Xu et al., 2012) rArr; Zhixiang Eddie Xu, Kilian Q. Weinberger, Olivier Chapelle (2012). "The Greedy Miser: Learning Under Test-time Budgets". In: Proceedings of the 29th International Conference on Machine Learning (ICML 2012).
- QUOTE: Test-time computational cost. There are two factors that contribute to this cost: The function evaluation cost of all trees $h_t$ with $\beta_t > 0$ and the feature extraction cost for all features that are used in these trees. Let $e > 0$ be the cost to evaluate one tree $h_t$ if all features were previously extracted. With this notation, both costs can be expressed in a single function as
  [math]\displaystyle{ c(\beta) = e\parallel \beta \parallel_0 + \displaystyle\sum_{\alpha=1}^d c_{\alpha} \parallel\sum_{t=1}^T F_{\alpha t} \beta_t \parallel_0, \quad\quad }[/math] (3)
  :: where the l0-norm for scalars is defined as $\parallel a\parallel_0 \to \{0, 1\}$ with $\parallel a \parallel_0 = 1 $ if and only if $a\ne= 0$. The first term captures the function-evaluation costs and the second term captures the feature costs of all used features.

(Manning et al., 2008) ⇒ Christopher D. Manning, Prabhakar Raghavan, Hinrich Schütze. (2008). “Introduction to Information Retrieval." Cambridge University Press. ISBN:0521865719.
- QUOTE: Table 14.3 gives the time complexity of kNN. kNN has properties that are quite different from most other classification algorithms. Training a kNN classifier simply consists of determining $k$ and preprocessing documents. In fact, if we preselect a value for $k$ and do not preprocess, then kNN requires no training at all. In practice, we have to perform preprocessing steps like tokenization. It makes more sense to preprocess training documents once as part of the training phase rather than repeatedly every time we classify a new test document.
  Test time is $\Theta(\vert \mathbb{D} \vert M_{ave} M_{a})$ for kNN. It is linear in the size of the training set as we need to compute the distance of each training document from the test document. Test time is independent of the number of classes $J$. kNN therefore has a potential advantage for problems with large $J$.

**Table 14.3:** Training and test times for kNN classification.$ M_{ave}$ is the average size of the vocabulary of documents in the collection.
	kNN with preprocessing of training set
training	$\Theta(\vert\mathbb{D}\vert L_{ave})$
testing	$\Theta( L_{a} + \vert \mathbb{D} \vert M_{ave} M_{a})= \Theta(\vert\mathbb{D}\vert M_{ave} M_{a})$
	kNN without preprocessing of training set
training	$\Theta(1)$
testing	$\Theta( L_{a} + \vert \mathbb{D} \vert L_{ave} M_{a}) = \Theta(\vert\mathbb{D}\vert L_{ave} M_{a})$