2020 SensorDataQualityASystematicRev: Difference between revisions
m (Text replacement - "ers]] " to "er]]s ") |
m (Text replacement - "ions]] " to "ion]]s ") |
||
Line 18: | Line 18: | ||
=== Abstract === | === Abstract === | ||
[[Sensor data quality]] plays a vital role in [[Internet of Things application|IoT | [[Sensor data quality]] plays a vital role in [[Internet of Things application|IoT application]]s as they are [[rendered useless]] if the [[data quality]] is bad.</s> | ||
This [[systematic review]] aims to provide an [[introduction]] and [[guide for researcher]]s who are interested in [[quality-related issues]] of [[physical sensor data]].</s> | This [[systematic review]] aims to provide an [[introduction]] and [[guide for researcher]]s who are interested in [[quality-related issues]] of [[physical sensor data]].</s> | ||
The process and results of the systematic review are presented which aim to [[answer the following research questions]]: what are the different types of [[physical sensor data errors]], how to [[quantify or detect those errors]], how to [[correct them]] and what domains are the solutions in.</s> | The process and results of the systematic review are presented which aim to [[answer the following research questions]]: what are the different types of [[physical sensor data errors]], how to [[quantify or detect those errors]], how to [[correct them]] and what domains are the solutions in.</s> | ||
Out of 6970 literatures obtained from three databases ([[ACM Digital Library]], [[IEEE Xplore]] and [[ScienceDirect]]) using the search string refined via [[topic modelling]], 57 [[ | Out of 6970 literatures obtained from three databases ([[ACM Digital Library]], [[IEEE Xplore]] and [[ScienceDirect]]) using the search string refined via [[topic modelling]], 57 [[publication]]s were selected and examined.</s> | ||
[[Results]] show that the different [[types of sensor data errors]] addressed by those [[research paper|paper]]s are mostly [[missing data]] and [[data fault|faults]] e.g. [[outliers]], [[bias]] and [[drift]].</s> | [[Results]] show that the different [[types of sensor data errors]] addressed by those [[research paper|paper]]s are mostly [[missing data]] and [[data fault|faults]] e.g. [[outliers]], [[bias]] and [[drift]].</s> | ||
The [[most common | The [[most common solution]]s for [[error detection]] are based on [[principal component analysis|PCA]] and [[artificial neural network|ANN]] which [[accounts]] for about 40% of all [[error detection paper]]s found in the study.</s> | ||
Similarly, for [[fault correction]], [[PCA]] and [[ANN]] are among the most common, along with [[Bayesian Network]]s.</s> | Similarly, for [[fault correction]], [[PCA]] and [[ANN]] are among the most common, along with [[Bayesian Network]]s.</s> | ||
[[Missing values]] on the other hand, are mostly imputed using [[association rule mining]].</s> | [[Missing values]] on the other hand, are mostly imputed using [[association rule mining]].</s> | ||
Other techniques include [[hybrid | Other techniques include [[hybrid solution]]s that combine several [[data science]] methods to detect and correct the errors.</s> | ||
Through this [[systematic review]], it is found that the methods proposed to solve [[physical sensor data errors]] cannot be directly compared due to the [[non-uniform evaluation process]] and the high use of [[non-publicly available datasets]].</s> | Through this [[systematic review]], it is found that the methods proposed to solve [[physical sensor data errors]] cannot be directly compared due to the [[non-uniform evaluation process]] and the high use of [[non-publicly available datasets]].</s> | ||
[[Bayesian data analysis]] done on the 57 selected [[publication]]s also suggests that [[ | [[Bayesian data analysis]] done on the 57 selected [[publication]]s also suggests that [[publication]]s using [[publicly available datasets]] for method [[evaluation]] have higher [[citation rates]].</s> | ||
== References == | == References == |
Latest revision as of 07:25, 22 August 2024
- (Teh et al., 2020) ⇒ Hui Yie Teh, Andreas W Kempa-Liehr, and Kevin I-Kai Wang. (2020). “Sensor Data Quality: A Systematic Review.” In: Journal of Big Data, 7(1).
Subject Headings: Sensor Data, Sensor Data Error.
Notes
- Sensor data quality is crucial for the effectiveness of IoT applications, as poor data quality can render the systems useless.
- Sensor data errors such as missing data, outliers, bias, and drift are commonly addressed by researchers to improve data quality.
- The most common solutions for sensor data error detection are based on PCA and ANN, which together account for about 40% of all error detection methods.
- For fault correction in sensor data, PCA, ANN, and Bayesian networks are among the most widely used techniques.
- Sensor data errors can be corrected using methods like association rule mining, which is commonly used for imputing missing values.
- Through systematic reviews, it has been found that methods proposed to solve sensor data errors often cannot be directly compared due to non-uniform evaluation processes and the use of non-publicly available datasets.
Cited By
Quotes
Abstract
Sensor data quality plays a vital role in IoT applications as they are rendered useless if the data quality is bad. This systematic review aims to provide an introduction and guide for researchers who are interested in quality-related issues of physical sensor data. The process and results of the systematic review are presented which aim to answer the following research questions: what are the different types of physical sensor data errors, how to quantify or detect those errors, how to correct them and what domains are the solutions in. Out of 6970 literatures obtained from three databases (ACM Digital Library, IEEE Xplore and ScienceDirect) using the search string refined via topic modelling, 57 publications were selected and examined. Results show that the different types of sensor data errors addressed by those papers are mostly missing data and faults e.g. outliers, bias and drift. The most common solutions for error detection are based on PCA and ANN which accounts for about 40% of all error detection papers found in the study. Similarly, for fault correction, PCA and ANN are among the most common, along with Bayesian Networks. Missing values on the other hand, are mostly imputed using association rule mining. Other techniques include hybrid solutions that combine several data science methods to detect and correct the errors. Through this systematic review, it is found that the methods proposed to solve physical sensor data errors cannot be directly compared due to the non-uniform evaluation process and the high use of non-publicly available datasets. Bayesian data analysis done on the 57 selected publications also suggests that publications using publicly available datasets for method evaluation have higher citation rates.
References
;
Author | volume | Date Value | title | type | journal | titleUrl | doi | note | year | |
---|---|---|---|---|---|---|---|---|---|---|
2020 SensorDataQualityASystematicRev | Hui Yie Teh Andreas W Kempa-Liehr Kevin I-Kai Wang | Sensor Data Quality: A Systematic Review | 2020 |