Synthetic Dataset Generation System
Jump to navigation
Jump to search
A Synthetic Dataset Generation System is an information generation system that implements a synthetic data generation algorithm to solve a synthetic data generation task
- AKA: Artificial Data Set Generation System.
- Context:
- It can include features for customizing the generated data according to specific requirements, such as the number of samples, feature types, and distributions.
- It can integrate with other tools and platforms for data preprocessing, analysis, and visualization.
- ...
- Example(s):
- Counter-Example(s):
- See: Data Masking System, Pseudo-Random Number Generator.
References
2023
- (Lu, Shen et al., 2023) ⇒ Yingzhou Lu, Minjie Shen, Huazheng Wang, Xiao Wang, Capucine van Rechem, and Wenqi Wei. (2023). “Machine Learning for Synthetic Data Generation: A Review.” In: arXiv preprint arXiv:2302.04062. doi:10.48550/arXiv.2302.04062
- NOTE:
- The paper showcases the effectiveness of synthetic data in improving machine learning models' performance by providing additional training data, thus mitigating issues related to data scarcity and enhancing the robustness of models.
- NOTE:
1999
- (Melli, 1999) => Gabor Melli. (1999). “The datgen Dataset Generator." Version 3.1 http://www.datasetgenerator.com