Synthetic Data Generation Task

AKA: Synthetic Random Data Generation.
Context:
- It can be solved by a Data Generation System (that implements a data generation algorithm).
- It can require the simulation of a Data Generation Process.
- It can range from being a Numerical Data Generation Task to being a Categorical Data Generation Task to being a Hybrid Data Generation Task.
- It can address issues such as data privacy and data scarcity by providing alternative datasets for analysis and model training.
- It can be used in various domains including computer vision, natural language processing, healthcare, and business.
- ...
Example(s):
- a Random Number Generation Task.
- creating a 5-dimensional Identity Matrix Data Structure.
- Patient Record Generation for a medical study.
- …
Counter-Example(s):
- a Data Processing Task.
See: Data Masking, Simulation.

References

(Gentle, 2009) ⇒ James E. Gentle. (2009). “Computational Statistics." Springer. ISBN:978-0-387-98143-7
- QUOTE: Many exercises require the student to generate artificial data. While such datasets may lack any apparent intrinsic interest, I believe that they are often the best for learning how a statistical method works. One of my firm beliefs is
  If I understand something, I can simulate it.

(Melli, 1999) => Gabor Melli. (1999). “The datgen Dataset Generator." Version 3.1 http://www.datasetgenerator.com