Missing-not-at-Random (MNAR) Dataset

From GM-RKB

Jump to navigation Jump to search

A Missing-not-at-Random (MNAR) Dataset is a missing data dataset that is neither an MAR Dataset nor an MCAR Dataset.

- …
Counter-Example(s):
- Missing at Random Data (MAR).
- Missing Completely at Random Data (MCAR).
See: Censored Data.

References

2010a

http://missingdata.lshtm.ac.uk/index.php?option=com_content&view=article&id=77:missing-not-at-random-mnar&catid=40:missingness-mechanisms&Itemid=96
- QUOTE: When neither MCAR nor MAR hold, we say the data are Missing Not At Random, abbreviated MNAR. In the likelihood setting (see end of previous section) the missingness mechanism is termed non-ignorable. What this means is
  1. Even accounting for all the available observed information, the reason for observations being missing still depends on the unseen observations themselves.
  2. To obtain valid inference, a joint model of both Y and R is required (that is a joint model of the data and the missingness mechanism).
- Unfortunately
  1. We cannot tell from the data at hand whether the missing observations are MCAR, NMAR or MAR (although we can distinguish between MCAR and MAR).
  2. In the MNAR setting it is very rare to know the appropriate model for the missingness mechanism.
- Hence the central role of sensitivity analysis; we must explore how our inferences vary under assumptions of MAR, MNAR, and under various models. Unfortunately, this is often easier said than done, especially under the time and budgetary constraints of many applied projects.

2010b

(Steck, 2010) ⇒ Harald Steck. (2010). “Training and Testing of Recommender Systems on Data Missing Not at Random.” In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2010). doi:10.1145/1835804.1835895
- QUOTE: We show that the absence of ratings carries useful information for improving the top-k hit rate concerning all items, a natural accuracy measure for recommendations. As to test recommender systems, we present two performance measures that can be estimated, under mild assumptions, without bias from data even when ratings are missing not at random (MNAR). As to achieve optimal test results, we present appropriate surrogate objective functions for efficient training on MNAR data. ...

2007

(Howell, 2009a) ⇒ David C. Howell. (2007). “Treatment of Missing Data." webpage
- QUOTE: … If data are not missing at random or completely at random then they are classed as Missing Not at Random (MNAR). For example, if we are studying mental health and people who have been diagnosed as depressed are less likely than others to report their mental status, the data are not missing at random.

2002

(Schafer & Graham, 2002) ⇒ Joseph L. Schafer, and John. W. Graham. (2002). “Missing Data: Our view of the state of the art.” In: Psychological Methods, 7(2). [doi>10.1037/1082-989X.7.2.147]
- QUOTE: … Rubin (1976) defined missing data to be MAR if the distribution of missingness does not depend on [math]\displaystyle{ Y_{mis} }[/math], [math]\displaystyle{ P(R|Y_{com}) = P(R|Y_{obs}). (1) }[/math]. … When Equation 1 is violated and the distribution depends on [math]\displaystyle{ Y_{mis} }[/math], the missing data are said to be missing not at random (MNAR). MAR is also called ignorable nonresponse, and MNAR is called nonignorable. ...

1976

(Rubin, 1976) ⇒ D. B. Rubin. (1976). “Inference and missing data." Biometrika, 63.

Retrieved from "http://www.gabormelli.com/RKB/index.php?title=Missing-not-at-Random_(MNAR)_Dataset&oldid=903918"

Concept