Twyman's Law
Jump to navigation
Jump to search
A Twyman's Law is a statistical observation that highlights how outlier data are more likely to result from errors than from extraordinary, yet genuine phenomena.
- Context:
- It can (typically) warn analysts to be cautious when interpreting data points that deviate significantly from the norm.
- It can (often) serve as a reminder in Data Analysis to verify and re-verify unusual data before drawing conclusions.
- It can range from being applied in simple Data Visualization to complex Statistical Analysis.
- It can influence decisions in areas like User (Computing) behavior analysis and Test Score assessments.
- It can help identify potential Software Bugs when unusual data patterns are detected in Log Files.
- ...
- Example(s):
- an Online Traffic Anomaly Detection that identifies a sudden spike in website traffic due to a Software Bug, rather than an increase in real user interest.
- a Market Research case where unusually high sales data were eventually traced back to a data entry error.
- ...
- Counter-Example(s):
- Black Swan Events, which are genuine, rare events that significantly impact data but are not errors.
- Predictive Anomalies in Machine Learning, where models correctly predict rare but genuine outcomes.
- ...
- See: Fraud, Tony Twyman, Data Analysis, Measurement, Quantity, User (Computing), Software Bug, Log File, Test Score.
References
2024
- (Wikipedia, 2024) ⇒ https://en.wikipedia.org/wiki/Twyman's_law Retrieved:2024-4-9.
- Twyman's law states that "Any figure that looks interesting or different is usually wrong", following the principle that "the more unusual or interesting the data, the more likely they are to have been the result of an error of one kind or another". It is named after the media and market researcher Tony Twyman and has been described as one of the most important laws of data analysis. The law is based on the fact that errors in data measurement and analysis can lead to observed quantities that are wildly different from typical values. These errors are usually more common than real changes of similar magnitude in the underlying process being measured. For example, if an analyst at a software company notices that the number of users has doubled overnight, the most likely explanation is a bug in logging, rather than a true increase in users.[1] The law can also be extended to situations where the underlying data is influenced by unexpected factors that differ from what was intended to be measured. For example, when schools show unusually large improvements in test scores, subsequent investigation often reveals that those scores were driven by fraud.
- ↑ Cite error: Invalid
<ref>
tag; no text was provided for refs namedKohaviTang2020