Thurstone's Comparative Judgment Model

A Thurstone's Comparative Judgment Model is a psychometric scaling model that quantifies subjective judgments by comparing pairs of stimuli.

Context:
- It can (typically) be used to measure perceived differences between stimuli by utilizing pairwise comparisons, rather than relying on absolute measurements.
- It can (often) be applied in psychological research to quantify attitudes, preferences, or perceptions, such as comparing statements about social issues or sensory experiences.
- It can range from simple comparisons of physical stimuli, like object weights, to complex comparisons of abstract entities, like attitudes or traits.
- It can serve as a basis for more advanced models, including the Rasch Model and Item Response Theory, which extend its principles to broader applications in education and psychology.
- It can be visualized as a mathematical framework for a "discriminal process," where entities are compared in pairs to evaluate differences in magnitude along a specific attribute.
- ...
- Methodologically:
  - It can begin with the selection of a set of stimuli that need to be compared along a specific attribute (e.g., preference, intensity, quality).
  - It can involve presenting pairs of these stimuli to participants, asking them to choose which stimulus in each pair better represents the attribute under study.
  - It can accumulate data from these pairwise comparisons, resulting in a matrix of preference counts or proportions for each stimulus pair.
  - It can apply a statistical model (e.g., Thurstone's Law of Comparative Judgment) to these data, converting them into scale values representing the perceived magnitude of the attribute for each stimulus.
  - It can assume that the judgments are influenced by random perceptual errors, which are accounted for in the statistical modeling process.
  - It can be employed in various fields, including psychology, marketing, and sensory analysis, to create scales of preferences, attitudes, or perceived qualities.
  - It can require iterative refinement of stimuli and procedures to ensure the reliability and validity of the resulting scale.
- ...
- Mathematically:
  - It can assume that each stimulus \(i\) has a true score or perceived value \( \mu_i \) on a psychological continuum.
  - It can model the perceived values of stimuli as normally distributed random variables, where:
    [math]\displaystyle{ \[ X_i \sim \mathcal{N}(\mu_i, \sigma_i^2) \quad \text{and} \quad X_j \sim \mathcal{N}(\mu_j, \sigma_j^2) \] }[/math].
  - It can evaluate differences between stimuli by calculating \( D_{ij} = X_i - X_j \), where \( D_{ij} \) follows a normal distribution:
    [math]\displaystyle{ \[ D_{ij} \sim \mathcal{N}(\mu_i - \mu_j, \sigma_i^2 + \sigma_j^2) \] }[/math].
  - It can determine the probability that a person prefers stimulus \(i\) over stimulus \(j\) using the cumulative distribution function \( \Phi \) of the standard normal distribution:
    [math]\displaystyle{ \[ P(i \gt j) = \Phi\left(\frac{\mu_i - \mu_j}{\sqrt{\sigma_i^2 + \sigma_j^2}}\right) \] }[/math].
  - It can estimate the true scores \( \mu_i \) using maximum likelihood estimation (MLE) based on observed pairwise comparison data.
  - It can simplify the model by assuming equal variance across stimuli, resulting in the simplified equation:
    [math]\displaystyle{ \[ P(i \gt j) = \Phi(\mu_i - \mu_j) \] }[/math].
- ...
Example(s):
- In consumer research, comparing pairs of products (e.g., two types of smartphones) to determine which one is perceived as higher quality by participants, leading to a ranked order of preferences.
- In sensory evaluation, presenting pairs of food samples to judges to compare their sweetness, with outputs being a scale of sweetness intensity.
- In educational testing, comparing pairs of student essay responses to judge which one demonstrates higher quality, ultimately producing a scale of writing proficiency.
- In social psychology, comparing pairs of statements about social issues (e.g., climate change) to gauge the extremity of attitudes, resulting in a scale of attitude intensity.
- In human resource management, comparing pairs of job applicants' profiles to determine which is more suitable for a specific role, creating a ranked list of candidates.
- In game theory, comparing strategies in a formal game setup to evaluate which one is perceived as more effective, aiding in the optimization of decision-making processes.
- In computer science, applying the model to compare algorithm performance by evaluating pairs of algorithms against specific tasks, producing a ranking of effectiveness or efficiency.
- In machine learning, comparing the outputs of different models to determine which one provides better predictions for a given dataset, ultimately leading to model selection.
- In AI systems, using the model to compare user feedback on different AI-generated content, such as recommendations or text, to refine and improve AI behavior.
- In robotics, comparing the performance of different control algorithms in a simulation environment to determine which one is more effective in completing a given task.
- ...
Counter-Example(s):
- A Likert Scale approach, which involves rating items on a scale rather than making pairwise comparisons.
- A Guttman Scale, which assumes a cumulative ordering of items.
- Direct Estimation Methods, where participants assign absolute values (e.g., rating the sweetness of a food sample on a 1-10 scale).
See: Rasch Model, L. L. Thurstone, Pairwise Comparison (Psychology), Stimulus (Physiology), Psychometrics, Psychophysics, Attitude (Psychology), Item Response Theory, Bradley-Terry Model.

References

2024

(Wikipedia, 2024) ⇒ https://en.wikipedia.org/wiki/Law_of_comparative_judgment Retrieved:2024-8-14.
- The law of comparative judgment was conceived by L. L. Thurstone. In modern-day terminology, it is more aptly described as a model that is used to obtain measurements from any process of pairwise comparison. Examples of such processes are the comparisons of perceived intensity of physical stimuli, such as the weights of objects, and comparisons of the extremity of an attitude expressed within statements, such as statements about capital punishment. The measurements represent how we perceive entities, rather than measurements of actual physical properties. This kind of measurement is the focus of psychometrics and psychophysics.
  In somewhat more technical terms, the law of comparative judgment is a mathematical representation of a discriminal process, which is any process in which a comparison is made between pairs of a collection of entities with respect to magnitudes of an attribute, trait, attitude, and so on. The theoretical basis for the model is closely related to item response theory and the theory underlying the Rasch model, which are used in psychology and education to analyse data from questionnaires and tests.

2024

1927

(Thurstone, 1927) ⇒ L.L. Thurstone. (1927). "A Law of Comparative Judgment.” In: Psychological Review, 34(4), 273–286. doi:10.1037/h0070288
- NOTE: It provides a foundational understanding of how comparative judgments can be mathematically modeled and is seminal in the development of modern psychometrics.