Meta-d' App

Data preview

Upload CSV

Browse

Column mapping

Participant ID

Stimulus

Response

Confidence

Stimulus values

Response values

Model parameters

s (SD ratio, default=1)

Restarts (random starts per participant)

Measures of Metacognition — Conceptual & Mathematical Overview

SDT-based Signal Detection Theory Measures

SDT measures assume internal evidence follows normal distributions. They disentangle metacognitive sensitivity from bias.

d’ — Type 1 sensitivity

The standard measure of perceptual discrimination ability. Quantifies how well the observer separates the two stimulus distributions.

$$d' = \Phi^{-1}(H) - \Phi^{-1}(F)$$

where H = hit rate, F = false-alarm rate, $\Phi^{-1}$ = probit function.

📚 Green, D. M., & Swets, J. A. (1966). Signal detection theory and psychophysics. Wiley.

meta-d’ — Type 2 sensitivity

The value of d’ that a hypothetical ideal observer would need to produce the observed pattern of Type 2 (confidence) responses. Fit via maximum likelihood to the Type 2 ROC curve.

$$\text{meta-}d' = \underset{m}{\arg\max}\; \ell(\text{Type 2 data} \mid m,\, d',\, c)$$

where $\ell$ is the log-likelihood under the SDT model with metacognitive sensitivity $m$, type 1 sensitivity $d'$, and criteria $c$. Implemented via L-BFGS-B MLE.

📚 Maniscalco, B., & Lau, H. (2012). A signal detection theoretic approach for estimating metacognitive sensitivity from confidence ratings. Consciousness & Cognition, 21, 422–430.

M-Ratio = meta-d’ / d’

Normalises metacognitive sensitivity by task performance. Values near 1 indicate ideal metacognition; values < 1 suggest information loss in the metacognitive system.

$$M_{\text{ratio}} = \frac{\text{meta-}d'}{d'}$$

Range: any positive real number. Unstable when d’ ≈ 0.

📚 Maniscalco & Lau (2012) ibid.; Rahnev, D. (2025). Nature Communications, 16, 701.

M-Diff = meta-d’ − d’

Difference-based alternative to M-Ratio. Negative values indicate a metacognitive deficit; positive values indicate a surplus. Tends to over-correct at extreme d’ values.

$$M_{\text{diff}} = \text{meta-}d' - d'$$

📚 Maniscalco & Lau (2012) ibid.; Rahnev (2025) ibid.

Association Trial-Level Association Measures

Measure how strongly confidence tracks accuracy on a trial-by-trial basis. No SDT model assumptions required; computable from raw trial data.

ΔConf — Delta Confidence

The simplest association measure: the difference in mean confidence between correct and incorrect trials. Positive values indicate higher confidence on correct trials.

$$\Delta\text{Conf} = \overline{c}_{\text{correct}} - \overline{c}_{\text{error}}$$

Units: raw confidence scale. Strongly depends on task d’.

📚 Rahnev, D. (2025). A comprehensive assessment of current methods for measuring metacognition. Nature Communications, 16, 701.

AUC₂ — Area Under the Type 2 ROC

The area under the curve that plots Type 2 hit rate against Type 2 false-alarm rate across all confidence thresholds. The oldest metacognitive measure (proposed 1950s). Computed here from empirical response frequencies.

$$\text{AUC}_2 = \int_0^1 \text{HR}_2\,d(\text{FAR}_2) \approx \sum_k \Delta\text{FAR}_{2,k}\cdot\overline{\text{HR}}_{2,k}$$

Range: 0.5 (chance) – 1.0 (perfect). Strongly depends on task d’.

📚 Galvin, S. J. et al. (2003). Type 2 tasks in the theory of signal detectability. Perception & Psychophysics, 65, 354–370. | Rahnev (2025) ibid.

γ — Goodman–Kruskal Gamma

The rank correlation between trial-by-trial confidence and accuracy. The most common metacognitive measure in the memory literature.

$$\gamma = \frac{C - D}{C + D}$$

where C = concordant pairs (high conf & correct > low conf & incorrect), D = discordant pairs. Range: $-1$ to $+1$.

📚 Nelson, T. O. (1984). A comparison of current measures of the accuracy of feeling-of-knowing predictions. Psychological Bulletin, 95, 109–133. | Rahnev (2025) ibid.

ϕ — Phi (Pearson correlation)

The Pearson product-moment correlation between trial-by-trial confidence rating and binary accuracy. Assumes a linear relationship between the two.

$$\phi = r(c_i,\, a_i) = \frac{\sum_i(c_i-\bar{c})(a_i-\bar{a})}{\sqrt{\sum_i(c_i-\bar{c})^2 \sum_i(a_i-\bar{a})^2}}$$

where $c_i$ = confidence on trial $i$, $a_i \in \{0,1\}$. Range: $-1$ to $+1$.

📚 Rahnev (2025) ibid.

Calibration Calibration Measures (Schraw 2009)

Item-level measures comparing normalised confidence to binary accuracy. Confidence is rescaled to [0, 1] before computation. Originally described for free-recall and knowledge-monitoring tasks.

Bias — over- / under-confidence

The signed mean discrepancy between normalised confidence and accuracy. Positive = overconfident; negative = underconfident.

$$\text{Bias} = \frac{1}{N}\sum_{i=1}^{N}(\hat{c}_i - p_i)$$

$\hat{c}_i$ = confidence normalised to [0,1]; $p_i \in \{0,1\}$. Range: $-1$ to $+1$.

📚 Schraw, G. (2009). A conceptual analysis of five measures of metacognitive monitoring. Metacognition and Learning, 4, 33–45.

Calibration — Absolute Accuracy (Brier score)

The mean squared error between normalised confidence and accuracy. Lower values are better. Sometimes called the Brier score.

$$\text{Calibration} = \frac{1}{N}\sum_{i=1}^{N}(\hat{c}_i - p_i)^2$$

Range: 0 (perfect) – 1 (worst). Sensitive to both direction and magnitude of error.

📚 Schraw (2009) ibid.

Discrimination — weighted confidence difference

Measures how well the observer assigns higher confidence to correct versus incorrect items, weighted by the proportion of each response type. Related to ΔConf but with count-weighting.

$$D = \frac{N_c\,\overline{c}_{\text{correct}} - N_e\,\overline{c}_{\text{error}}}{N}$$

$N_c, N_e$ = number of correct / error trials; $N = N_c + N_e$. Range: $-c_{\max}$ to $+c_{\max}$.

📚 Schraw (2009) ibid.

Scatter — confidence variability difference

Assesses whether confidence judgments are more variable for correct trials than error trials. Positive scatter means confidence is more spread out on correct trials.

$$S = \frac{N_c\,\text{var}(c_{\text{correct}}) - N_e\,\text{var}(c_{\text{error}})}{N}$$

Range: $-\infty$ to $+\infty$. Near zero = equal variability.

📚 Schraw (2009) ibid.

Recommended reading:
Maniscalco, B., & Lau, H. (2014). Signal detection theory analysis of type 1 and type 2 data: meta-d’, response-specific meta-d’, and the unequal variance SDT model. In S. M. Fleming & C. D. Frith (Eds.), The Cognitive Neuroscience of Metacognition (pp. 25–66). Springer.

Rahnev, D. (2025). A comprehensive assessment of current methods for measuring metacognition. Nature Communications, 16, 701. https://doi.org/10.1038/s41467-025-56117-0

Schraw, G. (2009). A conceptual analysis of five measures of metacognitive monitoring. Metacognition and Learning, 4, 33–45. https://doi.org/10.1007/s11409-008-9031-3