Marginal Homogeneity Tests

This page is solely based on coin package documentation including data samples which are generated inline.

coin::mh_test() provides the McNemar test, the Cochran Q test, the Stuart(-Maxwell) test and the Madansky test of interchangeability. A general description of these methods is given by Agresti (2002).

The null hypothesis of marginal homogeneity is tested. If formula interface is used, the response variable and the measurement conditions are given by y and x, respectively, and block is a factor where each level corresponds to exactly one subject with repeated measurements: coin::mh_test(y ~ x | block, data, subset = NULL, ...). We can also directly pass an object of class "table".

coin::mh_test() computes different tests depending on x and y:

The conditional null distribution of the test statistic is used to obtain p-values and an asymptotic approximation of the exact distribution is used by default (distribution = "asymptotic"). Alternatively, the distribution can be approximated via Monte Carlo resampling or computed exactly for univariate two-sample problems (McNemar test) by setting distribution to "approximate" or "exact", respectively.

McNemar test

For more information on the McNemar see the McNemar’s test page.

Cochran Q test

## Effectiveness of different media for the growth of diphtheria
## Cochran (1950, Tab. 2)
cases <- c(4, 2, 3, 1, 59)
n <- sum(cases)
cochran <- data.frame(
  diphtheria = factor(
    unlist(rep(list(c(1, 1, 1, 1),
                    c(1, 1, 0, 1),
                    c(0, 1, 1, 1),
                    c(0, 1, 0, 1),
                    c(0, 0, 0, 0)),
                cases)
    )
  ),
  media = factor(rep(LETTERS[1:4], n)),
  case =  factor(rep(seq_len(n), each = 4))
)

head(cochran)
  diphtheria media case
1          1     A    1
2          1     B    1
3          1     C    1
4          1     D    1
5          1     A    2
6          1     B    2
## Asymptotic Cochran Q test (Cochran, 1950, p. 260)
coin::mh_test(
  diphtheria ~ media | case, 
  data = cochran
) 

    Asymptotic Marginal Homogeneity Test

data:  diphtheria by media (A, B, C, D) 
     stratified by case
chi-squared = 8.0526, df = 3, p-value = 0.04494
## Approximative Cochran Q test
mt <- coin::mh_test(
  diphtheria ~ media | case, 
  data = cochran,
  distribution = coin::approximate(nresample = 10000)
)
coin::pvalue(mt)             # standard p-value
[1] 0.0513
99 percent confidence interval:
 0.04578231 0.05724962 
coin::midpvalue(mt)          # mid-p-value
[1] 0.0421
99 percent confidence interval:
 0.03714186 0.04749367 
coin::pvalue_interval(mt)    # p-value interval
   p_0    p_1 
0.0329 0.0513 
coin::size(mt, alpha = 0.05) # test size at alpha = 0.05 using the p-value
[1] 0.0329

Stuart-Maxwell test

## Opinions on Pre- and Extramarital Sex
## Agresti (2002, p. 421)
opinions <- c("Always wrong", "Almost always wrong",
              "Wrong only sometimes", "Not wrong at all")
PreExSex <- matrix(
  c(144, 33, 84, 126,
    2,  4, 14,  29,
    0,  2,  6,  25,
    0,  0,  1,   5),
  nrow = 4,
  dimnames = list(
    "Premarital Sex" = opinions,
    "Extramarital Sex" = opinions
  )
)
PreExSex <- as.table(PreExSex)

PreExSex
                      Extramarital Sex
Premarital Sex         Always wrong Almost always wrong Wrong only sometimes
  Always wrong                  144                   2                    0
  Almost always wrong            33                   4                    2
  Wrong only sometimes           84                  14                    6
  Not wrong at all              126                  29                   25
                      Extramarital Sex
Premarital Sex         Not wrong at all
  Always wrong                        0
  Almost always wrong                 0
  Wrong only sometimes                1
  Not wrong at all                    5
## Asymptotic Stuart test
coin::mh_test(PreExSex)

    Asymptotic Marginal Homogeneity Test

data:  response by
     conditions (Premarital.Sex, Extramarital.Sex) 
     stratified by block
chi-squared = 271.92, df = 3, p-value < 2.2e-16
## Asymptotic Stuart-Birch test
## Note: response as ordinal
coin::mh_test(
  PreExSex, 
  scores = list(response = 1:length(opinions))
)

    Asymptotic Marginal Homogeneity Test for Ordered Data

data:  response (ordered) by
     conditions (Premarital.Sex, Extramarital.Sex) 
     stratified by block
Z = 16.454, p-value < 2.2e-16
alternative hypothesis: two.sided

Madansky test of interchangeability

## Vote intention
## Madansky (1963, pp. 107-108)
vote <- array(
    c(120, 1,  8, 2,   2,  1, 2, 1,  7,
        6, 2,  1, 1, 103,  5, 1, 4,  8,
       20, 3, 31, 1,   6, 30, 2, 1, 81),
    dim = c(3, 3, 3),
    dimnames = list(
          "July" = c("Republican", "Democratic", "Uncertain"),
        "August" = c("Republican", "Democratic", "Uncertain"),
          "June" = c("Republican", "Democratic", "Uncertain")
    )
)
vote <- as.table(vote)

vote
, , June = Republican

            August
July         Republican Democratic Uncertain
  Republican        120          2         2
  Democratic          1          2         1
  Uncertain           8          1         7

, , June = Democratic

            August
July         Republican Democratic Uncertain
  Republican          6          1         1
  Democratic          2        103         4
  Uncertain           1          5         8

, , June = Uncertain

            August
July         Republican Democratic Uncertain
  Republican         20          1         2
  Democratic          3          6         1
  Uncertain          31         30        81
## Asymptotic Madansky test (Q = 70.77)
coin::mh_test(vote)

    Asymptotic Marginal Homogeneity Test

data:  response by
     conditions (July, August, June) 
     stratified by block
chi-squared = 70.763, df = 4, p-value = 1.565e-14
## Cross-over study
## http://www.nesug.org/proceedings/nesug00/st/st9005.pdf (link is dead now)
dysmenorrhea <- array(
  c(6, 2, 1,  3, 1, 0,  1, 2, 1,
    4, 3, 0, 13, 3, 0,  8, 1, 1,
    5, 2, 2, 10, 1, 0, 14, 2, 0),
  dim = c(3, 3, 3),
  dimnames =  list(
    "Placebo" = c("None", "Moderate", "Complete"),
    "Low dose" = c("None", "Moderate", "Complete"),
    "High dose" = c("None", "Moderate", "Complete")
  )
)
dysmenorrhea <- as.table(dysmenorrhea)

dysmenorrhea
, , High dose = None

          Low dose
Placebo    None Moderate Complete
  None        6        3        1
  Moderate    2        1        2
  Complete    1        0        1

, , High dose = Moderate

          Low dose
Placebo    None Moderate Complete
  None        4       13        8
  Moderate    3        3        1
  Complete    0        0        1

, , High dose = Complete

          Low dose
Placebo    None Moderate Complete
  None        5       10       14
  Moderate    2        1        2
  Complete    2        0        0
## Asymptotic Madansky-Birch test (Q = 53.76)
## Note: response as ordinal
coin::mh_test(
  dysmenorrhea, 
  scores = list(response = 1:3)
)

    Asymptotic Marginal Homogeneity Test for Ordered Data

data:  response (ordered) by
     conditions (Placebo, Low.dose, High.dose) 
     stratified by block
chi-squared = 53.762, df = 2, p-value = 2.117e-12
## Asymptotic Madansky-Birch test (Q = 47.29)
## Note: response and measurement conditions as ordinal
coin::mh_test(
  dysmenorrhea, 
  scores = list(response = 1:3, conditions = 1:3)
)

    Asymptotic Marginal Homogeneity Test for Ordered Data

data:  response (ordered) by
     conditions (Placebo < Low.dose < High.dose) 
     stratified by block
Z = 6.8764, p-value = 6.138e-12
alternative hypothesis: two.sided

Reference

Hothorn T, Hornik K, van de Wiel MA, Zeileis A (2006). A Lego system for conditional inference. The American Statistician, 60 (3), 257-263. doi:10.1198/000313006X118430 https://doi.org/10.1198/000313006X118430

Agresti, A. (2002). Categorical Data Analysis, Second Edition. Hoboken, New Jersey: John Wiley & Sons.

Birch, M. W. (1965). The detection of partial association, II: The general case. Journal of the Royal Statistical Society B 27(1), 111–124. doi:10.1111/j.2517-6161.1965.tb00593.x

Cochran, W. G. (1950). The comparison of percentages in matched samples. Biometrika 37(3/4), 256–266. doi:10.1093/biomet/37.3-4.256

Madansky, A. (1963). Tests of homogeneity for correlated samples. Journal of the American Statistical Association 58(301), 97–119. doi:10.1080/01621459.1963.10500835

Maxwell, A. E. (1970). Comparing the classification of subjects by two independent judges. British Journal of Psychiatry 116(535), 651–655. doi:10.1192/bjp.116.535.651

McNemar, Q. (1947). Note on the sampling error of the difference between correlated proportions or percentages. Psychometrika 12(2), 153–157. doi:10.1007/BF02295996

Stuart, A. (1955). A test for homogeneity of the marginal distributions in a two-way classification. Biometrika 42(3/4), 412–416. doi:10.1093/biomet/42.3-4.412

White, A. A., Landis, J. R. and Cooper, M. M. (1982). A note on the equivalence of several marginal homogeneity test criteria for categorical data. International Statistical Review 50(1), 27–34. doi:10.2307/1402457

─ Session info ───────────────────────────────────────────────────────────────
 setting  value
 version  R version 4.4.3 (2025-02-28)
 os       Ubuntu 24.04.2 LTS
 system   x86_64, linux-gnu
 ui       X11
 language (EN)
 collate  C.UTF-8
 ctype    C.UTF-8
 tz       Europe/London
 date     2025-03-13
 pandoc   NA (via rmarkdown)

─ Packages ───────────────────────────────────────────────────────────────────
 ! package     * version date (UTC) lib source
   codetools     0.2-20  2024-03-31 [2] CRAN (R 4.4.3)
 P coin          1.4-3   2023-09-27 [?] RSPM (R 4.4.0)
   lattice       0.22-6  2024-03-20 [2] CRAN (R 4.4.3)
 P libcoin       1.0-10  2023-09-27 [?] RSPM (R 4.4.0)
 P MASS          7.3-61  2024-06-13 [?] RSPM (R 4.4.0)
 P Matrix        1.7-1   2024-10-18 [?] RSPM (R 4.4.0)
 P matrixStats   1.4.1   2024-09-08 [?] RSPM (R 4.4.0)
 P modeltools    0.2-23  2020-03-05 [?] RSPM (R 4.4.0)
 P multcomp      1.4-26  2024-07-18 [?] RSPM (R 4.4.0)
 P mvtnorm       1.3-1   2024-09-03 [?] RSPM (R 4.4.0)
 P sandwich      3.1-1   2024-09-15 [?] RSPM (R 4.4.0)
 P survival      3.7-0   2024-06-05 [?] RSPM (R 4.4.0)
 P TH.data       1.1-2   2023-04-17 [?] RSPM (R 4.4.0)
 P zoo           1.8-12  2023-04-13 [?] RSPM (R 4.4.0)

 [1] /home/michael/source/CAMIS/renv/library/linux-ubuntu-noble/R-4.4/x86_64-pc-linux-gnu
 [2] /opt/R/4.4.3/lib/R/library

 P ── Loaded and on-disk path mismatch.

──────────────────────────────────────────────────────────────────────────────