Confidence Intervals for Independent Proportions in R

Introduction

This page covers confidence intervals for comparisons of two independent proportions in R, including the contrast parameters for risk difference (RD) \(\theta_{RD} = p_1 - p_2\), relative risk (RR) \(\theta_{RR} = p_1 / p_2\), and odds ratio (OR) \(\theta_{OR} = p_1(1-p_2) / (p_2(1-p_1))\).

See the summary page for general introductory information on confidence intervals for proportions, including the principles underlying the most common methods.

Note that because the Asymptotic Score methods rely on an iterative root-finding subroutine to identify the confidence limits, the precision of results (and therefore agreement between packages) depends on the tolerance parameter used.

Data used

The adcibc data stored here was used in this example, creating a binary treatment variable trt taking the values of ACT or PBO and a binary response variable resp taking the values of Yes or No. For this example, a response is defined as a score greater than 4.

The below shows that for the Active Treatment, there are 36 responders out of 154 subjects, \(p_1\) = 0.2338 (23.38% responders), while for the placebo treatment \(p_2\) = 12/77 = 0.1558, giving a risk difference of 0.0779, relative risk 1.50, and odds ratio 1.6525.

# A tibble: 4 × 3
# Groups:   trt [2]
  trt   resp      n
  <chr> <chr> <int>
1 ACT   No      118
2 ACT   Yes      36
3 PBO   No       65
4 PBO   Yes      12

Packages

The table below indicates which methods can be produced using each package, for each contrast. Methods are grouped by those that aim to achieve the nominal confidence interval on average, then the ‘exact’ and continuity adjusted methods that aim to achieve the nominal confidence level as a minimum. {ExactCIdiff} appears to be the only package offering an ‘exact’ method for RD, but run times can be prohibitively long, and it is not clear which of the SAS ‘exact’ methods it matches with, if any.

	ratesci	contingencytables	DescTools	cicalc	gsDesign	PropCIs	cardx
For proximate coverage:
Wald/log	RD,RR,OR	RD,RR,OR	RD,RR,OR	RD	-	-	RD
Agresti-Caffo	RD	RD	RD	-	-	-	-
MOVER-W (Newcombe)	RD,RR,OR	RD,RR,OR	RD	RD	-	-	-
MOVER-J	RD,RR,OR	-	-	-	-	-	-
Asymptotic Score methods:
Miettinen-Nurminen	RD,RR,OR	RD,RR,OR	RD	RD	RD,RR,OR	RD	-
Mee/ Koopman	RD,RR,OR	RD,RR,OR	RD,RR	RD	RD,RR,OR	RR	-
SCAS	RD,RR,OR	-	-	-	-	-	-
For conservative coverage:
Wald-cc	RD	RD	RD	RD	-	-	RD
MOVER-W-cc	RD,RR,OR	-	RD	RD	-	-	-
MOVER-J-cc	RD,RR,OR	-	-	-	-	-	-
MN-cc	RD,RR,OR	-	-	-	-	-	-
SCAS-cc	RD,RR,OR	-	-	-	-	-	-

The {ratesci} package has the most extensive coverage of the best-performing methods (Asymptotic Score and MOVER) for all contrasts, including several features not available elsewhere, including the skewness correction for improved symmetrical one-sided coverage (SCAS), corresponding hypothesis tests, an option to apply the MOVER approach with Jeffreys intervals (MOVER-J), and optional ‘sliding scale’ continuity adjustments across all methods. It also produces a selection of other methods for reference. Note, the current development version at https://github.com/petelaud/ratesci has added more functionality (including the convenience functions rdci() etc) compared to the CRAN release (v1.0.0). A CRAN update will be released in due course.

The {DescTools} package has a function BinomDiffCI() which produces CIs for RD using numerous different methods including those indicated above plus Brown-Li(Jeffreys), Hauck-Anderson, and Haldane. See here for more detail. The BinomRatioCI() function offers several methods for RR CIs, including the approximate (log)normal and Asymptotic Score (Koopman) methods. The methods available for OR are more limited: OddsRatio() only has Wald, MLE and mid-P CI methods.

The {PropCIs} package has functions diffscoreci() for the Miettinen-Nurminen (MN) CI for RD, riskscoreci() for the Koopman interval for RR, and orscoreci() for the MN CI for OR. It also has functions providing Bayesian tail intervals (with user-specified prior).

The {contingencytables} package also provides a good selection of different methods.

The {cicalc} package in general replicates the methods available in SAS, but only for the RD contrast.

The {gsDesign} package gives Asymptotic Score intervals (with or without ‘N-1’ correction) for all contrasts with the ciBinomial() function. The package also has functions for the corresponding hypothesis tests, and sample size calculations.

The {cardx} package has very limited options for comparing proportions - the cardx::ard_stats_prop_test function only provides for estimation of RD with the Wald approximate normal method (which is not recommended).

Proportion Difference

This paper describes many methods for the calculation of confidence intervals for 2 independent proportions. The 2-sided and 1-sided performance of many of the same methods have been compared graphically¹. For more technical information regarding the methods below see the corresponding SAS page.

Example code for {DescTools}

indat1 <- adcibc2 |>
  select(AVAL, TRTP) |>
  mutate(
    resp = if_else(AVAL > 4, "Yes", "No"),
    respn = if_else(AVAL > 4, 1, 0),
    trt = if_else(TRTP == "Placebo", "PBO", "ACT"),
    trtn = if_else(TRTP == "Placebo", 1, 0)
  ) |>
  select(trt, trtn, resp, respn)

# cardx package required a vector with 0 and 1s for a single proportion CI
# To get the comparison the correct way around Placebo must be 1, and Active 0

indat <- select(indat1, trtn, respn)

# BinomDiffCI requires
# x1 = successes in active,  n1 = total subjects in active,
# x2 = successes in placebo, n2 = total subjects in placebo

x <- indat |>
  filter(respn == 1) |>
  count(trtn, respn) |>
  pull(n)
x

[1] 36 12

n <- indat |>
  count(trtn) |>
  pull(n)
n

[1] 154  77

DescTools::BinomDiffCI(
  x1 = x[1],
  n1 = n[1],
  x2 = x[2],
  n2 = n[2],
  conf.level = 0.95,
  sides = c("two.sided"),
  method = c(
    "wald",
    "waldcc",
    "score",
    "scorecc",
    "ac",
    "mn",
    "mee",
    "blj",
    "ha",
    "hal",
    "jp"
  )
) |>
  as_tibble(rownames = "Method")

# A tibble: 11 × 4
   Method     est  lwr.ci upr.ci
   <chr>    <dbl>   <dbl>  <dbl>
 1 wald    0.0779 -0.0271  0.183
 2 waldcc  0.0779 -0.0368  0.193
 3 score   0.0779 -0.0361  0.175
 4 scorecc 0.0779 -0.0440  0.181
 5 ac      0.0779 -0.0329  0.178
 6 mn      0.0779 -0.0361  0.177
 7 mee     0.0779 -0.0358  0.177
 8 blj     0.0779 -0.0306  0.181
 9 ha      0.0779 -0.0342  0.190
10 hal     0.0779 -0.0314  0.177
11 jp      0.0779 -0.0321  0.178

Example code for {ratesci}

# Selected methods for proximate coverage
ratesci::rdci(
  x1 = x[1],
  n1 = n[1],
  x2 = x[2],
  n2 = n[2],
  level = 0.95,
  precis = 6
)$estimates

, , 36/154 vs 12/77

                       lower      est    upper
SCAS               -0.034239 0.077922 0.179081
Gart-Nam           -0.033963 0.077922 0.178881
Miettinen-Nurminen -0.036065 0.077922 0.177495
Mee                -0.035798 0.077922 0.177288
MOVER-W            -0.036142 0.077922 0.175125
MOVER-J            -0.033570 0.077922 0.176023
Wald               -0.027108 0.077922 0.182952
Agresti-Caffo      -0.032925 0.077922 0.178170

Example code for {contingencytables}

# Tricky to get this the right way round
tab2x2 <- table(indat1$trt, indat1$resp)[, c(2,1)]

contingencytables::the_2x2_table_CIs_difference(n = tab2x2)

Estimate of pi_1: 36 / 154 = 0.234
Estimate of pi_2: 12 / 77 = 0.156
Estimate of delta = pi_1 - pi_2: 0.078

Interval method                           95% CI         width
--------------------------------------------------------------
Wald                                -0.0271 to  0.1830   0.210
Wald with continuity correction     -0.0368 to  0.1927   0.230
Agresti-Caffo                       -0.0329 to  0.1782   0.211
Newcombe hybrid score               -0.0361 to  0.1751   0.211
Mee asymptotic score                -0.0358 to  0.1773   0.213
Miettinen-Nurminen asymptotic score -0.0361 to  0.1775   0.214
--------------------------------------------------------------

Relative Risk

The 1-sided performance of selected methods have been compared graphically², with the observation that optimum 2-sided coverage follows directly from optimum 1-sided coverage (while the reverse is not true). It has been noted previously that the ratio contrasts suffer a greater imbalance in 1-sided coverage than RD³. Therefore, skewness correction is particularly valuable here.

Another relatively recent method, not provided by SAS, is the MOVER approach, which uses a formula adapted from the MOVER (Newcombe Hybrid Score) method for application to ratio contrasts. As for the RD contrast, this may be based on Wilson (MOVER-W) or Jeffreys (MOVER-J) intervals for the proportions in each group.

Example code for {DescTools}

DescTools::BinomRatioCI(
  x1 = x[1],
  n1 = n[1],
  x2 = x[2],
  n2 = n[2],
  conf.level = 0.95,
  sides = c("two.sided"),
  method = c("katz.log", "adj.log", "bailey", "koopman", "noether", 
                        "sinh-1", "boot")
) |>
  as_tibble(rownames = "Method")

# A tibble: 7 × 4
  Method     est lwr.ci upr.ci
  <chr>    <dbl>  <dbl>  <dbl>
1 katz.log   1.5  0.829   2.71
2 adj.log    1.5  0.819   2.62
3 bailey     1.5  0.851   2.82
4 koopman    1.5  0.849   2.73
5 noether    1.5  0.610   2.39
6 sinh-1     1.5  0.836   2.69
7 boot       1.5  0.893   3.17

Example code for {ratesci}

# Selected methods for proximate coverage
ratesci::rrci(
  x1 = x[1],
  n1 = n[1],
  x2 = x[2],
  n2 = n[2],
  level = 0.95,
  precis = 6
)$estimates

, , 36/154 vs 12/77

                      lower est   upper
SCAS               0.852966 1.5 2.83502
Gart-Nam           0.854006 1.5 2.83174
Miettinen-Nurminen 0.848245 1.5 2.73225
Koopman            0.849237 1.5 2.72874
MOVER-W            0.847205 1.5 2.71518
MOVER-J            0.854776 1.5 2.80156
Katz log           0.828758 1.5 2.71490
Adjusted log       0.818876 1.5 2.61996

Example code for {gsDesign}

# Miettinen-Nurminen
gsDesign::ciBinomial(
  x1 = x[1],
  n1 = n[1],
  x2 = x[2],
  n2 = n[2],
  scale = 'rr', 
  adj = 1
)

     lower   upper
1 0.848245 2.73227

Example code for {contingencytables}

contingencytables::the_2x2_table_CIs_ratio(n = tab2x2)

Estimate of pi_1: 36 / 154 = 0.234
Estimate of pi_2: 12 / 77 = 0.156
Estimate of phi = pi_1 / pi_2: 1.500

Interval method                            95% CI      Log width
----------------------------------------------------------------
Katz log                              0.829 to  2.715    1.187
Adjusted log                          0.819 to  2.620    1.163
Price-Bonett approximate Bayes        0.831 to  2.695    1.177
Inverse sinh                          0.836 to  2.692    1.170
Adjusted inverse sinh                 0.835 to  2.694    1.171
MOVER-R Wilson                        0.847 to  2.715    1.165
Miettinen-Nurminen asymptotic score   0.848 to  2.732    1.170
Koopman asymptotic score              0.849 to  2.729    1.167
----------------------------------------------------------------

Odds Ratio

Example code for {ratesci}

# Selected methods for proximate coverage
ratesci::orci(
  x1 = x[1],
  n1 = n[1],
  x2 = x[2],
  n2 = n[2],
  level = 0.95,
  precis = 6
)$estimates

, , 36/154 vs 12/77

                                lower     est   upper
SCAS                         0.817594 1.65254 3.48002
Gart                         0.818872 1.65254 3.47530
Miettinen-Nurminen           0.810258 1.65254 3.36344
Uncorrected Asymptotic Score 0.811482 1.65254 3.35842
MOVER-W                      0.806857 1.65254 3.34520
MOVER-J                      0.817280 1.65254 3.44985
Woolf logit                  0.804332 1.65254 3.39523
Gart adjusted logit          0.793782 1.65254 3.28179

Example code for {gsDesign}

# Miettinen-Nurminen
gsDesign::ciBinomial(
  x1 = x[1],
  n1 = n[1],
  x2 = x[2],
  n2 = n[2],
  scale = 'or', 
  adj = 1
)

     lower   upper
1 0.810255 3.36353

Example code for {contingencytables}

Note there is an error in the {contingencytables} package v3.1.0 for the MOVER-W (“MOVER-R Wilson”) method.

contingencytables::the_2x2_table_CIs_OR(n = tab2x2)

Estimate of pi_1: 36 / 154 = 0.234
Estimate of pi_2: 12 / 77 = 0.156
Estimate of theta = (pi_1 / (1-pi_1)) / (pi_2 / (1-pi_2)): 1.653

Interval method                            95% CI      Log width
----------------------------------------------------------------
Woolf logit                            0.804 to  3.395    1.440
Gart adjusted logit                    0.794 to  3.282    1.419
Independence-smoothed logit            0.804 to  3.367    1.433
Inverse sinh                           0.816 to  3.346    1.411
Adjusted inverse sinh (0.45, 0.25)     0.803 to  3.259    1.401
Adjusted inverse sinh (0.6, 0.4)       0.800 to  3.227    1.395
MOVER-R Wilson                         0.872 to  3.336    1.342
Miettinen-Nurminen asymptotic score    0.810 to  3.363    1.423
Uncorrected asymptotic score           0.811 to  3.358    1.420
Cornfield exact conditional            0.813 to  3.503    1.461
Baptista-Pike exact conditional        0.796 to  3.326    1.430
Cornfield mid-P                        0.813 to  3.503    1.461
Baptista-Pike mid-P                    0.796 to  3.326    1.430
----------------------------------------------------------------

Continuity Adjusted Methods

There are relatively few methods widely available for aligning the minimum coverage with the nominal confidence level. The most versatile option is to use functions from the {ratesci} package, which provides optional continuity adjustments, on a sliding scale from 0 to \(0.5/N\), for any of the Asymptotic Score or MOVER methods for any contrast.

Example code for {ratesci}

# Selected methods for conservative coverage
# Using the conventional 0.5
ratesci::rdci(
  x1 = x[1],
  n1 = n[1],
  x2 = x[2],
  n2 = n[2],
  level = 0.95,
  cc = TRUE,
  precis = 6
)$estimates

, , 36/154 vs 12/77

                          lower      est    upper
SCAS_cc               -0.041280 0.077922 0.185316
Gart-Nam_cc           -0.041003 0.077922 0.185116
Miettinen-Nurminen_cc -0.043077 0.077922 0.183742
Mee_cc                -0.042809 0.077922 0.183535
MOVER-W_cc            -0.043962 0.077922 0.180990
MOVER-J_cc            -0.041446 0.077922 0.181965
Wald_cc               -0.036848 0.077922 0.192692
Hauck-Anderson        -0.034150 0.077922 0.189994

# Using an intermediate adjustment of magnitude 0.25
ratesci::rdci(
  x1 = x[1],
  n1 = n[1],
  x2 = x[2],
  n2 = n[2],
  level = 0.95,
  cc = 0.25,
  precis = 6
)$estimates

, , 36/154 vs 12/77

                                lower      est    upper
SCAS_cc(0.25)               -0.037760 0.077922 0.182197
Gart-Nam_cc(0.25)           -0.037484 0.077922 0.181997
Miettinen-Nurminen_cc(0.25) -0.039572 0.077922 0.180617
Mee_cc(0.25)                -0.039305 0.077922 0.180410
MOVER-W_cc(0.25)            -0.040052 0.077922 0.178057
MOVER-J_cc(0.25)            -0.037510 0.077922 0.178998
Wald_cc(0.25)               -0.031978 0.077922 0.187822

Consistency with hypothesis tests

Test for association

The Asymptotic Score methods for all contrasts are inherently consistent with \(\chi^2\) tests. What may be less widely known is that there is more than one version of the \(\chi^2\) test. The Mee and Koopman methods (without the ‘N-1’ variance correction) are consistent with the Karl Pearson \(\chi^2\) (as produced by stats::chisq.test()) , while the Miettinen-Nurminen method agrees with the Egon Pearson ‘N-1’ test. (The stratified MN interval agrees with the standard CMH test, which also incorporates the ‘N-1’ adjustment.) Note that the SCAS (with or without ‘N-1’ adjustment) also agrees with the same tests for association, because the skewness correction term is zero when \(\theta_{RD}=0\) or when \(\theta_{RR}\) or \(\theta_{OR}=1\). The ‘N-1’ adjusted \(\chi^2\) test is available in the {ratesci} and {gsDesign} packages.

Non-inferiority test

One important use for CIs for independent proportions is in the analysis of clinical trials aiming to demonstrate non-inferiority for an outcome such as cure rate or relapse rate. The Asymptotic Score methods are naturally suited for this purpose, as they are derived by inverting a score test statistic. Probably the most well-known named test for such analysis is the Farrington-Manning (FM) test, but it is important to note that the FM formula omits the ‘N-1’ correction factor, so is consistent with the Mee CI, not the Miettinen-Nurminen CI. Non-inferiority tests including the ‘N-1’ adjustment can be obtained from the {ratesci} and {gsDesign} functions.

References

Laud, P. J. & Dane, A. Confidence intervals for the difference between independent binomial proportions: Comparison using a graphical approach and moving averages. Pharmaceutical Statistics 13, 294–308 (2014).

Laud, P. J. Equal-tailed confidence intervals for comparison of rates. Pharmaceutical Statistics 16, 334–348 (2017).

Gart, J. J. & Nam, J. Approximate interval estimation of the difference in binomial parameters: Correction for skewness and extension to multiple tables. Biometrics 46, 637 (1990).