Confidence Intervals for Proportions in R

Introduction

There are several ways to calculate a confidence interval (CI) for a proportion. You need to select the method based on if you have a 1 sample proportion (1 proportion calculated from 1 group of subjects), or if you have 2 samples and you want a CI for the difference in the 2 proportions. The difference in proportion can come from either 2 independent samples (different subjects in each of the 2 groups), or can be matched (the same subject with 1 result in 1 group and 1 result in the other group [paired data]).

The method selected is also dependent on whether your proportion is close to 0 or 1 (or near to the 0.5 midpoint), and your sample size.

For more information about these methods, including which performs better in different scenarios see Five Confidence Intervals for Proportions That You Should Know about1.

Data used

The adcibc data stored here was used in this example, creating a binary treatment variable trt taking the values of ACT or PBO and a binary response variable resp taking the values of Yes or No. For this example, a response is defined as a score greater than 4.

The below shows that for the Actual Treatment, there are 36 responders out of 154 subjects = 0.234 (23.4% responders).

# A tibble: 4 × 3
# Groups:   trt [2]
  trt   resp      n
  <chr> <chr> <int>
1 ACT   No      118
2 ACT   Yes      36
3 PBO   No       65
4 PBO   Yes      12

Packages

The {cardx} package is an extension of the {cards} package, providing additional functions to create Analysis Results Data Objects (ARDs)1. It was developed as part of {NEST} and pharmaverse. This package requires the binary endpoint to be a logical (TRUE/FALSE) vector or a numeric/integer coded as (0, 1) with 1 (TRUE) being the success you want to calculate the confidence interval for.

If calculating the CI for a difference in proportions, the package requires both the response and the treatment variable to be numeric/integer coded as (0, 1) (or logical vector).

Instead of the code presented below, you can use ard_categorical_ci(data, variables=resp, method ='wilson') for example. This invokes the code below but returns an analysis results dataset format as the output. Methods included are waldcc, wald, clopper-pearson, wilson, wilsoncc, strat_wilson, strat_wilsoncc, agresti-coull and jeffreys.

Code example: proportion_ci_clopper_pearson(<resp_var>,conf.level=0.95) %>% as_tibble()

Example data format needed for {cardx} for a single proportion CI

 [1] 0 0 0 0 0 0 0 0 1 1 0 0 1 1 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0

The {PropCIs} package produces CIs for methods such as Blaker’s exact method and Midp which aren’t available in {cardx} but are available in SAS. We found results agreed with SAS to the 5th decimal place. The package also calculates CIs for Clopper-Pearson, Wald, Wilson, Agresti-coull and these align to results obtained in cardx to at least the 7th decimal place. The {PropsCIs} package requires just the number of events (numerator number of successes) & total number of subjects (denominator) as an input dataset. Given Blaker and Midp are rarely used in practice, and {PropsCIs} isn’t a package commonly downloaded from CRAN, further detail is not provided here.

Code example for Clopper-pearson:
exactci(x=<count of successes> , n=<Total>, conf.level=0.95)

Code example for Mid P method:
midPci(x=<count of successes> , n=<Total>, conf.level=0.95)

Code example for Blaker’s exact method:
blakerci(x=<count of successes> , n=<Total>, conf.level=0.95, tolerance=1e-05)

Methods for Calculating Confidence Intervals for a single proportion using cardx

For more technical derivation and reasons for use of each of the methods listed below, see the corresponding SAS page.

Let’s start by calculating a Confidence interval for the proportion of successes observed in the Active Treatment group (a single sample).

Clopper-Pearson (Exact or binomial CI) Method

Clopper-Pearson Exact CI is one of the most popular methods, it is often good for small sample sizes when the proportion is not close to the tails (0,1), but it can be too conservative (too wide an interval compared to the interval containing the true population proportion 95% of the time).

The cardx package calculates the Clopper-Pearson score by calling stats::binom.test() function.

proportion_ci_clopper_pearson(act2,conf.level=0.95) %>% 
  as_tibble()
# A tibble: 1 × 10
      N conf.level estimate statistic  p.value parameter conf.low conf.high
  <int>      <dbl>    <dbl>     <dbl>    <dbl>     <dbl>    <dbl>     <dbl>
1   154       0.95    0.234        36 2.21e-11       154    0.169     0.309
# ℹ 2 more variables: method <chr>, alternative <chr>

Normal Approximation Method (Also known as the Wald or asymptotic CI Method)

In large random samples from independent trials, the sampling distribution of proportions approximately follows the normal distribution. The expectation of a sample proportion is the corresponding population proportion. Therefore, based on a sample of size \(n\), a \((1-\alpha)\%\) confidence interval for population proportion can be calculated using normal approximation as follows:

\(p\approx \hat p \pm z_\alpha \sqrt{\hat p(1-\hat p)}/{n}\), where \(\hat p\) is the sample proportion, \(z_\alpha\) is the \(1-\alpha/2\) quantile of a standard normal distribution corresponding to level \(\alpha\), and \(\sqrt{\hat p(1-\hat p)}/{n}\) is the standard error.

For more technical information see the corresponding SAS page.

Example code

The following code calculates a confidence interval for a binomial proportion using normal approximation equation manually. This is replicated exactly using the cardx::proportion_ci_wald function which also allows the continuity correction to be applied.

    # sample proportion by trt
summary <- adcibc %>% 
           filter(trt=="ACT") %>% 
           group_by(resp) %>% 
           tally()  %>% 
           ungroup() %>% 
           mutate(total=sum(n)) %>% 
           mutate(p=n/total)

    # Calculate standard error and 95% wald confidence intervals for population proportion
waldci <-summary %>% 
         filter(resp=="Yes") %>% 
         mutate(se=sqrt(p*(1-p)/total)) %>% 
         mutate(lower_ci=(p-qnorm(1-0.05/2)*se)) %>% 
         mutate(upper_ci=(p+qnorm(1-0.05/2)*se)) 
waldci  
# A tibble: 1 × 7
  resp      n total     p     se lower_ci upper_ci
  <chr> <int> <int> <dbl>  <dbl>    <dbl>    <dbl>
1 Yes      36   154 0.234 0.0341    0.167    0.301
#cardx package Wald method without continuity correction
proportion_ci_wald(act2,conf.level=0.95,correct=FALSE) %>% 
  as_tibble()
# A tibble: 1 × 6
      N estimate conf.low conf.high conf.level method                           
  <int>    <dbl>    <dbl>     <dbl>      <dbl> <glue>                           
1   154    0.234    0.167     0.301       0.95 Wald Confidence Interval without…
#cardx package Wald method with continuity correction
proportion_ci_wald(act2,conf.level=0.95,correct=TRUE) %>% 
  as_tibble()
# A tibble: 1 × 6
      N estimate conf.low conf.high conf.level method                           
  <int>    <dbl>    <dbl>     <dbl>      <dbl> <glue>                           
1   154    0.234    0.164     0.304       0.95 Wald Confidence Interval with co…

Wilson Method (Also known as the Score method or the Altman, Newcombe method3 )

The cardx package calculates the Wilson (score) method by calling stats::prop.test() function. This method is often used as a compromise between the Clopper-Pearson and the Wald given it was found to be accurate for most parameter values (even those close to 0 and 1), and it does not suffer from being over-conservative. For more technical information see the corresponding SAS page.

The package also contains a function for proportion_ci_strat_wilson() which calculates the stratified Wilson CIs for unequal proportions as described on page 47 here.

#cardx package Wilson method without continuity correction
proportion_ci_wilson(act2,conf.level=0.95,correct=FALSE) %>% 
  as_tibble()
# A tibble: 1 × 10
      N conf.level estimate statistic  p.value parameter conf.low conf.high
  <int>      <dbl>    <dbl>     <dbl>    <dbl>     <int>    <dbl>     <dbl>
1   154       0.95    0.234      43.7 3.90e-11         1    0.174     0.307
# ℹ 2 more variables: method <glue>, alternative <chr>
#cardx package Wilson method with continuity correction
proportion_ci_wilson(act2,conf.level=0.95,correct=TRUE) %>% 
  as_tibble()
# A tibble: 1 × 10
      N conf.level estimate statistic  p.value parameter conf.low conf.high
  <int>      <dbl>    <dbl>     <dbl>    <dbl>     <int>    <dbl>     <dbl>
1   154       0.95    0.234      42.6 6.70e-11         1    0.171     0.310
# ℹ 2 more variables: method <glue>, alternative <chr>

Agresti-Coull Method

The cardx package calculates the Agresti-Coull method using the equation from the published method by Alan Agresti & Brent Coull based on adding 2 successes and 2 failures before computing the wald CI. The CI is truncated, when it overshoots the boundary (<0 or >1).

#cardx package agresti_coull method 
proportion_ci_agresti_coull(act2,conf.level=0.95) %>% 
  as_tibble()
# A tibble: 1 × 6
      N estimate conf.low conf.high conf.level method                           
  <int>    <dbl>    <dbl>     <dbl>      <dbl> <chr>                            
1   154    0.234    0.174     0.307       0.95 Agresti-Coull Confidence Interval

Jeffreys Method

Jeffreys method is a particular type of Bayesian Highest Probability Density (HPD) Method. For proportions, the beta distribution is generally used for the prior, which consists of two parameters alpha and beta. Setting alpha=beta=0.5 is called Jeffrey’s prior. NOTE: if you want to use any other priors, you can use binom.bayes which estimates a credible interval for proportions.

#cardx package jeffreys method 
proportion_ci_jeffreys(act2,conf.level=0.95) %>% 
  as_tibble()
# A tibble: 1 × 6
      N estimate conf.low conf.high conf.level method           
  <int>    <dbl>    <dbl>     <dbl>      <dbl> <glue>           
1   154    0.234    0.172     0.305       0.95 Jeffreys Interval

Methods for Calculating Confidence Intervals for a matched pair proportion

When you have 2 measurements on the same subject, the 2 sets of measures are not independent and you have matched pair of responses.

This section is work in progress. For more information about methods for calculating confidence intervals for a matched pair proportion see here

Methods for Calculating Confidence Intervals for 2 independent samples proportion using {cardx}

This paper4 describes many methods for the calculation of confidence intervals for 2 independent proportions. The most commonly used are: Wald with continuity correction and the Wilson with continuity correction.

Normal Approximation Method (Also known as the Wald or asymptotic CI Method)

For more technical information see the corresponding SAS page.

Example code

cardx::ard_stats_prop_test function uses stats::prop.test which also allows a continuity correction to be applied. More research is needed into this method.

Although this website here and this one here both reference Newcombe for the CI that this function uses, replication of the results by hand and compared to SAS show that the results below match the Normal Approximation (Wald method) not the Newcome method? Further research is needed into this topic.

Both the Treatment variable (ACT,PBO) and the Response variable (Yes,No) have to be numeric (0,1) or Logit (TRUE,FALSE) variables.

adcibc2<-select(adcibc,trtn,respn)
cardx::ard_stats_prop_test(data=adcibc2, by=trtn, variables=respn, conf.level = 0.95, correct=FALSE) 
{cards} data frame: 13 x 9
   group1 variable   context   stat_name stat_label      stat
1    trtn    respn stats_pr…    estimate  Rate Dif…    -0.078
2    trtn    respn stats_pr…   estimate1  Group 1 …     0.156
3    trtn    respn stats_pr…   estimate2  Group 2 …     0.234
4    trtn    respn stats_pr…   statistic  X-square…     1.893
5    trtn    respn stats_pr…     p.value    p-value     0.169
6    trtn    respn stats_pr…   parameter  Degrees …         1
7    trtn    respn stats_pr…    conf.low  CI Lower…    -0.183
8    trtn    respn stats_pr…   conf.high  CI Upper…     0.027
9    trtn    respn stats_pr…      method     method 2-sample…
10   trtn    respn stats_pr… alternative  alternat… two.sided
11   trtn    respn stats_pr…           p          p          
12   trtn    respn stats_pr…  conf.level  CI Confi…      0.95
13   trtn    respn stats_pr…     correct  Yates' c…     FALSE
ℹ 3 more variables: fmt_fn, warning, error
cardx::ard_stats_prop_test(data=adcibc2, by=trtn, variables=respn, conf.level = 0.95, correct=TRUE) 
{cards} data frame: 13 x 9
   group1 variable   context   stat_name stat_label      stat
1    trtn    respn stats_pr…    estimate  Rate Dif…    -0.078
2    trtn    respn stats_pr…   estimate1  Group 1 …     0.156
3    trtn    respn stats_pr…   estimate2  Group 2 …     0.234
4    trtn    respn stats_pr…   statistic  X-square…      1.45
5    trtn    respn stats_pr…     p.value    p-value     0.229
6    trtn    respn stats_pr…   parameter  Degrees …         1
7    trtn    respn stats_pr…    conf.low  CI Lower…    -0.193
8    trtn    respn stats_pr…   conf.high  CI Upper…     0.037
9    trtn    respn stats_pr…      method     method 2-sample…
10   trtn    respn stats_pr… alternative  alternat… two.sided
11   trtn    respn stats_pr…           p          p          
12   trtn    respn stats_pr…  conf.level  CI Confi…      0.95
13   trtn    respn stats_pr…     correct  Yates' c…      TRUE
ℹ 3 more variables: fmt_fn, warning, error

Wilson Method (Also known as the Score method or the Altman, Newcombe method3 )

For more technical information see the corresponding SAS page.

Further research is needed into this topic.

References

  1. pharmaverse cardx package
  2. PropCIs package
  3. D. Altman, D. Machin, T. Bryant, M. Gardner (eds). Statistics with Confidence: Confidence Intervals and Statistical Guidelines, 2nd edition. John Wiley and Sons 2000.
  4. https://www.lexjansen.com/wuss/2016/127_Final_Paper_PDF.pdf