Confidence intervals for Proportions in SAS

Introduction

The methods to use for calculating a confidence interval (CI) for a proportion depend on the type of proportion you have.

1 sample proportion (1 proportion calculated from 1 group of subjects)
2 sample proportions and you want a CI for the difference in the 2 proportions.
- If the 2 samples come from 2 independent samples (different subjects in each of the 2 groups)
- If the 2 samples are matched (i.e. the same subject has 2 results, one on each group [paired data]).

The method selected is also dependent on whether your proportion is close to 0 or 1 (or near to the 0.5 midpoint), and your sample size.

For more information about these methods in R & SAS, including which performs better in different scenarios see Five Confidence Intervals for Proportions That You Should Know about¹ and Confidence Intervals for Binomial Proportion Using SAS²

Data used

The adcibc data stored here was used in this example, creating a binary treatment variable trt taking the values of Act or PBO and a binary response variable resp taking the values of Yes or No. For this example, a response is defined as a score greater than 4.

data adcibc2 (keep=trt resp) ;
    set adcibc;     
    if aval gt 4 then resp="Yes";
    else resp="No";     
    if trtp="Placebo" then trt="PBO";
    else trt="Act"; 
run;

The below shows that for the Actual Treatment, there are 36 responders out of 154 subjects = 0.2338 (23.38% responders).

proc freq data=adcibc;
  table trt*resp/ nopct nocol;
run;

Methods for Calculating Confidence Intervals for a single proportion

SAS PROC FREQ in Version 9.4 can compute 11 methods to calculate CIs for a single proportion, an explanation of these methods and the code is shown below. See BINOMIAL³ for more information on SAS parameterization. It is recommended to always sort your data prior to doing a PROC FREQ.

Here we are calculating a 95% confidence interval for the proportion of responders in the active treatment group.

Clopper-Pearson (Exact or binomial CI) Method

With Binary endpoint data (response/non-response), we make the assumption that the proportion of responders, has been derived from a series of Bernoulli trials. Trials (Subjects) are independent and we have a fixed number of repeated trials with an outcome of respond or not respond. This type of data follows the discrete binomial probability distribution, and the Clopper-Pearson⁴ (Exact) method uses this distribution to calculate the CIs. However, for large numbers of trials (subjects), the probability distribution becomes difficult to calculate and hence a variety of approximations were developed all with their pros and cons (depending on your data distribution). This method can also be too conservative, implying that the interval returned is too wide an interval compared to the interval containing the true population proportion 95% of the time.

Clopper-Pearson method is output by SAS as the default method, but you can also specify it using BINOMIAL(LEVEL="Yes" CL=CLOPPERPEARSON);

Normal Approximation Method (Also known as the Wald or asymptotic CI Method)

The most commonly used alternative to the Clopper-Pearson (Exact) method is the asymptotic Normal Approximation (Wald) CI. In large random samples from independent trials, the sampling distribution of proportions approximately follows the normal distribution. The expectation of a sample proportion is the corresponding population proportion. Therefore, based on a sample of size \(n\), a \((1-\alpha)\%\) confidence interval for population proportion can be calculated using normal approximation as follows:

\(p\approx \hat p \pm z_\alpha \sqrt{\hat p(1-\hat p)}/{n}\), where \(\hat p\) is the sample proportion, \(z_\alpha\) is the \(1-\alpha/2\) quantile of a standard normal distribution corresponding to level \(\alpha\), and \(\sqrt{\hat p(1-\hat p)}/{n}\) is the standard error.

One should note that the approximation can become unreliable as the proportion of responders gets close to 0 or 1 (e.g. 0 or 100% responding), and alternative methods may be more suitable. In this scenario, common issues consist of:

it does not respect the 0 and 1 proportion boundary (so you can get a lower CI of -0.1 or an upper CI of 1.1%!)
the derived 95% CI may not cover the true proportion 95% of the time

Wald method can be derived with or without a Yate’s continuity correction. Applying the continuity correction is recommended when you have a small sample size or the estimated proportion is close to the tail ends (0 or 1). Applying Yate’s correction is considered more conservative but it’s not as conservative as Clopper-Pearson approach.

Normal approximation method is output by SAS as the default method, but you can also specify it using BINOMIAL(LEVEL="Yes" CL=WALD);

SAS also produces a continuity correction version of the Wald method, you can specify it using BINOMIAL(LEVEL="Yes" CL=WALD(CORRECT));

Wilson Method (Also known as the Score method or the Newcombe method)⁷

The Wilson (Score) method is an extension to the normal approximation, but commonly used where the proportion is close to 0 or 1 (i.e. 0% or 100% responding). This is because the normal approximation can be unreliable at the tail ends of the distribution, and as such the Wilson method can provide more reliable CIs. The method can be derived with or without a Yate’s continuity correction. Applying the continuity correction is recommended when you have a small sample size or the estimated proportion is close to the tail ends (0 or 1). Applying Yate’s correction is considered more conservative (sometimes too conservative), but it’s not as conservative as Clopper-Pearson approach. Not that the Wilson method is not boundary-respecting so you can get confidence interval <0 or >1.

Let p=r/n, where r= number of responses, and n=number of subjects, q=1-p, and z= the appropriate value from standard normal distribution:
\[ z{_1-\alpha/2} \]For example, for 95% confidence intervals, alpha=0.05, using standard normal tables, z in the equations below will take the value =1.96. Calculate 3 quantities

\[ A= 2r+z^2\]

\[ B=z\sqrt(z^2 + 4rq) \] \[ C=2(n+z^2) \]The method calculates the confidence interval (Low to High) as: (A-B)/C to (A+B)/C

A = 2 * 36 + 1.96^2 = 75.8416

B = 1.96 * sqrt (1.96^2 + 4 x 36 x 0.7662) = 20.9435

C = 2* (154+1.96^2) = 315.6832

Lower interval = A-B/C = 75.8416 - 20.9435 / 315.6832 = 0.17390

Upper interval = A+B/C = 75.8416 + 20.9435 / 315.6832 = 0.30659

CI = 0.17390 to 0.30659

Wilson (score) method is output by SAS using BINOMIAL(LEVEL="Yes" CL=Wilson);

SAS also produces a continuity correction version of the Wilson method, you can specify it using BINOMIAL(LEVEL="Yes" CL=WILSON(CORRECT));

The only differences in the equations to calculate the Wilson score with continuity correction is that the equations for A and B are changed as follows:

\[ A= 2r+z^2 -1\]

\[ B=z\sqrt(z^2 - 2 -\frac{1}{n} + 4rq) \]

Agresti-Coull Method

Agresti-Coull Method is a ‘simple solution’ designed to improve coverage compared to the Wald method and still perform better than Clopper-Pearson particularly when the probability isn’t in the mid-range (0.5). It is less conservative whilst still having good coverage. The only difference compared to the Wald method is that it adds two successes and two failures to the original observations (increasing the sample by 4 observations). In practice it is not often used.

Agresti-Coull method is output by SAS using BINOMIAL(LEVEL="Yes" CL=AGRESTICOULL);

Binomial based MidP Method

The MidP method is similar to the Clopper-Pearson method, but aims to reduce the conservatism. It’s quite a complex method compared to the methods above and rarely used in practice.

MidP method is output by SAS using BINOMIAL(LEVEL="Yes" CL=MIDP);

Jeffreys Method

Jeffreys method is a particular type of Bayesian Highest Probability Density (HPD) Method. For proportions, the beta distribution is generally used for the prior, which consists of two parameters alpha and beta. Setting alpha=beta=0.5 is called Jeffrey’s prior. This is considered as non-informative for a binomial proportion.

\[ (Beta (^k/_2 + ^1/_{2}, ^{(n-k)}/_2+^1/_2)_{\alpha}, Beta (^k/_2 + ^1/_{2}, ^{(n-k)}/_2+^1/_2)_{1-\alpha} \] Jeffreys method is output by SAS using BINOMIAL(LEVEL="Yes" CL=Jeffreys);

Blaker Method⁶

The Blaker method is a less conservative alternative to the Clopper-pearson exact test. It is also an exact method, but derives the CI by inverting the p-value function of an exact test.

The Clopper-pearson CI’s are always wider and contain the Blaker CI limits. It’s adoption has been limited due to the numerical algorithm taking longer to compute compared to some of the other methods especially when the sample size is large. NOTE: Klaschka and Reiczigel⁵ is yet another adaptation of this method.

BLAKER method is output by SAS using BINOMIAL(LEVEL="Yes" CL=BLAKER);

Example Code using PROC FREQ

By adding the option BINOMIAL(LEVEL="Yes") to your ‘PROC FREQ’, SAS outputs the Normal Approximation (Wald) and Clopper-Pearson (Exact) confidence intervals as two default methods, derived for the Responders = Yes. If you do not specify the LEVEL you want to model, then SAS assumes you want to model the first level that appears in the output (alphabetically).

It is very important to ensure you are calculating the CI for the correct level! Check your output to confirm, you will see below it states resp=Yes !

The output consists of the proportion of resp=Yes, the Asymptotic SE, 95% CIs using normal-approximation method, 95% CI using the Clopper-Pearson method (Exact), and then a Binomial test statistic and p-value for the null hypothesis of H0: Proportion = 0.5.

proc sort data=adcibc;
by trt; 
run; 

proc freq data=adcibc ; 
table resp/ nopct nocol BINOMIAL(LEVEL="Yes");
by trt;
run;

By adding the option BINOMIAL(LEVEL="Yes" CL=<name of CI method>), the other CIs are output as shown below. You can list any number of the available methods within the BINOMIAL option CL=XXXX separated by a space. However, SAS will only calculate the WILSON and WALD or the WILSON(CORRECT) and WALD(CORRECT). SAS wont output them both from the same procedure.

BINOMIAL(LEVEL="Yes" CL=CLOPPERPEARSON WALD WILSON AGRESTICOULL JEFFREYS MIDP LIKELIHOODRATIO LOGIT BLAKER) will return Agresti-Coull, BLAKER, Clopper-pearson(Exact), WALD(without continuity correction) WILSON(without continuity correction), JEFFREYS, MIDP, LIKELIHOODRATIO, and LOGIT
BINOMIAL(LEVEL="Yes" CL=ALL); will return Agresti-Coull, Clopper-pearson (Exact), Jeffreys, Wald(without continuity correction), Wilson (without continuity correction)
BINOMIALc(LEVEL="Yes" CL=ALL);will return Agresti-Coull, Clopper-pearson (Exact), Jeffreys, Wald (with continuity correction), Wilson(with continuity correction)
BINOMIALc(LEVEL="Yes" CL=WILSON(CORRECT) WALD(CORRECT));will return Wilson(with continuity correction) and Wald (with continuity correction)

proc freq data=adcibc;
         table resp/ nopct nocol 
                     BINOMIAL(LEVEL="Yes" 
                              CL= CLOPPERPEARSON WALD WILSON 
                                  AGRESTICOULL JEFFREYS MIDP 
                                  LIKELIHOODRATIO LOGIT BLAKER);
  by trt; 
run;

proc freq data=adcibc;
         table resp/ nopct nocol 
                     BINOMIAL(LEVEL="Yes" 
                              CL= WILSON(CORRECT)  WALD(CORRECT));
  by trt; 
run;

SAS output often rounds to 3 or 4 decimal places in the output window, however the full values can be obtained using SAS ODS statements. ods output binomialcls=bcl; and then using the bcl dataset, in a data step to put the variable out to the number of decimal places we require.
10 decimal places shown here ! lowercl2=put(lowercl,12.10);

Methods for Calculating Confidence Intervals for a matched pair proportion

You may experience paired data in any of the following types of situation:

Tumour assesssments classified as Progressive Disease or Not Progressive Disease performed by an Investigator and separately by an independent panel.
A paired case-control study (each subject taking active treatment is matched to a patient taking control)
A cross-over trial where the same subjects take both medications

In all these cases, the calculated proportions for the 2 groups are not independent.

Using a cross over study as our example, a 2 x 2 table can be formed as follows:

	Placebo Response= Yes	Placebo Response = No	Total
Active Response = Yes	r	s	r+s
Active Response = No	t	u	t+u
Total	r+t	s+u	N = r+s+t+u

The proportions of subjects responding on each treatment are:

Active: \(\hat p_1 = (r+s)/n\) and Placebo: \(\hat p_2= (r+t)/n\)

Difference between the proportions for each treatment are: \(D=p1-p2=(s-t)/n\)

Suppose :

	Placebo Response= Yes	Placebo Response = No	Total
Active Response = Yes	r = 20	s = 15	r+s = 35
Active Response = No	t = 6	u = 5	t+u = 11
Total	r+t = 26	s+u = 20	N = r+s+t+u = 46

Normal Approximation Method (Also known as the Wald or asymptotic CI Method)

In large random samples from independent trials, the sampling distribution of the difference between two proportions approximately follows the normal distribution. Hence the SE for the difference and 95% confidence interval can be calculated using the following equations.

\(SE(D)=\frac{1}{n} * sqrt(s+t-\frac{(s-t)^2}{n})\)

\(D-z_\alpha * SE(D)\) to \(D+z_\alpha * SE(D)\)

where \(z_\alpha\) is the \(1-\alpha/2\) quantile of a standard normal distribution corresponding to level \(\alpha\),

D=(15-6) /46 = 0.196

SE(D) = 1/ 46 * sqrt (15+6- (((15+6)^2)/46) ) = 0.0956

Lower CI= 0.196 - 1.96 *0.0956 = 0.009108

Upper CI = 0.196 + 1.96 * 0.0956 = 0.382892

Wilson Method (Also known as the Score method or the Altman, Newcombe method⁷ )

Derive the confidence intervals using the Wilson Method equations above for each of the individual single samples 1 and 2.

Let l1 = Lower CI for sample 1, and u1 be the upper CI for sample 1.

Let l2 = Lower CI for sample 2, and u2 be the upper CI for sample 2.

We then define \(\phi\) which is used to correct for \(\hat p_1\) and \(\hat p_2\) not being independent. As the samples are related, \(\phi\) is usually positive and thus makes the confidence interval smaller (narrower).

If any of r+s, t+u, r+t, s+u are zero, then set \(\phi\) to be 0.

Otherwise we calculate A, B and C, and \(\phi=C / sqrt A\)

In the above: \(A=(r+s)(t+u)(r+t)(s+u)\) and \(B=(ru-st)\)

To calculate C follow the table below. n=sample size.

Condition of B	Set C equal to
If B is greater than n/2	B - n/2
If B is between 0 and n/2	0
If B is less than 0	B

Let D = p1-p2 (the difference between the observed proportions of responders)

The Confidence interval for the difference between two population proportions is: \(D - sqrt((p_1-l_1)^2-2\phi(p_1-l_1)(u_2-p_2)+(u_2-p_2)^2 )\) to

\(D + sqrt((p_2-l_2)^2-2\phi(p_2-l_2)(u_1-p_1)+(u_1-p_1)^2 )\)

First using the Wilson Method equations for each of the individual single samples 1 and 2.

	Active	Placebo
a	73.842	55.842
b	13.728	11.974
c	99.683	99.683
Lower CI	0.603 = L1	0.440 = L2
Upper CI	0.878 = U1	0.680 = U2

\(A=(r+s)(t+u)(r+t)(s+u)\) = 9450000

B=10

C= 0 (as B is between 0 and n/2)

\(\phi\) = 0.

Hence the middle part of the equation simplies to 0, and becomes simply:

Lower CI = \(D - sqrt((p_1-l_1)^2+(u_2-p_2)^2 )\) = 0.196 - sqrt [ (0.761-0.603)^2 + (0.680-0.565) ^2 ]

Upper CI = \(D + sqrt((p_2-l_2)^2+(u_1-p_1)^2 )\) = 0.196 + sqrt [ (0.565-0.440)^2 + (0.878-0.761) ^2 ]

CI= 0.00032 to 0.367389

Example Code using PROC FREQ

Unfortunately, SAS does not have a procedure which outputs the confidence intervals for matched proportions.

Instead it calculates the risk difference = (r / (r+s) - t / (t+u) )

Which in the example above is: 20/ (20+15) - 6 / (6+5) = 0.0296

This is more applicable when you have exposed / non-exposed groups looking at who has experienced the outcome. SAS Proc Freq has 3 methods for analysis of paired data using a Common risk difference.

The default method is Mantel-Haenszel confidence limits. SAS can also Score (Miettinen-Nurminen) CIs and Stratified Newcombe CIs (constructed from stratified Wilson Score CIs). See here for equations.

As you can see below, this is not a CI for difference in proportions of 0.196, it is a CI for the risk difference of 0.0260. So must be interpreted with much consideration.

proc freq data=adcibc; 
table act*pbo/commonriskdiff(cl=MH); 
run;

Methods for Calculating Confidence Intervals for 2 independent samples proportion

This paper⁸ describes many methods for the calculation of confidence intervals for 2 independent proportions. The most commonly used are: Wald with continuity correction and Wilson with continuity correction. The Wilson method may be more applicable when sample sizes are smaller and/or the proportion is closer to 0 or 1, since reliance on a normal distribution may be inappropriate.

SAS is able to calculate CIs using the following methods: Exact, Agresti/Caffo, Wald, Wald with Continuity correction, Wilson/Newcombe Score, Wilson/Newcombe score with continuity correction, Miettinen and Nurminen, and Hauck-Anderson. Example code is shown below, before provision of a much more detailed analysis using Wald and Wilson with and without continuity correction.

# Wald, Wilson, Agresti/Caffo, Hauck-Anderson, and Miettinen and Nurminen methods
proc freq data=adcibc order=data;
table trt*resp /riskdiff(CL=(wald wilson ac ha mn ));
run;

# exact method
proc freq data=adcibc order=data;
exact riskdiff (method=noscore);
table trt*resp/riskdiff(CL=(exact)); 
run;

Normal Approximation Method (Also known as the Wald or asymptotic CI Method)

In large random samples from independent trials, the sampling distribution of the difference between two proportions approximately follows the normal distribution.

The difference between two independent sample proportions is calculated as: \(D= \hat p_1-\hat p_2\)

A confidence interval for the difference between two independent proportions \(D\) can be calculated using:

\(D\approx \hat D \pm z_\alpha * SE(\hat p)\),

where \(z_\alpha\) is the \(1-\alpha/2\) quantile of a standard normal distribution corresponding to level \(\alpha\), and

\(SE (\hat p) = sqrt{( \frac{\hat p_1 (1-\hat p_1)}{n_1} + \frac{\hat p_2 (1-\hat p_2)}{n_2})}\)

With continuity correction, the equation becomes

\(D\approx \hat D \pm (CC + z_\alpha * SE(\hat p))\),

where \(z_\alpha\) is the \(1-\alpha/2\) quantile of a standard normal distribution corresponding to level \(\alpha\),
and \(SE (\hat p)\) = \(sqrt{( \frac{\hat p_1 (1-\hat p_1)}{n_1} + \frac{\hat p_2 (1-\hat p_2)}{n_2})}\)
and

\(CC = \frac{1}{2} (\frac{1}{n_1} + \frac{1}{n_2})\)

Wilson Method (Also known as the Score method or the Altman, Newcombe method⁷ )

Derive the confidence intervals using the Wilson Method equations above for each of the individual single samples 1 and 2.

Let l1 = Lower CI for sample 1, and u1 be the upper CI for sample 1.

Let l2 = Lower CI for sample 2, and u2 be the upper CI for sample 2.

Let D = p1-p2 (the difference between the observed proportions)

The Confidence interval for the difference between two population proportions is: \[ D - sqrt((p_1-l_1)^2+(u_2-p_2)^2) \quad to\quad D + sqrt((p_2-l_2)^2+(u_1-p_1)^2 ) \]

Example Code using PROC FREQ

It is important to check the output to ensure that you are modelling Active - Placebo, and response = Yes (not Response=No). By default SAS sorts alphabetically and calculates CI’s for the first column. You can change this by using the COLUMN= Option on riskdiff or by sorting the dataset (here by trt, then descending resp), and then using order=data in the proc freq. This tells SAS to use the order you have sorted the data by. SAS confirms this by saying “Difference is (Row 1 - Row 2)” and “Column 1 (resp=Yes)”. Note how in the SAS output, it calls the requested ‘wilson’ method ‘Newcombe’ in the output.

Options for riskdiff(CL=XXX) consist of AC:Agresti-Caffo, EXACT=exact, HA:Hauck-Anderson, MN or SCORE:Miettinen-Nurminen (another type of Score CI), WILSON or NEWCOMBE: Wilson method described above, and WALD: normal approximation wald method described above. Examples using Wald and Wilson are shown below with and without continuity correction.

proc sort data=adcibc;
 by  trt descending resp;
run;

#without continuity correction
proc freq data=adcibc order=data;
  table trt*resp/riskdiff(CL=(wald wilson)); 
run;

#with continuity correction

proc freq data=adcibc order=data;
  table trt*resp/riskdiff(CORRECT CL=(wald wilson)); 
run;

Reference

Five Confidence Intervals for Proportions That You Should Know about
Confidence intervals for Binomial Proportion Using SAS
SAS PROC FREQ here and here
Clopper,C.J.,and Pearson,E.S.(1934),“The Use of Confidence or Fiducial Limits Illustrated in the Case of the Binomial”, Biometrika 26, 404–413.
Klaschka, J. and Reiczigel, J. (2021). “On matching confidence intervals and tests for some discrete distributions: Methodological and computational aspects,” Computational Statistics, 36, 1775–1790.
Blaker, H. (2000). Confidence curves and improved exact confidence intervals for discrete distributions, Canadian Journal of Statistics 28 (4), 783–798
D. Altman, D. Machin, T. Bryant, M. Gardner (eds). Statistics with Confidence: Confidence Intervals and Statistical Guidelines, 2nd edition. John Wiley and Sons 2000.
https://www.lexjansen.com/wuss/2016/127_Final_Paper_PDF.pdf

Introduction

Data used

Methods for Calculating Confidence Intervals for a single proportion

Clopper-Pearson (Exact or binomial CI) Method

Normal Approximation Method (Also known as the Wald or asymptotic CI Method)

Wilson Method (Also known as the Score method or the Newcombe method)7

Agresti-Coull Method

Binomial based MidP Method

Jeffreys Method

Blaker Method6

Example Code using PROC FREQ

Methods for Calculating Confidence Intervals for a matched pair proportion

Normal Approximation Method (Also known as the Wald or asymptotic CI Method)

Wilson Method (Also known as the Score method or the Altman, Newcombe method7 )

Example Code using PROC FREQ

Methods for Calculating Confidence Intervals for 2 independent samples proportion

Normal Approximation Method (Also known as the Wald or asymptotic CI Method)

Wilson Method (Also known as the Score method or the Altman, Newcombe method7 )

Example Code using PROC FREQ

Reference

Wilson Method (Also known as the Score method or the Newcombe method)⁷

Blaker Method⁶

Wilson Method (Also known as the Score method or the Altman, Newcombe method⁷ )

Wilson Method (Also known as the Score method or the Altman, Newcombe method⁷ )