Binomial Test Comparison.

One Sample Proportion

The following table shows the types of Binomial test analysis, the capabilities of each language, and whether or not the results from each language match.

Analysis	Supported in SAS	Supported in R	Match	Notes
Exact Binomial Test	Yes	Yes	Yes	Performed in Base R, using `binom.test()` function which executes the exact test of a single proportion based on exact binomial distribution. SAS uses `PROC FREQ` with binomial option and `level` to define category of success variable. The p-value for this test is influenced by the proportion specified, especially when the distribution is not symmetric (\(p\neq 0.5\)). R generates two sided p-value by default. SAS generates both two sided and one sided p-value.
Asymptotic Binomial Test (Wald test for proportion)	Yes	Yes	Yes	There is no base R function (library(help=“stats”)) for one‑sample binomial proportion. The Wald test can be done manually based on z statistic formula; `BinomCI()` gives the confidence interval but does not perform the formal hypothesis test. In SAS, it is implemented by default. To implement continuity correction, apply correct syntax and use `level` to define category of success variable.
Mid-P adjusted Exact Binomial Test	Yes	Yes	Yes	Not implemented by base R. Apply R package `exactci` for one sample proportion. SAS applies `PROC FREQ` with `EXACT BINOMIAL / MIDP` option. Mid-P binomial test is less conservative than the exact binomial test.
Wilson score test	Yes	Yes	Yes	Implemented in base R by applying the inbuilt function `prop.test()` which can perform both one and two samples Z-tests of proportions. In SAS, it is implemented using `PROC FREQ` with a binomial option and `CL=SCORE` for confidence interval.

Comparison Results

Here is a table of comparison values between binom.test() and SAS PROC FREQ with binomial option:

Binomial Test on coin flips.

\(H_0 : p = 0.5\)

Test	Statistic	Binom.test( )	PROC FREQ with Binomial option	Match	Notes
Exact Binomial Test (Clopper-Pearson)	Probability of success	0.52	0.52	Yes	Binomial distributiion
	Confidence Interval	Lower: 0.4885 Upper: 0.5513	Lower: 0.4885 Upper: 0.5514	Yes	Constructed by inverting exact binomial test, with interval bounds obtained from beta distribution quantile.
	p- value	0.2174	0.2174	Yes	This is a two sided p_value. R generates two sided p-value by default. SAS generates both two sided and one sided p-value. These p-values match since the proportion is symmetric (p=0.5). When the proportion is not symmetric, the p-value shows subtle differences in R and SAS.
Asymptotic Binomial Test( Wald test)	Probability of success	0.52	0.52	Yes
	Confidence interval	Lower: 0.4890 Upper: 0.5510	Lower: 0.4890 Upper: 0.5510	Yes	Use normal approximation to the binomial distribution for confidence intervals and hypothesis tests.
	p- value	0.2059	0.2059	Yes	This is a two sided p_value. Since the Wald test is done manually in R based on Z statistic formula, p_value can be calculated for either tail tests. SAS generates both two sided and right sided p-value.
Mid-P adjusted Exact Binomial Test	Probability of success	0.52	0.52	Yes	Binomial distribution
	Confidence interval	Lower: 0.4890 Upper: 0.5509	Lower: 0.4890 Upper: 0.5509	Yes
	p-value	Right sided : 0.1031 Two sided: 0.2061 Left sided: 0.8969	Exact one sided: 0.1031	Yes	SAS default is right sided p-value. R default is two sided p-value. A Request of a right sided test p-value in R matches that of SAS. This because both software apply same midpoint definition.
Wilson Score Test	Probability of success	0.52	0.52	Yes
	Confidence interval	Lower: 0.4890 Upper: 0.5508	Lower: 0.4890 Upper: 0.5508	Yes	Obtained by inverting the score (Pearson chi-square) test for a single binomial proportion. It’s formed by a set of values that are not rejected by Pearson’s chi-square score test. It is applicable for extreme data, for instance, when the number of successes (x)=0 which is unlikely for Wald interval since it collapses [0,0].
	p-value	0.2059	0.2059	Yes	This is a two sided p-value. R generates two sided p-value by default. SAS generates both two sided and one sided p-value.

Binomial Test with Clinical Trial Data.

\(H_0 : p = 0.19\)

Test	Statistic	Binom.test( )	PROC FREQ with binomial option	Match	Notes
Exact Binomial Test ( Clopper-Pearson)	Probability of success	0.2763	0.2763	Yes	Binomial distribution
	Confidence interval	Lower:0.2193 Upper:0.3392	Lower:0.2193 Upper:0.3392	Yes	Constructed by inverting exact binomial test, with interval bounds obtained from beta distribution quantile.
	p-value	0.0017	0.0019	No	R generates a p-value of 0.00168288 compared to 0.0019 for SAS ; both R and SAS compute this test differently but when the proportion is 0.5, the results should match. PROC FREQ computes the two-sided p-value as the sum of the one-sided p-value and the corresponding area in the opposite tail of the distribution of the statistic, equidistant from the expected value. This implies that `PROC FREQ` calculates this test by obtaining the lower and upper tail probabilities: \(P(X \leq x)\) and \(P(X\geq x)\). Since `PROC FREQ` procedure assumes equidistance from the expected value, then the least of the two probabilities is doubled to maintain symmetry around the expected value. On the contrary, `binom.test()`compute p-value by summing all the values whose probability is \(\leq\) the probability of the observed value, such that it includes all the probabilities below observed probability.
Asymptotic Binomial Test (Wald Test)	Probability of success	0.2763	0.2763	Yes
	Confidence interval	Lower: 0.2183 Upper: 0.3344	Lower: 0.2183 Upper: 0.3344	Yes	Use normal approximation to the binomial distribution for confidence intervals and hypothesis tests.
	p-value	0.0009	0.0009	Yes	Two sided p-value. Since the Wald test is done manually in R based on Z statistic formula, p-value can be calculated for either tail tests. SAS generates both two sided and right sided p-value.
Mid-P adjusted Exact Binomial Test	Probability of success	0.2763	0.2763	Yes	Binomial distribution
	Confidence interval	Lower: 0.2212 Upper: 0.3371	Lower: 0.2212 Upper: 0.3371	Yes
	p-value	one-tailed Upper: 0.0008 Two tailed: 0.0015 one-tailed Lower: 0.9992	exact one sided: 0.0008	Yes	SAS default is right sided p-value. R default is two sided p-value. A Request of a right sided test p-value in R matches that of SAS. This because both software apply same midpoint definition.
Wilson Score Test	Probability of success	0.2763	0.2763	Yes
	Confidence interval	Lower: 0.2223 Upper: 0.3377	Lower: 0.2223 Upper: 0.3377	Yes	Obtained by inverting the score (Pearson chi-square) test for a single binomial proportion. It’s formed by a set of values that are not rejected by Pearson’s chi-square score test. It is applicable for extreme data, for instance, when the number of successes (x)=0 which is unlikely for Wald interval since it collapses [0,0].
	p-value	0.0009	0.0009	Yes	Two sided p-value. R generates two tailed test by default. SAS generates both two sided and right sided p-value.

Summary and Recommendation

Test results for the two example data are identical in both R and SAS for every instance. For the Exact Binomial test in both software, binom.test() and the PROC FREQ procedure with Binomial option offer Clopper-Pearson confidence intervals. Default is two sided alternative. Exact test is based on binomial distribution.

Asymptotic Binomial Test uses normal approximation to the binomial distribution for confidence intervals and hypothesis tests, which is suitable for large samples. Default tests are typically two-sided in both R and SAS. SAS use normal approximation for binomial proportion test in PROC FREQ. Since asymptotic method assumes large samples, it is therefore not reliable for small samples or for proportions closer to 0 or 1. The confidence interval can also go beyond [0,1].

For one sample case in R , Mid-P adjusted Exact Binomial Test implements tsmethod = ' central' by default which gives Garwood(1936) exact central intervals. They are obtained by inverting the mid-p value function. PROC FREQ provides exact mid-p-values if you specify MIDP option in the EXACT statement CL=MIDP for confidence interval.

prop.test() by default applies Yates continuity correction (for this case, it was not applied: correction=FALSE). Wilson interval corresponds to Pearson’s chi-square test. If Yates’ continuity correction is implemented to the chi-square test, the resultant confidence interval is that of continuity corrected Wilson interval. It works well for small number of trials(n) and probability of success(p) as well as offering better coverage.

More detailed information around CIs for proportions can be found here

References

binom.test() documentation: https://www.rdocumentation.org/packages/stats/versions/3.5.2/topics/binom.test

Package 'exactci' documentation: https://cran.r-project.org/web/packages/exactci/exactci.pdf

PROC FREQ with binomial option documentation: https://documentation.sas.com/doc/en/pgmsascdc/v_072/procstat/procstat_freq_examples04.htm

PROC FREQ with EXACT statement documentation: https://documentation.sas.com/doc/en/statug/latest/statug_freq_syntax03.htm

Binomial proportion confidence interval documentation: https://en.wikipedia.org/wiki/Binomial_proportion_confidence_interval#Wilson_score_interval