R vs SAS Wilcoxon Rank-Sum Test

Introduction

This section compares the wilcoxon rank-sum test.

Example: This section compares the implementation of Poisson Regression in R and SAS. Poisson regression is used to model count data and contingency tables. It’s particularly useful for modeling the number of events occurring within a fixed period of time or space.

Wilcoxon-Rank Sum

The following table provides an overview of the support and results comparability between R and SAS for the new analysis point.

Analysis Supported in {stats} Supported in {coin} Supported in SAS Notes

Wilcoxon Rank-Sum

Normal Approximation with continuity correction


If there are ties in the dataset there will be a warning in {stats}, but the warning only refers to doing the exact method

Wilcoxon Rank-Sum

Normal Approximation without continuity correction

Wilcoxon Rank-Sum Exact ✔️ | ✅ | ✅ | {stats} can only do the exact method when no ties are present |

Hodges-Lehmann CI

Asymptotic

Hodges-Lehmann CI

Exact

✔️ | ✅ | ✅ | {stats} can only do the exact method when no ties are present | | | | | | |

Prerequisites: R Packages

This test can be run using the {stats} package, but in order to handle ties the {coin} package is recommended.

library(coin) 
Loading required package: survival

Example Data

For this example we are using a dataset of birth weights for smoking and non-smoking mothers.

Data source: Table 30.4, Kirkwood BR. and Sterne JAC. Essentials of medical statistics. Second Edition. ISBN 978-0-86542-871-3

bw_ns <- c(3.99, 3.89, 3.6, 3.73, 3.31, 
            3.7, 4.08, 3.61, 3.83, 3.41, 
            4.13, 3.36, 3.54, 3.51, 2.71)
bw_s <- c(3.18, 2.74, 2.9, 3.27, 3.65, 
           3.42, 3.23, 2.86, 3.6, 3.65, 
           3.69, 3.53, 2.38, 2.34)

smk_data <- data.frame(
  value = c(bw_ns, bw_s), 
  smoke = as.factor(rep(c("non", "smoke"), c(length(bw_ns), length(bw_s))))
) 
# Relevel the factors to make it smoker - non-smokers 
smk_data$smoke <- forcats::fct_relevel(smk_data$smoke, "smoke")
head(smk_data)
  value smoke
1  3.99   non
2  3.89   non
3  3.60   non
4  3.73   non
5  3.31   non
6  3.70   non

This dataset is both small (so an exact test is recommended) and has ties in it.

To view the code for this see the SAS and R pages respectively.

Wilcoxon Rank Sum tests

Below is a table to the p-values from the Wilcoxon Rank Sum Test with different options

Statistic SAS Result {stats} Results {coin} Result Match Notes
Normal Approx with continuity correct 0.01 0.01001 Yes {coin} doesn’t have an option for continuity corrections
Normal Approx without continuity correct 0.0094 0.009392 0.009392 Yes
Exact (no correction) 0.0082 0.008181 Yes {stats} can’t do an exact method when there are ties

Hodges-Lehmann CI

Statistic SAS Result {stats} Results {coin} Result Match Notes
Asymptotic (Moses) (-0.77, -0.09) (-0.77, -0.090) Yes - only in {stats} the exact argument in {coin} only applies to the rank-sum test
Exact (-0.76, -0.1) (-0.76, -0.10) Yes {stats} can’t do an exact method when there are ties

Special Considerations

If you have a study where you would like to use the exact wilcoxon test and there is the risk of ties, {coin} would be recommended.

Summary and Recommendation

Wilcoxon Rank Sum and the associated Hodges-Lehmann CI are able to be consistenly created in both SAS and R. But, it is worth noting that the test statstic of Wilcoxon Rank Sum does not directly translate into the Hodges-Lehmann CI. Because of this, the exact option in SAS and R can be slightly confusing. In SAS the exact wilcoxon hl; statement is need to get both the exact p-value and CI. In {stats} exact values are only possible when there are no ties and the exact parameter is set to true (exact = TRUE). This will give the exact p-value and CI. In {coin} the exact option only effects the p-values, there is no option for exact CI.

Ties

In addition to the options discussed here, there are also different method to handle ties. SAS only has one option for handling ties, the average score method. In {coin} by default ties are handled with the “mid-ranks” method.

Additional References

Provide references and additional reading materials for both R and SAS documentation related to the analysis.

R Documentation:

SAS Documentation:

─ Session info ───────────────────────────────────────────────────────────────
 setting  value
 version  R version 4.4.2 (2024-10-31)
 os       Ubuntu 24.04.2 LTS
 system   x86_64, linux-gnu
 ui       X11
 language (EN)
 collate  C.UTF-8
 ctype    C.UTF-8
 tz       UTC
 date     2025-06-06
 pandoc   3.6.3 @ /opt/quarto/bin/tools/ (via rmarkdown)

─ Packages ───────────────────────────────────────────────────────────────────
 ! package * version date (UTC) lib source
 P coin    * 1.4-3   2023-09-27 [?] RSPM (R 4.4.0)

 [1] /home/runner/work/CAMIS/CAMIS/renv/library/linux-ubuntu-noble/R-4.4/x86_64-pc-linux-gnu
 [2] /opt/R/4.4.2/lib/R/library

 P ── Loaded and on-disk path mismatch.

─ External software ──────────────────────────────────────────────────────────
 setting value
 SAS     9.04.01M7P080520

──────────────────────────────────────────────────────────────────────────────