# x, y are two unpaired vectors. Do not necessary need to be of the same length.
::wilcox.test(x, y, paired = FALSE) stats
Wilcoxon Rank Sum (Mann Whitney-U) in R
Overview
Wilcoxon rank sum test, or equivalently, Mann-Whitney U-test is a rank based non-parametric method. The aim is to compare two independent groups of observations. Under certain scenarios, it can be thought of as a test for median differences, however this is only valid when: 1) both samples are independent and identically distributed (same dispersion, same shape, not necessarily normal) and 2) are symmetric around their medians.
Generally, with two samples of observations (A and B), the test uses the mean of each possible pair of observations in each group (including the pair of each value with itself) to test if the probability that (A>B) > probability (B>A).
The Wilcoxon rank sum test is often presented alongside a Hodges-Lehmann estimate of the pseudo-median (the median of the Walsh averages), and an associated confidence interval for the pseudo-median.
A tie in the data exists when an observation in group A, has the same result as an observation in group B.
Available R packages
There are two main implementations of the Wilcoxon rank sum test in R.
The stats
package implements various classic statistical tests, including Wilcoxon rank sum test. Although this is arguably the most commonly applied package, this one does not account for any ties in the data. To account for ties in the data, the coin
package should be used.
Example: Birth Weight
Data source: Table 30.4, Kirkwood BR. and Sterne JAC. Essentials of medical statistics. Second Edition. ISBN 978-0-86542-871-3
Comparison of birth weights (kg) of children born to 15 non-smokers with those of children born to 14 heavy smokers.
# bw_ns: non smokers
# bw_s: smokers
<- c(3.99, 3.89, 3.6, 3.73, 3.31,
bw_ns 3.7, 4.08, 3.61, 3.83, 3.41,
4.13, 3.36, 3.54, 3.51, 2.71)
<- c(3.18, 2.74, 2.9, 3.27, 3.65,
bw_s 3.42, 3.23, 2.86, 3.6, 3.65,
3.69, 3.53, 2.38, 2.34)
Can visualize the data on two histograms. Red lines indicate the location of medians.
par(mfrow =c(1,2))
hist(bw_ns, main = 'Birthweight: non-smokers')
abline(v = median(bw_ns), col = 'red', lwd = 2)
hist(bw_s, main = 'Birthweight: smokers')
abline(v = median(bw_s), col = 'red', lwd = 2)
It is possible to see that for non-smokers, the median birthweight is higher than those of smokers. Now we can formally test it with wilcoxon rank sum test.
The default test is two-sided with confidence level of 0.95, and does continuity correction.
# default is two sided
::wilcox.test(bw_s, bw_ns, paired = FALSE, conf.int = TRUE) stats
Warning in wilcox.test.default(bw_s, bw_ns, paired = FALSE, conf.int = TRUE):
cannot compute exact p-value with ties
Warning in wilcox.test.default(bw_s, bw_ns, paired = FALSE, conf.int = TRUE):
cannot compute exact confidence intervals with ties
Wilcoxon rank sum test with continuity correction
data: bw_s and bw_ns
W = 45.5, p-value = 0.01001
alternative hypothesis: true location shift is not equal to 0
95 percent confidence interval:
-0.76995896 -0.09000999
sample estimates:
difference in location
-0.4261377
We can also carry out a one-sided test, by specifying alternative = greater
(if the first item is greater than the second).
# default is two sided
::wilcox.test(bw_ns, bw_s, paired = FALSE, alternative = 'greater') stats
Warning in wilcox.test.default(bw_ns, bw_s, paired = FALSE, alternative =
"greater"): cannot compute exact p-value with ties
Wilcoxon rank sum test with continuity correction
data: bw_ns and bw_s
W = 164.5, p-value = 0.005003
alternative hypothesis: true location shift is greater than 0
In {stats} the exact p-value is computed when there are less than 50 values and no ties otherwise the normal approximation is used. In this cause, because there are ties the normal approximation is used.
# force exact
::wilcox.test(bw_s, bw_ns, paired = FALSE, conf.int = TRUE, exact = TRUE) stats
Warning in wilcox.test.default(bw_s, bw_ns, paired = FALSE, conf.int = TRUE, :
cannot compute exact p-value with ties
Warning in wilcox.test.default(bw_s, bw_ns, paired = FALSE, conf.int = TRUE, :
cannot compute exact confidence intervals with ties
Wilcoxon rank sum test with continuity correction
data: bw_s and bw_ns
W = 45.5, p-value = 0.01001
alternative hypothesis: true location shift is not equal to 0
95 percent confidence interval:
-0.76995896 -0.09000999
sample estimates:
difference in location
-0.4261377
In order to account for the ties, wilcox_test
from the {coin} package should be used. For this function, the data needs to be inputted via a formula where the right hand side is a factor, so we need to create a dataset. In order to get smokers - non-smokers we need to relevel the factors.
<- data.frame(
smk_data value = c(bw_ns, bw_s),
smoke = as.factor(rep(c("non", "smoke"), c(length(bw_ns), length(bw_s))))
)
$smoke smk_data
[1] non non non non non non non non non non non non
[13] non non non smoke smoke smoke smoke smoke smoke smoke smoke smoke
[25] smoke smoke smoke smoke smoke
Levels: non smoke
$smoke <- forcats::fct_relevel(smk_data$smoke, "smoke")
smk_data
$smoke smk_data
[1] non non non non non non non non non non non non
[13] non non non smoke smoke smoke smoke smoke smoke smoke smoke smoke
[25] smoke smoke smoke smoke smoke
Levels: smoke non
Now the data is in the right shape we can run wilcox_test
. In order to get the Hodges-Lehmann confidence intervals out, we need to include the conf.int = TRUE
argument.(NOTE: the conf.level
argument controls the confidence level, but must be used with conf.int = TRUE
otherwise you won’t get a confidence interval)
::wilcox_test(value ~ smoke, data = smk_data, conf.int = TRUE) coin
Asymptotic Wilcoxon-Mann-Whitney Test
data: value by smoke (smoke, non)
Z = -2.5974, p-value = 0.009392
alternative hypothesis: true mu is not equal to 0
95 percent confidence interval:
-0.76000001 -0.09999999
sample estimates:
difference in location
-0.4261403
In {coin} there is no option for a continuity correction and it is not done by default. {coin} can calculate exact and Monte Carlo conditional p-values using the distribtuion
argument. The exact p-value should be used in small sample sizes as the normal approximation does not hold
::wilcox_test(value ~ smoke, data = smk_data, conf.int = TRUE,
coindistribution = "exact")
Exact Wilcoxon-Mann-Whitney Test
data: value by smoke (smoke, non)
Z = -2.5974, p-value = 0.008181
alternative hypothesis: true mu is not equal to 0
95 percent confidence interval:
-0.76 -0.10
sample estimates:
difference in location
-0.425
Note: the distribution argument only effects the p-value, {coin} consistently calculated the exact Hodges-Lehmann confidence intervals.