Non-parametric point estimation

Introduction

The Hodges-Lehman estimator (Hodges and Lehmann 1962) provides a point estimate which is associated with the Wilcoxon rank sum statistics based on location shift. This is typically used for the 2-sample comparison with small sample size. Note: The Hodges-Lehman estimates the median of the difference and not the difference of the medians. The corresponding distribution-free confidence interval is also based on the Wilcoxon rank sum statistics (Moses).

There are several packages covering this functionality. However, we will focus on the wilcox.test function implemented in R base. The {coin} package provides further resources to derive various types of confidence intervals for the pairwise comparison case. This package is very flexible and uses the functions of related packages.

Hodges, J. L. and Lehmann, E. L. (1962) Rank methods for combination of independent experiments in analysis of variance. Annals of Mathematical Statistics, 33, 482-4.

Case study

# Hollander-Wolfe-Chicken Example
x <- c(1.83, 0.50, 1.62, 2.48, 1.68, 1.88, 1.55, 3.06, 1.30)
y <- c(0.878, 0.647, 0.598, 2.050, 1.060, 1.290, 1.060, 3.140, 1.290)

# Reshaping data
value <- c(x, y)
treat<- c(rep("A", length(x)), rep("B", length(y)))
all <- data.frame(value)
all$treat <- treat

Hodges-Lehmann estimate (and confidence interval)

{base}

The base function provides the Hodges-Lehmann estimate and the Moses confidence interval. The function will provide warnings in case of ties in the data and will not provide the exact confidence interval.

wt <- wilcox.test(x, y, exact = TRUE, conf.int = TRUE)

Warning in wilcox.test.default(x, y, exact = TRUE, conf.int = TRUE): cannot
compute exact p-value with ties

Warning in wilcox.test.default(x, y, exact = TRUE, conf.int = TRUE): cannot
compute exact confidence intervals with ties

# Hodges-Lehmann estimator
wt$estimate

difference in location 
             0.5600562

# Moses confidence interval
wt$conf.int

[1] -0.3699774  1.1829708
attr(,"conf.level")
[1] 0.95

Note: You can process the long format also for wilcox.test using the formula structure:

wilcox.test(all$value ~ all$treat, exact = TRUE, conf.int = TRUE)

Warning in wilcox.test.default(x = DATA[[1L]], y = DATA[[2L]], ...): cannot
compute exact p-value with ties

Warning in wilcox.test.default(x = DATA[[1L]], y = DATA[[2L]], ...): cannot
compute exact confidence intervals with ties


    Wilcoxon rank sum test with continuity correction

data:  all$value by all$treat
W = 58, p-value = 0.1329
alternative hypothesis: true location shift is not equal to 0
95 percent confidence interval:
 -0.3699774  1.1829708
sample estimates:
difference in location 
             0.5600562

{pairwiseCI}

wilcox_test(value ~ as.factor(treat), data = all, 
           conf.int = TRUE)


    Asymptotic Wilcoxon-Mann-Whitney Test

data:  value by as.factor(treat) (A, B)
Z = 1.5469, p-value = 0.1219
alternative hypothesis: true mu is not equal to 0
95 percent confidence interval:
 -0.220  1.082
sample estimates:
difference in location 
                  0.56