Non-parametric point estimation

Introduction

The Hodges-Lehman estimator (Hodges and Lehmann 1962) provides a point estimate which is associated with the Wilcoxon rank sum statistics based on location shift. This is typically used for the 2-sample comparison with small sample size. Note: The Hodges-Lehman estimates the median of the difference and not the difference of the medians. The corresponding distribution-free confidence interval is also based on the Wilcoxon rank sum statistics (Moses).

There are several packages covering this functionality. However, we will focus on the wilcox.test function implemented in R base. The {pairwiseCI} package provides further resources to derive various types of confidence intervals for the pairwise comparison case. This package is very flexible and uses the functions of related packages.

Hodges, J. L. and Lehmann, E. L. (1962) Rank methods for combination of independent experiments in analysis of variance. Annals of Mathematical Statistics, 33, 482-4.

Case study

# Hollander-Wolfe-Chicken Example
x <- c(1.83, 0.50, 1.62, 2.48, 1.68, 1.88, 1.55, 3.06, 1.30)
y <- c(0.878, 0.647, 0.598, 2.050, 1.060, 1.290, 1.060, 3.140, 1.290)

# Reshaping data
value <- c(x, y)
treat<- c(rep("A", length(x)), rep("B", length(y)))
all <- data.frame(value)
all$treat <- treat

Hodges-Lehmann estimate (and confidence interval)

{base}

The base function provides the Hodges-Lehmann estimate and the Moses confidence interval. The function will provide warnings in case of ties in the data and will not provide the exact confidence interval.

wt <- wilcox.test(x, y, exact = TRUE, conf.int = TRUE)
Warning in wilcox.test.default(x, y, exact = TRUE, conf.int = TRUE): cannot
compute exact p-value with ties
Warning in wilcox.test.default(x, y, exact = TRUE, conf.int = TRUE): cannot
compute exact confidence intervals with ties
# Hodges-Lehmann estimator
wt$estimate
difference in location 
             0.5600562 
# Moses confidence interval
wt$conf.int
[1] -0.3699774  1.1829708
attr(,"conf.level")
[1] 0.95

Note: You can process the long format also for wilcox.test using the formula structure:

wilcox.test(all$value ~ all$treat, exact = TRUE, conf.int = TRUE)
Warning in wilcox.test.default(x = DATA[[1L]], y = DATA[[2L]], ...): cannot
compute exact p-value with ties
Warning in wilcox.test.default(x = DATA[[1L]], y = DATA[[2L]], ...): cannot
compute exact confidence intervals with ties

    Wilcoxon rank sum test with continuity correction

data:  all$value by all$treat
W = 58, p-value = 0.1329
alternative hypothesis: true location shift is not equal to 0
95 percent confidence interval:
 -0.3699774  1.1829708
sample estimates:
difference in location 
             0.5600562 

{pairwiseCI}

The pairwiseCI package requires data to be in a long format to use the formula structure. Via the control parameter the direction can be defined. Setting method to “HL.diff” provides exact confidence intervals together with the Hodges-Lehmann point estimate.

# pairwiseCI is using the formula structure 
pairwiseCI(value ~ treat, data = all, 
           method="HL.diff", control="B",
           conf.level = .95)
  
95 %-confidence intervals 
 Method:  Difference in location (Hodges-Lehmann estimator) 
  
  
      estimate lower upper
A - B     0.56 -0.22 1.082