# Create sample data
<- tibble::tribble(
d1 ~trt_grp, ~WtGain,
"placebo", 94, "placebo", 12, "placebo", 26, "placebo", 89,
"placebo", 88, "placebo", 96, "placebo", 85, "placebo", 130,
"placebo", 75, "placebo", 54, "placebo", 112, "placebo", 69,
"placebo", 104, "placebo", 95, "placebo", 53, "placebo", 21,
"treatment", 45, "treatment", 62, "treatment", 96, "treatment", 128,
"treatment", 120, "treatment", 99, "treatment", 28, "treatment", 50,
"treatment", 109, "treatment", 115, "treatment", 39, "treatment", 96,
"treatment", 87, "treatment", 100, "treatment", 76, "treatment", 80
)
Two Sample t-test
Two Sample t-test in R
The Two Sample t-test is used to compare two independent samples against each other. In the Two Sample t-test, the mean of the first sample is compared against the mean of the second sample. In R, a Two Sample t-test can be performed using the Base R t.test()
function from the stats package or the proc_ttest()
function from the procs package.
Data Used
The following data was used in this example.
Base R
If we have normalized data, we can use the classic Student’s t-test. For a Two sample test where the variances are not equal, we should use the Welch’s t-test. Both of those options are available with the Base R t.test()
function.
Student’s T-Test
Code
The following code was used to test the comparison in Base R. By default, the R two sample t-test function assumes the variances in the data are unequal, and uses a Welch’s t-test. Therefore, to use a classic Student’s t-test with normalized data, we must specify var.equal = TRUE
. Also note that we must separate the single variable into two variables to satisfy the t.test()
syntax and set paired = FALSE
.
<- dplyr::filter(d1, trt_grp == 'placebo')
d1p <- dplyr::filter(d1, trt_grp == 'treatment')
d1t
# Perform t-test
t.test(d1p$WtGain, d1t$WtGain,
var.equal = TRUE, paired = FALSE)
Two Sample t-test
data: d1p$WtGain and d1t$WtGain
t = -0.6969, df = 30, p-value = 0.4912
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-31.19842 15.32342
sample estimates:
mean of x mean of y
75.1875 83.1250
Welch’s T-Test
Code
The following code was used to test the comparison in Base R using Welch’s t-test. Observe that in this case, the var.equal
parameter is set to FALSE.
<- dplyr::filter(d1, trt_grp == 'placebo')
d1p <- dplyr::filter(d1, trt_grp == 'treatment')
d1t
# Perform t-test
t.test(d1p$WtGain, d1t$WtGain,
var.equal = FALSE, paired = FALSE)
Welch Two Sample t-test
data: d1p$WtGain and d1t$WtGain
t = -0.6969, df = 29.694, p-value = 0.4913
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-31.20849 15.33349
sample estimates:
mean of x mean of y
75.1875 83.1250
Procs Package
Student’s T-Test and Welch’s T-Test
Code
The following code from the procs package was used to perform a two sample t-test. Note that the proc_ttest()
function performs both the Student’s t-test and Welch’s (Satterthwaite) t-test in the same call. The results are displayed on separate rows. This output is similar to SAS.
library(procs)
# Perform t-test
proc_ttest(d1, var = WtGain,
class = trt_grp)
$Statistics
VAR CLASS METHOD N MEAN STD STDERR MIN MAX
1 WtGain placebo <NA> 16 75.1875 33.81167 8.452918 12 130
2 WtGain treatment <NA> 16 83.1250 30.53495 7.633738 28 128
3 WtGain Diff (1-2) Pooled NA -7.9375 NA 11.389723 NA NA
4 WtGain Diff (1-2) Satterthwaite NA -7.9375 NA 11.389723 NA NA
$ConfLimits
VAR CLASS METHOD MEAN LCLM UCLM STD LCLMSTD
1 WtGain placebo <NA> 75.1875 57.17053 93.20447 33.81167 24.97685
2 WtGain treatment <NA> 83.1250 66.85407 99.39593 30.53495 22.55632
3 WtGain Diff (1-2) Pooled -7.9375 -31.19842 15.32342 NA NA
4 WtGain Diff (1-2) Satterthwaite -7.9375 -31.20849 15.33349 NA NA
UCLMSTD
1 52.33003
2 47.25868
3 NA
4 NA
$TTests
VAR METHOD VARIANCES DF T PROBT
1 WtGain Pooled Equal 30.00000 -0.6969002 0.4912306
2 WtGain Satterthwaite Unequal 29.69359 -0.6969002 0.4912856
$Equality
VAR METHOD NDF DDF FVAL PROBF
1 WtGain Folded F 15 15 1.226136 0.6980614
Viewer Output: