import pandas as pd
import numpy as np
from scipy import stats
# Create sample data
= {
data 'trt_grp': ['placebo', 'placebo', 'placebo', 'placebo', 'placebo', 'placebo', 'placebo', 'placebo', 'placebo', 'placebo', 'placebo', 'placebo', 'placebo', 'placebo', 'placebo', 'placebo', 'treatment', 'treatment', 'treatment', 'treatment', 'treatment', 'treatment', 'treatment', 'treatment', 'treatment', 'treatment', 'treatment', 'treatment', 'treatment', 'treatment', 'treatment', 'treatment'],
'WtGain': [94, 12, 26, 89, 88, 96, 85, 130, 75, 54, 112, 69, 104, 95, 53, 21, 45, 62, 96, 128, 120, 99, 28, 50, 109, 115, 39, 96, 87, 100, 76, 80]
}
= pd.DataFrame(data) df
Two Sample t-test in Python
The Two Sample t-test is used to compare two independent samples against each other. In the Two Sample t-test, the mean of the first sample is compared against the mean of the second sample. In Python, a Two Sample t-test can be performed using the stats package from scipy.
Data Used
The following data was used in this example.
If we have normalized data, we can use the classic Student’s t-test. For a Two sample test where the variances are not equal, we should use the Welch’s t-test. Both of those options are available in the scipy stats package.
Student’s T-Test
Code
The following code was used to test the comparison in Python. Note that we must separate the single variable into two variables to satisfy the scipy stats package syntax.
# Separate data into two groups
= df[df['trt_grp'] == 'placebo']['WtGain']
group1 = df[df['trt_grp'] == 'treatment']['WtGain']
group2
# Perform Student's t-test assuming equal variances
= stats.ttest_ind(group1, group2, equal_var=True)
t_stat, p_value_equal_var
print("Student's T-Test assuming equal variances:")
print(f"T-statistic: {t_stat}")
print(f"P-value: {p_value_equal_var}")
Student's T-Test assuming equal variances:
T-statistic: -0.6969002027708538
P-value: 0.4912306166204561
Welch’s T-Test
Code
The following code was used to test the comparison in Python using Welch’s t-test.
# Perform Welch's t-test assuming unequal variances
= stats.ttest_ind(group1, group2, equal_var=False)
t_stat_welch, p_value_unequal_var
print("\nWelch's T-Test assuming unequal variances:")
print(f"T-statistic: {t_stat_welch}")
print(f"P-value: {p_value_unequal_var}")
Welch's T-Test assuming unequal variances:
T-statistic: -0.6969002027708538
P-value: 0.4912856152047901