Two Sample t-test in Python

The Two Sample t-test is used to compare two independent samples against each other. In the Two Sample t-test, the mean of the first sample is compared against the mean of the second sample. In Python, a Two Sample t-test can be performed using the stats package from scipy.

Data Used

The following data was used in this example.

import pandas as pd
import numpy as np
from scipy import stats

# Create sample data
data = {
    'trt_grp': ['placebo', 'placebo', 'placebo', 'placebo', 'placebo', 'placebo', 'placebo', 'placebo', 'placebo', 'placebo', 'placebo', 'placebo', 'placebo', 'placebo', 'placebo', 'placebo', 'treatment', 'treatment', 'treatment', 'treatment', 'treatment', 'treatment', 'treatment', 'treatment', 'treatment', 'treatment', 'treatment', 'treatment', 'treatment', 'treatment', 'treatment', 'treatment'],
    'WtGain': [94, 12, 26, 89, 88, 96, 85, 130, 75, 54, 112, 69, 104, 95, 53, 21, 45, 62, 96, 128, 120, 99, 28, 50, 109, 115, 39, 96, 87, 100, 76, 80]
}

df = pd.DataFrame(data)

If we have normalized data, we can use the classic Student’s t-test. For a Two sample test where the variances are not equal, we should use the Welch’s t-test. Both of those options are available in the scipy stats package.

Student’s T-Test

Code

The following code was used to test the comparison in Python. Note that we must separate the single variable into two variables to satisfy the scipy stats package syntax.

# Separate data into two groups
group1 = df[df['trt_grp'] == 'placebo']['WtGain']
group2 = df[df['trt_grp'] == 'treatment']['WtGain']

# Perform Student's t-test assuming equal variances
t_stat, p_value_equal_var = stats.ttest_ind(group1, group2, equal_var=True)

print("Student's T-Test assuming equal variances:")
print(f"T-statistic: {t_stat}")
print(f"P-value: {p_value_equal_var}")

Student's T-Test assuming equal variances:
T-statistic: -0.6969002027708538
P-value: 0.4912306166204561

Welch’s T-Test

Code

The following code was used to test the comparison in Python using Welch’s t-test.

# Perform Welch's t-test assuming unequal variances
t_stat_welch, p_value_unequal_var = stats.ttest_ind(group1, group2, equal_var=False)

print("\nWelch's T-Test assuming unequal variances:")
print(f"T-statistic: {t_stat_welch}")
print(f"P-value: {p_value_unequal_var}")


Welch's T-Test assuming unequal variances:
T-statistic: -0.6969002027708538
P-value: 0.4912856152047901