import pandas as pd
# Create sample data
= {
data 'score': [40, 47, 52, 26, 19, 25, 35, 39, 26, 48, 14, 22, 42, 34, 33, 18, 15, 29, 41, 44, 51, 43, 27, 46, 28, 49, 31, 28, 54, 45],
'count': [2, 2, 2, 1, 2, 2, 4, 1, 1, 1, 2, 1, 1, 2, 2, 1, 1, 1, 2, 1, 1, 1, 2, 2, 1, 1, 1, 1, 1, 1]
}
= pd.DataFrame(data) df
One Sample t-test in Python
The One Sample t-test is used to compare a single sample against an expected hypothesis value. In the One Sample t-test, the mean of the sample is compared against the hypothesis value. In Python, a One Sample t-test can be performed using the scipy.stats.ttest_1samp(…) function from the scipy package, which accepts the following parameters:
1.a: Sample observations.
2.popmean: Expected value in null hypothesis. If array_like, then its length along axis must equal 1, and it must otherwise be broadcastable with a.
3.nan_policy: Defines how to handle input NaNs.
4.alternative (optional): Defines the alternative hypothesis.
5.keepdims: If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the input array.
Data Used
The following code was used to test the comparison in Python. Note that the baseline null hypothesis goes in the “popmean” parameter.
import pandas as pd
from scipy import stats
# Perform one-sample t-test
= df['score'].mean()
sample_mean = 30 # Hypothetical null hypothesis mean for comparison
null_mean = 0.05 # Significance level
alpha
= stats.ttest_1samp(df['score'], null_mean)
t_statistic, p_value
print(f"t: {t_statistic}")
print(f"p-value: {p_value}")
print(f"mean of x: {sample_mean}")
if p_value < alpha:
print("Reject null hypothesis: There is a significant difference.")
else:
print("Fail to reject null hypothesis: There is no significant difference.")
t: 2.364306444879101
p-value: 0.02497410401836272
mean of x: 35.03333333333333
Reject null hypothesis: There is a significant difference.