MathStatisticsInferential StatisticsHypothesis testingStatistics tests

Two samples t-test

8 minutes read

The Hypothesis for the Test

The two-sample t-test is used to determine whether the averages of two different data sets, each from a different population, vary. For example, it can help in identifying whether the average weight of apples from two separate harvests is the same, or if one of them has a higher average weight.

To accomplish this, you simply take one random sample from each group, find the average of each sample, and then calculate the difference between the two sample averages. Because a sample is chosen randomly, we could end up with one value that significantly skews one sample's mean, or a small value that distorts the mean, even if the two population averages are actually the same. The two-sample t-test determines if the perceived differences between the sample averages of the groups are likely a result of real population differences or simply a random occurrence.

Thus, the goal is to figure out the probability of observing such a difference between the averages of two data groups, assuming that the average values of the entire groups are the same. If this probability is small, we reject the assumption that they are equal.

Here is how you can define your hypothesis:

Null Hypothesis

Our initial assumption is that there is no difference between population averages, so that will be the null hypothesis (H0:μ1=μ2H_0:\mu_1=\mu_2).

In the context of the placebo effect, we start with the assumption that the drug doesn't have any effect, and that the average effectiveness is the same in both groups.

H0:μ drug=μ placeboH_0:\mu_\text{ drug}=\mu_\text{ placebo}

Alternative Hypothesis

The alternative hypothesis is influenced by our uncertainty regarding the data. If we're unsure whether one average is greater than the other, or if they're not equal - possibly with one being larger or smaller - that affects the alternative hypothesis. So, its forms will be as follows:

H1:μ1μ2 called two-tailed hypothesesH1:μ1>μ2 called one-tailed hypothesisH1:μ1<μ2 called one-tailed hypothesisH_1:\mu_1\neq\mu_2\text{ called two-tailed hypotheses}\\H_1:\mu_1>\mu_2\text{ called one-tailed hypothesis}\\H_1:\mu_1<\mu_2\text{ called one-tailed hypothesis}

In the placebo effect example, if we assume that the drug indeed has an impact on pain reduction, our hypothesis would be: H1:μ drug<μ placeboH_1:\mu_\text{ drug}<\mu_\text{ placebo}.

The difference between a one-sample t-test and a two-sample t-test lies in their respective purposes. A one-sample t-test is employed to determine whether a sample mean significantly deviates from a known or hypothesized population mean. Conversely, a two-sample t-test is used to compare the averages of two distinct samples to ascertain if they show a significant difference from each other.

Taking Samples

In this stage, you take a sample from each population. Each sample should be large enough to provide statistically meaningful results. Then, you calculate two things for each sample, sample average (xˉ)(\bar{x}) and sample standard deviation (s)({s}).

Remember, each sample must be randomly selected to ensure unbiased and representative insights from a larger data group.

P-value

The p-value in hypothesis testing refers to the probability of obtaining observed results, or results more extreme, under the assumption that the null hypothesis is accurate.

  1. Calculating the Critical T-value:
    This is achieved by plotting the data we derived from the sample in the following formula:
    t=xˉ1xˉ2sΔˉt = \frac{\bar{x}_1 - \bar{x}_2}{s_{\bar{\Delta}}}where,
    sΔˉ=s12n1+s22n2s_{\bar{\Delta}}=\sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}
    xˉ1\bar{x}_1 and xˉ2\bar{x}_2 are the averages of the two samples.
    s1s_1 and s2s_2 are the standard deviations of the two samples.
    n1n_1 and n2n_2 are sample sizes of the two samples.

    In case of the placebo effect example, we have the following data:

    Sample 1: n1=30,xˉ1=21.6,s1=4.12n_1=30, \bar{x}_1=21.6,s_1=4.12

    Sample 2: n2=25,xˉ2=19.4,s2=4.12n_2=25, \bar{x}_2=19.4,s_2=4.12

    Our hypothesis is:
    H0:μ1=μ2H1:μ1>μ2H_0:\mu_1=\mu_2\\ H_1:\mu_1>\mu_2

  2. Thus, the t-value is given by:
    t=21.619.44.12230+4.12225=1.972t = \frac{21.6-19.4 }{\sqrt{\frac{4.12^2}{30} + \frac{4.12^2}{25}}}=1.972

  3. Determining the Degrees of Freedom:
    To acquire this, we insert the data we obtained from the sample into this formula:
    Degrees of Freedom (df)(s12n1+s22n2)2(s12n1)2n11+(s22n2)2n21\text{Degrees of Freedom (df)} \approx \frac{\left(\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}\right)^2}{\frac{\left(\frac{s_1^2}{n_1}\right)^2}{n_1 - 1} + \frac{\left(\frac{s_2^2}{n_2}\right)^2}{n_2 - 1}}
    As you can see, the formula is quite complex and not user-friendly, so we use software to calculate it. In our example, the degrees of freedom will be approximately 51.
    To use the online calculator, visit t-test calculator (Change the input to "Mean and standard deviation").

  4. Deriving the Corresponding P-value:
    For each critical t-value, there is a corresponding p-value in the t-table. This table includes columns representing degrees of freedom and rows representing critical t-values. At the top of the table, you will find the p-values for both one-tailed and two-tailed hypotheses. For t=1.972t=1.972 and degrees of freedom equal to 51.251.2
    T-tables typically do not include rows for degrees of freedom this high (as there's little difference with such large sample sizes), so we'll go with the closest row, which is 60. The closest value for t=1.972t=1.972 in the table is 2.

  5. Obtaining P-value Using Software:
    The t-table is an old-school method. To simplify calculations, we refer to the two-sample t-test calculator, just like we did for degrees of freedom.

Now, with everything ready, we can make our decision.

Making a Decision

Before conducting our test, we set a limit for the p-value. This threshold is called the significance level, and it is denoted by (α\alpha).

  • If the p-value is less than the significance level (p-value<α\text{p-value}<\alpha), you reject the null hypothesis and accept the alternative hypothesis. This indicates the averages of the two groups are significantly different.

  • If the p-value is greater than the significance level (p-value>α\text{p-value}>\alpha), you fail to reject the null hypothesis. This indicates there isn't enough evidence to confirm a significant difference between the averages.

Typically, the significance level is set at 0.05 (5%). Our p-value in this case is 0.027, which is less than the set significance level, so we can confidently reject the null hypothesis in favor of the alternative one. That means there is indeed a palpable difference in the averages of the two populations. Therefore, we can conclude that the drug does effectively reduce pain.

Assumptions for the Test

Before conducting a two-sample t-test, it's crucial to validate some assumptions for the test. In the method used by Welch for t-tests, the assumptions include the following:

  • Independence: Data points within each group must be independent.

  • Normality: Each group's data must approximately follow a normal distribution.

  • Homogeneity of Variances: Each group must have relatively constant variances, even if they're not equal across groups.

Conclusion

The two-sample t-test is a method of hypothesis testing. It indicates if there's a difference between two groups or population averages. It’s a valuable tool in varying fields such as medicine, research, and data analysis.

The testing procedure typically follows these steps:

  • Statement of hypotheses

  • Sampling from the two populations

  • Determination of p-value

  • Comparison of p-value with the significance level

  • Conclusion from the test.

2 learners liked this piece of theory. 1 didn't like it. What about you?
Report a typo