Independent Two-Sample t-Test System

An Independent Two-Sample t-Test System is statistical hypothesis testing system that implements an independent two-sample t-test algorithm to solve an independent two-sample t-test task.

Context
- It can be based on the implementation of the following algorithms:
  - An Independent Two-Sample t-Test Algorithm to calculate the respective independent two-sample t-test statistic and the respective p-value.
  - A t-distribution calculator, or an alternative algorithm to calculate Probability Density Function and Cumulative Density Function for t-distribution in order to evaluate the null hypothesis and alternative hypothesis.
- …
Example(s):
- Example based on http://www.scipy-lectures.org/packages/statistics/index.html#student-s-t-test-the-simplest-statistical-test using dataset http://www.scipy-lectures.org/_downloads/brain_size.csv. This system test the null hypothesis: "Female and Male Verbal IQ Means are equal." using the following iPython code lines:

#importing python libraries

In[1]: import pandas

In[2]: from scipy import stats

# reading sample data

In[3]: data = pandas.read_csv('http://www.scipy-lectures.org/_downloads/brain_size.csv', sep=';', na_values=".")

# VIQ (test variable) data categorization by Gender (grouping variable, nominal variable with two categorical values)

# sample 1: Female VIQ

In[4]: female_viq = data[data['Gender'] == 'Female']['VIQ']

# sample 2: Male VIQ

In[5]: male_viq = data[data['Gender'] == 'Male']['VIQ']

# (Optional 1) Test the null hypothesis whether population variances are equal using Bartlett's Test

In[6]: stats.bartlett(female_viq,male_viq)

#Output : t-statistic value and p-value

Out[6]: (0.52142853432619718, 0.47023294528713788)

# (Optional 2) Test the null hypothesis whether population variances are equal using Levene's Test

In[7]: stats.levene(female_viq,male_viq)

#Output : t-statistic value and p-value

Out[7]: 0.78528266993527363, 0.38110422921600584)

#Main Task: Testing the whether female and male VIQ means are equal using an independent two-sample t-test

In[8]: stats.ttest_ind(female_viq, male_viq)

#Task Output : t-statistic value and p-value

Out[8]: (-0.77261617232, 0.4445287677858)

Considering the significance level [math]\displaystyle{ \alpha=0.05 }[/math] the Bartlett's Test, Levene's Test and independent two-sample t-test fail to reject the null hypotheses as the p-values are greater that this value.

A system to test the null hypothesis whether means of female's and male's height are equal using the same dataset as above using the following iPython code lines:

#importing python libraries

In[1]: import pandas

In[2]: from scipy import stats

# reading sample data

In[3]: data = pandas.read_csv('http://www.scipy-lectures.org/_downloads/brain_size.csv', sep=';', na_values=".")

# Fill in the missing values (NAN values) for Height

# In[4]: data['Height'].fillna(method='pad', inplace=True)

# Height (test variable) data categorization by Gender (grouping variable, nominal variable with two categorical values)

# sample 1: Female Height

In[5]: female_h = data[data['Gender'] == 'Female']['Height']

# sample 2: Male Height

In[6]: male_h = data[data['Gender'] == 'Male']['Height']

# (Optional) Test the null hypothesis whether population variances are equal using Bartlett's Test

In[7]: stats.bartlett(female_h,male_h)

#Output : t-statistic value and p-value

Out[7]: (2.0876034164547845, 0.14849886684013355)

#Main Task: Testing the whether means of female's and male's height means are equal using an independent two-sample t-test

In[8]: stats.ttest_ind(female_h, male_h)

#Task Output : t-statistic value and p-value

Out[8]: (-6.3452292802666515, 1.915212359094238e-07)

Conclusion: considering the significance level [math]\displaystyle{ \alpha=0.05 }[/math], Bartlett's Test fail to reject the null hypothesis. However, the independent two-sample t-test rejects the null hypothesis, p-value is to small and we can assume that the means of female's and male's height are not equal.

An online two-samples t-test calculator such as:

Counter-Example(s):
- One-Sample t-Test System.
- Paired Samples t-Test System.
See: Parametric Statistical Test, Computing System, Parameter Optimization System.

References

2017a

(Scipy docs, 2017) ⇒ https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.ttest_ind.html
- scipy.stats.ttest_ind(a, b, axis=0, equal_var=True, nan_policy='propagate')[source]

Calculates the T-test for the means of two independent samples of scores. This is a two-sided test for the null hypothesis that 2 independent samples have identical average (expected) values. This test assumes that the populations have identical variances by default.

2017b

(Varoquaux, 2017) ⇒ Retrieved on 2017-02-16 from "Statistics in Python" http://www.scipy-lectures.org/packages/statistics/index.html#student-s-t-test-the-simplest-statistical-test
- QUOTE: 3.1.2.1.2. 2-sample t-test: testing for difference across populations - We have seen above that the mean VIQ in the male and female populations were different. To test if this is significant, we do a 2-sample t-test with scipy.stats.ttest_ind() …

2017c

(Lowry, 2017) ⇒ Retrived from http://vassarstats.net/tu.html Copyright: Richard Lowry 2001-2017
- t-Test for Independent or Correlated Samples

The logic and computational details of two-sample t-tests are described in Chapters 9-12 of the online text Concepts & Applications of Inferential Statistics. For the independent-samples t-test, this unit will perform both the "usual" t-test, which assumes that the two samples have equal variances, and the alternative t-test, which assumes that the two samples have unequal variances. (A good formulaic summary of the unequal-variances t-test can be found on the StatsDirect web site. A more thorough account appears in the online journal Behavioral Ecology.)

2017d

A t test compares the means of two groups. For example, compare whether systolic blood pressure differs between a control and treated group, between men and women, or any other two groups.

Don't confuse t tests with correlation and regression. The t test compares one variable (perhaps blood pressure) between two groups. Use correlation and regression to see how two variables (perhaps blood pressure and heart rate) vary together. Also don't confuse t tests with ANOVA. The t tests (and related nonparametric tests) compare exactly two groups. ANOVA (and related nonparametric tests) compare three or more groups. Finally, don't confuse a t test with analyses of a contingency table (Fishers or chi-square test). Use a t test to compare a continuous variable (e.g., blood pressure, weight or enzyme activity). Use a contingency table to compare a categorical variable (e.g., pass vs. fail, viable vs. not viable).

2017 e.

(STHDA, 2017) ⇒ Retrieved from http://www.sthda.com/english/rsthda/unpaired-t-test.php
- Statistical tools for high-throughput data analysis: Student t-test for unpaired samples

2015a

(Hamelg, 2015) ⇒ Retrieved on 2017-02-26 from "Python for Data Analysis Part 24: Hypothesis Testing and the T-Test", http://hamelg.blogspot.ca/2015/11/python-for-data-analysis-part-24.html
- A two-sample t-test investigates whether the means of two independent data samples differ from one another. In a two-sample test, the null hypothesis is that the means of both groups are the same. Unlike the one sample-test where we test against a known population parameter, the two sample test only involves sample means. You can conduct a two-sample t-test by passing with the stats.ttest_ind() function.

2015b

(Mangiafico, 2015) ⇒ Mangiafico, S.S. 2015. An R Companion for the Handbook of Biological Statistics, version 1.3.0. , Content retrieved from http://rcompanion.org/rcompanion/d_02.html
- (...) Welch’s t-test is shown above in the “Example” section (“Two sample unpaired t-test”). It is invoked with the var.equal=FALSE option in the t.test function.

Independent Two-Sample t-Test System

References

2017a

2017b

2017c

2017d

2017 e.

2015a

2015b

Navigation menu

Search