April 7th Module

 
 

Welcome back!

Wellness rollcall… (yes, again!)

What to Expect

Welcome back! We’ve covered a lot of ground since we went online, and lots of folks have requested additional examples, so we are going to do a lot of review this week. This is for three reasons. First, I want you to have the opportunity to revisit the last two weeks of content with fresh perspective. Second, I want you to have time to work on your Interpreting Statistics and Final Portfolio assignments. And third, your new technique for this week, ANOVA testing, requires a strong familiarity with T-Testing.

 
 
 
 

Let’s practice!

I’d like us to get very comfortable with mean-testing and T-Tests. And I’ve taken variable requests this week! So let’s look at a few examples. I’m going to walk through a new example, but sprinkle in a couple of yours.

 
 

Again, I’ll be using the variables:

childs: A measure of the respondent's number of children
relig: A measure of the respondent's own religious preference
sprel: A measure of the respondent's spouse's religious preference
chldidel: A measure of the respondent's ideal number of children
age: A measure of the respondent’s age

So we could imagine a lot of different hypotheses that a person might have about these variables. Let’s consider two that are a bit different.

  1. I hypothesize that Americans tend to want more than two children

  2. I hypothesize that Christian Americans tend to want more children than non-Christians

  3. I hypothesize that male American’s reported a lower “ideal number of children” than women

These are all “mean testing” hypotheses. And yours should be too.

Let’s focus on the first hypothesis. That Americans on average want more than two children. This is an opportunity to test our sample mean, to achieve a confidence interval and some clarity about the population mean! Let’s start with our hypothesis.

“Americans on average want more than two children”

First. We need to know if this is a reasonable hypothesis based on the sample. So we will start by looking at the sample stats using the “sum” command:

week 10 sum childs.png

We can see that the mean of the sample is about 1.9, so our hypothesis that the average American wants more than 2 is a bit bold, but. not totally out of the park. So we’ll proceed. Savvy?

To test our hypothesis we will use a t-test. Again we use this technique when we are trying to infer something about the population’s mean based on our sample. We might also notice that our hypothesis is directional.

Here, we’re testing whether the population mean is less than, not equal to, or greater than 2. Let’s check the output.

week10 ttest childs 2.png

Take a second to see if you can interpret this yourself.

A recap in writing:

Across the top we have information about the sample: 1971 respondents to the question, reporting a mean of 1.9 (~2 children), a standard error of .04, a standard deviation of 1.67. It reports also that we can be 95% confident that the population mean falls between 1.81 and 1.97 (we would generally report this as between 1 and 2 children, as you can’t have 1.81 children). Our t score is -2.86.

Beneath that, it tells you what’s being tested, that the population mean is equal to 2. And at the very bottom, three significance tests for three tests: that the population mean is less than, not equal to, and greater than 2.

We have three “p-values,”: 0.0021, 0.0042, and 0.9979. These represent the error score for each of the three tests. Again we look for a p-value of less than .05 (think, 5%) to be able to say that the test is significant to the tune of 95% confidence. In this case, the first and second are significant, as these p-values are less than .05! This means we have something to report! Unfortunately it is the opposite of what our hypothesis was!

Here we would say that we can be 95% confident that the population mean is less than 2.

Ok onto our second hypothesis:

I hypothesize that Christian Americans tend to want more children than non-Christians.

You should notice right away that this hypothesis is comparing two groups, in this case Christians to non-Christians. This means two things. First, we will be using a bi-variate t-test. And second, we need to generate a new variable with only two categories: Christians and Non-Christians.

Let’s start there. Do you remember how to create a new variable?

week10 christian recode.png

Now that I have a binary variable where Christians are “1” and non-Christians are represented by “2,” we can perform a grouped t-test to address our hypothesis.

week 10 ttest childs christnonchrist.png

Let’s look at our third. See if you can interpret yourself.

Male American’s reported a lower “ideal number of children” than women, but only if you look at the respondents who do not already have children.

This is a tricky one... It’s a logic puzzle to help you train your brain into understanding the t-test!

Great work! Now let’s look at an example from one of your classmates. Her t-tests are listed below:

ttest age==50
ttest age, by(fepol)

Great work! I hope these examples were helpful and clarifying! I invite you to send me your examples before Q&A so I can workshop those as well.

Ready for something new? Short description, hang with me you’re almost there.


Analysis of Variance

Analysis of Variance, or “ANOVA” testing, is a new technique, but not a totally unfamiliar one! ANOVA testing relies on a very similar logic as mean testing. The main difference is that it allows us to consider more than two groups of the variable being tested.

Today we will be focusing on simple ANOVA, or one-way ANOVA testing, but know that there are much more complex versions of the ANOVA test available to you. You can read about them in your book, but we will stick to the very basics here.


When to Use ANOVA Testing

T-tests allowed us to compare population means between two groups. Simple ANOVA testing allows us to consider differences among more than two groups. It does this by essentially producing measures of variance within groups against variance between groups. Another way of putting this is, ANOVA testing attempts to mathmetize how much of the variation in a population can be “explained” by within-group variance, and how much must be explained by between-group variance.

Let’s consider an example. Let’s look back to our groups here:

childs: A measure of the respondent's number of children
relig: A measure of the respondent's own religious preference
sprel: A measure of the respondent's spouse's religious preference
chldidel: A measure of the respondent's ideal number of children
age: A measure of the respondent’s age


What ANOVA testing allows us to do here, which t-testing does not, is to work with more than 2 groups. For example, it would allow us not to condense the religion variable to just two groups! Wouldn’t that be nice. See, the better you get at statistics, the less reductive you can be about the human experience. Big day.

Ok so consider the religion variable…

Let’s imagine that we still wanted to streamline the variable so that the categories are large enough to work with, but we want more variation than simply “Christian” and “Non-Christian.” We could do that! We could also look at more nuance within the category of Christian! We’ll do a bit of both.

image.jpg
condensed religion 2.png

So in my first re-code (above and to the right), this new variable “relig_condensed2” looks like this:

1 - Non-Catholic Christianity2 - Catholicism
3 - Judaism
4 - No Religion
5- “Other”
6 - Buddhism, Hinduism, or “Other Eastern”
7 - Islam

This time, instead of comparing religious groups based on their actual number of children, we will compare their “ideal” number of children. So we will use “chldidel” and “religion_condensed2.”

Let’s talk about the logic

Start with the “end game.” Our “end game,” here is the variable “chldidel,” or ideal number of children. The following must be true:

  • All Americans, if asked, could share an “ideal number of children”

  • The entire American population therefore has an average (mean) ideal number of children, which we do not know.

  • Each religious group has its own distribution of ideal number of children, which we also do not know, and which may be the same as or different from the population distribution.

  • We have a sample of Americans from different religious groups, and with different ideal numbers of children.

This is where we begin

We can see above that each religious group has its own sample mean. For example, Muslims report the highest mean at 4, while “Buddhism, Hinduism, or Other Eastern” report the lowest mean, at about 2.4. There is a small difference in sample means between Catholics and non-Catholic Christians (3.17 and 3.25).

sums chldidel relig condensed.png

What we do not know is whether these differences in the sample means suggest statistically significant differences between groups in the entire population.

Now T-tests allow us to do this for only two groups. But we have several groups here. Unfortunately we have fewer than 20 observations for several of these groups, so we will have to condense further to get a more meaningful output.

 recode relig_condensed2 (1=1)(2=2)(3=3)(4=4)(5=3)(6/7=3), gen (relig_condensed3)
1 - Non-Catholic Christianity
2 - Catholicism
3 - Non-Christian, but Religious
4 - No Religion

You’ll notice how difficult it is to not eliminate (or over-report about) minority populations. One must work very hard (and often spend a lot of money) to preserve large sample sizes for under-represented groups in order to report meaningful information about them. It’s rare to find data that achieves this, but there are some very cool data nerds out there trying to make those changes. I hope that some of you will become these people. Unfortunately the GSS does not prioritize this, so we’ll do our best with what we’ve got.

We now have 4 groups, each with their own sample mean. And we can see some small differences between groups in the sample.

sums new relig chldidel.png

ANOVA testing posits the following:

  • For each religious group, there is a distribution of desired number of children

  • To some degree, this in-group distribution will reflect the variation in the population (people simply vary on this topic in general), and to some degree it will reflect variation that is specific to that group (people in this group may vary on this topic in a particular way).

This is the crux of ANOVA testing. Some of the in-group distribution is as it is, simply because variation exists in the population, and not because that group is special. But some may be as they are because this group is unlike other groups.

ANOVA testing attempts to summarize this dynamic. For each group, it measures “within-group” variance and “between group” variance. And calculates a proportion of these values. This proportion helps us understand the amount of variance that can be “explained” by a difference between groups, and the amount of variance that can be explained simply by variation within groups.

To contextualize this with our example. There is a distribution of “ideal number of children” among Catholics. Some Catholics want 8 children, and some want 0. Whatever it may be, there is a distribution of some kind. It is imaged in its most basic form below (left). We can also observe the distribution for all respondents (right). You can see that it is both similar to, and different from the distribution for only Catholics.

distribution catholics.png
distribution all.png
 

But we can’t “see” very much just looking at these distributions. And that’s why we have ANOVA testing.

But how does an ANOVA test calculate these “within group” and “between group” variances?

You’re not going to like it… but… it looks a little something like this:

chart catholic 1.png

In the above chart, I’ve condensed the sample down to only 9 respondents in order to show you the logic, but Stata does all this for you with all 2,000 cases!

(Stata appreciation exhale… Phewwwwww)

What you’re looking at here is a lot of information, but it might be familiar, because its very similar to the chart you built out when calculating correlation coefficients by hand (remember listing the Xs and the Yx, and the squares of those values, and the sums of the squares and the squares of the sums and so on?). This is a similar idea. We need to chart a lot of these means and sums within and between groups. Take a second to look at the chart.

The green area in the bottom right is the most important section. Because it is from these values that we can calculate then “Between Sum of Squares” and the “Within Sum of Squares”

 
sums of squares.png
 

(Remember how the ANOVA test gives us a proportion of the between-group variance, and the within-group variance? Yup! That’s what these are for!)

While they may not look familiar, these are measures of variance. And what we want are means of these variance measures, so that we can create a proportion of average between group variance to average within group variance. To do this, we divide these values by their “degrees of freedom”

(We haven’t talked much about degrees of freedom. Just know, for the time being, it’s a slightly more complicated measure of sample size than simply counting the number of entries in a sum. In this case, the degrees of freedom for the first calculation (between sum of squares) is one less than the number of groups involved. In the small sample above, we have 3 groups listed, so the df would be 2. The df for the second value (within groups) is going to be the number of values in the set, minus the number of groups. In this case, that is 30-3 or 27.)

Mean of Between Sum of Squares =

Between Group Sum of Squares/df


Mean of Within Sum of Squares

= Within Sum of Squares/df


We’re so close to the finish line.

Because Stata is very very cool it will run an ANOVA test to crunch all these numbers and produce an F-statistic from an F-test. This test uses the above “scores” to produce a metric we can interpret. The F score, is simply the proportion of these two above values, and helps us answer the following question:

How much of the variance in Group A seem to belong uniquely to Group A, and how much seems to belong to all groups in the population?

Getting ready for Stata

When we approach Stata with this question, our Null Hypothesis is that all the group means are equal to one another. In this case, that Catholics, Non-Catholic Christians, Religious Non-Christians, and Non-religous people all have the same mean ideal number of children.

So we run our One-Way ANOVA test.

one way.png

Look at all the work it did for you! And some familiar friends are here! You have the F-statistic, your within and between group variances, some degrees of freedom! And our old pal, the p-value which is all the way to the right there under “Prob>F”. If this value is less than .05 we can reject our null hypothesis. And it is! So we can!

We can reject the null hypothesis that all the groups have the same mean.

You’re probably wondering though… Are all the groups means different? Is just one mean different from all the other means? How did that giant chart of information give us one tiny value? How do we know which groups are the null hypothesis breakers?

These are great questions! … For next week :)

We’ll break here for now.

Because to understand the rest, we really need Stata’s help.

As I said, I’ll post a video this week, but you don’t have to review it until next week, as you technically have the rest of this week off. For now…

Final Roll Call