Statistics Preparations: Week 4

A Justice Site

Statistics Preparations
Week 4: Week of September 15, 2003

Mirror Sites:
CSUDH - Habermas - UWP - Archives

California State University, Dominguez Hills
University of Wisconsin, Parkside
Soka University Japan - Transcend Art and Peace
Created: September 15, 2003
Latest Update: September 15, 2003
jeannecurran@habermas.org
takata@uwp.edu

Soc. 220-01: Statistics: Week 2
Preparations for Class and Internet Discussions

* * * * *

Week 4: Week of September 15, 2003
Topic: Reviewing Arithmetic - What if someone explained it so I could understand it?

• Topic:

We need two kinds of practice this week. Practice with SPSS, which I'll add in a few moments to lectures. And Practice with creating the project questionnaire schedule. I'll try to be back on Thursday. Meanwhile . . . .

• Lectures:

• Concepts:

• frequency distribution: "A frequency distribution is a summary of the responses to the categories of a variable." Kendrick at p. 93.
Example: a frequency distribution of the political parties in California would show Democrats, Republican, Independents, Peace and Freedom, Greens, Libertarians, and Others. The frequency distribution would show us that most of the population registers in either the Democratic or the Republican Party. That means that we could probably lump all the other parties together as Other, and recode this as Democratic, Republican, and Other. That would make it a lot easier to grasp at a glance.

• independent and dependent variables "The variable that is producing or creating the effect is the independeent variable, whereas the variable being affected is the dependent variable." Kendrick, at p. 15.
Not only can we describe things with numbers, like how many Californians are Republicans, how many Democrats, etc., we can also see how that party affiliation affects the way they vote. At an exit poll (meaning that we ask people as they exit from the polling place, and no, they don't have to tell the truth; they don't even have to answer us) we might ask both their affiliation and how they voted on the recall election. Then we could determine whether there was some kind of relationship between their political affiliation and the way they "said" they voted.

What's the independent variable here? Political affiliation. We want to know if Democrats will vote one way and Democrats another. That makes how they voted on the recall the dependent variable.

• causality: Touch word in social statistics. Our measurements are so vague that it's really hard to determine causality. Very little in our lived experience is uncomplicated enough to have just a few variables operating on it. Most of life is complex. So be very careful with using the words cause or causality. When you use them in statistics you mean that you have some reliable evidence to show that the effects of one variable actually "caused" the effects on another. Not likely. For example, look at all the material on the mind and criminality. We're no closer to understanding what creates a murderer or a Saddam Hussein than we were many years ago. We've got a lot of theories, but . . . .

Or think of ritalin and hyaperactivity in children. Or think of cancer and how many times we've tried to find a cure that will "cause" it to go away.

• contingency table: "Contingency tables, or crosstabs (short for crosstabulations> in SPSS lingo, display data in such a way that we can look at whether or not one variable (referred to as the independent variable) seems to be having an effecto or influence on a second variable (the dependent variable.)" Kendrick, at pp. 248-9.
Recall the lecture on Thursday when we looked at General Happiness of the Respondent in a crosstab with highest level of school completed. Please link to my lecture notes on that.

• Discussion Questions:

1. On Thursday, I avoided Agnes' and Bernice's crosstabs, and chose instead to work with Jose's. The reason for that display of discrimination was that Agnes and Bernicee had use two variables that had large numbers of categories. As I recall, they might even have been interval, meaning that the respondents recorded the numbers of children, or numbers of times they had done something, or the numbers of brothers and sisters they had. Yikes. To do a dummy table I would have had to draw dozens of categories.

Jose had graciously chosen General Happiness as perceived by the Respondent. That was only 3 or 4 categories. Recall that I put General Happiness across the top of the table, which meant is was represented in the COLUMNS, and then I faked all those categories of years of school completed by skipping lots of them and making my table messy enough that you didn't fuss about that. Number of years of school completed was listed down the side and recorded in the rows.

Then I talked about what I was really doing was recoding the variable of numbers of years of school completed. I think we eventually made it 0-6, 7-12, 13-14, and 15+ That made our table lots more useable.

Let's play with that a little:

2. Go to SPSS, 1991 GSS, and run a crosstab for General Happiness and Highest Number of Years of School Completed by Spouse. Be sure to read the whole variable label or you might get the wrong one: there's father, mother, and spouse.
3. Link to Analyze on the horizontal menu, then link to Crosstabs. Notice that as you move each variable into the dialog box area, you are choosing rows and columns. Choose first one, then the other, and notice the difference.
4. Then discard your output; don't save it; we don't need it. And if you don't discard it, they'll complain about the number of pages, remember?
5. Now let's recode the highest number of years of school completed by spouse:

1. From the data editor of SPSS, link to TRANSFORM in the horizontal menu, and then link to RECODE. Link to Recode to different variable. Dialog box will come up with name of variable you have chosen to recode. On right side of that type in a variable name like levedsp for level of education of spouse. Then type out the full name of the variable in the dialog box beneath that.
2. Highlight Highest year of education spouse, then move that variable to the dialog box wit hthe arrow to the right of the variable list.
3. Skip if... that's more complicated. We're doing a simple recode.
4. Link to OLD AND NEW VALUES, and up pops a new dialog box.
5. Left side is old values, right side is new values.
6. I'm going to use the values we made up last Thursday for our dummy table.
7. We used 0-6 for our first category: that's not a value, that's a range, from 0 through 6. Click the radio dial button next to Range on the left side and some new choices light up. Enter 0 in the first box. Notice that through is now actively black. enter 6 in the second box.
8. Now just move your mouse over the the new side on the right, and put 1 in the value box. That means we want to change 0 to 6 to a new category we'll call 1, like first level, for example. Notice that then more of the dialog box on the right side becomes actively black. Under OLD -> NEW, ADD is now actively black. Click ADD. And our first recode appears in the dialog box. Notice you can't write these values in. You have to move them through the appropriate channels by clicking and linking.
9. Next, we'll recode 7-12. This, too, is a range, so we go back to the left-hand side and enter 7 in the first dialog box and 12 in the dialog box to the right of through. And we'll call this category 2, for second level.
10. In the value box on the right side enter 2. When ADD becomes actively black, click it. And there's our second recode. Aren't you clever?
11. Now we'll recode 13-14 as 3 for third level. Go back t the range on the left side and put 13 in the first box and 14 in the box to the right of through.
12. Then enter the new value 3 in the value box on the right side. And when ADD lights up as actively black, click ADD, and there's our third recode.
13. Now 15 + is a little different. Look at the left side of our dialog box. Although they are not lit up as actively black right now, you can see grayed out that the last Range choice has through hishest after it. Link in the radio button of that RANGE and trough highest will become actively black.
14. Then go back and click on the value radio button on the left side. Enter 97 for NAP, Then enter 7 in the right side value box. Then ADD.
15. Repeat that for 98 -> 8, and 99->9, the other missing values. Now you've added your missing values.
16. Then click on Continue and on OK when you get back to the dialog box that tells you it's changed the variable.

Now go play. Print out a crosstabs with General Happiness and Level of Educ of Spouse.

Texts:

• Timoth J. Lawson, Everyday Statistical Reasoning. Wadsworth and Thomson Learning. 2002. ISBN: 0-534-59094-2 (pbk). This one's much less expensive, but addresses more reasonably the part of statistics you really need: understanding.