## SPSS Exercises

Site last updated on April 27, 1998.
This is a Jeanne Site
Webmaster: Jeanne Curran

### Exercise 1 - Beginning SPSS

1. Opening the SPSS Program

1. Access SPSS by clicking on the icon.
2. File should open at Newdata.
In any case, click on File in the menu across the top.
Click on Open, then New.
3. Insert student disk.
Click on File, Open, GSS.SAV
Data from the GSS.SAV file should appear on screen.
You have now accessed the data in the GSS.SAV file that we will use in this first exercise.

2. Play with the screen, as indicated in your SPSS text.
Use the scroll bars to move around the page.
Identify the variable names at the top of the page.
Refer to page 19 in your SPSS text for variable explanations.

3. Value "5" is coded into the box for Respondent #5 under "thnkself."
Discover on the screen what the "5" means.

1. Click on Utilities in the main menu at top of page.
2. Click on Variables....
3. Click on "thnkself" from the index list of variables in the left frame.

4. E-mail Jeanne by clicking on her name.

For the subject of your e-mail type: "Interpretation of the "5"
For the message type in:

"The 5 coded as the value of "thnkself" for Respondent #5 means that that respondent believes that thinking for oneself is the ______(fill in) important value in the list of choices to prepare a child for life."

5. Cite the page in the SPSS text from which the answer paragraph was composed. In other words, where in your text would you go to find the information you would need to write that interpretation all by yourself?

E-mail Jeanne at: jcurran@csudh.edu

### SPSS Exercise 2 - Frequencies

Turn to p. 35 in the SPSS text. Load the GSS.SAV file from your student disk.

Select:

• Statistics
• Summarize
• Frequencies

Answer the following questions based on the results on this screen, which is shown in your text on p.37.

1. What is the sample size of the GSS data base we are using?

1. 1367
2. 1372
3. 1377
4. Cannot be determined from this screen.
5. This screen does not represent the data base we are using.

2. What does "valid percent" mean?

1. The percent is an accurate number.
2. The results of a percentage reported in a table like this can be considered valid.
3. The percent uses the sample size of 1367 for its calculations because that is the number of people who actually answered the question.
4. Everyone included in the sample was counted in this percentage.
5. None of the above.

3. What does "cum" mean?

1. cumulative
2. cumbersome
3. cumescent

4. Why would you ever want a cumulative percent?

1. Because it is important to calculate every possible statistic.
2. Because it permits you to say that 96.9% of the sample surveyed expressed some religious preference.
3. Because it permits you to say that 89.0% of the sample surveyed expressed some religious preference.
4. Because it represents the percentage of each category added up to that point.

5. Write 25 words or less explaining what the religious preference table tells us. (Page 37 of the SPSS text contains the table.)

E-mail Jeanne at: jcurran@csudh.edu

### SPSS Exercise 3 Terms of Art in Statistics

This exercise relates to technical terms used by Babbie and Halley. The concepts are not difficult. Read the text. Use the Help Index. If you do not get these terms straight now, they will confuse you later and make the statistics look more difficult to you than it really is. This is an essential exercise. If you do not get 5/5, correct and resubmit.

1. A codebook is:

1. A source book for data, in which researchers can use other researchers' data.
2. A source book for researchers that tells what was measured, how it was measured, and which values were coded for the data analysis.
3. A book that provides the index for out SPSS text.
4. A book that contains all the codes for running SPSS.
5. A book for spy control.

2. Religious preference is:

1. A variable.
2. A coded variable.
3. A nominal variable.
4. An ordinal variable.
5. A table title.

3. Collapsing categories is a technical term for:

1. Throwing out information
2. Making categories more precise
3. Turning nominal data into ordinal data
4. Taking categories with few respondents out of the data
5. Mathematically transforming ordinal into nominal data

4. "Not applicable" or NAP in SPSS means:

1. The answer given doesn't apply
3. The respondent was not asked that question
4. The answer couldn't be coded
5. All of the above.

5. Why would there be subsets of questions instead of asking all respondents all the questions?

1. Because if the interview schedule gets too long people don't answer, don't pay as much attention, get tired of survey and quit answering.
2. Because you have to have a control group who didn't answer the question so you can compare their responses.
3. Because the interviewers can't keep their enthusiasm at a reasonable level if the quetionnaire gets too long.
4. Because social scientists like to create subsets so that their data will be more impressive.
5. Because subsets permit individual analyses which makes the data set more useful on a national scale.

### SPSS Exercise 4 Univariate Description

Now, there's a fancy title for you. Univariate means there is only one variable involved, so you can't describe its relationship to any other variable. It's all by itself: "uni" What you can do is describe its distribution in the sample.

For example, you could describe the variable "Religious Preference" by telling us how many of your respondents identify a religious preference, which preference is the "modal," or most common preference, and perhaps whether your sample is best described as having religious preferences that can be broken into more global categories, like Christian, Jewish, Muslim, etc. This is called univariate analysis. The questions in this exercise will reflect Chapter 6 of the Babbie and Halley SPSS book.

• What percentage of the GSS sample believes that a woman should have an abortion whenever she wants it? Where could you start to find the answer? Try page 19. What is the technical term for what you're using page 19 for? Help, I've forgotten.

• Question number 1 asks a frequency question: how many in the sample? Ask SPSS to run the frequency for the variable that will tell you how many people in the sample believe that a woman should have an abortion whenever she wants it. Help, I've forgotten how to run a frequency distribution. Run the frequency and answer the following questions:

1. What valid percentage of the GSS sample believe in abortion for any reason?

2. What does valid percentage mean in this case? Help, I don't remember what a valid percent is.

1. That all 1372 cases in the GSS survey were included.
2. That 500 people weren't asked this question.
3. That the percentage would be calculated using: 395 x 872 x 100, in which 395 is the number of people who answered the question on abortion any time with a "yes." 872 is the total number of people who were asked and who answered the question on aborition any time. And 100 is what you multiply to turn the fraction into a percentage.
4. A valid percentage is: no. of Rs who gave Answer A / no. who were asked and who gave an answer we could coed X 100
5. C and D.

3. How many people in the survey of 1372 respondents were asked the question on their acceptance of aborition for any reason?

4. Why were 500 respondents not asked the question on whether they approved abortion for any reason?

1. To keep the questionnaire to a reasonable length.
2. Because they were men, and this survey concerned only women.
3. To get the survey done faster.
4. Because they answered an earlier question in which they said that they were conservative, and this was a liberal survey.
5. All of the above.

5. How many people in the GSS survey believe that the most important thing to teach children is to be obedient?

6. What percentage of people in the GSS survey believe that the least important thing to teach children is to be obedient?

7. What percentage of people in the GSS survey believe in the death penalty?

8. Do pp. 19- 21 give you the complete codebook for the GSS survey? Help, I forgot what a codebook was.

### SPSS Exercise 5 - Recoding

This exercise will appear shortly in more detailed form. But for those of you who are able to follow Babbie and Halley without the lab class, here are the questions, minus the helps I will put in later. Jeanne

1. Look at Variable whypoor on page 20 of the abbreviated codebook in Babbie and Halley. In 25 words or less, why would it make sense to recode this variable? (Think of theories that emphasize a social structural approach, and theories that emphasize an individualist approach.)

2. What recoding would you suggest? Do it.

3. What percentage of respondents in the GSS survey believe that nothing society does can prevent there being poor people, simply because some people have characteristics or habits that keep them poor?

4. What percentage of respondents in the GSS survey believe that poor schools cause poverty? (Nota bene: Do you want to use a recoded value for this question?)

### Added February 16, 1998. SPSS Exercise 6 - More Recoding

1. Recode the EDUC variable to a new variable EDUCAT with the following categories, listed on p. 53 of the Babbie and Halley text:

• Less than high school graduate
• Some college
Help, I'm Lost.

2. What percentage of the GSS sample are college graduates?

Help, I forgot how to run a frequency distribution.

3. What percentage of the GSS sample did not finish high school?

### Added March 18, 1998. SPSS Exercise 7 - Bivariate Analysis

It's time for you to try some analyses with more than one variable. Instead of just asking how many of the people in the GSS survey. Instead of asking just how many of the people in the survey believed that a woman should be able to have an abortion for any reason (abany), let's ask how people who considered themselves Republicans believed that a woman should be able to have an abortion for any reason (partyid). In this questions we are looking for evidence that one variable affects the other. That how you feel about party identification may affect how you feel about the availability of abortion.

Access the data from your book disk. HELP, I FORGOT HOW TO ACCESS THE DISK.

### AddedApril 22, 1998. SPSS Exercise 8 - Bivariate Analysis

Let's go back to p. 57 of Babbie and Halley for a continuation of our discussion on bivariate analysis. Answer the following questions in 25 words or less, and sometimes one word will do.

Please remember that when you give me hard copy of this or e-mail in answers that I will expect that from then on I can ask you these questions and you will be able to give me the answers, in lab, in class, in my office, over the phone, wherever works for us. But I want to know that you really know this stuff. Jeanne

1. How was the variable PARTYID measured?

2. Who measured it?

3. Run a graph of PARTYID. Don't e-mail the graph Just e-mail me that you've done it. Then share your copy with me in class, in the lab, in my office, wherever. No, I don't have one of those TV phones. Sorry.

4. What would you do to get a simpler graph for presentation to your local city council?

5. How would you decide whether to use the simpler graph or a more complex one? (Theory for this answer comes from one-sided versus two-sided arguments. If you/ve forgotten, ask me, until I get it up on the site.)

### AddedApril 22, 1998. SPSS Exercise 9- Interpreting SPSS Data

Recall that their are readings in the back of Babbie and Halley's text so that you have an opportunity to see how the statistics we look at occur in the articles you are likely to read in sociological journals. Time to take a quick look at those readings.

1. What does the article mean by saying that there is a need to "contol" a vairable?

2. Is controlling one way to begin to establish "causality", and, if so, how? Answer in 25 words or less and focus on cutting down what can vary. (Answer in lecture until I can get something up on the site.)

3. The authors identify questions about why some characteristics should matter in religiosity. How did they derive these questions?

4. What are the independent variables involved in determining religiosity, as the authors perceive them?

5. What is the dependent variable? And are their multiple dependent variables?

### AddedApril 27, 1998. SPSS Exercise 10 - Measures of Association

Measures of Association is Chapter 14 of the Babbie and Halley text. It is well written and carefully explained. You are expected to walk through the explanations of the simple arithmetic behind these measures so that you will understand their meaning. Do this even if you are math "phobic." I will walk through this in class lecture, explaining the ideas as we go. If you miss the lecture and have problems with the math, try to set up a time to see me. We'll set some times in class. This stuff is really easy and it is essential to your being able to read social science articles intelligently. I'll keep it simple. Your half of that promise is to be sure you learn the simple stuff so well you'll know it in any class from now on. Jeanne. Starts on p. 130 in the text.

1. How do Babbie and Halley describe the difference between the work we have done with frequencies and crosstabs and these measures?

Weren't the frequencies and crosstabs called summaries? What's the difference with these summaries? What is summarized?

2. PRE means "percentage reduction in error." What error are we talking about? And how is it reduced?

3. How are PRE measures related to an "educated guess?"

4. Babbie and Halley show on p. 132 that Lambda for religion as a predictor of whether respondent will approve abortion for any reason is 0.10263. What does that mean in plain English? (25 words or less!)

5. When would you use Gamma instead of Lambda?