A Jeanne Site
California State University, Dominguez Hills
University of Wisconsin, Parkside
Latest update: September 4, 1999
Curran or
Takata.
In the following questions and lecture material we are going to give you a whirlwind tour through basic statistics. This does not qualify as a course in statistics; but it will give you literacy. Just as a whirlwind tour of the Web doesn't qualify you as a guru, this won't qualify you to hire out as a stat guru. But you can wander through the Web now, and you will find that wandering through statistical results is not much harder, if you'll just keep your common sense about you.
We followed the lead of Dowdall, Logio, Babbie and Halley in the concepts we covered. The DLBH page numbers at the end of the questions refer to pages in Dowdall, Logio, Babbie and Halley. The BH numbers refer to an earlier edition of Babbie and Halley's Adventures in Social Research.
Please use common sense in transferring what you learn here to more extensive courses in statistics. They will go into far greater detail. We are trying for literacy. Given a solid base of literacy you will find that such courses make more sense to you, but you will have to allow for the greater complexity. What is an adequate answer for us, like a hypothesis being a bet on the outcome of the data, may not quite cut it in a formal theory class. Once again, common sense will do.
Explain how gender is both a concept and a variable. (DLBH. p.15; BH, p.7))
Free comment: Of course, the study has probably found a spurious relationship because if the women are old enough to take a machine shop course they have already been socialized not to take up such activities. In that case the correlation with gender would be spurious.
End of editing - August 29, 1999.
(1) Because computers empower us to analyze data quickly and efficiently.
(2) Data come from the measurement of variables we define.
(3) And our approach to society and its situational context shapes the perspective from which we define variables.
(4) An apologist may never consider the variables that might tell us what we could learn structurally from churches that might improve the workplace (using Babbie and Halley's example).
(5) A critical theorist will at least search for such variables.
(6) To read and understand sociology you must be able to discern how different these approaches are and how differently they see the world.
(7) Then when you run SPSS data you will be able to think deeply about what it means.
6. What do the terms "critical" and "apologetic" mean as regards the validity of theory.
(1) In critiquing the validity of the theory, we mean to determine the extent to which the theory accurately predicts and explains.
(2) Critical theorists argue that the theory's accuracy applies only to some, usually privileged groups, and does not adequately take into account the social context which differs from that of privilege.
(3) Apologetic theorists defend the theory's framework as essentially neutral and unbiased, describing the world that actually exists, not the idealized, romanticized world critical theorists would like to posit.
(4) Critical is usually associated with the "left," apologetic with the "right."
I ALWAYS LOOK UP THE WORD "HYPOTHESIS"
7. What is an hypothesis? (p. 8)
(1) A bet you make on the outcome.
(2) When you measure two variables to see how they are related.
(3) Example: Hypothesis 1: That women will attend church more frequently than men.
(4) Theoretical explanation for prediction in Hypothesis 1: Women are more frequently deprived of achievement opportunities in the workplace because of gender discrimination (a theoretical, conceptual use of the term gender). The church offers an alternative locale for achievement opportunities.
(5) Our theoretical explanation gives us another hypothesis to test: Hypothesis 2: Women are more deprived than men of workplace success.
(6) When you collect data (or when you do a secondary analysis of someone else's data) you can discover whether or not you were right in your bet.
(7) In the first section of this text you learned to use the World Wide Web. Go to the World Wide Web site to find the U.S. Government's Glass Ceiling Commission report on whether Hypothesis II has been supported by the data collected and reported in Good for Business: Making Full Use of the Nation's Capital. The World Wide Web site address at Cornell University is: http://www.ilr.cornell.edu
PROBABILISTIC VERSUS UNIVERSALISTIC THEORIES
8. What do we mean when we say that data are probabilistic?
(1) That we are basing our conclusions on statistics that were collected from a sample of people. And there will be sampling error in generalizing that data to the entire population represented by the sample.
(2) In the social sciences, we are also assuming that what people do or say in our study is what they would do or say under non-study situations. That assumption is not always met. That is one reason that people often deride studies done on "college sophomores."
(3) Also, people, unlike gravity, do not always behave in the same way. This means that most of the time we can predict that a student who wants an A will be willing to work harder to get that A than a student who doesn't much care about his/her grade. But if something else more interesting or more crucial turns up, the student who wants the A may not work harder. Our prediction turns out to be wrong in that case. Most social science predictions are like this: probabilistic. Most of the time if Person P tells Person O that she doesn't like him anymore, Person O will respond by not liking Person P anymore. Unless, Person O is in love with Person P, ... and so on.
(4) Hard science data is universalistic. Drop an apple and it will fall with acceleration equal to 1/2 gt2, reliably, every time, nothing probabilistic about it, no "the apple will probably fall with acceleration ..." By comparison soft science, social science, looks "mushy." It isn't. It just has to deal with more complex variables (fuzzy) in which the behaviors being measured are controlled by the actors to a much greater extent than the apple can control its fall. People can invent wings that alter their rate of fall. Apples can't.
(5) Many of the statistics we regularly report tell us what most people would probably do in a given situation, not what an individual would do in the precise situation.
(6) Behavior is situated and complex, sometimes meaningful, sometimes random, sometimes born of pure feeling.
(7) Not all people respond to the same situatedness in the same way; people do not always respond in the same way to the same situation. (This is one plausible explanation for why serial monogamy might work; change the people in the situation and marriage might have a chance.) Social science is probabilistic. If we say "I do," we might, if we're lucky and we work hard at it. Then again, maybe we might not, but if we try again with another, we might, if we're lucky and we work hard at it.
DEDUCTIVE AND INDUCTIVE REASONING
9. What is deductive reasoning? P. 9
(1) Reasoning from the whole to the parts,
(2) From theory to variables to measurement to data to testing hypotheses.
10. What is inductive reasoning? P. 9
(1) Reasoning from the parts to the whole.
(2) Grounded theory.
(3) Moving from data collection to breaking categories and, consequently, variables out of the data, and then constructing the theory.
VALIDITY AND RELIABILITY IN MEASUREMENT
11. What is validity? P. 11
(1) A measure of the verifiable, "objective" truth of a concept or measure, if you believe that measures can be validly objective. (The authors of this text do not.)
(2) Or the extent to which we can establish that the measure used provides some cognitive or affective indication of the variable we want to examine. We could agree that male/female is an adequate categorical measure of which set of experiences some of us have had in this world. Or we could agree that male/female is inadequate to describe some of our experiences with sexuality and sexual experience.
(3) Key question: Does the indicator really measure what it says it measures?
(4) Key example: IQ. Does a paper and pencil test measure IQ? What is intelligence? What is smart? What is wise? Etc.
(5) Other examples that provide measurement and validity problems: race, sexuality, liberalism, conservatism, prejudice, refusal to listen, beauty, truth, etc.
(6) Are we measuring the variable, or someone's perception of the variable? Example: When we record the number of women who describe men as refusing to listen (Hite's study) are we describing men's refusal to listen or women's perception of men's refusal to listen, and why does that matter? It matters if we are trying to resolve the problem. If men believe they are listening and women believe they are not, we must look for underlying structural variables (structural because the phenomenon seems too widespread to depend on individual behavior) that might explain the differences in perception and help us to resolve the resultant problems.
12. What is reliability?
(1) The extent to which you can count on the measure or indicator of a variable to elicit the same value or response each time you use it. i.e., Male/female may be a reliable measure of gender in that a person who responds "male" on one occasion can be expected to respond "male" on the next occasion. If in fact that rough measure is adequate as a variable, the measure is reliable.
(2) If, on the other hand, you want to measure the extent to which one identifies with either the male or the female role, and all the difficulties of validity that phrase entails, male/female may not be a valid categorization, and depending on how the measure is presented to the subjects may not be reliable. If for example, you ask people to mark the category in which they might best "fit," some males might mark "female" some of the time, depending on the circumstances and complexity of the setting and the presentation of the experiment. Under this situation, the measure is far less reliable than when it represents straightforward gender identification.
(3) Example: Are you prejudiced? might elicit any number of answers, depending on the context, and the topic. Not a reliable measure by itself.
(4) Example: What percentage of time do you spend on homework each week? might elicit different responses depending on whether it is asked around mid-term or finals week, during Spring break, or when you are sick, or when you have a heavy load at work. Not a reliable measure of time spent on homework without other data to supplement it. Besides, what do we mean by homework? Is study homework? Does thinking deeply about the topic qualify, maybe while doing something unrelated to the course, like doing the wash? Does it have to be written to qualify?
(5) Notice that the complexity of measurement is largely amenable to common sense. You don't have to be a statistical expert to ask what the researcher means by homework, or by intelligence, by the situational pattern in which he/she collected the data. Common sense will do.
TRIANGULATION
13. Are multiple indicators the same as triangulation?
(1) Yes. They are different perspectives for measuring the variable you want to study.
(2) In trying to measure extent of involvement in a course through homework, you might ask:
a. Do you get to do the same amount of homework most of the time?
b. Are there lots of disruptions, like:
family problems
illness
work problems
other
c. Does the time you spend on homework vary over courses?
d. Would you like less disruption in the time you have for homework?
e. Does it matter to you whether your homework gets done, especially if no special credit is given for it?
f. Do you have anyone you can call on for help with homework if you need it?
And so on... Depending on how important this issue is, you can continue to expand the ways you measure the subject's perspective on homework. The more extensively you measure the situatedness of homework for the student, the more reliable your generalizations will be.
(3) Triangulation would also dictate that you try to develop other measurement techniques besides questioning. You might determine the number of college courses at the subject's school that require homework. You might ask for their definition of "homework." You might look at GPA and make an assumption (unreliable though it may be) that students with higher GPAs must do more homework. And so on...
NOMINAL DATA AND CROSS TABS
14. What are nominal data? P. 14
(1) Data that place people or things in named categories.
(2) Example: You might categorize people according to their planet of origin.
(3) Nominal data are generally used to describe a population by the percentage representing each category. These are often presented as frequency tables, often as bar graphs, which show, for example, that 23% of the sample was from Mars, 10% from Venus, 31% from Earth, and 2O% from Jupiter, 9% from Uranus, and 7% giving no specific planetary origin.
First thing you'd want to do with that kind of data is arrange it in order of frequency:
31% Earth
23% Mars
20% Jupiter
10% Venus
9% Uranus
7% Missing Data
(4) Then you might want to see how the planet of origin correlates with liberal political tendencies (liberal/not liberal), with comfort in breathing Earth's atmosphere (comfortable breathing, not comfortable breathing), appreciation of Earth's government (likes/dislikes) etc. We might discover with such cross tabulation that 82% of Martians are liberal, that 50% of Venusians are uncomfortable breathing Earth's air, and that no one likes Earth's government.
CAUSATION
(5) When you look at the data like this, you are studying the effects of one variable on another. We call such analysis in SPSS "cross tabs" for cross tabulation. You will notice that the results are much more interesting when we look at the variation of one variable with another. But there is one great danger. We have a tendency to make unwarranted assumptions about which one causes the other. Causation is complex. The fact that one is from Mars might cause one to be liberal if Martian society inculcates liberal values, or being liberal might cause one to move to Mars, or that might be a spurious relationship, with some other variable explaining the relationship. For example, maybe 88% of the Martian sample is female, and females are liberal.
THE RIGHT WAY TO SAY IT
(6) The appropriate words to use when describing variable relationships are shown above: "50% of Venusians are uncomfortable ..." not "Earth's air causes discomfort for 50% of Venusians." Watch such wording. Be alert to the fact that causation is complex.
(7) Appropriate wording also dictates that you use care in stating any conclusions. You are always safer if you say, "One possible conclusion is ..." "One plausible explanation is ..." than if you say "Earth's air causes Venusians difficulty in breathing." If you had said one possible explanation, you would have had a face saving niche into which to withdraw if you were shown to be wrong. Hedge your bets. Time enough to be certain when you are cloaked in advanced degrees and have an ivory tower wrapped around you.
(8) And while we're discussing the right way to say it, let's talk about the right way to put a graphics table in your term papers. The table, by virtue of its titles, should be self explanatory, so that anyone looking at the table or graph could understand from just the table or graph that 82% of Venusians had difficulty breathing Earth's air. Good reporting would include the actual number of respondents that made up the 82% in parentheses. That's important because if there were only 5 Venusians in your sample we might have a little less faith in your data than if there were 573 Venusians in your sample.
RULE: MAKE EACH TABLE OR GRAPH SELF EXPLANATORY -- NO NEED TO READ TEXT TO UNDERSTAND TABLE OR GRAPH
(9) But some people are not visual. Tables and graphs are not their cup of tea. They skip them. Therefore, your text should explain clearly the results presented in every table or graph, so that it is possible to read the text without reference to the tables and graphs, and still process all the information. Some appropriate words for this might be: Table 10 shows the proportion of Venusians who have difficulty breathing Earth's air. 82% (820) of Venusians in the sample reported difficulty, while 12% (120) reported no difficulty in breathing Earth's air. 6% (60) Venusians did not respond to this question. One plausible explanation for the large number of Venusians reporting difficulty is that so few Venusians on Earth have had lung transplants. Many of the 12% reporting no difficulty may have had lung transplants which were very popular at the end of the 20th Century. Unfortunately since artificial breathing tanks have been available for so long on Earth, this matter was not addressed in the questionnaire. Further study should investigate this possible explanation.
RULE: MAKE THE TEXT SELF EXPLANATORY -- NO NEED TO READ THE TABLES OR GRAPHS TO UNDERSTAND THE DATA
There is a great deal more to running the SPSS program, but it might be a good idea for you to develop a basic understanding of statistical measures. Statistics will help you understand how to interpret the results generated by SPSS. For example, a nonparametric statistic, such as lambda, will tell you the percentage by which you could improve your prediction of a person's political party affiliation if you knew the person's religion. When you get your printed results from the SPSS program, and see that lambda, is, say .42, that means that you could predict party affiliation 42% more accurately, if you based that prediction on the person's religious affiliation. You will find examples of this calculation and an appropriate interpretation on pages 132-3 in Babbie and Halley's book.
In this course, which focusses primarily on literacy in such analyses by computer, we ask that you recognize that there is specific wording that explains accurately what the statistical results mean. You should be careful of not going far astray of that wording until you are very certain of the statistical meaning and assumptions. Babbie and Halley's book is a good guide. Use it in conjunction with your statistics course, and check your wording of statistical results with your teacher until you have a chance to develop confidence in your own statistical work.
For this class, and for literacy purposes, we will complete a hands on workshop in which you will produce a frequency table and a cross tab. You will be expected to write a brief report of those results in which you use appropriate wording for the tables produced by SPSS. This should help you gain familiarity both with how tables and graphs are presented and how to read the tabular and graph results in the body of a report. Hopefully, it will also alert you to the niceties of language of which you will need to be aware.
For the hands on demonstration we will use Judy Emerson's handout for our lab. For those of you who choose to go more deeply into SPSS we will use Babbie and Halley's book.