Part I – Multiple Choice

1. Suppose that the scatterplot of (log x, log y) shows a strong positive correlation Which of the following must be true?

I. The variables x and y also have a correlation close to 1. II. A scatterplot of (x, y) shows a strong nonlinear pattern. III. The residual plot of the variables x and y shows a random pattern.

(a) I only

(b) II only

(c) III only

(d) I and II

(e) I, II, and III

2. What is the purpose of residual plots?

(a) To determine causation.

(b) To assess the type of relationship that exists between x and y. (c) To check the appropriateness and fit of the regression equation for the data. (d) To measure the variability in the residuals.

(e) To provide predictions for the response variable.

3. Fourth grade children were asked what emotion they associated with the color red. The responses for emotion and gender of the children are summarized in the following two-way table.

Anger

Pain

Happiness

Love

Male

35

27

12

38

Female

27

17

19

39

What proportion of the males associate the color red with love? (a) 0.5234

(b) 0.3598

(c) 0.3393

(d) 0.1822

(e) 0.1775

4. A strong negative linear relationship between Average State SAT scores and Percentage of students taking the SAT in each of those states reflects which underlying relationship?

(a) causation

(b) correlation

(c) common response

(d) extrapolation

(e) confounding

5. Two variables are confounded when:

(a) The effect of one variable on the response variable is dependent upon the effect of the other variable. (b) The effect of one variable on the response variable cannot be separated from the other variable. (c) The effect of one variable on the response variable changes the impact of the other variable on the response variable. (d) Both variables are classified as lurking or extraneous variables. (e) They interact in their effects on the response variable.

6.Which of the following are true statements?

I. High correlation does not necessarily imply causation.

II. A lurking variable is a name given to variables that cannot be identified or explained. III. Successful prediction requires a cause and effect relationship.

(a) I only

(b) II only

(c) III only

(d) I and III only

(e) I and II only

7. If the model for the relationship between the score on and AP Statistics Test (y) and the number of hours spent preparing for the test (x) was, determine the residual if a student studied 9 hours and earned an 85.

(a) 6.53

(b) 3.14

(c) 15.23

(d) 0

(e) –4.86

8.Which of the statements is true?

I. Two variables are confounded if their effects on a response variable cannot be distinguished from each other. II.A lurking variable has an effect on the relationship among variables in the study but is not included among the variables studied. III.Observational studies of the effect of one variable on another variable can fail if the explanatory variable is confounded with a lurking variable. (a) I only

(b) II only

(c) III only

(d) I and II only

(e) I, II, and III

9. A study was conducted to determine the effectiveness of varying amounts of vitamin C in reducing the number of common colds. A survey of 450 people provided the following information:

Daily amount of Vitamin C taken

None

500 mg

1000 mg

No colds

57

26

17

At least one cold

223

84

43

What conclusion can be made?

(a) The data proves that vitamin C reduces the number of common colds. (b) The data proves that vitamin C has no effect on the number of common colds. (c) There appears to be a strong association between consumption of vitamin C and the occurrence of common colds. (d) There appears to be little association between consumption of vitamin C and the occurrence of common colds. (e) Since common colds are caused by viruses, there is no reason to conclude that vitamin C could have any effect.

10.)For the past two-hundred years population per square feet in a northwest suburb can be modeled using the exponential equation . The scatter plot of the data is shown below.

Which of the following statements is true?

(a) If an attempt is made at fitting a straight line to the original data, the corresponding residual plot would be approximately linear. (b) If an attempt is made at fitting a straight line to the original data, the corresponding residual plot would be scattered and show no pattern. (c) If an attempt is made at fitting a straight line to the original data, the corresponding residual plot would be a straight line. (d) Plotting the logarithm of population per square mile against year should be approximately linear. (e) Plotting the logarithm of population per square mile against the logarithm of year should be approximately linear.

Answers: 1-B2-C3-C4-B5-B6-A7-B8-E9-C10-D

Part II – Free Response

The table below describes the data comparing the relationship between age groups and localities of residence.

Localities of Residence

Urban

Suburban

Rural

Totals

Age Groups

Under 25

110

150

65

25-50

240

220

75

Over 50

53

112

58

Totals

a.) Compute the marginal frequencies. Place the answers in the table. b.) Compute the marginal percentages. Place answers below or on the side of the table c.) What percent of the urban dwellers are over 50?

d.) What percent of the over-50 residents live in rural areas?

e.) Compute the conditional percentage for the Urban Dwellers:

f.) Based on your analyses, do you believe that these data indicate there is a relationship between locality of residence and the ages of the residences? Explain your answer.