Date | May 2021 | Marks available | 3 | Reference code | 21M.3.AHL.TZ2.1 |
Level | Additional Higher Level | Paper | Paper 3 | Time zone | Time zone 2 |
Command term | State | Question number | 1 | Adapted from | N/A |
Question
Juliet is a sociologist who wants to investigate if income affects happiness amongst doctors. This question asks you to review Juliet’s methods and conclusions.
Juliet obtained a list of email addresses of doctors who work in her city. She contacted them and asked them to fill in an anonymous questionnaire. Participants were asked to state their annual income and to respond to a set of questions. The responses were used to determine a happiness score out of . Of the doctors on the list, replied.
Juliet’s results are summarized in the following table.
For the remaining ten responses in the table, Juliet calculates the mean happiness score to be .
Juliet decides to carry out a hypothesis test on the correlation coefficient to investigate whether increased annual income is associated with greater happiness.
Juliet wants to create a model to predict how changing annual income might affect happiness scores. To do this, she assumes that annual income in dollars, , is the independent variable and the happiness score, , is the dependent variable.
She first considers a linear model of the form
.
Juliet then considers a quadratic model of the form
.
After presenting the results of her investigation, a colleague questions whether Juliet’s sample is representative of all doctors in the city.
A report states that the mean annual income of doctors in the city is . Juliet decides to carry out a test to determine whether her sample could realistically be taken from a population with a mean of .
Describe one way in which Juliet could improve the reliability of her investigation.
Describe one criticism that can be made about the validity of Juliet’s investigation.
Juliet classifies response as an outlier and removes it from the data. Suggest one possible justification for her decision to remove it.
Calculate the mean annual income for these remaining responses.
Determine the value of , Pearson’s product-moment correlation coefficient, for these remaining responses.
State why the hypothesis test should be one-tailed.
State the null and alternative hypotheses for this test.
The critical value for this test, at the significance level, is . Juliet assumes that the population is bivariate normal.
Determine whether there is significant evidence of a positive correlation between annual income and happiness. Justify your answer.
Use Juliet’s data to find the value of and of .
Interpret, referring to income and happiness, what the value of represents.
Find the value of , of and of .
Find the coefficient of determination for each of the two models she considers.
Hence compare the two models.
Juliet decides to use the coefficient of determination to choose between these two models.
Comment on the validity of her decision.
State the name of the test which Juliet should use.
State the null and alternative hypotheses for this test.
Perform the test, using a significance level, and state your conclusion in context.
Markscheme
Any one from: R1
increase sample size / increase response rate / repeat process
check whether sample is representative
test-retest participants or do a parallel test
use a stratified sample
use a random sample
Note: Do not condone:
Ask different types of doctor
Ask for proof of income
Ask for proof of being a doctor
Remove anonymity
Remove response .
[1 mark]
Any one from: R1
non-random sampling means a subset of population might be responding
self-reported happiness is not the same as happiness
happiness is not a constant / cannot be quantified / is difficult to measure
income might include external sources
Juliet is only sampling doctors in her city
correlation does not imply causation
sample might be biased
Note: Do not condone the following common but vague responses unless they make a clear link to validity:
Sample size is too small
Result is not generalizable
There may be other variables Juliet is ignoring
Sample might not be representative
[1 mark]
because the income is very different / implausible / clearly contrived R1
Note: Answers must explicitly reference "income" to get credit.
[1 mark]
(M1)A1
[2 marks]
A2
[2 marks]
EITHER
only looking for change in one direction R1
OR
only looking for greater happiness with greater income R1
OR
only looking for evidence of positive correlation R1
[1 mark]
A1A1
Note: Award A1 for seen (do not accept ), A1 for both correct hypotheses, using their or . Accept an equivalent statement in words, however reference to “correlation for the population” or “association for the population” must be explicit for the first A1 to be awarded.
Watch out for a null hypothesis in words similar to “Annual income is not associated with greater happiness”. This is effectively saying and should not be condoned.
[2 marks]
METHOD 1 – using critical value of
R1
(therefore significant evidence of) a positive correlation A1
Note: Do not award R0A1.
METHOD 2 – using -value
A1
Note: Follow through from their -value from part (c)(ii).
(therefore significant evidence of) a positive correlation A1
Note: Do not award A0A1.
[2 marks]
A1
[1 mark]
EITHER
the amount the happiness score increases for every increase in (annual) income A1
OR
rate of change of happiness with respect to (annual) income A1
Note: Accept equivalent responses e.g. an increase of in happiness for every increase in salary.
[1 mark]
,
,
A1
[1 mark]
for quadratic model: A1
for linear model: A1
Note: Follow through from their value from part (c)(ii).
[2 marks]
EITHER
quadratic model is a better fit to the data / more accurate A1
OR
quadratic model explains a higher proportion of the variance A1
[1 mark]
EITHER
not valid, not a useful measure to compare models with different numbers of parameters A1
OR
not valid, quadratic model will always have a better fit than a linear model A1
Note: Accept any other sensible critique of the validity of the method. Do not accept any answers which focus on the conclusion rather than the method of model selection.
[1 mark]
(single sample) -test A1
[1 mark]
EITHER
A1
OR
(sample is drawn from a population where) the population mean is
the population mean is not A1
Note: Do not allow FT from an incorrect test in part (f)(i) other than a -test.
[1 mark]
A1
Note: For a -test follow through from part (f)(i), either (from biased estimate of variance) or (from unbiased estimate of variance).
R1
EITHER
no (significant) evidence that mean differs from A1
OR
the sample could plausibly have been drawn from the quoted population A1
Note: Allow R1FTA1FT from an incorrect -value, but the final A1 must still be in the context of the original research question.
[3 marks]