User interface language: English | Español

Date May 2016 Marks available 2 Reference code 16M.3sp.hl.TZ0.2
Level HL only Paper Paper 3 Statistics and probability Time zone TZ0
Command term State Question number 2 Adapted from N/A

Question

The random variables \(X\), \(Y\) follow a bivariate normal distribution with product moment correlation coefficient \(\rho \).

A random sample of 10 observations on \(X\), \(Y\) was obtained and the value of \(r\), the sample product moment correlation coefficient, was calculated to be 0.486.

State suitable hypotheses to investigate whether or not \(X\), \(Y\) are independent.

[2]
a.

(i)     Determine the \(p\)-value.

(ii)     State your conclusion at the 5% significance level.

[7]
b.

Explain why the equation of the regression line of \(y\) on \(x\) should not be used to predict the value of \(y\) corresponding to \(x = {x_0}\), where \({x_0}\) lies within the range of values of \(x\) in the sample.

[1]
c.

Markscheme

\({H_0}:{\text{ }}\rho  = 0;{\text{ }}{H_1}:{\text{ }}\rho  \ne 0\)    A1A1

[2 marks]

a.

(i)     \(t = 0.486 \times \sqrt {\frac{{10 - 2}}{{1 - {{0.486}^2}}}} \)     (M1)

\( = 1.572 \ldots \)    (A1)

degrees of freedom \( = 8\)     (A1)

\({\text{P}}(T > 1.5728 \ldots )\)    (M1)

\( = 0.0772\)    (A1)

\(p{\text{ - value }} = {\text{ }}0.154\)    A1

Note:     Do not follow through for the final A1 if their \({H_1}\) is one-sided.

(ii)     accept \({H_0}\) or equivalent statement involving \({H_0}\) or \({H_1}\) (at the 5% significance level)     R1

Note:     Follow through the candidate’s \(p\)-value.

[7 marks]

b.

EITHER

because the above analysis suggests that \(X\), \(Y\) are independent     R1

OR

the value of \(r\) suggests that \(X\) and \(Y\) are weakly correlated     R1

[1 mark]

c.

Examiners report

Part (a) was well answered with only a few candidates using inappropriate symbols, for example \(r\) or \(\mu \). Also, only very few candidates failed to realise that the wording of the question indicated that a two-tailed test was required.

a.

The test in (b) was generally well carried out and the \(p\)-value found correctly. The most common errors were using incorrect degrees of freedom and evaluating a one-tailed \(p\)-value instead of a two-tailed \(p\)-value.

b.

In (c), many realised that the earlier work meant that the regression line should not be used because the variables had been found to be independent. Incorrect reasons, however, were not uncommon, for example the suggestions that either the regression line of \(x\) on \(y\) should be used or that there were insufficient data.

c.

Syllabus sections

Topic 5 - Core: Statistics and probability
Show 277 related questions

View options