Date | May Specimen paper | Marks available | 6 | Reference code | SPM.3.AHL.TZ0.1 |
Level | Additional Higher Level | Paper | Paper 3 | Time zone | Time zone 0 |
Command term | Determine | Question number | 1 | Adapted from | N/A |
Question
Two IB schools, A and B, follow the IB Diploma Programme but have different teaching methods. A research group tested whether the different teaching methods lead to a similar final result.
For the test, a group of eight students were randomly selected from each school. Both samples were given a standardized test at the start of the course and a prediction for total IB points was made based on that test; this was then compared to their points total at the end of the course.
Previous results indicate that both the predictions from the standardized tests and the final IB points can be modelled by a normal distribution.
It can be assumed that:
- the standardized test is a valid method for predicting the final IB points
- that variations from the prediction can be explained through the circumstances of the student or school.
The data for school A is shown in the following table.
For each student, the change from the predicted points to the final points was calculated.
The data for school B is shown in the following table.
School A also gives each student a score for effort in each subject. This effort score is based on a scale of 1 to 5 where 5 is regarded as outstanding effort.
It is claimed that the effort put in by a student is an important factor in improving upon their predicted IB points.
A mathematics teacher in school A claims that the comparison between the two schools is not valid because the sample for school B contained mainly girls and that for school A, mainly boys. She believes that girls are likely to show a greater improvement from their predicted points to their final points.
She collects more data from other schools, asking them to class their results into four categories as shown in the following table.
Identify a test that might have been used to verify the null hypothesis that the predictions from the standardized test can be modelled by a normal distribution.
State why comparing only the final IB points of the students from the two schools would not be a valid test for the effectiveness of the two different teaching methods.
Find the mean change.
Find the standard deviation of the changes.
Use a paired -test to determine whether there is significant evidence that the students in school A have improved their IB points since the start of the course.
Use an appropriate test to determine whether there is evidence, at the 5 % significance level, that the students in school B have improved more than those in school A.
State why it was important to test that both sets of points were normally distributed.
Perform a test on the data from school A to show it is reasonable to assume a linear relationship between effort scores and improvements in IB points. You may assume effort scores follow a normal distribution.
Hence, find the expected improvement between predicted and final points for an increase of one unit in effort grades, giving your answer to one decimal place.
Use an appropriate test to determine whether showing an improvement is independent of gender.
If you were to repeat the test performed in part (e) intending to compare the quality of the teaching between the two schools, suggest two ways in which you might choose your sample to improve the validity of the test.
Markscheme
(goodness of fit) A1
[1 mark]
EITHER
because aim is to measure improvement
OR
because the students may be of different ability in the two schools R1
[1 mark]
0.1875 (accept 0.188, 0.19) A1
[1 mark]
2.46 (M1)A1
Note: Award (M1)A0 for 2.63.
[2 marks]
: there has been no improvement
: there has been an improvement A1
attempt at a one-tailed paired -test (M1)
-value = 0.423 A1
there is no significant evidence that the students have improved R1
Note: If the hypotheses are not stated award a maximum of A0M1A1R0.
[4 marks]
: there is no difference between the schools
: school B did better than school A A1
one-tailed 2 sample -test (M1)
-value = 0.0984 A1
0.0984 > 0.05 (not significant at the 5 % level) so do not reject the null hypothesis R1A1
Note: The final A1 cannot be awarded following an incorrect reason. The final R1A1 can follow through from their incorrect -value. Award a maximum of A1(M1)A0R1A1 for -value = 0.0993.
[5 marks]
sample too small for the central limit theorem to apply (and -tests assume normal distribution) R1
[1 mark]
:
: A1
Note: Allow hypotheses to be expressed in words.
-value = 0.00157 A1
(0.00157 < 0.01) there is a significant evidence of a (linear) correlation between effort and improvement (so it is reasonable to assume a linear relationship) R1
[3 marks]
(gradient of line of regression =) 6.6 A1
[1 mark]
: improvement and gender are independent
: improvement and gender are not independent A1
choice of test for independence (M1)
groups first two columns as expected values in first column less than 5 M1
new observed table
(A1)
-value = 0.581 A1
no significant evidence that gender and improvement are dependent R1
[6 marks]
For example:
larger samples / include data from whole school
take equal numbers of boys and girls in each sample
have a similar range of abilities in each sample
(if possible) have similar ranges of effort R1R1
Note: Award R1 for each reasonable suggestion to improve the validity of the test.
[2 marks]