Date | November 2021 | Marks available | 6 | Reference code | 21N.1.SL.TZ0.3 |
Level | SL | Paper | 1 | Time zone | no time zone |
Command term | Explain | Question number | 3 | Adapted from | N/A |
Question
Human genome research
MediResearch, a US-based DNA testing company, has a relational database of human genome information. An individual’s genome data represents private information about their past, their present and, potentially, their future. This information is stored in a relational database.
The senior managers at MediResearch are considering using data analytics but are concerned this may compromise the anonymity of the individuals who have provided their DNA.
Identify two features of a relational database.
Identify two reasons why a relational database, rather than a flat-file database, is used to store the data for MediResearch.
Identify two features of data analytics.
MediResearch is looking to expand access to the genome data it holds by sharing it with other companies.
Explain three strategies that MediResearch could use to ensure the security of the genome data.
The chief executive officer of MediResearch is considering using cloud-based storage to store the genome data.
Discuss whether MediResearch should move to cloud-based storage.
Markscheme
Answers may include:
- Multiple tables
- Joins between tables
Award [1] for identifying each feature of a relational database up to [2].
Answers may include:
- Removes update anomalies / updates are easier, as data is only stored once.
- Removes the risk of data redundancy.
- Reduces storage requirements.
Award [1] for identifying each reason why a relational database is used to store the data for MediResearch up to [2].
Answers may include:
- The practice of interrogating (large) databases / data sets.
- Uncovers patterns in the data that would not normally be apparent.
Award [1] for each characteristic of data analytics identified up to [2].
Answers may include:
- Using additional authentication processes, such as a PIN or a text/SMS…
- which will mean a second device is required for authentication to take place.
- Setting different levels of access so only specified employees have access to the most sensitive data…
- meaning less likelihood of this sensitive data being accessed.
- Designing the database so that the most sensitive data is placed in a table that only specified employees have access to…
- reducing the likelihood of this sensitive data being accessed.
- Encryption on the stored data…
- will ensure only users or devices who are authorized can access the stored data.
- VPN/encryption when transferring the data to other companies…
- will ensure that if data is intercepted it is not compromised.
- Creating policies/agreements between parties…
- which will dictate the way the data can be shared and used.
Award [1] for identifying each policy and [1] for a development of the policy identified up to [2].
Mark as [2] + [2] + [2].
Answers may include:
Advantages of cloud-based storage:
- No need for them to invest in storage infrastructure on site (systems).
- Data security will be provided by the company responsible for the cloud-based storage.
- Data backup facilities will be provided by the company responsible for the cloud-based storage (systems).
- Collaborating and sharing genome data with other data companies or external scientists will be more convenient (systems).
- Accessibility – having the data on the cloud will enable the MediResearch scientists to work remotely using an internet connection (systems).
- Scalability – MediResearch can expand or reduce storage capacity subscription on a needs basis (systems).
Disadvantages of cloud-based storage:
- The data may be managed by a third party, which could cause issues linked to its use / security.
- MediResearch may feel that data security may be more easily compromised.
- The company will have to ensure that they have enough internet bandwidth for data accessibility.
- Storing data in the cloud may sometimes be difficult – MediResearch’s existing data management may not integrate well with the cloud vendor’s system (systems).
- Considering the nature and sensitivity of the genome data held by MediResearch, the government policies of the country where the cloud storage unit is placed may influence the way data is stored/used or shared (values).
In part (c) of this question it is expected there will be a balance between the terminology related to digital systems and the terminology related to social and ethical impacts.
Keywords: health, policies, laws, regulations, data, security, privacy, anonymity, cloud, bandwidth, change, power, systems, values, ethics
Refer to SL/HL paper 1, part c markbands when awarding marks. These can be found under the "Your tests" tab > supplemental materials > Digital society markbands and guidance document.