User interface language: English | Español

Date November 2019 Marks available 2 Reference code 19N.2.HL.TZ0.8
Level HL Paper 2 Time zone no time zone
Command term Identify Question number 8 Adapted from N/A

Question

Optical character recognition (OCR) is a method where printed text or handwritten text is converted to machine-encoded text.

Recognizing handwritten characters presents more of a problem because people have different handwriting styles.

For example, the digitized handwritten letter X in Figure 3 does not exactly match the digitized letter X in Figure 4.

Artificial neural networks (ANNs) can be used to assist with recognizing handwritten characters that do not exactly match the expected pattern.

Predictive text, where the computer predicts the next word in the sentence, can be programmed to utilize a neural network.

The sentence “The child is feeling” is entered into an application that uses predictive text and three options are suggested: better, like, a. Upon entering the two characters “hu” the word hungry is suggested.

Outline one problem that may lead to printed text characters not being detected correctly.

[2]
a.

Outline why an ANN can be used to overcome the challenges outlined in this scenario.

[2]
b.

Explain how ANN pattern recognition techniques are applied to ensure that the handwritten letter X in Figure 3 is recognized as a letter X.

[4]
c.

Identify two features that would be required by the ANN to predict the next word in the sentence.

[2]
d.

Explain how the application uses a neural network to suggest suitable words.

[6]
e.

Outline two potential problems with training the ANN to suggest appropriate words.

[4]
f.

Markscheme

Award [2 max].
Printed text quality may be poor or faint/may use an unusual typeface e.g. BrushScript/In certain typefaces some characters can look similar e.g. sans serif;
For example, 5 and S may be confused/Numbers may be confused with letters 1 or I; 

a.

Award [2 max].
ANN can be trained to recognise handwriting styles / supervised learning;
ANN can learn to apply existing knowledge to new handwriting styles / unsupervised learning;
ANN can recognize parts of the image to determine if it is a match / doesn’t require the entire image to be a match;

b.

Award [4 max].
The ANN accesses a database of correctly stored characters;
ANN breaks the image into smaller parts / applies a filter (2 or 3 pixels) to a section of the image;
Calculate whether the pixels match / multiple each image pixel to the feature pixel, add them up, divide by number of pixels / perfect match will be 1;
Apply convolution / repeat the application of the filter over and over;
Apply a ReLU (Rectified Linear Units) layer (remove negative values) to reduce the mathematical calculations;
Apply pooling to shrink the image stack;
Use the stack of filter images / convolution layer to see if the image is a match;

c.

Award [2 max].
Dictionary of words;
Predictive text algorithms;
The ability to recall the previous words in the sentence / memory;
An understanding of language structure e.g. nouns, verbs, adverbs, etc.;
Previous words must be added back into the neural network;

d.

Award [6 max].
If they have used a diagram should include:
Award [1] Input for previous words;
Award [1] Input for new characters the user enters;
Award [1] Hidden layers;
Award [1] Weights inputted into the network;
Award [1] Combining the two inputs;
Award [1] Non-linear regression / sigmoid function;
Award [1] Merge layers to produce output;
Award [1] Back propagation / Output that re-enters the ANN;

e.

Award [4 max].
Unsupervised learning may have a poor set of words / text-speak is used;
So the ANN database doesn’t contain text-speak words so doesn’t learn;

Vanishing gradient problem / Exploding gradient problem;
Gradient signal is multiplied many times by weight matrix / The gradient signal can become smaller at every training step (vanishing) / can become excessively large at every training step (exploding) / This can make learning very slow or stops it completely;

f.

Examiners report

[N/A]
a.
[N/A]
b.
[N/A]
c.
[N/A]
d.
[N/A]
e.
[N/A]
f.

Syllabus sections

Option B: Modelling and simulation » B.4 Communication modelling and simulation
Show 29 related questions
Option B: Modelling and simulation

View options