Date | November 2018 | Marks available | 3 | Reference code | 18N.2.HL.TZ0.12 |
Level | HL | Paper | 2 | Time zone | no time zone |
Command term | Explain | Question number | 12 | Adapted from | N/A |
Question
Tapetum lucidum refers to the layer of tissue in the eyes of many animals, such as cats and owls, which helps to improve their night vision by reflecting light. The eyes of these animals glow in the dark.
A student wishes to identify a particular animal which belongs to the above group.
The web and social media are increasingly populated by pictures of pets and other animals. Most of these pictures are annotated with tags, such as those used on social networks.
Search engines use indexing for the storing of keywords that might be used in a search engine query.
With the use of an example, explain how a multimedia search in the semantic web would be preferable to a text based search for this student.
Identify three differences between an ontology and a folksonomy.
Discuss whether it is possible to integrate the expressivity of folksonomies with the authoritativeness of ontologies.
Explain the role that web crawlers perform in developing this index.
Open source software, such as Linux, is often developed by using the collective intelligence of the developer community.
Explain how collective intelligence can lead to the successful development of open source software.
Markscheme
Award up to [4 max].
Award [2 max] for a general knowledge and understanding of multimedia web.
Award [2 max] for describing a situation where the multimedia web may be beneficial to using a text based search.
Knowledge of multimedia web
The multimedia web makes an expanded use of the semantic web (formally and semantically interlinked data of any kind) making possible to provide input in a form other than text, and possibly return output in various forms;
Therefore, the search is made at the level of “concepts” and not just textual terms;
In a way that the computer can retrieve semantical links among different concepts/concepts presented in different formats;
That includes the ability for crawlers to go through scientific classifications of text/captions/description of images;
That can all be indexed and clustered into a common “concept”;
Example 1:
To detect/discover/verify/monitor the presence of species based on sound
For example:
We might aim in recognising and searching on the web starting from the sound of some nocturnal animal that we don’t see/we don’t know, but whose presence is revealed by their eyes;
Giving the sound as input would allow to perform a search based on registration/audio that would run on multimedia sources linked with semantical links;
Example 2:
To detect/discover/verify/monitor the presence of species based on images
For example:
We might start a multimedia search based on images/photos of nocturnal animals we may have taken;
To see whether those sightings were associated to dangerous species or not;
To signal the presence of some unusual species in the region;
To signal the presence of some animal in difficulty to the protection agency;
Award up to [3 max].
Award [1] for each difference up to [3 max].
Designed or extracted by knowledge engineers VS;
Created collaboratively by users;
Laborious and expert VS quick/easy non-expert;
Controlled vocabulary/dictionary VS no control of vocabulary/informal;
Formal specification of knowledge domain (taxonomies) VS;
Informal metadata on documents;
Not just for the web VS mostly for the web;
Essential for the semantic web VS important for social web;
Explicit meaning and high expressive power VS;
Ambiguity and low expressive power;
Experts’ view of the domain VS social aspect of meaning;
Award up to [6 max].
Award [2 max] for a superficial response that uses generic terminology.
Award [4 max] for a response that shows some analytical comments, unsubstantiated conclusions and some use of appropriate terminology.
Award [6 max] for a coherent analysis leading to substantiated conclusions with the appropriate use of subject specific terminology.
General, possibly described through examples
The ontology of the semantics web is extracted with expert means from documents;
(For example, by making features analysis / aggregating concepts / eliminating disambiguation;)
Whereas, the folksonomy is the result of adding different vocabulary (done by the users) via tags in posts/web pages/social web (like Tumblr);
(For example, tagging the photo of a cat under an informal “cat’s eye” would add a different perspective to the ontology);
The problem:
If not monitored, tagging may be source of “garbage” and hinder the research for the ontology;
The ambiguity problem of tagging needs to be addressed so to extract a weight to ambiguous use of tags;
Possible countermeasures:
Expand the query within social tagging platforms;
In order to extract a weight to ambiguous use of tags and use this weight to cut-out some irrelevant retrieved information;
OR
Use the folksonomy tags (at least some) to update (possibly as metadata) the ontology vocabulary;
However, depending on the context, this may not always be feasible/applicable and depends on the domain of discourse;
OR
Re-engineer the ontology to include a collaborative form of construction from tagging;
However, this is method is not applicable in expert fields where quality of content must be preserved;
OR
Keep the ontology and folksonomy well separated, but let them work in tandem;
In a way to have feedback from folksonomies, but merged in a controlled way;
Award up to [4 max].
Award [1] for each comment that indicates the role that web crawlers perform in developing an index that is used by a search engine up to [4 max].
A web crawler is given a “seed” page which is where it starts;
It searches for hyperlinks;
Which it recursively follows;
Making a copy of each web page visited;
Which can be indexed by the search engine;
Award up to [3 max].
Award [1] some example to fix the scenario of open source;
Award [1] what is meant by collective intelligence;
Award [1] some explanation/expansion to link the concepts;
The focus:
“Software for Public Interest”;
Open source and free software development is the result of initiatives taken by sparse communities of programmers that collaborate effectively to maintain complex programs/packages of code;
(For example, Debian Foundation, Linux Foundation, Mozilla Foundation, Apache Foundation, Free Software Foundation, Wikipedia…;)
The structure:
A distributed system of development AND tightly governed /
Each developer codes individually, but the whole project relies on the fact that portions of code may be integrated as components of other parts;
Organisations may have different modes of governance, and have precise ways (processes) to contribute to a project, both technical and social;
The objective is to guarantee coordination, quality and relevance for the project development (technical) based on contributions of the individuals;
While modulating possible clashes/conflicts between different personalities;
Examples:
OS kernels, Wikipedia, mail programs, GNU, …;
Build a deposit of old software (free/open source), for preservation and sharing (with everybody);
Old open source SW may in fact be at risk of not being publicly available for a variety of reasons, such as sites become proprietary and limit public access, crash of the systems, business decisions;
Role of collective intelligence
Individuals are expert developers, but act collectively, for public interest, and this becomes a movement that may make a cultural/political difference/impact (the key is high quality, not populism!);
e.g. the French Public Administration adopts Open Source software and has rejected proprietary products (Microsoft, specifically)