Ask the experts: the future for media archives


Stefano Cavaglieri# TV-Bay Magazine
Read ezine online
by Stefano Cavaglieri
Issue 89 - May 2014

Semantic linking is a term coined by Tim Berners-Lee and used to describe a framework of syntax that allows computers to understand complex statements of the kind humans are able to deal with easily. If all the information online were to be accessible through semantic linking, computers would be able to make use of it in much more subtle ways, and this would greatly increase the power of data search and retrieval. While constructing a framework and system for internet-wide semantic linking is a massive and complex undertaking, one area in which it could more readily be implemented is in media archiving. Here it would allow much more versatile and efficient retrieval of media assets, using a far wider range of search criteria. The commercial possibilities opened up by this development would create far greater revenue for holders of media archives.

Whats different about semantic linking?

Sentences like The Beatles were a popular band from Liverpool, John Lennon was a member of the Beatles, Let It Be was recorded by the Beatles are easily understood by people. But how can they be understood by computers? Statements are built with syntax rules. The syntax of a language defines the rules for building the language statements. But how can syntax become understandable to computers? This is what the Semantic Web is all about, describing things in a way that computer applications can understand.

The Semantic Web is not about links between web pages; instead, it describes the relationships between things (for example, A is a part of B and Y is a member of Z) and the properties of things (such as the format, dimensions, replay speed, equalization, etc.).

Berners-Lee puts it like this: If HTML and the Web made all the online documents look like one huge book, RDF (Resource Description Framework), schema, and inference languages will make all the data in the world look like one huge database.
If information about music, events, preservation, and so on could be stored in RDF files, intelligent web applications could then collect information from any source, combining the information and presenting it to users in a more meaningful way. This could have the advantage of creating a more relational database-like guarantee for the correctness of query results.

Is a Semantic Web just around the corner?

The Semantic Web is not a very fast growing technology. One of the reasons for this is the very steep learning curve. RDF was developed by people with academic backgrounds in logic and artificial intelligence, making it very easy for the rest of us to understand it. Another is the current lack of standards. RDF is data about data or metadata. Often RDF files describe other RDF files. Will it ever be possible to link all these RDF files together and build a Semantic Web?

The promise of the Semantic Web has raised a number of different expectations. These expectations can be traced to three different perspectives on the Semantic Web. The Semantic Web is portrayed as: A universal library, to be readily accessed and used by humans in a variety of information use contexts; The backdrop for the work of computational agents completing sophisticated activities on behalf of their human counterparts; and A method for federating particular knowledge bases and databases to perform anticipated tasks for humans and their agents.

Some of the challenges for the Semantic Web include vastness, vagueness, uncertainty, inconsistency, and deceit. Automated reasoning systems will have to deal with all of these issues in order to deliver on the promise of the Semantic Web.

It is not very likely that owners of media archives will be able to catalog their multimedia document just by putting an RDF file on the Internet. Various applications will have to be developed, including a search engine database for all the items, and someone will have to develop a standard for it.

It might be eBay, it might be Microsoft, it might be Google. But eventually we will see marketplaces based on RDF. Publishing information about things on the Internet will be much easier than before. One day we will be able to collect information about almost everything on the web in a standardized RDF format.
What other snags are there?

The advantages that the Semantic Web brings in terms of reuse, dynamism, flexibility, and openness also pose the possibility of inefficiencies such as complexity, and performance degradation. Then theres the human factor: people may include spurious metadata (i.e. metacrap) into web pages in an attempt to mislead Semantic Web engines that naively assume the metadatas veracity.

Enthusiasm about the Semantic Web could be tempered by concerns regarding censorship and privacy. For instance, text-analyzing techniques can now be easily bypassed by using other words, metaphors for instance, or by using images in place of words. An advanced implementation of the Semantic Web would make it much easier for governments to control the viewing and creation of online information, as this information would be much easier for an automated content-blocking machine to understand.

Another criticism of the Semantic Web is that it would be much more time-consuming to create and publish content because there would need to be two formats for one piece of data: one for human viewing and one for machines. However, many web applications in development are addressing this issue by creating a machine-readable format upon the publishing of data or the request of a machine for such data.

Is it all too difficult then?

Where Semantic Web technologies have found a greater degree of practical adoption, it has tended to be among core specialized communities and organizations for intra-company projects. The practical constraints toward adoption appear less challenging where domain and scope is more limited than that of the general public and the World-Wide Web.

Media archiving could be an ideal application. However, the IASA (International Association of Sound and Audiovisual Archives) is not yet committed to the Semantic Web. Documentation practices in libraries and archives are well established. They are supported by international regulations and a vast know-how. Common resources provide trusted information for aggregating data in the traditional way. A number of consortia at various levels are successfully covering different topics and the dissemination of information is well developed. At this stage, most of the organizations represented within the IASA do not really feel they need the Semantic Web.
So at the current stage of development, the Semantic Web is not a priority for the IASA. But this does not mean it will be ignored. A very soft, although practical introduction to some of the beauties of the Semantic Web might be offered by companies such as NOA-Audio, which have systems that are particularly adaptable to a Semantic Web extension, and which have representatives on IASA technical committees.
The commercial potential of media archives which are searchable through Semantic Web technology could be the argument that spurs development in this direction. An IASA implementation of Semantic technologies would transform the way archive holders could exploit their assets, and for the world at large, it would open up access to archived material in exciting new ways.


Tags: iss089 | Semantic linking | media acrhives | Stefano Cavaglieri#
Contributing Author Stefano Cavaglieri#

Read this article in the tv-bay digital magazine
Article Copyright tv-bay limited. All trademarks recognised.
Reproduction of the content strictly prohibited without written consent.

Articles
Test, Measurement and Standards
Alan Wheable The Alliance for IP Media Solutions (AIMS), is a non-profit trade alliance that fosters the adoption of one set of common, ubiquitous, standards-based protocols for interoperability over IP in the media and entertainment, and professional audio/video industries.
Tags: iss135 | omnitek | aims | SNMP | hdr | ai | Alan Wheable
Contributing Author Alan Wheable Click to read or download PDF
Switching to Internet Based Distribution
Chris Clark

"An IP status check for the broadcast industry", "Resistance is futile", "IP points the way forward for the broadcast industry"...

Yes, we've read the headlines too. But rather than force you into submission, scare you, or leave you feeling like you have no other choice, we want to give you the information that helps you to make a sensible decision about Internet-based distribution.

So what’s stopping you from making the switch right now?

Tags: iss135 | ip | internet | distribution | cerberus | Chris Clark
Contributing Author Chris Clark Click to read or download PDF
21st Century Technology for 20th Century Content
James Hall A big challenge facing owners of legacy content is rationalising and archiving their tape and film-based media in cost effective and efficient ways, whilst also adding value. Normally the result of this is to find a low cost means of digitising the content – usually leaving them with a bunch of assets on HDD. But then what? How can content owners have their cake and eat it?
Tags: iss135 | legacy | digitising | digitizing | archive | James Hall
Contributing Author James Hall Click to read or download PDF
Your two week editing future
Alex Macleod

So here we are - January again! Usually a good time to reflect on the year just gone by, and a good time to look forward to the coming months as the new year begins.

When I was reflecting on my 2018, and when thinking about what to write for my first article for Kit Plus - I kept coming back to one theme - organisation.

Tags: iss135 | editing | mediacity training | premiere pro | dit | Alex Macleod
Contributing Author Alex Macleod Click to read or download PDF
The making of The Heist
Tom Hutchings Shine TV has never been one to shy away from a challenge, be that in terms of using new technologies, filming ideas or overall formats: we pride ourselves on being ambitious and risk-takers.
Tags: iss135 | liveu | heist | streaming | cellular | mobile | connectivity | Tom Hutchings
Contributing Author Tom Hutchings Click to read or download PDF