RDFa-deployed Multimedia Metadata (ramm.x) Use Cases

ramm.x logo

2007-08-27

This version:
http://sw.joanneum.at/rammx/usecases
Latest version:
http://sw.joanneum.at/rammx/usecases
Revision:
$Id: index.html 142 2007-08-27 16:31:47Z baw $
Editor:
Werner Bailer - JOANNEUM RESEARCH
Authors:
TBD
Creative Commons Attribution 3.0 License

This work is licensed under the Creative Commons Attribution 3.0 License.

Valid XHTML + RDFa

ramm.x relies heavily on W3C's RDFa technology, an open Web standard that can be freely used by anyone.

RDF Resource Description Framework Metadata Icon

To obtain the metadata of this page, follow the RDFa extractor link, a service provided by Dave Beckett.

The visual layout and structure of this specification was adapted from the FOAF Vocabulary Specification by Dan Brickley and Libby Miller,
and the SIOC Core Ontology Specification by Uldis Bojars and John G. Breslin, respectively.


Abstract

This document describes application scenarios which benefit from the use of ramm.x. From the use cases, a set of requirements for ramm.x is gathered.


Status of this document

NOTE: This section describes the status of this document at the time of its publication. Other documents may supersede this document.

This is the first draft of the ramm.x use cases.

Suggestions are welcome; please send comments to rammx-spec@googlegroups.com.

Table of contents


1. Introduction

This document describes application scenarios that benefit from the use of ramm.x. Generally speaking, the following conditions need to be fulfilled to make an application scenario a potential ramm.x use case:

2. Use Cases

2.1 Annotate and share photos

The image annotation on the Semantic Web report [Image Annotation] presents five uses cases for image annotation using semantic-based technologies. In the solutions presented for the use cases, the metadata descriptions are RDF documents. Deploying them on the Web would often require to embed the complete description into a HTML document using RDFa. In some of the use cases, the source metadata come from Dublin Core, Exif or TV Anytime descriptions. In these cases, ramm.x could be used as an alternative deployment strategy, which avoids to embed the complete RDF document into the HTML page.

Photo sharing is without doubt one of the most popular type of Web 2.0 applications. A number of photo sharing services exist and all of them allow to add some kind of metadata to the images uploaded. However, there are two shortcomings that are addressed by ramm.x:

2.2 Buy music

Music is increasingly sold online, both by ordering traditional media like CDs online and by downloading files. Music stores provide only very basic metadata with their content (cf. Figure 2) As there are a number of common formats for this kind of metadata, using ramm.x would allow to link the metadata of music stores with other Sematic Web resources. This enables applications that automatically link items on the store's site with artist information, reviews, information from fan sites, etc.

Metadata in the iTunes music store

Figure 2: Metadata in the iTunes music store.

2.3 Description of video structure

While global metadata can be described in a number of simple formats, the metadata related to a certain temporal/spatial range of the content requires the use of more advanced metadata formats. A typical example is the description of the structure of a video, i.e. its scenes, sequences etc. This structure does not only serve as container for metadata valid only for a certain segment, but can also be used for navigation and abstraction of the content.

Assume that Ann creates a video of her last vacation, along with a description of the segmentation of the video into segments that correspond to the different places she has visited. She publishes the video on YouTube together with a ramm.x description of the video structure. Other Web applications can for example use this description as follows:

2.4 Publishing professional content with metadata

More and more professional content providers offer high quality content on the Web (an example is [BBC Motion Gallery], cf. Figure 3). In contrast to user generated content, detailed and accurate metadata are available for this kind of content. Currently the metadata are published only in part, and just as text on the Web site. The application of ramm.x does not only allow to publish the metadata in Semantic Web compliant way, but also to directly link the description in a format that is used by the content provider for business to business exchange (such as [EBU P_Meta] in the broadcast domain), provided that a service for conversion to RDF is available.

BBC Motion Gallery

Figure 3: Asset offered at BBC Motion Gallery and metadata.

2.5 Rights information for media asset

When media assets are published, it is also important to make the related rights information accessible. This information is for example interesting for multimedia agencies if they want to retrieve images from the Web automatically that they can re-use in advertisings, catalogs, etc. If the rights information only consists of the reference to a certain license (e.g. a specific Creative Commons license) this is trivial. If there are however more complex rights metadata (cf. Figure 4), e.g. expressed using MPEG-21 REL, then ramm.x can be used to deploy these metadata.

Getty Images rights metadata

Figure 4: Textual rights information for an image offered at Getty Images.

2.6 Detailed description of large media assets

Imageine a Web application that allows to create highlight and summary videos of NBA basketball games. Besides its presence on TV, footage of NBA games is available on the web [NBA Content], even entire games are broadcast via broadband. Basketball content is both spectacular and multifaceted, and therefore well suited for interactive consumption.

Almost every aspect of an NBA basketball game is covered by exhaustive statistics. This includes statistics about teams, players (averages, career bests), and games (all game events by exact time, involved players, and action, e.g., free throws or turnovers) including extensive game logs. Comprehensive statistics — both official NBA statistics and further analysis — are publicly available on different web sites. In addition, metadata could be automatically extracted by content analysis approaches, e.g. segmenting and tracking players and describing their trajectories. This means that a huge amount of metadata in different formats is available for one video.

When a user watches parts of the videos of a game, e.g. highlights selected based on his personal preferences, a Semantic Web application could use the ramm.x deployed metadata of the basketball game to gather related information and present it to the user. However, the complete description of the game is large and it is time consuming to process all of it, if only the description of a small segment is needed.

NBA content on YouTube  Metadata published with NBA content

Figure 5: NBA content on YouTube and metadata published with it.

2.7 Cultural heritage applications

Imagine the case of an archive collecting historical newspapers (cf. Figure 6), which are scanned page per page. Optical character recognition (OCR) can be applied to retrieve the text in the articles and to make it searchable by full text.

Neue Zeit 1956

Figure 6: Scanned page of a historical newspaper.

Other elements of the pages, such as illustrations, photographs, advertisements, etc. can be located and extracted during the digitisation process, but are not self-descriptive. In addition, a number of metadata about the asset exists, for example descriptive and administrative metadata. These metadata are commonly represented using the METS standard [METS]; in our example the suggested historical newspaper profile [METS Newspaper] would be appropriate. Another type of metadata is information about the digitisation process (e.g. device, resolution, date/time), which is usually stored as Exif data [Exif] embedded in the digital image.

The archive in our example decides to make its collection available on the Web. It publishes the original scanned images, the text transcript and the extracted non-text elements. The most relevant of the available metadata elements are put into the asset description on the HTML page. This is very useful for a human viewer of the page.

Let us now assume that a TV journalist wants to edit a documentary on the Hungarian Revolution of 1956. He uses a Semantic Web agent to gather video and image material on this event. Clearly the image on the frontpage of the newsletter depicted in Figure 6 would be relevant in this context, but how could it be linked to other information on the Semantic Web? Some simple descriptive metadata could be represented using the Dublin Core vocabulary. As the Semantic Web agent understands RDF, we could use the RDF representation of Dublin Core [DC RDF], either in a separate document or preferably embedded in the HTML page using RDFa. But what about the information contained in the Exif and METS descriptions?

3. Requirements

This section gathers the requirements from the use cases above and tries to generalize them.


4. Acknowledgements

TODO: Add contributors ...

5. References

werner.bailer@joanneum.at