Digital TV
With the phenomenal growth of the Internet and the World Wide Web, the interest in advanced interactivity with content provided by digital television is increasing. Increased text, picture, audio, or graphics that can be controlled by the user can add to the entertainment value of certain programs, or provide valuable information unrelated to the current program, but of interest to the viewer. TV station logos, customized advertising, multi-window screen formats allowing display of sports statistics or stock quotes using data-casting are some examples of increased functionalities. Providing the capability to link and to synchronize certain events with video would even improve the experience. Coding and representation of not only frames of video, but also individual objects in the scene (video objects), can open the door for completely new ways of television programming.
Mobile multimedia
The enormous popularity of cellular phones and palm computers indicates the interest in mobile communications and computing. Using multimedia in these areas would enhance the user’s experience and improve the usability of these devices. Narrow bandwidth, limited computational capacity, and reliability of the transmission media are limitations that currently hamper wide-spread use of multimedia here. Providing improved error resilience, improved coding efficiency, and flexibility in assigning computational resources would bring mobile multimedia applications closer to reality
TV production
Content creation is increasingly turning into virtual production techniques as extensions to the well-known chroma keying. The scene and the actors are recorded separately, and can be mixed with additional computer generated special effects. By coding video objects instead of rectangular linear video frames, and allowing access to the video objects, the scene can be rendered with higher quality, and with more flexibility. Television programs consisting of composited video objects, and additional graphics and audio, can then be transmitted directly to the viewer, with the additional advantage of allowing the user to control the programming in a more sophisticated way. In addition, depending on the targeted viewers, local TV stations could inject regional advertisement video objects, better suited when international programs are broadcast.
Games
The popularity of games on stand-alone game machines, and on PCs clearly indicate the interest in user interaction. Most games are currently using three dimensional graphics, both for the environment, and for the objects that are controlled by the players. The addition of video objects into these games would make the games even more realistic, and using overlay techniques, the objects could be made more life like. Essential is the access to individual video objects, and using standards based technology would make it possible to personalize games by using personal video data bases linked in real-time into the games.
Streaming video
Streaming video over the Internet is becoming very popular, using viewing tools as software plug-ins for a Web browser. News updates and live music shows are just examples of many possible video streaming applications. Here, bandwidth is limited due to the use of modems, and transmission reliability is an issue, as packet loss may occur. Increased error resilience and improved coding efficiency will improve the experience of streaming video. In addition, scalability of the bitstream, in terms of temporal and spatial resolution, but also in terms of video objects, under the control of the viewer, will further enhance the experience, and also the use of streaming video.
RECENT MPEG STANDARDS
MPEG-7 Standard
The MPEG-7 standard, formally named “Multimedia Content Description Interface”, provides a rich set of standardized tools to describe multimedia content. Both human users and automatic systems that process audiovisual information are within the scope of MPEG-7.
MPEG-7 offers a comprehensive set of audiovisual Description Tools (the metadata elements and their structure and relationships, that are defined by the standard in the form of Descriptors and Description Schemes) to create descriptions (i.e., a set of instantiated Description Schemes and their corresponding Descriptors at the users will), which will form the basis for applications enabling the needed effective and efficient access (search, filtering and browsing) to multimedia content. This is a challenging task given the broad spectrum of requirements and targeted multimedia applications, and the broad number of audiovisual features of importance in such context.
Context of MPEG-7
More and more audiovisual information is available from many sources around the world. The information may be represented in various forms of media, such as still pictures, graphics, 3D models, audio, speech, video. Audiovisual information plays an important role in our society, be it recorded in such media as film or magnetic tape or originating, in real time, from some audio or visual sensors and be it analogue or, increasingly, digital. Other scenarios are information retrieval (quickly and efficiently searching for various types of multimedia documents of interest to the user) and filtering in a stream of audiovisual content description (to receive only those multimedia data items which satisfy the user’s preferences). For example, a code in a television program triggers a suitably programmed PVR (Personal Video Recorder) to record that program, or an image sensor triggers an alarm when a certain visual event happens. Automatic transcoding may be performed from a string of characters to audible information or a search may be performed in a stream of audio or video data. In all these examples, the audiovisual information has been suitably “encoded” to enable a device or a computer code to take some action.
Audiovisual sources will play an increasingly pervasive role in our lives, and there will be a growing need to have these sources processed further. A PVR could receive descriptions of the audiovisual information associated to a program that would enable it to record, for example, only news with the exclusion of sport. Products from a company could be described in such a way that a machine could respond to unstructured queries from customers making inquiries.
MPEG-7 is a standard for describing the multimedia content data that will support these operational requirements. The requirements apply, in principle, to both real-time and non real-time as well as push and pull applications. MPEG-7 does not standardize or evaluate applications. In the development of the MPEG-7 standard applications have been used for understanding the requirements and evaluation of technology. It must be made clear that the requirements are derived from analyzing a wide range of potential applications that could use MPEG-7 descriptions. MPEG-7 is not aimed at any one application in particular; rather, the elements that MPEG-7 standardizes support as broad a range of applications as possible.
MPEG-7 Objectives
Audiovisual data content that has MPEG-7 data associated with it, may include: still pictures, graphics, 3D models, audio, speech, video, and composition information about how these elements are combined in a multimedia presentation (scenarios). A special case of these general data types is facial characteristics.
MPEG-7, like the other members of the MPEG family, is a standard representation of audio-visual information satisfying particular requirements. The MPEG-7 standard builds on other (standard) representations such as analogue, PCM, MPEG-1, -2 and 4. One functionality of the MPEG-7 standard is to provide references to suitable portions of them.
MPEG-7 allows different granularity in its descriptions, offering the possibility to have different levels of discrimination. Because the descriptive features must be meaningful in the context of the application, they will be different for different user domains and different applications. Intermediate levels of abstraction may also exist.
The level of abstraction is related to the way the features can be extracted: many low-level features can be extracted in fully automatic ways, whereas high level features need (much) more human interaction.
Next to having a description of what is depicted in the content, it is also required to include other types of information about the multimedia data:
The form,Conditions for accessing the material,Classification ,Links to other relevant material,The context.
In many cases, it is desirable to use textual information for the descriptions. Care was, however, that the usefulness of the descriptions is as independent from the language area as possible. A very clear example where text comes in handy is in giving names of authors, titles, places, etc. Information about the interaction of the user with the content (user preferences, usage history).All these descriptions are of course coded in an efficient way for searching, filtering, etc.To accommodate this variety of complementary content descriptions, MPEG-7 approaches the description of content from several viewpoints. The sets of Description Tools developed on those viewpoints are presented here as separate entities. However, they are interrelated and can be combined in many ways. Depending on the application, some will present and others can be absent or only partly present.
MPEG-7 uses also XML as the language of choice for the textual representation of content description, as XML Schema has been the base for the DDL (Description Definition Language) that is used for the syntactic definition of MPEG-7 Description Tools and for allowing extensibility of Description Tools (either new MPEG-7 ones or application specific). Considering the popularity of XML, usage of it will facilitate interoperability with other metadata standards in the future.
The main elements of the MPEG-7’s standard are:
Description Tools: Descriptors (D), that define the syntax and the semantics of each feature (metadata element); and Description Schemes (DS), that specify the structure and semantics of the relationships between their components, that may be both Descriptors and Description Schemes,
- A Description Definition Language (DDL) to define the syntax of the MPEG-7 Description Tools and to allow the creation of new Description Schemes and, possibly, Descriptors and to allow the extension and modification of existing Description Schemes
MPEG-21 Standard
Today, many elements exist to build an infrastructure for the delivery and consumption of multimedia content. There is, however, no 'big picture' to describe how these elements, either in existence or under development, relate to each other. The aim for MPEG-21 is to describe how these various elements fit together. Where gaps exist, MPEG-21 will recommend which new standards are required. ISO/IEC JTC 1/SC 29/WG 11 (MPEG) will then develop new standards as appropriate while other relevant standards may be developed by other bodies. These specifications will be integrated into the multimedia framework through collaboration between MPEG and these bodies.
The vision for MPEG-21 is to define a multimedia framework to enable transparent and augmented use of multimedia resources across a wide range of networks and devices used by different communities.
Multimedia Framework
Currently, multimedia technology provides the different players in the multimedia value and delivery chain (from content creators to end-users) with an excess of information and services. Access to information and services from almost anywhere at anytime can be provided with ubiquitous terminals and networks. However, no complete solutions exist that allow different communities, each with their own models, rules, procedures, interests and content formats, to interact efficiently using this complex infrastructure. Examples of these communities are the content, financial, communication, computer and consumer electronics sectors and their customers. Developing a common multimedia framework will facilitate co-operation between these sectors and support a more efficient implementation and integration of the different models, rules, procedures, interests and content formats. This will enable an enhanced user experience.
The MPEG-21 multimedia framework will identify and define the key elements needed to support the multimedia delivery chain as described above, the relationships between and the operations supported by them. Within the parts of MPEG-21, MPEG will elaborate the elements by defining the syntax and semantics of their characteristics, such as interfaces to the elements. MPEG-21 will also address the necessary framework functionality, such as the protocols associated with the interfaces, and mechanisms to provide a repository, composition, conformance, etc.
The seven key elements defined in MPEG-21 are:
1. Digital Item Declaration (a uniform and flexible abstraction and interoperable schema for declaring Digital Items);
2. Digital Item Identification and Description (a framework for identification and description of any entity regardless of its nature, type or granularity);
3. Content Handling and Usage (provide interfaces and protocols that enable creation, manipulation, search, access, storage, delivery, and (re)use of content across the content distribution and consumption value chain);
4. Intellectual Property Management and Protection (the means to enable content to be persistently and reliably managed and protected across a wide range of networks and devices);
5. Terminals and Networks (the ability to provide interoperable and transparent access to content across networks and terminals);
6. Content Representation (how the media resources are represented);
7. Event Reporting (the metrics and interfaces that enable Users to understand precisely the performance of all reportable events within the framework);
0 comments:
Post a Comment
Thanks for your Valuable comment