MPEG-4 visual overview


Motivation
 Digital video is replacing analog video in many existing applications. A prime example is the introduction of digital television that is starting to see wide deployment. Another example is the progressive replacement of analog video cassettes by DVD as the preferred medium to watch movies. MPEG-2 has been one of the key technologies that enabled the acceptance of these new media. In these existing applications, digital video will initially provide similar functionalities as analog video, i.e. the content is represented in digital form instead of analog, with obvious direct benefits such as improved quality and reliability, but the content remains the same to the user. However, once the content is in the digital domain, new functionalities can easily be added, that will allow the user to view, access, and manipulate the content in completely new ways. The MPEG-4 standard provides key technologies that will enable such functionalities.

Features and functionalities
 The MPEG-4 visual standard consists of a set of tools that enable applications by supporting several classes of functionalities. The most important features covered by MPEG-4 standard can be clustered in three categories (see Fig. 1) and summarized as follows:
 1)      Compression efficiency: Compression efficiency has been the leading principle for MPEG-1 and MPEG-2, and in itself has enabled applications such as Digital TV and DVD. Improved coding efficiency and coding of multiple concurrent data streams will increase acceptance of applications based on the MPEG-4 standard.
 2)      Content-based interactivity: Coding and representing video objects rather than video frames enables content-based applications. It is one of the most important novelties offered by MPEG-4. Based on efficient representation of objects, object manipulation, bitstream editing, and object-based scalability allow new levels of content interactivity
 3)      Universal access: Robustness in error-prone environments allows MPEG-4 encoded content to be accessible over a wide range of media, such as mobile networks as well as wired connections. In addition, object-based temporal and spatial scalability allow the user to decide where to use sparse resources, which can be the available bandwidth, but also the computing capacity or power consumption.

Figure 1: Functionalities offered by the MPEG-4 visual standard
  To support some of these functionalities, MPEG-4 should provide the capability to represent arbitrarily shaped video objects. Each object can be encoded with different parameters, and at different qualities. The shape of a video object can be represented in MPEG-4 by a binary or a gray-level (alpha) plane. The texture is coded separately from its shape. For low-bitrate applications, frame based coding of texture can be used, similar to MPEG-1 and MPEG-2. To increase robustness to errors, special provisions are taken into account at the bitstream level to allow fast resynchronization, and efficient error recovery.

Structure and syntax
 The central concept defined by the MPEG-4 standard is the audio-visual object, which forms the foundation of the object-based representation. Such a representation is well suited for interactive applications and gives direct access to the scene contents.
 An MPEG-4 visual scene may consist of one or more video objects. Each video object is characterized by temporal and spatial information in the form of shape, motion, and texture. For certain applications video objects may not be desirable, because of either the associated overhead or the difficulty of generating video objects. For those applications, MPEG-4 video allows coding of rectangular frames which represent a degenerate case of an arbitrarily shaped object.

 An MPEG-4 visual bitstream provides a hierarchical description of a visual scene as shown in Fig. 2. Each level of the hierarchy can be accessed in the bitstream by special code values called start codes. The hierarchical levels that describe the scene most directly are:
1.      Visual Object Sequence (VS): The complete MPEG-4 scene which may contain any    2-D or 3-D natural or synthetic objects and their enhancement layers.
2.      Video Object (VO): A video object corresponds to a particular (2-D) object in the scene. In the most simple case this can be a rectangular frame, or it can be an arbitrarily shaped object corresponding to an object or background of the scene.
3.     Video Object Layer (VOL): Each video object can be encoded in scalable (multi-layer) or non-scalable form (single layer), depending on the application, represented by the video object layer (VOL). The VOL provides support for scalable coding. A video object can be encoded using spatial or temporal scalability, going from coarse to fine resolution. Depending on parameters such as available bandwidth, computational power, and user preferences, the desired resolution can be made available to the decoder.
 There are two types of video object layers, the video object layer that provides full MPEG-4 functionality, and a reduced functionality video object layer, the video object layer with short headers. The latter provides bitstream compatibility with base-line

Figure 3: Example of VOP based decoding in MPEG-4

Application of  MEPG-4
Share on Google Plus

About Unknown

This is a short description in the author block about the author. You edit it by entering text in the "Biographical Info" field in the user admin panel.

0 comments:

Post a Comment

Thanks for your Valuable comment