MPEG-4 NATURAL VIDEO CODING

Introduction
The MPEG-4 Visual standard allows the hybrid coding of natural (pixel based) images and video together with synthetic (computer generated) scenes. This enables, for example, the virtual presence of videoconferencing participants. To this end, the Visual standard comprises tools and algorithms supporting the coding of natural (pixel based) still images and video sequences as well as tools to support the compression of synthetic 2-D and 3-D graphic geometry parameters (i.e. compression of wire grid parameters, synthetic text).
The subsections below give an itemized overview of functionalities that the tools and algorithms of in the MPEG-4 visual standard.

Formats Supported
The following formats and bitrates are be supported by MPEG-4 Visual :
bitrates: typically between 5 kbit/s and more than 1 Gbit/s
Formats: progressive as well as interlaced video
Resolutions: typically from sub-QCIF to 'Studio' resolutions (4k x 4k pixels)


Compression Efficiency
For all bit rates addressed, the algorithms are very efficient. This includes the compact coding of textures with a quality adjustable between "acceptable" for very high compression ratios up to "near lossless". 
Efficient compression of textures for texture mapping on 2-D and 3-D meshes.
Random access of video to allow functionalities such as pause, fast forward and fast reverse of stored video.

Content-Based Functionalities
Content-based coding of images and video allows separate decoding and reconstruction of arbitrarily shaped video objects.
Random access of content in video sequences allows functionalities such as pause, fast forward and fast reverse of stored video objects.
Extended manipulation of content in video sequences allows functionalities such as warping of synthetic or natural text, textures, image and video overlays on reconstructed video content. An example is the mapping of text in front of a moving video object where the text moves coherently with the object.



Scalability of Textures, Images and Video
Complexity scalability in the encoder allows encoders of different complexity to generate valid and meaningful bitstreams for a given texture, image or video.
Complexity scalability in the decoder allows a given texture, image or video bitstream to be decoded by decoders of different levels of complexity. The reconstructed quality, in general, is related to the complexity of the decoder used. This may entail that less powerful decoders decode only a part of the bitstream. 
Spatial scalability allows decoders to decode a subset of the total bitstream generated by the encoder to reconstruct and display textures, images and video objects at reduced spatial resolution. A maximum of 11 levels of spatial scalability are supported in so-called 'fine-granularity scalability', for video as well as textures and still images.
Temporal scalability allows decoders to decode a subset of the total bitstream generated by the encoder to reconstruct and display video at reduced temporal resolution. A maximum of three levels are supported.
Quality scalability allows a bitstream to be parsed into a number of bitstream layers of different bitrate such that the combination of a subset of the layers can still be decoded into a meaningful signal. The bitstream parsing can occur either during transmission or in the decoder. The reconstructed quality, in general, is related to the number of layers used for decoding and reconstruction.
Fine Grain Scalability – a combination of the above in fine grain steps, up to 11 steps

Shape and Alpha Channel Coding
Shape coding assists the description and composition of conventional images and video as well as arbitrarily shaped video objects. Applications that benefit from binary shape maps with images are content-based image representations for image databases, interactive games, surveillance, and animation. There is an efficient technique to code binary shapes. A binary alpha map defines whether or not a pixel belongs to an object. It can be ‘on’ or ‘off’.
‘Gray Scale’ or ‘alpha’ Shape Coding. An alpha plane defines the ‘transparency’ of an object, which is not necessarily uniform; it can vary over the object, so that, e.g., edges are more transparent (a technique called feathering). Multilevel alpha maps are frequently used to blend different layers of image sequences. Other applications that benefit from associated binary alpha maps with images are content-based image representations for image databases, interactive games, surveillance, and animation. 

Robustness in Error Prone Environments
Error resilience allows accessing image and video over a wide range of storage and transmission media. This includes the useful operation of image and video compression algorithms in error-prone environments at low bit-rates (i.e., less than 64 Kbps). There are tools that address both the band-limited nature and error resiliency aspects of access over wireless networks.


Face and Body Animation
The ‘Face and Body Animation’ tools in the standard allow sending parameters that can define, calibrate and animate synthetic faces and bodies. These models themselves are not standardized by MPEG-4, only the parameters are, although there is a way to send, e.g., a well-defined face to a decoder.
The tools include:
Definition and coding of face and body animation parameters (model independent):
Feature point positions and orientations to animate the face and body definition meshes
Visemes, or visual lip configurations equivalent to speech phonemes
Definition and coding of face and body definition parameters (for model calibration):
3-D feature point positions
3-D head calibration meshes for animation
Personal characteristics
Facial texture coding

Coding of 2-D Meshes with Implicit Structure
2D mesh coding includes:
Mesh-based prediction and animated texture transfiguration
2-D Delaunay or regular mesh formalism with motion tracking of animated objects
Motion prediction and suspended texture transmission with dynamic meshes.
Geometry compression for motion vectors:
2-D mesh compression with implicit structure & decoder reconstruction

Coding of 3-D Polygonal Meshes
MPEG-4 provides a suite of tools for coding 3-D polygonal meshes. Polygonal meshes are widely used as a generic representation of 3-D objects. The underlying technologies compress the connectivity, geometry, and properties such as shading normals, colors and texture coordinates of 3-D polygonal meshes.

Last post related to this was MPEG4 Audio 
Share on Google Plus

About Unknown

This is a short description in the author block about the author. You edit it by entering text in the "Biographical Info" field in the user admin panel.

0 comments:

Post a Comment

Thanks for your Valuable comment