Any video is a set of static frames alternately replacing each other at different speeds. Some of the values that have been established in the video industry are 23.976, 24 , 25, and 29.97 frames per second.
For example, films are shot at a speed of 24 frames per second, and 25 frames and 29.97 are the standard for broadcasting in Europe and the USA. In what follows, we will call this the video stream. In addition to the frame rate, any video file has the following characteristics:
- Pixel format - any image, whether static or dynamic, consists of pixels, each of which carries only information about a specific color. However, the perception of color, both in the digital and the real world, is a very subjective thing. Therefore, the film industry engineers made great efforts to bring the appearance of photos and videos shot on digital or film cameras to the real one. It was decided to use the RGB24 scheme (Red, Blue, Green), since by changing the values of each of these channels it is possible to work with color tinting of images or video as finely and in detail. Well, so that during the shooting it was possible to capture almost the entire range of shades of colors. It is not so difficult to calculate this figure, each of the colors is encoded with eight bits, respectively, in total, the camera can receive and transmit 256 gradations of color, well, or 16.7 million of all possible shades.
- Frame format is the next feature that any video file has. It is usually indicated by the number of lines (frame height) and the first letter of the name of the scan type. There are two of them: progressive (“p” progressive ) and interlaced (“i” Interlaced ). So, for example, decoding the frame format of a 1080p video does not seem so mysterious anymore, because 1080 is the frame height, and “p” is a progressive scan type.
- Frame size - you will probably find these numbers familiar (640x480, 720x480, 720x576, 1280x720, 1920x1080). Why? Because each of them appears in different video standards. For example, 720x576 is used on the relatively forgotten by many DVD discs, and 1280x720 and 1920x1080 are often featured as digital web formats. The first number indicates the width of the frame, and the second - its height, and the video itself has a lower resolution (frame size) carries much less information and data.
All of the above applies to the local (permanent) properties of video files, but there are also dynamic properties. Such as, for example: time (video duration), data volume, dependency, as well as the availability of other additional information.
It may be that one video stream may contain information for the left eye and another stream for the right eye. So using these two conditional streams, you can transmit stereoscopic video, or in simple 3D.
Still, why do you need to compress the video? Trite, to save free storage space. Let's imagine how much a movie will last for 90 minutes, with a frame size of 1920x1080 and a speed of 24 frames per second. Someone this figure may not surprise, but the volume of such a file will be as much as 750 GB. If you are the owner of an online movie theater, then you are unlikely to have enough money to secure the storage of a huge library consisting of thousands of bestsellers on hosting in an uncompressed format, and the speed of access to such files will obviously not please the audience.
At the same time, the video in Blu-Ray format will be of similar quality, while having a size of some kind of pathetic 20 GB. Yes, even stronger compression, which allows to reduce the size at least a hundred times, is unlikely to leave the viewer without enthusiasm for what he saw. Okay, enough lyrics, let's get to the point. What exactly do they save in the process of video compression?
First of all, the codec performs color coding. Previously, television, films and in general all video content was black and white, but today everything has become more cheerful, color. But how can the old-fashioned black and white televisions show modern TV channels?
It turns out that any picture in a digital or analog signal is encoded in the YUV format, which consists of two components - a brightness map and a color mask. The variable Y stands for brightness, and U and V are the color components that are calculated using these formulas:
U = -0.14713 * R — 0.28886 * G + 0.436 * B
V = 0.615 * R — 0.51499 * G — 0.10001 * B
Thus, dividing the image into three conditional layers, one of which (Y) will be luminance, and the other two (U and V) color masks, coding reduces the video by half, in width and height, while visually saving most of the general information , because it turns out our eyesight is not so essential to the perception of color, but brightness is especially sensitive.
Therefore, the volume of each pixel is reduced by half, from the conventional 24 to 12 units. After all, as a rule, the colors on objects from frame to frame do not change so often, so this color thinning does not pose a potential danger in terms of serious artifacts. JPEG works the same way.
Next, the process of inter-frame difference. Probably, many noticed during frame-by-frame viewing of videos that frames change smoothly, and some objects in the background remain static. But of course this does not always happen, in the case of a change of scene, everything happens abruptly.
And here a new task arises of rendering an incredibly huge number of image conversion options in front of a computer. The motion compensation algorithm comes to the aid of this process. This
Wikipedia article can tell you more about this process, but I will explain some basic concepts in its work.
First of all, frames are divided into equal blocks, and between them there is a process of searching for similar blocks on another frame, so the computer builds the field of motion of vectors. With the compensation itself, all motion vectors are taken into account and the process of generating an image similar to the original (previous) frame is in progress.
In the pictures you can clearly see the inter-frame difference, but the differences between the original frame and the compensated are not so noticeable. If we take into account the amount of information, then when compressing images, we have the opportunity, practically without any problems, to save motion vectors. You can do this later, when the algorithm has already compressed the image using the inter-frame difference method. But as you can see, a frankly strong blocking effect arose in the second picture. This occurs if the algorithm compresses the image too much. From real situations, I can give an example when the algorithms of the TV broadcast codecs do not work out sharp changes in brightness from camera flashes from broadcast events.
And so, after all the file compression procedures, the coefficients still have redundancy. To eliminate and equalize it, for example, in the H.264 codec, two mathematical models CABAC and CAVLC are used, which deal with lossless arithmetic compression using the Huffman probabilistic model.
Of course, the processes in video compression are much more complicated, and you can find more in-depth technical details on this topic on the Internet. I shared with you a basic description of the processes in video encoding.
Thank you for attention!