资源说明:Using quad trees for video and image compression.
QTC (c) 50m30n3 2011, 2012 http://d00m.org/~someone/qtc/ This software is licensed under the GNU GPL V3 see LICENSE for details. OVERVIEW The QTC codec is a lossless video and still image codec. It uses quad trees to quickly find and encode constant and unchanged areas in image sequences or single images. QTC is designed for "pixel graphics". Its main goal is to lossless encode screen captures at high frame rates. It is not suited for encoding photos, film, or captures of computer games, although it is certainly capable of doing so. QTC uses full 24bit RGB color internally to avoid conversion artifacts. Encoding and decoding should be bit exact under all circumstances. There are currently 3 file formats associated with this codec: QTI - Still image container QTV - Video container QTW - Video container for Web usage QTW files are split into multiple blocks to make streaming using JavaScript and HTTP easier. Both video file containers support indexing and seeking. Adapting the codec for other container formats should be trivial and is left as an exercise for the reader. The supplied programs implement a full featured QTC decoder and encoder for video ans still images in plain C code. The only exception is the quad tree compressor itself which uses local functions. This is not supported by all compilers. This reference implementation is not tweaked for speed or minimal file size. All parameters are set on a per file basis, but the format is theoretically capable of varying parameters dynamically per frame. A lot of algorithmic and micro optimizations are omitted for the sake of clarity. Programs: qtienc - Still image encoder qtidec - Still image decoder qtvenc - Video encoder qtvdec - Video decoder qtvcap - X11 screen capture program qtvplay - Video player Aside from the screen capture program, which is build for the X Window System, and the video player, which uses SDL, the programs rely on no external dependencies. ALGORITHM The QTC algorithm can be split into three parts. The first part is image preprocessing, the second part is the quad tree compression itself, and the last part is a range coder based entropy encoder. In the fist step the image is preprocessed using lossless image and color transformations. These are tweakable on a per frame basis. The image can be processed using either a full Paeth transform, as used in the PNG file format, or a simplified Paeth transform that only uses a single predictor for increased speed. The color information can be transformed into a simplified YUV color space called fakeyuv. This is done to avoid rounding errors that appear when using the standard YUV color space. The fakeyuv color space encodes the green channel as Y component, the difference between red and green as U and the difference between red and blue as V. In memory the colors are stored as UYV to increase computational efficiency. This step does not reduce the image size and only serves to make the image more compressible in the next steps. The quad tree compressor itself uses recursive subdivision of the input image to find areas of constant color or areas that did not change in respect to the reference image. When a reference image is available the encoder first checks the current block against the same block in the reference image. If there are no changes subdivision stops and the next block gets encoded. When there is no reference image, or a change was detected in the last step, the encoder checks if the current block is of a single color. If the block is of a single color, subdivision stops and the color of the block is written to the output stream, otherwise the block is subdivided and the process is repeated. Should a block at any time get smaller than a certain threshold it gets saved as "literal" block, where the image contents of the block are written to the output stream as they are. At this point an optional caching mechanism can be used to reduce the number of literal blocks written to the file. This cache allows to recognize recently used blocks and reference them using an id. The structure of the subdivision tree built during compression is saved as a separate command data bit stream so the structure can be replicated during decompression. Normally the compressor works on a per pixel basis, but a special mode for the aforementioned fakeyuv color model exists that encodes the Y component separately. This is especially useful for gray scale images. The last step of the encoder is a range coder based entropy encoder. In the reference encoder this step is integrated into the container format. The color and index data is encoded 8 bit at a time using a second order Markov chain model. The command data is encoded one bit at a time using an eight order Markov chain model. USAGE All programs read and write binary ppm files for uncompressed files. The video en/decoders read/write image sequences in the form of img001.ppm, img002.ppm, img003.ppm.... Simply pass the name of the first file. Concatenated image sequences can also be read/written using stdio. qtienc: -h - Print help -t [0..2] - Use image transforms (0) -e - Compress output data -y [0..2] - Use fakeyuv transform (0) -v - Be verbose -s [1..] - Minimal block size (2) -d [0..] - Maximum recursion depth (16) -c [0..] - Cache size in kilo tiles (0) -l [0..] - Laziness -i filename - Input file (-) -o filename - Output file (-) qtidec: -h - Print help -v - Be verbose -a [0..2] - Analysis mode -i filename - Input file (-) -o filename - Output file (-) qtvenc: -h - Print help -t [0..2] - Use image transforms (0) -e - Compress output data -w - Create QTW file -y [0..2] - Use fakeyuv transform (0) -v - Be verbose -x - Create index (Needs key frames) -s [1..] - Minimal block size (2) -n [1..] - Limit number of frames to encode -r [1..] - Frame rate (25) -k [1..] - Place key frames every X seconds -b [1..] - Maximum size of one QTW block in KiB (1024) -d [0..] - Maximum recursion depth (16) -c [0..] - Cache size in kilo tiles (0) -l [0..] - Laziness -i filename - Input file (-) -o filename - Output file (-) qtvdec: -h - Print help -v - Be verbose -w - Read QTW file -a [0..2] - Analysis mode -f [0..] - Begin decoding at specific frame -n [1..] - Limit number of frames to decode -i filename - Input file (-) -o filename - Output file (-) qtvcap: -h - Print help -t [0..2] - Use image transforms (0) -e - Compress output data -y [0..2] - Use fakeyuv transform (0) -v - Be verbose -x - Create index (Needs key frames) -m - Capture Mouse -g geometry - Specify capture region -s [1..] - Minimal block size (2) -n [1..] - Limit number of frames to encode -r [1..] - Frame rate (25) -k [1..] - Place key frames every X seconds -d [0..] - Maximum recursion depth (16) -c [0..] - Cache size in kilo tiles (0) -l [0..] - Laziness -i filename - Input screen ($DISPLAY) -o filename - Output file (-) qtvplay: -h - Print help -v - Be verbose -r [1..] - Override frame rate -w - Read QTW file -i filename - Input file (-) [space] - Play/Pause [left] - Seek backwards 10sec [right] - Seek forwards 10sec [down] - Seek backwards 1min [up] - Seek forwards 1min [a] - Toggle analysis mode [o] - Toggle overlay mode [t] - Toggle Peath transform [y] - Toggle fakeyuv [s] - Print stats OPTIONS -t: Choose which image transform to use. 0 - Don't use image transforms (faster, big) 1 - Use simplified Peath transform (fast, small) 2 - Use full Peath transform (slow, smaller) -e: Compress output data using entropy coding (slower, smaller) -y: Choose which color transform to use. Mode 2 is mostly useful for pure gray scale images like text. 0 - Don't transform color data (fast, big) 1 - Transform color data using fakeyuv transform (fast, small) 2 - Also separate color data during compression (fast, sometimes smaller) -v: Print some stats like compression ratio and FPS -s: Minimal block size. Small values reduce the amount of data after quad tree compression. Makes entropy coding slightly less efficient. When using entropy coding 4-8 is optimal, otherwise 1-2. -d: Maximum recursion depth during quad tree compression. Setting this to 0 disables the quad tree compression. -c: Size of the cache to use during compression in kilo (1024) tiles. Caching uses an additional cachesize*minsize*minsize*4 bytes of ram during en/decoding. Can drastically reduce the size of raw data without much of a speed impact. Cache size 64 is recommended. Larger cache sizes need more bits to save the cache indices, increasing the files size, but allow for more cache hits in large videos. -l: Subdivide quad tree n times before beginning real compression. Saves a bit of time but introduces a tiny overhead. Values around 3 make sense for FullHD material. -i: Input file name. File to read input from. When not set or "-" read from stdin. For image sequences specify the first file. Numbers need leading zeros. For QTW files pass the file without number postfix. For qtvcap this contains the X screen to capture from. -o: Output file name. File to write output to. When not set or "-" write to stdout. For image sequences specify the first file. Numbers need leading zeros. For QTW files additional block files with numbered postfixes are created. -a: Create a color coded analysis image showing types of blocks and subdivisions. Green blocks are simplified blocks, red blocks are literal blocks, blue blocks are reference blocks, white blocks are cached. When using full fakeyuv encoding a value of 1 will show the luma channel and a value of 2 will show the chroma channel. For videos/images without color separation this has no effect. -w: Create a QTW file instead of a QTV file. QTW files are designed for web usage and JavaScript streaming. The file itself only contains the header and index. The video data is written to sequentially numbered block files for easier streaming and seeking using JavaScript and HTTP. QTW files always have an index. -x: Append an index to the file containing a list of key frames, their offset, and in the case of QTW files, their block number. This allows for seeking inside the video stream. -f: Start decoding at a specific frame. In case the selected frame is not a keyframe or the video has no index the decoder will skip the required number of frames. This may take a while. -n: Only encode a certain number of frames. -r: Frame rate. -k: Key frame rate. Place a key frame every X seconds. Higher values increase the file size since key frames are bigger, but allow for more accurate seeking. Every 5-10 seconds should be ok. -b: Maximum size of one QTW block in KiB. Smaller values create more files and therefore more server requests but allow for smaller buffer and seek times. -m: Include mouse cursor in screen capture. -g: Specify the capture region. The region is in the format WxH+X,Y. W and H are the size of the region, X and Y the offset from the top left screen corner. You can use the "getgeom" script to query the geometry of a window by clicking ok it. EXAMPLES Encode a still image (simple): $ qtienc -i image.ppm -o image.qti Encode a still image (high compression): $ qtienc -y1 -t2 -s4 -c64 -e -i image.ppm -o image.qti Encode an image sequence (simple): $ qtvenc -i frame0000.ppm -o video.qtv Encode an image sequence (fast): $ qtvenc -l3 -i frame0000.ppm -o video.qtv Encode an image sequence (high compression): $ qtvenc -y1 -t2 -s4 -c64 -e -i frame0000.ppm -o video.qtv Re-encode a video: $ qtvdec -i video_old.qtv | qtvenc -y1 -t2 -s4 -c64 -e -o video_new.qtv Re-encode a video with key frames and index: $ qtvdec -i video_old.qtv | qtvenc -x -k10 -y1 -t2 -s4 -c64 -e -o video_new.qtv Re-encode a video for web into a separate directory: $ mkdir video.qtw $ qtvdec -i video.qtv | qtvenc -x -k10 -y1 -t2 -s4 -c64 -e -o video.qtw/video Decode a video into an image sequence: $ qtvdec -i video.qtv -o frame0000.ppm Capture a full screen video: $ qtvcap -l3 -c64 -m -o capture.qtv Capture a video of a single window: $ qtvcap -l2 -c64 -m -g $(getgeom) -o capture.qtv Play back video: $ qtvplay -i video.qtv Re-encode a video using avconv: $ qtvdec -i video.qtv | avconv -f image2pipe -r 25 -vcodec ppm -i - video.mp4
本源码包内暂不包含可直接显示的源代码文件,请下载源码包。