ZlibDecompress
文件大小: unknow
源码售价: 5 个金币 积分规则     积分充值
资源说明:This class was written from original C++ source code by Mark Adler (http://zlib.net/)
ZlibDecompress
==============

This class was written from original C++ source code by Mark Adler (http://zlib.net/)

It happens that for some pdf files it is not possibile to extract text stored in compressed zlib format, because gzuncompress() php function fails to decompress it.

I found that the issue is due to the fact that in some cases, in the gzipped pdf stream, the zlib stream header (see RFC 1950) is missing or invalid, and this is the reason why gzuncompress() function cannot decompress data.

Here's my workaround to fix the problem: I wrote a new ZlibDecompress? class in PHP from some original C++ code by Mark Adler taken from the official zlib site, which uses the standard algorithm and always decompress the string correctly, ignoring the zlib stream header, as it contains only CRC data. Obviously this PHP class is much more slower than gzuncompress function, because it is not compiled, so I use it only when gzuncompress fails. For pdf files this happens only in a few cases, so this is a good compromise for me.

Ref. http://bugs.php.net/bug.php?id=39616

#Usage

The `ZlibDecompress.php` file contains the class `ZlibDecompress`, which requires the `HuffmanTable.php` file/class to work. 
To use it, simply call the `inflate()` method:

    $zlib = new ZlibDecompress;
    $inflated = $zlib->inflate(substr($compressed_stream,2));

For pdf files, it can be used in conjunction with `gzuncompress()` function, calling it only when the latter fails to decompress the stream:

    ini_set("memory_limit","-1");
    $inflated = @gzuncompress($stream);
    if (!$inflated) {
        require_once("ZlibDecompress.php");
        $zlib = new ZlibDecompress;
        $inflated = $zlib->inflate(substr($stream, 2));    
    }

Note that when using the `inflate()` method you must cut the first two bytes (i.e. the zlib stream header) from the compressed pdf stream.

本源码包内暂不包含可直接显示的源代码文件,请下载源码包。