imgfind
文件大小: unknow
源码售价: 5 个金币 积分规则     积分充值
资源说明:Searches for similar images.
 imgfind - Searches for similar image files.

 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  Copyright (C) 2011 Joseph Woodruff

  imgfind is free software; you can redistribute it and/or modify
  it under the terms of the GNU General Public License as published by
  the Free Software Foundation; either version 3 of the License, or
  (at your option) any later version.

  This program is distributed in the hope that it will be useful,
  but WITHOUT ANY WARRANTY; without even the implied warranty of
  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  GNU General Public License for more details.

  You should have received a copy of the GNU General Public License
  along with this program; if not, write to the Free Software
  Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA

 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

INSTALL:
  Build and installation instructions included in INSTALL file.

USAGE:

  Usage: imgfind [path...] search_criteria

         path is a path to a directory or image file.
            Default is the current working directory.

         search_criteria is a file of gif, jpeg, png or bmp format


HOW DOES IT WORK? (DESIGN):

    imgfind uses the pHash perceptual hashing library to generate 
    non-avalanching discrete cosine transform-based hash values of images, 
    for which hamming distances are computed to determine similarity.

    The hashing algorithms in the pHash library are designed to be robust 
    against rotation, skew and contrast adjustments and should also be 
    resistant to scaling to a limited extent. 

    The default settings for the hamming distance similarity criteria will tend
    to produce some false positives against a non-trivial image collection; 
    this is by design, to reduce the possibility of false negatives. Thus, I 
    wouldn't incorporate imgfind into an automated file deduplication script, 
    unless it enforces manual/interactive verification of the results ;)

    imgfind is written in C++, and makes considerable use the STL, boost, and 
    C++ standard classes such as string. Unfortunately, pHash is written 
    predominantly in standard C, without any of the helpful features provided 
    by C++, like classes or templates or automated memory management.
 
    This has, unsurprisingly, presented some challenges with regards to 
    integrating the two. It is my hope that I will be able to write a 
    class-based wrapper around the pHash API to make it easier to incorporate 
    into object-oriented C++ applications. For now though, I'll make due with 
    what I have.

WHY THIS?:

    To be perfectly honest, I've been using imageboards since high school, 
    and now have what constitutes an unmanageably large collection of 
    image macros, funny cat pictures and other assorted mental detritus. As
    a result, there are duplicate files taking up disk space and cluttering 
    up my image directories.

    Unfortunately, most image files such as these don't contain any relevant 
    searchable metadata describing their contents, which leaves manual 
    inspection the directory or directories as the only option for trying to 
    track down duplicates or lower-resolution copies. 

    This is where imgfind comes in. I've written it as a nice little search 
    utility modeled after the classic UNIX find, that can be used to match 
    image files by their actual contents. It can detect duplicates and even 
    images which are merely 'similar' (using pHash's suggested values for 
    hamming distance it can reliably pick out image macros using the same 
    image with different texts, for instance).

    I imagine this could also eventually be useful for digital forensics and 
    copyright enforcement applications, if one were so inclined. In any event,
    happy image hunting.


Comments are welcome; patches / pull requests are even more welcome.

	- Joseph Woodruff  (aka sectoidman)

本源码包内暂不包含可直接显示的源代码文件,请下载源码包。