Learning OpenCV 2nd Early Release 非常实用易懂的Opencv学习资料,C++实现
文件大小: 11390k
源码售价: 10 个金币 积分规则     积分充值
资源说明:What Is OpenCV? OpenCV [OpenCV] is an open source (see http://opensource.org) computer vision library available from http://opencv.org. The library is written in C and C++1 and runs under Linux, Windows, Mac OS X, iOS, and Android. Interfaces are available for Python, Java, Ruby, Matlab, and other languages. OpenCV was designed for computational efficiency with a strong focus on real-time applications: optimizations were made at all levels, from algorithms to multicore and CPU instructions. For example, OpenCV supports optimizations for SSE, MMX, AVX, NEON, OpenMP, and TBB. If you desire further optimization on Intel architectures [Intel] for basic image processing, you can buy Intel’s Integrated Performance Primitives (IPP) libraries [IPP], which consist of low-level optimized routines in many different algorithmic areas. OpenCV automatically uses the appropriate instructions from IPP at runtime. The GPU module also provides CUDA-accelerated versions of many routines (for Nvidia GPUs) and OpenCL-optimized ones (for generic GPUs). One of OpenCV’s goals is to provide a simple-to-use computer vision infrastructure that helps people build fairly sophisticated vision applications quickly. The OpenCV library contains over 500 functions that span many areas, including factory product inspection, medical imaging, security, user interface, camera calibration, stereo vision, and robotics. Because computer vision and machine learning often go hand-in-hand, OpenCV also contains a full, general-purpose Machine Learning Library (MLL). This sub-library is focused on statistical pattern recognition and clustering. The MLL is highly useful for the vision tasks that are at the core of OpenCV’s mission, but it is general enough to be used for any machine learning problem. 1 The legacy C interface is still supported, and will remain so for the foreseeable future. Who Uses OpenCV? Most computer scientists and practical programmers are aware of some facet of the role that computer vision plays. But few people are aware of all the ways in which computer vision is used. For example, most people are somewhat aware of its use in surveillance, and many also know that it is increasingly being used for images and video on the Web. A few have seen some use of computer vision in game interfaces. Yet few people realize that most aerial and street-map images (such as in Google’s Street View) make heavy use of camera calibration and image stitching techniques. Some are aware of niche applications in safety monitoring, unmanned aerial vehicles, or biomedical analysis. But few are aware how pervasive machine vision has become in manufacturing: virtually everything that is mass-produced has been automatically inspected at some point using computer vision. The BSD [BSD] open source license for OpenCV has been structured such that you can build a commercial product using all or part of OpenCV. You are under no obligation to open-source your product or to return improvements to the public domain, though we hope you will. In part because of these liberal licensing terms, there is a large user community that includes people from major companies (Google, IBM, Intel, Microsoft, Nvidia, SONY, and Siemens, to name only a few) and research centers (such as Stanford, MIT, CMU, Cambridge, Georgia Tech and INRIA). OpenCV is also present on the web for users at http://opencv.org, a website that hosts documentation, developer information, and other community resources including links to compiled binaries for various platforms. For vision developers, code, development notes and links to GitHub are at http://code.opencv.org. User questions are answered at http://answers.opencv.org/questions/ but there is still the original Yahoo groups user forum at http://groups.yahoo.com/group/OpenCV; it has almost 50,000 members. OpenCV is popular around the world, with large user communities in China, Japan, Russia, Europe, and Israel. OpenCV has a Facebook page at https://www.facebook.com/opencvlibrary. Since its alpha release in January 1999, OpenCV has been used in many applications, products, and research efforts. These applications include stitching images together in satellite and web maps, image scan alignment, medical image noise reduction, object analysis, security and intrusion detection systems, automatic monitoring and safety systems, manufacturing inspection systems, camera calibration, military applications, and unmanned aerial, ground, and underwater vehicles. It has even been used in sound and music recognition, where vision recognition techniques are applied to sound spectrogram images. OpenCV was a key part of the vision system in the robot from Stanford, “Stanley”, which won the $2M DARPA Grand Challenge desert robot race [Thrun06], and continues to play an important part in other many robotics challenges. What Is Computer Vision? Computer vision2 is the transformation of data from 2D/3D stills or videos into either a decision or a new representation. All such transformations are done for achieving some particular goal. The input data may include some contextual information such as “the camera is mounted in a car” or “laser range finder indicates an object is 1 meter away”. The decision might be “there is a person in this scene” or “there are 14 tumor cells on this slide”. A new representation might mean turning a color image into a grayscale image or removing camera motion from an image sequence. Because we are such visual creatures, it is easy to be fooled into thinking that computer vision tasks are easy. How hard can it be to find, say, a car when you are staring at it in an image? Your initial intuitions can be quite misleading. The human brain divides the vision signal into many channels that stream different pieces of information into your brain. Your brain has an attention system that identifies, in a task-dependent way, important parts of an image to examine while suppressing examination of other areas. There is massive feedback in the visual stream that is, as yet, little understood. There are widespread associative inputs from muscle control sensors and all of the other senses that allow the brain to draw on cross-associations made from years of living in the world. The feedback loops in the brain go back to all stages of processing including the hardware sensors themselves (the eyes), which mechanically control lighting via the iris and tune the reception on the surface of the retina. In a machine vision system, however, a computer receives a grid of numbers from the camera or from disk, and, in most cases, that’s it. For the most part, there’s no built-in pattern recognition, no automatic control of focus and aperture, no cross-associations with years of experience. For the most part, vision systems are still fairly naïve. Figure 1- 1 shows a picture of an automobile. In that picture we see a side mirror on the driver’s side of the car. What the computer “sees” is just a grid of numbers. Any given number within that grid has a rather large noise component and so by itself gives us little information, but this grid of numbers is all the computer “sees”. Our task then becomes to turn this noisy grid of numbers into the perception: “side mirror”. Figure 1-2 gives some more insight into why computer vision is so hard. 2 Computer vision is a vast field. This book will give you a basic grounding in the field, but we also recommend texts by Szeliski [Szeliski2011] for a good overview of practical computer vision algorithms, and Hartley [Hartley06] for how 3D vision really works. Figure 1-1. To a computer, the car’s side mirror is just a grid of numbers In fact, the problem, as we have posed it thus far, is worse than hard; it is formally impossible to solve. Given a two-dimensional (2D) view of a 3D world, there is no unique way to reconstruct the 3D signal. Formally, such an ill-posed problem has no unique or definitive solution. The same 2D image could represent any of an infinite combination of 3D scenes, even if the data were perfect. However, as already mentioned, the data is corrupted by noise and distortions. Such corruption stems from variations in the world (weather, lighting, reflections, movements), imperfections in the lens and mechanical setup, finite integration time on the sensor (motion blur), electrical noise and compression artifacts after image capture. Given these daunting challenges, how can we make any progress? Figure 1-2: The ill-posed nature of vision: the 2D appearance of objects can change radically with viewpoints In the design of a practical system, additional contextual knowledge can often be used to work around the limitations imposed on us by visual sensors. Consider the example of a mobile robot that must find and pick up staplers in a building. The robot might use the facts that a desk is an object found inside offices and that staplers are mostly found on desks. This gives an implicit size reference; staplers must be able to fit on desks. It also helps to eliminate falsely “recognizing” staplers in impossible places (e.g., on the ceiling or a window). The robot can safely ignore a 200-foot advertising blimp shaped like a stapler because the blimp lacks the prerequisite wood-grained background of a desk. In contrast, with tasks such as image retrieval, all stapler images in a database may be of real staplers and so large sizes and other unusual configurations may have been implicitly precluded by the assumptions of those who took the photographs. That is, the photographer perhaps took pictures only of real, normal-sized staplers. Also, when taking pictures, people tend to center objects and put them in characteristic orientations. Thus, there is often quite a bit of unintentional implicit information within photos taken by people. Contextual information can also be modeled explicitly with machine learning techniques. Hidden variables such as size, orientation to gravity, and so on can then be correlated with their values in a labeled training set. Alternatively, one may attempt to measure hidden bias variables by using additional sensors. The use of a laser range finder to measure depth allows us to accurately infer the size of an object. The next problem facing computer vision is noise. We typically deal with noise by using statistical methods. For example, it may be impossible to detect an edge in an image merely by comparing a point to its immediate neighbors. But if we look at the statistics over a local region, edge detection becomes much easier. A real edge should appear as a string of such immediate neighbor responses over a local region, each of whose orientation is consistent with its neighbors. It is also possible to compensate for noise by taking statistics over time. Still, other techniques account for noise or distortions by building explicit models learned directly from the available data. For example, because lens distortions are well understood, one need only learn the parameters for a simple polynomial model in order to describe—and thus correct almost completely—such distortions. The actions or decisions that computer vision attempts to make based on camera data are performed in the context of a specific purpose or task. We may want to remove noise or damage from an image so that our security system will issue an alert if someone tries to climb a fence or because we need a monitoring system that counts how many people cross through an area in an amusement park. Vision software for robots that wander through office buildings will employ different strategies than vision software for stationary security cameras because the two systems have significantly different contexts and objectives. As a general rule: the more constrained a computer vision context is, the more we can rely on those constraints to simplify the problem and the more reliable our final solution will be. OpenCV is aimed at providing the basic tools needed to solve computer vision problems. In some cases, high-level functionalities in the library will be sufficient to solve the more complex problems in computer vision. Even when this is not the case, the basic components in the library are complete enough to enable creation of a complete solution of your own to almost any computer vision problem. In the latter case, there are some tried-and-true methods of using the library; all of them start with solving the problem using as many available library components as possible. Typically, after you’ve developed this first-draft solution, you can see where the solution has weaknesses and then fix those weaknesses using your own code and cleverness (better known as “solve the problem you actually have, not the one you imagine”). You can then use your draft solution as a benchmark to assess the improvements you have made. From that point, whatever weaknesses remain can be tackled by exploiting the context of the larger system in which your problem solution is embedded, or by setting out to improve some component of the system with your own novel contributions. The Origin of OpenCV OpenCV grew out of an Intel Research initiative to advance CPU-intensive applications. Toward this end, Intel launched many projects including real-time ray tracing and 3D display walls. One of the authors (Gary) working for Intel at that time was visiting universities and noticed that some top university groups, such as the MIT Media Lab, had well-developed and internally open computer vision infrastructures—code that was passed from student to student and that gave each new student a valuable head start in developing his or her own vision application. Instead of reinventing the basic functions from scratch, a new student could begin by building on top of what came before. Thus, OpenCV was conceived as a way to make computer vision infrastructure universally available. With the aid of Intel’s Performance Library Team,3 OpenCV started with a core of implemented code and algorithmic specifications being sent to members of Intel’s Russian library team. This is the “where” of OpenCV: it started in Intel’s research lab with collaboration from the Software Performance Libraries group together with implementation and optimization expertise in Russia. Chief among the Russian team members was Vadim Pisarevsky, who managed, coded, and optimized much of OpenCV and who is still at the center of much of the OpenCV effort. Along with him, Victor Eruhimov helped develop the early infrastructure, and Valery Kuriakin managed the Russian lab and greatly supported the effort. There were several goals for OpenCV at the outset: • Advance vision research by providing not only open but also optimized code for basic vision infrastructure. No more reinventing the wheel. • Disseminate vision knowledge by providing a common infrastructure that developers could build on, so that code would be more readily readable and transferable. • Advance vision-based commercial applications by making portable, performance-optimized code available for free—with a license that did not require commercial applications to be open or free themselves. Those goals constitute the “why” of OpenCV. Enabling computer vision applications would increase the need for fast processors. Driving upgrades to faster processors would generate more income for Intel than selling some extra software. Perhaps that is why this open and free code arose from a hardware vendor rather than a software company. Sometimes, there is more room to be innovative at software within a hardware company. In any open source effort, it is important to reach a critical mass at which the project becomes self-sustaining. There have now been around seven million downloads of 3 Shinn Lee was of key help as was Stewart Taylor. OpenCV, and this number is growing by hundreds of thousands every month4. The user group now approaches 50,000 members. OpenCV receives many user contributions, and central development has long since moved outside of Intel.5 OpenCV’s past timeline is shown in Figure 1-3. Along the way, OpenCV was affected by the dot-com boom and bust and also by numerous changes of management and direction. During these fluctuations, there were times when OpenCV had no one at Intel working on it at all. However, with the advent of multicore processors and the many new applications of computer vision, OpenCV’s value began to rise. Similarly, rapid growth in the field of robotics has driven much use and development of the library. After becoming an open source library, OpenCV spent several years under active development at Willow Garage and Itseez, and now is supported by the OpenCV foundation at http//opencv.org. Today, OpenCV is actively being developed by the OpenCV.org foundation, Google supports on order of 15 interns a year in the Google Summer of Code program6, and Intel is back actively supporting development. For more information on the future of OpenCV, see Chapter 14. Figure 1-3: OpenCV timeline 4 It is noteworthy, that at the time of the publication of “Learning OpenCV” in 2006, this rate was 26,000 per month. Seven years later, the download rate has grown to over 160,000 downloads per month. 5 As of this writing, Itseez (http://itseez.com/) is the primary maintainer of OpenCV 6 Google Summer of Code https://developers.google.com/open-source/soc/ Who Owns OpenCV? Although Intel started OpenCV, the library is and always was intended to promote commercial and research use. It is therefore open and free, and the code itself may be used or embedded (in whole or in part) in other applications, whether commercial or research. It does not force your application code to be open or free. It does not require that you return improvements back to the library—but we hope that you will. Downloading and Installing OpenCV The main OpenCV site is at http://opencv.org, from which you can download the complete source code for the latest release, as well as many recent releases. The downloads themselves are found at the downloads page: http://opencv.org/downloads.html. However, if you want the very most up-to-date version it is always found on GitHub at https://github.com/Itseez/opencv, where the active development branch is stored. The computer vision developer’s site (with links to the above) is at http://code.opencv.org/. Installation In modern times, OpenCV uses Git as its development version control system, and CMake to build7. In many cases, you will not need to worry about building, as compiled libraries exist for supported environments. However, as you become a more advanced user, you will inevitably want to be able to recompile the libraries with specific options tailored to your application and environment. On the tutorial pages at http://docs.opencv.org/doc/tutorials/tutorials.html under “introduction to OpenCV”, there are descriptions of how to set up OpenCV to work with a number of combinations of operating systems and development tools. Windows At the page: http://opencv.org/downloads.html, you will see a link to download the latest version of OpenCV for Windows. This link will download an executable file which you can run, and which will install OpenCV, register DirectShow filters, and perform various post-installation procedures. You are now almost ready to start using OpenCV.8 The one additional detail is that you will want to add is an OPENCV_DIR environment variable to make it easier to tell your compiler where to find the OpenCV binaries. You can set this by going to a command prompt and typing9: setx -m OPENCV_DIR D:\OpenCV\Build\x86\vc10 If you built the library to link statically, this is all you will need. If you built the library to link dynamically, then you will also need to tell your system where to find the library 7 In olden times, OpenCV developers used Subversion for version control and automake to build. Those days, however, are long gone. 8 It is important to know that, although the Windows distribution contains binary libraries for release builds, it does not contain the debug builds of these libraries. It is therefore likely that, before developing with OpenCV, you will want to open the solution file and build these libraries for yourself. 9 Of course, the exact path will vary depending on your installation, for example if you are installing on an ia64 machine, then the path will not include “x86”, but rather “ia64”. binary. To do this, simply add %OPENCV_DIR%\bin to your library path. (For example, in Windows 7, right-click on your Computer icon, select Properties, and then click on Advanced System Settings. Finally select Environment Variables and add the OpenCV binary path to the Path variable.) To add the commercial IPP performance optimizations to Windows, obtain and install IPP from the Intel site (http://www.intel.com/software/products/ipp/index.htm); use version 5.1 or later. Make sure the appropriate binary folder (e.g., c:/program files/intel/ipp/5.1/ia64/bin) is in the system path. IPP should now be automatically detected by OpenCV and loaded at runtime (more on this in Chapter 3). Linux Prebuilt binaries for Linux are not included with the Linux version of OpenCV owing to the large variety of versions of GCC and GLIBC in different distributions (SuSE, Debian, Ubuntu, etc.). In many cases however, your distribution will include OpenCV. If your distribution doesn’t offer OpenCV, you will have to build it from sources. As with the Windows installation, you can start at the http://opencv.org/downloads.html page, but in this case the link will send you to Sourceforge10, where you can select the tarball for the current OpenCV source code bundle. To build the libraries and demos, you’ll need GTK+ 2.x or higher, including headers. You’ll also need pkgconfig, libpng, libjpeg, libtiff, and libjasper with development files (i.e., the versions with -dev at the end of their package names). You’ll need Python 2.6 or later with headers installed (developer package). You will also need libavcodec and the other libav* libraries (including headers) from ffmpeg 1.0 or later . Download ffmpeg from http://ffmpeg.mplayerhq.hu/download.html.11 The ffmpeg program has a lesser general public license (LGPL). To use it with non-GPL software (such as OpenCV), build and use a shared ffmpg library:
本源码包内暂不包含可直接显示的源代码文件,请下载源码包。