资源说明:CUDA-accelerated fractal generation
-------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Author: Alan Reiner (USA) Orig Date: 05 Jan, 2011 Last Update: 03 Feb, 2011 Description: Interactive fractal viewer. Uses CUDA for super-fast fractal generation, and OpenGL for displaying and interacting Windows Linux Mac CUDA Compute 1.0 ? No No 1.1 ? No No 1.2 Yes No No 1.3 Yes No No (All GTX 2XX) 2.0 Yes Yes No (All GTX 4XX) 2.1 Yes Yes No (GTX 460) I don't have a MAC on which to try this. -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- *** One code base for Windows and Linux -------------------------------------------------------------------------------- Update 04 Feb, 2011 Checked out the Win64 branch, updated the makefile, and compiled. It works! So I merged it into the master branch and all new development will continue from here. -------------------------------------------------------------------------------- *** Windows Support! -------------------------------------------------------------------------------- Update 03 Feb, 2011 After a tremendous amount of pain and frustration, it appears that NVIDIA does not want any Windows users to get into CUDA. They don't make their SDK compile in MSVS 2008 or 2010, and normal constructs I can use in Linux fail miserably in Windows. I had to do complete code re-organization to get this to work... With the exception of the mouse scroll wheel, the exact same code CUDA/OpenGL code that worked in Linux, also works in Windows. That's nice... At the moment there are two branches... I need to check out the win64 branch in Linux and see if I can get it working via preprocessor branches. Then I'll be able to merge the win64 branch into the master and everyone should be happy! Then to start implementing new stuff! Dynamic colormaps, new IFS functions, randomization... -------------------------------------------------------------------------------- *** MAJOR UPDATE -- Integrated OpenGL with CUDA for interactive fractal viewing! -------------------------------------------------------------------------------- Update 30 Jan, 2011 The original fractal code in CUDA was combined with CUDA-OpenGL-interop code to allow for real-time passing of CUDA output to OpenGL for display and interaction (via video memory -- no need to pass image through host RAM). I don't know how NVIDIA could've made this task any more complicated... But the torture is over, and it works (LINUX ONLY -- no windows support yet). Also, the code compiles and runs for CUDA Compute Capability less than 2.0, and it actually loads without crashing, but it doesn't actually display a fractal. So, for the time being, consider this code Fermi-only. With OpenGL installed, this produces a real-time Julia set. Left-click-drag will pan the view, scroll wheel will zoom. Right-clicking and dragging will adjust the C-parameter, which causes the fractal to morph in real time. At this time, the colormap can only be changed via added a cmap*.txt file and updating the code in main_gl.cu (and recompiling). --------------------- This CUDA code is a very basic fractal generator using CUDA (NVIDIA gfx card) The code is kind of sloppy, and simplistic (though Mandlebrots and Julia sets aren't difficult), but it works if you have CUDA installed and a newer NVIDIA graphics card. Eventually, as I learn more about fractals, I will expand this library to generate more types of fractals. To learn more about CUDA, or to download more of the files you need to compile this code, you can check out my other CUDA project on github, CUDA-Image-Processing If you are familiar with CUDA, you will realize that fractals are an absolutely *PERFECT" application for CUDA. You don't have to pass in an image, only a few parameters that define the fractal, and the calculations are relatively simple and completely parallelizable. You should see the full advantage of your GPU with this program. For reference, I put this same code together, though much more simply, in MATLAB using MATLAB complex numbers and it took 632s to generate a 2048x2048 Mandlebrot with max escape time of 256. By comparison, a 8192x8192 Mandlebrot with max escape time of 1024 took less than 1 second using this code. Epic! Typically you would expect 20x-200x speedup from CUDA for most parallelizable applications (like image processing). However, this program demonstrates that sometimes you can achieve even more than that (10,000x?) if your problem is just right. --------------------- Update 13 Jan, 2011 Added a Colormap class for defining what colors you want in the output png. Will eventually implement the capability to supply N colors, and a 256x3 colormap will be constructed by spreading out those three colors across the spectrum of gray values and interpolating the missing ones. An example cmap is provided, via cmap_blue_green.txt Also implemented more useful tiling to the rendering process. Rather than writing each CUDA tile to a separate file, it now allocates one large host image and writes each CUDA tile to the host RAM. This assumes that you are rendering an image that is too big for your GFX card, but will fit in host RAM. I changed this because it looks like there's nothing useful to come out of writing separate files (no print shop will be able to do anything with a sequence of files...) --------------------- Update 12 Jan, 2011 ** Added Julia Sets: Julia sets turned out to be very closely related to the Mandlebrot, and it was actually only a couple extra lines of code to modify the kernel to do Julia instead. This also gives me the possibility of doing 3D fractals by varying the c parameter. However, this may not be desired, since it seems most Julia sets are rich enough in two dimensions. ** Added writePngFile(): I finally succeeded in harnessing the libpng[12] library to write png files directly from the main() function, without every going through MATLAB. Bypassing MATLAB saves a lot of time and RAM. ** createcolormap.m: I created a MATLAB method for exploring colormaps. It creates a 256x3 matrix of colors, one for each gray level, to be applied to an image loaded in MATLAB. Given the completion of writePngFile() this won't be so useful as a MATLAB script, but will soon be adapted to C++ so that it can be used in the writePngCode() to add color via the C++ code. ** cudaQuaternionclass: Extends complex numbers to four dimensions. Originally planned to use quaternions for 3D Mandlebrot, but found out the there is no such thing (or, rather, the Mandlebrot is pretty boring in higher dimensions). However, these may be useful for other kinds of fractals... --------------------- Update 09 Jan, 2011 ** Added the cudaComplex class: Can be run by devices of CUDA compute 2.0 or higher. Using this, I can use complex numbers as native, algebraic objects, and will be able to implement any kind of IFS fractal, now. Using the new cudaComplex class increases computation time by about 5-10% (prob due to extra memory alloc/dealloc due to operators returning copies of the answers, instead of writing directly to a var) Manual complex-multiplication in kernel (4096x4096 tile, 1024 max esc): ---Single-precision w/ mem copy: 0.195 s ---Double-precision w/ mem copy: 0.562 s Using cudaComplex (4096x4096 tile, 1024 max esc): ---Single-precision w/ mem copy: 0.202 s ---Double-precision w/ mem copy: 0.624 s ** Complex class: This was a regular C++ implementation of the cudaComplex before I realized that it needed to be converted to __device__ code. I left this class in the project even though it's completely redundant w.r.t. to std::complex ** Quaternion Class: I implemented this in the most tedious way possible, to later realize it's probably completely unnecessary. It hasn't been compiled or tested in any way, I just wanted to get the non-commutative algebra in there. I should be able to use them for higher-dimensional fractals, later.
本源码包内暂不包含可直接显示的源代码文件,请下载源码包。