Group: Members
Posts: 3
Joined: 29-November 04
From: Santa Clara, CA
Member No.: 897
Org.: NVIDIA Developer Technology
What's new in CUDA 2.1 • Debugger support using gdb for CUDA • Support for using a GPU that is not driving a display on Vista (This was already supported on Windows XP, OSX and Linux) • DX10 interop support, accelerates communication with DX10 applications • VisualStudio 2008 support for Windows XP and Windows Vista • Just-in-time (JIT) compilation, for applications that dynamically generate CUDA kernels • C++ templates are now supported in CUDA kernels • Support for recent releases of Linux including Fedora9, OpenSUSE 11 and Ubuntu 8.04
New CUDA SDK samples for CUDA 2.1 • smokeParticles (volumetric particle shadows) • DX10 interop samples: simpleD3D10 and simpleD3D10Texture
Known Issues In this release, #pragma unroll sometimes does not unroll loops because of limits in the compiler on loop bodies, which may cause a decrease in performance versus CUDA 2.0. A user can override this limit on the command line with the following nvcc compiler flag:
nvcc -Xopencc -OPT:unroll_size=200000
In most cases, this should override the built-in loop unrolling limits. Unless a kernel uses #pragma unroll and shows a significant performance drop from CUDA 2.0, this flag should not be used.