This action might not be possible to undo. Are you sure you want to continue?
CUDA Programming within Mathematica
such as image processing and financial engineering. the CUDA programming paradigm. the device is the GPU. CUDA promises significant performance increases. Enter Mathematica. becomes more important. and correct formulation of the problem and memory usage can give you a significant performance boost. which runs on the Central Processing Unit (CPU). GPUs currently have thousands of cores. we describe the benefits of CUDA integration in Mathematica and provide some applications for which it is suitable. and finance— that makes CUDA programming a breeze. Unlike programming in C or developing CUDA wrapper code. as well as application performance. The first is the promise of significant speed. Mathematica offers an intuitive environment—even featuring built-in ready-to-use examples for common application areas. such as image processing. the easiest way to program for CUDA and unlock GPU performance potential. With CPU manufacturers indicating that hundred. NVIDIA developed the language to program graphics cards and let you run C-like code on the Graphical Processing Unit (GPU).and thousand-core systems will be included in user desktops. now you don’t have to be a programming wizard to use CUDA. Considering CUDA’s advanced technology. while the other is the promise that a CUDA-designed program will scale to multicore systems. In CUDA terminology.Introduction CUDA is short for Common Unified Device Architecture. even if you’ve never used Mathematica before. enhancing speed by factors easily exceeding 100. you may expect its programming to be enormously complicated. medical imaging. 2 CUDA Programming within Mathematica . There are two motivations for using CUDA. In this document. while the host is the CPU. and CUDA-designed programs will scale to multicore systems. Motivations for CUDA CUDA is different from regular C code. statistics. And if you have used Mathematica. which scales well to thousands of CPU cores. you’ll be amazed by the massive boost in computational power.
Through its full automation and preprocessing mechanisms.A Brief Introduction to Mathematica Mathematica has been one of the most powerful languages for technical computing for more than 20 years. Mathematica is a sophisticated development environment that combines a flexible programming language with a wide range of symbolic and numeric computational capabilities. instant interface construction. This unified representation makes Mathematica’s language and functions extremely flexible. and consistent. documents—can be represented as symbolic entities. For CUDA developers. Some highlights of Mathematica’s features include: Full-featured. such as statistics. and image processing. and deployment. streamlined. Programmers can choose their own style for writing code with minimal effort. which brings a whole new meaning to high-performance computing. Along with comprehensive documentation and resources. production of high-quality visualizations. as well as several different programming paradigms. Multiparadigm programming language Mathematica provides its own highly declarative functional language. Mathematica provides a streamlined workflow. All functions are carefully designed and tightly integrated with the system. control systems. data visualization. formulas. which enables users to break the barriers between specialized areas and explore new possibilities. With direct integration of dynamic libraries. and automatic C code generation and linking. development. programs. graphics. and a range of immediate deployment options. such as procedural and rule-based programming. Unique symbolic-numeric hybrid system The fundamental principle of Mathematica is full integration of symbolic and numeric computing capabilities. CUDA Programming within Mathematica 3 . the new integration means unlimited access to Mathematica’s vast computing abilities. built-in application area packages. Unified data representation At the core of Mathematica is the foundational idea that everything—data. Extensive scientific and technical area coverage Mathematica provides thousands of built-in functions and packages that cover a broad range of scientific and technical computing areas. users can enjoy the full power of a hybrid computing system without the knowledge of specific methodologies and algorithms. Wolfram Research introduces fully integrated GPU programming capabilities in Mathematica. Mathematica’s flexibility greatly reduces the cost of entry for new users. unified development environment Through its unique interface and integrated features for computation. called expressions. Mathematica provides the most sophisticated build-todeploy environment in the market.
multicore processors without extra packages. 4 CUDA Programming within Mathematica . Easy data access and connectivity Mathematica natively supports hundreds of formats for importing and exporting. Benefits of CUDA Integration in Mathematica For users who want to tap into the power of GPU computing. programming with CUDA may require programmers to manage project setup. The full integration and automation of Mathematica’s CUDA capability means a more productive and efficient development cycle. Wolfram Research’s computational knowledge engine’. and browser plug-ins. It also provides APIs for accessing common database and programming languages. Mathematica provides a wide range of options for deployment. Many functions automatically utilize the power of multicore processors. Platform-dependent deployment options Through its interactive notebooks. and device configuration. and built-in generic parallel functions make high-performance programming a simple process. Users do not have to set up any project files or worry about platform dependency. such as SQL. Built-in code generation functionality can be used to create standalone programs for independent distribution. CUDA integration in Mathematica provides benefits in both development and performance.NET. as well as unlimited access to data from Wolfram|Alpha®. Mathematica Player’.Built-in high-performance computing Mathematica fully supports multi-threaded. platform dependencies. it brings unprecedented levels of performance improvements without extra development time and cost. C/C++. and . CUDA integration in Mathematica makes the process completely transparent and fully automated. In addition. Simplified Development Cycle Automation of development project management Like many other development frameworks.
For advanced applications. It goes one step further by compiling and establishing the connection with Mathematica in a completely transparent process. to avoid data transfer (which is known to be very costly) unless absolutely needed. built-in application area CUDA Programming within the power 5 support. but also spend more time on developing and optimizing core CUDA kernel algorithms. users can not only combine Mathematica of Mathematica and GPU computing.Automated GPU memory and thread management In a typical CUDA program. which completely abolishes the need for extra code writing: The memory between the host and the device is synchronized only as needed. thereby greatly simplifying the development process for the end user. With Mathematica’s comprehensive symbolic and numerical functions. in addition to a CUDA kernel function: With Mathematica. programmers should write memory and thread management code manually. CUDA integration provides full access to Mathematica’s native language and built-in functions. . linked. full control of how and when the memory needs to be copied between the host and GPU devices is provided. It also provides free exchange of data between Mathematica and users’ CUDA programs. and is ready to use. memory and thread management for the GPU is completely automatic. Automatic compilation and binding of dynamic libraries Mathematica’s CUDA support streamlines the whole programming process. With a single command. and graphical interface building functions. your CUDA kernel code is compiled. and bound to Mathematica.
Using CUDALink from within Mathematica gives you access to Mathematica’s features. and graphical interface building functions. and a retail price of less than $500. CUDALink provides you with carefully tuned linear algebra. image processing. and image processing algorithms. 6 CUDA Programming within Mathematica . and 3D fractals can be computed in real time. with a high frame rate. Performance Improvements GPUs have been using many cores for quite some time now. Ready-to-use applications CUDA integration in Mathematica provides several ready-to-use CUDA functions that cover a broad range of topics such as mathematics. and programming capabilities. financial engineering. and more. support for double precision.With Mathematica’s comprehensive symbolic and numerical functions. This provides considerable speed up in many areas such as: † On a NVIDIA GT 240 with an Intel i7 950. FinancialDerivative improvements of up to 35x for binomial and 80x for Monte Carlo methods. Mathematica provides support for multicore systems via built-in functions such as Parallelize. users can not only combine the power of Mathematica and GPU computing. etc. import/export. Mathematicaʼs CUDALink: Integrated GPU Programming CUDALink is a built-in Mathematica package that provides a simple and powerful interface for using CUDA within Mathematica’s streamlined workflow. discrete Fourier transform. A current example is the NVIDIA GeForce GTX 480. which is a consumer card with 480 CUDA cores. including visualization.. † Several random number generation algorithms have been implemented and speed ups of 50x have been observed. Examples will be given in Section 6. † Volumetric rendering. implicit algebraic functions. ParallelMap. there is nothing for users to configure or set up to utilize the GPU computing power. and makes CUDA devices available to the users. You can also write your own CUDALink modules with minimal effort. configures. built-in application area support. but also spend more time on developing and optimizing core CUDA kernel algorithms. that free you from the cumbersome details of launching multiple kernels and processors and let you concentrate on the core implementation of your algorithms. Mathematica automatically finds. Zero device configuration As long as NVIDIA’s CUDA Toolkit is installed on the development system.
and Mac OS X. visualization features. Linux. int index = channels * HxIndex + yIndex*widthL. and more † Ready-to-use functionality with zero configuration needed if current video drivers and compilers are already installed † Support for 32-bit and 64-bit Windows. int *dim.y * blockDim. import/export capabilities.<<". we will create a simple example that negates colors of a 3-channel image. write a CUDA kernel function as a string.1 or higher † A recent NVIDIA driver † On Windows. height = dim@1D.x + blockIdx. Mac OS X users need at least Mac OS X 10. int yIndex = threadIdx.The CUDALink package included with Mathematica at no additional cost offers: † Automatic compilation of CUDA programs † Automatic memory handling between host and GPU devices † The ability to write your own CUDALink modules with little effort † Support for Microsoft Visual Studio and GCC compilers † Multiple-GPU support † Support for single and double arithmetic precision † Access to Mathematica’s flexible programming language.nvidia. and Linux systems System Requirements To utilize Mathematica’s CUDALink. and fullfeatured development environment † Access to Mathematica’s computable data. int channelsL 8 int width = dim@0D. † NVIDIA CUDA enabled products. automatic interface builders. It begins with loading the CUDALink package into Mathematica: Needs@"CUDALink`"D The following function verifies that the system has CUDA support: CUDAQ@D True Now.x. First.img@index + cD.and 64-bit architecture.0 or higher † CUDA Toolkit 3. both 32. † Mathematica 8. c < channels. the following operating systems and software are required: † Operating System: Windows.html for further information.6. and assign it to a variable: kernel = " __global__ void cudaColorNegateHint *img.x * blockDim. if HxIndex < width && yIndex < heightL 8 for Hint c = 0.y. CUDA Programming within Mathematica 7 . Mac OS X. Microsoft Visual Studio (Windows platforms) 2005 or higher Getting Started with CUDALink Programming the GPU in Mathematica is straightforward. See www.3.y + blockIdx.com/object/cuda_gpus. int xIndex = threadIdx. c++L img@index + cD = 255 .
Several things are happening at this stage. "cudaColorNegate". "Input"<. CUDAFunctionUnload can be called to unbind the function from Mathematica. After compilation. 8_Integer. The last argument denotes the dimension of threads per block to be launched.Pass that string to a built-in function CUDAFunctionLoad. CUDAQ tells whether the current hardware and system configuration support CUDALink: Needs@"CUDALink`"D CUDAQ@D True 8 CUDA Programming within Mathematica . ImageDimensions@iD. ImageChannels@iDD : > Finally. 816. Now you can apply this new CUDA function to any image format that Mathematica can handle. For instance. CUDAFunctionUnload@colorNegateD Accessing System Information CUDALink supplies several functions that make it easy to acquire detailed system information for GPU programming. There is no need for users to add system interface or memory management code. _. "InputOutput"<. _. the function is automatically bound to Mathematica and is ready to be called. Mathematica automatically compiles the kernel function as a dynamic library. colorNegate@i. i= . 16<D. _Integer<. colorNegate = CUDAFunctionLoad@kernel. along with the kernel function name and the argument specification. 88_Integer.
operating systems. Several other functions are also provided that return in-depth information about Mathematica. Retargetable Code Generation CUDALink provides you with the ability to perform on-the-fly compilation and execution. Example of a report generated by CUDAInformation. Code can be compiled into an executable or an external library for reuse. or to compile executables or libraries to be used later. CUDALink’s portable simple compiling mechanism for CUDA code uses the NVCC compiler.CUDAInformation generates a detailed report on supported CUDA devices. CUDA Programming within Mathematica 9 . which means that it can be used to optimize CUDA kernel code programmatically. The returned data from CUDAInformation is a valid Mathematica input form. hardware. and C/C++ compilers that are currently used by CUDALink.
This makes it well suited to creating. String CUDACodeStringGenerate generates CUDA interface code in string form. compiles it to a dynamic library. you can focus on innovating your algorithms in CUDA kernels. and optimizing C code. The SymbolicC output can then be used to render CUDA code that calls the correct functions to bind the CUDA code to Mathematica. In conjunction with this capability. which can be exported to other development platforms. Combining Mathematica’s full-featured development environment and CUDA integration. which can be called with Mathematica’s automatic dynamic library binding capability. rather than spending time on repetitive tasks. and programming capabilities. 10 CUDA Programming within Mathematica . which provides a hierarchical view of C code as Mathematica’s own language. depending on the target: SymbolicC CUDACodeGenerate takes a CUDA kernel function or program and generates SymbolicC output. and better code optimization. manipulating. CUDASymbolicCGenerate produces SymbolicC output for CUDA kernel as well as C++ wrapper code. CUDALibraryFunctionGenerate takes a CUDA kernel function and generates the code required to use it. Dynamic library CUDALibraryGenerate generates CUDA interface code and compiles it into a library that can be loaded into Mathematica.Mathematica also provides SymbolicC. and returns Mathematica’s library function. Integration with Mathematica Functions When you use CUDALink. Furthermore. Manipulate: Mathematica’s automatic interface generator Mathematica provides extensive built-in interface functions including both standard and advanced controls. Mathematica provides a fully automated interface generating function called Manipulate. users can generate CUDA kernel code for several different targets for greater portability. such as interface building. Then it takes the sources. Users can also customize controls using Mathematica’s highly declarative interface language. less platform dependency. import/export. you can use all of Mathematica’s features including visualization. Several built-in functions perform code generation.
data formats (Excel. including CUDALink functions. etc. Manipulate automatically chooses appropriate controls and creates a user interface around it. MAT. CSV. Support for import and export Mathematica natively supports hundreds of file formats and their subformats for importing and exporting. 8operation. PNG. video formats (AVI. Any supported data formats will be automatically converted to Mathematica’s unified data representation. 0. Example of a user interface built with Manipulate. and various raw formats for further processing. 8CUDAErosion. medical imaging formats (DICOM). 9<. TIFF. etc.). or an expression. xF. AIFF.ManipulateBoperationB . AU. which can be used in all Mathematica functions. MOV. H264.). etc.).). audio formats (WAV. BMP. 8x. CUDADilation<<F By specifying possible ranges for variables. CUDA Programming within Mathematica 11 . FLAC. etc. Supported formats include: common image formats (JPEG.
dilation.png $ImportFormats and $ExportFormats give full lists of supported import and export formats. outputD masked. and closing. and random number generation. All of these operations work on either images or arrays of real and integer numbers. which provides the same benefits and functionality of GPU programming as CUDALink over OpenCL architecture.Not only does Mathematica provide access to local resources. Image Processing CUDALink offers many image processing functions that have been carefully tuned for the GPU.wolfram. financial derivatives. opening.png". and can be easily exported to one of the supported formats using the Export function. 12 CUDA Programming within Mathematica . morphological operators such as such as erosion. Mathematicaʼs CUDALink Applications In addition to support for user-defined CUDA functions and automatic compilation. The following code imports an image from a given URL: image = Import@ "http:êêgallery. but any URL can be used to access data online. This can be directly used by CUDALink functions. such as CUDAImageAdd: output = CUDAImageAddBimage. CUDALink includes several ready-to-use functions that support NVIDIA’s CUFFT (Fourier analysis) and CUBLAS (linear algebra) libraries and cover application areas such as image processing. including the ones from CUDALink functions. OpenCL Compatibility In addition to CUDALink. are also expressions. Mathematica supports OpenCL with the built-in package OpenCLLink. the following code exports CUDA output into PNG format: Export@"masked.pop.comê2dêpopupê00_contourMosaic. For example. F All outputs from Mathematica functions.jpg"D. The function Import automatically recognizes the file format and converts it into Mathematica expression. These include pixel operations such as image arithmetic and composition. and image convolution and filtering.
and closing. CUDAOpening .Image convolution CUDALink’s convolution is similar to Mathematica’s ListConvolve and ImageConvolve functions. Dilation. More sophisticated morphological operations can be built using these fundamental operations. -1 0 1 -2 0 2 F -1 0 1 CUDAImageConvolveB . and CUDAClosing are equivalent to Mathematica’s built-in Erosion. CUDALink supports simple pixel operations on one or two images. lists. Opening. CUDADilation. or CUDA memory references. and it can use Mathematica’s built-in filters as the kernel. F Multiplication of two images. Morphological operations CUDALink supports fundamental operations such as erosion. CUDAImageMultiplyB . Convolving a microscopic image with a Sobel mask to detect edges. dilation. CUDA Programming within Mathematica 13 . It will operate on images. opening. such as adding or multiplying pixel values from two images. CUDAErosion . and Closing functions.
14 CUDA Programming within Mathematica . products. and other operations. and closing. CUDAOpening . The CUDAFourier function can operate on a list of 1D. Many common formats such as H. CUDAErosion . finding minimum or maximum elements. CUDADilation. With GPU computing power. and Closing functions. or 3D real or complex numbers. or on data held in a chunk of device memory that is registered with CUDALink’s Fourier memory manager.264. and DivX are supported.CUDALink supports fundamental operations such as erosion. QuickTime. More sophisticated morphological operations can be built using these fundamental operations. Fourier Analysis The Fourier analysis capabilities of the CUDALink package include forward and inverse Fourier transforms. dilation. Video Processing CUDALink’s built-in image processing functions can also be applied to videos to perform realtime filtering. CUDALink’s video processing function can easily handle full high-resolution video (1080p) filtering in 30 frames per second. or transposing rows and columns of an image. Dilation. Processing multiple video streams with different filters in real time. Opening. Examples include vector addition. Linear Algebra You can perform various linear algebra functions with the GPU. and CUDAClosing are equivalent to Mathematica’s built-in Erosion. 2D. opening.
CUDA Programming within Mathematica 15 . with interactive interfaces for transfer functions and other volume-rendering parameter controls. depending on the type of option selected. Computing options on the GPU can be dozens of times faster than using the CPU and parallel processing. They simulate a large number of particles in real time using the GPU’s computing power.Fluid Dynamics Computational fluid dynamics examples are included with CUDALink. Volumetric rendering of a medical image. CUDALink’s options pricing function uses the binomial or Monte Carlo method. Fluid simulation with multi-particles. Volumetric Rendering CUDALink includes functions to read and display volumetric data in 3D.
and Canada: 1-800-WOLFRAM (965-3726) info@wolfram. directorate.Pricing and Licensing Information Wolfram Research offers many flexible licensing options for both organizations and individuals.com Europe: +44-(0)1993-883400 info@wolfram. Recommended Next Steps Request your free Mathematica trial version www. including network licensing for groups. cost-effective plan for your workgroup. and Canada (except Europe and Asia): +1-217-398-0700 email@example.com © 2010 Wolfram Research.wolfram.co. and built-in functions for many applications—plus full integration with all of Mathematica’s other computation. Wolfram|Alpha is a registered trademark and computational knowledge engine is a trademark of Wolfram Alpha LLC—A Wolfram Research Company. automatic memory management. and maintain multiple tools and add-on packages.wolfram. All other trademarks are the property of their respective owners.S. development. You can choose a convenient. Inc.com/mathematica-trial Request a technical demo: www.com Outside U. Visit us online for more information: www. and deployment capabilities—Mathematica’s built-in CUDALink package provides a powerful interface for GPU computing. university. or just yourself. all functionality is included without the need to buy. Inc. or Mathtech.wolfram. multicore computing. With its simplified development cycle. MKT3014 2060699 0910. Inc. learn.co. Mathematica is not associated with Mathematica Policy Research.uk Asia: +81-(0)3-3518-2880 firstname.lastname@example.org 16 CUDA Programming within Mathematica . Mathematica is a registered trademark and Mathematica Player is a trademark of Wolfram Research.com/mathematica-purchase Summary Thanks to Mathematica’s integrated platform design.com/mathematica-demo Learn more about CUDA programming in Mathematica: U. department.S. use. Inc.
This action might not be possible to undo. Are you sure you want to continue?
We've moved you to where you read on your other device.
Get the full title to continue listening from where you left off, or restart the preview.