cuda-experience: 2011

Monday, April 11, 2011

Debugging Software Crashes in C and C++ - II

Sunday, April 3, 2011

Setup eclipse for CUDA development

After two weeks playing with Visual Studio, I found that it's very sux, the intellisense is not working all times...maybe I need to install Visual Assist...and for a linux die-hard you'll found eclipse should be more suitable for you.

BTW, here is how to setup eclipse for CUDA development on Windows, although there're several errors
1) Download & install Visual Studio 2005/2008 Express Edition
2) Download & install DirectX SDK
3) Download & install Microsoft Platform SDK
4) Download & install CUDA SDK & Toolkit :-)
5) Download & install eclipse CDT (for sure :p)
6) Download the project template at http://public.procoders.net/cuda_template/minimal_cuda.zip

Now ready to go, create a project using above template, then edit the MAKEFILE change following
Change VC_HOME point to the place that you installed the VC express edition

i.e. VC_HOME= C:\Program Files\Microsoft Visual Studio 9.0\VC

Add SDK_HOME and point to the place that you installed the Platform SDK

i.e. SDK_HOME= C:\Program Files\Microsoft Platform SDK

Change the CUDA_COMMON_HOME point to the location of the CUDA SDK

i.e. CUDA_COMMON_HOME= C:\ProgramData\NVIDIA Corporation\NVIDIA GPU Computing SDK 3.2\C\common

Find in the MAKEFILE, the place that set nvcc_par and add -I "$(CUDA_INC_PATH)"

i.e. SET nvcc_par= -I "$(CUDA_INC_PATH)" -I "$(CUDA_COMMON_HOME)\inc" -ccbin "$(CL_EXE)" \
-Xcompiler $(XCOMPILER_FLAGS) -c $*.cu

The CUDA_INC_PATH is set during the setup of CUDA toolkit
Find the line start with $(LINK_EXE)" and add /LIBPATH:"$(SDK_HOME)\lib"

i.e. "$(LINK_EXE)" /OUT:$(MAIN) /LIBPATH:"$(VC_HOME)\lib" /LIBPATH:"$(SDK_HOME)\lib" $(OBJECTS) \
"$(CUDA_LIB_PATH)\cudart.lib" "$(CUDA_COMMON_HOME)\lib\cutil32.lib"

You will also need to setup the PATH environment variable point to $(VC_HOME)\BIN before starting eclipse.

Further change is inside the OBJECTS, and MAIN to adapt with your custom project i.e.

OBJECTS= helloWorld.obj
MAIN=helloWorld.exe

Ctrl+B and it will be built :-)
Have fun.

I think I don't need NSight as i'll use cuda-gdb for this feature, that's cool. You might also need to setup eclipse include file to search for CUDA SDK so that the intellisense will work.

BTW, there's a sux thing here, the sal.h get error during the compile inside Eclipse but doing a nmake from console work well. What's happen??!!! X.X

Enable CUDA syntax highlight & intellisense for Visual Studio

Syntax highlighting like __global__...
Goto "C:\ProgramData\NVIDIA Corporation\NVIDIA GPU Computing SDK 3.2\C\doc\syntax_highlighting"
Read the readme.txt there, basically there is simple step as following

Want pretty syntax highlighting when editing your .cu files in Visual Studio?
Here's how:

---
Visual Studio .Net 2005 / Visual Studio 8:

1. If you don't have a usertype.dat file in your "Microsoft Visual Studio 8\Common7\IDE" folder, then copy the included usertype.dat file there. If you do, append the contents of the included usertype.dat onto the end of the "Microsoft Visual Studio 8\Common7\IDE\usertype.dat"

2. Start Visual Studio 8. Select the menu "Tools->Options...". Open "Text Editor" in the tree view on the left, and click on "File Extension". Type cu in the "Extension" box, set the editor to "Microsoft Visual C++" and click "Add". Click "OK" on the dialog box.

3. Restart Visual Studio and your CUDA code should now have syntax highlighting.

For intellisense support
Go to C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v3.2\extras\visual_studio_integration, read the NvCudaRules.README.txt or run the .reg there, it will append the cu, cuh into registry entry as following

Windows Registry Editor Version 5.00

[HKEY_CURRENT_USER\Software\Microsoft\VisualStudio\8.0\Languages\Language Services\C/C++]
"NCB Default C/C++ Extensions"=".cpp;.cxx;.c;.cc;.h;.hh;.hxx;.hpp;.inl;.tlh;.tli;.cu;.cuh;.cl"

[HKEY_CURRENT_USER\Software\Microsoft\VisualStudio\9.0\Languages\Language Services\C/C++]
"NCB Default C/C++ Extensions"=".cpp;.cxx;.c;.cc;.h;.hh;.hxx;.hpp;.inl;.tlh;.tli;.cu;.cuh;.cl"

Last but not least, you'll need to include following header files in your CUDA project in order to get intellisense read those headers.

#include "cuda.h"
#include "cuda_runtime.h"
#include "device_launch_parameters.h"

I really recognize that VS is sux and eclipse should be better. But Nsight only on VS :-(

Saturday, April 2, 2011

CUDA app wizard for Visual Studio 2008

http://sourceforge.net/projects/cudavswizard/files/

There's few errors, set following environment variable in order to get it works
NVSDKCUDA_ROOT=C:\ProgramData\NVIDIA Corporation\NVIDIA GPU Computing SDK 3.2\C
PATH=$PATH;C:\ProgramData\NVIDIA Corporation\NVIDIA GPU Computing SDK 3.2\C\common\lib

Monday, March 21, 2011

'fxc.exe' is not recognized as an internal or external command,

If you get error like "'fxc.exe' is not recognized as an internal or external command," when compiling CUDA sample, it is because the DirectX utilities is not found in your PATH, set it and rebuild the project.

Remember to install DirectX SDK before.

You might also need to setup include and library folder point to DirectX SDK folder.

Monday, March 14, 2011

CUDA scientific computing

scientific computing

http://vergil.chemistry.gatech.edu/resources/programming/programming.pdf
http://apophenia.sourceforge.net/doc/
http://www.gnu.org/software/gsl/

http://www.usenix.org/event/wiov08/tech/full_papers/dowty/dowty_html/

Thursday, March 3, 2011

Install CUDA display driver on U45J series

For my Geforce 310M graphics card, when I do install the CUDA official driver, it told me that the driver is not supported. Here is how to update it.

1) Download CUDA driver file from Nvidia webpage http://www.nvidia.com
2) From the Start Menu in Windows, go to run and type dxdiag. Once the window appears, on bottom right hand side there is an option to "Save All Information". Save this information on your Desktop. It will save a .txt file to your desktop.
3) Within this text, file search for the section "NVIDIA GeForce 310M". You only need to use a certain piece of the Device and Subsystem ID such as " PCI\VEN_10DE&DEV_0A70&SUBSYS_12D21043&REV_A2\4&179FD7D4&0&0008" from my base U45J laptop. This will be important in the next step.
4) Now you will need a tool like WinRar or WinZip to extract the files from the Nvidia driver executable. Once you have WinRar or similar installed, you can right on the Nvidia driver executable and choose extract.
5) Within the file folder extracted from the Nvidia driver executable, you will need to open the file nvam.inf, choose find and enter the "0A70.02". There are two instances to find. Once you have found the first, towards the end of the line you should see the text "PCI\VEN_10DE&", highlight all the text following this and paste the Device and Subsystem ID found earlier in Step 3. You should end up with something like “%NVIDIA_DEV.0A70.02% = Section077, PCI\VEN_10DE&DEV_0A70&SUBSYS_12D21043 ".
Make sure to do this for both instances.
6) Now you must uninstall the nvidia display driver from the control panel. Make sure to only uninstall the display driver as that is all you are updating.
7) Now you can reboot. Once your computer has restarted, within the file folder extracted from the Nvidia driver executable, you can simply double click the setup.exe, and your new driver will install.

Tips: Download the original NVIDIA driver from manufacture and then find the corresponding .inf file so that you will know which file to modify, in my case is from ASUS and they update the nvam.inf file, inside that file you'll also find the correct update inf for your driver, just copy & paste it into the official NVIDIA inf file.

My profile :)
c:\ProgramData\NVIDIA Corporation\NVIDIA GPU Computing SDK 3.2\C\bin\win32\Debu
\deviceQuery.exe Starting...

CUDA Device Query (Runtime API) version (CUDART static linking)

There is 1 device supporting CUDA

Device 0: "GeForce 310M"
CUDA Driver Version:                           3.20
CUDA Runtime Version:                          3.20
CUDA Capability Major/Minor version number:    1.2
Total amount of global memory:                 1034551296 bytes
Multiprocessors x Cores/MP = Cores:            2 (MP) x 8 (Cores/MP) = 16 (Co
es)
Total amount of constant memory:               65536 bytes
Total amount of shared memory per block:       16384 bytes
Total number of registers available per block: 16384
Warp size:                                     32
Maximum number of threads per block:           512
Maximum sizes of each dimension of a block:    512 x 512 x 64
Maximum sizes of each dimension of a grid:     65535 x 65535 x 1
Maximum memory pitch:                          2147483647 bytes
Texture alignment:                             256 bytes
Clock rate:                                    1.47 GHz
Concurrent copy and execution:                 Yes
Run time limit on kernels:                     Yes
Integrated:                                    No
Support host page-locked memory mapping:       Yes
Compute mode:                                  Default (multiple host threads
can use this device simultaneously)
Concurrent kernel execution:                   No
Device has ECC support enabled:                No
Device is using TCC driver mode:               No

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 3.20, CUDA Runtime Ver
ion = 3.20, NumDevs = 1, Device = GeForce 310M

PASSED

Press <Enter> to Quit...
-----------------------------------------------------------

Saturday, February 26, 2011

My current CUDA profile

c:\ProgramData\NVIDIA Corporation\NVIDIA GPU Computing SDK 3.2\C\bin\win32\Debug
\deviceQuery.exe Starting...

CUDA Device Query (Runtime API) version (CUDART static linking)

There is 1 device supporting CUDA

Device 0: "GeForce G210M"
CUDA Driver Version:                           3.20
CUDA Runtime Version:                          3.20
CUDA Capability Major/Minor version number:    1.2
Total amount of global memory:                 497549312 bytes
Multiprocessors x Cores/MP = Cores:            2 (MP) x 8 (Cores/MP) = 16 (Cores)
Total amount of constant memory:               65536 bytes
Total amount of shared memory per block:       16384 bytes
Total number of registers available per block: 16384
Warp size:                                     32
Maximum number of threads per block:           512
Maximum sizes of each dimension of a block:    512 x 512 x 64
Maximum sizes of each dimension of a grid:     65535 x 65535 x 1
Maximum memory pitch:                          2147483647 bytes
Texture alignment:                             256 bytes
Clock rate:                                    1.47 GHz
Concurrent copy and execution:                 Yes
Run time limit on kernels:                     Yes
Integrated:                                    No
Support host page-locked memory mapping:       Yes
Compute mode:                                  Default (multiple host threads can use this device simultaneously)
Concurrent kernel execution:                   No
Device has ECC support enabled:                No
Device is using TCC driver mode:               No

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 3.20, CUDA Runtime Version = 3.20, NumDevs = 1, Device = GeForce G210M

PASSED

Press <Enter> to Quit...
-----------------------------------------------------------

If that is to mean G210M can support CUDA level 1.2

First build CUDA with Visual Studio

error.>Linking...
3>oclUtils32D.lib(shrUtils.obj) : error LNK2019: unresolved external symbol "public: static void __cdecl std::_Locinfo::_Locinfo_ctor(class std::_Locinfo *,class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> > const &)"

To resolve this problem, one needs to rebuild shrUtils32/64(D).lib and oclUtils32/64(D).lib with VS2005.
For some reason it helps.

Most likely this is due to a newer Visual Studio version used to build the pre-compiled versions of these static libs that come along with OpenCL SDK distribution.
I believe a general rule should sound like this:
"Make sure all OpenCL libs are built with the Visual Studio version you are going to use for building your OpenCL application".