__global__ void cuda_hello(){ printf("Hello World from GPU!\n"); } int main() { cuda_hello<<<1,1>>>(); return 0; }I set up a Makefile like this:
#NOPTS = -allow-unsupported-compiler -Wno-deprecated-gpu-targets NOPTS = -Wno-deprecated-gpu-targets NVCC = /usr/local/cuda-12.9/bin/nvcc $(NOPTS) all: hello hello: hello.cu $(NVCC) hello.cu -o helloI set up a bash script that sets a bunch of environment variables, then invokes "make" (I call it "cuda_make") and it looks like the following. (I do away with this, see below).
#!/bin/bash # You could put all this into some bash script ... export CUDAHOSTCXX=/usr/bin/g++-14 export CC=/usr/bin/gcc-14 export CXX=/usr/bin/g++-14 export NVCC_CCBIN=/usr/bin/g++-14 # And perhaps this also. export LD_LIBRARY_PATH=/usr/local/cuda-12.9/targets/x86_64-linux/lib:$LD_LIBRARY_PATH export CPATH=/usr/local/cuda-12.9/targets/x86_64-linux/include:$CPATH export PATH=/usr/local/cuda-12.9/bin:$PATH makeSo now I type "cuda_make" and I get:
/usr/local/cuda-12.9/bin/nvcc -Wno-deprecated-gpu-targets hello.cu -o hello /usr/include/bits/mathcalls.h(79): error: exception specification is incompatible with that of previous function "cospi" (declared at line 2601 of /usr/local/cuda-12.9/bin/../targets/x86_64-linux/include/crt/math_functions.h) extern double cospi (double __x) noexcept (true); extern double __cospi (double __x) noexcept (true); hello.cu(2): error: identifier "printf" is undefined printf("Hello World from GPU!\n"); ^ 5 errors detected in the compilation of "hello.cu". make: *** [Makefile:24: hello] Error 2There are several errors involving "math_functions.h". These are mentioned in the discussion of setting things up for Fedora 42. I just show one of them.
These apparently reference things in /usr/include/bits/mathcalls.h (which is a regular part of linux) but in an "old" way. The fix is something like this. Replace the first line with the second:
extern __DEVICE_FUNCTIONS_DECL__ __device_builtin__ double sinpi(double x); extern __DEVICE_FUNCTIONS_DECL__ __device_builtin__ double sinpi(double x) noexcept (true);The trick is appending the "noexcept (true)" business. He calls this an "ugly hack" because the game is to edit the following file file and add this to sinpi, sinpif, cospi, cospif.
cd /usr/local/cuda-12.9/targets/x86_64-linux/include/crt vi math_functions.hAfter this, we are left with the complaint about printf. The thing to do is the usual:
#includeAdd this to the start of the file and we get no more compile errors.
We get the file "hello" and running it does nothing. No output. Just like the tutorial said. So what is the point? It did help us to grind out way through compiling something with nvcc. Nonetheless, a printf() that doesn't produce output is pretty disappointing.
As an experiment, I changed my file to this, compiled it, and ran it. No output. And it returns to the command line in about a second. It should be in an infinite loop. There are clearly strange things going on.
#includeWe try something else (nothing like experimenting).__global__ void cuda_hello(){ for ( ;; ) printf("Hello World from GPU!\n"); } int main() { cuda_hello<<<1,1>>>(); return 0; }
#includeHere is see the "Goodbye" message after about 1 second. I do a bit of reading and learn that the __global__ marker indicates code that will run in the CUDA world. The main() function in the above is just good old C code running on the x86.__global__ void cuda_hello(){ for ( ;; ) printf("Hello World from GPU!\n"); } int main() { cuda_hello<<<1,1>>>(); printf ( "Goodbye\n" ); return 0; }
#includeAnd now I get the following, as expected:__global__ void cuda_hello(){ int i; for ( i=0 ; i<4; i++ ) printf("Hello World %d from GPU!\n", i+1 ); } int main() { cuda_hello<<<1,1>>>(); cudaDeviceSynchronize(); printf ( "Goodbye\n" ); return 0; }
Hello World 1 from GPU! Hello World 2 from GPU! Hello World 3 from GPU! Hello World 4 from GPU! GoodbyeThe necessary ingredient was the call to cudaDeviceSynchronize(); -- what was happening without it was that the code running in main() simply exited before the CUDA code could run, or something of that sort.
# Build a cuda hello world # Tom Trebisky 6-19-2025 # Hot! these days. over 110 yesterday. # We could do this: ## .EXPORT_ALL_VARIABLES: export CUDAHOSTCXX=/usr/bin/g++-14 export CC=/usr/bin/gcc-14 export CXX=/usr/bin/g++-14 export NVCC_CCBIN=/usr/bin/g++-14 # These won't work (and don't seem to be needed). # They fail because of how the previous values are referenced. # In lieu of CPATH we could use a -I line on gcc (but not nvcc) #export LD_LIBRARY_PATH=/usr/local/cuda-12.9/targets/x86_64-linux/lib:$LD_LIBRARY_PATH ##export CPATH=/usr/local/cuda-12.9/targets/x86_64-linux/include:$CPATH #export PATH=/usr/local/cuda-12.9/bin:$PATH #NOPTS = -allow-unsupported-compiler -Wno-deprecated-gpu-targets NOPTS = -Wno-deprecated-gpu-targets # This nice options doesn't seem to work for us #NVCC = /usr/local/cuda-12.9/bin/nvcc --std=c++14 $(NOPTS) NVCC = /usr/local/cuda-12.9/bin/nvcc $(NOPTS) all: hello hello: hello.cu $(NVCC) hello.cu -o hello clean: rm -f hello # THE END
Tom's Computer Info / tom@mmto.org