cuda – Page 4 – Make Me Engineer

Modifying registry to increase GPU timeout, windows 7

July 17, 2022 by Tarik

The link in your post is correct, you just need to create the corresponding key with the desired value. You will find the TDR Registry Keys description here. The setting you are looking for is TdrDelay Specifies the number of seconds that the GPU can delay the preempt request from the GPU scheduler. This is … Read more

How is CUDA memory managed?

July 10, 2022 by Tarik

The device memory available to your code at runtime is basically calculated as Free memory = total memory – display driver reservations – CUDA driver reservations – CUDA context static allocations (local memory, constant memory, device code) – CUDA context runtime heap (in kernel allocations, recursive call stack, printf buffer, only on Fermi and newer … Read more

GPU Emulator for CUDA programming without the hardware [closed]

July 9, 2022 by Tarik

For those who are seeking the answer in 2016 (and even 2017) … Disclaimer I’ve failed to emulate GPU after all. It might be possible to use gpuocelot if you satisfy its list of dependencies. I’ve tried to get an emulator for BunsenLabs (Linux 3.16.0-4-686-pae #1 SMP Debian 3.16.7-ckt20-1+deb8u4 (2016-02-29) i686 GNU/Linux). I’ll tell you … Read more

Thrust inside user written kernels

July 4, 2022 by Tarik

As it was originally written, Thrust is purely a host side abstraction. It cannot be used inside kernels. You can pass the device memory encapsulated inside a thrust::device_vector to your own kernel like this: thrust::device_vector< Foo > fooVector; // Do something thrust-y with fooVector Foo* fooArray = thrust::raw_pointer_cast( fooVector.data() ); // Pass raw array and … Read more

CUDA compute capability requirements

June 30, 2022 by Tarik

CUDA VERSION Min CC Deprecated CC Default CC Max CC 5.5 (and prior) 1.0 N/A 1.0 6.0 1.0 1.0 1.0 6.5 1.1 1.x 2.0 7.x 2.0 N/A 2.0 8.0 2.0 2.x 2.0 6.2 9.x 3.0 N/A 3.0 7.0 10.x 3.0 N/A 3.0 7.5 (3.0 deprecated in 10.2) 11.x 3.5 3.x,5.0 5.2 8.6 (11.0:8.0, 11.1:8.6) (CUDA … Read more

Different CUDA versions shown by nvcc and NVIDIA-smi

May 16, 2022 by Tarik

CUDA has 2 primary APIs, the runtime and the driver API. Both have a corresponding version (e.g. 8.0, 9.0, etc.) The necessary support for the driver API (e.g. libcuda.so on linux) is installed by the GPU driver installer. The necessary support for the runtime API (e.g. libcudart.so on linux, and also nvcc) is installed by … Read more

Unspecified launch failure on Memcpy

May 10, 2022 by Tarik

When I compile and run your code, I get: an illegal memory access was encountered-3 printed out. You may indeed be getting “unspecified launch failure” instead. The exact error reporting will depend on CUDA version, GPU, and platform. But we can proceed forward regardless. Either message indicates that the kernel launched but encountered an error, … Read more

What is the canonical way to check for errors using the CUDA runtime API?

April 13, 2022 by Tarik

Probably the best way to check for errors in runtime API code is to define an assert style handler function and wrapper macro like this: #define gpuErrchk(ans) { gpuAssert((ans), __FILE__, __LINE__); } inline void gpuAssert(cudaError_t code, const char *file, int line, bool abort=true) { if (code != cudaSuccess) { fprintf(stderr,”GPUassert: %s %s %d\n”, cudaGetErrorString(code), file, … Read more