CUDA is a parallel computing platform and programming model for Nvidia GPUs (Graphics Processing Units). CUDA provides an interface to Nvidia GPUs through a variety of programming languages, libraries, and APIs.

learn more… | top users | synonyms (2)

1
vote
0answers
11 views

Does the ray tracing algorithm involves rasterization of image?

I am working on a ray tracing algorithm , i know that the first step is to develop camera and view plane specifications. Now is the next step performing rasterization algorithm on image before a BVH ...
0
votes
1answer
13 views

Udacity cs 344 final assignment. How to convert a color image to greyscale

Im trying to solve the problem at the end of lesson 1 of the Udacity course but I'm not sure if ive just made a stupid typo or if the actual code is wrong. void your_rgba_to_greyscale(const uchar4 * ...
0
votes
0answers
5 views

bivariate/trivariate cumulative normal density - GPU implementation

Is there any effective GPGPU (CUDA) implementation, commercial or otherwise, of bivariate and trivariate cumulative normal density? I guess that it should not be something too complicated ...
0
votes
0answers
19 views

How does thrust decide kernel sizes?

When using thrust transformations to perform calculation on the device (saxpy example), how does thrust figure out the kernel's parameter like blockSize and gridSize?
1
vote
1answer
22 views

skipping incompatible libcudart.so when searching for -lcudart

When I compile .cu file with nvcc 5.0, the compiler gives me following information. /usr/bin/ld: skipping incompatible /usr/local/cuda-5.0/lib/libcudart.so when searching for -lcudart It seems ...
0
votes
0answers
22 views

Multi-GPU Memory Allocation behaves differently with different order of allocation

I have tested this on GTX 690 GPU with 4 GB RAM on Windows 7 x64, Visual C++ 10: I want to allocate 1.2 GB RAM on each of the two devices. If I get the ram from the first device and then the second ...
0
votes
0answers
22 views

CUDA: sum-reduction — data lost in call to device function

I am writing a CUDA sum reduction code taking the sum of the absolute values of an array starting on element begin_index through end_index (I am using one block with a variable number of threads). ...
0
votes
0answers
15 views

Multi GPU performance degrade when allocated memory increases

I've tested the following on a GTX 690 GPU with 4GB RAM in Windows 7 x64, Visual C++ 10: I've written a function that receives 2 vectors and adds into a 3rd vector. The task is broken over 2 GPU ...
0
votes
0answers
14 views

how to fix the prj0019 a tool returned an error code from compiling with cuda build rule

If compiled with the error to prj0019 a tool returned an error code from compiling with CUDA build rule... Setting configuration Windows8 k Visual Studio 2005(C++) Cuda SDK, Toolkit version : 4.1 ...
1
vote
1answer
23 views

setting up a CUDA 2D “unsigned char” texture for linear interpolation

I have a linear array of unsigned chars representing a 2D array. I would like to place it into a CUDA 2D texture and perform (floating point) linear interpolation on it, i.e., have the texture call ...
0
votes
0answers
33 views

memory mapping error with Cublas

I need your help, thanks in advance. I try to change the lapack routine dsyev for the magma routine dsyevd_gpu, but I get this error from cublasAlloc: CUBLAS error: memory mapping error (11) in ...
0
votes
0answers
22 views

Unable to bind 2D texture in within function, CUDA

I'm having a problem with a simple texture binding. This code works fine: texture<float, 2> textureD; __global__ void kernel(float *output, int width, int height) { int row = blockIdx.y * ...
1
vote
1answer
43 views

cuModuleLoadDataEx ignores all options

This question is similar to cuModuleLoadDataEx options but I would like to bring the topic up again and in addition provide more information. When loading a PTX string with the NV driver via ...
1
vote
1answer
39 views

Why static declared array of 4 unsigned chars produces ld.global.u8 when fetching memory?

I'm using CUDA 5.5 and I find a compiler behavior a bit weird, if I try to address a struct which only data is 4 unsigned chars, it triggers four loads of u8. Instead, if I use a union and load a ...
0
votes
2answers
31 views

Mirror reordering in Thrust

I'm using thrust vector. I'm looking for an elegant method for reordering a thrust device vector using a "mirror" ordering, (example given, couldn't find any function for that in Thrust ) For ...

1 2 3 4 5 333
15 30 50 per page