Tagged Questions

info newest frequent votes active unanswered

See the tag entry for "gpu".

votes

0answers

2 views

How can I run and test NVENC API working on Linux CentOS?

We have a server with kepler graphics card and Nvidia driver already installed. How can I run NVENC ( Hardware for Video encoding ) and use its SDK on linux CentOS 6.4 How can I test it that is it ...

asked 14 mins ago

farzad
246

votes

0answers

7 views

Is there any good tutoria or reference for writing code with Magma?

Currently I am trying to use Magma to do matrix operation on GPU, however, I found few documents about it. The only thing I can refer to is its testing program and the online generated document(here), ...

gpu gpu-programming

asked 2 days ago

itsuper7
11819

votes

1answer

90 views

How to solve “expected an identifier” error in CUDA

I'm having a problem with a kernel in CUDA C programming when compiling line 5. I got an "expected an identifier" error. Why is this happening? My kernel function is the following: __global__ void ...

c cuda gpu-programming

asked Jul 3 at 14:10

Leon
184

votes

1answer

24 views

how does Multithreading in GPUs work?

How does a GPU handle multithreading ?? In CPUs for example there will be independent copies of the Register File for each thread. But with large register files as in GPUs that will be impossible. So ...

multithreading gpu gpgpu gpu-programming

asked Jun 29 at 14:00

Mohammad Ewais
534

vote

1answer

24 views

How to set the right alignment for an OpenCL array of structs?

I have the following structure: C++: struct ss{ cl_float3 pos; cl_float value; cl_bool moved; cl_bool nextMoved; cl_int movePriority; cl_int nextMovePriority; cl_float ...

opencl gpu-programming memory-alignment

asked Jun 28 at 9:17

Alex
16019

votes

1answer

38 views

Find the closest weight vector to each instance in the data matrix

Suppose I have a weight matrix W nxm where m is the number of variables and the n is the number of instances. Also I have data matrix X of the same size. I try to find the closest weight vector to ...

matlab machine-learning bigdata gpu-programming

asked Jun 26 at 14:22

Erogol
974923

votes

0answers

13 views

Parallax Occlusion Mapping with Silhouettes

Does anyone knows how to implement the correct silhouettes effect in this youtube video? Actually, I understand (and successfully implemented) the parallax occlusion mapping algorithm, but I have no ...

3d shader gpu-programming

asked Jun 25 at 9:44

Ming
113

votes

1answer

35 views

Using shaders for long computations without causing lag

I am trying to use the Compute Shader with DirectX 11 to do some simple, but expensive calculations (think Mandelbrot Set). The result of the calculation is placed on a texture and are ...

wpf directx gpu-programming sharpdx compute-shader

asked Jun 25 at 3:04

Advecticity
31

votes

1answer

53 views

CUDA: When can someone achieve coalescing memory?

I have trouble understanding this concept. I've researched a lot online and the only thing I understood is that threads need to access consecutive data. So if we have an array of 10000 integers, if ...

memory cuda gpu-programming coalescing

asked Jun 24 at 21:30

ksm001
656129

votes

0answers

5 views

Running AMD GPU Assembly

I am trying to run AMD GPU Assembly on my PC. I am using Ubuntu 12.04 64-bit and Windows 7 Ultimate. I am using 6XX GPU. Please tell me how to run it. A good resource links is also helpful. If you can ...

gpu amd gpu-programming isa

asked Jun 24 at 9:56

Fr34K
142111

votes

1answer

41 views

CUDA: How does Thrust manage memory when using a Comparator in a sorting function?

I have a char array of 10 characters that I would like to pass as an argument to a comparator which will be used by Thrust's sorting function. In order to allocate memory for this array I use ...

sorting cuda gpu thrust gpu-programming

asked Jun 23 at 11:16

ksm001
656129

-2

votes

0answers

33 views

CUDA: tridiagonalization algorithm giving wrong results after a few iterations; see anything wrong? [closed]

I am trying to parallelize the tridiagonalization of a matrix from Numerical Recipes in C and comparing the answers (and eventually the computation speed) of different matrix sizes. I have run into a ...

cuda parallel-processing gpu-programming

asked Jun 13 at 22:37

user2407853
22

votes

0answers

47 views

CUDA: sum-reduction — data lost in call to device function

I am writing a CUDA sum reduction code taking the sum of the absolute values of an array starting on element begin_index through end_index (I am using one block with a variable number of threads). ...

cuda parallel-processing gpu-programming

asked Jun 13 at 4:14

user2407853
22

votes

1answer

52 views

How to choose a non busy CUDA device?

I'm working on a cluster with a lot of nodes, and each node has two gpus. In the cluster, I can't launch "nvidia-smi" to check which device is busy. My code selects the best device (with ...

cuda cluster-computing gpgpu gpu-programming hpc

asked Jun 9 at 23:10

Pablo
52213

vote

1answer

52 views

CUDA: Allocate memory for auxiliary data to the shared memory of each block efficiently

Suppose that we have an array int * data each thread will access one element of this array. Since this array will be shared among all threads it will be saved inside the global memory. Let's create ...

cuda shared-memory gpu-programming

asked May 25 at 23:36

ksm001
656129

15 30 50 per page

newest gpu-programming questions feed

284

questions tagged

gpu-programming about »

cuda × 148
gpu × 108
gpgpu × 82
opencl × 65
c++ × 27
nvidia × 22
parallel-processing × 18
c × 16
matlab × 12
opengl × 11
nsight × 7
c# × 6
image-processing × 5
opencv × 5
debugging × 4
thrust × 4
textures × 4
osx × 4
algorithm × 4
multithreading × 4
c++-amp × 4
texture × 3
f# × 3
.net × 3
performance × 3

Tagged Questions

Related Tags