Take the 2-minute tour ×
Stack Overflow is a question and answer site for professional and enthusiast programmers. It's 100% free, no registration required.

The Situation

I have a 2 gpu server (Ubuntu 12.04) where I switched a Tesla C1060 with a GTX 670. Than I installed CUDA 5.0 over the 4.2. Afterwards I compiled all examples execpt for simpleMPI without error. But when I run ./devicequery I get following error message:

foo@bar-serv2:~/NVIDIA_CUDA-5.0_Samples/bin/linux/release$ ./deviceQuery
./deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

cudaGetDeviceCount returned 38
-> no CUDA-capable device is detected

What I have tried

To solve this I tried all of the thinks recommended by CUDA-capable device, but to no avail:

  • /dev/nvidia* is there and the permissions are 666 (crw-rw-rw-) and owner root:root

     foo@bar-serv2:/dev$ ls -l nvidia*
     crw-rw-rw- 1 root root 195,   0 Oct 24 18:51 nvidia0
     crw-rw-rw- 1 root root 195,   1 Oct 24 18:51 nvidia1
     crw-rw-rw- 1 root root 195, 255 Oct 24 18:50 nvidiactl
    
  • I tried executing the code with sudo

  • CUDA 5.0 installs driver and libraries at the same time

PS here is lspci | grep -i nvidia:

foo@bar-serv2:/dev$ lspci | grep -i nvidia
03:00.0 VGA compatible controller: NVIDIA Corporation GK104 [GeForce GTX 670] (rev a1)
03:00.1 Audio device: NVIDIA Corporation GK104 HDMI Audio Controller (rev a1)
04:00.0 VGA compatible controller: NVIDIA Corporation G94 [Quadro FX 1800] (rev a1)

[update]

foo@bar-serv2:~/NVIDIA_CUDA-5.0_Samples/bin/linux/release$  nvidia-smi -a
NVIDIA: API mismatch: the NVIDIA kernel module has version 295.59,
but this NVIDIA driver component has version 304.54.  Please make
sure that the kernel module and all NVIDIA driver components
have the same version.
Failed to initialize NVML: Unknown Error

How could that be, if I use the CUDA 5.0 installer to install driver and libs at the same time. Could the old 4.2 version, that is still lying around mess things up?

share|improve this question
 
what happens if you run nvidia-smi -a ? –  Robert Crovella Oct 24 '12 at 17:17
 
@RobertCrovella, thanks for the input. I tried it, and it gives an error. BTW: where did you find out about this util? –  Framester Oct 24 '12 at 17:23
 
Please explain your reasons to flag to close. Thanks! –  Framester Oct 24 '12 at 17:24
 
There is no CUDA 5.0 package which lists Ubuntu 12.04 as supported. You can check the release notes for a list of supported OS's. Something about your system config got in the way of a successfull driver upgrade. You may want to review nvidia driver readme especially section 8 or search online for articles about failed driver installs on ubuntu. –  Robert Crovella Oct 24 '12 at 17:39
1  
Your kernel's driver module is out of date. This usually happens if the kernel sources present during driver installation do not match the running kernel, if the kernel is updated after installation of the Nvidia driver, or if no driver is installed at all. Check that your installed up-to-date kernel sources that match the running kernel and run the driver installation again. –  tera Oct 25 '12 at 8:34
show 1 more comment

4 Answers

See this stack overflow question Installing cuda 5 samples in Ubuntu 12.10.

  1. Ubuntu 12 is not a supported Linux distro (yet). For reference see CUDA 5.0 Toolkit Release Notes And Errata

    ** Distributions Currently Supported

    Distribution       32 64  Kernel                 GCC         GLIBC        
    -----------------  -- --  ---------------------  ----------  -------------
    Fedora 16          X  X   3.1.0-7.fc16           4.6.2       2.14.90      
    ICC Compiler 12.1     X                                                   
    OpenSUSE 12.1         X   3.1.0-1.2-desktop      4.6.2       2.14.1       
    Red Hat RHEL 6.x      X   2.6.32-131.0.15.el6    4.4.5       2.12         
    Red Hat RHEL 5.5+     X   2.6.18-238.el5         4.1.2       2.5          
    SUSE SLES 11 SP2      X   3.0.13-0.27-pae        4.3.4       2.11.3       
    SUSE SLES 11.1     X  X   2.6.32.12-0.7-pae      4.3.4       2.11.1       
    Ubuntu 11.10       X  X   3.0.0-19-generic-pae   4.6.1       2.13         
    Ubuntu 10.04       X  X   2.6.35-23-generic      4.4.5       2.12.1    
    
  2. If you want to do it run on Ubuntu 12 anyway then see answer of rpardo. It looks like this distro instead of installing 64 bit libraries to /usr/lib64 installs them to /usr/lib/x86_64-linux-gnu/

I'd suggest searching for all instances of libcuda.so and libnvidia-ml.so on the system. Since the driver doesn't support this distro it might have installed libraries to a path that is not pointed by LD_LIBRARY_PATH. Then move the libraries around and/or change the LD_LIBRARY_PATH to point to this location (it should be the first path on the left). Then retry nvidia-smi or deviceQuery

Good luck

share|improve this answer
 
Thanks for your answer. It is true, that I did not pay attention to the fact that CUDA 5.0 is not running on 12.04. The reason for that is, that CUDA 4.2 just worked fine on 12.04. I furthermore did what rpardo recommended while installing the driver the first time. –  Framester Oct 25 '12 at 8:56
add comment

I came across this issue, and running

nvidia-smi

informed me of an API mismatch. The problem was that my Linux distro had installed updates that required a system restart, so restarting resolved the issue.

share|improve this answer
add comment

Try running the sample using sudo (or, you might do a 'sudo su', set LD_LIBRARY_PATH to the path of cuda libraries and run the sample while being root). Apparently, since you've probably installed CUDA 5.0 using sudo, the samples doesn't run with normal user. However, if you run a sample with root, then you'll be able to run samples with the regular user too! I've not yet restarted the system to see if samples work with normal user even after reboot, or each time you should run at least one CUDA application with root.

The problem might completely disappear if you install CUDA TookKit without using sudo.

share|improve this answer
 
I recommend going back and reading the complete question again. This has nothing to do with the problem. –  talonmies Feb 2 at 11:21
add comment

By experience, It is a problem of read/execution rights. Try:

sudo chmod 755 /usr/local/cuda* -R
share|improve this answer
 
Please flesh out your answer. –  Joce Mar 26 at 21:42
add comment

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.