How to setup nVidia GPU system
.James Zou.
10/18/2010
1. Install Linux_64_redhat_5.5.
2. Uncheck “Virtualization \ Xen” during install RHEL5.
I accidently checked “Virtualization \ Xen” during install RHEL5. Then I got this message:
The kernel you are installing for is a Xen kernel!
The NVIDIA driver does not currently work on Xen kernels. If
you are using a stock distribution kernel, please install
a variant of this kernel without Xen support; if this is a
custom kernel, please install a standard Linux kernel. Then
try installing the NVIDIA kernel module again.
I have to remove the Xen kernel and install the standard kernel:
(1) cat /etc/yum.repos.d/rhel-base.repo
I just want to make sure that YUM is linked to my PXE server.
(2) yum -y install kernel
Install the standard kernel.
(3) vi /boot/grub/grub.conf
Make sure that the system default boots from standard kernel.
3. After I update the standard kernel, I install the nVidia driver for GPU. The nVidia driver still complains that I am running X11. I used the following steps to stop the X11 during system booting up.
(1) cd /etc
(2) vi inittab
(3) Change 5 to 3.
4. I tried to install nVidia GPU driver. The nVidia driver complains that I did not install the kernel source. To install the kernel source. I typed:
yum -y install kernel-devel
5. Install the nVidia GPU driver.
sh /root/CUDA/turnoff_srv.sh
sh devdriver_3.2_linux_64_260.24.run
sync;reboot
6. Download Intel compiler for openmpi.
7. Install Intel compiler.
tar xf Intel_SW-06012010.tar
cd Intel_SW
ls
./inst.sh
8. Download openmpi
9. Install openmpi
tar jxf openmpi-1.4.2.tar.bz2
cd openmpi-1.4.2
ls
. /root/CUDA/set_env
./configure –prefix=/opt/openmpi-icc –enable-static –disable-share CC=icc CXX=icpc FC=ifort F77=ifort && make && make install clean
10. Install nVidia took kit
tar xf CUDA-07122010.tar
cd CUDA
ls
cd CUDA-3.1
ls
sh cudatoolkit_3.1_linux_64_rhel5.4.run
ls
sh gpucomputingsdk_3.1_linux.run
11. Run nVidia bandwidth test.
cd NVIDIA_GPU_Computing_SDK/C/
mv src projects
ls
mkdir src
mv projects/bandwidthTest/ src
mv projects/deviceQuery src
make
screen -r
ls
cd bin
ls
cd linux
ls
cd release
ls
. ~/CUDA/set_env
./deviceQuery
./bandwidthTest –device=0 –memory=pinned
memory=pinned
numactl –physcpubind=0-3 ./bandwidthTest –device=0 –memory=pinned
numactl –physcpubind=4-7 ./bandwidthTest –device=1 –memory=pinned
12. Run nVidia linpack test
tar zxf hpl-2.0_FERMI_v07.tgz -C /opt
cd /opt/ hpl-2.0_FERMI_v07/
ls
cd src
ls
cd cuda
ls
make
ls
free
cd ../../bin/CUDA_pinned/
ls
vi run_linpack (change directory to /opt/… )
ldd xhpl
mpirun -np 1 numactl –cpunodebind=0 ./run_linpack : -np 1 numactl –cpunodebind=1 ./run_linpack
pwd
ls
mpirun -np 1 numactl –cpunodebind=0 ./run_linpack : -np 1 numactl –cpunodebind=1 ./run_linpack
13. Batch file for running bandwidth test.
#!/bin/bash
cd NVIDIA_GPU_Computing_SDK/C/bin/linux/release/
while true
do
date >> /root/$0.log
numactl –cpunodebind=0 ./bandwidthTest –memory=pinned –device=0 –noprompt 2>&1 | tee -a /root/$0.log
numactl –cpunodebind=1 ./bandwidthTest –memory=pinned –device=1 –noprompt 2>&1 | tee -a /root/$0.log
done
14. Batch file for running linpack test.
#!/bin/bash
cd /opt/hpl-2.0_FERMI_v07/bin/CUDA_pinned
while true
do
date >> /root/$0.log
mpirun -np 1 numactl –cpunodebind=0 ./run_linpack : -np 1 numactl –cpunodebind=1 ./run_linpack 2>&1 | tee -a /root/$0.log
sleep 10s
done
15 History of successful installation
1 ls
2 pwd
3 cd ..
4 ls
5 cd home
6 ls
7 cd ..
8 cd usr
9 ls
10 ifconfig
11 ifconfig eth0
12 sftp xxx.xxx.xxx.xxx
13 ls
14 pwd
15 cp CUDA-07122010.tar /root
16 cp Intel_SW-06012010.tar /root
17 cd /root
18 ls
19 sftp xxx.xxx.xxx.xxx
20 ls
21 mkdir zou
22 cd zou
23 ls
24 tftp xxx.xxx.xxx.xxx
25 stfp xxx.xxx.xxx.xxx
26 sftp xxx.xxx.xxx.xxx
27 cd ..
28 ls
29 tar xf CUDA-07122010.tar
30 ls
31 cd CUDA
32 ls
33 cd HPL-Fermi
34 ls
35 cd cuda
36 ls
37 cd ..
38 ls
39 cd ..
40 ls
41 cd ..
42 ls
43 tar xf Intel_SW-06012010.tar
44 ls
45 cd Intel_SW
46 ls
47 cd ..
48 ls
49 tar zxf hpl-2.0_FERMI_v07.tgz -C /opt
50 ls
51 cd /opt
52 ls
53 cd hpl-2.0_FERMI_v07/
54 ls
55 cd ..
56 ls
57 cd ..
58 ls
59 cd /root
60 ls
61 sync
62 man sync
63 ls
64 tar zxf openmpi-1.4.2.tar.bz2
65 ls
66 cd CUDA
67 ls
68 sh turnoff_srv.sh
69 reboot
70 ls
71 cd CUDA
72 ls
73 vi turnoff_srv.sh
74 ls
75 cd ..
76 ls
77 mount /dev/sdb1
78 mount /dev/sdb1 /mnt
79 cd /mnt
80 ls
81 cp devdriver_3.2_linux_64_260.24.run /root
82 cd /root
83 ls
84 sh devdriver_3.2_linux_64_260.24.run
85 init 3
86 sh devdriver_3.2_linux_64_260.24.run
87 cd /mnt
88 ls
89 cp devdriver_3.2_linux_64_260.24.run /root
90 cd /root
91 ls
92 sh devdriver_3.2_linux_64_260.24.run
93 chmod 777 devdriver_3.2_linux_64_260.24.run
94 sh devdriver_3.2_linux_64_260.24.run
95 sftp 172.31.10.138
96 ls
97 ls -l
98 sh devdriver_3.2_linux_64_260.24.run
99 ls
100 sh devdriver_3.2_linux_64_260.24.run
101 lsmod
102 lsmod >> xxx
103 vi xxx
104 man yum
105 yum list
106 yum list | less
107 yum update kernel
108 /var
109 ls
110 cd /var
111 ls
112 cd log
113 ls
114 vi nvidia-installer.log
115 ls
116 yum
117 ssh
118 ls
119 pwd
120 cd ..
121 ls
122 ls
123 pwd
124 cd ..
125 ls
126 ls
127 cd root
128 ls
129 cd /mnt
130 ls
131 cd /root
132 ls
133 cp cudatoolkit_3.2.9_linux_64_rhel5.5.run /mnt
134 cp devdriver_3.2_linux_64_260.24.run /mnt
135 uname -a
136 cat /etc/yum.repos.d/rhel-base.repo
137 yum -y install kernel
138 ifconfig
139 ls
140 ls
141 vi /boot/grub/grub.conf
142 reboot
143 ls
144 uname -a
145 cat /etc/yum.repos.d/rhel-base.repo
146 yum list
147 yum list | less
148 ls
149 sh devdriver_3.2_linux_64_260.24.run
150 ls
151 vi /boot/grub/grub.conf
152 ls
153 cd /boot
154 ls
155 cd grub
156 ls
157 vi grub.conf
158 ls
159 cd ..
160 ls
161 pwd
162 cd ..
163 dir
164 ls
165 cd bin
166 ls
167 ls -l
168 ls -l | less
169 ls
170 cd /etc
171 ls
172 vi inittab
173 reboot
174 ls
175 sh devdriver_3.2_linux_64_260.24.run
176 man yum
177 yum -y install kernel-devel
178 reboot
179 ls
180 sh devdriver_3.2_linux_64_260.24.run
181 ls
182 cd Intel_SW
183 ls
184 ./inst.sh
185 ls
186 cd Intel_SW
187 ls
188 vi inst.sh
189 ls
190 cd /bin
191 ls
192 ls *.lic
193 ls
194 cd ..
195 cd root
196 ls
197 ps
198 man ps
199 ps -aux
200 ps -aux | less
201 kill 60392
202 kill 17616
203 reboot
204 ls
205 sftp xxx.xxx.xxx.xxx
206 ls
207 cd Intel_SW
208 ls
209 vi inst.sh
210 vi inst.sh
211 root
212 123456
213 ls
214 cd /dev
215 ls
216 ls | less
217 cd null
218 vi null
219 ls
220 ls -l null
221 ls
222 cd ..
223 ls
224 cd /root
225 ls
226 vi install.log
227 ls
228 cd Intel_SW
229 ls
230 vi install.log
231 ls
232 ls
233 cd /opt
234 ls
235 cd intel
236 ls
237 cd licenses
238 ls
239 cd /root
240 ls
241 cd Intel_SW
242 ls
243 ./inst.sh
244 ls
245 cd ..
246 ls
247 tar xf Intel_SW-06012010.tar
248 reboot
249 ls
250 tar xf Intel_SW-06012010.tar
251 rm -r Intel_SW
252 man rm
253 rm -r -f Intel_SW
254 ls
255 cd ..
256 ls
257 cd opt
258 ls
259 rm -r -f intel
260 ls
261 tar xf Intel_SW-06012010.tar
262 ls
263 cd Intel_SW
264 ls
265 ./inst.sh
266 ls
267 cd /opt
268 ls
269 cd ..
270 ls
271 cd root
272 ls
273 cd CUDA
274 ls
275 . set_env
276 cd ..
277 ls
278 ls
279 tar jxf CUDA/openmpi-1.4.2.tar.bz2
280 cd openmpi-1.4.2
281 ls
282 cd
283 pwd
284 screen
285 sync;reboot
286 ls
287 sh cudatoolkit_3.2.9_linux_64_rhel5.5.run
288 ls
289 cd zou
290 ls
291 sh cudatoolkit_3.2.9_linux_64_rhel5.5.run
292 cd ..
293 sh cudatoolkit_3.2.9_linux_64_rhel5.5.run
294 ls
295 mkdir NVIDIA_GPU_Computing_SDK
296 sh gpucomputingsdk_3.2_linux.run
297 ls
298 sh cudatoolkit_3.2.9_linux_64_rhel5.5.run
299 ls
300 cd /usr
301 ls
302 cd local
303 ls
304 cd cuda
305 ls
306 cd /root
307 ls
308 ls
309 cd NVIDIA_GPU_Computing_SDK/
310 ls
311 cd C
312 ls
313 cd bin
314 ls
315 ls
316 cd bin/linux/release/
317 ls
318 cd ..
319 ls
320 cd ..
321 ls
322 cd ..
323 ls
324 cd CUDA
325 ls
326 . set_env
327 env
328 ls
329 ./deviceQuery
330 cd ..
331 ls
332 cd NVIDIA_GPU_Computing_SDK/
333 ls
334 cd C
335 ls
336 cd bin
337 ls
338 cd ..
339 cd ..
340 cd ..
341 ls
342 ./bandwidthTest
343 reboot
344 ls
345 cd /opt
346 ls
347 cd openmpi-icc/
348 ls
349 cd bin
350 ls
351 pwd
352 pwd
353 cd ..
354 ls
355 cd ..
356 ls
357 cd hpl-2.0_FERMI_v07/
358 ls
359 cd bin
360 ls
361 cd CUDA_pinned/
362 ls
363 vi run_linpack
364 ls
365 vi HPL.dat
366 vi HPL.dat
367 poweroff
368 ls
369 cd NVIDIA_GPU_Computing_SDK/
370 ls
371 cd C
372 ls
373 cd bin
374 ls
375 cd ..
376 cd /usr
377 ls
378 cd local
379 ls
380 cd bin
381 ls
382 cd src
383 ls
384 ls
385 pwd
386 ls
387 cd ..
388 ls
389 cd src
390 ls
391 cd ..
392 ls
393 cd cuda
394 ls
395 cd bin
396 ls
397 cd ..
398 ls
399 cd doc
400 ls
401 cd .
402 cd ..
403 ls
404 ls
405 cd ..
406 ls
407 cd ..
408 ls
409 pwd
410 cd /root
411 ls
412 cd openmpi-1.4.2
413 ls
414 icc
415 make
416 ./configue
417 ls
418 cd configue
419 /configure
420 ./configure
421 cd /opt
422 ls
423 cd intel
424 ls
425 cd /root
426 ls
427 cd openmpi-1.4.2
428 ls
429 vi README
430 ls
431 vi config.status
432 ls
433 cd examples
434 ls
435 cd ..
436 ls
437 ./configure –prefix=opt/openmpi-icc –enable-static –disable-share CC=icc CXX=icpc FC=ifort F77=ifort && make && make install clean
438 ./configure –prefix=/opt/openmpi-icc –enable-static –disable-share CC=icc CXX=icpc FC=ifort F77=ifort && make && make install clean
439 ls
440 vi config.log
441 ls
442 pwd
443 cd ..
444 ls
445 cd /opt
446 ls
447 cd hpl-2.0_FERMI_v07/
448 ls
449 make
450 env
451 ls
452 mpicc
453 . ~/CUDA/set_env
454 icc
455 mpicc
456 make
457 ls
458 cd
459 ls
460 cd CUDA
461 ls
462 cd CUDA-3.1
463 ls
464 sh cudatoolkit_3.1_linux_64_rhel5.4.run
465 sh cudatoolkit_3.1_linux_64_rhel5.4.run
466 delete /usr/local/cuda
467 rm -r -f /usr/local/cuda
468 sh cudatoolkit_3.1_linux_64_rhel5.4.run
469 ls
470 sh gpucomputingsdk_3.1_linux.run
471 ls
472 cd .
473 cd ..
474 ls
475 cd ..
476 ls
477 cd NVIDIA_GPU_Computing_SDK/
478 ls
479 cd C
480 ls
481 cd src
482 ls
483 cd ..
484 ls
485 mv src projects
486 ls
487 mkdir src
488 mv projects/bandwidthTest/ src
489 mv projects/deviceQuery src
490 make
491 ls
492 ifconfig
493 screen -r
494 ls
495 cd bin
496 ls
497 cd linux
498 ls
499 cd release
500 ls
501 . ~/CUDA/set_env
502 ./deviceQuery
503 ./bandwidthTest –device=0 –memory=pinned
504 lspci
505 numactl –physcpubind=6-11 ./bandwidthTest –device=0 –memory=pinned
506 numactl –physcpubind=0-5 ./bandwidthTest –device=0 –memory=pinned
507 numactl –physcpubind=6-11 ./bandwidthTest –device=0 –memory=pinned
508 numactl –physcpubind=0-3 ./bandwidthTest –device=0 –memory=pinned
509 numactl –physcpubind=4-7 ./bandwidthTest –device=0 –memory=pinned
510 numactl –physcpubind=4-7 ./bandwidthTest –device=1 –memory=pinned
511 numactl –physcpubind=0-3 ./bandwidthTest –device=1 –memory=pinned
512 ./bandwidthTest –device=1 –memory=pinned
513 ./bandwidthTest –device=0 –memory=pinned
514 numactl –physcpubind=0-3 ./bandwidthTest –device=0 –memory=pinned
515 numactl –physcpubind=4-7 ./bandwidthTest –device=0 –memory=pinned
516 numactl –physcpubind=4-7 ./bandwidthTest –device=1 –memory=pinned
517 numactl –physcpubind=0-3 ./bandwidthTest –device=0 –memory=pinned
518 cd /opt
519 ls
520 cd hpl-2.0_FERMI_v07/
521 ls
522 cd src
523 ls
524 cd cuba
525 cd cuda
526 ls
527 make
528 ls
529 make
530 mpicc
531 env
532 cd /optopenmpi-icc
533 cd /opt
534 ls
535 cd /root
536 ls
537 cd openmpi-1.4.2
538 ls
539 make
540 ls
541 ./configure –prefix=/opt/openmpi-icc –enable-static –disable-share CC=icc CXX=icpc FC=ifort F77=ifort && make && make install clean
542 ls
543 cd /opt
544 ls
545 cd ipenmpi-icc
546 cd openmpi-icc
547 ls
548 cd bin
549 ls
550 cd /opt
551 ls
552 cd hpl-2.0_FERMI_v07/
553 ls
554 cd src
555 ls
556 cd cuda
557 ls
558 make
559 ls
560 free
561 cd ../../bin/CUDA_pinned/
562 ls
563 vi run_linpack
564 ldd xhpl
565 man ldd
566 man ldd
567 ldd xhpl
568 mpirun -np 1 numactl –cpunodebind=0 ./run_linpack : -np 1 numactl –cpunodebind=1 ./run_linpack
569 pwd
570 ls
571 mpirun -np 1 numactl –cpunodebind=0 ./run_linpack : -np 1 numactl –cpunodebind=1 ./run_linpack
572 ls
573 cd /opt
574 ls
575 cd hpl-2.0_FERMI_v07/
576 ls
577 cd bin
578 ls
579 cd c
580 cd CUDA_pinned/
581 ls
582 vi xx.dat
583 ls
584 cp xx.dat /mnt
585 umount /mnt
586 cd /root
587 ls
588 cd /etc
589 ls
590 ls | less
591 vi bashrc
592 ls
593 vi bashrc
594 cd /root
595 ls
596 cd CUDA
597 ls
598 set_env
599 . set_env
600 env
601 ls
602 vi set_env
603 ls
604 poweroff
605 ls
606 cd /mnt
607 ls
608 run_gpu_hpl.sh
609 sh run_gpu_hpl.sh
610 vi run_gpu_hpl.sh
611 ls
612 pwd
613 cd /roo
614 cd /root
615 ls
616 ls
617 pwd
618 vi .bashrc
619 env
620 vi .bashrc
621 ls
622 dmesg
623 dmesg | less
624 mount /dev/sdb1 /mnt
625 ls
626 cd /mnt
627 ls
628 cp run*.sh /root
629 cd /root
630 ls
631 vi run_gpu_hpl.sh
632 cd /opt
633 ls
634 cd hpl-2.0_FERMI_v07/
635 ls
636 cd bin
637 ls
638 cd CUDA_pinned/
639 ls
640 cp HPL.dat xx.dat
641 cp /mnt/HPL.dat yyy.dat
642 ls
643 cp yyy.dat HPL.dat
644 ls
645 vi HPL.dat
646 ls
647 ls
648 cp /root/run_gpu_hpl.sh dick.sh
649 vi dick.sh
650 ls
651 sh dick.sh
652 chmod 777 dick.sh
653 sh dick.sh
654 mpirun
655 env
656 mount /dev/sdb1 /mnt
657 lcd /mnt
658 cd /mnt
659 ls
660 vi dick
661 cd /root
662 ls
663 sh run_gpu_hpl.sh
664 cd /opt
665 ls
666 cd openmpi-icc
667 ls
668 cd bin
669 ls
670 pwd
671 ls
672 history
673 history >> zou
674 mount /dev/sdb1 /mnt
675 ls
676 cp zou /mnt
677 vi zou
678 ls
679 cd zou
680 ls
681 pwd
682 cd ..
683 ls
684 history zounew
685 history >> zounew
Where can I locate hpl-2.0_FERMI_v07.tgz? Can you email to me?
Hi Robert:
I am sorry that I did not reply to you earlier. Recently I am very busy on a new motherboard based on Intel Sandybridge CPU. Come back to your question: the hpl-2.0_FERMI_V07.tgz is under nVidia NDA. I can not send it to you. Sorry about that. However, you can contact nVidia guys and get it very easily because they try to promote their hardware and software. Good luck!
James