IPB

Welcome Guest ( Log In | Register )

5 Pages V  « < 3 4 5  
Reply to this topicStart new topic
> 4 x GTX-295: CUDA only sees 5 x GPU (NOT the usual issues!)
evanevery
post Oct 20 2009, 07:33 PM
Post #81



***

Group: Members
Posts: 41
Joined: 14-July 09
Member No.: 163,683
Org.: Digital Intelligence



For folks who have been following along...

NVIDIA received our test system last week as scheduled. Unfortunately it did not arrive in good condition. Water had leaked out of the cooling system and onto the video cards. We should have shipped it freight instead of FEDEX (as we have done sucessfully before). NVIDIA is in the process of getting the system dried out and the fluid topped off. Hopefully they will have it running today or tomorrow...


Off Topic: You know those commercials where FEDEX or UPS are trying to sell us all those add-on services, like logistics, or inventory management, or parts control? (I.e. "What can Brown Do for You?") Well, you know what "Brown can do for me" (and FEDEX as well) - They can ship my !@#$% packages without breaking them! Thats all I ask! Just a core competancy. Just ship my crap on time without breaking it! If someone started a shipping business that could accomplish that one simple task they could own the industry! A UPS Manager for my territory came in to my business a couple of years back and asked me what they could do for us. Needless to say, he hasn't been back...
Go to the top of the page
 
+Quote Post
SPWorley
post Oct 21 2009, 03:04 PM
Post #82



*******

Group: Members
Posts: 793
Joined: 13-June 08
From: California USA
Member No.: 107,688



Out of curiosity, how does Windows 7 deal with 3 or 4 ATI 4870X2 cards in OpenCL? Are there similar enumeration issues?

(Not that anyone likes using ATI cards for GPGPU apps..)
Go to the top of the page
 
+Quote Post
evanevery
post Oct 21 2009, 03:10 PM
Post #83



***

Group: Members
Posts: 41
Joined: 14-July 09
Member No.: 163,683
Org.: Digital Intelligence



QUOTE (SPWorley @ Oct 21 2009, 10:04 AM) *
Out of curiosity, how does Windows 7 deal with 3 or 4 ATI 4870X2 cards in OpenCL? Are there similar enumeration issues?

(Not that anyone likes using ATI cards for GPGPU apps..)



BTW - Its not a "Windows 7" Issue - its a driver/CUDA issue. Windows 7 sees all 8 GPU's just fine in the Device Manager. CUDA (or the driver) simply wont enumerate all of them for number crunching...

Go to the top of the page
 
+Quote Post
totsuka
post Nov 4 2009, 01:16 AM
Post #84



**

Group: Members
Posts: 10
Joined: 3-October 09
Member No.: 195,402



Any update on their resolution to your issue evanevery?

My system is working great, I switched back to XP thumbsdown.gif
Go to the top of the page
 
+Quote Post
evanevery
post Nov 4 2009, 02:45 PM
Post #85



***

Group: Members
Posts: 41
Joined: 14-July 09
Member No.: 163,683
Org.: Digital Intelligence



QUOTE (evanevery @ Oct 21 2009, 10:10 AM) *
BTW - Its not a "Windows 7" Issue - its a driver/CUDA issue. Windows 7 sees all 8 GPU's just fine in the Device Manager. CUDA (or the driver) simply wont enumerate all of them for number crunching...



Its not good....

After pushing this issue with NVIDIA for 5 Months...

...and sending them our prototype system because they couldn't allocate a couple of GTX-295 cards and spend the time and resources in-house to research such a well documented and researched issue...

...and having it trashed during shipment by our friends at FEDEX....

...and waiting over two weeks while NVIDIA attempts to get it running again...

...and then finding out it simply will not boot anymore due to the damage in shipping...

...and then having it shipped all the way back...

...we now have a broken unit back in our hands and are beginning to tear it down to repair it!

We are no further along on this issue then we were on July 16th - except of course we now have a broken system that needs to be repaired! So far we have determined that the motherboard is toast and two of the four water-cooled 295's will not post. So thats at least $1800.00 in parts so far...

However, now the guys at NVIDIA PROMISE to see if they can allocate at least6 two 295's so that their engineers can try to replicate the problem. Seems this should have been done during normal driver testing in the first place if not certainly pusued once we reported the issue!.

I will continue to follow this through - we're not done yet!
Go to the top of the page
 
+Quote Post
totsuka
post Nov 5 2009, 03:21 PM
Post #86



**

Group: Members
Posts: 10
Joined: 3-October 09
Member No.: 195,402



QUOTE (evanevery @ Nov 4 2009, 09:45 AM) *
However, now the guys at NVIDIA PROMISE to see if they can allocate at least6 two 295's so that their engineers can try to replicate the problem. Seems this should have been done during normal driver testing in the first place if not certainly pusued once we reported the issue!.


Sorry to hear the machine is dead, that sucks. Just to prove to Nvidia that their own system has a problem, you have to pay the price. A big one at that.

Seems quite amazing that they can't come up with two of their own cards.

I wonder what happens if some Microsoft engineers need to test clustering on some Windows software? They can't get the keys to activate it? Maybe they can have the customer who is reporting the problem sent in two copies of Windows...

Seems so stupid doesn't it?
Go to the top of the page
 
+Quote Post
evanevery
post Nov 5 2009, 03:56 PM
Post #87



***

Group: Members
Posts: 41
Joined: 14-July 09
Member No.: 163,683
Org.: Digital Intelligence



QUOTE (totsuka @ Nov 5 2009, 10:21 AM) *
Sorry to hear the machine is dead, that sucks. Just to prove to Nvidia that their own system has a problem, you have to pay the price. A big one at that.

Seems quite amazing that they can't come up with two of their own cards.

I wonder what happens if some Microsoft engineers need to test clustering on some Windows software? They can't get the keys to activate it? Maybe they can have the customer who is reporting the problem sent in two copies of Windows...

Seems so stupid doesn't it?


Yeah - It was kind of like extortion... The only way NVIDIA would address this issue is if we would send them our system (not that ANY similarly configured system wouldn't do). ...and our prototype system is not much use anyway if we can't sell the product (because of the driver problems in currenly shipping O/S'). So we had to send them our system if there was any hope of resolving this. And now we're out at least $1800.00 in parts (and still counting), plus labor, plus all these months of wasted time where we could be selling this product. Oh, and we STILL have not got a solution!

1. Multiple GPU testing should have been done DURING driver development. One can only ASSUME that the engineers have access to the necessary resources to do this. I find it really hard to believe they don't have multiple 295's (or any other card for that metter) to use in testing during initial driver development.

2. Even if they DID NOT test the Vista/Win7 driver suite PROPERLY, they should still have fixed the probblem when it was originally identified in detail.

I don't know how long other folks have known about this, but be we identified it, and documented it in detail, both with a formal support incident, and in this thread, on/before July 16. One would think the response would have been more like: "Thanks for doing all that hard work and detailed testing to quantify this problem for us. Sorry we didn't test the drivers thoroughly in the first place. We are going to test to confirm and resolve this problem right away". Anyone who took the time to read this thread (or our formal support incident) would see that we did quite a lot of work to corner the specific issue. All that work was sort of a gift for NVIDIA. "Here's what we found, here's what we did to quantify the issue", etc.

The only thing we couldn't do (and still can't do) is solve the problem...
Go to the top of the page
 
+Quote Post
poirot
post Nov 6 2009, 05:21 AM
Post #88



**

Group: Members
Posts: 21
Joined: 15-June 09
Member No.: 159,278
Org.: HKU



I have faced the same problem too. I got 3 gtx295. The most number of gpus cuda can see is 5 if I enable the quad sli and connect all the dummy plugs to the dvi ports. I am wondering whether I should swith the computer back to xp.
Go to the top of the page
 
+Quote Post
MKasper
post Nov 10 2009, 07:31 AM
Post #89



*

Group: Members
Posts: 4
Joined: 1-October 09
Member No.: 195,158
Org.: Ruhr-Universität Bochum, Embedded Security



Just for the sake of completeness:
We are also facing the same probs as documented in
another posting.
The only step forward to see so far is the documentation of the bug as "known issue" in the latest drivers.
But this is not really of any use to me.
Go to the top of the page
 
+Quote Post
svenkr01
post Nov 10 2009, 03:29 PM
Post #90



*

Group: Members
Posts: 1
Joined: 9-May 09
From: Europe
Member No.: 154,041



For what it's worth:
I had the same probs on Vista Ult. 64 (2 gtx295, only 3 gpus enumerated).

However in a new installation of Win7 64B Ult. with the newest beta NVIDIA drivers (195.39 for OpenCL...) I get the following Picture
with 2xGTX295 (old version with double PCB):

1.) PhysX disabled / SLI disabled: 1 GPU detected by CUDA (and OpenCL)
2.) PhysX enabled / SLI disabled: 2 GPUs detected
3.) Quad SLI enabled / PhysX enabled or disabled: ALL 4 GPUs detected by CUDA / OpenCL.

No registry keys were changed by me.
Only one LCD was connected via DVI at the first GTX295 Card, no other dummies or LCDs were connected
to the other ports or card.
MoBo used: P6T7 (Bios V0406)
Video card brand: Zotac

Maybe Nvidia changed something in the new driver.

I dont know about 3xGTX295, this might give you again CUDA enum./detection probs.

This post has been edited by svenkr01: Nov 10 2009, 04:43 PM
Go to the top of the page
 
+Quote Post
evanevery
post Nov 12 2009, 02:39 PM
Post #91



***

Group: Members
Posts: 41
Joined: 14-July 09
Member No.: 163,683
Org.: Digital Intelligence



Perhaps a hint at some hope for this issue:

The guys that I have been working with at NVIDIA finally came up with a couple of GTX-295's and "verified that engineering is able to repro on any motherboard that can support 3 or more GTX295's".

(Although every thing that we've seen seems to indicate problems with only 2 or more dual GPU cards under Vista/Win7...)

At least they seem to have finally reproduced the problem. (After only 4 Months and about $1800 of destroyed hardware...)

Anyway, I'll keep everyone posted as the saga continues...
Go to the top of the page
 
+Quote Post
bw168
post Nov 13 2009, 07:05 AM
Post #92



*

Group: Members
Posts: 1
Joined: 13-November 09
Member No.: 245,380
Club SLI Member: No



QUOTE (evanevery @ Nov 12 2009, 10:39 PM) *
Perhaps a hint at some hope for this issue:

The guys that I have been working with at NVIDIA finally came up with a couple of GTX-295's and "verified that engineering is able to repro on any motherboard that can support 3 or more GTX295's".

(Although every thing that we've seen seems to indicate problems with only 2 or more dual GPU cards under Vista/Win7...)

At least they seem to have finally reproduced the problem. (After only 4 Months and about $1800 of destroyed hardware...)

Anyway, I'll keep everyone posted as the saga continues...


Hi Evan,

Have you tried disabling devices in your BIOS, such as Audio, Network Controller, and eSATA controller?
It seems like 8 GPUs should work just fine, but you may need to turn those devices off to obtain more I/O resources.
Go to the top of the page
 
+Quote Post
SazanEyes
post Nov 13 2009, 02:36 PM
Post #93



**

Group: Members
Posts: 12
Joined: 14-July 09
Member No.: 163,736



svenkr01, what you are seeing is not new. All of the 19x.xx drivers have worked that way starting with 190.38 beta (see page 1 of this thread). Quad-SLI will only enumerate four GPUs, and the rest of the cards will still only be detected as one GPU.

bw168, I don't want to speak for evanevery but I have a similar configuration with three GTX 295s (again, see page 1). There's no problem getting the cards working in Windows. I believe both of us have all cards working and recognized, desktops extended, etc. The problem is specifically with GPU detection by CUDA, and I agree with evanevery that it is a driver issue. If NVIDIA can get two cards to be detected correctly (quad-SLI), there's no reason three or four cards shouldn't also work.

evanevery, I'm glad to see you're still pressing NVIDIA on this issue. A lot of people are waiting for this to be fixed.
Go to the top of the page
 
+Quote Post

5 Pages V  « < 3 4 5
Reply to this topicStart new topic

 



Copyright 2008 NVIDIA Corporation.  Terms of Use | Legal Info | Privacy Policy Time is now: 24th November 2009 - 01:54 AM
Unites States Argentina Brazil Chile China Colombia France Germany India Italy Japan Korea Mexico Poland Russia Spain Taiwan United Kingdom Venezuela