Issue with second GPU
log in

Advanced search

Message boards : Number crunching : Issue with second GPU

Author Message
OffDutyTaoist
Send message
Joined: 30 Jun 24
Posts: 3
Credit: 21,188,350
RAC: 145,878
Message 10144 - Posted: 1 Oct 2024, 6:28:50 UTC
Last modified: 1 Oct 2024, 6:30:35 UTC

I recently started running a second GPU (Primary GTX 1060 Secondary GTX 960). SRBase will run fine with a second project task (ie Einstein, Amicable, etc) but when another SRBase task starts the task on the 960 fails out with a commutation error. Do I need to do something in the XML files? Included is one of the errors:

<core_client_version>8.0.2</core_client_version>
<![CDATA[
<message>
The operating system cannot run (null).
(0xc3) - exit code 195 (0xc3)</message>
<stderr_txt>
2024-10-01 01:13:05 (15324): wrapper (7.24.26018): starting
2024-10-01 01:13:05 (15324): wrapper: running mfaktc-win-64.exe (-d 1)
2024-10-01 01:13:05 (15324): wrapper: created child process 10536
mfaktc v0.21 (64bit built)

Compiletime options
THREADS_PER_BLOCK 256
SIEVE_SIZE_LIMIT 32kiB
SIEVE_SIZE 193154bits
SIEVE_SPLIT 250
MORE_CLASSES enabled

Runtime options
SievePrimes 25000
SievePrimesAdjust 1
SievePrimesMin 5000
SievePrimesMax 100000
NumStreams 3
CPUStreams 3
GridSize 3
GPU Sieving enabled
GPUSievePrimes 82486
GPUSieveSize 2047Mi bits
GPUSieveProcessSize 32Ki bits
Checkpoints enabled
CheckpointDelay 60s
WorkFileAddDelay disabled
Stages enabled
StopAfterFactor class
PrintMode full
V5UserID (none)
ComputerID (none)
AllowSleep no
TimeStampInResults no

CUDA version info
binary compiled for CUDA 12.0
CUDA runtime version 12.0
CUDA driver version 12.60

CUDA device info
name NVIDIA GeForce GTX 960
compute capability 5.2
max threads per block 1024
max shared memory per MP 98304 byte
number of multiprocessors 8
CUDA cores per MP 128
CUDA cores - total 1024
clock rate (CUDA cores) 1367MHz
memory clock rate: 3505MHz
memory bus width: 128 bit

Automatic parameters
threads per grid 1048576
GPUSievePrimes (adjusted) 82486
GPUsieve minimum exponent 1055144

running a simple selftest...
ERROR: cudaGetLastError() returned 209: no kernel image is available for execution on the device
2024-10-01 01:13:06 (15324): mfaktc-win-64.exe exited; CPU time 0.000000
2024-10-01 01:13:06 (15324): app exit status: 0x1
2024-10-01 01:13:06 (15324): called boinc_finish(195)

</stderr_txt>
]]>

Profile rebirther
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 2 Jan 13
Posts: 7479
Credit: 43,620,241
RAC: 37,909
Message 10146 - Posted: 1 Oct 2024, 7:53:29 UTC - in response to Message 10144.

ERROR: cudaGetLastError() returned 209: no kernel image is available for execution on the device


Looks like the GTX960 is too old and was not compiled in mfaktc with older cc.

zlodeck
Send message
Joined: 3 Mar 15
Posts: 4
Credit: 138,485,223
RAC: 116,447
Message 10148 - Posted: 1 Oct 2024, 10:05:55 UTC - in response to Message 10146.
Last modified: 1 Oct 2024, 10:06:17 UTC

ERROR: cudaGetLastError() returned 209: no kernel image is available for execution on the device


Looks like the GTX960 is too old and was not compiled in mfaktc with older cc.


No, it looks like CC 5.2 is enough to run mfaktc with "barrett76_mul32_gs" kernel. Probably large GPUSieveSize (2047Mi bits vs 64Mi bits below) may be an issue for GTX960 with 2Gb memory onboard.


2024-10-01 12:16:10 (9532): wrapper (7.24.26018): starting
2024-10-01 12:16:10 (9532): wrapper: running mfaktc-win-64.exe (-d 0)
2024-10-01 12:16:10 (9532): wrapper: created child process 8352
mfaktc v0.21 (64bit built)

Compiletime options
THREADS_PER_BLOCK 256
SIEVE_SIZE_LIMIT 32kiB
SIEVE_SIZE 193154bits
SIEVE_SPLIT 250
MORE_CLASSES enabled

Runtime options
SievePrimes 25000
SievePrimesAdjust 1
SievePrimesMin 5000
SievePrimesMax 100000
NumStreams 3
CPUStreams 3
GridSize 3
GPU Sieving enabled
GPUSievePrimes 82486
GPUSieveSize 64Mi bits
GPUSieveProcessSize 16Ki bits
Checkpoints enabled
CheckpointDelay 60s
WorkFileAddDelay disabled
Stages enabled
StopAfterFactor class
PrintMode full
V5UserID (none)
ComputerID (none)
AllowSleep no
TimeStampInResults no

CUDA version info
binary compiled for CUDA 10.0
CUDA runtime version 10.0
CUDA driver version 11.30

CUDA device info
name NVIDIA GeForce GTX 980
compute capability 5.2
max threads per block 1024
max shared memory per MP 98304 byte
number of multiprocessors 16
CUDA cores per MP 128
CUDA cores - total 2048
clock rate (CUDA cores) 1329MHz
memory clock rate: 3505MHz
memory bus width: 256 bit

Automatic parameters
threads per grid 1048576
GPUSievePrimes (adjusted) 82486
GPUsieve minimum exponent 1055144

running a simple selftest...
Selftest statistics
number of tests 107
successfull tests 107

selftest PASSED!

got assignment: exp=178691207 bit_min=75 bit_max=76 (42.82 GHz-days)
Starting trial factoring M178691207 from 2^75 to 2^76 (42.82 GHz-days)
k_min = 105710103187920
k_max = 211420206384062
Using GPU kernel "barrett76_mul32_gs"
Date Time | class Pct | time ETA | GHz-d/day Sieve Wait
Oct 01 12:16 | 0 0.1% | 6.301 1h40m | 611.66 82485 n.a.%
Oct 01 12:16 | 4 0.2% | 6.298 1h40m | 611.95 82485 n.a.%
Oct 01 12:16 | 9 0.3% | 6.304 1h40m | 611.37 82485 n.a.%

Profile rebirther
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 2 Jan 13
Posts: 7479
Credit: 43,620,241
RAC: 37,909
Message 10150 - Posted: 1 Oct 2024, 13:33:55 UTC - in response to Message 10149.

You can only exclude the GTX960 and use it for another project.

OffDutyTaoist
Send message
Joined: 30 Jun 24
Posts: 3
Credit: 21,188,350
RAC: 145,878
Message 10151 - Posted: 1 Oct 2024, 17:30:16 UTC - in response to Message 10150.

So my CC file should look something like this:

<cc_config>
<options>
<use_all_gpus>1</use_all_gpus>
<exclude_gpu>
<url>https://srbase.my-firewall.org/sr5/</url>
[<device_num>1</device_num>]
[<type>NVIDIA_gpu</type>]
[<app>SRBase</app>]
</exclude_gpu>
</options>
</cc_config>

Correct? My 960 being the second card.

Profile rebirther
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 2 Jan 13
Posts: 7479
Credit: 43,620,241
RAC: 37,909
Message 10152 - Posted: 1 Oct 2024, 17:40:58 UTC - in response to Message 10151.
Last modified: 1 Oct 2024, 17:45:09 UTC



Correct? My 960 being the second card.


<cc_config> <options> <use_all_gpus>1</use_all_gpus> <exclude_gpu> <url>https://srbase.my-firewall.org/sr5/</url> [<device_num>1</device_num>] [<type>NVIDIA</type>] [<app>TF</app>] </exclude_gpu> </options> </cc_config>

OffDutyTaoist
Send message
Joined: 30 Jun 24
Posts: 3
Credit: 21,188,350
RAC: 145,878
Message 10159 - Posted: 3 Oct 2024, 3:05:01 UTC

If anyone else has this issue in the future, and needs a solution, I ended up grouping my projects in the cc_config like this:

<cc_config>
<options>
<use_all_gpus>1</use_all_gpus>
<exclude_gpu>
<url>https://srbase.my-firewall.org/sr5/</url>
<device_num>1</device_num>
<type>NVIDIA</type>
</exclude_gpu>
<exclude_gpu>
<url>https://asteroidsathome.net/boinc/</url>
<device_num>1</device_num>
<type>NVIDIA</type>
</exclude_gpu>
<exclude_gpu>
<url>https://www.gpugrid.net/</url>
<device_num>1</device_num>
<type>NVIDIA</type>
</exclude_gpu>
<exclude_gpu>
<url>https://einstein.phys.uwm.edu/</url>
<device_num>0</device_num>
<type>NVIDIA</type>
</exclude_gpu>
<exclude_gpu>
<url>https://sech.me/boinc/Amicable/</url>
<device_num>0</device_num>
<type>NVIDIA</type>
</exclude_gpu>
<exclude_gpu>
<url>https://numberfields.asu.edu/NumberFields/</url>
<device_num>0</device_num>
<type>NVIDIA</type>
</exclude_gpu>
</options>
</cc_config>

The projects I'm working on that get me the most mag with GridCoin I assigned to my 1090 (excluded them from the 960, device 1), and assigned the secondary projects to my 960 (excluded them from the 1060, device 0). It seems to be working out pretty well so far.


Post to thread

Message boards : Number crunching : Issue with second GPU


Main page · Your account · Message boards


Copyright © 2014-2024 BOINC Confederation / rebirther