| log in |
Message boards : Number crunching : TF v0.47 (opencl_ati_101) failing on Debian with AMD RX 7800 XT
| Author | Message |
|---|---|
|
I picked up some advice from https://srbase.my-firewall.org/sr5/forum_thread.php?id=1961 but all GPU tasks are still failing on my system:
<core_client_version>8.0.4</core_client_version>
<![CDATA[
<message>
process exited with code 195 (0xc3, -61)</message>
<stderr_txt>
2025-11-12 01:31:37 (40068): wrapper (7.24.26018): starting
2025-11-12 01:31:37 (40068): wrapper (7.24.26018): starting
2025-11-12 01:31:37 (40068): wrapper: running ./mfakto-x64 (-d 0)
2025-11-12 01:31:37 (40068): wrapper: created child process 40070
2025-11-12 01:31:39 (40068): ./mfakto-x64 exited; CPU time 1.640716
2025-11-12 01:31:39 (40068): app exit status: 0x4
2025-11-12 01:31:39 (40068): called boinc_finish(195)
cc_config.xml says:
<ignore_ati_dev>1</ignore_ati_dev>
<ignore_ati_dev>2</ignore_ati_dev>
<ignore_ati_dev>3</ignore_ati_dev>
And the log events say:
Mi 12 Nov 2025 01:43:44 EET | | Starting BOINC client version 8.0.4 for x86_64-pc-linux-gnu
Mi 12 Nov 2025 01:43:44 EET | | log flags: file_xfer, sched_ops, task
Mi 12 Nov 2025 01:43:44 EET | | Libraries: libcurl/8.14.1 OpenSSL/3.5.1 zlib/1.3.1 brotli/1.1.0 zstd/1.5.7 libidn2/2.3.8 libpsl/0.21.2 libssh2/1.11.1 nghttp2/1.64.0 nghttp3/1.8.0 librtmp/2.3 OpenLDAP/2.6.10
Mi 12 Nov 2025 01:43:44 EET | | Data directory: /var/lib/boinc-client
Mi 12 Nov 2025 01:43:45 EET | | OpenCL: AMD/ATI GPU 0: AMD Radeon RX 7800 XT (driver version 3649.0 (HSA1.1,LC), device version OpenCL 2.0, 16368MB, 16368MB available, 16312 GFLOPS peak)
Mi 12 Nov 2025 01:43:45 EET | | OpenCL: AMD/ATI GPU 1 (ignored by config): AMD Radeon Graphics (driver version 3649.0 (HSA1.1,LC), device version OpenCL 2.0, 15619MB, 15619MB available, 563 GFLOPS peak)
Mi 12 Nov 2025 01:43:45 EET | | OpenCL: AMD/ATI GPU 2 (ignored by config): AMD Radeon RX 7800 XT (driver version 3649.0 (HSA1.1,LC), device version OpenCL 2.0, 16368MB, 16368MB available, 16312 GFLOPS peak)
Mi 12 Nov 2025 01:43:45 EET | | OpenCL: AMD/ATI GPU 3 (ignored by config): AMD Radeon Graphics (driver version 3649.0 (HSA1.1,LC), device version OpenCL 2.0, 15619MB, 15619MB available, 563 GFLOPS peak)
Mi 12 Nov 2025 01:43:45 EET | | libc: version 2.41
Mi 12 Nov 2025 01:43:45 EET | | Processor: 32 AuthenticAMD AMD Ryzen 9 7950X 16-Core Processor [Family 25 Model 97 Stepping 2]
Mi 12 Nov 2025 01:43:45 EET | | Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good amd_lbr_v2 nopl xtopology nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd mba perfmon_v2 ibrs ibpb stibp ibrs_enhanced vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local user_shstk avx512_bf16 clzero irperf xsaveerptr rdpru wbnoinvd cppc arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic vgif x2avic v_spec_ctrl
Mi 12 Nov 2025 01:43:45 EET | | OS: Linux Debian: Debian GNU/Linux 13 (trixie) [6.12.48+deb13-amd64|libc 2.41]
Mi 12 Nov 2025 01:43:45 EET | | Memory: 30.51 GB physical, 0 bytes virtual
Mi 12 Nov 2025 01:43:45 EET | | Disk: 97.87 GB total, 74.27 GB free
Mi 12 Nov 2025 01:43:45 EET | | Local time is UTC +2 hours
Mi 12 Nov 2025 01:43:45 EET | | Config: GUI RPCs allowed from:
Mi 12 Nov 2025 01:43:45 EET | | Config: ignoring AMD/ATI GPU 1
Mi 12 Nov 2025 01:43:45 EET | | Config: ignoring AMD/ATI GPU 2
Mi 12 Nov 2025 01:43:45 EET | | Config: ignoring AMD/ATI GPU 3
I have no idea why the BOINC client is picking up the GPUs twice. Primegrid is happy with the setup. This is with ROCm 6.4.4. | |
| ID: 11186 · Rating: 0 · rate:
| |
|
Can you post the selftest lines? <ignore_ati_dev>0</ignore_ati_dev> and add <cc_config>
<options>
<exclude_gpu>
<type>ATI</type>
<device_num>0</device_num>
<app>TF</app>
</exclude_gpu>
</options>
</cc_config> BOINC is counting device 0 as iGPU Try it or disable iGPU in BIOS if you don't use it. | |
| ID: 11188 · Rating: 0 · rate:
| |
remove this line in cc_config There is no such line because then I would have ignored all GPUs. and add OK, had to add the mandatory <url> element to make that work: <cc_config>
<options>
<exclude_gpu>
<url>https://srbase.my-firewall.org/sr5/</url>
<type>ATI</type>
<device_num>0</device_num>
<app>TF</app>
</exclude_gpu>
</options>
</cc_config>
I am not following. I already had GPUs 1-3 excluded. Now you asked me to exclude GPU 0 as well and I am not able to execute any GPU tasks with that config anymore. | |
| ID: 11192 · Rating: 0 · rate:
| |
|
yes, you want to run your 7800 XT but boinc is counting igpu as device 0 not this card in log, exclude device 0 not device 1 and see if it is working. | |
| ID: 11193 · Rating: 0 · rate:
| |
|
OK, so I changed from ignoring GPU 1, 2 and 3 to ignoring 0, 2 and 3:
<ignore_ati_dev>0</ignore_ati_dev>
<ignore_ati_dev>2</ignore_ati_dev>
<ignore_ati_dev>3</ignore_ati_dev>
Fr 14 Nov 2025 02:20:24 EET | | OpenCL: AMD/ATI GPU 0 (ignored by config): AMD Radeon RX 7800 XT (driver version 3649.0 (HSA1.1,LC), device version OpenCL 2.0, 16368MB, 16368MB available, 16312 GFLOPS peak)
Fr 14 Nov 2025 02:20:24 EET | | OpenCL: AMD/ATI GPU 1: AMD Radeon Graphics (driver version 3649.0 (HSA1.1,LC), device version OpenCL 2.0, 15619MB, 15619MB available, 563 GFLOPS peak)
Fr 14 Nov 2025 02:20:24 EET | | OpenCL: AMD/ATI GPU 2 (ignored by config): AMD Radeon RX 7800 XT (driver version 3649.0 (HSA1.1,LC), device version OpenCL 2.0, 16368MB, 16368MB available, 16312 GFLOPS peak)
Fr 14 Nov 2025 02:20:24 EET | | OpenCL: AMD/ATI GPU 3 (ignored by config): AMD Radeon Graphics (driver version 3649.0 (HSA1.1,LC), device version OpenCL 2.0, 15619MB, 15619MB available, 563 GFLOPS peak)
...
Fr 14 Nov 2025 02:20:24 EET | | Config: GUI RPCs allowed from:
Fr 14 Nov 2025 02:20:24 EET | | Config: ignoring AMD/ATI GPU 0
Fr 14 Nov 2025 02:20:24 EET | | Config: ignoring AMD/ATI GPU 2
Fr 14 Nov 2025 02:20:24 EET | | Config: ignoring AMD/ATI GPU 3
Fr 14 Nov 2025 02:20:24 EET | SRBase | Computing prefs: from SRBase (last modified 07-Nov-2025 23:56:53)
Fr 14 Nov 2025 02:20:24 EET | SRBase | Computing prefs: computer location unspecified; using default
...
Fr 14 Nov 2025 02:20:24 EET | | Config: GUI RPCs allowed from:
Fr 14 Nov 2025 02:20:24 EET | | Config: ignoring AMD/ATI GPU 0
Fr 14 Nov 2025 02:20:24 EET | | Config: ignoring AMD/ATI GPU 2
Fr 14 Nov 2025 02:20:24 EET | | Config: ignoring AMD/ATI GPU 3
...
Fr 14 Nov 2025 02:34:22 EET | SRBase | Sending scheduler request: To fetch work.
Fr 14 Nov 2025 02:34:22 EET | SRBase | Requesting new tasks for AMD/ATI GPU
Fr 14 Nov 2025 02:34:24 EET | SRBase | Scheduler request completed: got 1 new tasks
Fr 14 Nov 2025 02:34:24 EET | SRBase | Project requested delay of 7 seconds
Fr 14 Nov 2025 02:34:26 EET | SRBase | Started download of worktodo13a424_0177974.txt
Fr 14 Nov 2025 02:34:27 EET | SRBase | Finished download of worktodo13a424_0177974.txt (57 bytes)
Fr 14 Nov 2025 02:34:27 EET | SRBase | Starting task TF_75-76_621-630M_wu_177974_1
stderr.txt now says:
2025-11-14 02:34:27 (32028): wrapper (7.24.26018): starting
2025-11-14 02:34:27 (32028): wrapper (7.24.26018): starting
2025-11-14 02:34:27 (32028): wrapper: running ./mfakto-x64 (-d 1)
2025-11-14 02:34:27 (32028): wrapper: created child process 32030
And I can hear that the 7800 XT fan is spinning up, i.e. it's definitely processing the task now. So the device numbers in the BOINC log are wrong? | |
| ID: 11194 · Rating: 0 · rate:
| |
|
Took 12 minutes to run through, nice:
2025-11-14 02:34:27 (32028): wrapper (7.24.26018): starting
2025-11-14 02:34:27 (32028): wrapper (7.24.26018): starting
2025-11-14 02:34:27 (32028): wrapper: running ./mfakto-x64 (-d 1)
2025-11-14 02:34:27 (32028): wrapper: created child process 32030
2025-11-14 02:46:31 (32028): ./mfakto-x64 exited; CPU time 4.261798
2025-11-14 02:46:31 (32028): called boinc_finish(0)
| |
| ID: 11195 · Rating: 0 · rate:
| |
So the device numbers in the BOINC log are wrong? Yes, with iGPU BOINC is always counting as device 0, mfakto is different. | |
| ID: 11196 · Rating: 0 · rate:
| |
Message boards :
Number crunching :
TF v0.47 (opencl_ati_101) failing on Debian with AMD RX 7800 XT