Question/Problem with Intel GPU work units
log in

Advanced search

Message boards : Number crunching : Question/Problem with Intel GPU work units

Previous · 1 · 2
Author Message
Dean Loros
Avatar
Send message
Joined: 22 Sep 23
Posts: 13
Credit: 566,281
RAC: 0
Message 9108 - Posted: 3 Oct 2023, 16:32:40 UTC - in response to Message 9107.

Sounds good...again, numberfields@home is producing workunits with this card, so it would be interesting to see what their approach is.....

AndyPC
Send message
Joined: 22 Jan 24
Posts: 2
Credit: 23,394,500
RAC: 0
Message 9658 - Posted: 5 Feb 2024, 14:36:30 UTC
Last modified: 5 Feb 2024, 14:38:59 UTC

Trying to work on debugging this a bit. I recompiled mfakto from current Github sources on Pop!OS (based on Ubuntu) 22.04 LTS with an Intel Arc A770 16 GB gpu. CPU is AMD Ryzen 7 7700, OS kernel is 6.6.10.

Turned on as much debug output as I could find including #define DETAILED_INFO.

Self-test is failing using both OpenCL 2.2.0 from ROCm and OpenCL 3.0 from Intel Compute Runtime.

Here's the output from the latter: https://gist.github.com/APCBoston/580385ff898cc013b34e2e7570b8b468

I'm not entirely satisfied with the explanation that Intel's drivers are bad, as Phoronix is succesfully running vector compute benchmarks on this GPU... either there's a very specific and obscure bug in the driver that is only showing up on this application, or, I suspect more plausible, a bug in mfakto.

Will try working through this in gdb when I have a bit more time.

DeleteNull
Volunteer developer
Volunteer tester
Send message
Joined: 29 Nov 14
Posts: 83
Credit: 381,908,322
RAC: 423,119
Message 9662 - Posted: 6 Feb 2024, 14:43:26 UTC - in response to Message 9658.

The main problem is this (mfactc.c):
GPU_type gpu_types[]={
{GPU_AUTO, 0, "AUTO"},
{GPU_VLIW4, 64, "VLIW4"},
{GPU_VLIW5, 80, "VLIW5"},
{GPU_GCN, 64, "GCN"},
{GPU_GCN2, 64, "GCN2"},
{GPU_GCN3, 64, "GCN3"},
{GPU_GCN4, 64, "GCN4"},
{GPU_GCN5, 64, "GCN5"},
{GPU_GCNF, 64, "GCNF"},
{GPU_RDNA, 64, "RDNA"},
{GPU_APU, 80, "APU"},
{GPU_CPU, 1, "CPU"},
{GPU_NVIDIA, 8, "NVIDIA"},
{GPU_INTEL, 1, "INTEL"},
{GPU_UNKNOWN, 0, "UNKNOWN"}
};

Depending on different values for the integer in {GPU_INTEL, 1, "INTEL"}, you will get different results for the number of successful and "not successfull" self tests. But will never get it to 0 of not successful?

AndyPC
Send message
Joined: 22 Jan 24
Posts: 2
Credit: 23,394,500
RAC: 0
Message 9663 - Posted: 6 Feb 2024, 15:36:27 UTC - in response to Message 9662.

It looks like that integer is GPU_type.CE_per_multiprocessor, but where is it consumed? I only see that name referenced twice in the whole repo, both in mfakto.cpp. One is setting it and the other is printing it.

Previous · 1 · 2
Post to thread

Message boards : Number crunching : Question/Problem with Intel GPU work units


Main page · Your account · Message boards


Copyright © 2014-2025 BOINC Confederation / rebirther