Constant calculation errors since the server crash and restoration
log in

Advanced search

Message boards : Number crunching : Constant calculation errors since the server crash and restoration

Author Message
Carlos Solís
Send message
Joined: 5 Oct 24
Posts: 6
Credit: 8,500
RAC: 2
Message 10209 - Posted: 3 Nov 2024, 14:00:19 UTC

Since the server blue-screened and had to be rolled back to an old backup, I've been getting constant errors when attempting SRBase workloads. The program "TF 0.29 (opencl_ati_101)" always runs for 9 to 10 seconds, then errors out with a "calculation error". I've already tried restarting the project in the hopes that it was just a corrupted file from the crash, but to no avail. All other projects I'm contributing are working correctly, even the GPU-based ones. Any clues of what may be happening?

Profile rebirther
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 2 Jan 13
Posts: 7479
Credit: 43,691,181
RAC: 42,242
Message 10210 - Posted: 3 Nov 2024, 14:22:12 UTC - in response to Message 10209.
Last modified: 3 Nov 2024, 14:25:46 UTC

There is no change in apps since crash, must be on your side. Mine is running on RX5500XT. Try to copy the app outside boinc and run mfakto.exe -st. Your selftests failed.

Carlos Solís
Send message
Joined: 5 Oct 24
Posts: 6
Credit: 8,500
RAC: 2
Message 10213 - Posted: 4 Nov 2024, 15:22:17 UTC - in response to Message 10210.

Looks like my self-tests are indeed failing - problem being that they didn't use to fail before:

Loading binary kernel file mfakto_Kernels.elf Compiling kernels. GPUSievePrimes (adjusted) 82486 GPUsieve minimum exponent 1055144 Started a simple self-test ... ERROR: self-test failed for M51332417 (cl_barrett15_69_gs) no factor found ERROR: self-test failed for M50896831 (cl_barrett15_71_gs) no factor found ERROR: self-test failed for M50979079 (cl_barrett15_73_gs) no factor found ERROR: self-test failed for M51232133 (cl_barrett15_73_gs) no factor found ERROR: self-test failed for M50830523 (cl_barrett15_73_gs) no factor found ERROR: self-test failed for M50752613 (cl_barrett15_73_gs) no factor found ERROR: self-test failed for M51507913 (cl_barrett15_73_gs) no factor found ERROR: self-test failed for M51916901 (cl_barrett15_74_gs) no factor found ERROR: self-test failed for M50805581 (cl_barrett15_82_gs) no factor found ERROR: self-test failed for M51157429 (cl_barrett15_82_gs) no factor found ERROR: self-test failed for M51406151 (cl_barrett15_82_gs) no factor found ERROR: self-test failed for M51478381 (cl_barrett15_82_gs) no factor found ERROR: self-test failed for M51350527 (cl_barrett15_82_gs) no factor found ERROR: self-test failed for M53061139 (cl_barrett15_82_gs) no factor found ERROR: self-test failed for M48629519 (cl_barrett15_83_gs) no factor found ERROR: self-test failed for M55069117 (cl_barrett15_69_gs) no factor found ERROR: self-test failed for M45448679 (cl_barrett15_83_gs) no factor found Self-test statistics number of tests 30 successful tests 13 no factor found 17 self-test FAILED!

Profile rebirther
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 2 Jan 13
Posts: 7479
Credit: 43,691,181
RAC: 42,242
Message 10214 - Posted: 4 Nov 2024, 15:26:18 UTC - in response to Message 10213.

Driver problem? What driver is installed?

Carlos Solís
Send message
Joined: 5 Oct 24
Posts: 6
Credit: 8,500
RAC: 2
Message 10215 - Posted: 4 Nov 2024, 19:01:56 UTC - in response to Message 10214.

Standard AMD drivers, version 24.10.1

Profile rebirther
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 2 Jan 13
Posts: 7479
Credit: 43,691,181
RAC: 42,242
Message 10216 - Posted: 4 Nov 2024, 19:38:19 UTC - in response to Message 10215.

Standard AMD drivers, version 24.10.1


If other projects are running with the card then we need an update soon for mfakto. The devs are working on it with 6000+ serie. Our coder has tested it on a smaller 6xxx card and it was working.

Dirk Broer
Send message
Joined: 2 Jan 15
Posts: 74
Credit: 146,667,551
RAC: 111,781
Message 10228 - Posted: 12 Nov 2024, 15:16:11 UTC - in response to Message 10213.
Last modified: 12 Nov 2024, 15:21:08 UTC

Looks like my self-tests are indeed failing - problem being that they didn't use to fail before:

Microsoft has the infuriating habit of overwriting AMD drivers (perhaps even Nvidia drivers and Intel drivers too) with their own castrated (no OpenCL) versions.

My RX 6400 card only produces errors on TF, as does the Radeon 780M IGP of my Ryzen 7 8700G.

Where is that mfakto.ini that I need to update? Why don't the developers show a more pro-active attitude? If AMD launches a new range of APUs and/or cards, they will be used.

Dirk Broer
Send message
Joined: 2 Jan 15
Posts: 74
Credit: 146,667,551
RAC: 111,781
Message 10229 - Posted: 12 Nov 2024, 15:22:49 UTC - in response to Message 10216.

Standard AMD drivers, version 24.10.1


If other projects are running with the card then we need an update soon for mfakto. The devs are working on it with 6000+ serie. Our coder has tested it on a smaller 6xxx card and it was working.


The 7000-series is out and the 8000-series is waiting round the corner.
Time for an app that is based upon capabilities, not an entry in an *.ini file.

Profile rebirther
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 2 Jan 13
Posts: 7479
Credit: 43,691,181
RAC: 42,242
Message 10231 - Posted: 12 Nov 2024, 15:37:49 UTC - in response to Message 10229.

Standard AMD drivers, version 24.10.1


If other projects are running with the card then we need an update soon for mfakto. The devs are working on it with 6000+ serie. Our coder has tested it on a smaller 6xxx card and it was working.


The 7000-series is out and the 8000-series is waiting round the corner.
Time for an app that is based upon capabilities, not an entry in an *.ini file.


Its still in testing and need another kernel file to run.

Profile rebirther
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 2 Jan 13
Posts: 7479
Credit: 43,691,181
RAC: 42,242
Message 10234 - Posted: 13 Nov 2024, 17:57:43 UTC

If you want to try to test with beta1, run

mfakto.exe -st

in a separate folder.

https://download.mersenne.ca/mfakto/windows

I hope most of the errors are gone in beta2+ but still in development.

Carlos Solís
Send message
Joined: 5 Oct 24
Posts: 6
Credit: 8,500
RAC: 2
Message 10236 - Posted: 14 Nov 2024, 19:19:21 UTC - in response to Message 10234.

Tested Beta 1 on my RX 6600 - the vast majority of the tests passed, but not all:

Self-test statistics number of tests 34026 successful tests 32619 no factor found 1407 self-test FAILED! ERROR: self-test failed, exiting.

Profile rebirther
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 2 Jan 13
Posts: 7479
Credit: 43,691,181
RAC: 42,242
Message 10237 - Posted: 14 Nov 2024, 20:44:40 UTC - in response to Message 10236.

Tested Beta 1 on my RX 6600 - the vast majority of the tests passed, but not all:
Self-test statistics number of tests 34026 successful tests 32619 no factor found 1407 self-test FAILED! ERROR: self-test failed, exiting.


thx, reported.

Profile rebirther
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 2 Jan 13
Posts: 7479
Credit: 43,691,181
RAC: 42,242
Message 10241 - Posted: 15 Nov 2024, 18:05:38 UTC - in response to Message 10236.
Last modified: 15 Nov 2024, 19:02:10 UTC

Tested Beta 1 on my RX 6600 - the vast majority of the tests passed, but not all:
Self-test statistics number of tests 34026 successful tests 32619 no factor found 1407 self-test FAILED! ERROR: self-test failed, exiting.


Can you test with GPUtye=RDNA instead auto in inifile? Pls also post the gputype
gfx<xxxx>

Carlos Solís
Send message
Joined: 5 Oct 24
Posts: 6
Credit: 8,500
RAC: 2
Message 10252 - Posted: 22 Nov 2024, 18:50:41 UTC - in response to Message 10241.

Tested Beta 1 on my RX 6600 - the vast majority of the tests passed, but not all:
Self-test statistics number of tests 34026 successful tests 32619 no factor found 1407 self-test FAILED! ERROR: self-test failed, exiting.


Can you test with GPUtye=RDNA instead auto in inifile? Pls also post the gputype
gfx<xxxx>


Sorry for the delay, I didn't get any notifications. As for my last test setting GPUtype to RDNA, first, the GPU type which would be "gfx1032":

Runtime options INI file mfakto.ini Verbosity 1 SieveOnGPU yes MoreClasses yes GPUSievePrimes 81157 GPUSieveProcessSize 24 Kib GPUSieveSize 96 Mib FlushInterval 0 WorkFile worktodo.txt ResultsFile results.txt JSONResultsFile results.json.txt LogFile mfakto.log Checkpoints enabled CheckpointDelay 300 s Stages enabled StopAfterFactor bitlevel PrintMode compact Logging disabled V5UserID none ComputerID none TimeStampInResults yes VectorSize 2 GPUType RDNA SmallExp no UseBinfile mfakto_Kernels.elf Compile-time options Select device - Get device info: OpenCL device info name gfx1032 (Advanced Micro Devices, Inc.) device (driver) version OpenCL 2.0 AMD-APP (3628.0) (3628.0 (PAL,LC)) maximum threads per block 1024 maximum threads per grid 1073741824 number of multiprocessors 14 (896 compute elements) clock rate 2044 MHz Automatic parameters threads per grid 0 optimizing kernels for RDNA


And second, the results - all the same tests that failed last time failed this time as well:

Self-test statistics number of tests 34026 successful tests 32619 no factor found 1407 self-test FAILED! ERROR: self-test failed, exiting.

Profile rebirther
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 2 Jan 13
Posts: 7479
Credit: 43,691,181
RAC: 42,242
Message 10253 - Posted: 22 Nov 2024, 19:09:27 UTC - in response to Message 10252.

beta2 could run but not released yet, we need to wait... but thx for the info, forwarded for further development

Dirk Broer
Send message
Joined: 2 Jan 15
Posts: 74
Credit: 146,667,551
RAC: 111,781
Message 10259 - Posted: 26 Nov 2024, 21:51:42 UTC - in response to Message 10228.

Looks like my self-tests are indeed failing - problem being that they didn't use to fail before:

Microsoft has the infuriating habit of overwriting AMD drivers (perhaps even Nvidia drivers and Intel drivers too) with their own castrated (no OpenCL) versions.

My RX 6400 card only produces errors on TF, as does the Radeon 780M IGP of my Ryzen 7 8700G.

Where is that mfakto.ini that I need to update? Why don't the developers show a more pro-active attitude? If AMD launches a new range of APUs and/or cards, they will be used.


Any answers yet?

Profile rebirther
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 2 Jan 13
Posts: 7479
Credit: 43,691,181
RAC: 42,242
Message 10260 - Posted: 27 Nov 2024, 7:43:03 UTC - in response to Message 10259.

Looks like my self-tests are indeed failing - problem being that they didn't use to fail before:

Microsoft has the infuriating habit of overwriting AMD drivers (perhaps even Nvidia drivers and Intel drivers too) with their own castrated (no OpenCL) versions.

My RX 6400 card only produces errors on TF, as does the Radeon 780M IGP of my Ryzen 7 8700G.

Where is that mfakto.ini that I need to update? Why don't the developers show a more pro-active attitude? If AMD launches a new range of APUs and/or cards, they will be used.


Any answers yet?


no news yet.

Profile rebirther
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 2 Jan 13
Posts: 7479
Credit: 43,691,181
RAC: 42,242
Message 10264 - Posted: 29 Nov 2024, 21:52:36 UTC - in response to Message 10260.

beta2 is now available for download, pls retest with this version

https://github.com/primesearch/mfakto/releases/tag/v0.16-beta.2

Carlos Solís
Send message
Joined: 5 Oct 24
Posts: 6
Credit: 8,500
RAC: 2
Message 10266 - Posted: 30 Nov 2024, 15:29:50 UTC - in response to Message 10264.

beta2 is now available for download, pls retest with this version

https://github.com/primesearch/mfakto/releases/tag/v0.16-beta.2


Just tested the beta 2, the exact same 1407 tests failed.


Post to thread

Message boards : Number crunching : Constant calculation errors since the server crash and restoration


Main page · Your account · Message boards


Copyright © 2014-2024 BOINC Confederation / rebirther