Trial Factoring tests
log in

Advanced search

Message boards : Number crunching : Trial Factoring tests

Previous · 1 . . . 4 · 5 · 6 · 7
Author Message
Profile PDW
Send message
Joined: 15 Oct 15
Posts: 41
Credit: 1,078,223,594
RAC: 43,916
Message 6228 - Posted: 15 Apr 2020, 19:13:15 UTC - in response to Message 6225.

I have only been looking at mfaktc on linux for nVidia, don't have any working AMD cards. I only have the one dual GPU box that I put together this afternoon to try.

The output in stderr.txt says:
wrapper: running ./mfaktc.exe ( --device 1)
are you actually passing "-d 1" to the mfaktc program ?
This is a second task, the first task going to first GPU says ( --device 0).

In the paused Boinc slot for the second task (that started on first GPU) I can type "sudo ./mfaktc.exe -d 1' and it will run on the second GPU. It needs sudo to create the checkpoint file. Any other attempt than "-d 1" makes it run on the first GPU again.

If the wrapper has managed to work out the correct device number to pass to mfaktc (which it looks like it has on my dual GPU system) then I don't understand why it wouldn't run it on the second GPU ?


Theoretical both applicatins can support more than one (different) GPU.
But: BOINC enumerates the GPU with 0, 1, 2, ....
In OpenCl you have platforms, e.g. Intel=0, AMD=1, NVidia=2, and for each platform 1..n devices GPU.

A mapping form 0, 1, 2 to 00, 10, 11 is different for each computer with more than one graphics device.

So there is currently only one mapping --device 0 to d 00. (possible)

DeleteNull
Volunteer developer
Volunteer tester
Send message
Joined: 29 Nov 14
Posts: 79
Credit: 352,388,022
RAC: 592,729
Message 6231 - Posted: 15 Apr 2020, 20:00:33 UTC - in response to Message 6222.
Last modified: 15 Apr 2020, 20:00:56 UTC

Do you have the complete wrapper.cpp file?


The new wrapper works as expected.
Task starts with 0%, switches to 100% after one minute.
This is because there is no Mxxxxxx.ckp file in the first 5 minutes.

After 5 minutes it switches to the percentage of the last Mxxxxxx.ckp.
The remaining time increases (because the percentage stays at the old level for 5 minutes)
After 10 minutes is the next update, and then every 5 minutes.

Is this o.k., or shall I implement a smoother calculation of the percentage (a few more steps in the 5 minute interval)?

Profile rebirther
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 2 Jan 13
Posts: 7228
Credit: 42,729,227
RAC: 31
Message 6232 - Posted: 15 Apr 2020, 20:10:40 UTC - in response to Message 6231.

Do you have the complete wrapper.cpp file?


The new wrapper works as expected.
Task starts with 0%, switches to 100% after one minute.
This is because there is no Mxxxxxx.ckp file in the first 5 minutes.

After 5 minutes it switches to the percentage of the last Mxxxxxx.ckp.
The remaining time increases (because the percentage stays at the old level for 5 minutes)
After 10 minutes is the next update, and then every 5 minutes.

Is this o.k., or shall I implement a smoother calculation of the percentage (a few more steps in the 5 minute interval)?


no, thats good.

DeleteNull
Volunteer developer
Volunteer tester
Send message
Joined: 29 Nov 14
Posts: 79
Credit: 352,388,022
RAC: 592,729
Message 6233 - Posted: 15 Apr 2020, 20:19:21 UTC - in response to Message 6232.

o.k., you can find the updated wrapper.cpp here:
https://p-numbers.net/wrapper.cpp

Works for mfakto (mfaktc), but will not work for LLR.

Profile PDW
Send message
Joined: 15 Oct 15
Posts: 41
Credit: 1,078,223,594
RAC: 43,916
Message 6234 - Posted: 15 Apr 2020, 21:31:55 UTC - in response to Message 6225.

So there is currently only one mapping --device 0 to d 00. (possible)

I don't think you are mapping anything !

On mfaktc everything goes to the first device because the program does not recognise any of the command line telling it which GPU device to use so it defaults to the first one. The wrapper seems to know the right number of the device to use but it is ignored because it is not formatted correctly.

DeleteNull
Volunteer developer
Volunteer tester
Send message
Joined: 29 Nov 14
Posts: 79
Credit: 352,388,022
RAC: 592,729
Message 6235 - Posted: 15 Apr 2020, 21:46:17 UTC - in response to Message 6234.

I've patched mfakto, not mfaktc.
So the mapping --device 0 to -d 00 is only implemented for AMD, not NVidia.
(and only linux)

Profile PDW
Send message
Joined: 15 Oct 15
Posts: 41
Credit: 1,078,223,594
RAC: 43,916
Message 6253 - Posted: 17 Apr 2020, 18:26:18 UTC

Started running this task yesterday evening on the second device but it was going to device 0 by default even though it said device 1.
Carried on running it overnight outside of Boinc until GPU 0 was available to complete it on device 0 and report it through Boinc.

<core_client_version>7.9.3</core_client_version> <![CDATA[ <stderr_txt> 18:53:44 (21480): wrapper (7.2.26012): starting 18:53:44 (21480): wrapper: running ./mfaktc.exe ( --device 1) 19:01:46 (21530): wrapper (7.2.26012): starting 19:01:46 (21530): wrapper: running ./mfaktc.exe ( --device 1) 19:04:28 (21551): wrapper (7.2.26012): starting 19:04:28 (21551): wrapper: running ./mfaktc.exe ( --device 1) 19:23:31 (21671): wrapper (7.2.26012): starting 19:23:31 (21671): wrapper: running ./mfaktc.exe ( --device 1) 20:56:42 (22038): wrapper (7.2.26012): starting 20:56:42 (22038): wrapper: running ./mfaktc.exe ( --device 0) 10:31:40 (126157): wrapper (7.2.26012): starting 10:31:40 (126157): wrapper: running ./mfaktc.exe ( --device 0) 18:56:42 (126157): ./mfaktc.exe exited; CPU time 24.265708 18:56:42 (126157): called boinc_finish </stderr_txt> ]]>

Do you have any plans to work on getting '--device' changed to '-d' for mfaktc so it can work as it should ?
It is easy enough for me just to run a second client and put a GPU in each but most won't want, or know how, to do that.

bluestang
Send message
Joined: 6 Jun 19
Posts: 59
Credit: 1,268,237,570
RAC: 0
Message 6306 - Posted: 19 Apr 2020, 19:38:48 UTC - in response to Message 6253.

Started running this task yesterday evening on the second device but it was going to device 0 by default even though it said device 1.
Carried on running it overnight outside of Boinc until GPU 0 was available to complete it on device 0 and report it through Boinc.

<core_client_version>7.9.3</core_client_version> <![CDATA[ <stderr_txt> 18:53:44 (21480): wrapper (7.2.26012): starting 18:53:44 (21480): wrapper: running ./mfaktc.exe ( --device 1) 19:01:46 (21530): wrapper (7.2.26012): starting 19:01:46 (21530): wrapper: running ./mfaktc.exe ( --device 1) 19:04:28 (21551): wrapper (7.2.26012): starting 19:04:28 (21551): wrapper: running ./mfaktc.exe ( --device 1) 19:23:31 (21671): wrapper (7.2.26012): starting 19:23:31 (21671): wrapper: running ./mfaktc.exe ( --device 1) 20:56:42 (22038): wrapper (7.2.26012): starting 20:56:42 (22038): wrapper: running ./mfaktc.exe ( --device 0) 10:31:40 (126157): wrapper (7.2.26012): starting 10:31:40 (126157): wrapper: running ./mfaktc.exe ( --device 0) 18:56:42 (126157): ./mfaktc.exe exited; CPU time 24.265708 18:56:42 (126157): called boinc_finish </stderr_txt> ]]>

Do you have any plans to work on getting '--device' changed to '-d' for mfaktc so it can work as it should ?
It is easy enough for me just to run a second client and put a GPU in each but most won't want, or know how, to do that.



What is your cc_config and app_config setup? I was able to run 2 instances to get both GPUs to work on 1 WU ea, but now with the changes (and who knows what was changed as no one knows apparently) it will only run on my 1st GPU no matter what I've tried. I'm on Windows 10 with 2xc 1660ti..

bluestang
Send message
Joined: 6 Jun 19
Posts: 59
Credit: 1,268,237,570
RAC: 0
Message 6307 - Posted: 19 Apr 2020, 19:39:57 UTC - in response to Message 6235.

I've patched mfakto, not mfaktc.
So the mapping --device 0 to -d 00 is only implemented for AMD, not NVidia.
(and only linux)


This is ridiculous. You're screwing us on Windows...please fix/implement so it works properly there too.

Profile mikey
Avatar
Send message
Joined: 29 Apr 16
Posts: 43
Credit: 1,385,289,607
RAC: 2,995,169
Message 6713 - Posted: 29 Aug 2020, 23:16:22 UTC

I can run the TF tasks in my windows pc's with no problems but on my every single one of my Linux pc's every single task errors out. Any ideas on how to fix it?

I just checked one and it says
"Stderr output
<core_client_version>7.16.6</core_client_version>
<![CDATA[
<message>
process exited with code 195 (0xc3, -61)</message>
<stderr_txt>
14:35:21 (2977): wrapper (7.2.26012): starting
14:35:21 (2977): wrapper: running ./mfaktc.exe ( --device 0)
./mfaktc.exe: error while loading shared libraries: libcudart.so.10.1: cannot open shared object file: No such file or directory
14:35:22 (2977): ./mfaktc.exe exited; CPU time 0.000571
14:35:22 (2977): app exit status: 0x7f00
14:35:22 (2977): called boinc_finish"

The pc above is: CPU type AuthenticAMD
AMD Ryzen Threadripper 1920X 12-Core Processor [Family 23 Model 1 Stepping 1]
Number of processors 24
Coprocessors NVIDIA GeForce GTX 1080 Ti (4095MB) driver: 435.21 OpenCL: 1.2
Operating System Linux LinuxMint
Linux Mint 19.3 Tricia [5.0.0-32-generic|libc 2.27 (Ubuntu GLIBC 2.27-3ubuntu1)]

Profile rebirther
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 2 Jan 13
Posts: 7228
Credit: 42,729,227
RAC: 31
Message 6714 - Posted: 30 Aug 2020, 6:09:03 UTC - in response to Message 6713.

I can run the TF tasks in my windows pc's with no problems but on my every single one of my Linux pc's every single task errors out. Any ideas on how to fix it?

I just checked one and it says
"Stderr output
7.16.6

process exited with code 195 (0xc3, -61)


14:35:21 (2977): wrapper (7.2.26012): starting
14:35:21 (2977): wrapper: running ./mfaktc.exe ( --device 0)
./mfaktc.exe: error while loading shared libraries: libcudart.so.10.1: cannot open shared object file: No such file or directory
14:35:22 (2977): ./mfaktc.exe exited; CPU time 0.000571
14:35:22 (2977): app exit status: 0x7f00
14:35:22 (2977): called boinc_finish"


The pc above is: CPU type AuthenticAMD
AMD Ryzen Threadripper 1920X 12-Core Processor [Family 23 Model 1 Stepping 1]
Number of processors 24
Coprocessors NVIDIA GeForce GTX 1080 Ti (4095MB) driver: 435.21 OpenCL: 1.2
Operating System Linux LinuxMint
Linux Mint 19.3 Tricia [5.0.0-32-generic|libc 2.27 (Ubuntu GLIBC 2.27-3ubuntu1)]


You must install cuda libs 10, see FAQ

Profile mikey
Avatar
Send message
Joined: 29 Apr 16
Posts: 43
Credit: 1,385,289,607
RAC: 2,995,169
Message 6717 - Posted: 31 Aug 2020, 1:59:04 UTC - in response to Message 6714.

I can run the TF tasks in my windows pc's with no problems but on my every single one of my Linux pc's every single task errors out. Any ideas on how to fix it?

I just checked one and it says
"Stderr output
<core_client_version>7.16.6</core_client_version>
<![CDATA[
<message>
process exited with code 195 (0xc3, -61)</message>
<stderr_txt>
14:35:21 (2977): wrapper (7.2.26012): starting
14:35:21 (2977): wrapper: running ./mfaktc.exe ( --device 0)
./mfaktc.exe: error while loading shared libraries: libcudart.so.10.1: cannot open shared object file: No such file or directory
14:35:22 (2977): ./mfaktc.exe exited; CPU time 0.000571
14:35:22 (2977): app exit status: 0x7f00
14:35:22 (2977): called boinc_finish"


The pc above is: CPU type AuthenticAMD
AMD Ryzen Threadripper 1920X 12-Core Processor [Family 23 Model 1 Stepping 1]
Number of processors 24
Coprocessors NVIDIA GeForce GTX 1080 Ti (4095MB) driver: 435.21 OpenCL: 1.2
Operating System Linux LinuxMint
Linux Mint 19.3 Tricia [5.0.0-32-generic|libc 2.27 (Ubuntu GLIBC 2.27-3ubuntu1)]


You must install cuda libs 10, see FAQ


I did the CUDA thing and 1804 is gone and 2004 is in it's place, so I tried that and of course it failed as I'm using Linux Mint 19.3. I will work on just the Lib 10 files

Profile rebirther
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 2 Jan 13
Posts: 7228
Credit: 42,729,227
RAC: 31
Message 6718 - Posted: 31 Aug 2020, 17:34:06 UTC - in response to Message 6717.


I did the CUDA thing and 1804 is gone and 2004 is in it's place, so I tried that and of course it failed as I'm using Linux Mint 19.3. I will work on just the Lib 10 files


https://mrprajesh.blogspot.com/2018/11/install-cuda-10-on-linux-mint-19-or.html

Profile mikey
Avatar
Send message
Joined: 29 Apr 16
Posts: 43
Credit: 1,385,289,607
RAC: 2,995,169
Message 6721 - Posted: 31 Aug 2020, 20:30:41 UTC - in response to Message 6718.
Last modified: 31 Aug 2020, 20:54:01 UTC


I did the CUDA thing and 1804 is gone and 2004 is in it's place, so I tried that and of course it failed as I'm using Linux Mint 19.3. I will work on just the Lib 10 files


https://mrprajesh.blogspot.com/2018/11/install-cuda-10-on-linux-mint-19-or.html


Thank you but it fails at the part where it says wget...cuda-repo-ubuntu...and says no such directory

Profile mikey
Avatar
Send message
Joined: 29 Apr 16
Posts: 43
Credit: 1,385,289,607
RAC: 2,995,169
Message 6722 - Posted: 2 Sep 2020, 16:43:24 UTC - in response to Message 6721.


I did the CUDA thing and 1804 is gone and 2004 is in it's place, so I tried that and of course it failed as I'm using Linux Mint 19.3. I will work on just the Lib 10 files


https://mrprajesh.blogspot.com/2018/11/install-cuda-10-on-linux-mint-19-or.html


Thank you but it fails at the part where it says wget...cuda-repo-ubuntu...and says no such directory


I also tried the "runfile(local)" option at this site https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&target_distro=Ubuntu&target_version=2004&target_type=runfilelocal and it got to the 98% range and stopped downloading.

I'm guessing either I have the wrong version or don't have enough knowledge and will just run it on my Windows pc's and be happy.

Profile PDW
Send message
Joined: 15 Oct 15
Posts: 41
Credit: 1,078,223,594
RAC: 43,916
Message 6723 - Posted: 2 Sep 2020, 17:17:18 UTC - in response to Message 6721.


I did the CUDA thing and 1804 is gone and 2004 is in it's place, so I tried that and of course it failed as I'm using Linux Mint 19.3. I will work on just the Lib 10 files


https://mrprajesh.blogspot.com/2018/11/install-cuda-10-on-linux-mint-19-or.html


Thank you but it fails at the part where it says wget...cuda-repo-ubuntu...and says no such directory

There is no wget command on that web page !

Profile mikey
Avatar
Send message
Joined: 29 Apr 16
Posts: 43
Credit: 1,385,289,607
RAC: 2,995,169
Message 6724 - Posted: 2 Sep 2020, 20:11:42 UTC - in response to Message 6723.


I did the CUDA thing and 1804 is gone and 2004 is in it's place, so I tried that and of course it failed as I'm using Linux Mint 19.3. I will work on just the Lib 10 files


https://mrprajesh.blogspot.com/2018/11/install-cuda-10-on-linux-mint-19-or.html


Thank you but it fails at the part where it says wget...cuda-repo-ubuntu...and says no such directory


There is no wget command on that web page !


It's on this page https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&target_distro=Ubuntu&target_version=2004&target_type=runfilelocal

Previous · 1 . . . 4 · 5 · 6 · 7
Post to thread

Message boards : Number crunching : Trial Factoring tests


Main page · Your account · Message boards


Copyright © 2014-2024 BOINC Confederation / rebirther