Author |
Message |
|
Do you mean to exclude the other devices for SRBase to run only on device 0?
For one you're asking the wrong question.
I don't want to exclude any devices. But you keep telling us to exclude all but 1 device to get your TF projects to play somewhat nice.
Forgot to mention, if someone did run the TF projects on an 8 GPU server, I believe that the project would run 8 TF projects on 1 GPU at the same time and leaving the rest not running any project OR just run only one TF GPU project and the rest of other projects. Again 1 TF out of 8 GPUs. Note: there are some computers that can run up to 8 GPUs (not just serves). Again only 1 GPU TF on anyone with more than 1 GPU on their computer.
I can speak for all of us when I say, "we want to run the same amount of TF projects that we have of GPUs". Exp: 2 GPUs, we should run 2 TF projects, One for each GPU. Or one TF project on ANY GPU and run whatever project BOINC wants to run on the other GPU". Oh, wait BOINC does that on every project out there like, Prime Grid, GPU Grid, Amicable Numbers, Einstein, ext. Even all of SRBase. Well, for some reason NOT for TF projects.
TF is the only project I can see runs to project at the same time on the same GPU and leaving the other GPU not doing anything.
BOINC is showing TF on device 0 and second TF on device 1. The device 1 is showing false device 1.
People are trying to tell you that there is a problem with you TF project assigning GPUs to their respective devices. And it just baffles all of us and only you should listen to people complaining about a problem with your TF projects and you're not getting it that you need to fix it.
Well, I'm just not going to run any TF projects until you fix it. I'm not even going into the config files just to have it only run 1 TF at a time. |
|
|
rebirtherVolunteer moderator Project administrator Project developer Project tester Project scientist
 Send message
Joined: 2 Jan 13 Posts: 7998 Credit: 44,538,964 RAC: 1
|
Do you mean to exclude the other devices for SRBase to run only on device 0?
For one you're asking the wrong question.
I don't want to exclude any devices. But you keep telling us to exclude all but 1 device to get your TF projects to play somewhat nice.
Forgot to mention, if someone did run the TF projects on an 8 GPU server, I believe that the project would run 8 TF projects on 1 GPU at the same time and leaving the rest not running any project OR just run only one TF GPU project and the rest of other projects. Again 1 TF out of 8 GPUs. Note: there are some computers that can run up to 8 GPUs (not just serves). Again only 1 GPU TF on anyone with more than 1 GPU on their computer.
I can speak for all of us when I say, "we want to run the same amount of TF projects that we have of GPUs". Exp: 2 GPUs, we should run 2 TF projects, One for each GPU. Or one TF project on ANY GPU and run whatever project BOINC wants to run on the other GPU". Oh, wait BOINC does that on every project out there like, Prime Grid, GPU Grid, Amicable Numbers, Einstein, ext. Even all of SRBase. Well, for some reason NOT for TF projects.
TF is the only project I can see runs to project at the same time on the same GPU and leaving the other GPU not doing anything.
BOINC is showing TF on device 0 and second TF on device 1. The device 1 is showing false device 1.
People are trying to tell you that there is a problem with you TF project assigning GPUs to their respective devices. And it just baffles all of us and only you should listen to people complaining about a problem with your TF projects and you're not getting it that you need to fix it.
Well, I'm just not going to run any TF projects until you fix it. I'm not even going into the config files just to have it only run 1 TF at a time.
I can understand your situation. The issue on BOINC is to run all GPUs only on device 0. Also the mfaktc/o have no multi-gpu support for one task, here from the readme:
Q Does mfaktc support multiple GPUs?
A Yes, with the exception that a single instance of mfaktc can only use one
GPU. For each GPU you want to run mfaktc on you need (at least) one
instance of mfaktc. For each instance of mfaktc you can use the
commandline option "-d " to specify which GPU to use for each
specific mfaktc instance.
An app_config.xml could help but I havent found any solution yet thats why we exclude the devices for other projects to avoid being idle. |
|
|
|
I can speak for all of us when I say, "we want to run the same amount of TF projects that we have of GPUs". Exp: 2 GPUs, we should run 2 TF projects, One for each GPU. Or one TF project on ANY GPU and run whatever project BOINC wants to run on the other GPU". Oh, wait BOINC does that on every project out there like, Prime Grid, GPU Grid, Amicable Numbers, Einstein, ext. Even all of SRBase. Well, for some reason NOT for TF projects.
I think if you changed the word "project" to the word "tasks", no quotes, it would be easier to get what you are trying to say.
What I think you are trying to say is that you want EVERY gpu in the pc to run at least one TF task on it here at SRBase, just like it does at every other Boinc Project ie MilkyWay, Einstein etc when you use the <use-all-gpus> line in the cc_config file. |
|
|
|
I can speak for all of us when I say, "we want to run the same amount of TF projects that we have of GPUs". Exp: 2 GPUs, we should run 2 TF projects, One for each GPU. Or one TF project on ANY GPU and run whatever project BOINC wants to run on the other GPU". Oh, wait BOINC does that on every project out there like, Prime Grid, GPU Grid, Amicable Numbers, Einstein, ext. Even all of SRBase. Well, for some reason NOT for TF projects.
I think if you changed the word "project" to the word "tasks", no quotes, it would be easier to get what you are trying to say.
What I think you are trying to say is that you want EVERY gpu in the pc to run at least one TF task on it here at SRBase, just like it does at every other Boinc Project ie MilkyWay, Einstein etc when you use the <use-all-gpus> line in the cc_config file.
Yes, that's what understand them to be saying also.
If you have three GTX NVidia cards in a computer then a single TF should get assigned to each GPU device automatically, as long as <use_all_gpus>1</use_all_gpus> is in the cc_config.
From Rebirther:
Q Does mfaktc support multiple GPUs?
A Yes, with the exception that a single instance of mfaktc can only use one
GPU. For each GPU you want to run mfaktc on you need (at least) one
instance of mfaktc. For each instance of mfaktc you can use the
commandline option "-d <GPU number>" to specify which GPU to use for each
specific mfaktc instance.
So how does the "-d <GPU number>" switch work?
Do we send that switch to the TF instance and how do we do that?
Or is it a switch applied in the batch file starting a BOINC client with <use_all_gpus>0</use_all_gpus> set in cc_config?
This is strange. If you start Milkyway@Home or Einstein@Home then a single BOINC client with <use_all_gpus>1</use_all_gpus> will send equal number of WU's to each GPU.
SRBase TF should do the same thing, automatically, without the user sending "-d <GPU number>" to any process.
I'll restate my opinion: BOINC management of GPU's is archaic and clunky. We should easily be able to use the BOINC Management GUI to assign GPU WU's without touching config files....
____________
My primes found at SRBase:
40*1017^215605+1 (Top 5000)
18922*111^383954+1 (Top 5000)
19116*24^791057-1 (Top 5000)
4281*880^27069+1 |
|
|
rebirtherVolunteer moderator Project administrator Project developer Project tester Project scientist
 Send message
Joined: 2 Jan 13 Posts: 7998 Credit: 44,538,964 RAC: 1
|
I can speak for all of us when I say, "we want to run the same amount of TF projects that we have of GPUs". Exp: 2 GPUs, we should run 2 TF projects, One for each GPU. Or one TF project on ANY GPU and run whatever project BOINC wants to run on the other GPU". Oh, wait BOINC does that on every project out there like, Prime Grid, GPU Grid, Amicable Numbers, Einstein, ext. Even all of SRBase. Well, for some reason NOT for TF projects.
I think if you changed the word "project" to the word "tasks", no quotes, it would be easier to get what you are trying to say.
What I think you are trying to say is that you want EVERY gpu in the pc to run at least one TF task on it here at SRBase, just like it does at every other Boinc Project ie MilkyWay, Einstein etc when you use the line in the cc_config file.
Yes, that's what understand them to be saying also.
If you have three GTX NVidia cards in a computer then a single TF should get assigned to each GPU device automatically, as long as 1 is in the cc_config.
From Rebirther:
Q Does mfaktc support multiple GPUs?
A Yes, with the exception that a single instance of mfaktc can only use one
GPU. For each GPU you want to run mfaktc on you need (at least) one
instance of mfaktc. For each instance of mfaktc you can use the
commandline option "-d " to specify which GPU to use for each
specific mfaktc instance.
So how does the "-d " switch work?
Do we send that switch to the TF instance and how do we do that?
Or is it a switch applied in the batch file starting a BOINC client with 0 set in cc_config?
This is strange. If you start Milkyway@Home or Einstein@Home then a single BOINC client with 1 will send equal number of WU's to each GPU.
SRBase TF should do the same thing, automatically, without the user sending "-d " to any process.
I'll restate my opinion: BOINC management of GPU's is archaic and clunky. We should easily be able to use the BOINC Management GUI to assign GPU WU's without touching config files....
BOINC send always work to device 0, if you have more GPUs on one host it will only work on this task. I have already asked the BOINC dev to change that. We need the same way which CPU does, 1 WU per GPU. |
|
|
|
There must be some kind of translation problem here.
I'm going to try to say this clearly:
SRBase is the only project I know of with this problem.
The other projects have solved the problem.
It shouldn't be the user's problem to make SRBase work properly.
This project might benefit from the experience of people at PrimeGrid.
Primegrid does not have problems with multi GPU, nor is any special user configuration required. |
|
|
rebirtherVolunteer moderator Project administrator Project developer Project tester Project scientist
 Send message
Joined: 2 Jan 13 Posts: 7998 Credit: 44,538,964 RAC: 1
|
There must be some kind of translation problem here.
I'm going to try to say this clearly:
SRBase is the only project I know of with this problem.
The other projects have solved the problem.
It shouldn't be the user's problem to make SRBase work properly.
This project might benefit from the experience of people at PrimeGrid.
Primegrid does not have problems with multi GPU, nor is any special user configuration required.
I have already posted this everywhere. The problem is the app itself. If one of the devs can change the single instance to multiple then we are good. Iam still open for any hints or solutions.
Unfortunately gpuowl doesnt support TF which has multi-GPU support. |
|
|
DeleteNullVolunteer developer Volunteer tester Send message
Joined: 29 Nov 14 Posts: 89 Credit: 583,618,162 RAC: 361
|
Hello, this is the content of a job file:
../../projects/srbase.my-firewall.org_sr5/job_TF_l64c_00020.xml
<job_desc>
<task>
<application>./mfaktc.exe</application>
<append_cmdline_args/>
</task>
<unzip_input>
<zipfilename>mfaktc-linux64-v6.zip</zipfilename>
</unzip_input>
</job_desc>
if you want a device number you have to add a -d <number> parameter (default is 0)
Usage: ./mfaktc.exe [options]
-h display this help and exit
-d <device number> specify the device number used by this program
-tf <exp> <min> <max> trial factor M<exp> from 2^<min> to 2^<max> and exit
instead of parsing the worktodo file
-st run builtin selftest and exit
-st2 same as -st but extended range for k_min/m_max
-v <number> set verbosity (min = 0, default = 1, more = 2, max/debug = 3) |
|
|
|
I have already posted this everywhere. The problem is the app itself. If one of the devs can change the single instance to multiple then we are good. Iam still open for any hints or solutions.
Unfortunately gpuowl doesnt support TF which has multi-GPU support.
OK, I was seeing "BOINC app" mentioned and thinking that meant that the boinc-client app was being blamed for the problem. |
|
|
DeleteNullVolunteer developer Volunteer tester Send message
Joined: 29 Nov 14 Posts: 89 Credit: 583,618,162 RAC: 361
|
As far as I know: BOINC passes e.g. "-device 1" to the device so may be we have to update the code that it understands -device instead of -d. |
|
|
|
Hello, this is the content of a job file:
../../projects/srbase.my-firewall.org_sr5/job_TF_l64c_00020.xml
Can we permanently edit the job file or is it created freshly upon every WU?
Can we edit a job file template that supersedes the default?
____________
My primes found at SRBase:
40*1017^215605+1 (Top 5000)
18922*111^383954+1 (Top 5000)
19116*24^791057-1 (Top 5000)
4281*880^27069+1 |
|
|
rebirtherVolunteer moderator Project administrator Project developer Project tester Project scientist
 Send message
Joined: 2 Jan 13 Posts: 7998 Credit: 44,538,964 RAC: 1
|
Hello, this is the content of a job file:
../../projects/srbase.my-firewall.org_sr5/job_TF_l64c_00020.xml
Can we permanently edit the job file or is it created freshly upon every WU?
Can we edit a job file template that supersedes the default?
no, you need an app_config.xml, the job file is signed by the server. |
|
|
|
Hello, this is the content of a job file:
../../projects/srbase.my-firewall.org_sr5/job_TF_l64c_00020.xml
<job_desc>
<task>
<application>./mfaktc.exe</application>
<append_cmdline_args/>
</task>
<unzip_input>
<zipfilename>mfaktc-linux64-v6.zip</zipfilename>
</unzip_input>
</job_desc>
if you want a device number you have to add a -d <number> parameter (default is 0)
Usage: ./mfaktc.exe [options]
-h display this help and exit
-d <device number> specify the device number used by this program
-tf <exp> <min> <max> trial factor M<exp> from 2^<min> to 2^<max> and exit
instead of parsing the worktodo file
-st run builtin selftest and exit
-st2 same as -st but extended range for k_min/m_max
-v <number> set verbosity (min = 0, default = 1, more = 2, max/debug = 3)
So could you give us an example app_config to force a WU onto dev x with a switch, please?
____________
My primes found at SRBase:
40*1017^215605+1 (Top 5000)
18922*111^383954+1 (Top 5000)
19116*24^791057-1 (Top 5000)
4281*880^27069+1 |
|
|
|
So could you give us an example app_config to force a WU onto dev x with a switch, please? |
|
|