Author |
Message |
rebirtherVolunteer moderator Project administrator Project developer Project tester Project scientist
Send message
Joined: 2 Jan 13 Posts: 7509 Credit: 44,029,375 RAC: 20,028 |
The checkpoint is generated from mfakto with the name: M97983883.ckp
yes but the wrapper cannot readout it or save the time or Iam wrong, never checked the new wrapper code, if you restart BOINC the stderr.txt must keep the last savepoint. |
|
|
DeleteNullVolunteer developer Volunteer tester Send message
Joined: 29 Nov 14 Posts: 83 Credit: 381,908,322 RAC: 284,346 |
In this case the file M97983883.ckp keeps the checkpoint:
97983883 77 78 4620 mfakto 0.15pre6: 17 0 B624ABF0
97983883 77 78 4620 mfakto 0.15pre6: 45 0 150D244C
97983883 77 78 4620 mfakto 0.15pre6: 72 0 7071D1B7
97983883 77 78 4620 mfakto 0.15pre6: 93 0 48AC81E6
97983883 77 78 4620 mfakto 0.15pre6: 116 0 9EC681F1
So the estimated runtime is 9:50 hours.
But the name M97983883.ckp is different for each test.
...we need a few additional lines of code in the wrapper, so that it can handle the .ckp file. |
|
|
rebirtherVolunteer moderator Project administrator Project developer Project tester Project scientist
Send message
Joined: 2 Jan 13 Posts: 7509 Credit: 44,029,375 RAC: 20,028 |
In this case the file M97983883.ckp keeps the checkpoint:
97983883 77 78 4620 mfakto 0.15pre6: 17 0 B624ABF0
97983883 77 78 4620 mfakto 0.15pre6: 45 0 150D244C
97983883 77 78 4620 mfakto 0.15pre6: 72 0 7071D1B7
97983883 77 78 4620 mfakto 0.15pre6: 93 0 48AC81E6
97983883 77 78 4620 mfakto 0.15pre6: 116 0 9EC681F1
So the estimated runtime is 9:50 hours.
But the name M97983883.ckp is different for each test.
...we need a few additional lines of code in the wrapper, so that it can handle the .ckp file.
The old wrapper code is here |
|
|
DeleteNullVolunteer developer Volunteer tester Send message
Joined: 29 Nov 14 Posts: 83 Credit: 381,908,322 RAC: 284,346 |
I added a second PC to this test, and it runs too (had to reset the project).
The good news:
Two tests are running, one with Ubuntu 18.04, the other with Opensuse 15.1.
One with driver 20.10 (Radeon 5500XT), the other with driver 19.50 (R9 380)
One will need about 10 hours, the other will need 27 hours.
The bad news:
You have to wait with 100% until it's finished, or you have to extend the wrapper. |
|
|
DeleteNullVolunteer developer Volunteer tester Send message
Joined: 29 Nov 14 Posts: 83 Credit: 381,908,322 RAC: 284,346 |
The old wrapper code is here
Tomorrow (after work) I will try to implement a method in the wrapper that it can deal with the .ckp file.
The ini file should change from CheckpointDelay=300 to CheckpointDelay=60.
(both: NVidia and AMD) |
|
|
rebirtherVolunteer moderator Project administrator Project developer Project tester Project scientist
Send message
Joined: 2 Jan 13 Posts: 7509 Credit: 44,029,375 RAC: 20,028 |
The old wrapper code is here
Tomorrow (after work) I will try to implement a method in the wrapper that it can deal with the .ckp file.
The ini file should change from CheckpointDelay=300 to CheckpointDelay=60.
(both: NVidia and AMD)
every 60s is bad. The old wrapper code can be used for all apps to have checkpointing. Some times ago I have tried it to implement but failed. As far as I know there are only 2 sections of code. We only need to add the old snipped to the latest code. |
|
|
DeleteNullVolunteer developer Volunteer tester Send message
Joined: 29 Nov 14 Posts: 83 Credit: 381,908,322 RAC: 284,346 |
Will need another evening. C and characters (strings) are colliding worlds. |
|
|
DeleteNullVolunteer developer Volunteer tester Send message
Joined: 29 Nov 14 Posts: 83 Credit: 381,908,322 RAC: 284,346 |
A new (test) version of the wrapper (mfakto/mfaktc) is here:
https://p-numbers.net/wrapper_26016_7.16_mfakt
The next test can begin..... |
|
|
rebirtherVolunteer moderator Project administrator Project developer Project tester Project scientist
Send message
Joined: 2 Jan 13 Posts: 7509 Credit: 44,029,375 RAC: 20,028 |
A new (test) version of the wrapper (mfakto/mfaktc) is here:
https://p-numbers.net/wrapper_26016_7.16_mfakt
The next test can begin.....
Can you test it first standalone? Start wrapper, wait for the checkpoint, close the window, restart the wrapper and check the stderr.txt. Do you have the source code if its working? I could try to build it on windows. |
|
|
DeleteNullVolunteer developer Volunteer tester Send message
Joined: 29 Nov 14 Posts: 83 Credit: 381,908,322 RAC: 284,346 |
Does the wrapper work without BOINC?
So I create a test dir and copy the files into it?
And yes...you will get the source. |
|
|
rebirtherVolunteer moderator Project administrator Project developer Project tester Project scientist
Send message
Joined: 2 Jan 13 Posts: 7509 Credit: 44,029,375 RAC: 20,028 |
Does the wrapper work without BOINC?
So I create a test dir and copy the files into it?
And yes...you will get the source.
yes its working outside, thats why Iam testing before I setup the apps ;)
Do you have also saved runtime included if you are a c++ specialist? |
|
|
DeleteNullVolunteer developer Volunteer tester Send message
Joined: 29 Nov 14 Posts: 83 Credit: 381,908,322 RAC: 284,346 |
20:18:30 (19664): BOINC client no longer exists - exiting
20:18:30 (19664): timer handler: client dead, exiting |
|
|
rebirtherVolunteer moderator Project administrator Project developer Project tester Project scientist
Send message
Joined: 2 Jan 13 Posts: 7509 Credit: 44,029,375 RAC: 20,028 |
20:18:30 (19664): BOINC client no longer exists - exiting
20:18:30 (19664): timer handler: client dead, exiting
ok I will try to setup, lets see how its working in BOINC. |
|
|
rebirtherVolunteer moderator Project administrator Project developer Project tester Project scientist
Send message
Joined: 2 Jan 13 Posts: 7509 Credit: 44,029,375 RAC: 20,028 |
v11 linux app is up with the new wrapper. |
|
|
DeleteNullVolunteer developer Volunteer tester Send message
Joined: 29 Nov 14 Posts: 83 Credit: 381,908,322 RAC: 284,346 |
What ending is exe?
The file is named wrapper_26016_7.16_mfakt
It is linux executable (-rwxr-xr-x)
The new method is:
double getMfraction() {
FILE* f = fopen("worktodo.txt", "r");
if (!f) return 0;
char* p;
char buf[256];
p = fgets(buf, 256, f);
fclose(f);
if (p == NULL) return 0;
char * qch;
qch = strtok(p, "="); qch = strtok(NULL, ",");
int len = strlen(qch);
char mfile[6+len];
mfile[0] = 'M';
int i;
for (i = 0; i < len; i++) {
mfile[1+i] = qch[i];
}
mfile[1+len] = '.'; mfile[2+len] = 'c'; mfile[3+len] = 'k'; mfile[4+len] = 'p'; mfile[5+len] = '\0';
FILE* ff = fopen(mfile, "r");
if (!ff) return 0;
char* pp;
pp = fgets(buf, 256, ff);
fclose(ff);
if (pp == NULL) return 0;
char * pch;
int count = 0;
double all = 0, done = 0, frac = 0;
pch = strtok (pp, " ");
while (pch != NULL) {
count++;
if (count == 4) all = atof(pch);
if (count == 7) done = atof(pch);
pch = strtok (NULL, " ");
}
if (all > 0) frac = done / all;
if (frac < 0) return 0;
if (frac > 1) return 1;
return frac;
}
|
|
|
rebirtherVolunteer moderator Project administrator Project developer Project tester Project scientist
Send message
Joined: 2 Jan 13 Posts: 7509 Credit: 44,029,375 RAC: 20,028 |
What ending is exe?
The file is named wrapper_26016_7.16_mfakt
It is linux executable (-rwxr-xr-x)
The new method is:
double getMfraction() {
FILE* f = fopen("worktodo.txt", "r");
if (!f) return 0;
char* p;
char buf[256];
p = fgets(buf, 256, f);
fclose(f);
if (p == NULL) return 0;
char * qch;
qch = strtok(p, "="); qch = strtok(NULL, ",");
int len = strlen(qch);
char mfile[6+len];
mfile[0] = 'M';
int i;
for (i = 0; i < len; i++) {
mfile[1+i] = qch[i];
}
mfile[1+len] = '.'; mfile[2+len] = 'c'; mfile[3+len] = 'k'; mfile[4+len] = 'p'; mfile[5+len] = '\0';
FILE* ff = fopen(mfile, "r");
if (!ff) return 0;
char* pp;
pp = fgets(buf, 256, ff);
fclose(ff);
if (pp == NULL) return 0;
char * pch;
int count = 0;
double all = 0, done = 0, frac = 0;
pch = strtok (pp, " ");
while (pch != NULL) {
count++;
if (count == 4) all = atof(pch);
if (count == 7) done = atof(pch);
pch = strtok (NULL, " ");
}
if (all > 0) frac = done / all;
if (frac < 0) return 0;
if (frac > 1) return 1;
return frac;
}
Sorry, was the test folder. I have removed the text ^^ I have forgot the execute bit again, any issues? |
|
|
rebirtherVolunteer moderator Project administrator Project developer Project tester Project scientist
Send message
Joined: 2 Jan 13 Posts: 7509 Credit: 44,029,375 RAC: 20,028 |
Do you have the complete wrapper.cpp file? |
|
|
PDWSend message
Joined: 15 Oct 15 Posts: 41 Credit: 1,325,663,594 RAC: 1,772,867 |
Sorry to interject your development process.
Why are you using "--device x" for mfakto and mfaktc ?
Both their guides say to use "-d x"
Q: Does mfakto support multiple GPUs?
A: No, but you can use the -d option to tell an instance to run on a specific
device. Please also read the next question.
Q Does mfaktc support multiple GPUs?
A Yes, with the exception that a single instance of mfaktc can only use one
GPU. For each GPU you want to run mfaktc on you need (at least) one
instance of mfaktc. For each instance of mfaktc you can use the
commandline option "-d <GPU number>" to specify which GPU to use for each
specific mfaktc instance. Please read the next question, too.
I can run a second task on a second GPU but only if I specify "-d x", as soon as I pass "--device x" (or "-device x") on the command line it defaults to the first GPU.
Can you use "-d x" for linux wrapper at least please ? |
|
|
DeleteNullVolunteer developer Volunteer tester Send message
Joined: 29 Nov 14 Posts: 83 Credit: 381,908,322 RAC: 284,346 |
No issues.
In about an hour an "old" WU is finished, the new WU should use the new wrapper and I can report.
I am not a C/C++ developer. The new method just reads the worktodo.txt with the number and contructs the file name for the checkpoint file of mfaktc/mfakto and calculates the fraction of "work done". This can be used by the method fraction_done() of the (original) wrapper.
If it succeeds I will send you the wrapper.cpp so it can be used for Windows too. |
|
|
DeleteNullVolunteer developer Volunteer tester Send message
Joined: 29 Nov 14 Posts: 83 Credit: 381,908,322 RAC: 284,346 |
Theoretical both applicatins can support more than one (different) GPU.
But: BOINC enumerates the GPU with 0, 1, 2, ....
In OpenCl you have platforms, e.g. Intel=0, AMD=1, NVidia=2, and for each platform 1..n devices GPU.
A mapping form 0, 1, 2 to 00, 10, 11 is different for each computer with more than one graphics device.
So there is currently only one mapping --device 0 to d 00. (possible)
Sorry to interject your development process.
Why are you using "--device x" for mfakto and mfaktc ?
Both their guides say to use "-d x"
Q: Does mfakto support multiple GPUs?
A: No, but you can use the -d option to tell an instance to run on a specific
device. Please also read the next question.
Q Does mfaktc support multiple GPUs?
A Yes, with the exception that a single instance of mfaktc can only use one
GPU. For each GPU you want to run mfaktc on you need (at least) one
instance of mfaktc. For each instance of mfaktc you can use the
commandline option "-d <GPU number>" to specify which GPU to use for each
specific mfaktc instance. Please read the next question, too.
I can run a second task on a second GPU but only if I specify "-d x", as soon as I pass "--device x" (or "-device x") on the command line it defaults to the first GPU.
Can you use "-d x" for linux wrapper at least please ? |
|
|