Author |
Message |
|
Since yesterday, I do have quite a few computation errors.
The all have "finish file present too long" which is really bothersome since some of those took more than 6 hours of computation time. The all have in common that at the End (100%) the do go on "Verdrängt" and then when they are started again they do get the computation error.
You should look into this issue.
I have the issue with Tasks for
S1006
R920
and even two of the small ones R3_340_360 |
|
|
rebirtherVolunteer moderator Project administrator Project developer Project tester Project scientist
Send message
Joined: 2 Jan 13 Posts: 7479 Credit: 43,687,281 RAC: 41,976 |
Since yesterday, I do have quite a few computation errors.
The all have "finish file present too long" which is really bothersome since some of those took more than 6 hours of computation time. The all have in common that at the End (100%) the do go on "Verdrängt" and then when they are started again they do get the computation error.
You should look into this issue.
I have the issue with Tasks for
S1006
R920
and even two of the small ones R3_340_360
This issue is very rare. In some cases the wrapper is not stopping after the WU is finished and restart the llr app. I have this on my computer with a chance of 1:20k. I dont know how to fix it. |
|
|
|
I have seen this error 80 times myself since Feb 7th. I am not sure why this happens but it occurs only on windows hosts at the moment. |
|
|
|
I have taken a look into it. Only a small number of my boxes are affected. I have now restored the web based settings. Now it is running fine. Looks like it is a problem with BOINC / BOINC Mangager Config. |
|
|
|
I have to come back to this issue once more. And yes it is maybe an issue in BOINC but it is probably also related to the provided data.
It does switch a little bit through my systems, but it does most of the time happen on the AVX systems after 6 hours. (So most of my systems do not do this work anymore.)
In addition it does so far only affect tasks that are "Sierpinski / Riesel Base - average v0.01" or "Sierpinski / Riesel Base - long v0.01".
The most bothering about this is:
(a) It does take very long for those tasks to get the status error. You can notice that the first 95% are done fast and then it does take forever to either complete sucessful or to error out.
(b) Under Linux you can notice a similar thing. The "Sierpinski / Riesel Base - average v0.01" tasks are finished after about 1 hour (100%) but then they do need another 11 hours of work to be really finished and reported. So far under Linux those tasks are not flagged as an error.
I have the feeling that there is a bug in the application that does cause unneeded work cycles. Or if it is working correctly the Progress bar is totally off. |
|
|
rebirtherVolunteer moderator Project administrator Project developer Project tester Project scientist
Send message
Joined: 2 Jan 13 Posts: 7479 Credit: 43,687,281 RAC: 41,976 |
I have to come back to this issue once more. And yes it is maybe an issue in BOINC but it is probably also related to the provided data.
It does switch a little bit through my systems, but it does most of the time happen on the AVX systems after 6 hours. (So most of my systems do not do this work anymore.)
In addition it does so far only affect tasks that are "Sierpinski / Riesel Base - average v0.01" or "Sierpinski / Riesel Base - long v0.01".
The most bothering about this is:
(a) It does take very long for those tasks to get the status error. You can notice that the first 95% are done fast and then it does take forever to either complete sucessful or to error out.
(b) Under Linux you can notice a similar thing. The "Sierpinski / Riesel Base - average v0.01" tasks are finished after about 1 hour (100%) but then they do need another 11 hours of work to be really finished and reported. So far under Linux those tasks are not flagged as an error.
I have the feeling that there is a bug in the application that does cause unneeded work cycles. Or if it is working correctly the Progress bar is totally off.
The progress bar is not working correctly (mix of different FFT length). You can check the FAQ how to calculate the runtime of a WU but also not right for every run. I need to know more about the web settings to collect some datas to track down this issue.
Edit:
http://setiathome.berkeley.edu/forum_thread.php?id=73191
It seems if the machine is under heavy load its not enough time to write a file. |
|
|
|
The broken WUs with the message "finish file present too Long" appears on many Computers from me running "Sierpinski Base" and the project CAS@Home at the same computers. After the CAS@Home WUs finished or completely inactive the SR-WUs are running fine and the CPU Usage of SR was increased. |
|
|
|
i have seen this error once too
____________
|
|
|
|
http://srbase.myfirewall.org/sr5/workunit.php?wuid=41526019
With the above WU, it got to 98% in 2:38 and showed 10 seconds left. It finally got to 100% at the 30:20 mark and has sat there crunching ever since.
This is a Sierpinski / Riesel Base - short 0.04.
It seems that as it got closer to 100% it would take longer and longer to get the last seconds checked off, like a graphing that went off into infinity.
I have another that was at 99.9xx% at 7:10 and 99.999% at 11:34 and now seems to be going off into infinity. It is the same type of WU. It hit 100% at 14:07
I have to stop watching these and get something done.......
____________
I got in, cool, more work. |
|
|
rebirtherVolunteer moderator Project administrator Project developer Project tester Project scientist
Send message
Joined: 2 Jan 13 Posts: 7479 Credit: 43,687,281 RAC: 41,976 |
http://srbase.myfirewall.org/sr5/workunit.php?wuid=41526019
With the above WU, it got to 98% in 2:38 and showed 10 seconds left. It finally got to 100% at the 30:20 mark and has sat there crunching ever since.
This is a Sierpinski / Riesel Base - short 0.04.
It seems that as it got closer to 100% it would take longer and longer to get the last seconds checked off, like a graphing that went off into infinity.
I have another that was at 99.9xx% at 7:10 and 99.999% at 11:34 and now seems to be going off into infinity. It is the same type of WU. It hit 100% at 14:07
I have to stop watching these and get something done.......
Dont worry about the 100%. Due the different runtimes of loaded bases BOINC needs longer or less in time to finish. There is no endless loop. |
|
|