Sierpinski / Riesel Base - long3 are ghosts
log in

Advanced search

Message boards : Number crunching : Sierpinski / Riesel Base - long3 are ghosts

Author Message
Profile Dario666
Avatar
Send message
Joined: 3 Jan 15
Posts: 8
Credit: 13,564,657
RAC: 0
Message 3773 - Posted: 22 Aug 2017, 5:50:20 UTC

Hello

I have 515 hours on the WUProp crunching "Sierpinski / Riesel Base - long3", however there are no signs in the account that these tasks have been crunched (neither good nor wrong), unless I manually cancel them. Just after crunching, these tasks disappear from the my SRBase account.

Please check

Other tasks work well and I get points for them.

WU that are about to end in a while:
R27_2-2.5M_wu_6825
R27_2-2.5M_wu_6805
R27_2-2.5M_wu_6852
____________

Profile rebirther
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 2 Jan 13
Posts: 7229
Credit: 42,729,227
RAC: 31
Message 3774 - Posted: 22 Aug 2017, 16:48:17 UTC - in response to Message 3773.

Hello

I have 515 hours on the WUProp crunching "Sierpinski / Riesel Base - long3", however there are no signs in the account that these tasks have been crunched (neither good nor wrong), unless I manually cancel them. Just after crunching, these tasks disappear from the my SRBase account.

Please check

Other tasks work well and I get points for them.

WU that are about to end in a while:
R27_2-2.5M_wu_6825
R27_2-2.5M_wu_6805
R27_2-2.5M_wu_6852


Which host is affected? It could be from the old run before R27 started. For hostid 1293 I can see 14 records.

Profile Dario666
Avatar
Send message
Joined: 3 Jan 15
Posts: 8
Credit: 13,564,657
RAC: 0
Message 3775 - Posted: 23 Aug 2017, 0:44:59 UTC - in response to Message 3774.

Sorry, these tasks were hidden, because I did not expect anyone to do tasks that would crunch 30 hours without check points.

Long 2 are crunching average 62 hours, without check points... this is sick. There is no such project in the whole BOINC. Check points is not Himalayas
____________

Profile rebirther
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 2 Jan 13
Posts: 7229
Credit: 42,729,227
RAC: 31
Message 3776 - Posted: 23 Aug 2017, 1:42:12 UTC - in response to Message 3775.

Sorry, these tasks were hidden, because I did not expect anyone to do tasks that would crunch 30 hours without check points.

Long 2 are crunching average 62 hours, without check points... this is sick. There is no such project in the whole BOINC. Check points is not Himalayas


There are checkpoints every 10min.

Profile Dario666
Avatar
Send message
Joined: 3 Jan 15
Posts: 8
Credit: 13,564,657
RAC: 0
Message 3777 - Posted: 23 Aug 2017, 5:28:52 UTC - in response to Message 3776.

BoincTasks application informs me that there are no checkpoints.
After the restart of the computer, the tasks were re-crunched, but this may be the case :D
____________

Profile JohnMD
Avatar
Send message
Joined: 3 Jun 15
Posts: 23
Credit: 34,353,261
RAC: 13,996
Message 3778 - Posted: 23 Aug 2017, 11:47:46 UTC

"long3" on my little celeron take up to 100 hours.
On restart, tasks begin again.

Profile rebirther
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 2 Jan 13
Posts: 7229
Credit: 42,729,227
RAC: 31
Message 3779 - Posted: 23 Aug 2017, 12:14:40 UTC - in response to Message 3778.

"long3" on my little celeron take up to 100 hours.
On restart, tasks begin again.


It seems so but its not. Only the time is not saving and restart from 0 but you can check the stderr.txt in windows, you will find a resume from bit xxx

Profile JohnMD
Avatar
Send message
Joined: 3 Jun 15
Posts: 23
Credit: 34,353,261
RAC: 13,996
Message 3780 - Posted: 24 Aug 2017, 11:09:15 UTC

Thanks Rebirther - you are right !
http://srbase.my-firewall.org/sr5/result.php?resultid=253233097 is just as you say -
but the timings are only about half of what they should be.

Profile Dario666
Avatar
Send message
Joined: 3 Jan 15
Posts: 8
Credit: 13,564,657
RAC: 0
Message 3781 - Posted: 25 Aug 2017, 16:20:52 UTC - in response to Message 3779.

Test:
1. Run 2 tasks
2. after 39 hours - reboot
2a. All tasks, even those waiting in the queue, have been identified as erroneous (error time - 25 Aug 2017, 12:13:15 UTC at my place 14:13 - UTC+2)
3. Run 2 tasks (14:16) (R27_2-2.5M_wu_2655_1 and R27_2-2.5M_wu_2786_1)
4. after 40 minutes - reboot
5. again run 2 tasks (14:57)
6. after 6 minutes - reboot
7. again run 2 tasks (15:03)
8. after 90 minutes stderr.txt looks:

slot1:

14:16:07 (3796): wrapper (7.5.26012): starting
14:16:07 (3796): wrapper: running llr.exe ( -d -oPgenInputFile=input.prp -oPgenOutputFile=primes.txt -oDiskWriteTime=10 -oOutputIterations=50000 -oResultsFileIterations=99999999)
Base prime factor(s) taken : 3
Starting N+1 prime test of 706*27^2161846-1
Using AMD K10 FFT length 896K, Pass1=896, Pass2=1K, a = 3

706*27^2161846-1, bit: 50000 / 10279344 [0.48%]. Time per bit: 20.462 ms.
706*27^2161846-1, bit: 100000 / 10279344 [0.97%]. Time per bit: 18.144 ms.
14:57:45 (2060): wrapper (7.5.26012): starting
14:57:45 (2060): wrapper: running llr.exe ( -d -oPgenInputFile=input.prp -oPgenOutputFile=primes.txt -oDiskWriteTime=10 -oOutputIterations=50000 -oResultsFileIterations=99999999)
Base prime factor(s) taken : 3
Resuming N+1 prime test of 706*27^2161846-1 at bit 102145 [0.99%]
Using AMD K10 FFT length 896K, Pass1=896, Pass2=1K, a = 3

15:03:59 (2492): wrapper (7.5.26012): starting
15:03:59 (2492): wrapper: running llr.exe ( -d -oPgenInputFile=input.prp -oPgenOutputFile=primes.txt -oDiskWriteTime=10 -oOutputIterations=50000 -oResultsFileIterations=99999999)
Base prime factor(s) taken : 3
Resuming N+1 prime test of 706*27^2161846-1 at bit 102145 [0.99%]
Using AMD K10 FFT length 896K, Pass1=896, Pass2=1K, a = 3

706*27^2161846-1, bit: 150000 / 10279344 [1.45%]. Time per bit: 16.900 ms.
706*27^2161846-1, bit: 200000 / 10279344 [1.94%]. Time per bit: 16.556 ms.
706*27^2161846-1, bit: 250000 / 10279344 [2.43%]. Time per bit: 15.736 ms.
706*27^2161846-1, bit: 300000 / 10279344 [2.91%]. Time per bit: 16.644 ms.
706*27^2161846-1, bit: 350000 / 10279344 [3.40%]. Time per bit: 16.588 ms.
706*27^2161846-1, bit: 400000 / 10279344 [3.89%]. Time per bit: 16.624 ms.
706*27^2161846-1, bit: 450000 / 10279344 [4.37%]. Time per bit: 16.556 ms.
706*27^2161846-1, bit: 500000 / 10279344 [4.86%]. Time per bit: 16.620 ms.


slot2:

14:16:07 (3764): wrapper (7.5.26012): starting
14:16:07 (3764): wrapper: running llr.exe ( -d -oPgenInputFile=input.prp -oPgenOutputFile=primes.txt -oDiskWriteTime=10 -oOutputIterations=50000 -oResultsFileIterations=99999999)
Base prime factor(s) taken : 3
Starting N+1 prime test of 706*27^2169102-1
Using AMD K10 FFT length 896K, Pass1=896, Pass2=1K, a = 3

706*27^2169102-1, bit: 50000 / 10313846 [0.48%]. Time per bit: 21.396 ms.
706*27^2169102-1, bit: 100000 / 10313846 [0.96%]. Time per bit: 18.216 ms.
14:57:45 (5892): wrapper (7.5.26012): starting
14:57:45 (5892): wrapper: running llr.exe ( -d -oPgenInputFile=input.prp -oPgenOutputFile=primes.txt -oDiskWriteTime=10 -oOutputIterations=50000 -oResultsFileIterations=99999999)
Base prime factor(s) taken : 3
Resuming N+1 prime test of 706*27^2169102-1 at bit 98177 [0.95%]
Using AMD K10 FFT length 896K, Pass1=896, Pass2=1K, a = 3

15:03:59 (4196): wrapper (7.5.26012): starting
15:03:59 (4196): wrapper: running llr.exe ( -d -oPgenInputFile=input.prp -oPgenOutputFile=primes.txt -oDiskWriteTime=10 -oOutputIterations=50000 -oResultsFileIterations=99999999)
Base prime factor(s) taken : 3
Resuming N+1 prime test of 706*27^2169102-1 at bit 98177 [0.95%]
Using AMD K10 FFT length 896K, Pass1=896, Pass2=1K, a = 3

706*27^2169102-1, bit: 100000 / 10313846 [0.96%]. Time per bit: 498.099 ms.
706*27^2169102-1, bit: 150000 / 10313846 [1.45%]. Time per bit: 16.386 ms.
706*27^2169102-1, bit: 200000 / 10313846 [1.93%]. Time per bit: 16.617 ms.
706*27^2169102-1, bit: 250000 / 10313846 [2.42%]. Time per bit: 16.563 ms.
706*27^2169102-1, bit: 300000 / 10313846 [2.90%]. Time per bit: 15.728 ms.
706*27^2169102-1, bit: 350000 / 10313846 [3.39%]. Time per bit: 15.711 ms.
706*27^2169102-1, bit: 400000 / 10313846 [3.87%]. Time per bit: 15.756 ms.
706*27^2169102-1, bit: 450000 / 10313846 [4.36%]. Time per bit: 15.696 ms.
706*27^2169102-1, bit: 500000 / 10313846 [4.84%]. Time per bit: 15.894 ms.


Questions:
1. Where is these 39 hours?
____________

Profile rebirther
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 2 Jan 13
Posts: 7229
Credit: 42,729,227
RAC: 31
Message 3782 - Posted: 25 Aug 2017, 16:23:02 UTC - in response to Message 3781.

WUprop also counting the time until it errored out.

Profile Dario666
Avatar
Send message
Joined: 3 Jan 15
Posts: 8
Credit: 13,564,657
RAC: 0
Message 3783 - Posted: 25 Aug 2017, 16:27:24 UTC - in response to Message 3779.

One more thing:
Time per bit: 498.099 ms???
____________

Profile rebirther
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 2 Jan 13
Posts: 7229
Credit: 42,729,227
RAC: 31
Message 3784 - Posted: 25 Aug 2017, 16:33:39 UTC - in response to Message 3783.

One more thing:
Time per bit: 498.099 ms???


A little overheat of all cores / one core. If you start a new WU the second timestep is more important to calculate the complete runtime.

Profile Dario666
Avatar
Send message
Joined: 3 Jan 15
Posts: 8
Credit: 13,564,657
RAC: 0
Message 3785 - Posted: 25 Aug 2017, 16:54:21 UTC - in response to Message 3782.
Last modified: 25 Aug 2017, 16:56:19 UTC

"WUprop also counting the time until it errored out."

I know :)

These tasks it:
http://srbase.my-firewall.org/sr5/result.php?resultid=254255826
http://srbase.my-firewall.org/sr5/result.php?resultid=254255867

Does it say something ...

14:06:33 (3928): llr.exe exited; CPU time 100236.352137
14:06:33 (3928): app exit status: 0xc000013a
14:06:33 (3928): called boinc_finish(195)

____________

Profile rebirther
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 2 Jan 13
Posts: 7229
Credit: 42,729,227
RAC: 31
Message 3786 - Posted: 25 Aug 2017, 17:21:55 UTC - in response to Message 3785.

"WUprop also counting the time until it errored out."

I know :)

These tasks it:
http://srbase.my-firewall.org/sr5/result.php?resultid=254255826
http://srbase.my-firewall.org/sr5/result.php?resultid=254255867

Does it say something ...
14:06:33 (3928): llr.exe exited; CPU time 100236.352137
14:06:33 (3928): app exit status: 0xc000013a
14:06:33 (3928): called boinc_finish(195)


I found this one here to this error. Never seen it before.

http://setiathome.berkeley.edu/forum_thread.php?id=79695

Profile Dario666
Avatar
Send message
Joined: 3 Jan 15
Posts: 8
Credit: 13,564,657
RAC: 0
Message 3787 - Posted: 25 Aug 2017, 19:38:38 UTC - in response to Message 3786.
Last modified: 25 Aug 2017, 19:43:59 UTC

Thanks

This error is it:
"{DLL Initialization Failed} The application failed to initialize because the window station is shutting down."

Users wrote:
"I've been able to identify occurrences back to last December, but only with BOINC 7.6.9 and 7.6.22. Also, I've only seen it on Windows 7 and Windows 8.1"
"I guess that means that BOINC and Windows are fighting over the task: BOINC is trying to start it up, and Windows is trying to shut it down.
Solution - well, workround: shut down BOINC before shutting down Windows."

I have Windows 7 and BOINC 7.6.22. Damn :)
____________


Post to thread

Message boards : Number crunching : Sierpinski / Riesel Base - long3 are ghosts


Main page · Your account · Message boards


Copyright © 2014-2024 BOINC Confederation / rebirther