Checkpoint?
log in

Advanced search

Message boards : Number crunching : Checkpoint?

Author Message
dannyridel
Avatar
Send message
Joined: 21 Jul 19
Posts: 63
Credit: 8,229,219
RAC: 17,104
Message 6181 - Posted: 14 Apr 2020, 6:16:19 UTC

I don't seem to see checkpointing in v2 mfakto windows. Is there any problem? On my side? On the app side? Should I be seeing a checkpoint?

Gigacruncher [TSBTs Pirate]
Send message
Joined: 28 Mar 20
Posts: 48
Credit: 8,419,360
RAC: 0
Message 6182 - Posted: 14 Apr 2020, 6:25:41 UTC

http://srbase.my-firewall.org/sr5/forum_thread.php?id=1344&postid=6176#6176

dannyridel
Avatar
Send message
Joined: 21 Jul 19
Posts: 63
Credit: 8,229,219
RAC: 17,104
Message 6183 - Posted: 14 Apr 2020, 6:30:05 UTC

I'm on Windows with v2, not linux.

dannyridel
Avatar
Send message
Joined: 21 Jul 19
Posts: 63
Credit: 8,229,219
RAC: 17,104
Message 6184 - Posted: 14 Apr 2020, 13:10:34 UTC

It seems, that when the task is at about 60% a .ckpt file and a .ckpt.bu file appears. I know the .bu is backup, but why do they appear so late and why does the .bu file have a later checkpoint than the original one? see below:

.ckpt:

104864329 73 74 4620 mfakto 0.14-Win: 4595 0 026B80C3

.ckpt.bu:
104864329 73 74 4620 mfakto 0.14-Win: 4515 0 FB8118C9

Profile rebirther
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 2 Jan 13
Posts: 7255
Credit: 42,729,227
RAC: 4
Message 6192 - Posted: 14 Apr 2020, 15:03:40 UTC - in response to Message 6184.

It seems, that when the task is at about 60% a .ckpt file and a .ckpt.bu file appears. I know the .bu is backup, but why do they appear so late and why does the .bu file have a later checkpoint than the original one? see below:

.ckpt:
104864329 73 74 4620 mfakto 0.14-Win: 4595 0 026B80C3

.ckpt.bu:
104864329 73 74 4620 mfakto 0.14-Win: 4515 0 FB8118C9


.chpt.bu should only be a backup.

Profile rebirther
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 2 Jan 13
Posts: 7255
Credit: 42,729,227
RAC: 4
Message 6193 - Posted: 14 Apr 2020, 15:04:57 UTC - in response to Message 6181.

I don't seem to see checkpointing in v2 mfakto windows. Is there any problem? On my side? On the app side? Should I be seeing a checkpoint?


Depends on your GPU, on a 5500xt it does every 5min.

dannyridel
Avatar
Send message
Joined: 21 Jul 19
Posts: 63
Credit: 8,229,219
RAC: 17,104
Message 6201 - Posted: 15 Apr 2020, 0:29:08 UTC - in response to Message 6193.

Okay, on vega8 it happens about that long AFTER 5 hours passed idk why :(

pls
Send message
Joined: 18 Jul 20
Posts: 6
Credit: 27,085,480
RAC: 15,817
Message 6637 - Posted: 18 Jul 2020, 21:24:21 UTC - in response to Message 6201.

Has there been a real answer to this?

I'm seeing that whenever BOINC syspends an SRBASE work unit to run another project, the SRBASE unit is resumed from the beginning. This is wasted work unless I'm only running this one project.

Are there any plans to fix checkpointing so that it works correctly?

Thanks.

Profile rebirther
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 2 Jan 13
Posts: 7255
Credit: 42,729,227
RAC: 4
Message 6638 - Posted: 18 Jul 2020, 22:00:44 UTC - in response to Message 6637.
Last modified: 18 Jul 2020, 22:02:37 UTC

Has there been a real answer to this?

I'm seeing that whenever BOINC syspends an SRBASE work unit to run another project, the SRBASE unit is resumed from the beginning. This is wasted work unless I'm only running this one project.

Are there any plans to fix checkpointing so that it works correctly?

Thanks.


Checkpoints are every 10min (depends on the CPU speed), the runtime is not saved by the wrapper. On windows you can check the stderr.txt file. You can also check the FAQ

dannyridel
Avatar
Send message
Joined: 21 Jul 19
Posts: 63
Credit: 8,229,219
RAC: 17,104
Message 6641 - Posted: 19 Jul 2020, 14:07:57 UTC - in response to Message 6637.

Has there been a real answer to this?

I'm seeing that whenever BOINC syspends an SRBASE work unit to run another project, the SRBASE unit is resumed from the beginning. This is wasted work unless I'm only running this one project.

Are there any plans to fix checkpointing so that it works correctly?

Thanks.


If these tasks are LLR (CPU tasks), don't fret. BOINC doesn't show progress correctly and LLR checkpoints itself. For TF, I'm not sure.


Post to thread

Message boards : Number crunching : Checkpoint?


Main page · Your account · Message boards


Copyright © 2014-2024 BOINC Confederation / rebirther