Author |
Message |
rebirtherVolunteer moderator Project administrator Project developer Project tester Project scientist
Send message
Joined: 2 Jan 13 Posts: 7445 Credit: 42,730,867 RAC: 0 |
SRBase is chosen as next Sprint project. More work is available soon. |
|
|
|
Hello Rebirther,
Thank you. I think most of us know about Sprint.
I got some long2 4-4.5M. Estimated running 22 hours.
I know it is all except accurate.
The only accurate is in slot and "time per bit"
After 8 hours running : 5.15% done.
Time per bit about 45ms !!!
WU name : R22_4-4.5M_wu_4949_4
Well eight WU on a i7-2600K eight core.
At such speed, those WU will never finish before end of sprint.
5% = 8Hr === 1% = 1.6hr === 100% = 160 hours = 6.6 days !!!
Sprint 3 days
Reducing 8 simultanous to 4 simultanous decrease time per bit of about 30%
I now understand why everyone cancel such WU. Also considering about 25% are in error while computing"
Reducing to two WU at the time will go to about 15ms. But considering amount of
to crunch, they will be also too late for Sprint. And whith hope no "error while computing"
It is the same for long. On perfect same host : time per bit 37ms.
Will also be out of time.
And all those WU are annouced from project with estimation computing of 80GFLOPS (eighty).
To compare Asteroid : 1,380,023 GFLOPS (seventeen thousand time more) : average running 2h30
So why for a sprint you release only "monster" WU ?
I understand you have a lot old WU waiting results.
I understand you want to make your queue so empty as possible.
But then why producing so monster WU ?
It was only my opinion. At normal time (outside Sprint), OK
Not at sprint.
I will let finish on one of my host four simultanous Long2. Waiting result.
I know the credit is very big. But considering % of error and delay, why to crunch so WU.
Waiting production of norma WU,
Best regards
08:41:01 (3628): wrapper (7.5.26012): starting
08:41:01 (3628): wrapper: running llr.exe ( -d -oPgenInputFile=input.prp -oPgenOutputFile=primes.txt -oDiskWriteTime=10 -oOutputIterations=50000 -oResultsFileIterations=99999999)
Base factorized as : 2*11
Base prime factor(s) taken : 11
Starting N+1 prime test of 3656*22^4348766-1
Using AVX FFT length 1920K, Pass1=768, Pass2=2560, a = 3
3656*22^4348766-1, bit: 50000 / 19393037 [0.25%]. Time per bit: 39.327 ms.
3656*22^4348766-1, bit: 100000 / 19393037 [0.51%]. Time per bit: 33.596 ms.
3656*22^4348766-1, bit: 150000 / 19393037 [0.77%]. Time per bit: 34.878 ms.
3656*22^4348766-1, bit: 200000 / 19393037 [1.03%]. Time per bit: 35.968 ms.
3656*22^4348766-1, bit: 250000 / 19393037 [1.28%]. Time per bit: 36.125 ms.
3656*22^4348766-1, bit: 300000 / 19393037 [1.54%]. Time per bit: 38.167 ms.
3656*22^4348766-1, bit: 350000 / 19393037 [1.80%]. Time per bit: 42.293 ms.
3656*22^4348766-1, bit: 400000 / 19393037 [2.06%]. Time per bit: 42.970 ms.
3656*22^4348766-1, bit: 450000 / 19393037 [2.32%]. Time per bit: 43.005 ms.
3656*22^4348766-1, bit: 500000 / 19393037 [2.57%]. Time per bit: 42.257 ms.
3656*22^4348766-1, bit: 550000 / 19393037 [2.83%]. Time per bit: 26.694 ms.
3656*22^4348766-1, bit: 600000 / 19393037 [3.09%]. Time per bit: 24.258 ms.
3656*22^4348766-1, bit: 650000 / 19393037 [3.35%]. Time per bit: 24.116 ms.
3656*22^4348766-1, bit: 700000 / 19393037 [3.60%]. Time per bit: 25.888 ms.
3656*22^4348766-1, bit: 750000 / 19393037 [3.86%]. Time per bit: 42.601 ms.
3656*22^4348766-1, bit: 800000 / 19393037 [4.12%]. Time per bit: 42.885 ms.
3656*22^4348766-1, bit: 850000 / 19393037 [4.38%]. Time per bit: 41.926 ms.
3656*22^4348766-1, bit: 900000 / 19393037 [4.64%]. Time per bit: 44.576 ms.
3656*22^4348766-1, bit: 950000 / 19393037 [4.89%]. Time per bit: 46.705 ms.
3656*22^4348766-1, bit: 1000000 / 19393037 [5.15%]. Time per bit: 47.208 ms.
____________
|
|
|
|
Hello Rebirther,
Thank you. I think most of us know about Sprint.
I got some long2 4-4.5M. Estimated running 22 hours.
I know it is all except accurate.
The only accurate is in slot and "time per bit"
After 8 hours running : 5.15% done.
Time per bit about 45ms !!!
WU name : R22_4-4.5M_wu_4949_4
Well eight WU on a i7-2600K eight core.
At such speed, those WU will never finish before end of sprint.
5% = 8Hr === 1% = 1.6hr === 100% = 160 hours = 6.6 days !!!
Sprint 3 days
Reducing 8 simultanous to 4 simultanous decrease time per bit of about 30%
I now understand why everyone cancel such WU. Also considering about 25% are in error while computing"
Reducing to two WU at the time will go to about 15ms. But considering amount of
to crunch, they will be also too late for Sprint. And whith hope no "error while computing"
It is the same for long. On perfect same host : time per bit 37ms.
Will also be out of time.
And all those WU are annouced from project with estimation computing of 80GFLOPS (eighty).
To compare Asteroid : 1,380,023 GFLOPS (seventeen thousand time more) : average running 2h30
So why for a sprint you release only "monster" WU ?
I understand you have a lot old WU waiting results.
I understand you want to make your queue so empty as possible.
But then why producing so monster WU ?
It was only my opinion. At normal time (outside Sprint), OK
Not at sprint.
I will let finish on one of my host four simultanous Long2. Waiting result.
I know the credit is very big. But considering % of error and delay, why to crunch so WU.
Waiting production of norma WU,
Best regards
08:41:01 (3628): wrapper (7.5.26012): starting
08:41:01 (3628): wrapper: running llr.exe ( -d -oPgenInputFile=input.prp -oPgenOutputFile=primes.txt -oDiskWriteTime=10 -oOutputIterations=50000 -oResultsFileIterations=99999999)
Base factorized as : 2*11
Base prime factor(s) taken : 11
Starting N+1 prime test of 3656*22^4348766-1
Using AVX FFT length 1920K, Pass1=768, Pass2=2560, a = 3
3656*22^4348766-1, bit: 50000 / 19393037 [0.25%]. Time per bit: 39.327 ms.
3656*22^4348766-1, bit: 100000 / 19393037 [0.51%]. Time per bit: 33.596 ms.
3656*22^4348766-1, bit: 150000 / 19393037 [0.77%]. Time per bit: 34.878 ms.
3656*22^4348766-1, bit: 200000 / 19393037 [1.03%]. Time per bit: 35.968 ms.
3656*22^4348766-1, bit: 250000 / 19393037 [1.28%]. Time per bit: 36.125 ms.
3656*22^4348766-1, bit: 300000 / 19393037 [1.54%]. Time per bit: 38.167 ms.
3656*22^4348766-1, bit: 350000 / 19393037 [1.80%]. Time per bit: 42.293 ms.
3656*22^4348766-1, bit: 400000 / 19393037 [2.06%]. Time per bit: 42.970 ms.
3656*22^4348766-1, bit: 450000 / 19393037 [2.32%]. Time per bit: 43.005 ms.
3656*22^4348766-1, bit: 500000 / 19393037 [2.57%]. Time per bit: 42.257 ms.
3656*22^4348766-1, bit: 550000 / 19393037 [2.83%]. Time per bit: 26.694 ms.
3656*22^4348766-1, bit: 600000 / 19393037 [3.09%]. Time per bit: 24.258 ms.
3656*22^4348766-1, bit: 650000 / 19393037 [3.35%]. Time per bit: 24.116 ms.
3656*22^4348766-1, bit: 700000 / 19393037 [3.60%]. Time per bit: 25.888 ms.
3656*22^4348766-1, bit: 750000 / 19393037 [3.86%]. Time per bit: 42.601 ms.
3656*22^4348766-1, bit: 800000 / 19393037 [4.12%]. Time per bit: 42.885 ms.
3656*22^4348766-1, bit: 850000 / 19393037 [4.38%]. Time per bit: 41.926 ms.
3656*22^4348766-1, bit: 900000 / 19393037 [4.64%]. Time per bit: 44.576 ms.
3656*22^4348766-1, bit: 950000 / 19393037 [4.89%]. Time per bit: 46.705 ms.
3656*22^4348766-1, bit: 1000000 / 19393037 [5.15%]. Time per bit: 47.208 ms.
llr is highly optimized and can be setup as MT. An example app_config is in the FAQ
http://srbase.my-firewall.org/sr5/forum_thread.php?id=6&postid=3795#3795 |
|
|
|
Hello Mmonnin
Thank you.
In fact I already, a long time ago ,have read the post you suggest.
Reducing cores.
It is real benefit.
But now, PRJ release again other serie of WU
You, as me, you are attentive in Sprint.
I crunch very long WU here. But in Sprint it is fully ompossible to have estimation. On the same host, same apps ( Siepinsky Base short 0.22 S163-100-150K
some WU runs in a few minuts, some hurs !!! I never consider Boinc time. I only look stderr.txt in slot and "Time per bit"
The only trustable reference.
What I see ?
The short WU have longer deadline than very long !
With result, sorry to everyone, all my "long", "long2" : cancelled
Average2: deadline 4 days but "time per bit" 20ms
The new one deadline 5 days and time per bit about 2ms (ten times less)
I repeat all on the same host.
Where is logic of PRJ ?
In each case, Mmonnin, thank you for your help here and on several othe project
Best and friendly regards
Philippe
____________
|
|
|
rebirtherVolunteer moderator Project administrator Project developer Project tester Project scientist
Send message
Joined: 2 Jan 13 Posts: 7445 Credit: 42,730,867 RAC: 0 |
A big thx to all who have participated. We have made a good process but was behind my expectations. |
|
|
|
Hello Rebirther
I think the few participation is due to the WU.
Not forget, at start, it were only "long" and "long2" WU, almost impossible
to runch more than 1 serie.
Then users who not look in slot, but only on BoincManager, think that WU are "frozen"
And also a serious bottleneck by multiples Cores, with results "time per bit" explode.
With those two factors only, it can explain why so few returning participants.
A big thx to all who have participated. We have made a good process but was behind my expectations. |
|
|