LLR Version 3.8.20 released
log in

Advanced search

Message boards : Number crunching : LLR Version 3.8.20 released

Author Message
Profile rebirther
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 2 Jan 13
Posts: 7227
Credit: 42,729,227
RAC: 34
Message 3275 - Posted: 5 Mar 2017, 15:34:26 UTC
Last modified: 5 Mar 2017, 15:36:38 UTC

The main new feature in this version is that MULTITHREADING is now available by setting -oThreadsPerTest= or -t in the command line.
Thanks to Serge Batalov who showed me how simple it was to implement this!

This LLR version is linked with the Version 28.13 of George Woltman's gwnum library.
George has fixed in this gwnum version, the bug which sometimes affected prime or PRP tests done using multithreading and FMA3.

When doing PRP tests, the Fermat test is now not strong by default, because it is time consuming...

Now, if the input file name (PgenInputFile parameter) has changed while working with the same .ini file, the PgenLine parameter is forced to one.
I made this update to fix the "CLLR bug" found by LaurV.


The only one interesting thing is the multithreading. You can put all your cores to one WU (big advantage for the long runners).

The app is still in test to make sure that all residues are matching with older versions.

For all Ryzen owners the lib is not prepared to use AVX/FMA yet.

Profile rebirther
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 2 Jan 13
Posts: 7227
Credit: 42,729,227
RAC: 34
Message 3300 - Posted: 18 Mar 2017, 15:23:53 UTC
Last modified: 18 Mar 2017, 21:34:13 UTC

The test was over and successful on primegrid. I will update all apps before the stress test is starting.

The new command -tx (x is standing for the amount of cores) is a multithreading feature and can only be used in the app_config file. Anonymous platforms are not allowed here.

I will post the content of an app_config file soon to test if its working.
In my tests the CPU load is around 95% for all cores which you have.

Profile rebirther
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 2 Jan 13
Posts: 7227
Credit: 42,729,227
RAC: 34
Message 3312 - Posted: 19 Mar 2017, 17:12:46 UTC
Last modified: 19 Mar 2017, 18:39:01 UTC

Here is an app_config file you need to activate the multithreading. If something is wrong or can be tweaked post it in this thread. The -t command is not working with older apps.

The new app should be faster on a single core too.

<app_config> <app> <name>srbase</name> <max_concurrent>1</max_concurrent> </app> <app> <name>srbase2</name> <max_concurrent>1</max_concurrent> </app> <app> <name>srbase3</name> <max_concurrent>1</max_concurrent> </app> <app> <name>srbase4</name> <max_concurrent>1</max_concurrent> </app> <app> <name>srbase5</name> <max_concurrent>1</max_concurrent> </app> <app> <name>srbase6</name> <max_concurrent>1</max_concurrent> </app> <app> <name>srbase7</name> <max_concurrent>1</max_concurrent> </app> <app> <name>srbase8</name> <max_concurrent>1</max_concurrent> </app> <app> <name>srbase9</name> <max_concurrent>1</max_concurrent> </app> <app> <name>srbase10</name> <max_concurrent>1</max_concurrent> </app> <app> <name>srbase11</name> <max_concurrent>1</max_concurrent> </app> <app> <name>srbase12</name> <max_concurrent>1</max_concurrent> </app> <app_version> <app_name>srbase</app_name> <cmdline>-t4</cmdline> </app_version> <app_version> <app_name>srbase2</app_name> <cmdline>-t4</cmdline> </app_version> <app_version> <app_name>srbase3</app_name> <cmdline>-t4</cmdline> </app_version> <app_version> <app_name>srbase4</app_name> <cmdline>-t4</cmdline> </app_version> <app_version> <app_name>srbase5</app_name> <cmdline>-t4</cmdline> </app_version> <app_version> <app_name>srbase6</app_name> <cmdline>-t4</cmdline> </app_version> <app_version> <app_name>srbase7</app_name> <cmdline>-t4</cmdline> </app_version> <app_version> <app_name>srbase8</app_name> <cmdline>-t4</cmdline> </app_version> <app_version> <app_name>srbase9</app_name> <cmdline>-t4</cmdline> </app_version> <app_version> <app_name>srbase10</app_name> <cmdline>-t4</cmdline> </app_version> <app_version> <app_name>srbase11</app_name> <cmdline>-t4</cmdline> </app_version> <app_version> <app_name>srbase12</app_name> <cmdline>-t4</cmdline> </app_version> </app_config>


srbase = Sierpinski / Riesel Base
srbase2 = Riesel Base
srbase3 = Sierpinski Base
srbase4 = Sierpinski / Riesel Base - short
srbase5 = Sierpinski / Riesel Base - long
srbase6 = Sierpinski / Riesel Base - average
srbase7 = Riesel Base - short
srbase8 = Sierpinski Base - short
srbase9 = Sierpinski / Riesel Base - average2
srbase10 = Sierpinski / Riesel Base - average3
srbase11 = Sierpinski / Riesel Base - long2
srbase12 = Sierpinski / Riesel Base - long3

Remove only the lines you dont need to run short apps.

Change the -tx commandline to use all of your cores you have / want to run.

Thalus
Send message
Joined: 7 Mar 17
Posts: 34
Credit: 2,584,831
RAC: 0
Message 3317 - Posted: 19 Mar 2017, 18:35:10 UTC

From srbase3 to srbase 12 the tag <app> is missing in your config.

Profile rebirther
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 2 Jan 13
Posts: 7227
Credit: 42,729,227
RAC: 34
Message 3318 - Posted: 19 Mar 2017, 18:39:17 UTC - in response to Message 3317.

From srbase3 to srbase 12 the tag is missing in your config.


yeah, copy/paste, fixed, thx!

Thalus
Send message
Joined: 7 Mar 17
Posts: 34
Credit: 2,584,831
RAC: 0
Message 3320 - Posted: 20 Mar 2017, 16:22:31 UTC
Last modified: 20 Mar 2017, 16:42:20 UTC

How do i reload the "stock"-settings after editing/adding a app_conf.xml? I deleted the file, reloaded the settings in Boinc but it´s still using the preferences I added.

Edit:
Nevermind... deleting app_config.xml and resetting the project.

Wailing Angus Beef
Send message
Joined: 4 Dec 14
Posts: 7
Credit: 324,496,280
RAC: 1,703,156
Message 3324 - Posted: 22 Mar 2017, 0:41:09 UTC

Will WUProp record all cpu time on the multi-thread WUs? Most do, some don't.

Profile Odicin
Avatar
Send message
Joined: 29 Nov 14
Posts: 6
Credit: 220,119,016
RAC: 142,269
Message 3325 - Posted: 22 Mar 2017, 6:38:00 UTC - in response to Message 3324.

Yep, they also track mt apps from other projects like amicale numbers.

Regards Odi
____________

Profile rebirther
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 2 Jan 13
Posts: 7227
Credit: 42,729,227
RAC: 34
Message 3329 - Posted: 23 Mar 2017, 18:40:16 UTC - in response to Message 3324.

Will WUProp record all cpu time on the multi-thread WUs? Most do, some don't.


If you are using multiple cores on 1 WU, then only this WU will be counted.

Wailing Angus Beef
Send message
Joined: 4 Dec 14
Posts: 7
Credit: 324,496,280
RAC: 1,703,156
Message 3336 - Posted: 24 Mar 2017, 12:21:53 UTC

If a WU for a particular app typically takes 12 hours using 1 core, then you earn 12 hours on WUProp.

If you now use 12 cores to crunch that 1 WU and lets say it takes 1 hour using 12 cores, then will you earn 12 hours work on WUProp or 1 hour?

forretrio
Send message
Joined: 28 Nov 16
Posts: 7
Credit: 25,040,159
RAC: 0
Message 3359 - Posted: 29 Mar 2017, 17:15:40 UTC

It's interesting that the client does not show that you are using multiple cores, but it actually is. Running fine here.
____________

forretrio
Send message
Joined: 28 Nov 16
Posts: 7
Credit: 25,040,159
RAC: 0
Message 3360 - Posted: 29 Mar 2017, 20:39:45 UTC - in response to Message 3359.

Or the other way round: this is not supposed to have?!? I have the following app_config

<app_config>

<project_max_concurrent>3</project_max_concurrent>

<app_version>
<app_name>srbase</app_name>
<cmdline>-t6</cmdline>
</app_version>

......

<app_version>
<app_name>srbase12</app_name>
<cmdline>-t6</cmdline>
</app_version>
</app_config>


When I run the apps I noticed that each llr.exe uses more than what it should have for 1 threads, but it is never faster than using a single thread, nor it is showing that multithreading is being used.

The only difference I noticed is the change in stderr, there is a '4 thread' string in extra like

Using all-complex FMA3 FFT length 288K, Pass1=384, Pass2=768, 4 threads, a = 3


so I wonder what's going wrong up there...
____________

Profile rebirther
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 2 Jan 13
Posts: 7227
Credit: 42,729,227
RAC: 34
Message 3361 - Posted: 29 Mar 2017, 20:45:05 UTC - in response to Message 3360.
Last modified: 29 Mar 2017, 20:45:30 UTC



When I run the apps I noticed that each llr.exe uses more than what it should have for 1 threads, but it is never faster than using a single thread, nor it is showing that multithreading is being used.

The only difference I noticed is the change in stderr, there is a '4 thread' string in extra like

Using all-complex FMA3 FFT length 288K, Pass1=384, Pass2=768, 4 threads, a = 3


so I wonder what's going wrong up there...


The BOINCmanager is not designed to show multithreading but if the app has this feature its working, in standalone with my 6 cores its using up 95% CPU load. The multithreading feature is good for slower multicore cpus can finish a long runner in time.

Partygott
Send message
Joined: 14 Dec 16
Posts: 2
Credit: 12,340,132
RAC: 0
Message 3396 - Posted: 6 Apr 2017, 11:06:58 UTC

Hi.

The mt feature seems to have a problem with scaling and/or hyperthreading on windows.

4c/4t i5-3470 uses all free ressources up to 99.x percent.
2c/4t i5-2520m uses max. 93-97%.

8c/16t ryzen 1700 behaves like this:
1 wu @ 16 vcores: 67%
2 wu @ 8 vcores: 41% per wu
4 wu @ 4 vcores: 22% per wu

Even with restricting to 4 wu at the same time my ryzen sys takes a 5th wu with 4 vcores on a different app, like 4x riesl base + 1x sierpinski/riesel, which results in 5x 19.x percent usage.


btw...
I noticed that on aamd ryzen the crunshing of a wu slows down the more % of it has completed. On intel it has the same speed over 0-100%.

Profile rebirther
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 2 Jan 13
Posts: 7227
Credit: 42,729,227
RAC: 34
Message 3397 - Posted: 6 Apr 2017, 11:44:17 UTC - in response to Message 3396.

Hi.

The mt feature seems to have a problem with scaling and/or hyperthreading on windows.

4c/4t i5-3470 uses all free ressources up to 99.x percent.
2c/4t i5-2520m uses max. 93-97%.

8c/16t ryzen 1700 behaves like this:
1 wu @ 16 vcores: 67%
2 wu @ 8 vcores: 41% per wu
4 wu @ 4 vcores: 22% per wu

Even with restricting to 4 wu at the same time my ryzen sys takes a 5th wu with 4 vcores on a different app, like 4x riesl base + 1x sierpinski/riesel, which results in 5x 19.x percent usage.


btw...
I noticed that on aamd ryzen the crunshing of a wu slows down the more % of it has completed. On intel it has the same speed over 0-100%.


The app is not ready yet for Ryzen. It must be optimized in later versions.

Thalus
Send message
Joined: 7 Mar 17
Posts: 34
Credit: 2,584,831
RAC: 0
Message 3399 - Posted: 6 Apr 2017, 19:08:22 UTC - in response to Message 3396.


Even with restricting to 4 wu at the same time my ryzen sys takes a 5th wu with 4 vcores on a different app, like 4x riesl base + 1x sierpinski/riesel, which results in 5x 19.x percent usage.


I have the same problem with win10 x64 and my skylake... so i stopped using the "-tx" option.


Post to thread

Message boards : Number crunching : LLR Version 3.8.20 released


Main page · Your account · Message boards


Copyright © 2014-2024 BOINC Confederation / rebirther