log in |
61)
Message boards :
Number crunching :
Too long calculation for the Sierpinski/Riesel Bases - long
(Message 801)
Posted 29 Jan 2015 by Dirk Broer That's not completely true. It is true for those CPUs that are based upon the Bulldozer architecture -The FX series CPUs and Opterons based upon the same archtecture and the Trinity, Richland and Kaveri APUs. It is not true for Llano (still K10 based) and not true for the AM1 SOCs (Jaguar based). Most amazing in the floating point vs Integer discussion is the BOINC benchamrks for the APUs: A8-3870K (Llano, 1st generation APU), running Ubuntu 13.10, benchmarks via BOINC Manager: 2461 floating point MIPS (Whetstone) per CPU 15793 integer MIPS (Dhrystone) per CPU This Llano is made out of four K10 cores, each having both a FPU (Floating Point Unit) and an ALU (Arithmetic Logic Unit) A10-5700 (Trinity, 2nd generation APU), running Lubuntu 13.10, benchmarks via BOINC Manager: 2450 floating point MIPS (Whetstone) per CPU 9513 integer MIPS (Dhrystone) per CPU This Trinity is made out of two Piledriver modules, each having two integer cores and a shared floating point unit. For some reason the integer performance of Bulldozer, Piledriver -and now Steamroller too- leaves much to be desired as compared to the older K10 integer units....and quite a lot of BOINC projects make heavy use of the integer performance of your CPU core(s). It almost looks like a Bulldozer module isn't made out of two integer cores and a shared floating point unit, but the other way around: Two floating point units and a shared integer unit! |
62)
Message boards :
Number crunching :
Too long calculation for the Sierpinski/Riesel Bases - long
(Message 791)
Posted 28 Jan 2015 by Dirk Broer Perhaps interesting to observe, AMD-performance wise: I've run the Sierpinski/Riesel Base - long so far on five AMD systems, in order of architectural age: [1.] Three using FM1 APUs (A8-3820, 3850 and 3870K), K10 based, four discrete cores (no shared resources), not using AVX. [2.] One using a FM2 A10-5700 APU, Piledriver based, two Bulldozer modules that feature two integer units and one floating point unit each, using AVX. [3.] One using an AM1 Athlon 5350 SOC, Jaguar based, four discrete cores (no shared resources), using AVX (amongst others).
[1.] 58,000-68,000 sec. [2.] 85,000 sec. [3.] 150,000 sec.
|
63)
Message boards :
Number crunching :
Too long calculation for the Sierpinski/Riesel Bases - long
(Message 790)
Posted 28 Jan 2015 by Dirk Broer You claim AMD's way of implementing and cite the architecture of the FX and present Opteron CPUs "they have a single AVX ALU for each pair of CPU cores", while I have problems with the performance of my AM1 Athlon, using the Jaguar architecture that does *NOT* share its ALUs with other cores. I just want to know wheter the present application really uses the AM1 Athlon's architecture to the fullest and therefore asked whether you have run an AM1 Athlon through the dubugger to see if AVX is used or not. |
64)
Message boards :
Number crunching :
Too long calculation for the Sierpinski/Riesel Bases - long
(Message 784)
Posted 28 Jan 2015 by Dirk Broer Ever watched in the debugger whether an AMD CPU actually uses the AVX part of the code? According to Agner Fog the Intel compiler-made Assembly seeks a different, inferior path when running on a non-Intel CPU. I wouldn't be surprised that there is no difference as compared to SSE4 because it is the exact same code that has been running. Who wrote the MicroSoft compiler actually? |
65)
Message boards :
Number crunching :
Too long calculation for the Sierpinski/Riesel Bases - long
(Message 781)
Posted 28 Jan 2015 by Dirk Broer The massive difference is due to the AVX instruction set in the intel cpu's. You will see the same result if you crunch on Primegrid. NeoAtP The massive difference could very well be due to the use of code that excludes CPUs of other make than Intel from using the afore mentioned AVX instruction set. AVX is included in all current AMD CPUs, with the exeption of the FM1 APUs (Llano). All AM3+, FM2, FM2+, AM1 and FT3 CPUs, APUs and SOCs can use AVX. |
66)
Message boards :
Number crunching :
Too long calculation for the Sierpinski/Riesel Bases - long
(Message 778)
Posted 27 Jan 2015 by Dirk Broer Nothing wrong with the instruction set of the Athlon 5350...it includes AVX (but when the application is compiled using an Intel compiler it may not be able to use it because the Intel compiler checks on vendor string instead of capabilities). |
67)
Message boards :
Number crunching :
Too long calculation for the Sierpinski/Riesel Bases - long
(Message 770)
Posted 27 Jan 2015 by Dirk Broer Want to see long calculation time? My Athlon 5350 (AM1 socket) 3979288 1186 22 Jan 2015, 16:41:56 UTC 26 Jan 2015, 13:24:00 UTC Voltooid en gecontroleerd 167,366.05 159,487.70 1,100.00 Sierpinski / Riesel Base - long v0.01 Most of the times it is not finished before the deadline, it takes more than 40 hours to calculate the long WUs... |
68)
Message boards :
Number crunching :
Raspberry Pi
(Message 621)
Posted 12 Jan 2015 by Dirk Broer RaspberryPi has this http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0301h/index.html (section 1.5.9) And the RaspberryPi just has to do with a ARMv6 CPU. BananaPi and others based upon the Allwinner A20 SOC do it with an ARMv7, which is more powerful. |
69)
Message boards :
Number crunching :
Raspberry Pi
(Message 582)
Posted 9 Jan 2015 by Dirk Broer Fair enough: http://www.arm.com/products/processors/technologies/vector-floating-point.php |
70)
Message boards :
Number crunching :
Raspberry Pi
(Message 578)
Posted 9 Jan 2015 by Dirk Broer Did I anywhere said "current ARM CPU" in my message? |
71)
Message boards :
Number crunching :
Raspberry Pi
(Message 573)
Posted 9 Jan 2015 by Dirk Broer The LLR application uses "gwnum code". This code is written for INTEL processors. Therefore you cannot compile the LLR application for ARM based OS's ..... until there is "gwnum code" available for ARM processors. "gwnum code" is *NOT* written for Intel CPUs but for CPUs using the x86 and x86-64 archtecture, so AMD CPUs and VIA CPUs can be used ass well. And as Arthur C. Clarke once said: "Any sufficiently advanced technology is indistinguishable from magic". The mere fact that NOW there is no arm-design that is IEEE 754 compatible says nothing about the future. We are just waiting for a programmer to eable it, one way or another. |