Showing posts with label nbench. Show all posts
Showing posts with label nbench. Show all posts

Saturday, March 19, 2016

Raspberry PI 3 is here!

Some days ago the Raspberry PI 3 arrived home as I had ordered one when I heard of its launch. It's certainly a faster PI than the PI 2 due to the ARM Cortex-A53 cores. More or less the +50% performance ratio is true, depending on the application of course. There are some other additions as well like WiFi and bluetooth.

The Raspberry PI 3

A closer look of the PI 3

As usual, I am providing some nbench execution results. These are consistent with the +50% performance claim. For those interested I had published nbench results on the PI 2 in the past.

BYTEmark* Native Mode Benchmark ver. 2 (10/95)
Index-split by Andrew D. Balsa (11/97)
Linux/Unix* port by Uwe F. Mayer (12/96,11/97)

TEST                : Iterations/sec.  : Old Index   : New Index
                    :                  : Pentium 90* : AMD K6/233*
--------------------:------------------:-------------:------------
NUMERIC SORT        :          654.04  :      16.77  :       5.51
STRING SORT         :          72.459  :      32.38  :       5.01
BITFIELD            :      1.9972e+08  :      34.26  :       7.16
FP EMULATION        :          134.28  :      64.44  :      14.87
FOURIER             :          6677.3  :       7.59  :       4.27
ASSIGNMENT          :          10.381  :      39.50  :      10.25
IDEA                :          2740.7  :      41.92  :      12.45
HUFFMAN             :          1008.9  :      27.98  :       8.93
NEURAL NET          :          9.8057  :      15.75  :       6.63
LU DECOMPOSITION    :          365.38  :      18.93  :      13.67
==========================ORIGINAL BYTEMARK RESULTS==========================
INTEGER INDEX       : 34.272
FLOATING-POINT INDEX: 13.131
Baseline (MSDOS*)   : Pentium* 90, 256 KB L2-cache, Watcom* compiler 10.0
==============================LINUX DATA BELOW===============================
CPU                 : 4 CPU ARMv7 Processor rev 4 (v7l)
L2 Cache            :
OS                  : Linux 4.1.18-v7+
C compiler          : gcc-4.9
libc                : libc-2.19.so
MEMORY INDEX        : 7.162
INTEGER INDEX       : 9.769
FLOATING-POINT INDEX: 7.283
Baseline (LINUX)    : AMD K6/233*, 512 KB L2-cache, gcc 2.7.2.3, libc-5.4.38
* Trademarks are property of their respective holder.

As I crossed some reports on temperature issues of PI 3 I wanted to execute some experiments on power consumption of the PI 3. I used a power meter on which I plugged the power supply unit feeding the PI. I run a few experiments and I got the following power consumption ratings:


PI running statePower consumption
Idle1.4W
Single threaded benchmark2.2W
Multithreaded benchmark4.0W
After running "poweroff"0.5W

So, for my case it doesn't seem consume to much power. However, a comparison with the PI 2 should be performed in order to have a better picture.

Friday, February 13, 2015

Raspberry Pi 2 is here!


Well, it's here! Raspberry PI 2 looks very similar to it's predecessor, the Raspberry PI B+, except of two things. The rather old ARM11 core is upgraded to not one but four Cortex-A7 cores (900MHz). The Cortex-A7 is an upgrade by itself as benchmarks has shown that it is 1.5-3 times faster than the old CPU core. Four CPU cores do a decent upgrade for the same power envelope and the same price ($35). And this is not all of the changes. The new PI features double the amount of RAM which now reaches to 1GB.
To summarize it is a great upgrade of the old PI. I would say that it is the most affordable 4 core computer for applying parallel programming paradigms, e.g. OpenMP.
One can compare these nbench output to the original Raspberry PI nbench results. Keep in your mind that nbench is a single threaded benchmark.



BYTEmark* Native Mode Benchmark ver. 2 (10/95)
Index-split by Andrew D. Balsa (11/97)
Linux/Unix* port by Uwe F. Mayer (12/96,11/97)

TEST                : Iterations/sec.  : Old Index   : New Index
                    :                  : Pentium 90* : AMD K6/233*
--------------------:------------------:-------------:------------
NUMERIC SORT        :           453.9  :      11.64  :       3.82
STRING SORT         :          36.298  :      16.22  :       2.51
BITFIELD            :      1.1028e+08  :      18.92  :       3.95
FP EMULATION        :          82.381  :      39.53  :       9.12
FOURIER             :          4877.8  :       5.55  :       3.12
ASSIGNMENT          :          7.1713  :      27.29  :       7.08
IDEA                :          1364.7  :      20.87  :       6.20
HUFFMAN             :           663.8  :      18.41  :       5.88
NEURAL NET          :          5.7769  :       9.28  :       3.90
LU DECOMPOSITION    :          224.96  :      11.65  :       8.42
==========================ORIGINAL BYTEMARK RESULTS==========================
INTEGER INDEX       : 20.419
FLOATING-POINT INDEX: 8.434
Baseline (MSDOS*)   : Pentium* 90, 256 KB L2-cache, Watcom* compiler 10.0
==============================LINUX DATA BELOW===============================
CPU                 : 4 CPU ARMv7 Processor rev 5 (v7l)
L2 Cache            : 
OS                  : Linux 3.18.5-v7+
C compiler          : gcc-4.7
libc                : /lib/arm-linux-gnueabihf/libgcc_s.so.1
MEMORY INDEX        : 4.125
INTEGER INDEX       : 5.970
FLOATING-POINT INDEX: 4.678
Baseline (LINUX)    : AMD K6/233*, 512 KB L2-cache, gcc 2.7.2.3, libc-5.4.38
* Trademarks are property of their respective holder.

Monday, January 14, 2013

nbench on small linux devices

One of the benchmark programs that I find most convenient to use is nbench. The reason is that it's applicable on almost every device that can execute plain C code. This means that it can run on a desktop computer as well as on a smartphone (nbench is freely available on Google Play) or a flashed router with a custom firmware (e.g. DD-Wrt with optware).

Here are three devices that I have tried it on:
Raspberry PI
Raspberry PI
Asus RT-N16
ASUS RT-N16
Linksys NSLU2 
RaspPI and NSLU2 are ARM based where the RT-N16 is MIPS based.

Here you can see the results running it on a Raspberry PI (Raspbian OS):

BYTEmark* Native Mode Benchmark ver. 2 (10/95)
Index-split by Andrew D. Balsa (11/97)
Linux/Unix* port by Uwe F. Mayer (12/96,11/97)

TEST                : Iterations/sec.  : Old Index   : New Index
                    :                  : Pentium 90* : AMD K6/233*
--------------------:------------------:-------------:------------
NUMERIC SORT        :          221.64  :       5.68  :       1.87
STRING SORT         :          31.709  :      14.17  :       2.19
BITFIELD            :      8.4099e+07  :      14.43  :       3.01
FP EMULATION        :          46.363  :      22.25  :       5.13
FOURIER             :          2372.8  :       2.70  :       1.52
ASSIGNMENT          :          2.4781  :       9.43  :       2.45
IDEA                :           696.1  :      10.65  :       3.16
HUFFMAN             :          424.38  :      11.77  :       3.76
NEURAL NET          :          3.0098  :       4.83  :       2.03
LU DECOMPOSITION    :           78.72  :       4.08  :       2.94
==========================ORIGINAL BYTEMARK RESULTS==========================
INTEGER INDEX       : 11.729
FLOATING-POINT INDEX: 3.761
Baseline (MSDOS*)   : Pentium* 90, 256 KB L2-cache, Watcom* compiler 10.0
==============================LINUX DATA BELOW===============================
CPU                 :
L2 Cache            :
OS                  : Linux 3.2.27+
C compiler          : gcc-4.7
libc                : /lib/arm-linux-gnueabihf/libgcc_s.so.1
MEMORY INDEX        : 2.528
INTEGER INDEX       : 3.266
FLOATING-POINT INDEX: 2.086
Baseline (LINUX)    : AMD K6/233*, 512 KB L2-cache, gcc 2.7.2.3, libc-5.4.38
* Trademarks are property of their respective holder.


Here running it on Linksys nslu2 fileserver (flashed with SlugOS):

BYTEmark* Native Mode Benchmark ver. 2 (10/95)
Index-split by Andrew D. Balsa (11/97)
Linux/Unix* port by Uwe F. Mayer (12/96,11/97)

TEST                : Iterations/sec.  : Old Index   : New Index
                    :                  : Pentium 90* : AMD K6/233*
--------------------:------------------:-------------:------------
NUMERIC SORT        :          74.271  :       1.90  :       0.63
STRING SORT         :          6.9679  :       3.11  :       0.48
BITFIELD            :      1.8159e+07  :       3.11  :       0.65
FP EMULATION        :          17.645  :       8.47  :       1.95
FOURIER             :          75.723  :       0.09  :       0.05
ASSIGNMENT          :         0.96228  :       3.66  :       0.95
IDEA                :          176.19  :       2.69  :       0.80
HUFFMAN             :          104.82  :       2.91  :       0.93
NEURAL NET          :         0.10509  :       0.17  :       0.07
LU DECOMPOSITION    :          3.3757  :       0.17  :       0.13
==========================ORIGINAL BYTEMARK RESULTS==========================
INTEGER INDEX       : 3.324
FLOATING-POINT INDEX: 0.136
Baseline (MSDOS*)   : Pentium* 90, 256 KB L2-cache, Watcom* compiler 10.0
==============================LINUX DATA BELOW===============================
CPU                 :
L2 Cache            :
OS                  : Linux 2.6.27.8
C compiler          : gcc version 4.2.4
libc                :
MEMORY INDEX        : 0.668
INTEGER INDEX       : 0.976
FLOATING-POINT INDEX: 0.076
Baseline (LINUX)    : AMD K6/233*, 512 KB L2-cache, gcc 2.7.2.3, libc-5.4.38
* Trademarks are property of their respective holder.


And finally here running it on a Asus RT-N16 router (flashed with DD-Wrt with optware):

BYTEmark* Native Mode Benchmark ver. 2 (10/95)
Index-split by Andrew D. Balsa (11/97)
Linux/Unix* port by Uwe F. Mayer (12/96,11/97)

TEST                : Iterations/sec.  : Old Index   : New Index
                    :                  : Pentium 90* : AMD K6/233*
--------------------:------------------:-------------:------------
NUMERIC SORT        :           160.6  :       4.12  :       1.35
STRING SORT         :          3.7864  :       1.69  :       0.26
BITFIELD            :      6.3597e+07  :      10.91  :       2.28
FP EMULATION        :            28.6  :      13.72  :       3.17
FOURIER             :          19.904  :       0.02  :       0.01
ASSIGNMENT          :           1.753  :       6.67  :       1.73
IDEA                :          670.35  :      10.25  :       3.04
HUFFMAN             :          40.453  :       1.12  :       0.36
NEURAL NET          :        0.015345  :       0.02  :       0.01
LU DECOMPOSITION    :         0.43656  :       0.02  :       0.02
==========================ORIGINAL BYTEMARK RESULTS==========================
INTEGER INDEX       : 5.017
FLOATING-POINT INDEX: 0.023
Baseline (MSDOS*)   : Pentium* 90, 256 KB L2-cache, Watcom* compiler 10.0
==============================LINUX DATA BELOW===============================
CPU                 :
L2 Cache            :
OS                  : Linux 2.6.24.111
C compiler          : gcc version 4.1.1
libc                : ld-uClibc-0.9.28.so
MEMORY INDEX        : 1.011
INTEGER INDEX       : 1.470
FLOATING-POINT INDEX: 0.013
Baseline (LINUX)    : AMD K6/233*, 512 KB L2-cache, gcc 2.7.2.3, libc-5.4.38
* Trademarks are property of their respective holder.

It should be noted that the latter two devices do not feature a floating point unit and thus the performance on floating point intensive is extremely low.

One of the drawbacks of nbench application is that it is written as a single threaded application so it cannot exploit the extra cores of a multicore CPU. One of my future hobby projects could be porting nbench program to OpenMP or even OpenCL in order to exploit the full capabilities of a contemporary CPU or even a GPU. It would be fun of comparing a Raspberry PI with a GTX580 on nbench!