Select Git revision
-
David Brownell authored
The previous stuff was needed because the ARM11 code wasn't using the standard ARM base type and register access ... but now those mechanisms work, so we can switch out that special-purpose glue. This should resolve all the "FIXME -- handle Thumb single stepping" comments too, and properly handle the processor's mode. (Modulo the issue that this code doesn't yet handle two-byte breakpoints.) Clarify the comments about the the hardware single stepping. When we eventually share breakpoint code with Cortex-A8, we can just make that be the default on cores which support it. We may still want an override command, not just to facilitate testing but to cope with "instruction address mismatch" not quite being true single-step. Signed-off-by:
David Brownell <dbrownell@users.sourceforge.net>
David Brownell authoredThe previous stuff was needed because the ARM11 code wasn't using the standard ARM base type and register access ... but now those mechanisms work, so we can switch out that special-purpose glue. This should resolve all the "FIXME -- handle Thumb single stepping" comments too, and properly handle the processor's mode. (Modulo the issue that this code doesn't yet handle two-byte breakpoints.) Clarify the comments about the the hardware single stepping. When we eventually share breakpoint code with Cortex-A8, we can just make that be the default on cores which support it. We may still want an override command, not just to facilitate testing but to cope with "instruction address mismatch" not quite being true single-step. Signed-off-by:
David Brownell <dbrownell@users.sourceforge.net>
README.md 5.88 KiB
pi calculation benchmark
http://fab.cba.mit.edu/classes/MAS.864/text/benchmark.pdf
estimated GFlops | code | description | system | date |
---|---|---|---|---|
17,340,800 | cudampipi.cu | C++, CUDA+MPI 2048 nodes, 12228 ranks, GPUs nvcc -arch=sm_70 -std=c++11 |
Summit Oak Ridge OLCF IBM AC922 |
December, 2020 |
88,333 | mpimppi.c | C, MPI+OpenMP 1024 nodes, 64 cores/node, 4 threads/core cc mpimppi.c -o mpimppi -O3 -ffast-math -fopenmp |
Theta Argonne ALCF Cray XC40 |
October, 2019 |
16,239 | cudapit.cu | C++, CUDA, 8 GPUs, 6192 cores/GPU | NVIDIA A100 | December, 2020 |
12,589 | cudapit.cu | C++, CUDA, 8 GPUs, 5120 cores/GPU | NVIDIA V100 | March, 2020 |
11,083 | mpithreadpi.cpp | C++, MPI+threads, 128 nodes, 64 cores/node, 4 threads/core CC mpithreadpi.cpp -o mpithreadpi -O3 -ffast-math -std=c++11 |
Theta Argonne ALCF Cray XC40 |
March, 2020 |
2,117 | mpipi2.c | C, MPI, 10 nodes, 96 cores/node mpicc mpipi2.c -o mpipi2 -O3 -ffast-math |
Intel 2x Xeon Platinum 8175M | October, 2019 |
2,102 | mpipi2.py | Python, Numba, MPI 10 nodes, 96 cores/node |
Intel 2x Xeon Platinum 8175M | February, 2020 |
2,052 | cudapi.cu | C++, CUDA, 6192 cores | NVIDIA A100 | December, 2020 |
1,635 | cudapi.cu | C++, CUDA, 5120 cores | NVIDIA V100 | March, 2020 |
1,595 | prior | IBM Blue Gene/P | C, MPI, 4096 processes | prior |
1,090 | numbapig.py | Python, Numba, CUDA, 5120 cores | NVIDIA V100 | March, 2020 |
811 | prior | Cray XT4 | C, MPI, 2048 processes | prior |
315 | numbapip.py | Python, Numba, parallel, fastmath 96 cores |
Intel 2x Xeon Platinum 8175M | February, 2020 |
272 | threadpi.c | C, 96 threads gcc threadpi.c -o threadpi -O3 -ffast-math -pthread |
Intel 2x Xeon Platinum 8175M | June, 2019 |
267 | threadpi.cpp | C++, 96 threads g++ threadpi.cpp -o threadpi -O3 -ffast-math -pthread |
Intel 2x Xeon Platinum 8175M | March, 2020 |
211 | mpipi2.c | C, MPI, 1 node, 96 cores mpicc mpipi2.c -o mpipi2 -O3 -ffast-math |
Intel 2x Xeon Platinum 8175M | October, 2019 |
180 | mpipi2.py | Python, Numba, MPI mpirun -np 96 python mpipi2.py |
Intel 2x Xeon Platinum 8175M | February, 2020 |
173 | mppi.c | C, OpenMP, 96 threads gcc mppi.c -o mppi -O3 -ffast-math -fopenmp |
Intel 2x Xeon Platinum 8175M | July, 2019 |
152 | pi.html | JavaScript, 96 workers | Intel 2x Xeon Platinum 8175M | June, 2019 |
93.2 | threadpi.c | C, 56 threads gcc threadpi.c -o threadpi -O3 -ffast-math -pthread |
Intel 2x E5-2680 | December, 2018 |
71.4 | pi.html | JavaScript, 56 workers | Intel 2x E5-2680 | November, 2018 |
46.9 | mpipi.c | C, MPI mpicc mpipi.c -o mpipi -O3 -ffast-math mpirun -np 6 mpipi |
Intel i7-8700T | November, 2018 |
44.6 | threadpi.c | C, 6 threads gcc threadpi.c -o threadpi -O3 -ffast-math -pthread |
Intel i7-8700T | December, 2018 |
23.3 | mpipi2.py | Python, Numba, MPI mpirun -np 6 python mpipi2.py |
Intel i7-8700T | February, 2020 |
16.1 | pi.html | JavaScript, 6 workers | Intel i7-8700T | November, 2018 |
15.7 | clusterpi.js | Node, 6 workers | Intel i7-8700T | December, 2018 |
9.37 | pi.c | C gcc pi.c -o pi -lm -O3 -ffast-math |
Intel i7-8700T | November, 2018 |
4.87 | numbapi.py | Python, Numba | Intel i7-8700T | February, 2020 |
3.73 | pi.html | JavaScript, 1 worker | Intel i7-8700T | November, 2018 |
3.47 | pi.html | JavaScript, 1 worker | Intel 2x E5-2680 | November, 2018 |
3.29 | pi.js | Node | Intel i7-8700T | December, 2018 |
3.12 | clusterpi.js | Node, 1 worker | Intel i7-8700T | December, 2018 |
1.78 | threadpi.c | C, 4 threads gcc threadpi.c -o threadpi -O3 -ffast-math -pthread |
Raspberry Pi 4 | December, 2020 |
0.851 | prior | Connection Machine CM-2 | C, 32k processors | prior |
0.57 | pi.c | C gcc pi.c -o pi -lm |
Intel i7-8700T | November, 2018 |
0.47 | numpi.py | Python, NumPy | Intel i7-8700T | November, 2018 |
0.148 | prior | IBM ES/9000 | C | prior |
0.134 | prior | Pentium III | C | prior |
0.118 | prior | Cray Y-MP4/464 | C, vector | prior |
0.074 | pi.c | C gcc pi.c -o pi -lm -O3 -ffast-math |
Raspberry Pi Zero | December, 2020 |
0.029 | pi.py | Python | Intel i7-8700T | November, 2018 |
0.0168 | pi.c | C floats, -O3, gcc-arm-none-eabi, 160 MHz |
SAMD51J20A ARM Cortex M4F |
October, 2019 |
0.013 | prior | Intel Pentium Pro | C | prior |
0.0128 | pi.c | C floats, -O3, gcc-arm-none-eabi, 84 MHz |
STM32F412 ARM Cortex M4F |
October, 2019 |
0.010 | prior | Cray Y-MP4/464 | C, scalar | prior |
0.006 | pi.ino | Arduino, floats | ESP32-WROOM | December, 2020 |
0.001 | prior | Sun SPARCStation 1 | C | prior |
0.001 | prior | DEC VAX 8650 | C | prior |
0.0007 | prior | Intel 486 | C | prior |
0.0002 | pi.ino | Arduino, floats | ATSAMD21E | December, 2020 |
0.0001 | pi.ino | Arduino, floats | ATtiny1614 | December, 2020 |
0.00003 | prior | Sun 3/60 | C | prior |
0.00003 | prior | Intel 286 | C | prior |
0.000001 | prior | Intel 8088 | C | prior |
estimated GFlops | estimated GFlops/W | code | description | system | date |
---|---|---|---|---|---|
0.0168 | 0.233 calculated |
pi.c | C floats, -O3, gcc-arm-none-eabi, 160 MHz |
SAMD51J20A ARM Cortex M4F |
October, 2019 |
0.0128 | 0.171 calculated |
pi.c | C floats, -O3, gcc-arm-none-eabi, 84 MHz |
STM32F412 ARM Cortex M4F |
October, 2019 |