Benchmarks¶
The benchmark results were produced by the scripts in examples/benchmarks
, e.g.:
examples/benchmarks/generate_configuration.sh lennard_jones
examples/benchmarks/run_benchmark.sh lennard_jones
The Tesla GPUs had ECC enabled, no overclocking or other tweaking was done.
Simple Lennard-Jones fluid in 3 dimensions¶
Parameters:
- 64,000 particles, number density
- force: lennard_jones ()
- integrator: verlet (NVE, )
Hardware | time per MD step and particle | steps per second | FP precision | compilation details |
---|---|---|---|---|
Intel Xeon E5-2640 | 1.44 µs | 10.8 | double | GCC 4.7.2, -O3 |
NVIDIA Tesla S1070 | 57.0 ns | 274 | double-single | CUDA 5.5, -arch compute_12 |
55.1 ns | 284 | single | CUDA 5.5, -arch compute_12 | |
NVIDIA Tesla C2050 | 39.4 ns | 397 | double-single | CUDA 5.5, -arch compute_12 |
34.3 ns | 456 | single | CUDA 5.5, -arch compute_12 | |
NVIDIA Tesla K20m | 22.2 ns | 702 | double-single | CUDA 5.5, -arch compute_12 |
20.7 ns | 756 | single | CUDA 5.5, -arch compute_12 | |
NVIDIA Tesla K20Xm | 19.7 ns | 792 | double-single | CUDA 5.5, -arch compute_12 |
18.4 ns | 851 | single | CUDA 5.5, -arch compute_12 |
Results were obtained from 1 independent measurement based on pre-release version 1.0-alpha1. Each run consisted of NVT equilibration at over (10⁴ steps), followed by benchmarking 10⁴ NVE 5 times steps in a row.
Supercooled binary mixture (Kob-Andersen)¶
Parameters:
256,000 particles, number density
force: lennard_jones with 2 particle species (80% , 20% )
(, , , , , , , )
integrator: verlet (NVE, )
Hardware | time per MD step and particle | steps per second | FP precision | compilation details |
---|---|---|---|---|
Intel Xeon E5-2640 | 2.03 µs | 1.93 | double | GCC 4.7.2, -O3 |
NVIDIA Tesla S1070 | 65.7 ns | 59.4 | double-single | CUDA 5.5, -arch compute_12 |
66.3 ns | 58.9 | single | CUDA 5.5, -arch compute_12 | |
NVIDIA Tesla C2050 | 39.2 ns | 99.7 | double-single | CUDA 5.5, -arch compute_12 |
32.4 ns | 120 | single | CUDA 5.5, -arch compute_12 | |
NVIDIA Tesla K20m | 18.5 ns | 211 | double-single | CUDA 5.5, -arch compute_12 |
17.0 ns | 230 | single | CUDA 5.5, -arch compute_12 | |
NVIDIA Tesla K20Xm | 16.2 ns | 242 | double-single | CUDA 5.5, -arch compute_12 |
15.0 ns | 261 | single | CUDA 5.5, -arch compute_12 |
Results were obtained from 1 independent measurement and are based on pre-release version 1.0-alpha1. Each run consisted of NVT equilibration at over (2×10⁴ steps), followed by benchmarking 10⁴ NVE steps 5 times in a row.