The benchmark results were produced by the scripts in examples/benchmarks, e.g.:
examples/benchmarks/generate_configuration.sh lennard_jones
examples/benchmarks/run_benchmark.sh lennard_jones
Parameters:
- 64,000 particles, number density
- force: lennard_jones ()
- integrator: verlet (NVE, )
Hardware | time per MD step and particle | steps per second | FP precision | compilation details |
---|---|---|---|---|
Intel Xeon E5620 | 1.40 µs | 11.2 | double | GCC 4.4.1, -O3 |
NVIDIA Tesla S1070 | 58.6 ns | 267 | double-single | CUDA 4.2, -arch compute_12 |
54.3 ns | 288 | single | CUDA 4.2, -arch compute_12 | |
NVIDIA Tesla C2050 | 40.5 ns | 386 | double-single | CUDA 4.2, -arch compute_12 |
34.6 ns | 452 | single | CUDA 4.2, -arch compute_12 | |
NVIDIA Tesla S2050 | 46.4 ns | 337 | double-single | CUDA 4.2, -arch compute_20 |
39.6 ns | 395 | single | CUDA 4.2, -arch compute_12 | |
NVIDIA Tesla M2090 | 37.8 ns | 414 | double-single | CUDA 4.2, -arch compute_20 |
33.2 ns | 470 | single | CUDA 4.2, -arch compute_12 |
Results were obtained from 1 independent measurement based on release version 0.2.0. Each run consisted of NVT equilibration at over (10⁴ steps), followed by benchmarking 5 times 10⁴ NVE steps in a row.
Parameters:
256,000 particles, number density
force: lennard_jones with 2 particle species (80% , 20% )
(, , , , , , , )
integrator: verlet (NVE, )
Hardware | time per MD step and particle | steps per second | FP precision | compilation details |
---|---|---|---|---|
Intel Xeon E5620 | 1.96 µs | 2.00 | double | GCC 4.4.1, -O3 |
NVIDIA Tesla S1070 | 68.7 ns | 56.9 | double-single | CUDA 4.2, -arch compute_12 |
68.4 ns | 57.3 | single | CUDA 4.2, -arch compute_12 | |
NVIDIA Tesla C2050 | 41.9 ns | 93.3 | double-single | CUDA 4.2, -arch compute_12 |
35.0 ns | 112 | single | CUDA 4.2, -arch compute_12 | |
NVIDIA Tesla S2050 | 44.4 ns | 88.0 | double-single | CUDA 4.2, -arch compute_20 |
38.0 ns | 103 | single | CUDA 4.2, -arch compute_12 | |
NVIDIA Tesla M2090 | 35.8 ns | 109 | double-single | CUDA 4.2, -arch compute_20 |
29.6 ns | 132 | single | CUDA 4.2, -arch compute_12 |
Results were obtained from 1 independent measurement and are based on release version 0.2.0. Each run consisted of NVT equilibration at over (2×10⁴ steps), followed by benchmarking 5 times 10⁴ NVE steps in a row.