» Symbolically generated GPU-based LBM
…seminar report on using symbolic computation to generate optimized GPU implementations of the Lattice Boltzmann Method.
The main focus is the performance impact of common subexpression elimination.
In addition to the actual report (PDF, 20 pages) the underlying code is also available.
CSE impact on P100
CSE | D2Q9 | D3Q19 | D3Q27 | |||
---|---|---|---|---|---|---|
single | double | single | double | single | double | |
No | 6957.4 | 2814.4 | 2581.8 | 998.8 | 1576.4 | 647.4 |
Yes | 6922.4 | 3585.0 | 3420.2 | 1763.8 | 2374.6 | 1259.6 |
CSE | D2Q9 | D3Q19 | D3Q27 | |||
---|---|---|---|---|---|---|
single | double | single | double | single | double | |
No | 96.1% | 75.7% | 73.2% | 55.9% | 63.0% | 51.3% |
Yes | 95.6% | 96.4% | 96.9% | 98.7% | 94.9% | 99.8% |