» Symbolically generated GPU-based LBM

…seminar report on using symbolic computation to generate optimized GPU implementations of the Lattice Boltzmann Method.

The main focus is the performance impact of common subexpression elimination.

In addition to the actual report (PDF, 20 pages) the underlying code is also available.

CSE impact on P100

CSE D2Q9   D3Q19   D3Q27  
  single double single double single double
No 6957.4 2814.4 2581.8 998.8 1576.4 647.4
Yes 6922.4 3585.0 3420.2 1763.8 2374.6 1259.6
CSE D2Q9   D3Q19   D3Q27  
  single double single double single double
No 96.1% 75.7% 73.2% 55.9% 63.0% 51.3%
Yes 95.6% 96.4% 96.9% 98.7% 94.9% 99.8%