Abstract: Vector processors (e.g., SIMD or GPUs) are ubiquitous in high performance systems. All the supercomputers in the world exploit data-level parallelism (DLP), for example by using single ...