In the appendix of Ulrich Drepper’s 2007 paper on memory and computer architecture he demonstrates a way to verify how bad our branch prediction intuition truly is. The code still works like a charm, mostly untouched. Let’s see what we can do with it.
[Read More]
Possible _mm_pause assembly regression on GCC
While homebrewing some spinlocks I discovered an interesting possible regression in the compilation of the _mm_pause intrinsic on GCC. On supported architectures this intrinsic should translate to a PAUSE instruction, which can be used to stop CPU pipeline flushes in typical spin-locks after a lock has been released/acquired.
[Read More]
DSP Activity Through Linux debugfs(8)
Disclaimer: This post refers to the Qualcomm Snapdragon 8xx chipsets but likely applies to other vendors as well.
[Read More]
FastCV: Hardware Acceleration of Computer Vision
Qualcomm’s FastCV SDK promises hardware acceleration of computer vision applications, providing a slew of APIs for common signal processing tasks (e.g., filters) without ties to particular hardware.
[Read More]
Linux adsprpc Driver Perf Infrastructure
CPU statistics are readily available on most platforms. However, workloads on mobile phones run across dozens of other hardware components. To reason about the behaviour of IP blocks on mobile phones, something along the lines of performance counters would go a long way. Below we outline the performance counter infrastructure...
[Read More]