CPU statistics are readily available on most platforms. However, workloads on mobile phones run across dozens of other hardware components. To reason about the behaviour of IP blocks on mobile phones, something along the lines of performance counters would go a long way. Below we outline the performance counter infrastructure built into the Linux MSM Kernel DSP driver. Since it is done on an adhoc basis between drivers, we outline the same for other hardware components in future blog posts.
The Linux adsprpc driver dictates communiction between CPU and DSP. Most of the driver source sits in adsprpc.c. The kernel implements the adsp subsystem as a character device driver. User-space applications can invoke the DSP using the IOCTL interface.
Infrastructure
- Available counters: these are enumerated in
enum fastrpc_perfkeys
PERF_COUNT
: bookkeepingPERF_FLUSH
: cache flushesPERF_MAP
: fastrpc mmap calls and ion mappingsPERF_COPY
: page copiesPERF_LINK
: fastrpc_invoke_send calls (i.e., smd_write and glink_tx calls)PERF_GETARGS
: get_args callsPERF_PUTARGS
: put_args callsPERF_INVARGS
: inv_args or inv_args_pre callsPERF_INVOKE
: complete fastrpc_internal_invoke callPERF_KEY_MAX
: bookkeeping (denotes last perf key entry)
- Each of the above enum entries is actually an offset used to look up and store performance counters in a
struct fastrpc_perf
- The
struct fastrpc_file
file descriptor stores a liststruct fastrpc_perf
per process ID - Each performance counter in
struct fastrpc_perf
is anint64_t
- The
#define PERF
macro encloses a block of code and times it using getnstimeofday- The result is stored into the appropriate key in
struct fastrpc_perf
- The result is stored into the appropriate key in
- The
#define GET_COUNTER(perf_ptr, offset)
macro uses pointer arithmetic to retrieve a perf key from astruct fastrpc_perf
pointerperf_ptr
is astruct fastrpc_perf*
offset
is one of the enum values above- Since the struct consists of consecutive
int64_t
s, an increment of aperf_ptr
allows consecutive access of the individual counters
- Since the struct consists of consecutive
getperfcounter
is used to retrieve an element of the appropriatestruct fastrpc_perf
from the file descriptor- It iterates through the
struct fastrpc_perf
list to find the structure that was created for the current current process ID - It then returns the offset into the found structure using the enum key
- It iterates through the
- A user-space application can use the
FASTRPC_IOCTL_GETPERF
ioctl command to retrieve performance counters- It does essentially the same as
getpercounter
above but copies out some additional information about available performance counters to the user
- It does essentially the same as
Using the fastrpc_perf infrastructure in Android (on supported devices) one can read out these performance counters via adb logcat
by enabling the appropriate properties. On Android P read out these counters by running:
adb shell setprop vendor.fastrpc.perf.kernel 1
adb shell setprop vendor.fastrpc.perf.adsp 1
adb logcat | grep perf
Example
An example program to retrieve information about DSP performance counters is as follows:
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/ioctl.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <unistd.h>
struct fastrpc_ioctl_perf
{
/* kernel performance data */
uintptr_t data;
uint32_t numkeys;
uintptr_t keys;
};
struct hlist_node {
struct hlist_node *next, **pprev;
};
struct fastrpc_perf {
int64_t count;
int64_t flush;
int64_t map;
int64_t copy;
int64_t link;
int64_t getargs;
int64_t putargs;
int64_t invargs;
int64_t invoke;
int64_t tid;
struct hlist_node hn;
};
#define FASTRPC_IOCTL_GETPERF _IOWR( 'R', 9, struct fastrpc_ioctl_perf )
int main( void )
{
int fd;
struct fastrpc_ioctl_perf data;
fd = open( "/dev/adsprpc-smd", O_RDWR );
if( fd == -1 )
{
perror( "open" );
return -1;
}
memset( &data, 0, sizeof( struct fastrpc_ioctl_perf ) );
struct fastrpc_perf frpc_d;
memset( &frpc_d, 0, sizeof( struct fastrpc_perf ) );
data.data = (uintptr_t)&frpc_d;
char* keys = malloc( sizeof( char ) * 1024 );
data.keys = (uintptr_t)&keys;
ioctl( fd, FASTRPC_IOCTL_GETPERF, &data );
printf( "keys: %lu\n"
"numkeys: %u\n"
"data: %lu\n",
data.keys, data.numkeys, data.data );
close( fd );
printf( "Data content: %ld %ld %ld\n", ((struct fastrpc_perf*)data.data)->count,
((struct fastrpc_perf*)data.data)->flush,
((struct fastrpc_perf*)data.data)->tid );
printf("Keys: %s\n", (char*)data.keys);
return 0;
}