Running Renderscript efficiently with PowerVR GPUs on Android

Renderscript is a high performance compute API implemented in Android 4.2 (JELLY_BEAN_MR1), Google’s popular operating system for mobile and embedded devices. It allows developers to run a subset of C99 code across all available processor cores such as CPUs, GPUs or DSPs. Renderscript is particularly useful for applications that require parallel operations typically seen in algorithms linked to image processing, mathematical modeling, or any applications that require a great deal of mathematical computation. The Android SDK Manager provides access to a variety of Renderscript sample code.

We’ve recently run a variety of sample Renderscript and Filterscript code from Google, and found that a vast majority of their examples run perfectly well on PowerVR GPUs with no changes required. The very few that don’t either immediately fall back on the CPU or require a minor script update (optimizing the data type for a small number of variables).

PowerVRGPU Android Filterscript Renderscript Google API examples PowerVR SGX544What’s accelerated on PowerVR Series5XT GPUs

Camera applications such as smile and blink detection, high dynamic range (HDR), panoramic post processing and dynamic contrast enhancement are ideal candidates for acceleration using Renderscript. Voice enhancement algorithms are another key area of interest, with noise cancellation and beamforming being able to take advantage of Renderscript implementations.

A more detailed look at how Renderscript works and the software architecture it implements can be found on Android’s dedicated website. It includes useful diagrams and a short tutorial on creating a Renderscript application and calling scripts.

Helping developers access Renderscript efficiently on PowerVR GPUs

Developers can rely on Renderscript to automatically provide access to all available computational resources without having to worry about support for a certain processing architecture. What this essentially means is that the API provides a layer of abstraction for programs from the underlying instruction set of the CPU or the GPU. As Renderscript code is compiled on the device at runtime, non-parallel code will likely execute on the CPU while parallel code defined by ‘ForEach’ elements of each script will be considered for GPU acceleration.

rs_overviewRenderscript system overview*

This is where things get more vendor-specific. The driver decides which computational resource to use on a per-script basis. With our advanced driver implementations, both PowerVR Series5/5XT and Series6 cores will attempt to employ GPU acceleration wherever possible and keep CPU fall-back always as an option, meaning that a majority of scripts that contain parallel code will always target the PowerVR engine.

All PowerVR cores are managed using a software firmware (MicroKernel) which controls all higher level events at the GPU level. This approach offers numerous advantages including full offloading of the main CPU host of virtually all interrupt handling while fully maintaining maximal flexibility where the GPU execution is based on a software-controlled firmware.

On PowerVR Series5/5XT cores, the microkernel firmware is executed on the USSE pipelines to ensure optimal silicon area utilization. PowerVR Series6 ‘Rogue’ cores move the microkernel execution to a dedicated C-programmable multi-threaded microcontroller which enables full debugging functionality of the GPU core.

PowerVR GPU PowerVR Series5 PowerVR Series5XT PowerVR Series6 roadmap

This software-based management of the GPU core ensures that all PowerVR GPUs have the flexibility to adapt to present and future market requirements. All of Imagination’s GPUs are thus able to support not only current compute APIs like OpenCL, Renderscript and Filterscript, but will have much better support for future implementations put forward by heterogeneous processing-focused groups like the HSA Foundation.

Mobile GPU compute is about high performance and low power, not just standards

Renderscript gives developers the option to define the floating point precision required by their compute algorithms. This is particularly useful when targeting mobile platforms where battery life is the main consideration when developing applications. As few mobile CPUs or GPUs currently provide full compliance to the IEEE 754-2008 standard, this flexibility is particularly useful when developers want to both optimize their code for power efficiency and target as many platforms as possible without worrying about code compatibility issues.

Imagination’s PowerVR GPUs are designed with these low power and high efficiency considerations in mind. Therefore mapping a majority of the Renderscript intrinsics to our hardware architecture becomes a much more straightforward process in terms of GPU acceleration expectations.

In the graphs below, we see how scaling the number of GPU cores (going from a single core to a three-core PowerVR SGX544 GPU) leads to a corresponding increase in performance.

Most Android Renderscript examples running on PowerVR SGX544SC get a 3x boost when moved to a PowerVR SGX544MP3-based platform.

_PowerVRGPU Android Renderscript Filterscript image processing scripts PowerVR SGX544

_PowerVRGPU Android Renderscript Filterscript image intrinsics scripts PowerVR SGX544

Furthermore, blends,  blurs, color matrices or 3×3/5×5 convolve operations typically do not require FP64 precision. Therefore highly parallel scripts can now be developed and deployed across a range of Android devices from mass market smartphones to high-end tablets, with reasonable performance improvements for all use cases and a high degree of portability.

PowerVR GPUs have optimized 32-bit pipelines that are designed to support realistic mobile use-cases, with certain family members of the ‘Rogue’ architecture being able to scale to HPC tasks where FP64 precision is needed.

We await your feedback on Renderscript and Filterscript in the comments box below. To keep up to date with the latest developments for GPU Compute on PowerVR cores, follow us on Twitter (@GPUCompute, @PowerVRInsider and @ImaginationPR) and subscribe to our RSS feed.

 

* Image courtesy of the Android Developer page, all rights reserved.

, , , , , , , , ,

  • Schini

    Hello Alexandru. I am currently investigating the benefits of Renderscript for my Master-Thesis. Therefore I used a set of scripts and also the Android Intrinsic scripts provided by Google. Because of a lack of documentation I wanted you to ask how did you determine which script was executed on CPU or GPU?

  • http://withimagination.imgtec.com/index.php/author/alexvoica Alexandru Voica

    Hi,

    If you are running the script on a PowerVR-based platform that has our latest driver, you should see a debug message indicating where it’s executing.

    For 3rd party platforms, a quick and easy solution is to look at the CPU usage history and see if there have been spikes when you’ve run your script.

    Regards,
    Alex.

  • Schini

    Thanks for your quick reply. I will take a look at it again.
    There is just one question left ;-) Can your SDK help me investigate the exact run time of certain scripts when the application is executed on a Samsung Galaxy Nexus i9250 which has a Power VR SGX540 integrated?

  • http://withimagination.imgtec.com/index.php/author/alexvoica Alexandru Voica

    Because compute workloads are handled as generic 3D processing tasks in PowerVR SGX, you should be able to see some of the information you’re looking for in PVRTune – if the script is running on the GPU.

    If you need any support, please feel free to use our forum:

    http://forum.imgtec.com/categories/powervr-graphics

    Regards,
    Alex.

  • Jimmy Ren

    You mentioned it’s the PowerVR driver to decide if the program runs on GPU. Is there a way to know if such driver exists on device? I’m running some of the filterscript programs (e.g. convolve 3×3, both intrinsic and not intrinsic) you mentioned in a Lenovo K900 phone (with PowerVR SGX 544 gpu), and the program is not running on GPU. Is there a way to tell why? Thanks!

  • http://withimagination.imgtec.com/index.php/author/alexvoica Alexandru Voica

    Hi,

    Intel have a comprehensive blog article on Renderscript support for their Atom platforms running Android.

    http://software.intel.com/en-us/articles/how-to-use-renderscript-on-intel-based-devices

    To my knowledge, most of the Renderscript code at the moment runs on the CPU. If Lenovo/Intel have extended this capability to the GPU, then you should see workloads appearing in PVRTune as generic 3D tasks; another way to check is to look at CPU usage history.

    Regards,
    Alex.