Understanding PowerVR Series5XT: PowerVR’s market leading fillrate efficiency

It’s easy to talk about a lot of architectural benefits and features, but ultimately the proof is in actual measurable recognisable efficiency benefits. It’s for this reason that our design teams make use of hundreds of focus tests to ensure that the hardware is able to sustain peak throughput performance in a wide range of usage scenarios.

The Chase_Unity_PowerVR_Series5XT (8)

Equally similar tests are also used to ensure optimal power efficiency. From an architectural point of view, our aim is to always improve: execute the same workload (e.g. render a certain frame) in the same (worst case) or better in a lower power budget (design target) when compared to our previous GPU generation.

The Chase_Unity_PowerVR_Series5XT (6)

To illustrate this, the graph below shows fillrate efficiency calculated based on independent measured fillrate data from Kishonti’s GFXBench suite.

This fillrate is then compared with the fillrate throughput claimed by marketing. Basically if a certain product is measured at 500MPixels/second in the benchmark yet the marketing material indicates a 250MHz core with 4 pixels per clock, then the design is 50% efficient – from 500MPixels (measured) versus 1000MPixels (expected).

PowerVR Series5XT Series6 vs competing GPUs fillrate efficiency

In this graph, the purple bars represent PowerVR products, and the other colours represent a range of competitive products. As can be seen, the majority of PowerVR GPUs (a range from SGX Series5 and Series5XT to Series6 products) deliver real world sustained rates above 80%, only one product with a very high clock frequency sits at just below 70% efficiency, whereas the bulk of competitor products sit at efficiency rates below 60% and typically even below 50%.

Now what this means is that PowerVR products put down logic to deliver a certain fillrate and typically 80% of that fillrate can be sustained and delivered – basically a good return on silicon area and power investment. For competitive products, silicon area and power is being invested but the return is only 50%. It’s a bit like paying for 200g of chocolate but only getting 100g in your package… not something you’d be very happy about.

The above graph nicely sums up our architectural focus on efficiency, and how this results in high performance and high power efficiency, and thus the best overall results in practical real world applications.

In my next blog post, I will discuss how we write and optimize software for PowerVR GPUs, including drivers and software stacks .

If you have any questions or feedback about Imagination’s graphics IP, please use the comments box below. To keep up to date with the latest developments on PowerVR, follow us on Twitter (@GPUCompute, @PowerVRInsider and @ImaginationPR) and subscribe to our blog feed.

‘Understanding PowerVR’ is an on-going, multi-part series of blog posts from Kristof Beets, Imagination’s Senior Business Development Manager for PowerVR. These articles not only focus on the features that make PowerVR GPUs great, but also provide a detailed look at graphics hardware architectures and software ecosystems in mobile markets.

If you’ve missed any of the posts, here are some backlinks:

, , ,

  • Sandy

    Might that graph just show the ImgTec uses more realistic fillrate values in its marketing materials than it’s competitors? Not necessarily more efficient, just closer to the claimed throughput?

  • http://withimagination.imgtec.com/index.php/author/alexvoica Alexandru Voica

    Hi,

    As Kristof has explained, the chart shows the difference between theoretical fillrate (companies stating a GPU core can output N pixels / clock) and measured performance in real world scenarios and applications.

    This difference can either be caused by inefficient designs (the GPU cannot actually process that many pixels due to hardware limitations, driver issues, etc.) or unrealistic claims.

    Best regards,
    Alex.

  • Luca De Marco

    Imagination is certainly a leader in the GPU mobile. But with all respect is a bit weird to see graphs of the series 6 all the times when there is no one mobile to date that has the series 6.

    Please dont get me wrong the series 5xt is great, and I do believe the 554MP in the Ipad 4 is still one of the best you can get at moment. But all the new devices are coming out (Samsung, etc) have the series 5xt. No trace of series 6 at all.

    So question is when we will have the pleasure to comment these fantastic graphs with a smarthphone powered by series 6?
    Regards
    Luca

  • http://withimagination.imgtec.com/index.php/author/alexvoica Alexandru Voica

    Hi Luca,

    We expect to see PowerVR Series6 GPUs shipping in mobile devices (tablets, smartphones, etc.) towards the end of this year.

    Several of our partners are currently gearing up to release their platforms so you will see at least two major announcements in the upcoming future.

    Best regards,
    Alex.

  • Luca De Marco

    Thanks Alex for the info. I look fwd to see then new devices then.

    Regards

    Luca

  • Injik Lee

    In case of G64x0, there are four USCs. Is this the reason that produces 4 pixels per clock?
    Can you explain how it produces 4 pixels per clock?

  • http://rys.pixeltards.com/ Rys Sommefeldt

    Hi Injik,

    In Series 6 Rogues like G64xx, the pixel back-end block is capable of outputting 4 pixels per clock. That’s independent of the rest of the architecture, which lets the USCs scale independently.

    In terms of textured pixels per clock, a G64xx is capable of four texels per clock per USC cluster pair, so 8 texels per clock total for those products.

    Hope that helps you understand the architecture a bit better.

  • Sean Lumly

    A sure-fire way to improve fill rate is to require less of it. One way to do this is high quality, spatial upscaling. As mobile resolutions get higher and higher, the burden on computation and memory bandwidth increase as well. With fixed function hardware, upscaling 1080p to 4K could be a relatively low-power operation for pixels on the way to the display and with far superior quality compared to simple bilinear interpolation.