You are on page 1of 8
- How doesa (i24 \ Mobile GPU \SaEe Work? ‘ TR WI Comic: Ikaridon Yu Although the mobile and PC GPUs do the same job, directly porting from the PC is not a good idea! directly now. powerful Dr. Arm, Z the mobile S GPU expert. \42 I help developers by answering all \ ) sorts of questions about Mali GPUs. Although the mobile GPU uses the same API as the PC GPU, ) the architectures of “] that the direct) the two GPUs are porting is nota { quite different. Hmm... but I still Nota’ eroblen can't imagine the py i difference even let me take you though you said inside a GPU. CAIN crvecrerammconssronnis * 2+ Wow! That is a . huge power Here is the station. inside of a A the PC GPU doesn't need to worry about power consumption. memory. big warehouse All data that over there? needs to be rendered is stored inside it. J There are many Porters carry data to the GPU conveyors beside the gdm There are so many core whenever the GPU needs| warehouse, where porters carrying to render something. The PC lare they going to? data over the GPU can run so many conveyors! conveyors because it has far more energy available. That's why a PC GPU can et alg transfer huge amounts of data at high speeds. They are connected to the GPU core, which is the factory over there. > ° oO CAEN corer armconserovnes “3+ There are two masters inside a GPU core - the Master Vertex shader and Master Fragment shader. You can see there is a conveyor between them. Next, let's go to see the inside of a GPU core. This one is the Master This one Is the Master Fragment Vertex shader, all shader. The Master Vertex shader triangles that need to be sends transformed triangles to her. rendered must be Then she will process the triangles processed by him first. ate many freaments. V Li Looks fl. like an artist... Triangles are processed one-by-one After that, the porters carry those fragments and a lot of fragments need to be Y’ so jabor back to the video memory moved back to the video memory \ intensive pein : after processing one triangle. That's so the final image can be why the PC GPU needs so many displayed on the screen. Soniveyars eeu 7 Now, let's go to the inside of a mobile GPU fora comparison. That's right. The power station of a PC can provide 200 ~ 300 Watts but the power station of a mobile device can only provide 2 ~ 3 Watts. DESKTOP. “~ about half a MOBILE low-energy LED bulb Did you see any difference from the PC GPU? And there are only two conveyors outside of the So if you port your PC game) directly to mobile, the conveyors will stall almost straight away and the battery will CAPM cosccscrameom/arapnes “3+ station is way too small! That's because conveyors consume more energy, soa mobile GPU can't afford as many » conveyors as a PC So how can a mobile GPU fix this? Let's take a close look at the The transformed mobile GPU core. It has the triangles will not What?! So pass directly to the what vertex shader to process fragment shader. should the triangles, but... Instead, the fragment triangles will be shader do transferred to the: video memory. 7 / But would it increase the size of the data transfer? XC The Fragment shader will get the transformed triangles from the video memory then. Exactly! Did you see a I) On a mobile GPU, the small warehouse il screen is split into besides the fragment i i many tiles. Each tile shader? A | contains 16x16 ~ 64x64 fragments. Yes, I didn't see such a thing at the PC GPU core. CAI crrserersrmcon/ssancs “8 The fragment shader processes one tile at Then the a time and then processes all triangles in the tile at once. So the frame buffer data just needs to be transferred to the tile memory at the beginning of the tile processing. / I got it now. Although’ the data transfer for vertex is increased, the tile design saves more of the data transfer for fragment. If the GPU finds any triangle that is occluded by other triangles, those occluded triangles will be discarded without rendering. The whole tile | will be transferred back to the video'| memory when | all triangles in this tile are processed. CAIN cesses ammcon/sapnes "7 ~ fragment shader can directly access the data in the tile memory and process all the triangles You are invisible, don't waste my time ERD) POSSSTH Just need to transfer one tile, easy job! Data transfer Correct. And there is another advantage to using the tile for rendering... | Cool, so using the tile can save both memory bandwidth and power consumption. What @ smart design. My size is fixed fixed, the size of the data transfer is also fixed. That's why the mobile GPU doesn't need so much But we have learned enough But on a PC GPU, the data transfer size is not] fixed. It varies depending on the number of triangles that are rendered. The occluded object also needs to be rendered, what awaste! 4. ~ 4 consuming h huge amounts Why is the fps so low when there are not many objects ‘on screen? So what should I do to take advantage of A this architecture § when porting we've learned so far my PC game and I will show you | GPU core jf how to do that in the} ‘ next episode. =. recat rere: Fragment) Tile hader| [Shader Video memory Go to the Arm developer site for more detailed information : https://developer.arm.com/solutions/- graphics-and-gaming/developer-guides/- learn-the-basics/tile-based-rendering

You might also like