Path: chuka.playstation.co.uk!news From: "Martin Keates" Newsgroups: scee.yaroze.freetalk.english Subject: How the texture cache works Date: Tue, 27 Mar 2001 22:42:57 +0100 Organization: PlayStation Net Yaroze (SCEE) Lines: 83 Message-ID: <99r1f2$dkq1@www.netyaroze-europe.com> NNTP-Posting-Host: modem-136.sodium.dialup.pol.co.uk X-Priority: 3 X-MSMail-Priority: Normal X-Newsreader: Microsoft Outlook Express 5.00.2314.1300 X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300 Hi all, I've been doing some experimenting to try and figure out the inner workings of the texture cache (probably other people have done so before me - is there any documentation on the website about it at all?). I'm not saying that this is right, just what I've inferred from my timings and the behaviour I've encountered. Anyway: the t-cache size is 64x64 pixels for 4-bit, 64x32 for 8-bit and 32x32 for 16-bit. There ends the documentation that I could find... So how does it work? As far as I can tell, each row and column of the cache keeps track of the texture page offset last used, but they all work to MOD cache size. So, supposing we are using 8-bit textures and have a cache size of 64x32 pixels. A 16x16 texture at (0,0) on a texture page would map to (0,0) on the cache, but so would a texture at (0,32), (0,64), (64,32), (64,64) etc. etc. so changing between textures at those points will cause a lot of cache misses. On the other hand, a texture at say (16, 108) would map to (16,12) in the cache which isn't a conflict with (0,0) to (15,15) and so we could alternate two textures with optimal performance. So, for 16x16 textures we can have 8 textures anywhere in the texture page as long as they MOD to distinct cache addresses. And it's cleverer than that: textures can wrap around to the other side of the cache no problem (e.g. with a 64x32 texture at (10,10) the texture would look pretty mangled mapped to (0,0) but it doesn't matter), and only the offending pixels are changed during conflicts (e.g. two 16x16 textures at (0,0) and (0,18) would only have 2 lines conflicting and would have much less of a performance hit than two completely conflicting textures). All jolly nice, but when do you get cache misses? Well, apart from overlapping textures as described above, changing the texture page invalidates the entire texture cache, and changing between texture depths causes misses too. Why worry about this then? Because it can triple your rendering time if you get it wrong! Actual rendering times are very sprite specific (probably dependent on the the amount of pixels rendered), but for drawing 1000 sprites using two textures I get (times in hsyncs): 32x32x16 -> no misses: 438, all misses: 1455 32x32x8 -> no misses: 366, all misses: 906 16x16x16-> none: 150, all: 433 16x16x8-> none: 114, all: 263 4-bit renders at the same speed as 16-bit with no cache misses, but isn't as bad with them (it's much easier to keep all your textures in the cache using 4-bits anyway). I did some tests using zoom/rotate as well: 32x32x16 (mag*0.5)-> none: 183, all: 739 32x32x8 (mag*0.5)-> none: 177, all: 470 16x16x16 (mag*2)-> none: 542, all: 840 16x16x8 (mag*2)-> none: 525, all: 650 The set up time for these tests was 585, so generally you're going to be waiting for the CPU rather than the GPU if you use a lot of zoomed sprites. Note that the performance hit drops off pretty fast - if you can string 5 or 10 cache hits in a row together you'll only be about 20-30% off optimum rather than 200%. So... is this right? Was this all obvious? Did everyone just know it anyway? Is there any documentation about this on the web already? Or a post in one of the newsgroups? cheers, Martin.