Path: chuka.playstation.co.uk!news From: Chris Chadwick Newsgroups: scee.yaroze.programming.libraries Subject: Re: D-Cache Date: Tue, 21 Jul 1998 00:27:23 -0700 Organization: PlayStation Net Yaroze (SCEE) Lines: 69 Message-ID: <35B442DB.3AF0@dial.pipex.com> References: <359E82E9.1134@dial.pipex.com> <01bdafbf$5c177f20$f2e832a2@gbain.wav.scee.sony.co.uk> <35AF051F.708C@dial.pipex.com> <35AF1307.C1D4198B@scee.sony.co.uk> <35B33151.6DE1D612@easynet.co.uk> <35B34475.D898F644@scee.sony.co.uk> NNTP-Posting-Host: userm954.uk.uudial.com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Mailer: Mozilla 2.02 (Win95; I; 16bit) James Russell wrote: > > Well, for starters, as a general rule I wouldn't call any Yaroze library functions while your stack > is on the D-Cache, which probably rules out a few functions that you want to speed up. Many of the > library functions don't change the stack to the D-Cache, but some do to get extra speed. If your > program is running with a D-Cache stack and you call a library function which resets the stack to > the D-Cache too, your program will crash and burn because the new stack will overwrite the old one. Ah! Well, that explains it. I simply wasn't aware that some library functions actually set up and use the Dcache as a stack. I was completely under the impression (mainly from the manuals) that the 1k scratchpad was available to the programmer at ALL times. > GsSortObject4 doesn't reset the stack (to my knowledge), but takes as a parameter a 'scratch' area > to use for its intermediate workspace. If you've followed the sample code, you'll see that they use > getScratchAddr(0) for this scratch area, which is a macro that points to the start of the D-Cache. > > To be honest, I can't think of any obvious examples where using the D-Cache _as_a_stack_ would bring > you a huge speed increase. But here's 3 reasons: > > 1) If you're writing a function that uses a lot of local variables (more than the number of > registers available), then those variables will be allocated on the stack (and hence on the > D-Cache), and therefore they'll go a bit faster. > 2) If you are doing some major processing on a local array which is less than 1K, then having the > stack on D-Cache will (generally) increase the speed of that function. > 3) If you are doing a tree traversal (depth/breadth first, that sort of thing) which involves a lot > of recursive function calls, then having the stack on D-Cache will be faster. The only proviso is to > make sure that there aren't too many local variables and/or the tree is not too deep, or you'll > overflow the D-Cache! > > The D-Cache isn't a true cache in the usual sense of the word. A normal cache will _transparently_ > store the most recently used lines of RAM to increase speed. The D-Cache is more like a really fast > area of memory, but it's only 1K long. Thus it's up to the programmer to explicitly load and store > parts of this memory, which is why most people set up their stack on it, because it gives an instant > speed increase to local variable access. > > If you want to process a global/static array, it's going to be stored on the heap and so you'll have > to transfer it to D-Cache before you start, and transfer it back after you finish. This transfer > overhead is only worth it if you're going to be accessing each element of the array more than twice. > This is certainly the case if you're doing some image processing (like the flame/water effects). > > The first heuristic of optimisation is to optimise the biggest timewaster. Back in the days when I > was writing Unix database code, I managed to speed up a debugging function that was used twice in > every function by a factor of 8. But since 90% of the time was spent preparing and parsing the SQL, > the speed increase from the new function hardly made a dent in the performance. The lesson there is > that you should concentrate on optimising the component which takes the longest time to complete. > > If you want to time various parts of your code, use the VSync(-1) call or the Root counters. Run > various important pieces of code in a loop a million times and see how many VSyncs each part takes. > That will give you some idea of the proportion of time that code is taking. > > Cheers, > > James > > -- > == James_Russell@scee.sony.co.uk +44 (171) 447-1626 > == Developer Support Engineer - Sony Computer Entertainment Europe > > "Weaseling out of things is what separates us from the animals!... > Except the weasel." -- Homer Thanks very much for all the info, James! ;) Well, I think I'll have to ditch the idea about having a general purpose DCache stack and try and find a better use for it... Cheers, -Chris