Path: chuka.playstation.co.uk!scea!greg_labrec@interactive.sony.com From: Elliott Lee Newsgroups: scee.yaroze.freetalk.english Subject: Re: Optimisation Date: Fri, 27 Feb 1998 16:15:41 -0800 Organization: . Lines: 93 Message-ID: <34F7572D.EF10D927@netmagic.net> References: <34F1D481.5FFF@mdx.ac.uk> <34F74BF0.54FDD9A@ndirect.co.uk> Reply-To: tenchi@netmagic.net NNTP-Posting-Host: dhcp-e-39-245.cisco.com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Mailer: Mozilla 4.03 [en] (Win95; U) Alex Herbert wrote: > > Hi all. > > There are some very important points regarding optimisation which no one > seems to mention. (Maybe it's just too obvious.) > > Remember that the PSX processing is parallel. Your code and drawing are > happening at the same time, and both have to complete before frame > switching. (Yes, yes, I know you already know this but it's relevant.) It > is not usual that both the code and drawing take the same time to complete, > hence the need for DrawSync(). So why waste time optimising the part which > finishes first? It gains you nothing. Yup. The R3000 CPU is a pokey 33MHz (blech!), but the graphics and sound processing is offloaded to the dedicated processing units (each with their own RAM). The reason why I want my code as fast as possible is so I can do other things while waiting for drawing/sound processing to complete. E.g. image decompression, sound event handling, restructuring trees. The big bottlenecks I'm dealing with aren't in the GPU. You may just be tracking a few hundred events on-screen like smoke, gunfire, floating caption text, and enemy positions. You'd also want to get as much speed out of your code if you have complicated collision detection schemes or need to save a few cycles for complex math. Case in point: I need all the speed I can to do some cheezy 2D light-sourcing. On the "death" screen in my current project, I have a 30x30 grid of tiles for the playfield. The tiles farther from the epicenter of the bomb explosion fade to black. If I didn't do some optimisation of the code itself, it would take forever to calculate the distance of every tile to the epicenter. (I originally tried it with a floating-point hypoteneuse and I got something like 8 frames per second.) Using some integer math I got it back up to 60 fps---I cheated by using the X/Y delta instead. If I wanted a near-perfect circle, I would have put the delta distances into a lookup table. ^_^ Any speed improvement of the Basic Case makes a huge difference when you're slapping down 900+ sprites in the General Case. > I call VSync(1) before DrawSync to determine the code execution time, and > then VSync as normal before switching buffers. I then keep peak values for > these times (which are reset every second or so) as it is peaks that lead to > slowdown and juddering. That's a good idea. > If my code always completes before the drawing, then I know my code is as > optimised as it needs to be. I could make it quicker, but I'd still have to > wait for drawing to complete, so what's the point? In general, I'd suggest > that this is usually the case, especially for 2D work. > > If the code is overrunning the drawing, then it's time to optimise the code. > (Unfortunately the above timing method will not tell you by how much the > code is overrunning - DrawSync will return immediately and the two timings > will be close if not identical.) > > If the drawing is taking too long, then that's what needs to be optimised. > Reduce: use of semi-transparency, use of rotated/scaled sprites, > bit-resolution/size of textures, etc. *nod* I agree with that too. Also, instead of timing your frame flipping to 1/60 or 1/50 per second, you could always drop down to 30/25 fps. You could set a callback on the vsync to set a flag telling you when every other vertical retrace has occurred. I think that the graphics will still be fluid enough. > Ah well. I've said my bit, and I hope this is of use to someone. > > Alex. - e! tenchi@netmagic.net http://www.netmagic.net/~tenchi/yaroze/