Technical Note : SLE0011 Author : Scotte Created/Modified : 19/03/2000 Description : Code profiler This technical note describes how a simple profiler can be used to help visualise how well (or bad) your code is doing. Code is provided for such a profiling tool and an example of using the profiler also with source is available. The theory behind the profiler is really simple. Root counter 1 is set by default to count scanlines. This gives us an easy way to time our code. All we need to do is record the value of the root counter before the start of the code we want to profile and then again at the end of the code. The end time-start time will give us approximately how long the code has taken to execute in scanlines. The profiler then scales this value and uses it to draw a bar on the screen. You can then see how long a particular piece of code is taking which is very useful when trying to decide which functions need to be optimised. The time taken by the GPU to draw the primitives is a little harder to determine. Normally you can do this very easily. You can get the GPU to generate an interrupt when it has finished drawing. A function is set to record the value of root counter 1 and this function is called by the interrupt handler when the GPU interrupt is generated. Since the Yaroze libraries are very limiting this cannot be done since the function DrawSyncCallback() which can be used to set up the previously mentioned interrupt is not available. At the moment I am not sure how to get around this but I am working on it. For the moment the drawing time is set to the maximum value. So enough theory lets get down to business. The profiler is contained in its own source file profile.c. To add the profiler to your projects you will need to add profile.c to your build and include profile.h in any files that reference the profile functions. You will also need to add the files gtypes.h (type definitions) and fp.h (fixed point macros) to your project. Before the profiler can be used it needs to be initialised once at the start of the program. Calling PROFILE_Initialise() will initialise the profiler. It takes three parameters, screen width and height which are used to scale the profile bars so they fit the screen and the number of frames to display. Example To initialise the profiler for a 320x256 display and set the maximum profile time to 3 frames. PROFILE_Initialise(320,256,3); The initialise stage sets the default position of the profile bars and calculates the number of scanlines in one frame. It also sets a scale value to scale the profile bars so they fit the current screen width. Once the profiler has been initialised you must call PROFILE_Start() at the beginning of each frame. This is usually just after the call to VSync(). Example while(1) { VSync(0); PROFILE_Start(); } This marks the start of a frame and sets an internal variable start_count to the current value of root counter 1. We do this so all our times will be based on this reference count and we do not need to reset the root counter every frame. The function PROFILE_Read() is used to record the time taken for a piece of code to execute. It takes three parameters which are the colour of the profile bar. Depending on the number of readings taken the CPU profile bar is split into sections each with its own colour. This means you can time lots of different functions and assign a different colour to each. Example while(1) { PROFILE_Read(0x0,0x0,0x0); TestFunction1(); PROFILE_Read(0x80,0x0,0x0); TestFunction2(); PROFILE_Read(0x0,0x80,0x0); VSync(0); PROFILE_Start(); } So in the above example the red section of the CPU profile bar will be the time taken to execute the function TestFunction1() and the green section is the time taken for TestFunction2() to execute. The black section will represent the time taken for the code between PROFILE_Start() and the 1st PROFILE_Read(). There is a limit to the maximum readings that can be taken in one frame. It is set to 128 by default but can be changed by setting the macro MAX_READINGS in profile.h. The last thing to do is draw the profile bars. The bars are drawn using the GsBOXF primitive. The function PROFILE_Draw() adds the profile bars to the ordering table which is passed in as a parameter. Example while(1) { PROFILE_Read(0x0,0x0,0x0); TestFunction1(); PROFILE_Read(0x80,0x0,0x0); TestFunction2(); PROFILE_Read(0x0,0x80,0x0); while(DrawSync(1)); PROFILE_Draw(ot); VSync(0); PROFILE_Start(); } Right then that is all there is to it. You will find the complete source code to the profiler and the example program that demonstrates the profiler in action. If you have any problems let me know. Likewise if you have any suggestions or improvements or know of a better way to do this the let me know. sevans@acclaimstudios.co.uk