Using D the D cache

Technical Note	: SLE0005
Author	: Scott Evans
Created/Modified	: 11/11/97
Description	: Using the D cache

The D cache also known as the data cache or scratch pad is 1K of CPU memory located at 0x1f800000-0x1f800400. It is 5-6 times faster than main RAM. By faster I mean it takes less CPU cycles to access data on D cache than main RAM.

The D cache can be used to improve the performance of a program in two ways. The first is setting the stack to D cache. If all your functions local variables and parameters will fit into 1K then you can set the stack to point to the top of D cache. The top of D cache is used since a stack grows downwards in memory.

When a function is called with 4 parameters (or less) they are placed in registers. If more than 4 parameters are used then 4 are allocated to registers but the remaining parameters are put on the stack.

A functions local variables (unless static) are also created on the stack at the time the function is executed. Since the stack is in main RAM if a function has a lot of parameters and local variables each time a parameter/local variable is referenced main RAM has to be accessed. If the stack is on D cache then the parameters/local variables can be accessed 5-6 times quicker.

Another advantage of using the D cache is that during DMA transfers, like LoadImage() the CPU has to compete for the bus with the DMA transfer when trying to access main RAM. The only memory the CPU can fully access during DMA transfers are it's internal registers and caches. So if your functions parameters/local variables are located in D cache then the CPU can access them during DMA transfers.

Another way to use D cache when calling a function is to create a structure, which contains your parameters and local variables. You can map this structure to D cache and then pass a pointer to this "work structure" to the function. You could also declare a local variable in the function to point to the work structure, of course you should use the 'register' keyword when declaring this variable.

The following function is a trivial example in which the above technique could be used. There is not a lot of point doing this for a function unless it is called many times like in the main loop of a program.

The only restriction to this method is the structure must not exceed 1K in size.

int ExampleFunction(long a, long b, long c, long d, short e, short f, char g, char h)
{
int sum,sum1,total;
static int no_times_called=0;

sum=a+b+c+d;
sum1=(e+f)/(g-h);
total=sum+sum1;

no_times_called++;

return(total);
}

As you can see this function has 8 parameters and uses 3 local variables. Static local variables are not created on the stack so they can still be declared in the function. You can either put all the parameters and local variables in a structure or just some of them. For the first example we will put the local variables and 5 of the parameters in the work structure. This will leave 3 parameters plus one free for our pointer to the structure.

typedef struct
{
int sum,sum1,total;
long d;
short e,f;
char g,h;
}WORK;

int ExampleFunction(long a, long b, long c, WORK *work)
{
static int no_times_called=0;

work->sum=a+b+c+work->d;
work->sum1=(work->e+work->f)/(work->g-work->h);
work->total=work->sum+work->sum1;

no_of_times_called++;

return(work->total);
}

You would then call the function as follows.

// Map the structure onto D cache

WORK *work=(WORK *)getScratchAddr(0);

// Initialise the parameters

Work->d=5;
Work->e=100;
Work->f=25;
Work->g=10;
Work->h=5;

// Call the function

printf("Total=%d\n",ExampleFunction(10,20,90,work));

Doing it like this means the local variables and parameters can be accessed 5-6 times quicker than if they were in main RAM (on the stack).

The other way to do this is declaring a register variable in the function to point to the work structure.

typedef struct
{
int i,j;
}WORK;

void ExampleFunction(int no_objs, int no_points,OBJECT *object)
{
register WORK *work=(WORK *)getScratchAddr(0);

for(work->i=0;work->i<no_objs;work->i++)
{
for(work->j=0;work->j<no_points;work->j++)
{
printf("Object %d Point %d : (%d,%d)\n",
work->i,work->j,object->x,object->y );
}
}
}

NOTE : The getScratchAddr() macro is defined in the LIBPS.H header file. It takes an offset as a parameter.

This document in word format.
This document in text format.