
|
General info |
|
|

Overview
The N64 is a little bit different from modern graphics systems. It's very customizable, which you should immediately notice when you see there's no dedicated graphics CPU
or any dedicated graphics RAM. Its unified architecture means you can use the 4/8MB of ram for pretty much whatever you want. You can run in really any video mode, it's up
to you where you want to put your buffers and how you want to use them.
|
|
|

Typical process
In almost all games, here's how graphics are drawn:
- At the start of execution, the CPU transfers all the needed graphics data from cartridge ROM, into RAM, using DMA.
- The CPU configures the VI (video interface) with parameters like the screen resolution, gamma correction modes, dithering, etc.
- A displaylist is started, which contains all the information needed by the RCP to render the frame.
- At the start of the list are the setup instructions, where the current framebuffer is cleared along with the Z-buffer.
- The drawing mode is set (Opaque, transparent, Z-buffer, non-Zbuffer, etc)
- A matrix describing the 3d viewspace is loaded into the RSP
- For each object, each object's transform/rotate matrix is multiplied with the viewing matrix, and the object's displaylist is added to the master DL
- When all objects are added to the master displaylist, the CPU sets up a 'task', which loads the RSP with the microcode to use to draw with, and other parameters
- The CPU waits for the RSP to finish drawing.
- The CPU waits for a vertical retrace (to stay synced with the TV) and then starts the next frame.
It's also quite possible to execute several tasks within the same frame. For example, if you wanted to use the F3DEX microcode for your 3D scene, and then draw the screen
text with the S2DEX microcode.
I should also note that it's entirely possible to forgo 3D entirely and simple write to the raw 2D framebuffer in your code. The downside is that since the CPU must access
RDRAM through the RCP, it's not very fast. If you've ever played Namco Museum, this game does exactly what I described. It's also possible to modify the resulting image
once it's been drawn by the RDP. These sorts of framebuffer effects are used by Perfect Dark for the spycam/nightvision, the Pause screen in Donkey Kong 64, among others.
|
|
|

Typical memory usage
Here is a diagram showing how my code uses RDRAM.

Starting at the left, first are the framebuffers (two, if you're double-buffering) and then the FIFO.
However, there's something important to notice. There is a chunk of wasted memory here. The reason is that the memory is divided into 4 banks (each bank is 1 megabyte).
The Z-buffer is placed in a different bank to improve drawing performance when using the Z-buffer.
Continuing right, next is the FIFO (for storing RDP commands generated by the RSP during a graphics task).
Finally, the code section. When the N64 boots, the bootcode transfers the first megabyte of ROM (0x1000 to 0x101000) to RDRAM at this location. This is where execution
starts. What you put after this section is your own decision.
|
|
|

RSP and RDP
It takes both the RSP and RDP to render 3D scenes.
As I described above, the CPU generates a long displaylist, and passes the address to the RSP and the microcode running on it.
- The microcode (usually Fast3D or F3DEX) is started on the RSP, which is running in parallel to the main CPU.
- It looks in RDRAM for the start of the displaylist, and parses the instructions (for example, draw triangle, load vertex, etc)
- The RSP supports parallel vector operations, and vertices are transformed into 2d space.
- The RSP creates 2D drawing instructions for the RDP in a small buffer (FIFO) that is set up in RAM also. I believe it's about 128kbytes in most of my stuff.
- Once a sufficient number of commands are generated, the RSP instructs the RDP to begin rasterizing (drawing) the scene. It's all 2D at this point.
- While this is going on, the RSP continues generating 2D drawing commands from the master displaylist, and feeds it to the RDP small bits at a time.
Remember when I said the process is flexible? Well, you can change the way the RSP and RDP interact. In my opinion, the above method is most efficient for the majority of
games. This is the 'FIFO' method (remember, the 2D data is stored in a small circular buffer).
Some of the microcodes also support XBUS and DRAM. I won't give much detail on these, but basically with XBUS, there is no buffer used, and the RSP simply gives the RDP
instructions as they're generated. The disadvantage is that this makes both parts wait on each other, when you could be running them at the same time. And finally, DRAM.
The entire displaylist's data is stored in a rather large area in DRAM, the RSP fills this, and then it sits on its hands while the RDP processes the whole list.
It hogs memory and would only be useful when you want to use the RSP for doing something else very intensive while the RDP is rasterizing (say, decoding MPEG).
|
|
|
|