Frame Rates and Screen Buffers

hertymakeup Haziran 18, 2022

Introduction

In the ever-improving world of home computer games, we went through any number of learning phases as we began our gaming careers. Up to that point I had mostly written games that presented a screen to the player and then waited for input. As I got more ambitious we did have one multi-player game where everybody moved at their own rates and the game was monitored every time anybody moved. I did finish with a "real-time" Space Invaders where the game didn't have to wait for input from the player before it moved, but that wasn't running at a particularly fast rate, maybe 2 or 3 frames a second. The IBM mainframe was not designed for real-time games. It`s the most expensive bit of kit I`ve ever worked on, 7-figures worth.

Arcade Games

I didn`t really think too much about frame rates in the early days. Arcade games were running fast enough. The vector games, like Asteroids, didn't have a standard idea of a frame rate as the raster beam is not sweeping across the whole screen. They would be trying to keep the game going at a fixed rate though, to ensure that things behave smoothly, likely 60 frames per second. Quickly they would have the chips available to achieve 60 frames per second for whatever they wanted to design as they could add more chips onto the board to make sure they could achieve the magic frame rate.

Frame rates are what make a game look smooth or not. TV screens typically refresh at 50 or 60 frames per second. You therefore need to get the next screen ready to display in that 50th or 60th of a second in order to be as smooth as possible. I didn`t achieve that until Uridium on the C64. From now I`ll write 50 frames per second, but feel free to interpret it as 60 frames per second.

Dragon 32

When I started on the Dragon 32, Steve had been designing his first games on the ZX Spectrum, and I was commissioned to convert them to the Dragon 32. Steve had written routines to plot and unplot graphic images on the bitmap screen. The first game was 3D Space Wars. We had to plot up to 24 space ships on the screen, plus any bullets fired by them, plus lasers fired by the player. The sequence of events was:

move the player view according to player input,

move all the objects according to their speeds,

calculate all the actual plot screen positions,

then go through the objects from furthest to nearest unplot the image from its old position and plot it in the new positions. Objects weren't going to change depth too instantly so the sequence wouldn't change much from frame to frame.

We could see that plotting 24 objects took a lot longer than 1, but didn`t worry about it too much. It worked in the game`s favour that the last ship is a more speedy dog-fight than picking off any one of a squadron. We were aware of it though, and in the next two games Steve got more background graphics on the screen all the time, which balanced out the frame rate a bit. The Dragon 32 had an analogue joystick control, which had an interesting feature that it used time to measure the joystick positions, about a quarter of a frame for a joystick positioned full right and down, hardly anything for left and up. Sound was generated by firing values at the sound chip in tight loops for enough time for the sound to be heard. So with all that going on we were not going to break any speed records. I can't remember whether anything was going to stop them going faster than 50 frames per second...

3D Seiddab Attack had some graphics for buildings at night always on the screen, which meant that any screen would take a while to refresh. The player was driving a tank, so it didn't have to zoom about. I improved the graphic data format slightly to speed up plotting a tad. I had twice as much memory as the original, Steve was still writing for the 16K Spectrum. I didn't think of using the memory for pre-rotated graphics that would have sped things up a lot more, I just put some more graphics in instead.

3D Lunattack also did a fair amount of background plotting for the horizon, craters and rocks, that evened out the plot rates. Since we had decided to unplot an object from its old position and replot it in the new in consecutive operations, the objects were only off the screen for a small period, but with a number of them then you do see some flickering. It wasn't until Steve got to writing Astroclone that he addressed the flickering issue. It had a nice candle-lit kind of effect for Avalon and Dragontorc, but he wanted more solidity. That's when he decided to build up the next screen in a separate area of RAM, plotting all the objects into that, and then when they're all done he copies the finished screen on to the display screen. He might even have been able to simplify the plotting to avoid the screen-thirds addressing that the Spectrum had. Only the copy process needs to sort that out. Of course that is a time burden in itself, but it cleaned up all the drawing. So we had a double-buffered system on the Spectrum, something I never did on the C64 or Dragon 32.

Recreation

During my lunch-hours at work I had been playing some other games. We were working in Steve's living room, and he went off for his lunch. I had gotten used to missing lunch from my time at GEC when we used to play bridge at lunch-time, missing out food. So I was playing Revenge of the Mutant Camels, Sheep in Space, Manic Miner and Boulderdash. It was clear that Jeff Minter was achieving the magic 50 frames per second on the C64. My next game was going to be on the C64. I got to write 3D Lunattack on the C64, using very little of the C64`s strengths, but getting me familiar with some of the hardware and the different assembly language. Using the hardware sprites made some of the plotting much, much, quicker than it had been on the Dragon.

At that point I was thinking about writing an original game on the C64. I had caught up with Steve, there were no new games to convert, he was busy writing Avalon. He was moving into 48K games, his ambitions to write a more complex games was being realised, so development was taking a bit longer. I knew I wanted to utilise the C64`s strengths: hardware sprites, character modes, smooth scrolling, multi-colour graphics, screen interrupts. This is when I found out how long it takes to copy not even a full screen`s worth of characters from a map buffer. I was running 16 meanies and 8 Gribblets plus Gribbly, Seon and up to 2 bubbles. There's a sentence you don't see often. Thing is that even if you take one instruction over a fiftieth of a second then you`re down to 25 frames per second, albeit with plenty of time to spare. We were using border colour changes around the various big routines to tell how long they were taking, which also are more difficult to see when they are flickering at 25 frames per second.

On the C64 you can set the screen RAM to be on any 1K boundary in your 16K designated video bank. So you can have a second screen and go for what we call a double-buffered system. If you're scrolling quite slowly, preferably always in one direction at a fixed speed then you could be loading up a second screen with the new background data over 2 or more frames before you need to switch to the new moved background screen. That gives you plenty of time to concentrate on moving all your objects, and even use sprite multi-plexor to get more objects on screen. With fully free movement you can`t predict the next scroll position that you'll need so you can`t do work in advance. The trick to getting performance is always to work out as much as you can in advance. I'd liken it to doing an exam when you know what the questions are going to be. Starting to sound like cheating, isn't it? The trick is not to get caught!

Anyway, I was just using the one screen, and timing my update of the characters on the screen so that you never see it happening. If I hadn't done that, you'd get a tearing effect that we used to see on PCs before the video cards supported vertical blank screen switching, which happened remarkably late. Now we're getting variable refresh rates to suit the game so the monitor just says "I'll update when the game has got the screen built, however long it takes." This was not possible with cathode ray tubes because the dots fade, so they need to be driven at 50 or 60 frames per second. New monitors can hold the picture until they're given new information.

When I was writing Paradroid, the graphics weren`t looking good in multi-colour mode, so I switched to hi-res, and that meant that I needed to use different colours for the graphics or we'd have been in two colour mode. I might have tried that too, but that would have been a step backwards. So suddenly as well as doing the screen update, I have to do a colour map update. Each character had a designated colour and they had to be updated. It's also faster to just plonk your colour information out than check the colour that is already there and skip it if it`s already right. I ended up running at 17 frames per second. I synchronised the various parts of my game update over three consecutive frames so that various updates got done during the intervening vertical blank periods. The hardware sprites get updated during that time, again so you don't see any sprites tearing or otherwise malformed. The animated characters also get done in a vertical blank for the same reason. None of the third-of-the-job sub-frames must over-run or suddenly you'd drop to 12.5 frames a second. Doing it that way meant there was plenty of time. Later I was messing about in the code and decided to see what would happen if I took one of the two middle vertical blank waits out and with a bit of rearrangement of the calls, lo-and-behold, the game ran happily at a constant 25 frames per second. I didn't re-tune the game for the faster rate, it just felt nice 50% faster, so we called it the Competition Edition and it came out on a double-pack with Uridium+.

So along comes Uridium, and I desperately wanted 50 frames per second. I also wanted to maximise the screen play area. The switch from a vertical scrolling area to a static score area, or vice versa, always required a full character row to get the VIC-II graphics chip correctly synchronised. I tried it both ways, Gribbly`s has the panel at the bottom, Paradroid at the top. Vertical scrolling was therefore stopped. There are no animated characters on screen either. Anything leaving the visible screen area plus a few characters is disposed of pretty quickly. Everything is concentrated on the area of the map where the player is. I`m still only working on a single screen and the panel at the top actually buys me a little extension of the vertical blank time as far as the game screen goes. I start updating the scrolling screen as soon as I can, i.e. the end of the previous screen display, and race the raster back down the screen. I've got the screen all updated before any of it is read by the VIC-II for display.

With scrolling at up to 8 pixels, or a full character, per 50th of a second, the worst case is that I would have to refresh the background every frame, so I just get on with it whether I need to or not, and save all the memory a second screen buffer would take. The beneficial side-effect of that is that when you destroy features on the background I can swap them to the destroyed graphic on the background map and they instantly get displayed on screen, I don't have to worry about whether they need updating on two screen buffers later. That helped with the melting ship background at the end of the levels too. It also helped with the Manta bullets that were all done with modified characters that were moving along the map. I should clarify that we would always have a character layout of the full map of the level's play area somewhere in memory. From there we would copy the appropriate screen area to the display screen. We would use the map for collision detection so we could run objects off screen, they wouldn't be using screen co-ordinates to look at the screen characters under them, they would be using world co-ordinates to access the full map. They would look at the map below them for bullet characters, and blow up if they found one.

There might be an extra bit of border round the edges of the map. Gribby`s world is surrounded by solid rock, Paradroid is enclosed by the walls of the ship, and Uridium has a bit of spare space at both ends that you can never get to. Alleykat had a wraparound track map, solving the edge issue in another way. Morpheus doesn't have a background map at all, it`s all done with sprites and the player ship is made out of characters, so that one is turned completely inside out.

16-Bit

I`m trying to get to screen buffering, and I`m there if we now move to the 16-bit machines. Although the CPUs were 8 times faster than the 8-bit machines, and moved 16-bits at a time instead of 8, the screens were all bit-maps, so the screen was also 16 times bigger. What we did have was more RAM, so we had more program space, and if we followed the character screen model, even though we had to plot those characters ourselves into the bitmap, the screen maps took up relatively less space. We immediately adopted the double-buffered model. Dominic Robinson had become interested in 3D graphics and had done some demos and started on Simulcra. 3D games tend to involve a complete screen rebuild every frame. It was running pretty fast in the end.

For my games I didn`t build the whole screen from scratch every frame. If we go back to Rainbow Islands, it scrolled vertically only. We had a barrel effect for a background buffer of the character graphics that were on the screen. This "rolled" up and down the character map. There wasn`t space for a whole unpacked wallpaper of the whole level, it would have been too big. So we had this barrel, that would most of the time have a split position somewhere where you'd have to skip to the top of the buffer. What we did then was have two graphics screens. One would be on display and not to be touched, the other would effectively be the previous screen, and is going to be the next. What we would have to do therefore is remove any plotted objects from the previous image. Our plot routine would work out where on the screen it was going to plot, record that position and the width and depth of the graphic, and it would then plot the object. 2 frames later we would look up the list of areas written to and clean them all up from the barrel buffer with a copy routine, or a blit. The two screen buffers were also barrels. Display would begin at an appropriate position down the screen buffer and there would be an interrupt occurring at a point down the screen to reset the display address to the top. Some graphics might also need to be plotted across the split, making two plots and two restore blocks. That was fun getting it all working with pretty minimal debugging tools, I can tell you.

That does raise another point: when you`re debugging a plot routine you can point it at the real display screen while you`re debugging it. That way you can see the graphics appearing, or not, as you trace through the code. That`s provided your display screen and debugging screen are not one and the same.

Once you start messing about with plotting onto unseen screens you`ve no visibility, by definition. You only get to see the results when it's all done and you flick the screens over. A lot if issues have to be solved by just looking at the code and spotting the mistake. You know it`s there, or more likely they`re there, so it`s just a case of finding them. Plenty of patience is needed, you`re solving problems of your own making!

Single buffering is just going to look a mess and isn`t practical without a display chip to do the character and sprite rendering. Certainly it`s no good for 3D games. Regardless of how many buffers you use, three is also often mentioned and I'll come to that shortly, you can squeeze a bit of spare time out of the process. The mechanism required is that you need an indication of when the next screen can be switched over, at some point during the vertical blank period. On the Amiga and Atari ST we had total control of the machine so we could implement whatever we wanted. Later on when we had to be OS compliant we still were able to get an indication that it was time to change over the display.

Just a bit of terminology then, sometimes people refer to the two screens in a double-buffered system as the front buffer (being viewed) and the back buffer (being rendered). I actually called them the seen and unseen buffers.

Our usual sequence of events then per game cycle was:

get your input from the player`s device, usually a joystick or keyboard,

move the player object according to the instructions from the player,

set the game screen position according to the new player position, i.e. the scroll position,

move all the other objects in the game, ending each one by calculating the plot position relative to the screen position.

Each object also has a depth position on the display, maybe determined from a 3D position, or on a 2D game they might be given a front to back priority.

Now all the objects are ready to be plotted, and sorted into depth sequence, we can begin rendering on the unseen back buffer, provided one is ready for us, we may have to wait as the front buffer is still being displayed and the back buffer is built and ready to display at the next vertical-blank period.

As soon as the screen buffers are swapped, the back buffer becomes the front and is ready to be seen, and the old front buffer becomes free and ready to be updated as the back buffer.

We first have to clean up the old buffer, restoring the background from the master pristine copy. This involves going through the background restore list and copying the data over as fast as possible. We had the blitter do that later,

we then might have had to add new leading edges of background data if the scroll position was different from 2 frames ago,

then we can plot all the objects in their new locations on the screen in depth sequence. We supported 16 depths. Any 2 objects at the same depth would get plotted in the same sequence every time because the object list will cause them to be processed in the same sequence every time,

having finished rendering we mark the buffer as ready for display so that the vertical-blank interrupt can do the swap,

now here's the clever bit... we can then get on with moving all the objects ready for the next frame, we just have to wait before we can render.

Maybe all the movements of all the objects takes 20% of a frame. We're 20% of a fiftieth of a second ahead of the game. If there`s a brief spike in processing, like a big explosion needs displaying, we can absorb that over the next few frames. Processing uncoils a bit and we might not have all the objects moved by the time the next frame switch occurs, but the screen was rendered and ready in time. If we over-run by 1% every frame for 20 frames we're OK, after that we won't have the next screen rendered in time so the interrupt won't have a completed screen to display and it`ll have to wait. That`s when we get a glitch. It does mean though that we get a nearly a whole frame of catch-up time, so the next glitch might not be for a second or so. The trick is not to over-run at all, of course.

So, finally, we get to triple buffering. As well as a seen front screen and an unseen back screen, we can have a third saved screen. The same processes above still occur, but when we have moved all the sprites for the third time we can immediately start rendering to the saved screen and get more ahead of time. Only when we have moved for the fourth time will we have to check whether we have a buffer ready to render to. This means that in the normal course of events, instead of being up to 20% ahead, we can be a whole frame plus 20% ahead. We could over-run every frame by 1% for over 2 seconds before we would be forced to admit we haven't built the next frame in time. The system will uncoil like a watch spring a lot further. We could over-run by 10% for 12 consecutive frames and we`d still be OK. Any under-run is also gratefully received and we can get ahead of the game again. You could add a fourth buffer, even a fifth, but the more screens you have, the further behind the player gets. The player is looking at and reacting to a screen that is not quite the latest. You`re receiving joypad input for a frame that's not going to be displayed for 3 or 4 fiftieths of a second, the user is going to start experiencing time lag. Thus we usually stop at 3 buffers.

Now we did try to explain this to the Factor 5 lads, who are undoubtedly very clever and talented, that Fire & Ice had that trick up its sleeve. They came over from Germany to see how we were getting on as they had explained how the Amiga could smooth scroll horizontally and vertically and get away with less updating. It was a 2-dimensional implementation of our 1-dimensional barrel we had used in Rainbow Islands. I have written a separate blog article about that so will not be going into that here. Fire & Ice had a bit too much sliding and bouncing physics going on to be able to achieve 50 frames per second. They checked over our background and plot routines but agreed we had too much processing. We told them about triple buffering and they retreated to the back of the room for a secret discussion. About 5 minutes passed before the returned from their huddle. They just said: "No."

I have seen PC documentation where they do talk about triple buffering and various systems do support it. They wouldn`t if they didn't think it helped. It just allows you to over-run a little bit more than double buffering before bad things happen. You might be able to do some more brief but impressive effects before you get caught out. Your eye needs the input at 50 frames per second, but human reactions being what they are, your trigger finger is going to be 10 frames behind, so 2 or 3 buffers isn't going to make a lot of difference to the player. To the programmer, it bought us a bit of extra time. For those people with a 512MB Amiga, it would likely run in double-buffered mode, but if there was extra RAM of either video or fast RAM, we would have enough video RAM for triple buffering. We would move the code into fast RAM if there was any, which meant the program would run faster anyway as it wouldn't be interrupted by data fetches from various other sources such as sprites, extra bit-planes, copper processing or blitter operations. Of course the Amiga A1200 was insanely fast anyway. It would make sense to run in double-buffered mode all the time for the most accurate game-play experience. I can say that now...