v0.1 by this year!

I’ll keep it short and simple: My goal is to release CorgiDS by the end of this year.

Now, setting deadlines is a dangerous thing, especially when there’s still a lot of work to be done. However, I think I’m in a good enough position to start wrapping the rest of the emulator’s issues up. Mainly, better 3D rendering, better speed, and other errata.

Along the way I’m going to perform some major renovations to the blog as well as get a new domain. Keep an eye out for both of those!

Advertisements

Fixing touchscreen issues

Emulators are no different from regular programs when it comes to debugging. The only added difficulty is that one must be able to debug the emulator itself by figuring out what’s going wrong in the system being emulated. This can be quite challenging indeed when one does not have access to the source code of a game and must look through a disassembly of the compiled ROM in order to figure out what’s going on. Nevertheless, debugging is very much possible with a great deal of perseverance.

A bug I fixed today as of writing this article seemed quite simple: certain commercial games were failing to read touchscreen output. The problem is that the emulated touchscreen does work; for example, the firmware and the Digimon games read it perfectly fine. For some reason though, games like Harvest Moon DS and Super Mario 64 DS weren’t even attempting to read from the touchscreen. In fact, code that was being called according to NO$GBA was being completely skipped over by CorgiDS! Over the course of several weeks, I halfheartedly attempted to fix this problem to no avail. It wasn’t a large priority until I tested Pokemon Diamond, which hung in an infinite loop just before reaching the title screen. Thinking that the touchscreen issue was related to this, I finally decided to put in some effort. (Spoiler: It wasn’t! Pokemon Diamond still remains to be fixed.)

I first suspected an issue with the IPCFIFO. The FIFO is a pair of two queues that both the ARM9 and ARM7 use to communicate with each other. For example, when the ARM7 reads touchscreen input, it can then send this information to the ARM9’s FIFO, which will trigger an IRQ (interrupt request) for it to handle. In the case of something like Harvest Moon DS, no FIFO communication was happening at all once the intro screens were pulled up, which is when the game is supposed to start reading touchscreen input. Inspecting the code, however, didn’t reveal any issues, and I was back to square one.

After a while, my next idea was to figure out where exactly the touchscreen code was in the ARM7 binary and work my way backwards to see which functions called it. Using the stack trace in NO$GBA and strategically-placed breakpoints, I backtracked all the way to the main loop in the ARM7 binary. I saw that the loop consists of calls to many different functions throughout the binary, so I placed breakpoints on each of them to determine which one would lead to the touchscreen code.

I found something odd: it looked like every single function would call the touchscreen code in NO$GBA! From a DS programming perspective, this doesn’t make much sense; user input only needs to be called once every frame. It didn’t take long for me to figure out what was happening: the touchscreen code was being called within an IRQ handler. This makes more sense because an IRQ can be called at any point during code execution, assuming they’re enabled of course.

The DS has a plethora of IRQ options available. Some of them are vital, such as V-Blank, which signals that the game can start accessing VRAM without interfering with the graphics engine. Others are almost never used by games, such as real-time clock IRQs. Regardless, every game will make use of several kinds of IRQs, and not processing them properly can lead to severe issues. I decided to place a breakpoint in the BIOS code responsible for jumping to whatever IRQ handler the game has so that I could check which IRQs were being called. I saw that one IRQ in particular, V-Counter Match, was indeed calling the touchscreen code. V-Counter Match is simple: if the current scanline that the graphics engine is drawing matches a variable called V-Counter, an IRQ is requested. On the ARM9 side, this can be used for special mid-frame graphical effects that require extra timing precision. I was surprised to see that the ARM7 touchscreen code relied on it, however.

I looked in my GPU code and saw that V-Counter was being set correctly to a value of 0 (it would be triggered at the very beginning of each frame then). Then I looked at the code responsible for calling the V-Counter Match IRQ, and I facepalmed hard. CorgiDS was incrementing V-Count (the current scanline position) before checking if the IRQ could be called. Because V-Count is never less than 0, this means that the IRQ would never be called if V-Counter was set to 0, which certain games would set it to. Making V-Count increment AFTER the IRQ check fixed all the touchscreen issues I was having.

What a ride that was. Hope you enjoyed reading about my struggles!

Work in progress

Terribly sorry about the lack of updates. There isn’t anything revolutionary to show, but there has been incremental progress.

super_mario_titlesuper_mario_blehmario_and_luigi_title

While I’ve mostly fixed the issues with 3D title screens, there’s still not much in the way of graphical progress as far as going in-game is concerned. SM64DS is the only game in my library that actually displays stuff coherently; FFIV and others fail to render the models at all! Because the polygons themselves are clipped correctly, the problem seems to be with texture lookup. It’s super weird, as the code for texture rendering seems to be otherwise correct… Ah well, I’ll have plenty of time to fix these glitches.

I don’t have anything particularly interesting to write about this time. This is just a status update letting you guys know this project is still alive.

Getting somewhere!

With color interpolation out of the way, it wasn’t long before I was able to implement texture mapping. This is nearly enough to make the 3D games in my library playable:

super_mario2blarktetristetris2

While the inclusion of textures is obvious, I have also implemented z-buffering as well as z- and w-interpolation. The framework I have right now isn’t perfect, but it certainly gets the job (almost) done.

A lot of DS games like to use “3D-as-2D” graphics; e.g., the Tetris DS title screen, where the entire bottom screen is entirely 3D despite being a menu. This is mainly because it’s far easier to perform special effects using the GPU instead of the comparatively primitive 2D engines. Lighting effects, limitless rotation and scaling, stretching textures, per-pixel alpha blending, and more, all with a total of 2048 polygons per frame. In comparison, each 2D engine (one per screen) only has 128 sprites total, and only 32 of those can be rotscaled with less control than the GPU offers.

If you’ve read my previous article on color interpolation, textures largely act in the same manner. However, there is some extra complexity associated with them as you might guess. Each vertex in a polygon is capable of storing two texture coordinates s and t, which correspond to x and y respectively. These must be interpolated in the same manner as vertex colors, and combining s and t gives one “texel”. The meaning of the texel varies depending on the texture format; the GPU allows you to choose between direct color, palette-based with or without alpha coefficient, and compressed. Once the pixel color has been retrieved, it is combined with the color interpolation value and displayed on screen. To add to the fun, textures can be repeated on a polygon and flipped, and the vertex texture coordinates can be transformed by a special matrix if a game wishes to do so.

At this point, somewhere between 60-70% of the GPU is fully emulated. There are, of course, plenty more things to address before 3D games can be enjoyed:

  • Polygon strips. These are used to connect polygons together in order to save space on vertex lists as well as CPU time. Every detailed 3D model will use these, such as the Mario head in SM64DS.
  • Proper clipping of vertices outside the view volume, or in layman’s terms, vertices not shown on the camera. When a vertex leaves one of the six clipping planes (left, right, top, bottom, near, and far), the GPU will replace it with two extra vertices that intersect the clipping plane and polygon edge. This allows for polygons to go off- and onscreen seamlessly.
  • Interpolation precision. FFIV’s title screen looks ugly because of too much precision being lost. Some of you may be able to notice other quirks in the images shown.
  • Some other special effects, such as lighting and translucent polygons.

I’m hoping to get polygon strips and vertex clipping working within this week. Extra stuff like lighting will have to wait; some of the issues you see above are actually due to a combination of my imperfect 2D engine and even worse system timing, and those things definitely need fixing as well. Nevertheless, lots of progress over the last several days!

Interpolation

Disclaimer: this article is meant to describe the precision that the DS uses for interpolation as well as how interpolation works. It is not meant to deride DeSmuME (which uses lower precision on default settings), nor proclaim that CorgiDS is better (it isn’t). This is wholly a technical article, not a contest. I apologize if the tone of the article indicates otherwise. 

CorgiDS now supports color interpolation! Observe these shiny screenshots:

interpolation!interpolation_quad

Astute observers may notice that the output here differs from something like DeSmuME. This is because, along with melonDS, CorgiDS uses the same color precision that the DS uses! StapleButter, the developer of melonDS, has helped immensely in getting interpolation working, as well as giving me information about the quirks of the DS GPU.

For comparison, here’s the output from DeSmuME using default settings on my Mac:

desmumue_interpolation

How does interpolation work? If you’re unfamiliar with the term, interpolation simply means finding any number of values within the boundaries of two known values. Consider the following: say you have a function f(x). You are given the values for, say, f(0) and f(10). Interpolation would be finding the values between f(0) and f(10), such as f(1), f(2), and so on.

The DS uses interpolation for both vertex colors (shown in the screenshots) and textures, the latter of which I’ve yet to implement. Both methods use the following formula:

((pa)u1w2 + a(u2w1)) / ((pa)w2 + aw1)

This formula outputs the interpolated attribute of pixel a. There’s a lot of stuff here, but the formula is simpler than it seems:

  • a is the pixel number of the line on which interpolation is to take place, ranging from 0 to p.
  • p is the total number of pixels on the line.
  • u1 and u2 are the attributes of the boundaries of the line. With color interpolation, these are the colors, and with texture interpolation, these are the texture coordinates.
  • w1 and w2 are the w-values of the boundaries. If you don’t recall, the DS uses four-dimensional matrices to clip polygons from a 3D representation to a 2D image. Vertices keep this fourth dimension, known as the w-axis. They are included in the formula to perform perspective correction.

Using this formula, color interpolation is deceptively simple. The left and right edges of a polygon are interpolated, and then the DS uses the result of the interpolation to interpolate the interior of the polygon. That’s all it takes!

…Well, no. In reality, as it tends to be with the DS GPU, things are more complicated. I mentioned earlier how CorgiDS and melonDS have more color precision than the other emulators. Vertex colors are 15-bit, meaning that each RGB value ranges from 0 to 0x1F (31), a measly amount considering that modern displays use at least 24-bit color. StapleButter discovered that the DS gets around this limitation by extending color precision to 27 bits during interpolation as to allow for a wider range of values and then reducing it to 18 bits for the display.

Furthermore, the above formula isn’t entirely correct. While it is what a modern GPU would use, the DS GPU is a lazy bastard and takes shortcuts. The GPU sets u1 and u2 to 0 and 1 respectively, giving the actual formula:

(pa)w2 / ((pa)w2 + aw1)

This formula gives a “perspective correction factor” that the DS uses to linearly interpolate colors and textures, which as you might guess, loses precision unnecessarily.

Another weird quirk: w-values are normalized to 16-bit precision using shift increments of 4. If a w-value is 12-bits long, for instance, the DS will extend this to 16-bits. However, if the w-value is 20-bits long, it is reduced down to 16-bits, greatly reducing precision. Why it couldn’t have done things like normal GPUs is beyond us emudevs… Nevertheless, documenting these quirks (and the many others) is necessary for pixel-perfect accuracy. Admittedly, I’m not aiming for 100% accuracy, but I’d still like to have some standards for accuracy myself.

If you haven’t guessed by now, next up is textures! Color interpolation is nice and all, but commercial games don’t run on vertex colors. With my newfound knowledge of interpolation, textures shouldn’t be a hard task at all.

Many thanks to StapleButter, who helped me understand many of the technical aspects of 3D graphics.

Saving

As the title indicates, I have added basic saving support to CorgiDS. The games that were stuck on save error screens are stuck no more, and now boot to the title screen! Examples include Super Princess Peach, Tetris DS, Final Fantasy IV DS, and more. It’s worth noting that the save support is *really* basic: it just allocates a 64K block of EEPROM and prays that the game doesn’t use flash memory and doesn’t care about the size. Aside from that, it works perfectly, and I haven’t encountered any issues with my library so far.

Not much else to say… I’ve been hunting the bug in Harvest Moon DS that prevents the game from reading from the touchscreen (making it impossible to get past the title screen). Since I haven’t had success with this, I think I’m going to stop worrying about it for now. Getting save support to work has also uncovered some graphical glitches in the games that have started booting that I’ll need to fix eventually. In particular, a lot of games use 3D textures for 2D images such as backgrounds or title logos. It’s something I’ll need to take step by step, because things are becoming too complex to spread my efforts all over the place. Regardless, I’m looking forward to the challenge!

A three-dimensional perspective

After a week of hard work, CorgiDS now renders wireframe polygons, both triangles and quads:

wireframe

A whole lot is going on under the hood here. Mainly, I replaced my hackish “run commands instantly” design with a proper GXFIFO implementation. This allows for far more sophisticated programs that can draw a whole lot more than a single primitive. If you’re curious, the GXFIFO is a 256-long command queue whose sole purpose is to provide a buffer for when the program overwhelms the tiny pipeline the GPU has. The GXFIFO has a lot of interesting properties: for instance, being able to request an IRQ if half-empty or empty and allowing for automatic DMA transfers when the queue becomes half-empty. While programs can directly feed commands and parameters into the GXFIFO using I/O transfers, the aforementioned DMA transfers are the most commonly used method for filling it. Because DMA transfers are, on average, faster than the time the GPU takes to execute its commands, care must be taken to make sure that the GXFIFO doesn’t overflow. Otherwise, the CPUs and DMA are frozen for up to several seconds until the GPU has enough space for new commands! Of course… certain games don’t take that precaution into mind and blindly fill up the FIFO as they please. CorgiDS currently just executes commands instantly if the GXFIFO overflows, as emulating that “feature” is too costly for the time being.

Obviously, the next step for CorgiDS is to be able to fill in polygons. After that, however, an equally important step is being able to draw polygon strips. Currently CorgiDS can only handle separate polygons, which is great for test homebrew, but games will use strips in order to save time and space on vertex commands. Color interpolation also needs to be added, as well as fixing a lot of the quirks in the renderer. This needs to be handled one step at a time, so it will still be a while before I’m able to run Pokemon and the like. All the fancy stuff in 3D rendering like textures and lighting will come later, I promise. (The technical article I’m also writing has been delayed due to a combination of 3D work and real life work, but that’s also coming eventually.)

To wrap this up, let’s see how CorgiDS does with rendering the Utah Teapot:

teapot_fail

Better luck next time…

Road to 3D

After getting rotscale sprites to finally work (which was a pain in the ass), I grew bored of working on the 2D engine and save support. Thus, I decided to venture into the world of 3D!

No fancy pictures yet, unfortunately. While the geometry engine, which handles all of the matrix math and 3D representations, is mostly good enough to start drawing shapes, there’s still a bug somewhere that ultimately causes vertices to not have the right dimensions. In the interest of getting everything correct before rushing out into the great unknown, I have yet to begin work on the rendering engine.

The GPU is actually simpler than I expected it to be. The requirements for developers to get stuff on the screen are as follows. First, one must set up the projection and position matrices, the latter representing the camera. (It’s worth noting that on the GPU, matrices are 4×4, storing the three spatial dimensions as well as an extra W-dimension. Having the matrices be four-dimensional is useful for translation, as the relevant matrix must only be multiplied by a translation matrix; 3×3 matrices would also require an addition operation.) After optionally configuring some other properties, one must then start sending vertex lists. The DS provides four options: separate triangles/quads and triangular/quadrilateral strips. An arbitrary amount of vertices can be defined in a list under the condition that they don’t overflow vertex/polygon RAM and don’t incompletely define a polygon, but all polygons defined by said list will share the same properties: alpha-blending, fog, texture, etc. The tiniest change in a polygon’s properties will require a new vertex list to be sent. Finally, once all vertex lists have been sent, one must swap the geometry and rendering engine buffers. This will allow the GPU to start drawing all defined polygons as well as clearing the geometry engine’s buffer to be refilled as needed. The final image is seen as background 0 by the 2D display engine and can mostly be treated as such.

The progress I have made, despite my lack of visible results, is quite encouraging for me. CorgiDS is getting close to reaching a significant milestone! Still no guarantees can be made for the release date, but there’s not a whole lot of actual features left to implement afterwards. It’s been a rough ride getting this far, but compared to when I first started this blog, I feel as though I’m able to see the light at the end of the tunnel, as far away as it may be. 🙂

Change of plans

The goals in my previous post will have to be delayed for a little bit. While I’m not interested in making this a personal blog, I feel that it is my responsibility to tell my readers about events that will affect the progress of CorgiDS.

Simply put, this week I have started a full-time job. I have underestimated how much time this eats out of my day, so I’ve been working on adjusting my schedule and doing work-related things. For this reason, I have not had much free time at all, let alone any time to work on the emulator.

Please note that I have no intentions on giving up on the project! CorgiDS is still very dear and important to me. I simply need to make some time to organize my life. This won’t take relatively long, and at most I’ll be continuing work on the emulator (and article) next week. I will of course keep you guys updated if anything unexpected comes up. Thank you for understanding.

Conquering the tile engines

As of this post, CorgiDS is close to having a complete 2D engine:

firmware_successhow_vulgarharvestmoon

First things first. The firmware works! The WiFi features are all missing, of course, but you’re able to boot DS games and mess around with all the system settings.The changes you make however, can’t be saved: the firmware tries to shut off the DS when you exit the pictured menu to save the changes, but I have not implemented the power management features needed. Astute readers will notice that the clock has no hands. The hands are sprites that require rotation/scaling to be implemented, which is something I’m still in the process of.

Digimon World Dusk (and presumably Dawn as well) seems to be fully playable now. It goes in-game without any issues, and so far you can control your character and successfully fight in battles. There aren’t any glitches that I’ve spotted, other than battle animations not showing (but those require rotscale sprites).

Harvest Moon DS goes to the title screen… but it doesn’t allow you to select a new game. For some reason, it’s not reading anything from the touchscreen at all. There’s also some pretty bad graphical glitches in the opening intro:

harvestmoon_glitched

The shadow of the player character looks clownish, and that’s not even getting into the garbage on the bottom of the screen. If you’ve played the game before, you can also notice some issues in the title screen: the logo and copyright text handle transparency incorrectly, causing the former to have a little too much green and the latter to be barely legible. I think I know why this is happening at the very least, so a quick fix won’t be too hard.

No other games that I’ve tested work, as they all require saving support. This makes my upcoming goals pretty clear: after I handle rotscale sprites, I will need to work on a rudimentary form of save support. Nothing too fancy, just a way of making games at least go to the title screen. After that, there’s only a few more issues I need to address for the 2D graphics before I consider it good enough for a first release. It’s mainly simple things like alpha blending and mosaic. I estimate that I’ll have all of that worked out by the end of this week so that I can finally start work on the GPU. Exciting, isn’t it?

Spoiler alert: expect a technical article about the Nintendo DS 2D graphics relatively soonish. 😉