Extended break

Though I had started a PS2 emulator, I had promised you guys that I would still be working on CorgiDS.

That hasn’t happened.

My big mistake was trying to sneak in GBA emulation in the v0.2 release as a surprise. Because the DS and GBA share the same hardware components, I thought it would be a simple task. It wasn’t, and the more I worked on GBA compatibility, the more stressed I became. I didn’t want to release a half-baked product; it had to be as good as possible, I thought to myself. Combine that with problems at work adding to my stress even more, and I stopped working on CorgiDS. Part of my desire to work on DobieStation in the first place was because I needed something else to occupy my time.

Now you might be thinking, “Why not just remove GBA functionality for now and work on it later?” I very well could do that. But, now that DobieStation is growing into something relatively serious, I don’t have the same motivation I once did.

So, CorgiDS is now officially on hiatus. Maybe I’ll revisit it again, but more likely I won’t. If I ever get the desire to work on a DS emulator again, my plans are just to create a new one that doesn’t have the messy codebase and amateur mistakes CorgiDS currently has (I’ll still name it CorgiDS, for lulz). Don’t count on it happening any time soon, if at all.

This wasn’t all for nothing, however; I’ve learned a lot about my favorite childhood console and how ugly it is beneath the shiny exterior. I’ve honed my skills not only as an emudev, but also a programmer in general. In the end, I never really did anything important for the DS emulation scene, but I’ve learned something. That’s what counts for me.

I have high hopes for melonDS. That emulator is going to be the one that changes the DS emulation scene for the better. So if you happen to be concerned about the status of CorgiDS, follow melonDS instead; it’s going places no DS emulator has ever gone before. 🙂

Thank you all for your support!

DobieStation, my PS2 emulator

As you might have guessed from the title, I have started another dog-themed project. This means that, unfortunately, the CorgiDS release will not come this month.

Note that CorgiDS is not dead! I will continue to be developing it. For now, however, I’m taking a break and “relaxing” by creating a PS2 emulator.

If this interests you, check out the GitHub. Note that the project is extremely pre-alpha at the moment, and while it may compile, it probably won’t run. Nevertheless, I have gotten some graphical output a few days ago (demo is 3 Stars):

Screen Shot 2018-01-28 at 6.16.20 PM

Maybe one day, the dobie will run Shadow of the Colossus. But that won’t be for a while. 🙂

What’s left for v0.2, and the plan for v0.3

v0.2 has been an accuracy-focused update. So far, I’ve added audio, fixed tons of bugs, and overall improved the graphics.

Here’s an example of what I’m talking about:

present1 mariokart_improved

On the left is the v0.1 version, and on the right the v0.2 version. The differences may not seem like a whole lot, but getting to this point has required a lot of work on the 2D engine. First, sprites weren’t showing up at all because of a bug with VRAM accesses. Next, I had to implement window functionality. There are two fixed-sized windows as well as an “object window” that uses sprites, and backgrounds and sprites can be enabled or disabled as you please within windows. Mario Kart disables sprites in the regions outside of the windows, and the two fixed-sized ones are used to hold the “1st” and lap number on the top screen. The top-left corner holds an object window, where sprites are visible only within that black square. This allows the game to perform a slot machine effect without the items showing up outside of the window. I also fixed a nasty display capture bug that caused games that store code in VRAM to freeze.

The 3D engine has also received some love. Fog has been implemented, and I’m in the process of adding dynamic shadows and edge-marking. The Spiky Polygon Syndrome inflicting many games such as Final Fantasy IV, New Super Mario Bros, and more, has mostly been fixed, aside from a few edge cases. The problem was not vertex-sharing as I initially believed; it was actually a bug with clip matrix reads. The issue lingers in games like Sims Castaway, but I need to do more debugging for that.

Finally, I fixed a bug with DMA transfers that caused many games to not boot. The ARM7 DMAs have a maximum 16-bit length, and a write of zero is interpreted as max length. The ARM9 DMAs follow the same latter rule but have a maximum 21-bit length. I wasn’t accounting for the upper 5 bits on the ARM9 but I was using the ARM9 max-length, so games would accidentally overwrite critical memory.

After I get done with shadows and edge-marking on the GPU, my remaining plan is just to improve compatibility with broken or glitchy games. Some ideas I have are fixing save problems with the Pokemon games as well as defeating the infamous “no EXP” anti-piracy. I thought about implementing cache emulation, but I feel that it’s not worth it at this stage as the emulator is still immature. I want CorgiDS v0.2 to be out by the end of this month.

That brings us to my plans for v0.3.

Because v0.2 primarily focused on accuracy and overall compatibility, I want v0.3 to focus on optimization and quality-of-life.

My first goal is to completely high-level emulate (HLE) the NDS BIOS and firmware. This should offer some minor speedups on games that rely on the BIOS, but more importantly, it removes the need to have dumped the BIOS and firmware in order to play games. HLEing the BIOS means implementing all software interrupts that games use, and HLEing the firmware only means storing pre-determined values into memory (like DeSmuME does). The option to provide your own images will still be available for improved accuracy and the ability to boot from the firmware.

My second goal, far loftier, is adding a dynarec.

A dynarec (short for “dynamic recompiler”) converts assembly from one CPU architecture to another. In this case, the CorgiDS dynarec would recompile ARM machine code into x86 machine code. The benefits are twofold: one, the overhead incurred from the interpreter having to re-translate every opcode would be eliminated. Two, a dynarec offers opportunities for re-optimization that wouldn’t be possible with an interpreter model. An intelligent dynarec can take advantage of the architecture of the target machine and produce code tailored towards it, allowing for major gains in speed. Even in the most 3D-intensive games, the CPU code is still a large bottleneck. The dynarec, if designed correctly, would alleviate this, allowing CorgiDS to run on less powerful computers.

So what’s the catch? A dynarec requires a thorough understanding of the assembly language of the target processor. A naive implementation of a dynarec is already more complex than an interpreter, and an optimizing dynarec is far more complex. While slow, the interpreter can at least be debugged more easily. I’ve also never written a dynarec before, so only time will tell how long it takes for v0.3 to come out. Nevertheless, I’m not one for backing down from a difficult challenge.

My final goal, if time permits, is to add GBA functionality to CorgiDS. Because the NDS uses most of the GBA hardware, this would be done by re-using code already in place. I would have to create new scheduler, sound, and graphics code, but everything else won’t be so bad. By re-using the NDS hardware, this offers the possibility of booting from the firmware and loading a GBA game like you would on a real DS. Of course, if the stuff above takes a while, this may have to wait for a future update.

As always, thanks for your support!

Teaching CorgiDS how to bark

Sound can make or break a video game’s success. If you don’t have a memorable melody in the soundtrack, you lose a hook to convince people to buy a sequel. If you don’t have good sound design, you’ll find it more difficult to immerse yourself in the world that’s been created.

If you don’t have sound at all… you’re likely dealing with the initial release of an emulator.

After v0.1, I fixed some small bugs here and there, but I soon decided to attempt (what seemed like) a more complex task: finally adding sound to the emulator. And it works, somewhat. If you want to test it yourself, you can clone the v0.2 branch on GitHub and compile it using qmake.

But how does the DS sound processing unit (SPU) work?

The SPU has sixteen channels and two sound capture units. The channels can all play sound in PCM8, PCM16, and IMA-ADPCM; the first two are raw sound data, and the last one is compressed PCM16, similar to what you’d find in a .wav file. Channels 8-13 can also play PSG (square waves, like on the Game Boy), and channels 14-15 can play white noise. The SPU runs on its own clock at a frequency of ~16 MHz, which is just half the ARM7’s clock rate. To load music into a channel, you input an address for memory and a frequency… and you’re done. The channels are more configurable than that, of course, but it really is that simple. The capture units just take samples from channels 0-3 and put them in RAM, where the CPU can perform fancy sound effects before re-outputting them.

Implementing all of the above wasn’t that bad. I had some difficulties getting ADPCM to work, but all of the issues were resolved in a matter of hours. The hardest part was making things compatible for the Qt frontend. On Windows and Linux, Qt sound automatically shuts itself off when it detects no sound data. Because the SPU would not output sound upon starting a game, well… there wouldn’t be any sound. I resolved this by creating an intermediate buffer that stores old sound data, which is outputted if the SPU doesn’t have anything. It doesn’t sound great, but it gets the job done for now.

If you’re familiar with the GBA, you’ll see the DS SPU is a vast improvement over it. If not, let’s do a comparison. The GBA has six channels, four of which are just copies of the Game Boy’s sound registers. The other two take PCM8 data, but not automatically. If you want to use both channels, two of the four DMA (Direct Memory Access) units must be reserved for them. Furthermore, you must also reserve two of the four timer units, as the channels do not have their own clock. The GBA CPU is paused during the DMA transfers, so that means you have to make a tradeoff between CPU time and sound quality. The DS suffers from none of these flaws; all you need to do is supply the data, and the SPU works on its own.

The SPU implementation in CorgiDS is incomplete, for the moment. Neither the capture units nor the sound FIFOs are implemented. Sound is synchronous, meaning that if CorgiDS isn’t running full speed, it sounds terrible. Even when running full speed, things sound a bit off. Even so, the sound was good enough that I got a bit sidetracked and played Mario and Luigi: Partners in Time for a couple of hours. 🙂

Linux out for v0.1

EDIT: macOS build available as well!

https://github.com/PSI-Rockin/CorgiDS/releases

For Linux: Execute these commands: “sudo chmod u+x CorgiDS” and “./CorgiDS”

Read everything here to set things up.

Tested on Debian. Please report any issues you have on the GitHub issue tracker.

Thank you for your patience!

CorgiDS v0.1 is officially out!

https://github.com/PSI-Rockin/CorgiDS/releases/tag/v0.1

32-bit binaries are available for Windows. I’m working on getting OS X binaries deployed as we speak, but for now, you’ll have to compile things yourself.

Click here to see how to set up CorgiDS. The link gives you information on the BIOS/firmware you need, controls, and save files.

Please report any major bugs on the GitHub issue tracker, such as games not booting or graphical glitches that make a game unplayable. The nature of a v0.1 release means there will be numerous minor bugs that don’t affect playability; please use discretion in reporting those, as too many can distract from the more serious issues. But do test as many games as possible!

Have fun with the corgi, and happy holidays! 🙂

CorgiDS v0.1, looking for help with Windows/Linux builds

The time has come! CorgiDS is now ready to venture into public scrutiny. This means I have released the source code as well (but no builds yet, read below for more details).

present1 present2 present3 present4 present5 present6

Here’s some of what CorgiDS v0.1 has to offer:

A mostly-complete 2D engine
Software 3D rendering. Missing many features but works okay for many games
Ability to toggle frameskip and framelimiter on/off
Booting games directly or from the firmware
Reading from the AKAIO save database

An OS X build is ready to go. Unfortunately, I don’t have access to either a Windows or Linux machine. Therefore, I am seeking assistance with providing builds for both (or at the very least, Windows). If you have Qt installed and are willing to assist the project, please contact me through GitHub (linked below). I will help with any compilation errors, although it shouldn’t be very hard to get things working.

I will withhold the OS X build for now until I have something for Windows, so the actual release may not happen for another couple of days. In the meantime, here’s the project GitHub!

Feature chill

Not a whole lot remains until the release for CorgiDS. In fact, I plan to release it relatively soon for the curious.

I’ve stopped adding on to the emulator itself, and instead, I’ve been focusing on optimization. I’ve separated the GUI and emulation into separate threads, which has produced a massive speedup. Every 2D game now goes well above 60 FPS, and 3D games now reach a playable state. I don’t have a good way to actually measure FPS yet as the emulation thread has no frame limiter, but that’s on the list of tasks I’m performing. Some code cleanup has also helped with speed, but nothing drastic.

Unfortunately, the code cleanup has given birth to another stupid bug. Tetris DS is overflowing the position matrix stack on the GPU when it wasn’t doing that before. I can’t just ignore it because it produces a hideous flicker effect. I think the problem lies in the CPU, but I’m still hunting for it. That needs to get fixed as I don’t know if any other games are affected.

What else? I’ve added a framework for HLE BIOS emulation: in the future, this means you won’t have to dump the BIOS and firmware from a DS in order to play games on CorgiDS. While it has a few functions already implemented, I’m keeping it turned off for v0.1 so that it can’t be mis-used. I’m also trying to figure out a way to handle fatal errors in the course of emulation without having to crash the program… it’s difficult to figure out something that keeps the emulator core and frontend decoupled. I still need to add a window for configuring the save size of a game as well. Last but not least, I need to upload the source code to GitHub and find someone to help create builds outside of OS X.

It’s not quite a feature freeze, but things are cooling down for the holiday season. 🙂

Holiday plans

Happy December to all of you reading this!

At the moment, I’m taking a break from CorgiDS… by working on another emulator project called DaneBoy (named after the Great Dane). Don’t fear though; I’m not doing this merely for diversion, but rather, for practice.

The UI for CorgiDS is written using Qt. While Qt makes C++ UI development easy, it does so by making itself extremely bloated. One effect of this is that Qt does a lot of work on the main thread, which is not good for emulators and any other projects that require lots of processing time. Since I’d rather not switch UI libraries, the solution is, of course, moving emulation over to a separate thread. Assuming negligible overhead from thread synchronization once a frame, this would boost FPS by 20-100%, depending on the game.

The catch? I’m not privy to multithreading, so I’m afraid to touch the CorgiDS codebase and risk introducing nasty bugs due to my lack of knowledge. Instead, DaneBoy will bear the brunt of my learning experiences. I’ll incorporate what I’ve learned into CorgiDS once I believe I’m not going to deadlock everything.

Before I started DaneBoy, I also created a little program that automatically generates a jump table for ARMv5 instructions. Unlike the Z80 in the Game Boy, ARM CPUs aren’t easily decoded using a giant switch block. The two options are either using an ugly mess of if-statements or creating a jump table with 4096 separate elements. CorgiDS used the former for a while, but I finally got around to rewriting that behemoth of the codebase a couple of days ago. I don’t know of the existence of any similar decoding programs, but if there aren’t any, I’ll release mine along with CorgiDS. This is so that anyone else developing an emulator that uses an ARM CPU can benefit from this.

I don’t plan on doing anything noteworthy with DaneBoy, nor do I even want to release it. This is just a practice project; CorgiDS is what I’m really concerned about. Either way, it’s actually pretty fun to work on a much simpler system than the DS. 🙂

Ignorance, or Why You Should Research Whatever You’re Emulating Before You Start Coding

Apologies for the radio silence over the past couple of weeks. To summarize, I’ve run into a nasty problem that will require me to rewrite a significant portion of the 3D code in CorgiDS, and I’m trying to figure out how I want to do that.

When I started writing code for the DS GPU, I had no clue at all what I was doing. I had never worked with a 3D graphics library like OpenGL before, yet here I was attempting to create a software renderer in an emulator. After proverbially bashing my head against the keyboard over the course of several weeks, I finally started getting tangible results: getting polygons on the screen at all was cause for celebration. I gradually added more features such as textures, alpha blending, and lighting over time. Finally, after many, many weeks of work, CorgiDS almost meets my standards for a software renderer.

Almost.

Take a look at this image from Final Fantasy IV in CorgiDS:

final_fantasy_clipping

There are quite a few issues in this image, but let’s focus on one in particular: half of Cecil’s body on the left is entirely missing. If you look carefully, some polygons on the soldiers’ lower bodies are also missing.

What’s happening? I can’t conclusively say what the problem is, but I’ve managed to narrow it down to something going wrong with the clipping code. Clipping is how GPUs deal with polygons that extend beyond at least one of six planes of the viewing frustum: left, right, top, bottom, near, and far. The GPU “clips” the vertex outside of the frustum and creates two new vertices that lie directly on the plane.

The polygons are missing because one of their vertices is getting clipped on the far plane (the direction away from the camera), and the game is set to not render any polygons intersecting the far plane. I don’t exactly know why this is happening, but I have two thoughts:

A precision error causes vertices that should not be clipped to become clipped.
Vertices are being clipped multiple times.

I ruled out the first one, as there doesn’t seem to be anything wrong with my matrix multiplication code. Furthermore, the DS uses fixed-point arithmetic rather than floating-point, so a precision error is far less likely. That brings us to the second thought.

Because of my unfamiliarity with 3D graphics, I have used melonDS’s software renderer as a reference for creating my own. Out of a desire to learn things on my own and not outright copy someone else’s work, I have added parts bit-by-bit to my code. This organic process has led to CorgiDS’s software renderer being the messiest part of the codebase, founded upon faulty assumptions and unclear ideas. While it all remains solvable, there is one fundamental issue: CorgiDS does not re-use vertices in polygon strips.

The code for polygons so far looks like this:

Screen Shot 2017-11-26 at 1.13.16 PM

Note the “vert_index” variable, which points to the first vertex used. This code makes the assumption that vertices remain contiguous within RAM. While true for 90% of cases, this completely falls apart when polygon strips are involved. The DS GPU can allow two polygons to re-use the same vertices under special conditions, meaning that vertex lists no longer become contiguous. melonDS indicates that re-used vertices don’t get clipped again, but there’s other rules that I don’t quite understand…

Anyway, if this truly is the problem (and I don’t know what else could be), then a large portion of the renderer will need to be rewritten, a task that has little appeal to me. I might just ignore this problem entirely for the first release and focus my efforts elsewhere… not a whole lot of games are affected by this. Decisions, decisions…