What’s left for v0.2, and the plan for v0.3

v0.2 has been an accuracy-focused update. So far, I’ve added audio, fixed tons of bugs, and overall improved the graphics.

Here’s an example of what I’m talking about:

present1mariokart_improved

On the left is the v0.1 version, and on the right the v0.2 version. The differences may not seem like a whole lot, but getting to this point has required a lot of work on the 2D engine. First, sprites weren’t showing up at all because of a bug with VRAM accesses. Next, I had to implement window functionality. There are two fixed-sized windows as well as an “object window” that uses sprites, and backgrounds and sprites can be enabled or disabled as you please within windows. Mario Kart disables sprites in the regions outside of the windows, and the two fixed-sized ones are used to hold the “1st” and lap number on the top screen. The top-left corner holds an object window, where sprites are visible only within that black square. This allows the game to perform a slot machine effect without the items showing up outside of the window. I also fixed a nasty display capture bug that caused games that store code in VRAM to freeze.

The 3D engine has also received some love. Fog has been implemented, and I’m in the process of adding dynamic shadows and edge-marking. The Spiky Polygon Syndrome inflicting many games such as Final Fantasy IV, New Super Mario Bros, and more, has mostly been fixed, aside from a few edge cases. The problem was not vertex-sharing as I initially believed; it was actually a bug with clip matrix reads. The issue lingers in games like Sims Castaway, but I need to do more debugging for that.

Finally, I fixed a bug with DMA transfers that caused many games to not boot. The ARM7 DMAs have a maximum 16-bit length, and a write of zero is interpreted as max length. The ARM9 DMAs follow the same latter rule but have a maximum 21-bit length. I wasn’t accounting for the upper 5 bits on the ARM9 but I was using the ARM9 max-length, so games would accidentally overwrite critical memory.

After I get done with shadows and edge-marking on the GPU, my remaining plan is just to improve compatibility with broken or glitchy games. Some ideas I have are fixing save problems with the Pokemon games as well as defeating the infamous “no EXP” anti-piracy. I thought about implementing cache emulation, but I feel that it’s not worth it at this stage as the emulator is still immature. I want CorgiDS v0.2 to be out by the end of this month.

That brings us to my plans for v0.3.

Because v0.2 primarily focused on accuracy and overall compatibility, I want v0.3 to focus on optimization and quality-of-life.

My first goal is to completely high-level emulate (HLE) the NDS BIOS and firmware. This should offer some minor speedups on games that rely on the BIOS, but more importantly, it removes the need to have dumped the BIOS and firmware in order to play games. HLEing the BIOS means implementing all software interrupts that games use, and HLEing the firmware only means storing pre-determined values into memory (like DeSmuME does). The option to provide your own images will still be available for improved accuracy and the ability to boot from the firmware.

My second goal, far loftier, is adding a dynarec.

A dynarec (short for “dynamic recompiler”) converts assembly from one CPU architecture to another. In this case, the CorgiDS dynarec would recompile ARM machine code into x86 machine code. The benefits are twofold: one, the overhead incurred from the interpreter having to re-translate every opcode would be eliminated. Two, a dynarec offers opportunities for re-optimization that wouldn’t be possible with an interpreter model. An intelligent dynarec can take advantage of the architecture of the target machine and produce code tailored towards it, allowing for major gains in speed. Even in the most 3D-intensive games, the CPU code is still a large bottleneck. The dynarec, if designed correctly, would alleviate this, allowing CorgiDS to run on less powerful computers.

So what’s the catch? A dynarec requires a thorough understanding of the assembly language of the target processor. A naive implementation of a dynarec is already more complex than an interpreter, and an optimizing dynarec is far more complex. While slow, the interpreter can at least be debugged more easily. I’ve also never written a dynarec before, so only time will tell how long it takes for v0.3 to come out. Nevertheless, I’m not one for backing down from a difficult challenge.

My final goal, if time permits, is to add GBA functionality to CorgiDS. Because the NDS uses most of the GBA hardware, this would be done by re-using code already in place. I would have to create new scheduler, sound, and graphics code, but everything else won’t be so bad. By re-using the NDS hardware, this offers the possibility of booting from the firmware and loading a GBA game like you would on a real DS. Of course, if the stuff above takes a while, this may have to wait for a future update.

As always, thanks for your support!

Advertisements

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s