Blog

The Force Engine development news and updates.

TFE Update and 2021 Plans

As the first blog post of 2021, this will cover several topics. First, an update regarding the progress of The Force Engine. That will be followed by TFE plans for 2021 - the year of the first fully playable release (hopefully), and finally the first “experimental” build and what that is all about.

Progress

The main focus has been reverse-engineering and integrating the original INF-system to replace the code currently available in TFE, which - despite being mostly functional for the original levels - was basically placeholder until now.

The reverse-engineering progress is still ongoing and will, most likely, be for a few more weeks. My goal is to finish the INF reverse-engineering and integration by the end of Feburary, at which point proper testing of the Classic Renderer and INF can begin across the original levels and Mods. This will form the foundation for the rest of the year as we speed towards Dark Forces completion (though TFE will still have a ways to go after the first full release, with Outlaws, mod tools, and various improvements and enhancements).

Inf Debugger

In order to visualize and test the INF system at work I have also started the implementation of the INF debugger.

InfDebugger

At any point while running a level you will be able to hit Alt+F10 to bring up the INF debugger. The right window contains a list of all the INF items in the level. You will be able to select whether to show all items, active items and other options. In addition you will be able to click on the Name, Class, or Sub-Class buttons to sort by those categories. Clicking on any of the items will select it, and it will be visible in the Inspector window.

The Inspector window, on the bottom left, allows you to see the current state - such as the current stop, speed, key, flags and more. The values of most of these items are editable in real-time, in order to facilitate testing. Not all values will be shown (there are many values) - so you will be able to add and remove variables from the view, though some will likely always be visible. In the breakpoints tab, breakpoints can be set on various events such as reaching a stop, a variable changing, the item being activated, etc..

This is not shown yet but you will be able to pause the game, needed when breakpoints are hit, and single-step through INF commands. Finally, as the INF system operates, messages are sent to the Output window, on the bottom right, so you can see what is happening. There will be some sort of filtering available, though this has not been implemented yet. If you click on a message in the Output window, it will select the INF item so you can inspect it.

This is still a work in progress and will get fleshed out as I work on the INF system. This is great for me to debug the system but should also be useful to level authors and modders trying to figure out why their INF is not working as intended, or just to check correctness.

2021 Plans

The goal for The Force Engine project is to release version 1.0 of TFE this year, which means that TFE will be a full replacement for DosBox when playing Dark Forces. At that point the tools will still be mostly in a preview state, though the level editor should be usable, if not feature-complete.

INF & Classic Renderer

The next main test release will be the finish INF & Classic Renderer release - where levels should look correct compared to DOS, including Mods, and the level functionality - except where it relies on Logics or AI, will be fully accurate. The goal is to have that release done around the end of Feburary.

Cross Platform Release

The next main test release will be focused on setting up a CMake build system and making sure TFE builds and runs on Linux and OS X platforms. The GitHub projects list marks “Metal” support as required, though I may put this off until after the first release.

Player Control & Physics

The next test release will see Player Control, Physics, Collision detection reverse-engineered and integrated. Some of this work has already been completed. The goal of this release is to make sure Dark Forces plays and feels authentic.

Logics & AI

Finally, this release will get some real gameplay finished. Logics, such as pickups, and AI should be fully functional after this release, including bosses. The Logics will be scriptable and most likely written as scripts (as the placeholders are now).

Weapons

In this build, weapons will be finished and fully working. This means using ammo, updating the HUD, proper effects in the levels, proper damage and damage falloff. This test release will also cover auto-aim and System UI options to control how much auto-aim you want while playing (from “Dark Forces” to “None”).

Dark Forces UI

In this test release the Dark Forces UI will be finished, including the PDA and cutscenes.

Beta 1.0 Release

Finally the TFE 1.0 Beta release.

Post-Release

Once the initial release is complete and the important bugs worked out, the focus will shift to tools - including the initial support for Outlaws levels in the level editor. During this phase features that were added to the Jedi Engine for Outlaws will be ported over - such as slopes and dual adjoins - and made available for Dark Forces mods. Then the focus will shift again to Outlaws. This is about as far ahead as I want to plan, so there will be more updates on Outlaws and tools as we progress through the year and beyond.

Experimental Builds

The last topic will be experimental builds. I have a variety of possible features and enhancements in mind for The Force Engine, many of which are rendering related. The idea behind experimental builds (and the associated experimental branch) is to be able to take a break from the current work, such as INF reverse-engineering, to implement one of these experimental features and then release an Experimental build for people to play with.

Experimental Features may not make it into master, depending on the results, performance and reactions - though I suspect most will. Experimental Features won’t be “production ready” - meaning they may have bugs, low performance or missing features. It is meant as proof of concept, fun breaks from the main work. The features that make the cut will eventually be moved into master and finished, when the engine is ready for the feature.

The first experimental feature is . . .

Voxels

I couldn’t resist implementing voxel support as the first experimental feature. As mentioned above, it isn’t ready for production - and won’t be for some time, definitely not until after the Classic Renderer / INF test release is out, if not longer - but is a nice proof of concept that you can try out yourself. For this Experimental Build, I embedded Dzierzan’s voxel pack (https://github.com/Dzierzan/Dark-Forces-Voxel-Pack) and a patch to Secbase to show some of the voxels.

How to Get the Build

You can get the current experimental build from Downloads on The Force Engine site. Scroll to the bottom, under experimental builds.

How to Run

The setup is the same as normal, unzip the build into a new directory and then run TheForceEngine.exe. You will need to setup your Dark Forces data, if it is not auto-detected. If you have run TFE before, it should just work. Next startup the first level, Secbase. Make sure to either run in widescreen or at higher than 320x200 - otherwise the voxels won’t render.

Once in Secbase, open the console (~ on US keyboards) and type EnableExperiment. Hit the same key again to raise the console.

What is implemented

  • Conversion between VOX and the internal format. Proper conversion and saving of the TFE format will happen once the feature is moved to master.
  • Rendering of opaque voxel objects.
  • Basic console commands to place voxels and to save and load placements.
  • Some debugging console commands.

What is missing (things that need to be finished once I get back to the feature to make it “real”)?

As you can see from the list, this is definitely still in the experimental/not production ready phase.

  • Proper conversion into the TFE format.
  • Animation support.
  • LOD support.
  • Translucency support (obviously this needs this feature to be added to the engine first).
  • Editor to import frames, set scale values, etc.
  • Support for arbitrary pitch and roll, needed to support software voxels in the perspective renderer.
  • Better handling of palettes (MagicaVoxel mangles the order - meaning the colors have to be remapped).
  • Performance improvements (the techique is not fully optimized and may be slow on some machines).
  • Quality improvements, the texturing/edge quality leaves something to be desired at lower resolutions due to incorrect sub-pixel precision.
  • A few minor bugs, such as sorting breaking if standing inside or over the top of the object.

There are several console commands available to test things out:

raddVoxel

raddVoxel filename base offset upVector Add a new voxel object. This pulls the object from Mods/voxels.zip. If you want to add or change voxels, edit the zip file. There are several options:

  • filename: the path to the voxel, starting from the zip file. For example, decorations/table.vox
  • base: ceil or floor, default floor. This determines if the voxel is floor or ceiling aligned. Use ceiling alignment for things like hanging lights and chains.
  • offset: offset in units from base, negative values go up.
  • upVector: y or z, default y. This is the up vector of the original asset. For some reason a few of the voxel models use Z up instead of Y, such as table.vox. In those cases specify Z to get the correct look.

Examples

  • raddVoxel decorations/table.vox z - the voxel is decorations/table.vox, the up vector is Z.
  • raddVoxel enemies/intdroid.vox -4 - the Interrogation Droid, moved 4 units from the floor.
  • raddVoxel decorations/redlit_a.vox ceil - the redlight voxel aligned to the ceiling.

rsaveVoxels

rsaveVoxels filename Saves a level patch with the voxels you have setup. It will be saved into Mods/filename.

rloadVoxels

rloadVoxels filename Loads a previously saved level patch, assuming you already typed EnableExperiment.

rclearVoxels

There are no arguments, this command simply clears all voxels from the level.

rvoxel_showDrawOrder

rvoxel_showDrawOrder enable - animates the draw order of the voxels if enabled.

  • rvoxel_showDrawOrder 1 - enables the debug animation.
  • rvoxel_showDrawOrder 0 - disables the debug animation.

rvoxel_step

rvoxel_step - if voxel_showDrawOrder is enabled, this stops the animation and steps forward 1 column. You can specify the number of columns to step.

rvoxel_continue

rvoxel_continue - this resumes the draw order animation.

Screenshots

Voxels1 Voxels2 Voxels3 Voxels4 Voxels5

Implementation Details

I implemented the voxel rendering code from scratch in order to fullfill some requirements, which I think all new features and improvements in TFE should share:

  • Decent performance in the software renderer - while this feature can certainly be improved in this regard, in general it performs well consuming 1-3 ms per frame on 1080p. This is inline with other features, such as 3D objects.
  • Follows engine patterns - the feature should follow engine patterns, such as favoring column rendering and should mesh well with the engine.
  • Proper sorting - continuing with the integration, the feature should sort properly with sector geometry and other objects. This voxel renderer uses the 1d zbuffer to sort with walls, fits into the same sorting system as other objects, culls and clips against the view window (needed for adjoins), etc.
  • Proper “look” - meaning the results should be consistent with the rest of the game and feel like it belongs.

I will write up more details about the algorithm in the future, but here is the quick overview:

  • Data is organized and rendered as voxel columns, compressed using the same RLE algorithm as WAXes, modified to fit the data.
  • RLE data basically splits voxel columns into transparent runs which are skipped and opaque runs that form “sub-columns.”
  • Voxel-column rendering order is determined algorithmically based on the view vector, meaning no sorting is required for back to front rendering.
  • Sub-columns may have vertical caps rendered depending on the camera position, but at most one cap per sub-column will be rendered.
  • Each voxel has a 4-bit adjacency mask, which is used to avoid rendering faces shared by multiple voxels.
  • Sub-columns are further sub-divided based on the adjacency mask and the final contiguous voxel columns are rendered as textured lines using a column renderer very similar to sector walls. No per-pixel transparency testing is required.

Interior voxels are removed, interior faces are skipped using the column sub-division method mentioned above and voxels are rendered as voxel-columns instead of as individual voxels.

Next Experiment?

I have ideas for future experiments but they will have to wait, the focus has returned to the INF reverse-engineering, integration, and the debugger.

TFE Classic Renderer Complete

The 3D Object rendering system, derived from the original DOS code, is now working in both 320x200 (fixed-point sub-renderer) and at higher resolutions and in widescreen (floating-point sub-renderer). This completes the Classic Renderer reverse-engineering and integration project, allowing me to move on to reverse-engineering the INF system.

3D Object Renderer

The 3D object renderer in Jedi is basically an entire, mostly self-contained rendering system that integrates fairly well with the sector system. Models can either be drawn as points, though TFE turns these into quads at higher resolutions in order to maintain the same screen coverage, or as polygons. These polygons can either be triangles or quads, though there is little difference between the two forms in practice.

Each polygon can be shaded with one of five (5) variants: Color vertex shaded (“GOURAUD”), Color flat shaded (“FLAT”), Textured and vertex shaded (“GOURTEX”), Textured and flat shaded (“TEXTURE”), and finally textured as a floor or ceiling (“PLANE”). In the original executable, there are four (4) different clipping function variants and five (5) different polygon draw variants resulting in thousands of lines of C-code. In TFE, macros are abused in a method most likely similar to the original code to reduce these down to a master clipping routine, which is then instantiated four times using defines and two polygon rendering routines - one to handle the “PLANE” case and one to handle every other case.

For TFE integration, I have split the original rendering code into several components to make following and maintaining the code easier. I will go through each component below.

Transform And Lighting

3D objects in Jedi have a 3x3 rotation matrix and a 3D position. The main renderer generates a 3x3 camera matrix for use by the 3D object renderer. The object position is transformed into viewspace and the object/camera matrices are concatenated to build a single local to viewspace rotation matrix. The final matrix (“transform matrix”) and viewspace offset (“offset”) are used from here on in all transformations.

Next, all of the modelspace positions are transformed into viewspace. It is at this point the transform and lighting stage end if the “MFLAG_DRAW_VERTICES” flag is set, which causes the object to be rendered as points. Otherwise, the model has a list of polygon normals that were precalculated on load and these are now transformed into viewspace. These normals will be used later for backface culling.

Finally, if vertex shading is required - that is any polygon uses “GOURAUD” or “GOURTEX” for its shading - the vertex normals, also computed on load, are transformed into viewspace. Once that is done, per-vertex lighting is applied.

Lighting

The Jedi Engine supports multiple directional lights, each with its own direction and brightness values. For Dark Forces this is setup as 3 directional lights, each in a cardinal direction (X, Y, Z). The lighting contribution for each light is summed up for the vertex and then the sum is multiplied by the “sector ambient fraction” - which is the fraction of sector ambient compared to the maximum. The maximum ambient light level for any sector is 31 (0x1f), so a sector with an ambient of 22 would have a value of approximately 0.71 for its “sector ambient fraction.”

The next step is to apply lighting from the camera light source, such as the headlamp (by default this is turned off). This uses a special light source ramp that is embedded in the level’s color map. This is a 128 entry table, indexed by the current depth: depthIndex = min( floor(z * 4), 127 ); the table itself is inverted. The final value of the camera light for the current depth is: MAX_LIGHT_LEVEL - (s_lightSourceRamp[depthIndex] + s_worldAmbient). Normally this is done per-column or scanline, but in the case of vertex-lighting, it is done per-vertex.

The final step is to apply a falloff value based on distance, similar to the distance-based darkening in Doom or Duke3D. The current depth (Z) value is scaled and then subtracted from the intensity calculated so far. However to avoid the brightness changing too much, the overall lighting value is clamped to a range of [Sector Ambient, 0.875*Sector Ambient].

Backface Culling

If the “MFLAG_DRAW_VERTICES” flag is set, then the model will draw the vertices at this point and then the object drawing is complete. Otherwise the renderer moves on to the next step - backface culling. This step has two tasks, 1) determine which polygons are facing towards the camera, which are added to a list of potentially visible polygons and 2) determine the average depth (Z) value for each polygon for later sorting and lighting if flat shading is used (“FLAT” or “TEXTURE”).

Once the potentially visible set of polygons has been determined, they are sorted from back to front based on their average depth value.

Polygon Drawing

At this stage, the code loops through the visible polygons, skipping past any with too-few vertices. Drawing each polygon requires several steps:

Setup

The required vertex attributes to draw the polygon are copied into flat arrays which will be used directly for several reasons: to avoid indexing into the larger list, and the arrays are mutable, allowing the clipping routines to change and add values without modifying the larger data set. If the shading mode requires texture coordinates, they are copied from the polygon into the array. If vertex shading is required, the intensity values calculated during the Transform and Lighting step are copied.

Clipping

Next polygons are clipped against five (5) frustum planes: the near plane (at Z = 1.0), left, right, top, and bottom planes. In TFE, float-point sub-renderer, the left and right planes are adjusted based on the (potentially) widescreen aspect ratio. The top and bottom planes are computed, taking into account the sheared perspective. Given a convex polygon as input, the result will be a complex polygon or be discarded - which happens when a polygon is fully behind a plane.

Drawing

If the polygon survives clipping, its vertices are projected into screenspace and then drawing can begin. If flat shading is used, a single color or intensity value is generated for the polygon using the procedure described in the Lighting section above. In this case the polygon normal and average Z value is used.

The first four variants generate columns while the “PLANE” shading mode generates scanlines. For the column drawers, the screenspace bounding box is computed for the polygon and then edges are scanned starting starting from the minimum X value. Matching edges are scanned backwards and forwards and as the code steps along the edges, columns are computed.

Column Rendering You can see in the image the “top edge” (forward) and “bottom edge” (backward). Given two edges, the code steps forward in X and vertex values are computed at each point by linearly interpolating along these edges. Once we move past an edge, the next one is found and this continues until we run out of edges.

At each X value, we setup a column. The Z (depth) value is the minimum of the Z along each edge. This Z value is compared to the 1D depth buffer previously generated by the sector walls in order to sort the columns with the walls. Columns are also clipped to the current vertical window and columns are discarded if they are outside of the horizontal window. Finally the vertex values are interpolated along the column using one of the specialized column drawing routines. If vertex shading is used for lighting, a screenspace dither value is used to breakup the edges between light levels.

The scanline case is very similar but rotated on its side (minimum Y instead of X, stepping along Y, etc.). The difference with the “PLANE” mode is that it doesn’t use existing texture coordinates. Rather, once the scanlines are generated, they are clipped and rendered in the same way as the flats (floors and ceilings).

Conclusion

This is obviously a very brief description of the rendering process, but as you can see it was pretty advanced for the time. Unfortunately the lighting rig was underutilized, they simply stuck with 3 axis aligned directional lights at full brightness. It might be interesting to make this functionality accessible by modders in order to enhance the mood of their levels. However, this lighting rig only affects 3D objects.

Screenshots

320x200 TalayBridge GromasBridge DetentionCenterBridges

Issues and Bugs

The Classic Renderer will have bugs, the 3D object rendering code itself is about 3k lines of code and is less than half the size of the sector and sprite rendering code. However, some issues are due to the way objects are assigned and offsets are calculated as part of the INF setup. So, in other words, the Classic Renderer will not look 100% correct until the INF system is reverse-engineered.

Next Steps

The existing INF system in the test release was written based on my understanding of the INF system in the original game which obviously still has gaps. The next step is to replace that system with directly reverse-engineered code, which will be tested heavily with mods as well as the original levels. Once this is complete, almost all mods should be playable with TFE - with the caveat, of course, that the AI will still be placeholder (i.e. just stand around waiting to be killed) - but the correct events will fire.

TFE One Year Anniversary

The Force Engine started out life as DarkXL 2 on December 21, 2019. Two months later, the project was renamed to The Force Engine, in order to encapsulate the intent of the engine, and put up on GitHub. After another month of work, about 3 months from the beginning, The Force Engine was announced on DoomWorld.

After reflecting on what went well and what could have gone better with DarkXL, the project was started from scratch. In order to build a framework and get things working, basic tools were written, including a level viewer (which is becoming the full level editor) and viewers for most asset types. I wrote a software sector renderer based on my understanding of the Jedi Engine, a sound system and midi engine, and an INF system, and finally released a test build.

While it has been nice to have a working build and to be able to “play” through the levels, these initial systems are not accurate enough to meet the project goals. Thus, once the test build was released and after getting some initial feedback, fixing and improving a few things along the way, the real meat of the project began.

Most of the remaining year has been spent reverse-engineering the original DOS renderer, integrating it into the TFE framework, and extending it to support higher resolutions and widescreen. The DOS fixed-point renderer is still there and that is what you will see when running at the original resolution of 320x200. Because TFE is currently the only source of reverse-engineered Dark Forces/Jedi Engine code, the DOS-derived renderer will never go away. The renderer has the same features, techniques and bugs as the DOS renderer. At higher resolutions, the floating-point variant of the renderer is used in order to handle much higher resolutions and in some cases fix bugs (though most of these fixes will be optional). The “Classic Renderer” is finally days away from completion.

Once the Classic Renderer is finished, the next task is to properly reverse-engineer the original INF system which, once integrated, will complete the visuals since things like texture offsets and other features will be handled correctly. Once complete, the long overdue “Classic Renderer” release will finally be done.

Death Star (Classic Renderer) Death Star

Color Correction Color Correction

Foundation

The Classic Renderer and fully accurate INF system will lay a strong foundation upon which to rebuild the rest of the game and engine. And, of course, working through the rendering and INF code has and will continue to reveal more about the inner workings of the original systems, the layout of the code and over all program flow. As a result, I expect that the difficulty and complexity of the remaining reverse-engineering work to decrease, meaning we are nearly at the top of the hill.

The Future

Once the Classic Renderer release finally comes out there will be a period of testing to verify that The Force Engine renders everything correctly in the base levels and in mods and that the INF functionality is fully accurate (which handles things like level progress, doors, switches, elevators, 3D model animations, changing light levels, scrolling textures, and more).

The following test release will focus on realizing the cross platform support promised since the beginning. This means setting up a CMake-based build system and adding support for Linux and OS X (including new ARM based Macs).

After that the focus will shift to gameplay, starting with reverse-engineering the original collision, movement and physics code. And then moving on to Logics and weapons. This will finally lead to a fully playable version of Dark Forces through TFE. Once gameplay is complete, the next steps will be cutscenes, mission briefings, finishing the in-game UI, and finishing the iMuse system.

And finally, this will leave the project in a good place to start diving into the Outlaws code to figure out the engine differences and finally getting that game up and running using The Force Engine.

Adjustable HUD

Last post I talked about the remaining work for the Classic Renderer and the next steps. In the meantime, based on feedback regarding widescreen, it came to light that just moving the status HUD elements to the edges of the screen in widescreen may not be ideal - especially for ultrawide resolutions. So I decided to implement some basic HUD options, including the ability to move the HUD elements away from the edges. This, of course, appears unnatural since the grapics were designed to sit at the edges of the screen. “Dzierzan” - a member of the discord server - quickly made some art to fix these cutoff edges so I spent a little bit of time to implement an adjustable HUD.

Note that the original assets from Dark Forces are still used, “add ons” are rendered as needed which were cut from Dzierzan’s art. This is done to avoid palette issues, make it easier to integrate and avoid directly modifying the existing art.

Adjustable HUD

I added a new tab to the System UI Settings dialog in order to adjust the HUD. Like the Graphics tab, as you adjust the HUD settings, you will see the in-game HUD respond immediately, allowing you to easily tweak to taste.

HUD UI HUD UI 2

Hud Scale Type

The HUD scale type determines how the HUD scales with resolution.

Proportional

The HUD is scaled to stay the same size as the original game in screenspace regardless of the resolution. This is the default option and will look the most like the original game.

Scaled

With the scaled option, the HUD gets smaller with higher resolutions. Using this option, you can then use the Scale slider to adjust the size of the HUD. Note that changing resolution will change the apparent size on screen.

Hud Position Type

This determines how the HUD is position by default. In either case further adjustments are possible by modifying the OffsetX and OffsetY sliders.

Edge

The Status elements of the HUD are always aligned to the edges of the screen. This is the way the original art was designed.

4:3

In this mode, the HUD will stay in its original 4:3 positions even if in widescreen. With this mode it is easier to see the full HUD at once.

HUD UI 3 HUD UI 3 HUD UI 3

The Final Results

In this screenshot you can see the 4:3 HUD in action in widescreen.

HUD Results

Next Steps

The next steps haven’t changed from the last post, though some progress has been made integrating the 3D object rendering code into The Force Engine.

TFE Project Update

It is time for another, long overdue, project update. The project went through a “slow period” of about a month after a flurry of work on the Classic Renderer but progress resumed about a week ago in earnest. This post will go over some of the work done so far, the current state of the project, and the next steps.

Progress

The majority of the work since the last build has been on the “Classic Renderer” - consistenting of three main phases, though of course work bounces between them.

Reverse-Engineering

The first stage is reverse-engineering the original DOS code in order to figure out how the original renderer works. This stage consists of decompiling the code, stepping through the code in a debugger to test values and see the results of calculations, and then figuring out what the code is trying to accomplish. This requires figuring out structures, global variables, decompiling functions and going through the program flow. The raw code goes into an internal project called “Dark Forces DOS” which, for a variety of reasons, is kept private.

As blocks of code is figured out, it gets turned into nice “C” code, which leads to phase 2.

Porting into TFE

The code gets “ported” into the main project (The Force Engine). By this I mean building systems to contain the code in a way that can interact with the other low level systems, such as the file system, OS layer and graphics backend. During this process I attempt to clean up the code and test it in the working build to verify its accuracy to the original. Sometimes there are issues requiring me to bounce between phases 1 and 2.

Extending the Code

The Force Engine supports higher resolutions, up to 4k at the moment in addition to improved controls and mouse look. The original Dark Forces code does not work well at higher resolutions due to the limits of its internal 16.16 fixed point math. In addition the quality of the texture mapping does not hold up very well beyond 320x200, due to its low sub-texel precision. After evaluating and attempting a few different solutions, I landed on this renderer design. So far the Fixed Point and Floating Point Classic sub-renderers have been implemented. The Floating Point sub-renderer handles very high resolutions gracefully, with plenty of sub-texel precision.

In addition to supporting higher resolutions, the renderer now supports widescreen. This is made trickier by the requirement to support widescreen for 200p and 400p with rectangular pixels as well as normal resolutions. The biggest changes and caveats:

  • 200p widescreen automatically switches to the floating point sub-renderer, the original fixed point renderer has not been altered to support widescreen.
  • The algorithm to compute the wall/frustum intersection when clipping is different when using widescreen and is slightly more complex. This is unavoidable, however. With widescreen disabled the original algorithm is used.
  • There is no direct FOV control, though that may be added in the future.

Scene 1 - 200p Normal Scene 1 - 200p Widescreen Scene 1 - 1080p Normal Scene 1 - 1080p Widescreen Scene 2 - 200p Normal Scene 2 - 200p Widescreen Scene 2 - 1080p Normal Scene 2 - 1080p Widescreen Scene 3- Normal Scene 3 - 1080p Widescreen Scene 4- Normal Scene 4 - 1080p Widescreen Scene 5 Scene 6 - Wide 200p Scene 6 - Wide 1080p

And don’t forget about ultrawide resolutions! I took this by resizing the window, so I don’t remember the exact aspect ratio but I think it was about 35:9. Note that this is still the original software renderer with minimal modifications to support widescreen. Ultrawide

System UI

The “System UI” has been undergoing improvements. This includes:

  • New title screen.
  • OS (System) File Dialogs instead of the imGUI version I was using. This improves the UI when needing to select paths and files and makes this part of the UI more consistent with other programs used on your OS.
  • New Graphics UI.
  • Added the ability to bring up the System UI while playing a game using Ctrl + F1.
  • Added the ability to adjust resolution and graphical settings and see the results immediately in-game.
  • Added optional color correction.

TitleScreen

Titlescreen

Adjusting Graphics Settings During Gameplay

Graphics Settings

Optimizations

Running the software renderer at high resolutions was not always performing well - especially beyond 1080p. The main issue was not the rendering itself, however, but the cost of transfering the framebuffer from CPU to GPU memory and to a much lesser degree, the cost of converting the framebuffer from 8-bit to 32-bit color on the CPU. Several methods are employed to fix this situation.

GPU Palette Conversion

Rather than converting from 8-bit to 32-bit on the CPU - the new default is to perform this conversion on the GPU during the blit of the framebuffer to the window/screen. This has multiple effects:

  • Reduces CPU time spend converting from 8-bit to 32-bit.
  • Reduces the amount of memory that needs to be transfered from the CPU to GPU by a factor of 4. This is a huge time-saver.

Asynchronous Framebuffer Transfer

The next big issue was the stall waiting for the framebuffer to be transfered to the GPU so that the screen could be updated. To fix this, the framebuffer was changed from a Texture to a DynamicTexture. The new DynamicTexture uses multiple Pixel Buffer Objects to allow for an asynchronous copy of the framebuffer data to the GPU and then copy from GPU to GPU resources.

Results

Combining these two reduced the cost of transfering the framebuffer and converting from 8-bit to 32-bit at 1440p by more than an order of magnitude - allowing the original levels to run at over 100fps at 1440p with the conversion and upload taking around 0.2ms instead of 14ms.

Level Editor

During this period, I also worked on the Level Editor on occassion. It still has a long way to go, but here are some things I worked on:

  • Improved the grid rendering in the 3D view.
  • Added the ability to actually edit geometry.
  • Improved and refined the UI.
  • Worked toward being able to properly draw and shape sectors.

3D Grid Rendering Improvements

Extended View Distance Extended Grid

Rendering Sub-Grids (similar to the 2D Grid) Extended Grid

Geometry Editing

Geometry Editing

Object Editing

Object Editing

State of the Project

Now that you have read and seen some of the progress over the last few months, I want to quickly go over the current state of the project.

Roadmap Changes

The original November release was estimated based on the apparent project scope and development velocity when the Roadmap was put together. As a result, it was optimistic despite my original thoughts on the matter. The project is still in development, as I hope this post illustrates, but changes have been required to the Roadmap and time estimates. Please see the new roadmap for details.

Classic Renderer State

The Classic Renderer is now about 95% complete. 3D object rendering code still needs to be cleaned up and ported from the reverse-engineered code and the sorting algorithm needs to be finished - it doesn’t yet handle all of the special cases between sprites and 3D objects. Mainly because that code could not be tested until 3D objects are rendering properly.

Next Steps

The next steps for the project are to finish the Classic Renderer, which should be done in the next few weeks. Unfortunately I am going to have to put off a release a bit longer because of a issue:

In the original code, a lot of the initial texture offsets are handled when the INF system starts up. What this means is that some of the textures do not have the correct texture offsets when using the Classic Renderer (though most do). The second INF issue is that some adjoins are not correct until after setup is complete. And the final issue, is that many graphical issues in mods are actually caused by incorrect INF execution.

As a result of these issues, I have decided to put off the next test release until the reverse-engineering of the INF system is complete. Once this is finished most MODs will work correctly and it should be possible to properly test the Classic Renderer to find the real bugs. Basically I want to avoid having to worry about bug reports and issues people find (including myself) that are really due to INF issues.

Once the INF system is reverse-engineered and fully accurate, it will be time to make another test release. After that the focus will be on character control, physics and collision - the game needs to feel right. And then reverse-engineer and properly implement all of the Logics and weapons.

TFE Renderer Design

It has been awhile since the last update but some of that has been cleaning up the design of the overall renderer structure. There are two forces at play, pulling in separate directions - a desire to preserve the original DOS renderer with all of its quirks and a desire to clean up artifacts, especially at higher resolutions, and provide a robust, forward thinking renderer. In either case the renderer needs to be faithful to the original and existing Mods and levels need to work as is (which means rendering correctly as well).

Fortunately the needed reverse-engineering for the renderer is complete, the work now is supporting all of the required features while preserving the original renderer while making sure the result is clean and maintainable. This means this release is taking longer than planned but, as the bedrock of the entire project, it needs to be done well.

Design

The TFE Dark Forces renderer consists of the following matrix of combinations:

  Classic Perspective
Software Fixed point DOS 320x200 software renderer. 8-bit only. Floating point software renderer. 8-bit or true color.
  Floating point software renderer. 8-bit or true color.  
Hardware Floating point hardware renderer - OpenGL 3.3 with future support for Vulkan/Metal. 8-bit or true color. Floating point hardware renderer - OpenGL 3.3 with future suppport for Vulkan/Metal. 8-bit or true color.
  Floating point hardware renderer - Compute. 8-bit or true color. Floating point hardware renderer - Compute. 8-bit or true color.

The “Classic Renderer Release” focuses on the “Classic” column. The “Fixed-point DOS” renderer is the reverse-engineered DOS renderer. Rather than trying to have a hybrid renderer using higher precision fixed point – an approach taken until recently which is not compatible with GPU rendering - the higher precision software renderer instead uses floating point except for inner column/scanline loops. This means that much of the code can be shared between the CPU and GPU renderers with the inner loops being swapped out. To put it another way, the original DOS renderer is preserved without limiting the renderer used for higher resolutions.

The Compute based renderer, which moves the majority of the rendering system to the GPU including visibility and all transformations – promises to greatly improve the performance of highly detailed custom levels and greatly reducing the data being transferred from the CPU to GPU. It also opens up new possibilities such as raytracing effects without requiring RTX hardware. However, this is a non-trivial endeavor and WILL NOT happen until after the gameplay is complete. So, it is included for completeness and future planning only.

The perspective renderers will also be work for a future, post-gameplay release. It is here for future planning.

Note that 8-bit / true color option does not add an extra combination, each sub-renderer has support for either.

TFE_RenderBackend

The Render Backend is responsible for abstracting the low level rendering API and providing the renderers with an API to manipulate GPU buffers, shaders, GPU textures, render targets and virtual displays, render state, and low level GPU draw and compute commands.

TFE_PostProcess

The Post Processing system handles blitting virtual displays and render targets to the window, color correction, and post process shaders such as Bloom. Note that the post processing stack is independent of the renderer, this means that color correction and post processing shaders can be applied just as well to CPU based renderers if the minimum GPU requirements are met. This also means that the post processing system can be used to convert from 8-bit to true color using the GPU, which can save CPU time and CPU to GPU bandwidth on supported hardware.

Note that if no usable hardware (GPU) support is available, such as on some low-end Intel iGPUs, then GDI (or equivalent technologies) will be used instead to blit the virtual frame buffer to the window. This will allow The Force Engine to be usable even when lacking GPU support – but Editor support will be disabled, and the menu system may be more limited. This feature may not make it in for the “Classic Renderer Release” but is planned before the “1.0” release.

TFE_JediRenderer

The TFE “Jedi” derived renderer, which consists of several sub-renderers based on feature sets and hardware being used. The goal of the entire family of “Classic” renderers is to faithfully reproduce the original renderer and algorithms with minimal changes that might disrupt Mods or prevent correct rendering of existing levels. Any changes of this nature, even if they are a good idea, should be optional in order to provide the most faithful experience possible by default.

TFE_JediRenderer

High level functionality used by various sub-renderers. This includes fixed point functionality and other higher-level constructs. Shared functionality between RClassic_Float and RClassic_GPU will also find itself here.

RClassic_DOS

The reverse-engineered DOS renderer only works properly at 320x200. This renderer is always 8-bit and provides minimal changes to preserve the original. However, if a similar look is desired, even in 8-bit, but without the artifacts the Classic Float renderer can be used instead. Originally, I partially implemented a fixed-point renderer with improved precision to avoid code duplication but realized I needed the floating-point renderer anyway for GPU support – I was overcomplicating the issue.

RClassic_Float

The Classic_Float renderer uses the original algorithms but uses floating point instead of fixed point and fixes artifacts that can be fixed in a non-destructive way (a way that does not interfere with existing mods or tricky effects). This means that a lot of the code can be shared between the floating point and GPU renderers – this code will be pulled up into TFE_JediRenderer. Classic_Float can use 8-bit or true color rendering by using color map interpolation (more details in the future).

RClassic_GPU

The Classic_GPU renderer is similar to the Classic_Float renderer but moves low level rendering to the GPU (such as wall and floor rasterization). It uses the same algorithms as the Classic_Float renderer, including sorting and 1D depth buffer for object versus wall visibility. The goal is to share as much as possible between the software and GPU renderers – but only when that sharing does not greatly increase complexity or make maintenance more difficult. Classic_GPU can render in 8-bit color or true color by using the same color map interpolation technique as the software true color renderer. Note OpenGL 3.3, or equivalent in the case of Metal or Vulkan, will be required.

RClassic_Compute (Post-Gameplay Addition)

The goal of Classic_Compute is to implement most or all of the Classic_Float renderer entirely using the GPU with very little communication to the CPU – mainly viewpoint data and changes to sectors and objects. The goal is to improve scalability of the portal-based renderer and better utilize modern GPUs. Note that Compute Shader support will be required, probably requiring OpenGL 4.5 or 4.6 or equivalent in the case of Metal or Vulkan.

Fixed Point and Higher Resolutions

The Problem

So for a while I was working with two different “modes” for fixed point in the Classic Renderer - the original (DOS) 16.16 and then a higher precision, but also larger 44.20 format. The larger format fixes all of the overflow issues that happen in the renderer when rendering at above 320x200 and the smaller format matches the DOS original. However, when trying to reconcile these different modes which would have to exist at the same time in order to switch dynamically - without duplicating code - things were proving to be problematic. So I took a closer look at what Outlaws and the Mac version were doing and realized that they use the same precision for most operations, the same size type and still use fixed-point. This is obvious in hindsight. So what is the difference? Mainly that certain operations were handled more carefully in those versions of the engine. This is what I suspected but what I didn’t immediately realize that only a few places in the code need to change, most of the time 16.16 format is enough even at higher resolutions. The only time it is truly insufficient is when rendering textured scanlines (when rendering flats or textured models).

Note the code shown here isn’t complete but shows the relevant parts. And yes I realize the mul16() should be offseting the 64-bit result by HALF (32768 for 16.16) before the shift for proper rounding, but this follows the original code in that regard.

So what do I mean about being more careful? To answer that, I will show some problematic code found in the DOS renderer and explain the problem.

x0proj = div16(mul16(x0, focalLength), z0) + halfWidth;

div16() is fixed point division. It upcasts the fixed point values to 64-bits, scales the numerator by ONE (65536 in 16.16 fixed point) and then divides by the denominator.

fixed16 div16(fixed16 num, fixed16 den)
{
  s64 num64 = s64(num);
  s64 den64 = s64(denom);
  return (num64 << 16) / den;
}

mul16() is fixed point multiplication. It upcasts the fixed point values to 64-bits, multiplies the values and then scales by 1 / ONE (65536 in 16.16 fixed point).

fixed16 mul16(fixed16 x, fixed16 y)
{
  s64 x64 = s64(x);
  s64 y64 = s64(y);
  return (x * y) >> 16;
}

So what happens when we chain these operations - div16(mul16(a,b),c)? We will look at two cases, one at 320x200 and another at 640x400 and use the same X and Z values.

  • We will assign x = intToFixed(200); - since ONE is 65536, this means the fixed point equivalent is 13,107,200.
  • Let’s assign z = intToFixed(200); as well, this vertex will be on the edge of the screen after projection.
  • If we are rendering at 320x200, focalLength=intToFixed(160) (which is 320/2 for a 90 degree field of view) and at 640x400 focalLength=intToFixed(320), which are 10,485,760 and 20,971,520 respectively.

Here is the situation when rendering at 320x200:

div16(mul16(13107200, 10485760), 13107200):
Multiply: s32(13107200 * 10485760 / 65536) = 2,097,152,000
Divide: s32(2097152000 * 65536 / 13107200) = 10,485,760

This is fixed point, where ONE = 65536, so this represents: 160.0 - exactly what we expect for being on the right side of the screen. No problems here.

Now remember that an s32 (32-bit integer) can hold a maximum value of 2,147,483,647; you will probably notice how close we got to overflowing this value in the multiply above. Much bigger and things would start behaving weird. So let’s see what happens when we bump up the resolution to 640x400:

div16(mul16(13107200, 20971520), 13107200):
Multiply: s32(13107200 * 20971520 / 65536) = 4,194,304,000 -- OVERFLOW
...

Here the value overflows during the multiplication so when we get to the divide its already very wrong. But wait, the final answer should be 20,971,520 - so there should be plenty of precision! And indeed that is true.

The Solution

The problem with the original code is that the multiplication and division operations should be “fused” - in other words we should do them as one operation in 64-bits before dropping back from 64-bits to 32-bits.

So we replace mul16() and div16() with a single function fusedMulDiv(a, b, c), which computes (a * b) / c as one operation. Let’s see what that looks like:

fixed16 fusedMulDiv(fixed16 a, fixed16 b, fixed16 c)
{
  s64 a64 = s64(a);
  s64 b64 = s64(b);
  s64 c64 = s64(c);
  
  return ((a * b) / c) >> 16;
}

In other words (a * b) / c is computed in 64-bits - essentially 48.16 fixed point - and only converted to 32 bits once it is done. So let’s look at the results:

fusedMulDiv(13107200, 20971520, 13107200):
s32(13107200 * 20971520 / 13107200) = 20,971,520 -- No more overflow!

And since ONE = 65536, this represents 320.0 - exactly what we expect at 640x400 at the right edge of the screen.

Reality

Of course this, while being the most prominenent issue, is not the only one. The way scanlines are handled had to be fixed up as well. And there we really do need more precision, the original DOS game used 10-bits of fractional precision during scanline drawing for the texture coordinates, which is barely enough at 320x200. At higher resolutions we really need twice that (20-bits) to avoid artifacts, though 16-bits is tolerable (and most likely the solution used on the MAC). But this is a limited area of the code and so using specialized fixed point types for scanlines only affects a small amount of code, instead of the whole renderer.

Classic Renderer Release

So what does this mean for the release? To put it simply the renderer will detect if you are running at 320x200 or a higher resolution. If you are running at 320x200 the original code will be used as-is, flaws and all, in order to maintain the original limits. If you are running at any higher resolution or widescreen, then the code will instead use the improved, proper calculations as described here - just like the MAC port of Dark Forces and Outlaws - resulting in cleaner visuals, higher limits and a renderer that works well at higher resolutions.

Future Level Format

When taking breaks from the reverse-engineering work, sometimes I work a bit on the level editor. One area of work is deciding on the editor format and how to expose advanced features to Dark Forces and Outlaws (such as slopes in new Dark Forces levels). So I have been putting together a format “spec” that I am planning on implementing for the level editor at some point before it starts being used for real work.

TFEM Spec

Excerpt from the draft

The Force Engine will support the Dark Forces LEV format, Outlaws LVT and LVB formats and a super-set format called TFEM (“TFE Map”). Unfortunately the Jedi Engine reads text based data files in a fairly rigid way, meaning that there is no provision for skipping default parameters or adding new unknown parameters or structure. For this reason, the original formats are not very useful for adding new features or for the level editor. This format has some similarities to the UDMF format and is my attempt to “get ahead” of the format mess and allow the same format to be used while adding new features in the future and for the format to be the master format used by tools.

As a super-set format, TFEM will support the entire feature set of the Dark Forces version of Jedi and the Outlaws version as well as any new features added for The Force Engine. It will remain a text based format, though a binary “compiled” version may also be supported later. Finally it will support default values - in which case the parameter does not have to be specified and map readers must skip unknown values or blocks without giving an error - making the format more robust and making versioning easier. The plan is to use the TFEM format as the level editor format, which can then be exported to other formats as requested. If using Outlaws features in Dark Forces or using new map features not present in vanilla, the TFEM format must be used - in other words there will be no Outlaws in Dark Forces style formats. However when using the TFEM format all of the level features of both Dark Forces and Outlaws will be accessible when using the “TFE” featureset (and any addition TFE specific features).

DF Classic Renderer Progress 2

Its about time to post another update regarding the progress towards the Classic Renderer Release. RE = fully reverse-engineered and ported to the engine.

Task List

The following task list are the items required for the release. The biggest remaining items are Object/Sector assignment and RE Object 3D rendering. Once those are complete the remaining tasks will be completed quickly. The release is still on track for this month. Also note that the reverse engineering efforts touch many systems in the DOS exe, so this work is also laying the foundation for the rest of the reverse-engineering effort, accelerating this project towards release.

  • RE Lighting.
  • RE Wall clipping/sorting.
  • RE Wall rendering.
  • RE Sign rendering.
  • RE Sky rendering.
  • RE Flat (floor/ceiling) rendering.
  • RE Sector rendering.
  • RE Adjoin/portal traversal.
  • RE Mask Wall rendering.
  • RE Sprite Rendering.
  • RE Sprite/3D Object/Wall sorting.
  • RE Level Geometry loading.
  • RE Object Asset loading.
  • RE Sector updates.
  • RE Object/sector assignment.
  • RE Object3D rendering.
  • Classic Renderer Verification (i.e. go through levels and verify visual correctness).
  • Updated graphics settings UI.
  • Graphics settings can be changed on the fly instantly updating the 3D view.
  • Hardware Rendering (partial).
  • Fully replace current Test Release renderer with Classic Renderer / Refactor.
  • Color correction.
  • Post FX (Bloom).
  • Basic controller support.

Next Steps

Once this release is finished, the next non-bugfix release will be input/control binding - which will include being able to bind controls to the keyboard, mouse and controller but also things like mouse and thumbstick sensivity.

After that future releases will start fleshing out the gameplay - player movement and collision, weapons, logics/AI and remaining INF issues until the core gameplay is complete.

DF Classic Renderer Progress

Work on the reverse-engineering effort has been progressing well. The sector rendering portion is almost complete with some more work in the next few weeks on sprites and 3d objects. The goal of this effort is to finish the Classic Renderer (for Dark Forces) and have all levels and mods render correctly - except for issues caused by missing gameplay elements (caused by INF bugs or lack of AI).

The renderer will continue to support higher resolutions, such as 1080p. However limits will only be accurate at 320x200. To put it simply, resolutions beyond 320x200 have graphical issues with the base DOS limits. In other words 320x200 will produce near identical results to DosBox (same limits, etc) but resolutions beyond that will relax those limits as needed.

The Classic Renderer will also have hardware support, which means faster rendering at higher resolutions and optional texture filtering. I will post more about hardware rendering features once the software renderer is complete. There will also be additional options in order to improve performance and optionally even rendering quality.

The renderer in the current build has issues with low light environments, where the lighting itself doesn’t quite match the original. The Classic Renderer fixes this (among other issues) - so I put together a screenshot that shows the Classic Renderer and DosBox side by side. The Force Engine version was resized in Photoshop and cropped so there is some slight distortion and the viewpoints aren’t exactly the same but I think you will be able to see that the character of the rendering, lighting and colors match.

Comparison

Upcoming DF Classic Renderer Release

This post concerns the next major test release, the DF Classic Renderer Release. There may be smaller releases in the meantime, such as bug fixes or quality of life improvements.

Classic Renderer

The “Classic Renderer” is the Dos-Dark Forces derived Jedi Engine renderer. This covers the rendering in the 3D view - purely 2D rendering such as UI is basically done except for a few bugs. This renderer involves several key components: walls with overlays (such as switches or signs), flats (floors and ceilings), mask walls (walls with transparent elements such as fences or force fields), sprites and 3D models - with proper light attenuation in different situations.

For a more thorough description of the DOS renderer itself see the work in progress Dark Forces (DOS) Rendering series starting with Dark Forces (DOS) Rendering, Part I. I will probably post Part 2 within a week or so.

DF Classic Renderer Release

The current renderer in the test release is not complete and not fully accurate - there are several issues such as sorting issues with sprites versus walls in a few cases and incorrect sorting with 3D objects versus sprites and walls in general. There are also some issues with wall rendering in some mods - such as “Prelude to Harkov`s Defection” (see the tram/subway). Finally the light falloff is not quite correct (though it is close in many levels), which is probably most obvious in Gromas Mines - the fog effect does not match the original very well in some areas.

As the reverse-engineering of the original DOS renderer nears completion, The Force Engine code is being updated and the code refactored. The ultimate goal of this release is to complete the Classic (DF) Renderer and ensure that it is functionally identical to the Dos version in 320x200. As I have been working, when I identify look-up tables I spend the time to make sure I understand how to accurately re-build the tables - with this knowledge I can remove the tables entirely when appropriate in order to support higher resolutions without sacrificing accuracy.

While refactoring the renderer code I plan on implementing a few non-DOS based features:

  • Widescreen support.
  • “Window-Resolution” option that changes the virtual resolution to match window size and aspect ratio.
  • Improved software renderer performance.
  • [Maybe] Classic Hardware Renderer.

Note that after the gameplay is more complete there will be another big renderer release before the project beta release in November that will feature the “Perspective Renderer”, more hardware support and Outlaws Jedi Engine enhancements (such as slopes, double adjoins, vertical adjoins, etc.). The goal of this release is renderer accuracy so that all levels and mods are rendered correctly and visually match the DOS version at 320x200.

Next Steps

Once the DF Classic Renderer Release is finalized, the next steps are to focus on gameplay elements - fully accurate collision detection and player movement, AI and weapons. These will be broken down into major test releases, in a similar manner as the DF Classic Renderer.

Add Comment

Dark Forces (DOS) Rendering, Part I

I have been reverse engineering the Dark Forces executable in order to figure out how it works, exactly, so that The Force Engine can truly be “source port” accurate. Figuring out how the AI works, player movement, collision detection, etc. is very important but this series of posts will start with rendering. These posts will talk about how Dark Forces rendering works in DOS and how the “Classic” Software Renderer works in the Force Engine (or will work as more work needs to be done there). Note that I will be showing some code snippets, these are directly from the reverse engineered work and represent what is actually happening in Dark Forces. However I obviously won’t be showing all of the code here and instead just snippets as needed.

Fixed Point

Back when Dark Forces was developed, floating point processors were generally not available. So, like most 3D games of the time such as Doom, they used “fixed point” instead. The concept is pretty simple but I will not go into too much detail here. Basically a fixed number of bits are assigned to store the fractional part of the number and the remainder are used for the sign (1 bit) and integer part. Dark Forces used 16.16 for most of fixed point numbers (or 1.15.16 if you want to call out the sign bit). In this scheme 1.0 = 65536 (0x10000) and 0.5 = 32768 (0x8000), which are the HALF_16 and ONE_16 constants used in the code.

Adding and subtracting works as you would expect (ignoring possible overflow) but multiplication and division can be problematic. In Dark Forces, they essentially upcast the values to 64 bit, do the operation and then use the lower 32 bits of the result. You will see functions such as mul16(), div16(), round16(), floor16(), etc. that handle these details in the explanations.

Camera

The Player has 3D position in the world - an (x, y, z) coordinate where (x,z) represent their position on the floor, as if seen from above (or a map view) and (y) is the height. In Dark Forces negative (y) is going up and positive (y) is going down. To convert to camera coordinates the (x,z) value is left as-is and the (y) value is offset by the eye-height, which is about -5.8 DFU when standing (Dark Forces Units, ~1 foot, though some estimates put it closer to 25cm). For the curious, the exact value is -380108 / 65536.

vec3 camPos = { player.x, player.y - eyeHeight, player.z };

The Player also has pitch, yaw and roll angles in order to orient the view. For Dark Forces only yaw and pitch are used for the camera, though 3D models (3DOs) can be fully oriented using all 3 angles.

From this several values are computed, the equivalent of a matrix:
cosYaw, sinYaw, negSinYaw
xTranslation, zTranslation

Cosine and Sine are computed using a 4096 entry table and with transformations to handle each quadrant and phase shifting one table handles both sine and cosine. Note that the precision is equivalent to a 16k entry table. Translations are computed thusly:

zTranslation = mul16(-camPos.z, cosYaw) + mul16(-camPos.x, negSinYaw);
xTranslation = mul16(-camPos.x, cosYaw) + mul16(-camPos.z, sinYaw);

Sectors & Walls

Drawing starts in the current player sector, processing each sector as they are seen through portals (“adjoins”). Each time a portal is traversed, “window coordinates” are tracked that clip the view to the current portal. To start with, those window coordinates exactly cover the full view. We start by processing these walls - which includes culling, projection and preparing for rendering, which will be the focus of the rest of this post. Details such as merging and sorting, drawing columns, drawing floors and ceilings, traversing portals, drawing sprites and models will be handled in future posts.

Each sector keeps track of the last frame it has been used. By using this, the engine only transforms its vertices and processes its walls - converting them to renderable segments once per frame. The first time a sector is encountered, the first step is to transform all of its vertices from world space to viewspace:

vec2* vtxWS = curSector->verticesWS;
vec2* vtxVS = curSector->verticesVS;
for (s32 w = curSector->vertexCount - 1; w >= 0; w--)
{
vtxVS->x = mul16(vtxWS->x, cosYaw) + mul16(vtxWS->z, sinYaw) + xTranslation;
vtxVS->z = mul16(vtxWS->x, negSinYaw) + mul16(vtxWS->z, cosYaw) + zTranslation;
vtxVS++;
vtxWS++;
}

Note that the vertices are “2D” - the (y) coordinates, which map to the heights on screen, are computed later by projecting the floor and ceiling heights. It is still fair to say that the Jedi Engine, like the Doom Engine, is 3D since the third dimension is accounted for in the varying floor and ceiling heights.

Culling

Next we must loop through every wall in the sector and process them. In Dark Forces this is what the Wall_Process() function is for. Dark Forces uses a 90 degree horizontal field of view. As a result, we can think of the top down, 2D frustum as consisting of two 45 degree right triangles. What this means is that for a given ‘depth’ (z value), the left clipping plane passes through x = -z and the right clipping plane passes through x = +z. Or to put it a different way, given a point (x,z) in viewspace - we know that if x < -z it is outside the left side of the screen and if x > +z it is outside the right side of the screen.

At this point the wall being processed consists of two view space coordinates (x0,z0) and (x1,z1). First Wall_Process() culls the wall if it is completely behind the camera and then culls it if both vertices are outside the left plane or both vertices are outside the right plane.

if (z0 < 0 && z1 < 0)
{
wall->visible = 0;
return;
}
if ((x0 < -z0 && x1 < -z1) || (x0 > z0 && x1 > z1))
{
wall->visible = 0;
return;
}

Next, Wall_Process() determines if the wall is front facing. That is if the camera is looking at the front or back of the wall. We can determine which side of a wall we are on by taking the cross-product. However, the original programmer didn’t quite finish simplifying the equation, so did something equivalent but slightly more complicated. This actually simplifies to a cross-product, so it is mathematically equivalent and gets the job done.

s32 dz = z1 - z0;
s32 dx = x1 - x0;
s32 side = mul16(z0, dx) - mul16(x0, dz);
if (side < 0)
{
wall->visible = 0;
return;
}

Clipping

Now things get a little more complicated. I won’t post the code, there is a lot of it, but the next phase of the function is clipping the wall the view frustum and to the near plane. There are still a few interesting things to point out, though. First, whenever a wall changes in Dark Forces (i.e. the vertices are moved, such as in the case of rotating doors or moving sectors) - its length in texels is computed and stored. Dark Forces uses a consistent texel density - 8 texels per DFU. To put it simply:

wall->texelLength = mul16( length(x0, z0, x1, z1), intToFixed16(8) );
length in texels = length of wall * 8.0

Anyway when clipping lengthInTexels must be tracked, in addition to the starting texel offset after clipping (which is 0 if no clipping is required).

Projection

Once the wall is clipped to the view frustum, it is finally time to project it into screenspace. Note that we still don’t care about the wall height or camera pitch. For now we are still focusing on the x coodinates. First I will show the code and then explain it since it is pretty simple:

x0_screen = div16( mul16(x0, focalLength), z0 ) + halfWidth;
x1_screen = div16( mul16(x1, focalLength), z1 ) + halfWidth;
x0_pixel = round16(x0_screen);
x1_pixel = round16(x1_screen);

halfWidth is half of the width of the screen in pixels.
focalLength controls the horizontal field of view - it is basically tan(FOV/2) * halfWidth. Since FOV = 90, tan(45) = 1, so focalLength is actually the same value as halfWidth. But Dark Forces stores them as two seperate variables.

Finally round16() rounds the fixed point number and then converts it to an integer. Basically if the fractional part is less than 0.5, it rounds down otherwise it rounds up. The code is actually really simple:

s32 round16(s32 x) { return (x + HALF_16) >> 16; }

More Culling

Once the wall is projected, Dark Forces tries culling it again just to be sure.

First Backface culling. Since we know the orientation of the walls, we know that if x0 > x1 then we are looking at the backside of the wall:
if (x0_pixel > x1_pixel)
{
wall->visible = 0;
return;
}
Next we check if the wall is outside of the screen. Since we know the order of the x values, we can just see if the minimum value is to the right or the maximum value is to the left:
if (x0pixel > s_maxScreenX || x1pixel < s_minScreenX)
{
wall->visible = 0;
return;
}

Finally Dark Forces has a limit of 384 potentially visible walls. Fortunately each sector is only processed once and culled walls don’t count, so this limit isn’t as severe as it seems. Also note that only potentially visible sectors are processed in this way, so any invisible sector also doesn’t count towards this limit.
if (s_nextWall == MAX_SEG)
{
errorMessage(5, 20, "Wall_Process : Maximum processed walls exceeded!");
return;
}

Wall Segments

The walls themselves aren’t directly used for rendering. Instead processed walls are converted into “WallSegments.” The remainder of Wall_Process() handles creating a new segment which will be merged, clipped and sorted later during rendering.

Most elements are fairly straight forward, so I will just show the code:
wallSeg->srcWall = wall;
wallSeg->wallX0_raw = x0_pixel;
wallSeg->wallX1_raw = x1_pixel;
wallSeg->z0 = z0;
wallSeg->z1 = z1;
wallSeg->uCoord0 = curU;
wallSeg->wallX0 = x0_pixel;
wallSeg->wallX1 = x1_pixel;
wallSeg->x0View = x0;

Like I said, this is mostly straight forward. As WallSegments are processed, they are merged and clipped against existing segments and against the current window (or portal). This means that the projected (x) values are clamped during this process. To compute proper per-column depth (z) and texture coordinates, we have to keep around the ‘raw’ - or unclipped version of the coordinate (or no additional clipping anyway).

Next, remember in the clipping section when I mentioned we have to keep track of the starting ‘u’ texture coordinate after clipping? That is stored in uCoord0. If the renderer didn’t track this then the texture would swim as the wall went off screen. Finally we also save the view space x coordinate, this will be useful later for calculating per-column depth and u coordinate.

In the next step we have to compute two more values used to compute per-column depth (z) and texture coodinates - the slope of the wall and the texture scaling factor. We only have 16 bits of fractional precision available and we have to deal with purely horizontal or vertical walls (when viewed from top down) - so this complicates this part.

dx = x1 - x0;
dz = z1 - z0;
s32 slope, den;
if (abs(dx) > abs(dz))
{
slope = div16(dz, dx);
den = dx;
}
else
{
slope = div16(dx, dz);
den = dz;
}
wallSeg->slope = slope;
wallSeg->uScale = div16(texelLengthRem, den);

abs(x) = Absolute Value of x
Notice that we compute abs(dx) and abs(dz) (where dx = x1 - x0 and dz = z1 - z0) and use the larger value as the denominator in the slope. How does that help us? Let’s assume that abs(dx) > abs(dz):

Assume we want to compute the depth (z) at any point (x) along the WallSegment. We know that z(x0) = z0 and z(x1) = z1. So:
z(x) = z0 + (x - x0)*dx/dz => z(x) = z0 + (x - x0)*slope

Adding the WallSegment

So we have created a WallSegment, ready to be rendered. What next? First we have a list of “source segements” - again only computed once per frame for each potentially visible sector. So this segment is simply added to that list:

WallSegment* wallSeg = &s_wallSegListSrc[s_nextWall];
s_nextWall++;

Conclusion

Now we leave the Wall_Process() and continue to draw the wall. In Part 2 I will go over the merge/sort/clip process so we can start drawing the visible parts of the WallSegments!

Add Comment

First Post

This is my first “real” post on this blog. Several years ago there was a project called the “XL Engine” which evolved from DarkXL with lofty ambitions. I personally hit some difficult times but never properly canceled the project, even if I couldn’t get back to it for a long time and didn’t really want to for a long time after that. Fast forward to today, things are much better now with more free time but time moves on and the XL Engine isn’t really necessary anymore - both Daggerfall and Blood have great projects that fill the niche the XL Engine wanted to fill (or close enough).

But the Jedi Engine never had a proper source release or reverse engineering effort. While many considered DarkXL to be a promising effort, it was incomplete and inaccurate in many ways. Ultimately the methods used to reproduce the game could never be 100% faithful in the ways that matter. And so The Force Engine was concieved as a complete re-write, rebuilt from the ground up with the following tenets:

  • Support all Jedi based games - Dark Forces and Outlaws.
  • Source-level accuracy by reverse engineering the original executables.
  • Focus on making sure the core games are completely supported before adding effects on top like dynamic lighting, this means starting with the software renderer just like the originals.
  • Open sourcing the project the moment it goes public.
  • Cross platform support (though admittadly this last area still needs a lot of work before the first release).

While the GitHub repository has been made public, The Force Engine has not been officially announced and no release has been made. It will be some time before that happens (see the Roadmap).

Add Comment