Common File Formats for Build Engine Games

The Group File

The group file is the main storage file format used in Duke Nukem 3D and Shadow Warrior. Group files can be thought of as "containers". These "containers" hold multiple files, making it easy to deal with many files at once (moving one file, in essence, moves many other files as well).

The group file has the following format:

I - The File Signature

The first twelve (12) bytes of a group file are known as the file "signature". These 12 bytes contain the name of the original designer of the group file format. That person is Ken Silverman, and the first 12 bytes appear like this (without the quotation marks): "KenSilverman"

II - The File Count

The next four (4) bytes of a group file are the number of files that are contained within the group file.

III - File Header Table

The file header table is comprised of one 16 byte structure for each file contained in the group file (so if our group file has 25 embedded files in it, the group file has 25 structures that are each 16 bytes long). The first 12 bytes of the structure are the filename, and the last 4 bytes are the file's size. Note that the 12 bytes for the filename include the period and the file extension, so file names must be no longer than 8 characters. For example, Tiles000.art is as large a filename as you can have.

IV - The Raw File Data

Everything after the file header table is simply the raw file data, packed one after the other, in the exact same order as the files appear in the file header table.

ART Files

The majority of the following text comes from the "artform.txt" file, but I have made some additions. My additions and comments appear as green bold text.

Documentation on Ken's .ART file format by Ken Silverman

I am documenting my ART format to allow you to program your own custom art utilities if you so desire. I am still planning on writing the script system.

All art files must have xxxxx###.ART. When loading an art file you should keep trying to open new xxxxx###'s, incrementing the number, until an art file is not found.

1. long artversion;

The first 4 bytes in the art format are the version number. The current current art version is now 1. If artversion is not 1 then either it's the wrong art version or something is wrong. This can be useful, but I don't recall checking it in Group File Studio (so it might not be needed in any 3rd party code).

2. long numtiles;

Numtiles is not really used anymore. I wouldn't trust it. Actually when I originally planning art version 1 many months ago, I thought I would need this variable, but it turned it is was unnecessary. To get the number of tiles, you should search all art files, and check the localtilestart and localtileend values for each file. This is a useless bit of information - it is not used in Group File Studio to calculate the number of tiles in a file. Just use the localtilestart and localtileend values to get a number of tiles in the current art file.

3. long localtilestart;

Localtilestart is the tile number of the first tile in this art file.

4. long localtileend;

Localtileend is the tile number of the last tile in this art file. Note: Localtileend CAN be higher than the last used slot in an art

file. Example: If you chose 256 tiles per art file:

  • TILES000.ART -> localtilestart = 0, localtileend = 255
  • TILES001.ART -> localtilestart = 256, localtileend = 511
  • TILES002.ART -> localtilestart = 512, localtileend = 767
  • TILES003.ART -> localtilestart = 768, localtileend = 1023

5. short tilesizx[localtileend-localtilestart+1];

This is an array of shorts of all the x dimensions of the tiles in this art file. If you chose 256 tiles per art file then [localtileend-localtilestart+1] should equal 256.

6. short tilesizy[localtileend-localtilestart+1];

This is an array of shorts of all the y dimensions.

7. long picanm[localtileend-localtilestart+1];

This array of longs stores a few attributes for each tile that you can set inside EDITART. You probably won't be touching this array, but I'll document it anyway.

  • Bits 0-5 = Animation Number
    • The animation number is essentially the number of tiles included in the animation. This number does not  include the current tile. So if the number here is a 4, there are 4 tiles *following* the current tile that are included in the animation sequence.
  • Bits 6-7 = Animation Type
    • These two bits determine the type of animation:
      00 - No Animation
      01 - Oscillating Animation
      10 - Forward Animation
      11 - Backward Animation
  • Bits 8-15 = (signed char) x-center offset
  • Bits 16-23 = (signed char) y-center offset
  • Bits 24-27 = Animation speed
    • The animation speed is based on the following data (which comes directly from Ken Silverman). There are 16 possible speeds (0 to 15). In Editart, the fastest rate is also the default (value of 0). Here are the possible animation speeds (note the transition from frames per second to seconds per frame between speed #6 and speed #7).:
      • 0: 120 frames/sec
      • 1: 60 frames/sec
      • 2: 30 frames/sec
      • 3: 15 frames/sec
      • 4: 7.5 frames/sec
      • 5: 3.75 frames/sec
      • 6: 1.875 frames/sec
      • 7: 1.0666... sec/frame
      • 8: 2.1333... sec/frame
      • 9: 4.2666... sec/frame
      • 10: 8.5333... sec/frame
      • 11: 17.0666... sec/frame
      • 12: 34.1333... sec/frame
      • 13: 68.2666... sec/frame
      • 14:136.5333... sec/frame
      • 15: 273.0666... sec/frame
    • Animation rates after the value 7 seem kind of useless. The equation for this value system is:
      • frame_number = (totalclock >> animspeed)
    • Totalclock is a timer, incremented by one, 120 times per second
    • frame_number then gets MODed and possibly reversed, etc. (depending on the animation type).
  • Bits 28-31 = Nothing

8. The Raw Data

After the picanm's, the rest of the file is straight-forward rectangular art data. You must go through the tilesizx and tilesizy arrays to find where the artwork is actually stored in this file.

Note: The tiles are stored in the opposite coordinate system than the screen memory is stored (I now know why Ken did this - The following is the explanation).

A while ago I wrote a simple routine to draw textured vertical walls DOOM-style. Because the Z coordinate stays constant along the Y direction, my routine drew by columns. I was distantly aware that this made my memory accesses less linear, and therefore slower, but I figured that it wasn't that important since my K6s 32k L1 data cache was big enough to fit half my texture in. When the frame-rate that resulted were lower than my original 33mhrz 486 got under DOOM I was rather startled and disappointed, but looking through the code I couldn't see anything that should improve speed by more than a factor of 3 or 4. Wondering if my compiler had gone berserk, or if I was just a bad programmer, I started optimizing. The second optimization I performed was to switch the X and Y coordinates when reading from my texture, which used the standard 2-dimensional array format. Suddenly the performance went up 10-fold! Analyzing the access patterns I realized that previously *every* read from my texture had been a cache-miss!

The above quote was taken from http://www.azillionmonkeys.com/qed/optimize.html

Example on a 4*4 file:

Offsets:
+---+---+---+---+
| 0 | 4 | 8 |12 |
+---+---+---+---+
| 1 | 5 | 9 |13 |
+---+---+---+---+
| 2 | 6 |10 |14 |
+---+---+---+---+
| 3 | 7 |11 |15 |
+---+---+---+---+

-