B800 Text Format

Most shareware (and registered!) games of the era displayed a full-screen text page immediately before returning to the DOS prompt, and Cosmo was no exception:

Sample of a full screen of B800 text.

These screens showed colorful line-drawn boxes, customarily in front of a background that resembled the closed curtains on a theater stage. Most contained teasers and ordering information in the case of shareware episodes, or messages thanking the player for purchasing the registered episodes.

Much less commonly, this type of text display was used to provide a more substantial-looking error message in cases where the game was unable to run on a particular system:

Another sample of B800 text, this time for a low memory error.

Red Marking Pen

These were some of the, shall we say, “less proofread” parts of the game.

There were several B800 text entries in the game’s group files:

Entry NameDescription
COSMO1.MNIEnd screen for episode 1. Contains teasers for the registered game and ordering information.
COSMO2.MNIEnd screen for episode 2. Contains advertisements for other Apogee games.
COSMO3.MNIEnd screen for episode 3. Identical to COSMO2.MNI except for the episode number in the top line of text.
NOMEMORY.MNIError message explaining that the system does not have enough memory.

The B800 Mechanism

PC-compatible graphics adapters that supported color usually booted into mode number 03h by default. This was a text mode providing 80x25 characters with 16 colors. The foreground and background color of each character on the screen could be set independently.

B800 files are named after the segment address where the mode 03h screen buffer is located: B800:0000. The 4,000 bytes at this memory-mapped address contain the full screen buffer, and writing data within this address range immediately changes the text/colors displayed on the screen. That’s all the file format is – 4,000 bytes that are loaded directly into the video memory in order to put characters on the screen.

The printable text content uses 2,000 bytes (80 × 25) of memory, the foreground colors use 1,000 bytes (80 × 25 × log2 16 = 8,000 bits), and the background colors use another 1,000 bytes. The colors are encoded using the standard RGBI palette with one key difference: on the background only, the intensity bit flashes the foreground text instead of brightening the background color. B800 text files utilized this liberally.

Encoding and Decoding

Within a B800 text file, and the memory area it mirrors, every even-addressed byte controls the character displayed on the screen, and every odd-addressed byte controls the attributes (foreground color, background color, flashing) on the character that immediately preceded it.

Characters and attributes are stored in row-major order, starting at the top-left corner of the screen.

Characters

Unicode 1.0 was less than a year old when the game was released, and DOS PCs were rooted firmly in the world of code pages. The basic idea was this: Since there were only eight bits available for each character on the screen, there could only be 256 (28) different characters (or code points) to choose from. This isn’t too much of a problem if all you’re doing is writing using the Latin alphabet, but toss in Cyrillic, Greek, Arabic, Hebrew, Urdu… You run out of encoding space real fast. Using code pages, the system can “switch” to a different display font, where the underlying text data doesn’t change but the fonts on the display do:

ByteCP437CP850CP860CP737CP775CP856CP861CP863CP864CP866
......
80hÇÇÇΑĆאÇÇ°А
81hüüüΒüבüü·Б
82héééΓéגééВ
83hâââΔāדââГ
84hääãΕäהäÂД
85hàààΖģוààЕ
86hååÁΗåזåЖ
87hçççΘćחççЗ
88hêêêΙłטêêИ
89hëëÊΚēיëëЙ
8AhèèèΛŖךèèК
8BhïïÍΜŗכÐïЛ
8ChîîÔΝīלðîМ
8DhìììΞŹםÞН
8EhÄÄÃΟÄמÄÀО
8FhÅÅÂΠÅןŧП
90hÉÉÉΡÉנÉÉβР
91hææÀΣæסæÈС
92hÆÆÈΤÆעÆÊφТ
93hôôôΥōףôô±У
94hööõΦöפö˽Ф
95hòòòΧĢץþϼХ
96hûûÚΨ¢צûûЦ
97hùùùΩŚקÝù«Ч
98hÿÿÌαśרý¤»Ш
99hÖÖÕβÖשÖÔЩ
9AhÜÜÜγÜתÜÜЪ
9Bh¢ø¢δø(none)ø¢(none)Ы
9Ch£££ε££££(none)Ь
9Dh¥ØÙζØ(none)ØÙЭ
9Eh×η××ÛЮ
9FhƒƒÓθ¤(none)ƒƒ(none)Я
......

Note that in the preceding table, the raw bytes never change. All that changes from one code page to another is what the visual representation of a code point is.

The lower half of most code pages was the same, mirroring the 7-bit character assignments originally defined by the American Standard Code for Information Interchange (ASCII) in the 1960s. This allowed for compatible rendering of the letters A-Z, digits, and common printable punctuation characters across all configurations. The upper half above 80h, however, was a free-for-all.

The prevailing theory of the day was that files would tend to only be opened on computers that were geographically near one another, and all of those computers would be configured to use the same code page for display. If a file was written with the expectation that it would be displayed in one code page, and was instead opened on a computer with a different code page loaded, it may display as abject gibberish. The IBM PC (and most compatibles sold in the Western world) booted into code page 437 (CP437) by default.

CP437 defined a few international characters – mostly currency symbols, accented vowels to properly write out certain names and loanwords from other languages, and enough Greek letters and mathematical symbols to write physics equations. Most of the rest of the characters were for block and box drawing, allowing solid filled areas and continuous single- or double-lines to be drawn in all four directions with corners and intersections. These box drawing characters were integral to some of the first text-based UIs in DOS software. They were also used extensively in B800 text screens.

The full CP437 character set is as follows:

00h(blank)20h(space)40h@60h`80hÇA0háC0hE0hα
01h21h!41hA61ha81hüA1híC1hE1hß
02h22h"42hB62hb82héA2hóC2hE2hΓ
03h23h#43hC63hc83hâA3húC3hE3hπ
04h24h$44hD64hd84häA4hñC4hE4hΣ
05h25h%45hE65he85hàA5hÑC5hE5hσ
06h26h&46hF66hf86håA6hªC6hE6hµ
07h27h'47hG67hg87hçA7hºC7hE7hτ
08h28h(48hH68hh88hêA8h¿C8hE8hΦ
09h29h)49hI69hi89hëA9hC9hE9hΘ
0Ah2Ah*4AhJ6Ahj8AhèAAh¬CAhEAhΩ
0Bh2Bh+4BhK6Bhk8BhïABh½CBhEBhδ
0Ch2Ch,4ChL6Chl8ChîACh¼CChECh
0Dh2Dh-4DhM6Dhm8DhìADh¡CDhEDhφ
0Eh2Eh.4EhN6Ehn8EhÄAEh«CEhEEhε
0Fh2Fh/4FhO6Fho8FhÅAFh»CFhEFh
10h30h050hP70hp90hÉB0hD0hF0h
11h31h151hQ71hq91hæB1hD1hF1h±
12h32h252hR72hr92hÆB2hD2hF2h
13h33h353hS73hs93hôB3hD3hF3h
14h34h454hT74ht94höB4hD4hF4h
15h§35h555hU75hu95hòB5hD5hF5h
16h36h656hV76hv96hûB6hD6hF6h÷
17h37h757hW77hw97hùB7hD7hF7h
18h38h858hX78hx98hÿB8hD8hF8h°
19h39h959hY79hy99hÖB9hD9hF9h
1Ah3Ah:5AhZ7Ahz9AhÜBAhDAhFAh·
1Bh3Bh;5Bh[7Bh{9Bh¢BBhDBhFBh
1Ch3Ch<5Ch\7Ch|9Ch£BChDChFCh
1Dh3Dh=5Dh]7Dh}9Dh¥BDhDDhFDh²
1Eh3Eh>5Eh^7Eh~9EhBEhDEhFEh
1Fh3Fh?5Fh_7Fh9FhƒBFhDFhFFh(blank🐰🥚)

Everything old is new again.

There are some precursors to emoji in the CP437 table – especially the faces, arrows, and playing card suits. Some computers may even display some of the above characters as emoji.

In files and console output, the control characters (00h-1Fh) represent newlines, tabs, audible beeps, and other commands to move the cursor around. In the video memory, however, these characters have absolutely no special behavior and display a single character just like any other.

Attributes

Each attribute byte has four bits that control foreground color, three bits that control background color, and one bit that controls flashing of the foreground text:

Bit PositionDescription
0 (least significant bit)Foreground blue palette bit.
1Foreground green palette bit.
2Foreground red palette bit.
3Foreground intensity palette bit.
4Background blue palette bit.
5Background green palette bit.
6Background red palette bit.
7 (most significant bit)0 = foreground text displays normally, 1 = foreground text flashes.

The default system color was 07h, or low-intensity white on a black background.

The attribute byte affects the character byte that immediately precedes it:

Offset (Bytes)Description
0Character at (0, 0).
1Attributes for character at (0, 0).
2Character at (1, 0).
3Attributes for character at (1, 0).
3,996Character at (78, 24).
3,997Attributes for character at (78, 24).
3,998Character at (79, 24).
3,999Attributes for character at (79, 24).

Programming Considerations

DOS is unaware of characters drawn directly to video memory, and the cursor position is not updated in any way to reflect the updated contents of the screen. When the program exits, DOS thinks the screen is blank and the most appropriate place for the prompt should be at the current cursor position, which is near the top of the screen. With B800 text already occupying space on the screen, this naive assumption results in the DOS prompt appearing on top of the text, with new and old characters and attributes jumbling together in an unreadable mess.

Typically, any program that displayed B800 text would also emit – usually via printf() – a series of newline characters to move the cursor down far enough to clear the bottom of the displayed text.