From 0 To 1 MB in DOS - by Julio Merino - Blog System:5

From 0 to 1 MB in DOS - by Julio Merino - Blog System/5 25/02/24, 12:36 PM
From 0 to 1 MB in DOS
A tour on how DOS was able to use most of the 1 MB address space of the 8086
JULIO MERINO
17-JAN-2024
14 18 Share
Since the last article on the text-based IDEs of old, I’ve been meaning to write about the
GCC port to DOS, namely DJGPP. As I worked on the draG for that topic, I realized that
there is a ton of ground to cover to set the stage so I took most of the content on memory
management out and wrote this separate post.
This article is a deep dive on how DOS had to pull out tricks to maximize the use of the
very limited 1 MB address space of the 8086. Those tricks could exist because of the
features later introduced by the 80286 and the 80386, but these were just clutches to
paper over the fact that DOS could not leverage the real improvements provided by
protected mode.
https://blogsystem5.substack.com/p/from-0-to-1-mb-in-dos Page 1 of 22
86
C80 64
S28 2C L 78
1 TE
02IN
Intel 8086 chip with the MS-DOS logo overlaid on top. Images from the
Wikipedia.
This detour is long but I hope you’ll enjoy it as much as I enjoyed researching the topic.
I’ll walk you through the changes in the x86 architecture over time, starting with the
8086 and ending in the 80386, and how DOS kept up along the way. I’ll conclude with a
peek into DOS’ own MEM and MemMaker utilities. I must omit details to keep the text
manageable in size though, so please excuse the lack of detail in some areas; just follow
the links to external documentation to learn more.
Articles like this take many hours to write.

Please subscribe to Blog System/5 to show
your support. It’s free. Donations appreciated!
Type your email... Subscribe
Before we begin, I’d like to thank neozeed from Virtually Fun for his in-depth review of this
article. Head to his blog for a lot of interesting and related content!
The 8086 and real mode

Back in the 1980s, the Intel 8086 was the processor for PCs (along with its crippled
sibling 8088 which I’ll ignore throughout this post). The 8086 was a simple chip with
signiXcant limitations, but it was incredibly successful and is the foundation of the x86
architecture that we still use today. Because the 8086 powered PCs, and DOS was the
operating system that IBM chose for the PC, DOS was designed to work for the 8086.
For the purposes of our article, let’s focus on two details: the 8086 CPU had a 20-bit
address bus, which means it could only address 1 MB of memory, and it was a 16-bit
CPU, which means that its internal registers were all 16-bit long. Which begs the
question: if registers could only represent 2^16 = 64K di]erent values, how could code
reference a 20-bit address space?
The answer is segmentation. In the 8086, every instruction that references memory does
so by specifying an address as two separate 16-bit quantities: a segment and an oCset
within the segment. Because segments are 16-bit long, there can be up to 64KB segments
—and if we take the 1 MB total address space and divide it by 64KB, we deduce that
segments are o]set from each other by 16 bytes.
Representation of the way segmentation slices the 8086 1 MB address space

into 64KB overlapping segments offset by 16 bytes from each other.
There are two things of note in the diagram. First, segments overlap, meaning that a
single memory location can be referenced by many di]erent segment/o]set
combinations. And, second, all segments aGer 0xF000 reference memory positions
above the 1 MB address space that don’t exist, which the 8086 chose to wrap around by
ignoring the 21st bit of an address. AGer all, there is no 21st line in the address bus.
When accessing a piece of memory with an address of the form segment:offset, say
81DA:72C3, the 8086 multiplies 0x81DA by 16 (a simple 4-bit shiG to the leG) and adds
the 0x72C3 o]set to the result to obtain address 0x89063. From there on, the 8086
reads from or writes to that memory location and calls it a day. There is no Memory
Management Unit (MMU): every memory access can reference any address, and all
accesses are legal (including those that wrap around!) which means null or dangling
pointer dereferences don’t cause any sort of crash.
This type of MMU-less segmented addressing is known as real mode, but this name
didn’t appear until the 80286 had to give it a name to distinguish it from protected mode.
But before introducing the latter, let’s talk about memory maps for a second.
Conventional and upper memory

Just because the 8086 processor can reference 1 MB of memory does not mean that 8086
machines came equipped with 1 full MB of RAM. Machines typically included less than
that for cost reasons, so it’s critical to understand that the address space—that is, the set
of addresses that can be referenced—is di]erent from the amount of memory installed.
No surprise there: the same is true today of modern 64-bit CPUs.
This fact was advantageous because certain system devices prefer to expose themselves
as part of the address space. For example: the BIOS ROM is accessible via a range of
addresses in upper memory. Or another example: video cards tend to expose their
framebu]er as a memory-mapped device so that applications can directly write to well-
known addresses to manipulate video memory—bypassing the separate I/O bus and
instruction set.
The question is: where do such devices live within the address space? The choice is
pretty much arbitrary, so what IBM did was split the address space of the original PC in
two parts. The Xrst 640 KB, known as conventional memory, were mapped to available
RAM and were readily usable by applications. The upper 384 KB, known as upper memory
or the Upper Memory Area (UMA), were reserved for memory-mapped devices. This
assignment of meaning to portions of the address space for speciXc purposes is what’s
called a memory map, and this was the memory map of the original PC.
Representation of the 8086 PC memory map. Solid blocks correspond to

mapped memory and dashed parts correspond to unmapped memory.
Note how conventional memory may not be fully usable: as shocking as it may sound
today, PCs could have even less than 640 KB of installed RAM. More importantly for our
purposes, the upper memory is sparse: certain address ranges are assigned to devices, but
there may be gaps between devices. What happens when reading from or writing to
unmapped addresses is hard to tell as this depends on the chipset—not the CPU—but
remember that all accesses had to be valid in some form.
And by the way, there is one teeny tiny exception to the “all Xrst 640 KB are free to use
to applications”: the very Xrst 1 KB of RAM is reserved for the Interrupt Vector Table,
which is what the 8086 uses to look for interrupt handlers when it receives an interrupt,
and a few more bytes aGer that are used by the BIOS to map volatile data. We can ignore
these.
Expanded Memory Specification (EMS)

As you can imagine, 640 KB of RAM were soon not enough for many programs.
Applications needed access to more memory, but the limitations of the 8086 20-bit
address bus made it diicult. Yet there was a need for an immediate solution. The
Expanded Memory SpeciXcation (EMS) was one of those Xrst solutions and was
originally designed by Lotus in order to support their star application 1-2-3.
The idea is simple. Remember how upper memory is sparse? What if we could map a
window in that address space to a di]erent portion of memory? And what if we could
control, via soGware, which portion of memory was exposed through such window at
any given time?
Representation of the 8086 PC memory map and an EMS window in upper

memory mapped to a portion of memory provided by an external ISA card.
This is precisely what EMS did. EMS brought the idea of having a soGware API that
allowed applications to select views into “other” memory by mapping a 64 KB window in
upper memory to another memory chip. Originally, this other memory was supplied by
expansion ISA cards like the one below, which means that EMS required hardware
support and hardware-speciXc drivers to perform the window remappings. This also
means, by the way, that this memory could be much slower than standard RAM.
An Emulex Persyst ISA card providing 4 MB of additional memory. It's

notable how huge this was. Photo taken from the Wikipedia; see original.
As you can imagine, this trick o]ered the theoretical ability to access any amount of
extra RAM on a machine… at the expense of having to structure your application so that
it explicitly (re)conXgured the window. Implementing such applications is diicult and
is also insuicient when applications need a larger address space to function
comfortably. EMS was a useful kludge, but still a kludge.
The 80286 and protected mode

As the needs of programs and operating systems grew, processors themselves needed to
grow more advanced features as well. In particular, they needed to gain the ability to
access much more than just 1 MB of memory, and they needed to o]er memory
protection features to isolate programs from each other.
The 80286, still a 16-bit CPU, grew the address space to 24 bits to support up to 16 MB of
RAM. It introduced a MMU to o]er memory isolation. But there was a problem here:
the 8086 memory addressing mode of 16-bit segment:offset pairs did not permit
addressing memory beyond the Xrst 1 MB. Thus the 80286 had to Xnd a di]erent way to
reference the RAM beyond 1 MB, also known as extended memory, and it did so by
introducing a completely di]erent operation mode known as protected mode.
Once in protected mode, all memory access rules change. For one, the MMU takes
charge of all memory accesses, ensuring that they are valid according to predeXned
protections. For another, the segment:offset addresses change their semantics. The
segment portion of an address stops being a scaled down memory address, and instead
becomes an index into a table (the GDT or LDT) of segment descriptors. Each segment
descriptor describes the base address of a segment as a 32-bit quantity, the size of the
segment as a 20-bit quantity, the access protections of the segment, and a bunch of nags.
Diagram representing the operations that the 80286 does, in hardware, to

resolve a segment:address memory reference.
Under this model, instead of having an address like 81DA:72C3, we might see
13:72C3. To access this address, the MMU looks up entry 13 in the GDT, Xnds a
descriptor that says that the segment starts at address 0x81DA and that it is 0x7200
bytes long, and concludes that the 13:72C3 reference is out of bounds, raising a
General Protection Fault hardware exception (your dreaded “Segmentation fault” error
these days).
But this access model is really di]erent from what DOS applications were used to in real
mode, and DOS itself was only designed to run in real mode. So DOS and all of its
ecosystem were stuck in the real mode world, limited to addressing just 1 MB of memory
without the ability to access extended memory… or were they?
The High Memory Area (HMA)

Avid readers might have noticed in a previous picture that, in real mode, there are a few
segments that could, in theory, allow access to addresses above the 1 MB mark. Take
FFFF:0010, which is 0xFFFF * 0x10 + 0x10 = 0x100000 = 2^20 or 1 followed
by 20 zeroes in binary form. This is the Xrst address into extended memory. The 8086
was able to produce such addresses, but they would wrap around because the 8086 had
no way to issue the 21st bit to the memory bus.
For compatibility reasons, the 80286 had to truncate these addresses in real mode, and it
did so by wiring the 21st line of the address bus (the A20) to zero. But what if the
addresses didn’t wrap around? What if we told the 80286 to not truncate them? This can,
in fact, be done by enabling the A20 gate—a task achieved via the keyboard controller,
mind you. Enabling the A20 is a prerequisite for entering protected mode, but if done in
real mode, it becomes possible to access an extra 64 KB of memory. This extra memory is
known as the High Memory Area (HMA).
Representation of the way segmentation slices the 80286 real mode 1 MB

address space into 64KB overlapping segments offset by 16 bytes from each
other, and how the last few segments can be made to spill into extended
memory.
The HMA isn’t big by any means, but conventional memory was so limited that any
tricks to gain extra memory were welcome and DOS did precisely that. By loading the
HIMEM.SYS driver and by specifying DOS=HIGH in the CONFIG.SYS Xle, DOS would
relocate parts of itself into the HMA to free up to 64 KB of conventional memory.
Unfortunately, 64 KB of extra memory really aren’t that much. This is a nice trick, but
applications really needed access to more memory.
eXtended Memory Specification (XMS)

Another technique to access extended memory (the memory visible beyond the Xrst 1
MB) came in the form of the eXtended Memory SpeciXcation (XMS).
The idea of XMS is to provide an API that real mode applications can use to allocate
chunks of extended memory and to copy data from/to those chunks via explicit calls.
This is di]erent from EMS because we aren’t talking about a Xxed window of the
address space referencing some external memory: we are talking about an API that is
able to allocate arbitrary chunks of memory and return references to them, much like
malloc and free do. Applications then use extra APIs to copy memory from/to those
extended memory chunks into conventional memory.
Representation of the 80286 memory map with a block of extended memory

copied into conventional memory by means of XMS.
To implement the XMS API, a driver such as HIMEM.SYS uses protected mode to
manipulate the extended memory blocks, but the real mode applications never have to.
Because XMS requires protected mode, this feature is only available in the 80286 and
above processors, whereas EMS could be used by the 8086 with the right hardware
assistance.
Much like EMS, XMS provided the theoretical ability to access an unlimited amount of
extended memory from real mode at the expense of having to manually transfer small
pieces of it into the limited 1 MB address space. This is not a trivial diiculty and is why
even modern architectures like the Cell processor and its SPEs with a separate and
limited address space didn’t succeed.
So far so good, but… we can still squeeze some extra memory out of the real mode
segmented address space. In particular, we still have some gaps in upper memory that
we haven’t put to use.
The 80386 and VM86 mode

To make further improvements possible, we need the additional processor features that
came with the 80386. Compared to the 80286, the 80386 was a 32-bit processor capable of
addressing up to 4 GB of RAM. More interesting for our discussion, however, were the
addition of pagination—which is the foundation of any modern operating system—and a
new execution mode known as Virtual 8086 or VM86 for short.
VM86 was originally envisioned to Xx a major deXciency in the 80286: that is, the ability
to run multiple concurrent real mode applications from within a protected mode
operating system because that’s what Windows and OS/2 wanted to provide. In today’s
terms, VM86 acts as virtual machine hypervisor, providing the processor with a
mechanism to enter a mode that behaves like real mode but that is backed by the
protected mode MMU. All memory accesses are subject to translation via pagination
and all privileged operations trap into protected mode, meaning that a hypervisor
process can place VM86 tasks anywhere in the 4 GB memory address space and can
context-switch between them.
But what if our hypervisor only spawned a single VM86 task to run DOS in? What if,
then, the hypervisor leveraged the MMU to relocate portions of the real mode address
space into portions of extended memory? That way, we could emulate EMS without
dedicated hardware, providing applications a trivial way to peek into extended memory
via 64 KB windows, or we could play other tricks… like UMBs.
Upper Memory Blocks (UMBs)
Remember the sparse upper memory area? Those 384 KB of memory wired to hardware
devices but with gaps in them? What if we could take those gaps in upper memory and
map them to some extended memory so that they were directly addressable from real
mode? This is precisely what Upper Memory Blocks (UMBs) are: portions of the upper
memory area remapped to extended memory by means of VM86, transparently o]ering
access to more memory from real mode.
As an example, think about the video memory mappings in the original PC speciXcation.
The memory map reserved two chunks of the address space for video: one for
monochrome displays and one for color displays. But only one of them can be in use at
any given time. So whichever video mode is selected leaves the address space of the
other mode unused, and thus such address space can be leveraged by the system as an
UMB into which to load drivers or place user data.
Representation of the 80386 VM86 memory map with two UMBs in upper
memory mapped to extended memory by means of the MMU and its
pagination features.
In DOS, the EMM386.EXE driver is in charge of providing UMBs. In principle, the

chipset could also implement them in older processors, but the 80386 made this all
possible in soGware by entering VM86 and then leveraging pagination to map pages of
extended memory in the upper memory area.
When UMBs are available, DOS o]ered ways to move load drivers and TSR programs
into UMBs by specifying DOS=UMB in the CONFIG.SYS Xle and then using the
DEVICEHIGH and LOADHIGH commands.
Putting it all together

In summary, we have seen four di]erent techniques that evolved over time to squeeze
the most memory out of the incredibly limited 1 MB address space of the 8086. These
limitations haunted the PC ecosystem until Windows took over and it can be argued that
Windows “won” because it made it much easier to take advantage of the larger address
space of the 80386. Until that happened, however, users had to live with the EMS, XMS,
HMA, and UMB mess.
And a mess it was, and it was so pervasive that it leaked to users. Let’s take a look at the
output of the MEM command, which was a very common utility to run when trying to
maximize free conventional memory. In one of its simplest forms, MEM /C, the
command shows a summary of the memory usage per program and whether they are
loaded in conventional or upper memory:
First page of the output of the MEM /C /P command.
MEM /D, which I did not know about back in the day, is more interesting. I have to show
you its output as three separate images due to console size limitations:
First page of the output of the MEM /D /P command.
Second page of the output of the MEM /D /P command.
Third page of the output of the MEM /D /P command.
MEM /D shows us the memory map of the machine along with details about the location
of each program. We can also see mentions of the conventional vs. upper address space,
whether the HMA is in use, and consumption of XMS memory.
But in all of these outputs, we see that upper memory is not in use at all. Why is that?
Well, because MS-DOS out of the box doesn’t attempt to load drivers into it. For that, we
have to manually modify CONFIG.SYS and AUTOEXEC.BAT with special commands and
reorder entries to maximize the usage of the upper memory and the HMA. A black art.
Which brings me to MemMaker, a tool that debuted with MS-DOS 6. This tool
attempted to automatically reconXgure all drivers and TSRs loaded during boot so that
the largest/majority of them Xt in upper memory, leaving as much conventional memory
free for application usage.
Welcome page to the MemMaker utility.
As automated as it was, however, I’d say that MemMaker was an experts-only tool. Take
a look at its advanced conXguration options page, which I had to Xddle with to make the
tool work in a KVM virtual machine running MS-DOS 6.22. You really need to
understand the intricate details of real mode memory management to know what each of
these options is about.
Advanced options in the MemMaker utility.
But if you made MemMaker run successfully, it would then greet you with a pretty cool
page aGer a reboot. This page showed a breakdown of the changes made by MemMaker
and how it was able to reclaim conventional memory.
Final report of the changes made by the MemMaker utility.
AGer that, we can look at the changes that MemMaker applied to the CONFIG.SYS and
AUTOEXEC.BAT Xles. I never truly understood what all of these magic numbers meant,
but now that I went through the hassle of researching and writing this article, I Xnally
do. Just 30 years aGer I regularly used MemMaker to be able to play games in my
multimedia 80386.
Content of the CONFIG.SYS and AUTOEXEC.BAT files after MemMaker is

done making changes.
You can much read more in MicrosoG’s own now-archived KB-Q9555 article or the DOS
Beyond 640KB book.
And that’s it for today folks. I intentionally did not touch on DPMI—the technology that
truly allowed DOS applications to break free from the 1 MB memory limitation—
because I’m saving that for the next article. So, make sure to come back for more!
Don’t miss the next post on DJGPP, which will

talk about DPMI and various other tricks to get
Unix programs to integrate with DOS.
Subscribe to Blog System/5; it’s free!
Type your email... Subscribe
14 Likes · 2 Restacks
18 Comments
Write a comment...
Ben Myers Jan 22 Liked by Julio Merino
To redirect the output generated by MEM /D into a single contiguous text file, simply run:
MEM /D >MEMD.TXT
LIKE (1) REPLY SHARE
2 replies by Julio Merino and others
Steve Jan 18
Wow.. Lots of stuff I haven't thought about in years, and a reminder of why I stopped
programming in Assembly when protected mode became common!! I love that you
reference using the video memory space for programs. It reminds me of a trick I used
once long ago when using an ADA compiler that was only capable of making .COM files
(not .EXE) which were limited to 64k.. We ran out of room in our 64k memory space, and
in trying anything and everything to optimize, we realized that while our monochrome
video card had 2k of RAM onboard, it was only 80x25 which was 2000 bytes, leaving 48
bytes of precious RAM available and unused. We decided to start using those bytes for
variable storage. It worked great until one day our boss decided to upgrade our demo
system to a new VGA card and monitor which he just bought.. Removing the old
monochrome adapter the night before an important customer demo and installing the
VGA, our program began to fail in interesting ways we had never seen before. It took us
some debugging time but finally realized that the 48 bytes we had long since forgotten we
were using for variables was now gone. Reinstalling the old monochrome card just for it's
2k of memory had us working again though, and were still able to use the new VGA card
for primary display.. Gotta love the old days before page faults!!
LIKE (2) REPLY SHARE
16 more comments...
© 2024 Julio Merino ∙ Privacy ∙ Terms ∙ Collection notice

Substack is the home for great writing

From 0 To 1 MB in DOS - by Julio Merino - Blog System:5

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

From 0 To 1 MB in DOS - by Julio Merino - Blog System:5

Uploaded by

Copyright:

Available Formats

From 0 to 1 MB in DOS - by Julio Merino - Blog System/5 25/02/24, 12:36 PM

Articles like this take many hours to write.

Type your email... Subscribe

The 8086 and real mode

Representation of the way segmentation slices the 8086 1 MB address space

Conventional and upper memory

Representation of the 8086 PC memory map. Solid blocks correspond to

Expanded Memory Specification (EMS)

Representation of the 8086 PC memory map and an EMS window in upper

An Emulex Persyst ISA card providing 4 MB of additional memory. It's

The 80286 and protected mode

Diagram representing the operations that the 80286 does, in hardware, to

The High Memory Area (HMA)

Representation of the way segmentation slices the 80286 real mode 1 MB

applications really needed access to more memory.

eXtended Memory Specification (XMS)

Representation of the 80286 memory map with a block of extended memory

The 80386 and VM86 mode

Upper Memory Blocks (UMBs)

In DOS, the EMM386.EXE driver is in charge of providing UMBs. In principle, the

Putting it all together

First page of the output of the MEM /C /P command.

First page of the output of the MEM /D /P command.

Second page of the output of the MEM /D /P command.

Third page of the output of the MEM /D /P command.

Welcome page to the MemMaker utility.

Advanced options in the MemMaker utility.

Final report of the changes made by the MemMaker utility.

Content of the CONFIG.SYS and AUTOEXEC.BAT files after MemMaker is

Don’t miss the next post on DJGPP, which will

Type your email... Subscribe

Ben Myers Jan 22 Liked by Julio Merino

2 replies by Julio Merino and others

© 2024 Julio Merino ∙ Privacy ∙ Terms ∙ Collection notice

You might also like