This article series is intended as only an introduction to the use of
advanced operating system features. For complete coverage of the
topics, please consult appropriate reference material. All code
examples are presented in HLA source and make use of high-level syntax
because of my personal ideas about code readability for instructional
material.
Every application manipulates data. Most of the time this data is
stored in (or retrieved from) a disk file in one way or another. One
typically writes code so that it doesn't matter exactly where this data
is stored. It can be on a hard drive, a CD, a DVD, a USB device, a
network drive, or a system device like STDIN and STDOUT (which can
represent a keyboard or screen or be re-directed to/from a disk file).
Your application's code is only interested in processing the data - it
lets the operating system deal with the low-level details of how it is
stored.
The operating system naturally manages internal buffers
during routine File I/O (also, hard drives have internal cache memory
used to virtualize the storage), but applications often require the
coding of additional buffers for manipulation of the data. These
additional buffers plus other overhead incurred from using tradition
File I/O functions can be taxing to both an applications resources
(memory requirements) and in terms of performance. One method to
improve performance is to read/write data in large chunks - for
instance, write entire strings to disk instead of sending it
character-by-character. It is also good to reduce the number of system
calls because every call incurs a performance hit due to a context
switch (from user-mode to kernel-mode, processing OS code, and then
from kernel-mode back to user-mode before returning to your code).
Another method is to use memory-mapped files.
A memory-mapped
file allows you to treat the file as if it were just an ordinary
section of memory. This memory region is mapped from a process's own
address space. The difference between a memory-mapped file and a region
allocated from the virtual memory pool is that the memory-mapped region
is backed by physical storage in a named file instead of using storage
from the system's paging file. The benefits are many:
- conserves paging file space
- avoids traditional file I/O operations
- avoids the need for buffering
- allows the sharing of data between processes
The
first step to using memory-mapped files is usually to create or open a
file kernel object. This establishes the file that will be the physical
storage for the file-mapping object.
w.CreateFile( FileName, DesiredAccess, ShareMode, SecurityAttributes, CreationDisposition, FlagsAndAttributes, TemplateFile );
The
DesiredAccess parameter must specify either read-only or read-write
access. Next, you create a file-mapping object and tell it how much
physical storage is needed.
w.CreateFileMapping( File, SecurityAttributes, Protect, MaximumSizeHigh, MaximumSizeLow, MapName );
MaximumSizeHigh
and MaximumSizeLow are two 32-bit parameters that specify a 64-bit
value. MaximumSizeHigh is always zero for files less than 4 GB in size.
If you are opening an existing file and wish to use its entire current
size, pass 0 for both of these parameters.
Now, you will want to
reserve a region of the process's address space that will mirror the
data stored in the file. This is called a "MapView" and you can have as
many as you wish. A MapView can reflect any section of a file but the
starting address MUST be an even multiple of the system's allocation
granularity (typically 64 KB, but you can use w.GetSystemInfo to be
sure).
w.MapViewOfFile( FileMappingObject, DesiredAccess, FileOffsetHigh, FileOffsetLow, NumberOfBytesToMap );
Again,
FileOffsetHigh and FileOffsetLow are two 32-bit parameters that specify
a 64-bit value. These specify which byte of the data file will be
mapped to the first byte in memory (of the region you wish to allocate
with this call; this function returns the starting address). If you
specify 0 for the last parameter, NumberOfBytesToMap, the remainder of
the file - starting from the given offset - will be mapped to the view.
Because
it helps to see things in action, compile the following program and
then step-through the code using a debugger while you have a file
manager window open (optionally, you can intermittently execute "dir
eraseme.xxx" at the command prompt).
program mmap;
#include( "w.hhf" )
static
hFile :dword;
hFileMapping :dword;
BaseAddr :dword;
begin mmap;
// At this point, "EraseMe.xxx" does not exist.
w.CreateFile( "EraseMe.xxx",
w.GENERIC_READ | w.GENERIC_WRITE,
w.FILE_SHARE_READ | w.FILE_SHARE_WRITE,
null,
w.CREATE_ALWAYS,
w.FILE_ATTRIBUTE_NORMAL,
null );
mov( eax, hFile );
// At this point, "EraseMe.xxx" exists with
// a file size of 0 bytes.
w.CreateFileMapping( hFile,
null,
w.PAGE_READWRITE,
0,
100,
null );
mov( eax, hFileMapping );
// At this point, "EraseMe.xxx" has a size
// of 100 bytes.
w.MapViewOfFile( hFileMapping,
w.FILE_MAP_WRITE,
0,
0,
100 );
mov( eax, BaseAddr );
// At this point, a region of the process's
// address space is reserved and the file is
// committed as the physical storage mapped to
// this region. BaseAddr is the start of this region.
for( mov( 0, ecx ); ecx < 100; inc( ecx ) ) do
mov( cl, (type byte [eax+ecx]) );
endfor;
// At this point, data is written to the range from
// BaseAddr to BaseAddr + 99.
w.UnmapViewOfFile( BaseAddr );
// At this point, the data is flushed to the file.
w.CloseHandle( hFileMapping );
w.CloseHandle( hFile );
end mmap;