I had the idea for this one day after stumbling upon a "gem" that
somebody had written to play life. It was small and fast and reminded
me of years ago when I had written many versions of this for the
BBC Master 128 (my love lost). Since I had never written a version
for the PC I thought that I would, and ended up spending some hours
trimming off the bytes until it is now :- 156 bytes long. I must admit
if it was not for the program that I found, this program would have been
MUCH slower than it is. After I had written the code I tested it against
the program that I had found and to my perplexity it was a great deal
slower. After some hours of frustration I found the reason:- my program
was accessing the video memory to do the bulk of its work. This must have
brought about a factor of 12 decrease in speed!!
Life is a classic game of cellular automata by John Conway. It is
played on an nxn grid of squares. Each square may be occuppied by a
cell or empty. Each 'go' of the game the player calculates the next
generation of a colony of cells by applying three simple rules:-
(i) a cell with less than 2 or more than 3 neighbours dies
(ii) a cell with 2 or 3 neighbours survives
(iii) a cell is born in a square with exactly 3 neighbours
A neighbouring square is one diagonally adjacent as well as the
normal horizontal/vertical so each square has 8 neighbouring squares.
Overview of the code
First, note that if we define
S:=state of square in this generation (0=empty, 1=occupied)
N:=number of neighbours
S':=state of square in the next generation
then according to the rules
S'={0, if N<2 or N>3
{1, if (N=2 or N=3) and S=1
{1, if N=3
so S'=1 iff (N=2 and S=1) or N=3
this can be simplified using bitwise-OR to the dramatically simple:
S'= ( N|S=3 )
note: iff means "if and only if"
"A iff B" means that A => B and B => A
The code uses one big array with one byte for each square that
starts just after the program end. To save space it just assumes that it
can use this memory since this is generally okay. However this is
very bad practice really and it should use AH=04Ah/int 021h to adjust
the memory size and abort if not successful.
The big array actually serves the purpose of 2 arrays; bit0 of
a byte indicates the state of the square in the current generation. bit4
of each byte indicates the state of the square in the next generation.
After initialisation, generation 0 is calculated by filling about
1/4 of the array with 1's.
Now we do a loop to get the next generation. The screen is 0140h
bytes across and 0C8h bytes down. Therefore:-
-0141h -0140h -013Fh
-0001h . +0001h
+013Fh +0140h +0141h
If DI is the offset of the array which we are calculating for,
note that the neighbours can be summed as follows:-
MOV AX,[DI-0141h]
ADD AL,[DI-013Fh]
ADD AX,[DI+013Fh]
ADD AL,[DI+0141h]
ADD AL,[DI-1]
ADD AL,[DI+1]
ADD AL,AH
Note that if bit4 of any of the neighbours was set then we would
still have the correct total in the least significant 4 bits of AL.
So from here the new cell state can be calculated simply:-
OR AL,[DI]
AND AL,0Fh
CMP AL,3
And if ZF=1 now we have a set cell.
JNZ ko
OR BYTE PTR [DI],010h
ko:
When the next generation has been calculated we have done most of
the work. The only thing is that if we want to iterate we need all
of those bit4 's moved to bit0, also we want to display the next
generation, this can be done easily at the same time.
Note that due to the structure of the code generation#0 is never
displayed. Also we always have blue cells. Despite this it is quite
an entertaining little program to watch....
The source here is in MASM format but should be trivial to convert
to run on any assembler. It is assembled into a .COM file which means
you should use the /T option on the linker (T=tiny).
===========START OF CODE===================================================
OPTION SEGMENT:USE16
.386
cseg SEGMENT BYTE
ASSUME NOTHING
ORG 0100h
kode PROC NEAR
;
;mode 013h=320x200x256 (0140hx0C8h) and be kind with the stack
;
MOV SP,0100h
MOV AX,013h
INT 010h
;
;use current time as random number seed
;in BP,DX which is used later
;
MOV AH,02Ch
INT 021h
MOV BP,CX
;
;get seg address of 1st seg after code for array store start
;for now ES points there and DS=screen
;
MOV AX,DS
ADD AX,01Ah ;(OFFSET endofprog+0Fh>>4)=(1A)
MOV ES,AX
MOV AX,0A000h
MOV DS,AX
;
;CREATE GENERATION#0
; this is done by filling approx 1/4 of the cells in the array
; 'randomly', while taking care not to fill any edge cells
;
;
;blank the array
; this is done to ensure the edge cells are clear
;
XOR DI,DI
MOV CX,0FA00h
REP STOSB
;
;fill the array
; two nested loops, CL counts the rows, SI counts the columns
; this is so that after each row DI can be bumped past the edge
;
MOV CL,0C6h
MOV DI,0141h ;array offset we are addressing
;
;BX is 0141h from now until exit, it is used as a constant later
;
MOV BX,DI
lopr0: MOV SI,-013Eh
;
;iterate random number seed in BP,DX
;
lopr: LEA AX,[BP+DI]
ROR BP,3
XOR BP,DX
SUB DX,AX
;
;set cell with probability 1/4
;
CMP AL,0C0h
SBB AL,AL
INC AX
STOSB
;
;
INC SI
JNZ lopr
SCASW ;DI+=2, skipping edge
LOOP lopr0
;
;now we set DS=array, ES=screen. this doesn't change until exit
;
PUSH ES
PUSH DS
POP ES
POP DS ;DS=vseg,ES=0A000h throughout
;
;'mlop' is the main loop, outputting generations until the user terminates
;
mlop:
;
;CREATE NEXT GENERATION
;
MOV DI,BX ;DI=0141h
;
;'lopy' is the loop for rows, a count is not needed because we can get
;the stop point from testing the array offset DI
;
lopy: MOV SI,013Eh
;
;'lopx' is the loop for columns, SI holds the count
;
;
;get the total number of neighbours into the least significant 4 bits of AL
;
lopx: MOV AX,[DI-0141h]
ADD AL,[DI-013Fh]
ADD AX,[DI+BX-2]
ADD AL,[DI+BX]
ADD AL,[DI-1]
ADD AL,[DI+1]
ADD AL,AH
;
;calculate new cell state
;
OR AL,[DI]
AND AL,0Fh
CMP AL,3
JNZ SHORT ko
OR BYTE PTR [DI],010h
ko: INC DI
DEC SI
JNZ lopx
;
;(each row we miss 2 edge cells)
;
SCASW
CMP DI,0FA00h-013Fh
JC lopy
;
;FIXUP ARRAY AND DISPLAY
; bit4 is copied to bit0 in each byte. all other bits then cleared so
; cells appear as blue pixels, also the iteration loop above assumes
; that bit4 is clear on entry (it only sets it)
;
MOV CX,03E80h
XOR DI,DI
lopc: LODSD
SHR EAX,4
AND EAX,01010101h
MOV [SI-4],EAX
STOSD
LOOP lopc
;
;USER KEYPRESS?
;
MOV AH,0Bh
INT 021h
ADD AL,3
;
;no, back for next generation
;
JP mlop
;
;yes, AL=2 now so make AX=2 to go into text mode
;
CBW
INT 010h
;
;back to DOS
;
MOV AH,04Ch
INT 021h
kode ENDP
endof EQU $
cseg ENDS
END FAR PTR kode
===========END OF CODE=====================================================
While the code is optimised for size and for speed you may find that
it runs too quickly. This can be easily remidied by the addition of a wait
for vertical synchronisation loop (or vert sync as we techies call it).
Just add the following after the generation calculating code (that
is after the instruction 'JC lopy'):-
MOV DX,03DAh
lopv0: IN AL,DX
AND AL,8
JNZ lopv0
lopv1: IN AL,DX
AND AL,8
JZ lopv1
Also if you add this the program size has changed. 'endofprog' is now
01ABh, so the number of segments to add to DS to get the start of free space
is now 01Bh. You must change the instruction at the beginning of the code:-
MOV AX,DS
** ADD AX,01Bh ;(OFFSET endofprog+0Fh>>4)=(1B) **
MOV ES,AX
One final note: I use SCASW in this code to increment DI by two.
This is a well known space saving trick. However you must be wary since
it does not do just that; it reads the memory at ES:[DI]. Generally this
is fine but if DI=0FFFFh we will get a general protection fault.
|