Crash course to Amiga assembly programming
The 30th anniversary of Amiga inspired me to dig into Amiga programming. Back in Amiga’s golden era (late 80’s and early 90’s) I never had the chance to try this out since despite my relentless whining my parents wouldn’t get me one. Luckily later when I was studying at the uni, I managed to bargain one fine Amiga 500 specimen from the flea market at an affordable price of 20 euros.
Although Amiga as such is not that useful a platform to know these days, learning how to write programs for it can be very educational. Amiga as an environment is much simpler than (for instance) modern PCs. This makes learning low-level programming on it faster than on more complex environments. Although the hardware architecture is quite simple, it has some computer system design features that are still in use in modern environments as well such as DMA and interrupts. On top of being plain fun, writing assembly on Amiga teaches programming concepts that are usually hidden by higher-level languages and modern operating systems.
I’ve written this blog post together with Harri Salokorpi. We’ll walk you through an example that creates graphics on the display with a simple animation. We both hope this blog post provides a quick start to those who want to try out programming on this legendary device. However, we’re mostly going to use an emulator as a development environment, so the real device is not mandatory.
Let’s get started!
As said, we are going to use the FS-UAE Amiga Emulator to run our compiled Amiga software. On top of installing the emulator you’re also going to need at least one Kickstart ROM image and a Workbench floppy disk image. FS-UAE can emulate multiple Amiga models, but for this blog post we are using Amiga 500 with kickstart 1.3 ROM image and Workbench 1.3. You can get plenty of ROM images and Workbench disk images from Amiga Forever bundle. There’s for instance Forever Plus Edition which provides required ROM images, Workbench disk images and some additional goodies like games and demos.
In order to get the emulator to boot up so that we can effectively run and debug our software, we need to configure it properly. FS-UAE takes a configuration file (an ini-file) as a command line parameter like this:
fs-uae configuration.ini. Here is an example configuration file:
[config] floppy_drive_0 = $HOME/amigaforever/adf/amiga-os-134-workbench.adf hard_drive_0 = $HOME/amigahd kickstart_file = $HOME/amigaforever/rom/amiga-os-130.rom console_debugger = 1
floppy_drive_0 will instruct FS-UAE to use the specified ADF-file (Amiga Disk File image) as the disk inserted into the first floppy disk drive.
hard_drive_0 enables you to map any directory on the host system as a hard drive in the booted up Amiga system. This is useful during development since it allows us to edit and compile on the host device and access the compiled binary directly from the emulator. I have made symbolic links in
$HOME/amigahd to all directories in my host environment where I have Amiga binaries, so that I can easily access them from the emulated system.
kickstart_file tells which ROM image file to use when booting up the Amiga emulator.
FS-UAE comes with a handy debugger that you can use on your host system while your software is running in the emulator. To enable the debugger you need to specify
console_debugger = 1 in the configuration file.
Now if you run
fs-uae configuration.ini with appropriate ROM image files and Workbench disk images the emulator should boot up to specified ROM image and start loading the Workbench from the floppy disk image (with the familiar floppy disk drive sounds and everything). Eventually you should be greeted with an UI that looks something like this:
Next, we are going to set up our host-side cross-assembler which will eat our assembly code and spit out Amiga m68k binaries. Vasm runs on a number of platforms and is capable to produce not only m68k binaries but object files for multiple other platforms as well. You can compile vasm to target m68k architecture and mot syntax by running
make CPU=m68k SYNTAX=mot. Resulting binary
vasmm68k_mot can now be used to compile assembly code that can be directly run on Amiga machine.
Let’s introduce some source code. Our example can be cloned from GitHub. Run
vasmm68k_mot -kick1hunks -Fhunkexe -o example -nosym source.asm in the root of the cloned project to compile the example. This will produce
example binary in the same folder. Running
file example will give you:
myname@mycomputer ~/w/amiga-quickstart> file example example: AmigaOS loadseg()ble executable/binary
We have our first Amiga executable! A few words on the arguments given to vasm:
-kick1hunks will produce binary that is compatible with kickstart 1.x systems.
-hunkexe generates executable object code.
-nosym strips local symbols from the binary. More information about vasm and its features can be found in the documentation.
-kick1hunks option is only needed for Kickstart 1.3 equipped Amigas. This means most of the Amiga 500, Amiga 1000 and Amiga 2000 computers, unless they have had their kickstart chips upgraded. Amiga 1200, Amiga 600, Amiga 500+ and Amiga 3000 have more modern Kickstart 2.0 (or 3.0/3.1) built in.
We are ready to run the software! Let’s do just that. However, let’s first introduce a handy tool, the FS-UAE debugger. It can be run in the console of the host system while the code is running in the emulator. This can be enabled by the
console_debugger = 1 declaration in the FS-UAE configuration file. In order for this to work the emulator needs to be run from console.
Once you have the emulator up and running, double click on the Workbench icon on the desktop. A new window is opened containing icon for shell. Double click that to start the shell. You are now in AmigaDOS. Activate FS-UAE console debugger by pressing
F12-d. This will freeze the emulator and open debugger in the console in which you started FS-UAE. Once started the debugger will show useful information such as current contents of the CPU registers and next program counter. Something like this:
-- stub -- activate_console D0 00000000 D1 00000000 D2 40000000 D3 46C4D2DE D4 00084CD1 D5 00000000 D6 80000000 D7 C0000000 A0 00C0040C A1 00C0297A A2 00FDFF50 A3 00C00410 A4 00FC0FE2 A5 00C03980 A6 00C00276 A7 00C80000 USP 00C039C2 ISP 00C80000 T=00 S=1 M=0 X=0 N=0 Z=0 V=0 C=0 IMASK=0 STP=1 Prefetch 2000 (MOVE) 4e72 (STOP) Chip latch FFFFFF80 00FC0F94 60e6 BT .B #$ffffffe6 == $00fc0f7c (T) Next PC: 00fc0f96
You can type
?<Enter> to list debugger commands. Run
fp "example"<Enter> to tell the debugger to break execution when process called “example” is being run. This command will let the emulator run again, so you can now navigate with the AmigaDOS to the location where compiled example is stored and run it. In my case, where the directory called
amiga-quickstart contains the amiga binary and there’s a symbolic link to the directory in a directory called
amigahd which is shared with the emulator, the process is like this:
1.SYS:> cd amigahd: 1.amigahd:> cd amiga-example 1.amigahd:amiga-example> example
Once you execute the binary, the breakpoint will trigger and and the control will be returned to debugger where you should be greeted with something like this:
-- stub -- activate_console D0 00000001 D1 00C1F668 D2 00000FA0 D3 00000FA8 D4 00000001 D5 0000003E D6 00307D41 D7 00307D4E A0 00C1F668 A1 00C1F778 A2 00C05B28 A3 00C192AC A4 00C20D80 A5 00FF4134 A6 00FF4128 A7 00C20D7C USP 00C20D7C ISP 00C80000 T=00 S=0 M=0 X=0 N=0 Z=0 V=0 C=0 IMASK=0 STP=0 Prefetch 00df (ILLEGAL) 3039 (MOVE) Chip latch 000000DF 00C192B0 3039 00df f002 MOVE.W $00dff002,D0 Next PC: 00c192b6
Next PC: shows the instruction that would be executed next. If you compare that instruction with the first code instruction in the example source code
source.asm which says
move.w DMACONR,d0 you should notice a remarkable similarity.
DMACONR is just alias for address
$dff002 which is defined on the first line of the source code file:
DMACONR EQU $dff002. We are now executing the first line of our application in a debugger. If you type
d<Enter> in the debugger you will notice even more lines that look the same in debugger output and in
source.asm. d stands for disassembly, and without parameters it will disassemble 10 consecutive lines starting from the next program counter address.
g<Enter> for “go” so that emulator will kick back alive and you can witness the marvellous animated graphics running in the emulator!
Let’s explore the source code. First block of
source.asm looks like this:
move.w DMACONR,d0 or.w #$8000,d0 move.w d0,olddmareq move.w INTENAR,d0 or.w #$8000,d0 move.w d0,oldintena move.w INTREQR,d0 or.w #$8000,d0 move.w d0,oldintreq move.w ADKCONR,d0 or.w #$8000,d0 move.w d0,oldadkcon
This block of code is proudly copied from copperbars example of AmigaVikke, and it stores values in certain hardware registers to memory in order to restore them back to the same hardware registers once execution of the program completes. As you probably could guess,
move – instruction moves data from memory to CPU registers and back.
or – instruction unsurprisingly does a bitwise or on the register value. Its important to notice that the
.w suffix on the instruction denotes the size of the data the operation works on. There are three variants:
.b for byte-sized operations,
.w for word-sized operations (16 bits) and
.l for long-word operations (32 bits). Good coverage of m68k instructions and their operands can be found from the Programmer’s Reference Manual.
As mentioned before,
INTENAR and others are just aliased memory addresses to certain hardware registers. Reading from and writing to hardware registers that reside in memory space called chip mem (for chip memory) gives possibility to control Amiga’s custom chipset from CPU. These addresses all start with
$dff and have 12 bits to denote the actual register. You can see the addresses the aliases point to at the beginning of the source file. For instance
DMACONR aliases address
You can find documentation of all Amiga hardware registers in Amiga hardware guide.
olddmareq and other memory locations to where the data from hardware registers is stored are defined at the end of the source code file in data block definitions. For instance
olddmareq: dc.w 0 declares (dc for declare) a word-sized block of memory that is initialized to zero and assigns label
olddmareq to it.
move.l $4,a6 move.l #gfxname,a1 moveq #0,d0 jsr -552(a6) move.l d0,gfxbase move.l d0,a6 move.l 34(a6),oldview move.l 38(a6),oldcopper
Amiga has a support for dynamically loadable libraries. Every Amiga system also comes with a set of predefined utility libraries. Libraries are opened using OpenLibrary function which itself is stored in a library called
OpenLibrary provides caller with base address to the opened library. All functions within that library can be accessed by jumping to addresses relative to the base address of the library. As said, libraries are opened using an
exec.library which is a library itself. Jump table of every other library can be loaded to any memory address (that’s why they need to be accessed using
exec.library base address is always stored in memory address
In the code block above we call
exec.library by loading library base address from memory address
$4 to register
a6 and jumping to function which relies in address that is stored in relative position
-552 from the base address using
jsr instruction. We provide the name of the library in register
#gfxname points to memory address that contains string
graphics.library) and version in register
d0. Zero means that we are fine with whatever version of the library the system can provide.
OpenLibrary returns base address of the
graphics.library in register
d0. Effectively, we have now opened the graphics library and we can use the functions it provides until we close the library.
LoadLibrary minimum version of the library required is mostly relevant with Kickstart 2.0, 2.1, 3.0 and 3.1 based systems and with AGA-chipset Amigas.
Last two instructions above copy old system view and copper list pointers to a memory location. The offsets used refer to internal structure of the Amiga graphics library – see
struct View here. This is not a very future-proof way to do this, but Amiga has almost always allowed direct access to internal system structures (due to missing MMU and memory protection), and sometimes there’s no system friendly way to read such data.
move.l #0,a1 jsr -222(a6) ; LoadView jsr -270(a6) ; WaitTOF jsr -270(a6) ; WaitTOF move.l $4,a6 jsr -132(a6) ; Forbid
In this code block several library calls are made as outlined in the discussion of the previous block. First, the current view is removed by calling LoadView with parameter 0 (value of register
a1. Next, we wait for vertical blank (for the next video frame to commence). Finally, we call Forbid() in
exec.library which is a rather powerful call to disable multitasking by preventing other tasks from being scheduled.
WaitTOF is called twice in case the system display uses interlaced mode. In interlaced mode the system has two copper lists (odd and even frame) so it might take up to 2 screen draws until our copper list is properly loaded.
move.w #$3200,BPLCON0 ; three bitplanes move.w #$0000,BPLCON1 ; horizontal scroll 0 move.w #$0050,BPL1MOD ; odd modulo move.w #$0050,BPL2MOD ; even modulo move.w #$2c81,DIWSTRT ; DIWSTRT - topleft corner (2c81) move.w #$c8d1,DIWSTOP ; DIWSTOP - bottomright corner (c8d1) move.w #$0038,DDFSTRT ; DDFSTRT move.w #$00d0,DDFSTOP ; DDFSTOP move.w #%1000000110000000,DMACON ; DMA set ON move.w #%0000000001111111,DMACON ; DMA set OFF move.w #%1100000000000000,INTENA ; IRQ set ON move.w #%0011111111111111,INTENA ; IRQ set OFF
This block should seem familiar after reading about the first setup block. In the first setup block we read data from hardware registers (the ones starting
$dff) and stored the values to memory. Here we write values to hardware registers in order to setup the system the way we want.
This block sets Amiga up to show 320 x 200 graphics with 8 colors.
Amiga uses bitplanes to draw graphics. A bitplane contains a single bit per pixel on the screen. By using three bitplanes we can display 8 different colors simultaneously. Amiga supports use of up to five bitplanes. First instruction in the block writes
$3200 to address BPLCON0 to setup the system with three bitplanes.
Amiga also allows usage of two simultaneous playfields. Each playfield can leverage its own bitplanes (up to three bitplanes each). These playfields are displayed on top of each other and can be moved independently of each other. On the second line we set horizontal scroll of both playfields to zero. Later in the software we shall create the animation by modifying this horizontal scroll value. There’s more about playfields here.
On the third and fourth line we specify bitplane modules for both odd and even bitplanes using hardware registers
BPL2MOD. We can instruct Amiga to jump a number of bytes in memory once it has read a line of bitplane data from memory in order for it to read the next line of bitplane data. In our case the bitplane data is tightly packed, so that for each raster line there’s always all three bitplanes one after each other. This means that we need to jump 80 bytes (
$0050 in hexadecimal) when we reach the end of line in order to reach the next line of the same bitplane data: each 320 pixel line of one bitplane data takes 320/8 = 40 bytes, and since there are total of three bitplanes, there’s always two bitplanes worth of data after a raster line of a given bitplane.
DIWSTRT and DIWSTOP registers control the size of the display window size.
DIWSTRT controls the top-left corner of the window and
DIWSTOP controls the bottom-right corner of the window.
DDFSTRT and DDFSTOP control the horizontal timing when bitplane data fetch is started and stopped. As you can see from the documentation we use the “normal” setting for both. These timings are wait times (in cycles) from the start of the row, and from the end of the row.
These register values define the size and position of the display onscreen. This is related to overscan which is relevant with old analog display technologies where the display area and resolution are not clearly defined. Modifying these values allows us to stretch the viewport to cover all the screen area. With Amiga hardware, this also increases the number of pixels displayed. The maximum overscan for OCS chipset is in lores-mode roughly 352×286, and it may require some trickery to achieve.
The lines commented
DMA set ON and
DMA set OFF use DMACON to setup the direct memory access for the software. We only need copper and bitplane support in DMA, so everything else is explicitly disabled.
Last two lines use INTENA to setup interrupts.
Amiga chipset comes bundled with three chips: Agnus, Denise and Paula. Agnus contains a general purpose coprosessor (or “Copper” for short) which handles big part of the graphics on an Amiga system. Copper has its own instruction set which can be used to write programs running on the copper. Copper instruction set contains only three different instructions:
SKIP. The main loop of the example generates a program for copper (called copper list) on each frame and assigns it to execution.
move.l frame,d1 move.l #copper,a6 addq.l #1,d1 move.l d1,frame
This first block increases frame counter and fetches pointer into which we will write the copper list. Frame counter is needed in order for us to create an animation. The ongoing frame is stored in register
d1 during the mainloop.
#copper points to memory block that we have reserved in chip mem for storing the copper list in. All data accessed by copper needs to be allocated in memory segment that is available to the custom chip set. If you take a look at the end of code listing you will see that
copper data block is declared in a segment that is marked
ChipRAM. Address pointer to copper list is stored in register
a6 which is incremented when we build up the copper list by adding instructions to it.
Copper move instruction moves data to hardware registers. The instruction is encoded so that first word of the instruction denotes the hardware register address to which data is moved and the second word of the instruction contains the data to be moved. Since all hardware registers have the same 12 highest bits (
dff) the first word of the instruction takes only last 12 bits of the destination register as a parameter. So for instance when we add move instruction
$00e2 to copper list we want copper to move the data to register
move.l #bitplanes,d0 move.w #$00e2,(a6)+ move.w d0,(a6)+ swap d0 move.w #$00e0,(a6)+ move.w d0,(a6)+
This block (and the two blocks following it) assign our bitplane data to bitplane hardware registers so that the hardware will draw the graphics from bitplanes contained in memory. Bitplane data is communicated to hardware through two registers: high and low pointer. Low pointer contains the lowest 15 bits of the bitplane data address and high pointer contains highest 3 bits.
swap instruction swaps the 16-bit halves of the 32-bit register so that the low bits become high bits and vice versa. This allows us to store the low bits to low pointer register and high bits to high pointer register. Notice that we are always moving just 16 bits of data to copper list (by using
.w version of the
move.w instruction with the
(a6)+ addressing scheme moves data to address pointed to by
a6 and increments the address in
a6 by word-size.
; colors move.l #$01800fd3,(a6)+ ; color 0 move.l #$01820832,(a6)+ ; color 1 move.l #$0184036b,(a6)+ ; color 2 move.l #$01860667,(a6)+ ; color 3 move.l #$01880f53,(a6)+ ; color 4 move.l #$018a07ad,(a6)+ ; color 5 move.l #$018c0000,(a6)+ ; color 6 move.l #$018e0cef,(a6)+ ; color 7
This block adds a bunch of move instructions into the copper list. These move instructions define the color palette used while drawing the graphics. Since we are using three bitplanes we have in total 3 bits to control used color which adds up to 8 distinct colors. Colors are defined in hardware registers
COLOR31. Each color is defined using 12 bits (4 blue, 4 green and 4 red).
move.l #32,d0 ; Number of iterations move.l #$07,d1 ; Current row wait move.l #sin32_15,a0 ; Sine base move.l frame,d2 ; Current sine scrollrows: ; Wait for correct offset row move.w d1,(a6)+ move.w #$fffe,(a6)+ ; Fetch sine from table move.l d2,d3 and.l #$1f,d3 move.b (a0,d3),d4 ; Transform sine to horizontal offset value move.l d4,d5 lsl.l #4,d4 add.l d4,d5 ; Add horizontal offset to copperlist move.w #$0102,(a6)+ move.w d5,(a6)+ ; Proceed to next row that we want to offset add.l #$500,d1 ; Move to next sine position for next offset row addq.w #1,d2 subq.w #1,d0 bne scrollrows
This block animates the graphics on the screen. We introduce different horizontal displacement per scan line and animate the displacement value by frame counter in order to create an effect where the graphics continuously flutter.
We achieve this effect by inserting values into
BPLCON1 register. Copper is instructed to perform the horizontal displacement assignment using move instruction with calculated displacement stored in
d5 – register. This is done by these two rows above:
move.w #$0102,(a6)+ move.w d5,(a6)+
During each frame the graphics area is divided into 32 bands. Each of these bands have a different horizontal displacement. We instruct the copper to wait until scan line of next band is reached, at which time we assign new horizontal displacement value into
BPLCON1 – register. Copper
WAIT – instruction is formed using value in
d1 register where the lower 8 bits are always
$07 (this denotes that the horizontal beam position should be at the beginning of the scan line) and the higher 8 bits are the beam vertical position (i.e. the scan line). Wait instruction is inserted into copper list by these two instructions:
move.w d1,(a6)+ move.w #$fffe,(a6)+
Once displacement of a given band is assigned, we wait until the next band. The vertical beam position of the next band is calculated by adding 5 to the vertical position of the previous band:
160 / 32 = 5 where 160 is the height of the graphics and 32 the number of bands. Thus the next wait position is calculated by this instruction:
sin32_15 contains a pre-calculated table of sine values (32 samples with values from 0 to 15). Horizontal displacement value is calculated from these sine values. We fetch a sine value for each band by taking modulo 32 of the frame counter and using the resulted value as an index to the sine table:
move.l d2,d3 and.l #$1f,d3 move.b (a0,d3),d4
Amiga supports hardware horizontal scroll on 4-bit resolution (thus the 0 to 15 sine value). To correctly scroll the graphics, we need to use the same displacement value on both playfields one and two. This is achieved by having the same displacement value on bits 1-4 and 5-8:
move.l d4,d5 lsl.l #4,d4 add.l d4,d5
The copper list for the frame is now complete. To conclude the copper list, we need to mark the end of the list. This move instruction is interpreted as the end of the built copper list:
; end of copperlist move.l #$fffffffe,(a6)+
Amiga system contains two complex interface adapters (CIA). These adapters are used to communicate with the outside world using input/output devices such as joystick. Data from those devices can be read from a set of chip registers. The
CIAAPRA register can be used to monitor if joystick button is pressed. On every iteration of the main loop, we check if joystick (or mouse) button is pressed and, in this case, exit the program. Following code checks both bit 6 and 7 on
CIAAPRA – register and branches to exit if either one is set. Bit 6 is set if joystick (or mouse) in port 1 has its button pressed. Bit 7 is set if joystick (or mouse) in port 2 has its button pressed.
; if mousebutton/joystick 1 or 2 pressed then exit btst.b #6,CIAAPRA beq exit btst.b #7,CIAAPRA beq exit
We need to wait for vertical blanking period before we can assign our copper list to execution. The vertical blanking time starts at scan line 300. To detect whether vertical blanking is on, we peek values in
VPOSR – register. We are interested only in vertical position so we mask everything else out (using mask
#$1ff00 which leaves only the 9 bits we care for).
After this we check if scan line 300 is reached:
; Wait for vertical blanking before taking the copper list into use waitVB: move.l VPOSR,d0 and.l #$1ff00,d0 cmp.l #300<<8,d0 bne waitVB
We could have bit shifted the value down in register
d0 and then compared it against
#300. Instead we use assembly constant trick where we shift the
<<8 operator. What this says is “give me #300 bit shifted right 8 bits”. This saves us an instruction but might look a bit confusing at first.
Note: Amiga has also support for vertical blank interrupts. However, interrupts are a bit trickier to program, so to keep it simple, we stick with the busy loop implementation.
Once vertical blank period is reached we assign our copper list into execution by assigning the address of the copper list to registers
COP1LCL. Notice that by writing long word to location
COP1LCH the lower word will leak into
COP1LCL and thus set both high and low bits of the copper list. Once that is done we execute the next iteration of the mainloop:
; Take copper list into use move.l #copper,a6 move.l a6,COP1LCH bra mainloop
On application exit, we must restore some registers back to defaults (interrupt control, most importantly), load back the old view and copperlist and enable multitasking again.
; exit gracefully - reverse everything done in init move.w #$7fff,DMACON move.w olddmareq,DMACON move.w #$7fff,INTENA move.w oldintena,INTENA move.w #$7fff,INTREQ move.w oldintreq,INTREQ move.w #$7fff,ADKCON move.w oldadkcon,ADKCON move.l oldcopper,COP1LCH move.l gfxbase,a6 move.l oldview,a1 jsr -222(a6) ; LoadView jsr -270(a6) ; WaitTOF jsr -270(a6) ; WaitTOF move.l $4,a6 jsr -138(a6) ; Permit ; end program rts
After that, we can just exit the subroutine with
rts and we are dropped back to Amiga OS.
Running on real hardware
Since we are compiling the application as a real Amiga executable, testing the code on a real Amiga should be easy – just copy the file over?
Well, not exactly. Since 3.5″ disk drives are no longer common, moving data to Amiga has become quite difficult. To add insult to injury, Amiga can read 3.5″ DD DOS-formatted disks, but PC cannot read Amiga MFM-coded disks. This requires a tool on Amiga side such as CrossDOS (which is shipped as part of Amiga OS 2.1) or MessyDOS from Aminet.
There’s still a chicken-and-egg problem trying to transfer the necessary software to your Amiga 500 with 1.3 Kickstart.
A more tractable way is to use serial port transfer. You can then transfer the MessyDOS to your Amiga and start moving data using the DOS-formatted disks.
If you have Amiga 1200 or Amiga 600, you can also use PCMCIA CompactFlash adapter.
exec.library base address is stored in memory address
$4. The library base address itself is not
$4. Thanks to Petri Koistinen! Without his comment this would have slipped our attention.