Crash course to Amiga assembly programming

The 30th anniversary of Amiga inspired me to dig into Amiga programming. Back in Amiga’s golden era (late 80’s and early 90’s) I never had the chance to try this out since despite my relentless whining my parents wouldn’t get me one. Luckily later when I was studying at the uni, I managed to bargain one fine Amiga 500 specimen from the flea market at an affordable price of 20 euros.

Although Amiga as such is not that useful a platform to know these days, learning how to write programs for it can be very educational. Amiga as an environment is much simpler than (for instance) modern PCs. This makes learning low-level programming on it faster than on more complex environments. Although the hardware architecture is quite simple, it has some computer system design features that are still in use in modern environments as well such as DMA and interrupts. On top of being plain fun, writing assembly on Amiga teaches programming concepts that are usually hidden by higher-level languages and modern operating systems.

I’ve written this blog post together with Harri Salokorpi. We’ll walk you through an example that creates graphics on the display with a simple animation. We both hope this blog post provides a quick start to those who want to try out programming on this legendary device. However, we’re mostly going to use an emulator as a development environment, so the real device is not mandatory.

Let’s get started!


Development environment

As said, we are going to use the FS-UAE Amiga Emulator to run our compiled Amiga software. On top of installing the emulator you’re also going to need at least one Kickstart ROM image and a Workbench floppy disk image. FS-UAE can emulate multiple Amiga models, but for this blog post we are using Amiga 500 with kickstart 1.3 ROM image and Workbench 1.3. You can get plenty of ROM images and Workbench disk images from Amiga Forever bundle. There’s for instance Forever Plus Edition which provides required ROM images, Workbench disk images and some additional goodies like games and demos.

In order to get the emulator to boot up so that we can effectively run and debug our software, we need to configure it properly. FS-UAE takes a configuration file (an ini-file) as a command line parameter like this: fs-uae configuration.ini. Here is an example configuration file:

floppy_drive_0 = $HOME/amigaforever/adf/amiga-os-134-workbench.adf
hard_drive_0 = $HOME/amigahd
kickstart_file = $HOME/amigaforever/rom/amiga-os-130.rom
console_debugger = 1

floppy_drive_0 will instruct FS-UAE to use the specified ADF-file (Amiga Disk File image) as the disk inserted into the first floppy disk drive. hard_drive_0 enables you to map any directory on the host system as a hard drive in the booted up Amiga system. This is useful during development since it allows us to edit and compile on the host device and access the compiled binary directly from the emulator. I have made symbolic links in $HOME/amigahd to all directories in my host environment where I have Amiga binaries, so that I can easily access them from the emulated system. kickstart_file tells which ROM image file to use when booting up the Amiga emulator.

FS-UAE comes with a handy debugger that you can use on your host system while your software is running in the emulator. To enable the debugger you need to specify console_debugger = 1 in the configuration file.

Now if you run fs-uae configuration.ini with appropriate ROM image files and Workbench disk images the emulator should boot up to specified ROM image and start loading the Workbench from the floppy disk image (with the familiar floppy disk drive sounds and everything). Eventually you should be greeted with an UI that looks something like this:



Next, we are going to set up our host-side cross-assembler which will eat our assembly code and spit out Amiga m68k binaries. Vasm runs on a number of platforms and is capable to produce not only m68k binaries but object files for multiple other platforms as well. You can compile vasm to target m68k architecture and mot syntax by running make CPU=m68k SYNTAX=mot. Resulting binary vasmm68k_mot can now be used to compile assembly code that can be directly run on Amiga machine.

Let’s introduce some source code. Our example can be cloned from GitHub. Run vasmm68k_mot -kick1hunks -Fhunkexe -o example -nosym source.asm in the root of the cloned project to compile the example. This will produce example binary in the same folder. Running file example will give you:

myname@mycomputer ~/w/amiga-quickstart> file example
example: AmigaOS loadseg()ble executable/binary

We have our first Amiga executable! A few words on the arguments given to vasm: -kick1hunks will produce binary that is compatible with kickstart 1.x systems. -hunkexe generates executable object code. -nosym strips local symbols from the binary. More information about vasm and its features can be found in the documentation.

Note: The -kick1hunks option is only needed for Kickstart 1.3 equipped Amigas. This means most of the Amiga 500, Amiga 1000 and Amiga 2000 computers, unless they have had their kickstart chips upgraded. Amiga 1200, Amiga 600, Amiga 500+ and Amiga 3000 have more modern Kickstart 2.0 (or 3.0/3.1) built in.


We are ready to run the software! Let’s do just that. However, let’s first introduce a handy tool, the FS-UAE debugger. It can be run in the console of the host system while the code is running in the emulator. This can be enabled by the console_debugger = 1 declaration in the FS-UAE configuration file. In order for this to work the emulator needs to be run from console.

Once you have the emulator up and running, double click on the Workbench icon on the desktop. A new window is opened containing icon for shell. Double click that to start the shell. You are now in AmigaDOS. Activate FS-UAE console debugger by pressing F12-d. This will freeze the emulator and open debugger in the console in which you started FS-UAE. Once started the debugger will show useful information such as current contents of the CPU registers and next program counter. Something like this:

 -- stub -- activate_console
  D0 00000000   D1 00000000   D2 40000000   D3 46C4D2DE
  D4 00084CD1   D5 00000000   D6 80000000   D7 C0000000
  A0 00C0040C   A1 00C0297A   A2 00FDFF50   A3 00C00410
  A4 00FC0FE2   A5 00C03980   A6 00C00276   A7 00C80000
USP  00C039C2 ISP  00C80000
T=00 S=1 M=0 X=0 N=0 Z=0 V=0 C=0 IMASK=0 STP=1
Prefetch 2000 (MOVE) 4e72 (STOP) Chip latch FFFFFF80
00FC0F94 60e6                     BT .B #$ffffffe6 == $00fc0f7c (T)
Next PC: 00fc0f96

You can type ?<Enter> to list debugger commands. Run fp "example"<Enter> to tell the debugger to break execution when process called “example” is being run. This command will let the emulator run again, so you can now navigate with the AmigaDOS to the location where compiled example is stored and run it. In my case, where the directory called amiga-quickstart contains the amiga binary and there’s a symbolic link to the directory in a directory called amigahd which is shared with the emulator, the process is like this:

1.SYS:> cd amigahd:
1.amigahd:> cd amiga-example
1.amigahd:amiga-example> example

Once you execute the binary, the breakpoint will trigger and and the control will be returned to debugger where you should be greeted with something like this:

 -- stub -- activate_console
  D0 00000001   D1 00C1F668   D2 00000FA0   D3 00000FA8
  D4 00000001   D5 0000003E   D6 00307D41   D7 00307D4E
  A0 00C1F668   A1 00C1F778   A2 00C05B28   A3 00C192AC
  A4 00C20D80   A5 00FF4134   A6 00FF4128   A7 00C20D7C
USP  00C20D7C ISP  00C80000
T=00 S=0 M=0 X=0 N=0 Z=0 V=0 C=0 IMASK=0 STP=0
Prefetch 00df (ILLEGAL) 3039 (MOVE) Chip latch 000000DF
00C192B0 3039 00df f002           MOVE.W $00dff002,D0
Next PC: 00c192b6

Row above Next PC: shows the instruction that would be executed next. If you compare that instruction with the first code instruction in the example source code source.asm which says move.w DMACONR,d0 you should notice a remarkable similarity. DMACONR is just alias for address $dff002 which is defined on the first line of the source code file: DMACONR EQU $dff002. We are now executing the first line of our application in a debugger. If you type d<Enter> in the debugger you will notice even more lines that look the same in debugger output and in source.asm. d stands for disassembly, and without parameters it will disassemble 10 consecutive lines starting from the next program counter address.

Type g<Enter> for “go” so that emulator will kick back alive and you can witness the marvellous animated graphics running in the emulator!


Application startup

Let’s explore the source code. First block of source.asm looks like this:

move.w DMACONR,d0
or.w #$8000,d0
move.w d0,olddmareq
move.w INTENAR,d0
or.w #$8000,d0
move.w d0,oldintena
move.w INTREQR,d0
or.w #$8000,d0
move.w d0,oldintreq
move.w ADKCONR,d0
or.w #$8000,d0
move.w d0,oldadkcon

This block of code is proudly copied from copperbars example of AmigaVikke, and it stores values in certain hardware registers to memory in order to restore them back to the same hardware registers once execution of the program completes. As you probably could guess, move – instruction moves data from memory to CPU registers and back. or – instruction unsurprisingly does a bitwise or on the register value. Its important to notice that the .w suffix on the instruction denotes the size of the data the operation works on. There are three variants: .b for byte-sized operations, .w for word-sized operations (16 bits) and .l for long-word operations (32 bits). Good coverage of m68k instructions and their operands can be found from the Programmer’s Reference Manual.

As mentioned before, DMACONR, INTENAR and others are just aliased memory addresses to certain hardware registers. Reading from and writing to hardware registers that reside in memory space called chip mem (for chip memory) gives possibility to control Amiga’s custom chipset from CPU. These addresses all start with $dff and have 12 bits to denote the actual register. You can see the addresses the aliases point to at the beginning of the source file. For instance DMACONR aliases address $dff002.

You can find documentation of all Amiga hardware registers in Amiga hardware guide.

olddmareq and other memory locations to where the data from hardware registers is stored are defined at the end of the source code file in data block definitions. For instance olddmareq: dc.w 0 declares (dc for declare) a word-sized block of memory that is initialized to zero and assigns label olddmareq to it.

move.l	$4,a6
move.l	#gfxname,a1
moveq	#0,d0
jsr	-552(a6)
move.l	d0,gfxbase
move.l 	d0,a6
move.l 	34(a6),oldview
move.l 	38(a6),oldcopper

Amiga has a support for dynamically loadable libraries. Every Amiga system also comes with a set of predefined utility libraries. Libraries are opened using OpenLibrary function which itself is stored in a library called exec.library. Calling OpenLibrary provides caller with base address to the opened library. All functions within that library can be accessed by jumping to addresses relative to the base address of the library. As said, libraries are opened using an exec.library which is a library itself. Jump table of every other library can be loaded to any memory address (that’s why they need to be accessed using OpenLibrary). exec.library base address is always stored in memory address $4.

In the code block above we call OpenLibrary of exec.library by loading library base address from memory address $4 to register a6 and jumping to function which relies in address that is stored in relative position -552 from the base address using jsr instruction. We provide the name of the library in register a1 (#gfxname points to memory address that contains string graphics.library) and version in register d0. Zero means that we are fine with whatever version of the library the system can provide. OpenLibrary returns base address of the graphics.library in register d0. Effectively, we have now opened the graphics library and we can use the functions it provides until we close the library.

Note: LoadLibrary minimum version of the library required is mostly relevant with Kickstart 2.0, 2.1, 3.0 and 3.1 based systems and with AGA-chipset Amigas.

Last two instructions above copy old system view and copper list pointers to a memory location. The offsets used refer to internal structure of the Amiga graphics library – see struct View here. This is not a very future-proof way to do this, but Amiga has almost always allowed direct access to internal system structures (due to missing MMU and memory protection), and sometimes there’s no system friendly way to read such data.

You can find information about libraries and the functions they provide in Libraries Manual Guide. Library function base address offsets are included in Amiga Developer Docs’ Includes and Autodocs.

move.l #0,a1
jsr -222(a6)	; LoadView
jsr -270(a6)	; WaitTOF
jsr -270(a6)	; WaitTOF
move.l	$4,a6
jsr -132(a6)	; Forbid

In this code block several library calls are made as outlined in the discussion of the previous block. First, the current view is removed by calling LoadView with parameter 0 (value of register a1. Next, we wait for vertical blank (for the next video frame to commence). Finally, we call Forbid() in exec.library which is a rather powerful call to disable multitasking by preventing other tasks from being scheduled.

WaitTOF is called twice in case the system display uses interlaced mode. In interlaced mode the system has two copper lists (odd and even frame) so it might take up to 2 screen draws until our copper list is properly loaded.

move.w  #$3200,BPLCON0                  ; three bitplanes
move.w  #$0000,BPLCON1                  ; horizontal scroll 0
move.w  #$0050,BPL1MOD                  ; odd modulo
move.w  #$0050,BPL2MOD                  ; even modulo
move.w  #$2c81,DIWSTRT                  ; DIWSTRT - topleft corner (2c81)
move.w  #$c8d1,DIWSTOP                  ; DIWSTOP - bottomright corner (c8d1)
move.w  #$0038,DDFSTRT                  ; DDFSTRT
move.w  #$00d0,DDFSTOP                  ; DDFSTOP
move.w  #%1000000110000000,DMACON       ; DMA set ON
move.w  #%0000000001111111,DMACON       ; DMA set OFF
move.w  #%1100000000000000,INTENA       ; IRQ set ON
move.w  #%0011111111111111,INTENA       ; IRQ set OFF

This block should seem familiar after reading about the first setup block. In the first setup block we read data from hardware registers (the ones starting $dff) and stored the values to memory. Here we write values to hardware registers in order to setup the system the way we want.

This block sets Amiga up to show 320 x 200 graphics with 8 colors.

Amiga uses bitplanes to draw graphics. A bitplane contains a single bit per pixel on the screen. By using three bitplanes we can display 8 different colors simultaneously. Amiga supports use of up to five bitplanes. First instruction in the block writes $3200 to address BPLCON0 to setup the system with three bitplanes.

Amiga also allows usage of two simultaneous playfields. Each playfield can leverage its own bitplanes (up to three bitplanes each). These playfields are displayed on top of each other and can be moved independently of each other. On the second line we set horizontal scroll of both playfields to zero. Later in the software we shall create the animation by modifying this horizontal scroll value. There’s more about playfields here.

On the third and fourth line we specify bitplane modules for both odd and even bitplanes using hardware registers BPL1MOD and BPL2MOD. We can instruct Amiga to jump a number of bytes in memory once it has read a line of bitplane data from memory in order for it to read the next line of bitplane data. In our case the bitplane data is tightly packed, so that for each raster line there’s always all three bitplanes one after each other. This means that we need to jump 80 bytes ($0050 in hexadecimal) when we reach the end of line in order to reach the next line of the same bitplane data: each 320 pixel line of one bitplane data takes 320/8 = 40 bytes, and since there are total of three bitplanes, there’s always two bitplanes worth of data after a raster line of a given bitplane.

DIWSTRT and DIWSTOP registers control the size of the display window size. DIWSTRT controls the top-left corner of the window and DIWSTOP controls the bottom-right corner of the window.

DDFSTRT and DDFSTOP control the horizontal timing when bitplane data fetch is started and stopped. As you can see from the documentation we use the “normal” setting for both. These timings are wait times (in cycles) from the start of the row, and from the end of the row.

These register values define the size and position of the display onscreen. This is related to overscan which is relevant with old analog display technologies where the display area and resolution are not clearly defined. Modifying these values allows us to stretch the viewport to cover all the screen area. With Amiga hardware, this also increases the number of pixels displayed. The maximum overscan for OCS chipset is in lores-mode roughly 352×286, and it may require some trickery to achieve.

The lines commented DMA set ON and DMA set OFF use DMACON to setup the direct memory access for the software. We only need copper and bitplane support in DMA, so everything else is explicitly disabled.

Last two lines use INTENA to setup interrupts.



Amiga chipset comes bundled with three chips: Agnus, Denise and Paula. Agnus contains a general purpose coprosessor (or “Copper” for short) which handles big part of the graphics on an Amiga system. Copper has its own instruction set which can be used to write programs running on the copper. Copper instruction set contains only three different instructions: WAIT, MOVE and SKIP. The main loop of the example generates a program for copper (called copper list) on each frame and assigns it to execution.

move.l frame,d1
move.l #copper,a6
addq.l #1,d1
move.l d1,frame

This first block increases frame counter and fetches pointer into which we will write the copper list. Frame counter is needed in order for us to create an animation. The ongoing frame is stored in register d1 during the mainloop. #copper points to memory block that we have reserved in chip mem for storing the copper list in. All data accessed by copper needs to be allocated in memory segment that is available to the custom chip set. If you take a look at the end of code listing you will see that copper data block is declared in a segment that is marked ChipRAM. Address pointer to copper list is stored in register a6 which is incremented when we build up the copper list by adding instructions to it.

Copper move instruction moves data to hardware registers. The instruction is encoded so that first word of the instruction denotes the hardware register address to which data is moved and the second word of the instruction contains the data to be moved. Since all hardware registers have the same 12 highest bits (dff) the first word of the instruction takes only last 12 bits of the destination register as a parameter. So for instance when we add move instruction $00e2 to copper list we want copper to move the data to register dff0e2.

move.l #bitplanes,d0
move.w #$00e2,(a6)+
move.w d0,(a6)+
swap d0
move.w #$00e0,(a6)+
move.w d0,(a6)+

This block (and the two blocks following it) assign our bitplane data to bitplane hardware registers so that the hardware will draw the graphics from bitplanes contained in memory. Bitplane data is communicated to hardware through two registers: high and low pointer. Low pointer contains the lowest 15 bits of the bitplane data address and high pointer contains highest 3 bits. swap instruction swaps the 16-bit halves of the 32-bit register so that the low bits become high bits and vice versa. This allows us to store the low bits to low pointer register and high bits to high pointer register. Notice that we are always moving just 16 bits of data to copper list (by using .w version of the move instruction). move.w instruction with the (a6)+ addressing scheme moves data to address pointed to by a6 and increments the address in a6 by word-size.

; colors
move.l #$01800fd3,(a6)+	; color 0
move.l #$01820832,(a6)+	; color 1
move.l #$0184036b,(a6)+	; color 2
move.l #$01860667,(a6)+	; color 3
move.l #$01880f53,(a6)+	; color 4
move.l #$018a07ad,(a6)+	; color 5
move.l #$018c0000,(a6)+	; color 6
move.l #$018e0cef,(a6)+	; color 7

This block adds a bunch of move instructions into the copper list. These move instructions define the color palette used while drawing the graphics. Since we are using three bitplanes we have in total 3 bits to control used color which adds up to 8 distinct colors. Colors are defined in hardware registers COLOR00 through COLOR31. Each color is defined using 12 bits (4 blue, 4 green and 4 red).

move.l #32,d0 ; Number of iterations
move.l #$07,d1 ; Current row wait
move.l #sin32_15,a0 ; Sine base
move.l frame,d2 ; Current sine
  ; Wait for correct offset row
  move.w d1,(a6)+
  move.w #$fffe,(a6)+
  ; Fetch sine from table
  move.l d2,d3
  and.l #$1f,d3
  move.b (a0,d3),d4
  ; Transform sine to horizontal offset value
  move.l d4,d5
  lsl.l #4,d4
  add.l d4,d5
  ; Add horizontal offset to copperlist
  move.w #$0102,(a6)+
  move.w d5,(a6)+
  ; Proceed to next row that we want to offset
  add.l #$500,d1
  ; Move to next sine position for next offset row
  addq.w #1,d2
  subq.w #1,d0
  bne scrollrows

This block animates the graphics on the screen. We introduce different horizontal displacement per scan line and animate the displacement value by frame counter in order to create an effect where the graphics continuously flutter.

We achieve this effect by inserting values into BPLCON1 register. Copper is instructed to perform the horizontal displacement assignment using move instruction with calculated displacement stored in d5 – register. This is done by these two rows above:

  move.w #$0102,(a6)+
  move.w d5,(a6)+

During each frame the graphics area is divided into 32 bands. Each of these bands have a different horizontal displacement. We instruct the copper to wait until scan line of next band is reached, at which time we assign new horizontal displacement value into BPLCON1 – register. Copper WAIT – instruction is formed using value in d1 register where the lower 8 bits are always $07 (this denotes that the horizontal beam position should be at the beginning of the scan line) and the higher 8 bits are the beam vertical position (i.e. the scan line). Wait instruction is inserted into copper list by these two instructions:

  move.w d1,(a6)+
  move.w #$fffe,(a6)+

Once displacement of a given band is assigned, we wait until the next band. The vertical beam position of the next band is calculated by adding 5 to the vertical position of the previous band: 160 / 32 = 5 where 160 is the height of the graphics and 32 the number of bands. Thus the next wait position is calculated by this instruction:

  add.l #$500,d1

sin32_15 contains a pre-calculated table of sine values (32 samples with values from 0 to 15). Horizontal displacement value is calculated from these sine values. We fetch a sine value for each band by taking modulo 32 of the frame counter and using the resulted value as an index to the sine table:

  move.l d2,d3
  and.l #$1f,d3
  move.b (a0,d3),d4

Amiga supports hardware horizontal scroll on 4-bit resolution (thus the 0 to 15 sine value). To correctly scroll the graphics, we need to use the same displacement value on both playfields one and two. This is achieved by having the same displacement value on bits 1-4 and 5-8:

  move.l d4,d5
  lsl.l #4,d4
  add.l d4,d5

The copper list for the frame is now complete. To conclude the copper list, we need to mark the end of the list. This move instruction is interpreted as the end of the built copper list:

  ; end of copperlist
  move.l #$fffffffe,(a6)+

Amiga system contains two complex interface adapters (CIA). These adapters are used to communicate with the outside world using input/output devices such as joystick. Data from those devices can be read from a set of chip registers. The CIAAPRA register can be used to monitor if joystick button is pressed. On every iteration of the main loop, we check if joystick (or mouse) button is pressed and, in this case, exit the program. Following code checks both bit 6 and 7 on CIAAPRA – register and branches to exit if either one is set. Bit 6 is set if joystick (or mouse) in port 1 has its button pressed. Bit 7 is set if joystick (or mouse) in port 2 has its button pressed.

  ; if mousebutton/joystick 1 or 2 pressed then exit
  btst.b #6,CIAAPRA
  beq exit
  btst.b #7,CIAAPRA
  beq exit

We need to wait for vertical blanking period before we can assign our copper list to execution. The vertical blanking time starts at scan line 300. To detect whether vertical blanking is on, we peek values in VPOSR – register. We are interested only in vertical position so we mask everything else out (using mask #$1ff00 which leaves only the 9 bits we care for).

After this we check if scan line 300 is reached:

; Wait for vertical blanking before taking the copper list into use
  move.l VPOSR,d0
  and.l #$1ff00,d0
  cmp.l #300<<8,d0
  bne waitVB

We could have bit shifted the value down in register d0 and then compared it against #300. Instead we use assembly constant trick where we shift the #300 with <<8 operator. What this says is “give me #300 bit shifted right 8 bits”. This saves us an instruction but might look a bit confusing at first.

Note: Amiga has also support for vertical blank interrupts. However, interrupts are a bit trickier to program, so to keep it simple, we stick with the busy loop implementation.

Once vertical blank period is reached we assign our copper list into execution by assigning the address of the copper list to registers COP1LCH and COP1LCL. Notice that by writing long word to location COP1LCH the lower word will leak into COP1LCL and thus set both high and low bits of the copper list. Once that is done we execute the next iteration of the mainloop:

  ; Take copper list into use
  move.l #copper,a6
  move.l a6,COP1LCH
  bra mainloop

Application exit

On application exit, we must restore some registers back to defaults (interrupt control, most importantly), load back the old view and copperlist and enable multitasking again.

; exit gracefully - reverse everything done in init
  move.w #$7fff,DMACON
  move.w  olddmareq,DMACON
  move.w #$7fff,INTENA
  move.w  oldintena,INTENA
  move.w #$7fff,INTREQ
  move.w  oldintreq,INTREQ
  move.w #$7fff,ADKCON
  move.w  oldadkcon,ADKCON

  move.l  oldcopper,COP1LCH
  move.l  gfxbase,a6
  move.l  oldview,a1
  jsr -222(a6)    ; LoadView
  jsr -270(a6)    ; WaitTOF
  jsr -270(a6)    ; WaitTOF
  move.l  $4,a6
  jsr -138(a6)    ; Permit

  ; end program

After that, we can just exit the subroutine with rts and we are dropped back to Amiga OS.

Running on real hardware

Since we are compiling the application as a real Amiga executable, testing the code on a real Amiga should be easy – just copy the file over?

Well, not exactly. Since 3.5″ disk drives are no longer common, moving data to Amiga has become quite difficult. To add insult to injury, Amiga can read 3.5″ DD DOS-formatted disks, but PC cannot read Amiga MFM-coded disks. This requires a tool on Amiga side such as CrossDOS (which is shipped as part of Amiga OS 2.1) or MessyDOS from Aminet.

There’s still a chicken-and-egg problem trying to transfer the necessary software to your Amiga 500 with 1.3 Kickstart.

A more tractable way is to use serial port transfer. You can then transfer the MessyDOS to your Amiga and start moving data using the DOS-formatted disks.

If you have Amiga 1200 or Amiga 600, you can also use PCMCIA CompactFlash adapter.

Update 9.10.2015: exec.library base address is stored in memory address $4. The library base address itself is not $4. Thanks to Petri Koistinen! Without his comment this would have slipped our attention.

Recommended posts