Beyond Brown

When brown just isn't enough

Blaster sprites

Blaster Sprites, Advanced

Update on emulator issue, Oct 31st, 2018

Atari legend Nicolas Pomarède points out that it doesn’t really make sense to combine the NFSR and FXSR bits the way I do. And he’s right!

My mistake was relying on emulators while writing this, and using the NFSR and FXSR bits this way was the only way I could make it work. So…

Wanted!

Someone to figure out what is going wrong here. Please email me your findings at per@brainfish.net - thanks in advance!


v 1.04, Oct 23rd, 2018 - There was bug in X clipping when showing fewer than the 16 rightmost pixels. It displayed fine in STEem, which is why I missed it.

However, at the time of writing, while the corrected code below works fine on real hardware, it doesn’t in neither Hatari or STEem. Lesson: Always test on real hardware.

Introduction

So, you’ve played around a bit with the blaster (we don’t call it a blitter, see my previous article “Basic Blaster Usage”), you’ve used it to clear screen buffers, maybe even written a basic sprite routine, but now you’re ready to move on to the hairy stuff : clipping. Specifically clipping of sprites.

What is clipping?

Clipping is the act of not drawing the whole sprite. For example, if a sprite is halfway off the screen on the right, if we just draw it normally, parts of it will wrap around and end up on the left end of the screen.

Or imagine a raindrop sprite entering the screen from the top on its way down the screen; we would need to draw only the bottom part of the raindrop.

Do you need clipping?

Well, it may seem an odd question in a document like this, but it’s a valid question. Clipping comes at a cost, and depending on what solution we chose, that cost can come in memory usage, CPU usage, or possibly both.

There are definitely cases where you don’t even have to bother with clipping. Imagine a Pac-Man type game, if you don’t implement the tunnel which lets sprites travel from the right end of the screen to the left one (and the other way around). Here, clipping won’t be necessary at all.

In demo development, we often resort to sacrificing memory to gain speed. And we can certainly solve clipping that way too:

Non-clipping clipping

As we know, the visible screen on a 1632-bit Atari is just a bit of memory, and on most of these machines we can easily make the screen larger than area shown on our TV/monitor (in fact, vertically we can do this even on the measly ST).

Let’s start with Y. Assume a 64x64-pixel sprite, drawn at position (0, 150) (let’s also assume a standard “ST Low” 320x200 screen). Only 49 rows of pixels will be visible, so 15 pixels (64-49) will be drawn… well, into whatever RAM is after the screen buffer. The solution would be to reserve 63 lines of screen space before the actual screen, and 63 lines after it. This means what was a 32000-byte screen buffer will now have to be 52160 bytes ((200+63+63)*160). Another disadvantage is that we will be spending CPU time drawing pixels that will never be used. But our code will definitely be a lot simple to write and manage.

How about X then? Well, on a standard ST this is problematic, but on STe/TT/Falcon we can increase the line width of the screen to give us space on both the left and the right of the 320x200 screen, which lets us simply draw sprites at the edge without worrying about part of it appearing on the opposite edge. The memory hit for a 64 pixels wide sprite goes from 32000 bytes to 38432 (8x200 bytes for a 16-pixel column, 24 such columns, plus an extra 4 single-pixel-height columns at 8 bytes each = (24x8x200)+(4x8)).

These solutions might be just what you need, but they do carry some downsides, and besides, it’s easy. We don’t want easy. We leave easy to people without ambition and discipline.

Real clipping

This is where we get into actual code. Finally! Please note that all code here is for rmac, but should be fairly easy to translate to other assemblers.

The code below is not super-optimized, because:

  • It’s super generic. It works with any resolution, and number of bitplanes (1, 2, 4 or 8), and any size sprite (I leave no guarantee for what happens if a single bitplane in the sprite is larger than 32766 bytes, however)

  • I want the code to be as easy to read as possible

If you want to make it faster without losing flexibility, there are two multiplications that could be replaced by LUTs (lookup tables), and if you’re ready to go for a fixed resolution or a fixed number of bitplanes, there’s practically no end to how fast you can make this.

Also, I do write a lot of data from registers to memory and back, so that’s another place where you can shave off cycles.

However, this is blaster code, so for any sprite larger than, say, 32x32 (even if it’s just mask+one bitplane), the blaster calls are going to be the majority of the time spent in the subroutine.

Before we get into the details

If you’re a demoscene legend and can read complex code while solving a Rubik’s Cube and inventing spaceships at the same time, skip this part.

Now then, for the mere mortals among us: The code below starts out with a bunch of EQUs and then some preprocessor code to weed out user errors.

Then follows the definition of what I call a sprite struct. It’s pretty straight-forward: The first three words give the number of bitplanes, the width and height (in pixels), and then comes 2-5 single-bitplane images where the first is the mask, and the following ones are the sprite data. All these must be the same dimensions, of course.

Code: Standard sprite routine

;--------------------------------------------------------------
;-- Blaster EQUs

BLASTER_HOP_ONES                    equ %00000000
BLASTER_HOP_HALFTONE                equ %00000001
BLASTER_HOP_SOURCE                  equ %00000010
BLASTER_HOP_SOURCE_AND_HALFTONE     equ %00000011

BLASTER_OP_SOURCE                   equ %00000011
BLASTER_OP_SOURCE_AND_TARGET        equ %00000001
BLASTER_OP_SOURCE_AND_NOT_TARGET    equ %00000010
BLASTER_OP_SOURCE_OR_TARGET         equ %00000111
BLASTER_OP_SOURCE_XOR_TARGET        equ %00000110
BLASTER_OP_SOURCE_NOT_TARGET        equ %00000100
BLASTER_OP_ZEROES                   equ %00000000
BLASTER_OP_ONES                     equ %00001111

BLASTER_COMMAND_START_HOG_MODE      equ %11000000
BLASTER_COMMAND_START_SHARED_MODE   equ %10000000

;-- Blaster EQUs
;--------------------------------------------------------------
;-- Screen mode EQUs

SCREEN_WIDTH_PIXELS equ 320
SCREEN_WIDTH_BYTES equ 160
SCREEN_BITPLANES equ 4

;-- Screen mode EQUs
;--------------------------------------------------------------
;-- User setting EQUs

SCREEN_CLIP_Y_MIN equ 32
SCREEN_CLIP_Y_MAX equ 199-32
SCREEN_CLIP_X_MIN equ 32  ; must be on a 16-pixel-boundary!
SCREEN_CLIP_X_MAX equ 319-32  ; must be on a 16-pixel-boundary!

;-- User setting EQUs
;--------------------------------------------------------------
;-- Verify user settings

macro fatal_error errorstring
  print "---[ FATAL ERROR ]-----------------------------------------------------------"
  print \{errorstring}
  print ""
  print ""
  .error
endm

; Check clip values
if SCREEN_CLIP_Y_MIN<0
  fatal_error "SCREEN_CLIP_Y_MIN must be a positive value!"
endif
if SCREEN_CLIP_X_MIN<0
  fatal_error "SCREEN_CLIP_X_MIN must be a positive value!"
endif

if (SCREEN_CLIP_X_MIN & $f)!=0
  fatal_error "SCREEN_CLIP_X_MIN must be divisible by 16!"
endif
if ((SCREEN_CLIP_X_MAX+1) & $f)!=0
  fatal_error "SCREEN_CLIP_X_MAX must be divisible by 16!"
endif

; Check number of bitplanes, and create bitplane shift value to save a couple of multiplications
BPL_ERROR set 1
BPL_SHIFTER set 0
if SCREEN_BITPLANES=1
  BPL_ERROR set 0
  BPL_SHIFTER set 1 ; 2
endif
if SCREEN_BITPLANES=2
  BPL_ERROR set 0
  BPL_SHIFTER set 2 ; 4
endif
if SCREEN_BITPLANES=4
  BPL_ERROR set 0
  BPL_SHIFTER set 3 ; 8
endif
if SCREEN_BITPLANES=8
  BPL_ERROR set 0
  BPL_SHIFTER set 8 ; 256
endif
if BPL_ERROR=1
  fatal_error "SCREEN_BITPLANES must be 1, 2, 4 or 8!"
endif


;-- Verify user settings
;--------------------------------------------------------------

; Sprite struct:
;     Number_of_bitplanes.w
;     X.w
;     Y.w
;     mask data, 1 bitplane
;     sprite data, 1-4 bitplanes, not interleaved



blsp_draw_noclip:
; In:  d0.w - X
;      d1.w - Y
;      a0.l - pointer to sprite struct
;      a1.l - pointer to screen

  clr.b bslp_clip_x_min_flag
  clr.b bslp_clip_x_max_flag

  clr.b blsp_bl_extra_source_read_flag

  ;------------------------------------------------------------
  ;-- Setup blaster registers

  ; These are all default values
  move.w #2,blsp_bl_source_x_inc
  move.w #2,blsp_bl_source_y_inc
  move.w #-1,blsp_bl_endmask_1
  move.w #SCREEN_BITPLANES*2,blsp_bl_dest_x_inc

  ;-- Setup blaster registers
  ;------------------------------------------------------------
  ;-- Get width and height of sprite

  ; Get number of bitplanes in sprite
  move.w (a0)+,blsp_sprite_bitplanes
  ; Get sprite width
  move.w (a0)+,d7  ; let's not modify d7 for a while, it's going to see use later
  ; Get sprite height
  move.w (a0)+,d6  ; let's not touch a0 or d6, we're going to use them in a bit
  move.w d6,blsp_height_pixels

  ;-- Get width and height of sprite
  ;------------------------------------------------------------
  ;-- "Early out" clipping

  ; X min
  cmp.w #SCREEN_CLIP_X_MIN,d0
  blt .early_out
  ; Y min
  cmp.w #SCREEN_CLIP_Y_MIN,d1
  blt .early_out
  ;X max
  move.w #SCREEN_CLIP_X_MAX,d2
  sub.w d7,d2
  add.w #1,d2
  cmp.w d2,d0
  bgt .early_out
  ; Y max
  move.w #SCREEN_CLIP_Y_MAX,d2
  sub.w d6,d2
  add.w #1,d2
  cmp.w d2,d1
  bgt .early_out

  ;-- "Early out" clipping
  ;------------------------------------------------------------
  ;-- Get width-in-words

  move.w d7,blsp_width_pixels
  lsr.w #4,d7
  move.w d7,blsp_width_words

  ;-- Get width-in-words
  ;------------------------------------------------------------
  ;-- X position

  move.w d0,d5  ; Back up X so we can still access the untouched value later
  and.b #$f,d5  ; Get lowest 4 bits from X for skewing
  move.b d5,blsp_skew_value
  ; Mask out so we get a clean multiple of 16

  move.w d0,d5
  sub.w #SCREEN_CLIP_X_MIN,d5
  tst.w d5
  bmi .x_position_done
    move.w d0,d4
    ; Adjust screen pointer to correct "16-pixel block"
    lsr.w #4,d4
    lsl.w #BPL_SHIFTER,d4  ; instead of mulu #2*SCREEN_BITPLANES,d4
    add.w d4,a1
  .x_position_done:

  ;-- X position
  ;------------------------------------------------------------
  ;-- Y position

  ; Adjust screen pointer - this multiplication should be LUT:ed, obviously
  mulu #SCREEN_WIDTH_BYTES,d1
  add.w d1,a1

  ;-- Y position
  ;------------------------------------------------------------
  ;-- Destination Y increment

  move.w blsp_width_words,d7
  sub.w #1,d7  ; because we need one less for dest y inc
  move.w #SCREEN_WIDTH_BYTES,d6
  ; multiply width-in-words to account for words and bitplanes
  lsl.w #BPL_SHIFTER,d7  ; instead of mulu #2*SCREEN_BITPLANES,d7
  sub.w d7,d6  ; subtract total sprite width from total screen width...
  move.w d6,blsp_bl_dest_y_inc  ; ...and that's how much the blaster needs to add each line

  ;-- Destination Y increment
  ;------------------------------------------------------------
  ;-- Pointers to mask and sprite data

  ; a0 contains the address to the mask, let's save it
  move.l a0,blsp_mask_pointer
  move.w blsp_width_words,d7  ; d7 now contains the width of the mask, in words
  move.w blsp_height_pixels,d6  ; d6 now contains the height in pixels
  mulu.w d7,d6  ; should be LUT:ed, or why not part of the sprite struct
  add.w d6,d6  ; we double d6 to convert a word-offset to bytes
  add.w d6,a0  ; a0 now points to sprite data
  move.l a0,blsp_sprite_pointer
  ext.l d6
  move.l d6,blsp_bitplane_offset  ; save offset for sprites with multiple bitplanes

  ;-- Pointers to mask and sprite data
  ;------------------------------------------------------------
  ;-- Skewing and endmasks

  tst.b blsp_skew_value  ; if skew value is 0, that's one codepath...
  bne .skewing  ; if not, that's another.
    ; no skewing
    move.w #-1,blsp_bl_endmask_0  ; left endmask
    move.w #-1,blsp_bl_endmask_2   ; right endmask
    bra .skewing_done
  .skewing:
    ; Fetch endmasks from LUTs
    clr.l d7
    move.b blsp_skew_value,d7
    add.l d7,d7
    lea blsp_leftmasks,a6
    move.w (a6,d7),blsp_bl_endmask_0
    lea blsp_rightmasks,a6
    move.w (a6,d7),blsp_bl_endmask_2

    ; ... and because we've skewed (effectively adding 16 pixels to the right), we need to adapt these values:
    add.w #1,blsp_width_words  ; one more xcount
    sub.w #SCREEN_BITPLANES*2,blsp_bl_dest_y_inc  ; adjust dest y inc accordingly
    sub.w #2,blsp_bl_source_y_inc   ; source y inc
  .skewing_done:

  ;-- Skewing and endmasks
  ;------------------------------------------------------------
  ;-- Blast mask

  ; Clear blaster halftone RAM
    move.l #$ffff8a00,a6
    rept 16/2
      clr.l (a6)+
    endr

  move.l a1,-(sp)  ; save a1 for next bitplane

  move.w blsp_sprite_bitplanes,d7
  subq #1,d7
  ext.l d7
  .blast_mask_loop:
    move.w blsp_bl_source_x_inc,$ffff8a20   ;source x inc
    move.w blsp_bl_source_y_inc,$ffff8a22   ;source y inc
    move.l blsp_mask_pointer,$ffff8a24   ;source address
    move.w blsp_bl_endmask_0,$ffff8a28   ;endmask 0
    move.w blsp_bl_endmask_1,$ffff8a2a  ;endmask 1
    move.w blsp_bl_endmask_2,$ffff8a2c   ;endmask 2
    move.w blsp_bl_dest_x_inc,$ffff8a2e   ;dest x inc
    move.w blsp_bl_dest_y_inc,$ffff8a30   ;dest y inc
    move.l a1,$ffff8a32   ;destination address
    move.w blsp_width_words,$ffff8a36   ;x count (n words per line to copy)
    move.w blsp_height_pixels,$ffff8a38   ;y count (n lines to copy)
    move.b blsp_skew_value,$ffff8a3d  ; set skew
    move.b #BLASTER_HOP_SOURCE,$ffff8a3a              ; halftone operation
    move.b #BLASTER_OP_SOURCE_AND_TARGET,$ffff8a3b    ; operation
      ;move.b #BLASTER_OP_SOURCE,$ffff8a3b    ; operation
    move.b #BLASTER_COMMAND_START_HOG_MODE,$ffff8a3c  ; start blaster

    addq #2,a1  ; offset to next bitplane
  dbra d7,.blast_mask_loop

  move.l (sp)+,a1

  ;-- Blast mask
  ;------------------------------------------------------------
  ;-- Blast sprite

  move.l blsp_bitplane_offset,d6
  move.w blsp_sprite_bitplanes,d7
  subq #1,d7
  ext.l d7
  .blast_sprite_loop:
    move.w blsp_bl_source_x_inc,$ffff8a20   ;source x inc
    move.w blsp_bl_source_y_inc,$ffff8a22   ;source y inc
    move.l blsp_sprite_pointer,$ffff8a24   ;source address
    move.w blsp_bl_endmask_0,$ffff8a28   ;endmask 0
    move.w blsp_bl_endmask_1,$ffff8a2a  ;endmask 1
    move.w blsp_bl_endmask_2,$ffff8a2c   ;endmask 2
    move.w blsp_bl_dest_x_inc,$ffff8a2e   ;dest x inc
    move.w blsp_bl_dest_y_inc,$ffff8a30   ;dest y inc
    move.l a1,$ffff8a32   ;destination address
    move.w blsp_width_words,$ffff8a36   ;x count (n words per line to copy)
    move.w blsp_height_pixels,$ffff8a38   ;y count (n lines to copy)
    move.b blsp_skew_value,$ffff8a3d  ; set skew
    move.b #BLASTER_HOP_SOURCE,$ffff8a3a              ; halftone operation
    move.b #BLASTER_OP_SOURCE_OR_TARGET,$ffff8a3b    ; operation
      ;move.b #BLASTER_OP_SOURCE_XOR_TARGET,$ffff8a3b    ; operation
      ;move.b #BLASTER_OP_SOURCE,$ffff8a3b    ; operation
    move.b #BLASTER_COMMAND_START_HOG_MODE,$ffff8a3c  ; start blaster

    add.l d6,blsp_sprite_pointer  ; to get to the next sprite bitplane
    addq #2,a1  ; offset to next bitplane
  dbra d7,.blast_sprite_loop

  ;-- Blast sprite
  ;------------------------------------------------------------

.early_out:
  rts


;------------------------------------------------------------------------------


  .bss
blsp_sprite_bitplanes:            ds.w 1

; Address to mask
blsp_mask_pointer:                ds.l 1
; Address to first bitplane in sprite
blsp_sprite_pointer:              ds.l 1
; Offset to next bitplane in sprite
blsp_bitplane_offset:             ds.l 1

; Width and height variables
blsp_width_pixels:      ds.w 1
blsp_height_pixels:     ds.w 1
blsp_width_words:       ds.w 1

blsp_skew_value:                  ds.b 1
  even

; Blaster shadow variables
blsp_bl_source_x_inc:   ds.w 1
blsp_bl_source_y_inc:   ds.w 1
blsp_bl_endmask_0:      ds.w 1
blsp_bl_endmask_1:      ds.w 1
blsp_bl_endmask_2:      ds.w 1
blsp_bl_dest_x_inc:     ds.w 1
blsp_bl_dest_y_inc:     ds.w 1


  .68000


;------------------------------------------------------------------------------


  .data

bfly_palette:
  include "bfly_160x164_4bpl.pal"


bfly_160:
  dc.w 1  ; bitplanes
  dc.w 160, 164  ; width, height
  incbin "bflymask_160x164.1bp"
  incbin "bfly_160x164.1bp"


blsp_leftmasks:
    dc.w %1111111111111111
    dc.w %0111111111111111
    dc.w %0011111111111111
    dc.w %0001111111111111
    dc.w %0000111111111111
    dc.w %0000011111111111
    dc.w %0000001111111111
    dc.w %0000000111111111
    dc.w %0000000011111111
    dc.w %0000000001111111
    dc.w %0000000000111111
    dc.w %0000000000011111
    dc.w %0000000000001111
    dc.w %0000000000000111
    dc.w %0000000000000011
    dc.w %0000000000000001


blsp_rightmasks:
    dc.w %0000000000000000
    dc.w %1000000000000000
    dc.w %1100000000000000
    dc.w %1110000000000000
    dc.w %1111000000000000
    dc.w %1111100000000000
    dc.w %1111110000000000
    dc.w %1111111000000000
    dc.w %1111111100000000
    dc.w %1111111110000000
    dc.w %1111111111000000
    dc.w %1111111111100000
    dc.w %1111111111110000
    dc.w %1111111111111000
    dc.w %1111111111111100
    dc.w %1111111111111110

  .68000

The only clipping this code actually performs is an “early out” clipping that skips all drawing if any part of the sprite is outside the clipping frustrum.

Code: Clipping sprite routine

Here we get into the real nitty-gritty.

“Early-out clipping”

The first difference you will notice is that the “early-out” clipping has been modified. Instead of exiting if a single pixel is outside the clipping frustrum, it now only exits if all of the sprite is outside the clipping frustrum.

“Partial clipping”

Rather than draw all of nothing of the sprite (like we do in the code above), we want to draw only parts of the sprite, which doesn’t sound too bad.

Vertically: Y clipping

And vertically, it’s really not too bad! All we do is move the start address of the sprite and the number of lines to draw.

Horizontally: X clipping

From high above, horizontal clipping (or X clipping) doesn’t seem too bad. We calculate how many pixels to draw, and we adjust the blaster’s “X count” register for that.

Oh, except the “X count” deals with words. And since the “X count” is the width of the sprite, we need to change the “dest y increment” register too. And the “source y increment” register.

Oh, and if we’re cutting the sprite on the right, we have to set the right endmask to all 1’s (we don’t want it to mask off anything).

And if the width drawn is 16 pixels or less, it’s not the right endmask that needs modifying, it’s the left one (because endmask_0 is always used, the other only come into play at widths greater than 16).

Aaand as if that wasn’t enough, there’s a specific case where we have to set the NFSR (No Final Source Read) and FXSR (Force eXtra Source Read) bits of the skew register.

So… Yes, there are a lot of cases we have to be aware of, but I have sorted it all out for you, so go nuts with the code:

;--------------------------------------------------------------
;-- Blaster EQUs

BLASTER_HOP_ONES                    equ %00000000
BLASTER_HOP_HALFTONE                equ %00000001
BLASTER_HOP_SOURCE                  equ %00000010
BLASTER_HOP_SOURCE_AND_HALFTONE     equ %00000011

BLASTER_OP_SOURCE                   equ %00000011
BLASTER_OP_SOURCE_AND_TARGET        equ %00000001
BLASTER_OP_SOURCE_AND_NOT_TARGET    equ %00000010
BLASTER_OP_SOURCE_OR_TARGET         equ %00000111
BLASTER_OP_SOURCE_XOR_TARGET        equ %00000110
BLASTER_OP_SOURCE_NOT_TARGET        equ %00000100
BLASTER_OP_ZEROES                   equ %00000000
BLASTER_OP_ONES                     equ %00001111

BLASTER_COMMAND_START_HOG_MODE      equ %11000000
BLASTER_COMMAND_START_SHARED_MODE   equ %10000000

;-- Blaster EQUs
;--------------------------------------------------------------
;-- Screen mode EQUs

SCREEN_WIDTH_PIXELS equ 320
SCREEN_WIDTH_BYTES equ 160
SCREEN_BITPLANES equ 4

;-- Screen mode EQUs
;--------------------------------------------------------------
;-- User setting EQUs

SCREEN_CLIP_Y_MIN equ 32
SCREEN_CLIP_Y_MAX equ 199-32
SCREEN_CLIP_X_MIN equ 32  ; must be on a 16-pixel-boundary!
SCREEN_CLIP_X_MAX equ 319-32  ; must be on a 16-pixel-boundary!

;-- User setting EQUs
;--------------------------------------------------------------
;-- Verify user settings

macro fatal_error errorstring
  print "---[ FATAL ERROR ]-----------------------------------------------------------"
  print \{errorstring}
  print ""
  print ""
  .error
endm

; Check clip values
if SCREEN_CLIP_Y_MIN<0
  fatal_error "SCREEN_CLIP_Y_MIN must be a positive value!"
endif
if SCREEN_CLIP_X_MIN<0
  fatal_error "SCREEN_CLIP_X_MIN must be a positive value!"
endif

if (SCREEN_CLIP_X_MIN & $f)!=0
  fatal_error "SCREEN_CLIP_X_MIN must be divisible by 16!"
endif
if ((SCREEN_CLIP_X_MAX+1) & $f)!=0
  fatal_error "SCREEN_CLIP_X_MAX must be divisible by 16!"
endif

; Check number of bitplanes, and create bitplane shift value to save a couple of multiplications
BPL_ERROR set 1
BPL_SHIFTER set 0
if SCREEN_BITPLANES=1
  BPL_ERROR set 0
  BPL_SHIFTER set 1 ; 2
endif
if SCREEN_BITPLANES=2
  BPL_ERROR set 0
  BPL_SHIFTER set 2 ; 4
endif
if SCREEN_BITPLANES=4
  BPL_ERROR set 0
  BPL_SHIFTER set 3 ; 8
endif
if SCREEN_BITPLANES=8
  BPL_ERROR set 0
  BPL_SHIFTER set 8 ; 256
endif
if BPL_ERROR=1
  fatal_error "SCREEN_BITPLANES must be 1, 2, 4 or 8!"
endif


;-- Verify user settings
;--------------------------------------------------------------

; Sprite struct:
;     Number_of_bitplanes.w
;     X.w
;     Y.w
;     mask data, 1 bitplane
;     sprite data, 1-4 bitplanes, not interleaved


blsp_draw_fullclip:
; In:  d0.w - X
;      d1.w - Y
;      a0.l - pointer to sprite struct
;      a1.l - pointer to screen

  clr.b bslp_clip_x_min_flag
  clr.b bslp_clip_x_max_flag

  clr.b blsp_bl_extra_source_read_flag

  ;------------------------------------------------------------
  ;-- Setup blaster registers

  ; These are all default values
  move.w #2,blsp_bl_source_x_inc
  move.w #2,blsp_bl_source_y_inc
  move.w #-1,blsp_bl_endmask_1
  move.w #SCREEN_BITPLANES*2,blsp_bl_dest_x_inc

  ;-- Setup blaster registers
  ;------------------------------------------------------------
  ;-- Get width and height of sprite

  ; Get number of bitplanes in sprite
  move.w (a0)+,blsp_sprite_bitplanes
  ; Get sprite width
  move.w (a0)+,d7  ; let's not modify d7 for a while, it's going to see use later
  ; Get sprite height
  move.w (a0)+,d6  ; let's not touch a0 or d6, we're going to use them in a bit

  ;-- Get width and height of sprite
  ;------------------------------------------------------------
  ;-- "Early out" clipping

  ; Y max
  cmp.w #SCREEN_CLIP_Y_MAX,d1
  bgt .early_out
  ; X max
  cmp.w #SCREEN_CLIP_X_MAX,d0
  bgt .early_out
  ; Y min
  move.w #SCREEN_CLIP_Y_MIN,d2
  sub.w d6,d2
  cmp.w d2,d1
  ble .early_out
  ; X min
  move.w #SCREEN_CLIP_X_MIN,d2
  sub.w d7,d2
  cmp.w d2,d0
  ble .early_out

  ;-- "Early out" clipping
  ;------------------------------------------------------------
  ;-- Get width-in-words

  move.w d7,blsp_width_pixels_before_clip
  lsr.w #4,d7
  move.w d7,blsp_width_words_before_clip

  ;-- Get width-in-words
  ;------------------------------------------------------------
  ;-- X position

  move.w d0,d5  ; Back up X so we can still access the untouched value later
  and.b #$f,d5  ; Get lowest 4 bits from X for skewing
  move.b d5,blsp_skew_value
  ; Mask out so we get a clean multiple of 16

  move.w d0,d5
  sub.w #SCREEN_CLIP_X_MIN,d5
  tst.w d5
  bmi .x_position_done
    move.w d0,d4
    ; Adjust screen pointer to correct "16-pixel block"
    lsr.w #4,d4
    lsl.w #BPL_SHIFTER,d4  ; instead of mulu #2*SCREEN_BITPLANES,d4
    add.w d4,a1
  .x_position_done:

  ;-- X position
  ;------------------------------------------------------------
  ;-- Partial clipping

  ; Write "before clipping" width and height values
  move.w d6,blsp_height_pixels_before_clip
  move.w d6,blsp_height_pixels_after_clip  ; default value, in case no clipping happens
  move.w d7,blsp_width_words_after_clip  ; default value


  ; Y max clipping
  move.w #SCREEN_CLIP_Y_MAX,d2
  move.w d1,d3  ; d3=Y
  add.w d6,d3   ; add sprite height to d3...
  sub.w d3,d2  ; subtract that from clip value. If result<0, it's clipping time
  bpl.s .no_y_max_clip
    neg.w d2  ; overshoot in d2
    sub.w #1,d2
    sub.w d2,d6
    move.w d6,blsp_height_pixels_after_clip
  .no_y_max_clip:


  ; Y min clipping
  move.w #SCREEN_CLIP_Y_MIN,d2
  sub.w d1,d2
  bmi.s .no_y_min_clip
    sub.w d2,d6
    move.w d6,blsp_height_pixels_after_clip
    ; add offset to mask/sprite data
    move.w d7,d3  ; offset = number of words in width...
    add.w d3,d3  ; ...times 2...
    mulu d3,d2  ; ...times the number of lines
    add.w d2,a0  ; add offset
    ; Adjust Y to clipping
    move.w #SCREEN_CLIP_Y_MIN,d1
.no_y_min_clip:


  ; X max clipping
  move.w #SCREEN_CLIP_X_MAX,d2
  sub.w blsp_width_pixels_before_clip,d2
  add.w #1,d2
  cmp.w d2,d0
  ble.s .no_x_max_clip
    move.b #1,bslp_clip_x_max_flag
    ; Calculate overshoot
    move.w d0,d3
    add.w blsp_width_pixels_before_clip,d3
    sub.w #SCREEN_CLIP_X_MAX,d3
    sub.w #1,d3  ; d3 is now overshoot

    move.w blsp_width_pixels_before_clip,d7
    sub.w d3,d7
    lsr.w #4,d7
    move.w d7,blsp_width_words_after_clip

    lsr.w #3,d3
    add.w d3,blsp_bl_source_y_inc

    ; Adjust for skew value changing width
    tst.b blsp_skew_value
    beq .no_fix_src_y_inc
      add.w #2,blsp_bl_source_y_inc
    .no_fix_src_y_inc:

  .no_x_max_clip:


  ; X min clipping
  move.w #SCREEN_CLIP_X_MIN,d2
  cmp.w d2,d0
  bge.s .no_x_min_clip
    move.b #1,bslp_clip_x_min_flag

    ; Calculate undershoot
    move.w d0,d7
    sub.w #SCREEN_CLIP_X_MIN,d7

    neg.w d7  ; make d7 positive
    lsr.w #4,d7  ; multiply by 16, because word = 16 pixels
    sub.w d7,blsp_width_words_after_clip  ; since we'll be blasting fewer words in X
    sub.w #1,blsp_width_words_after_clip  ; adjust

    ; For each 16-pixel-block, we need add 8 to source pointer...
    lsl.w #1,d7  ; ...so since we multiplied d7 by 16, if we now halve d7... you get it.
    add.w d7,a0
    add.w d7,blsp_bl_source_y_inc  ; we also have to adjust source_y_inc for the new width

    lsl.w #2,d7
    add.w d7,blsp_bl_dest_y_inc  ; ...and we adjust dest_y_inc

    ; Adjust screen offset to clipping
    move.w #SCREEN_CLIP_X_MIN,d2
    lsr.w #4,d2
    lsl.w #BPL_SHIFTER,d2  ; instead of mulu #2*SCREEN_BITPLANES,d4
    add.w d2,a1

  .no_x_min_clip:

  ;-- Partial clipping
  ;------------------------------------------------------------
  ;-- Y position

  ; Adjust screen pointer - this multiplication should be LUT:ed, obviously
  mulu #SCREEN_WIDTH_BYTES,d1
  add.w d1,a1

  ;-- Y position
  ;------------------------------------------------------------
  ;-- Destination Y increment

  move.w blsp_width_words_after_clip,d7
  sub.w #1,d7  ; because we need one less for dest y inc
  move.w #SCREEN_WIDTH_BYTES,d6
  ; multiply width-in-words to account for words and bitplanes
  lsl.w #BPL_SHIFTER,d7  ; instead of mulu #2*SCREEN_BITPLANES,d7
  sub.w d7,d6  ; subtract total sprite width from total screen width...
  move.w d6,blsp_bl_dest_y_inc  ; ...and that's how much the blaster needs to add each line

  ;-- Destination Y increment
  ;------------------------------------------------------------
  ;-- Skewing and endmasks

  tst.b blsp_skew_value  ; if skew value is 0, that's one codepath...
  bne .skewing  ; if not, that's another.
    ; no skewing
    move.w #-1,blsp_bl_endmask_0  ; left endmask
    move.w #-1,blsp_bl_endmask_2   ; right endmask
    bra .skewing_done
  .skewing:
    ; Fetch endmasks from LUTs
    clr.l d7
    move.b blsp_skew_value,d7
    add.l d7,d7
    lea blsp_leftmasks,a6
    move.w (a6,d7),blsp_bl_endmask_0
    lea blsp_rightmasks,a6
    move.w (a6,d7),blsp_bl_endmask_2

    ; ... and because we've skewed (effectively adding 16 pixels to the right), we need to adapt these values:
    add.w #1,blsp_width_words_after_clip   ; one more xcount
    sub.w #SCREEN_BITPLANES*2,blsp_bl_dest_y_inc  ; adjust dest y inc accordingly
    sub.w #2,blsp_bl_source_y_inc   ; source y inc
  .skewing_done:

  ;-- Skewing and endmasks
  ;------------------------------------------------------------
  ;-- X clipping adjustments

  tst.b bslp_clip_x_max_flag
  beq.s .no_x_max_clip_adjustments
    move.w #-1,blsp_bl_endmask_2  ; since we're clipping on the right, we want the corresponding endmask to show everything
  .no_x_max_clip_adjustments:

  tst.b bslp_clip_x_min_flag
  beq.s .no_x_min_clip_adjustments
    move.w #-1,blsp_bl_endmask_0  ; clipping on the left = leftmost endmask all 1's
    tst.b blsp_skew_value
    bne.s .no_skew_fix
      add.w #1,blsp_width_words_after_clip
      sub.w #SCREEN_BITPLANES*2,blsp_bl_dest_y_inc
      bra.s .skew_fixing_done
    .no_skew_fix:
      move.b #1,blsp_bl_extra_source_read_flag
      add.w #2,blsp_bl_source_y_inc
    .skew_fixing_done:

    cmp.w #1,blsp_width_words_after_clip
    bne .nopers
      ; Width is just a single word, so endmask_2 --> endmask_0
      move.w blsp_bl_endmask_2,blsp_bl_endmask_0
      tst.b blsp_skew_value
      beq .dont_change_src_y_inc
        sub #2,blsp_bl_source_y_inc
        sub.w #2,a0  ; At the time of writing (Oct 23rd, 2018), this
                     ; works on my real STE (TOS 1.62), but not in
                     ; Hatari 2.0.0 or STEem v3.7.2
      .dont_change_src_y_inc:
    .nopers:

  .no_x_min_clip_adjustments:


  tst.b blsp_bl_extra_source_read_flag
  beq .no__extra_src_read
    add.b #$c0,blsp_skew_value  ; set NFSR and FXSR bits
  .no__extra_src_read:

  ;-- X clipping adjustments
  ;------------------------------------------------------------
  ;-- Pointers to mask and sprite data

  ; a0 contains the address to the mask, let's save it
  move.l a0,blsp_mask_pointer
  move.w blsp_width_words_before_clip,d7  ; d7 now contains the width of the mask, in words
  move.w blsp_height_pixels_before_clip,d6  ; d6 now contains the height in pixels
  mulu.w d7,d6  ; should be LUT:ed, or why not part of the sprite struct
  add.w d6,d6  ; we double d6 to convert a word-offset to bytes
  add.w d6,a0  ; a0 now points to sprite data
  move.l a0,blsp_sprite_pointer
  ext.l d6
  move.l d6,blsp_bitplane_offset  ; save offset for sprites with multiple bitplanes

  ;-- Pointers to mask and sprite data
  ;------------------------------------------------------------
  ;-- Blast mask

  ; Clear blaster halftone RAM
    move.l #$ffff8a00,a6
    rept 16/2
      clr.l (a6)+
    endr

  move.l a1,-(sp)  ; save a1 for next bitplane

  move.w blsp_sprite_bitplanes,d7
  subq #1,d7
  ext.l d7
  .blast_mask_loop:
    move.w blsp_bl_source_x_inc,$ffff8a20   ;source x inc
    move.w blsp_bl_source_y_inc,$ffff8a22   ;source y inc
    move.l blsp_mask_pointer,$ffff8a24   ;source address
    move.w blsp_bl_endmask_0,$ffff8a28   ;endmask 0
    move.w blsp_bl_endmask_1,$ffff8a2a  ;endmask 1
    move.w blsp_bl_endmask_2,$ffff8a2c   ;endmask 2
    move.w blsp_bl_dest_x_inc,$ffff8a2e   ;dest x inc
    move.w blsp_bl_dest_y_inc,$ffff8a30   ;dest y inc
    move.l a1,$ffff8a32   ;destination address
    move.w blsp_width_words_after_clip,$ffff8a36   ;x count (n words per line to copy)
    move.w blsp_height_pixels_after_clip,$ffff8a38   ;y count (n lines to copy)
    move.b blsp_skew_value,$ffff8a3d  ; set skew
    move.b #BLASTER_HOP_SOURCE,$ffff8a3a              ; halftone operation
    move.b #BLASTER_OP_SOURCE_AND_TARGET,$ffff8a3b    ; operation
      ;move.b #BLASTER_OP_SOURCE,$ffff8a3b    ; operation
    move.b #BLASTER_COMMAND_START_HOG_MODE,$ffff8a3c  ; start blaster

    addq #2,a1  ; offset to next bitplane
  dbra d7,.blast_mask_loop

  move.l (sp)+,a1

  ;-- Blast mask
  ;------------------------------------------------------------
  ;-- Blast sprite

  move.l blsp_bitplane_offset,d6
  move.w blsp_sprite_bitplanes,d7
  subq #1,d7
  ext.l d7
  .blast_sprite_loop:
    move.w blsp_bl_source_x_inc,$ffff8a20   ;source x inc
    move.w blsp_bl_source_y_inc,$ffff8a22   ;source y inc
    move.l blsp_sprite_pointer,$ffff8a24   ;source address
    move.w blsp_bl_endmask_0,$ffff8a28   ;endmask 0
    move.w blsp_bl_endmask_1,$ffff8a2a  ;endmask 1
    move.w blsp_bl_endmask_2,$ffff8a2c   ;endmask 2
    move.w blsp_bl_dest_x_inc,$ffff8a2e   ;dest x inc
    move.w blsp_bl_dest_y_inc,$ffff8a30   ;dest y inc
    move.l a1,$ffff8a32   ;destination address
    move.w blsp_width_words_after_clip,$ffff8a36   ;x count (n words per line to copy)
    move.w blsp_height_pixels_after_clip,$ffff8a38   ;y count (n lines to copy)
    move.b blsp_skew_value,$ffff8a3d  ; set skew
    move.b #BLASTER_HOP_SOURCE,$ffff8a3a              ; halftone operation
    move.b #BLASTER_OP_SOURCE_OR_TARGET,$ffff8a3b    ; operation
      ;move.b #BLASTER_OP_SOURCE_XOR_TARGET,$ffff8a3b    ; operation
      ;move.b #BLASTER_OP_SOURCE,$ffff8a3b    ; operation
    move.b #BLASTER_COMMAND_START_HOG_MODE,$ffff8a3c  ; start blaster

    add.l d6,blsp_sprite_pointer  ; to get to the next sprite bitplane
    addq #2,a1  ; offset to next bitplane
  dbra d7,.blast_sprite_loop

  ;-- Blast sprite
  ;------------------------------------------------------------

.early_out:
  rts


;------------------------------------------------------------------------------


  .bss
blsp_sprite_bitplanes:            ds.w 1

; Address to mask
blsp_mask_pointer:                ds.l 1
; Address to first bitplane in sprite
blsp_sprite_pointer:              ds.l 1
; Offset to next bitplane in sprite
blsp_bitplane_offset:             ds.l 1

; Width and height variables for clipping version
blsp_width_pixels_before_clip:    ds.w 1
blsp_width_pixels_after_clip:     ds.w 1
blsp_width_words_before_clip:     ds.w 1
blsp_width_words_after_clip:      ds.w 1
blsp_height_pixels_before_clip:   ds.w 1
blsp_height_pixels_after_clip:    ds.w 1

; Flags to determine if X clipping is going on
bslp_clip_x_min_flag:             ds.b 1
bslp_clip_x_max_flag:             ds.b 1

; Flag for extra source read
blsp_bl_extra_source_read_flag:   ds.b 1

blsp_skew_value:                  ds.b 1  

; Blaster shadow variables
blsp_bl_source_x_inc:   ds.w 1
blsp_bl_source_y_inc:   ds.w 1
blsp_bl_endmask_0:      ds.w 1
blsp_bl_endmask_1:      ds.w 1
blsp_bl_endmask_2:      ds.w 1
blsp_bl_dest_x_inc:     ds.w 1
blsp_bl_dest_y_inc:     ds.w 1


  .68000


;------------------------------------------------------------------------------


  .data

bfly_palette:
  include "bfly_160x164_4bpl.pal"

bfly_160:
  dc.w 1  ; bitplanes
  dc.w 160, 164  ; width, height
  incbin "bflymask_160x164.1bp"
  incbin "bfly_160x164.1bp"


blsp_leftmasks:
    dc.w %1111111111111111
    dc.w %0111111111111111
    dc.w %0011111111111111
    dc.w %0001111111111111
    dc.w %0000111111111111
    dc.w %0000011111111111
    dc.w %0000001111111111
    dc.w %0000000111111111
    dc.w %0000000011111111
    dc.w %0000000001111111
    dc.w %0000000000111111
    dc.w %0000000000011111
    dc.w %0000000000001111
    dc.w %0000000000000111
    dc.w %0000000000000011
    dc.w %0000000000000001


blsp_rightmasks:
    dc.w %0000000000000000
    dc.w %1000000000000000
    dc.w %1100000000000000
    dc.w %1110000000000000
    dc.w %1111000000000000
    dc.w %1111100000000000
    dc.w %1111110000000000
    dc.w %1111111000000000
    dc.w %1111111100000000
    dc.w %1111111110000000
    dc.w %1111111111000000
    dc.w %1111111111100000
    dc.w %1111111111110000
    dc.w %1111111111111000
    dc.w %1111111111111100
    dc.w %1111111111111110

  .68000

Limitations of this code

  • This code always masks and draws the same number of bitplanes. It is trivial to change it to for example mask four bitplanes, but draw only two.

  • Sprites so wide that they clip both on the left and the right display incorrectly. It should be relatively easy to fix, if this is a problem (Y clipping does not have this issue).

  • X Clipping is currently only allowed on 16-pixel boundaries (to keep things relatively simple). There’s no reason this code couldn’t be expanded to allow for any X clipping boundaries; it’s mainly a lot of extra endmask work.

Thanks

  • Awesome butterfly graphics by Acca

  • Teaching-me-everything-I-know and Smacking-me-upside-my-head-when-I-deserve-it by ggn/küa

Code repository at BitBucket

Basic Blaster Usage article

Excellence in Art

Self-taught masturbator