Update on emulator issue, Oct 31st, 2018
Atari legend Nicolas Pomarède points out that it doesn’t really make sense to combine the NFSR and FXSR bits the way I do. And he’s right!
My mistake was relying on emulators while writing this, and using the NFSR and FXSR bits this way was the only way I could make it work. So…
Wanted!
Someone to figure out what is going wrong here. Please email me your findings at per@brainfish.net - thanks in advance!
v 1.04, Oct 23rd, 2018 - There was bug in X clipping when showing fewer than the 16 rightmost pixels. It displayed fine in STEem, which is why I missed it.
However, at the time of writing, while the corrected code below works fine on real hardware, it doesn’t in neither Hatari or STEem. Lesson: Always test on real hardware.
Introduction
So, you’ve played around a bit with the blaster (we don’t call it a blitter, see my previous article “Basic Blaster Usage”), you’ve used it to clear screen buffers, maybe even written a basic sprite routine, but now you’re ready to move on to the hairy stuff : clipping. Specifically clipping of sprites.
What is clipping?
Clipping is the act of not drawing the whole sprite. For example, if a sprite is halfway off the screen on the right, if we just draw it normally, parts of it will wrap around and end up on the left end of the screen.
Or imagine a raindrop sprite entering the screen from the top on its way down the screen; we would need to draw only the bottom part of the raindrop.
Do you need clipping?
Well, it may seem an odd question in a document like this, but it’s a valid question. Clipping comes at a cost, and depending on what solution we chose, that cost can come in memory usage, CPU usage, or possibly both.
There are definitely cases where you don’t even have to bother with clipping. Imagine a Pac-Man type game, if you don’t implement the tunnel which lets sprites travel from the right end of the screen to the left one (and the other way around). Here, clipping won’t be necessary at all.
In demo development, we often resort to sacrificing memory to gain speed. And we can certainly solve clipping that way too:
Non-clipping clipping
As we know, the visible screen on a 16⁄32-bit Atari is just a bit of memory, and on most of these machines we can easily make the screen larger than area shown on our TV/monitor (in fact, vertically we can do this even on the measly ST).
Let’s start with Y. Assume a 64x64-pixel sprite, drawn at position (0, 150) (let’s also assume a standard “ST Low” 320x200 screen). Only 49 rows of pixels will be visible, so 15 pixels (64-49) will be drawn… well, into whatever RAM is after the screen buffer. The solution would be to reserve 63 lines of screen space before the actual screen, and 63 lines after it. This means what was a 32000-byte screen buffer will now have to be 52160 bytes ((200+63+63)*160). Another disadvantage is that we will be spending CPU time drawing pixels that will never be used. But our code will definitely be a lot simple to write and manage.
How about X then? Well, on a standard ST this is problematic, but on STe/TT/Falcon we can increase the line width of the screen to give us space on both the left and the right of the 320x200 screen, which lets us simply draw sprites at the edge without worrying about part of it appearing on the opposite edge. The memory hit for a 64 pixels wide sprite goes from 32000 bytes to 38432 (8x200 bytes for a 16-pixel column, 24 such columns, plus an extra 4 single-pixel-height columns at 8 bytes each = (24x8x200)+(4x8)).
These solutions might be just what you need, but they do carry some downsides, and besides, it’s easy. We don’t want easy. We leave easy to people without ambition and discipline.
Real clipping
This is where we get into actual code. Finally! Please note that all code here is for rmac, but should be fairly easy to translate to other assemblers.
The code below is not super-optimized, because:
It’s super generic. It works with any resolution, and number of bitplanes (1, 2, 4 or 8), and any size sprite (I leave no guarantee for what happens if a single bitplane in the sprite is larger than 32766 bytes, however)
I want the code to be as easy to read as possible
If you want to make it faster without losing flexibility, there are two multiplications that could be replaced by LUTs (lookup tables), and if you’re ready to go for a fixed resolution or a fixed number of bitplanes, there’s practically no end to how fast you can make this.
Also, I do write a lot of data from registers to memory and back, so that’s another place where you can shave off cycles.
However, this is blaster code, so for any sprite larger than, say, 32x32 (even if it’s just mask+one bitplane), the blaster calls are going to be the majority of the time spent in the subroutine.
Before we get into the details
If you’re a demoscene legend and can read complex code while solving a Rubik’s Cube and inventing spaceships at the same time, skip this part.
Now then, for the mere mortals among us: The code below starts out with a bunch of EQUs and then some preprocessor code to weed out user errors.
Then follows the definition of what I call a sprite struct. It’s pretty straight-forward: The first three words give the number of bitplanes, the width and height (in pixels), and then comes 2-5 single-bitplane images where the first is the mask, and the following ones are the sprite data. All these must be the same dimensions, of course.
Code: Standard sprite routine
;--------------------------------------------------------------
;-- Blaster EQUs
BLASTER_HOP_ONES equ %00000000
BLASTER_HOP_HALFTONE equ %00000001
BLASTER_HOP_SOURCE equ %00000010
BLASTER_HOP_SOURCE_AND_HALFTONE equ %00000011
BLASTER_OP_SOURCE equ %00000011
BLASTER_OP_SOURCE_AND_TARGET equ %00000001
BLASTER_OP_SOURCE_AND_NOT_TARGET equ %00000010
BLASTER_OP_SOURCE_OR_TARGET equ %00000111
BLASTER_OP_SOURCE_XOR_TARGET equ %00000110
BLASTER_OP_SOURCE_NOT_TARGET equ %00000100
BLASTER_OP_ZEROES equ %00000000
BLASTER_OP_ONES equ %00001111
BLASTER_COMMAND_START_HOG_MODE equ %11000000
BLASTER_COMMAND_START_SHARED_MODE equ %10000000
;-- Blaster EQUs
;--------------------------------------------------------------
;-- Screen mode EQUs
SCREEN_WIDTH_PIXELS equ 320
SCREEN_WIDTH_BYTES equ 160
SCREEN_BITPLANES equ 4
;-- Screen mode EQUs
;--------------------------------------------------------------
;-- User setting EQUs
SCREEN_CLIP_Y_MIN equ 32
SCREEN_CLIP_Y_MAX equ 199-32
SCREEN_CLIP_X_MIN equ 32 ; must be on a 16-pixel-boundary!
SCREEN_CLIP_X_MAX equ 319-32 ; must be on a 16-pixel-boundary!
;-- User setting EQUs
;--------------------------------------------------------------
;-- Verify user settings
macro fatal_error errorstring
print "---[ FATAL ERROR ]-----------------------------------------------------------"
print \{errorstring}
print ""
print ""
.error
endm
; Check clip values
if SCREEN_CLIP_Y_MIN<0
fatal_error "SCREEN_CLIP_Y_MIN must be a positive value!"
endif
if SCREEN_CLIP_X_MIN<0
fatal_error "SCREEN_CLIP_X_MIN must be a positive value!"
endif
if (SCREEN_CLIP_X_MIN & $f)!=0
fatal_error "SCREEN_CLIP_X_MIN must be divisible by 16!"
endif
if ((SCREEN_CLIP_X_MAX+1) & $f)!=0
fatal_error "SCREEN_CLIP_X_MAX must be divisible by 16!"
endif
; Check number of bitplanes, and create bitplane shift value to save a couple of multiplications
BPL_ERROR set 1
BPL_SHIFTER set 0
if SCREEN_BITPLANES=1
BPL_ERROR set 0
BPL_SHIFTER set 1 ; 2
endif
if SCREEN_BITPLANES=2
BPL_ERROR set 0
BPL_SHIFTER set 2 ; 4
endif
if SCREEN_BITPLANES=4
BPL_ERROR set 0
BPL_SHIFTER set 3 ; 8
endif
if SCREEN_BITPLANES=8
BPL_ERROR set 0
BPL_SHIFTER set 8 ; 256
endif
if BPL_ERROR=1
fatal_error "SCREEN_BITPLANES must be 1, 2, 4 or 8!"
endif
;-- Verify user settings
;--------------------------------------------------------------
; Sprite struct:
; Number_of_bitplanes.w
; X.w
; Y.w
; mask data, 1 bitplane
; sprite data, 1-4 bitplanes, not interleaved
blsp_draw_noclip:
; In: d0.w - X
; d1.w - Y
; a0.l - pointer to sprite struct
; a1.l - pointer to screen
clr.b bslp_clip_x_min_flag
clr.b bslp_clip_x_max_flag
clr.b blsp_bl_extra_source_read_flag
;------------------------------------------------------------
;-- Setup blaster registers
; These are all default values
move.w #2,blsp_bl_source_x_inc
move.w #2,blsp_bl_source_y_inc
move.w #-1,blsp_bl_endmask_1
move.w #SCREEN_BITPLANES*2,blsp_bl_dest_x_inc
;-- Setup blaster registers
;------------------------------------------------------------
;-- Get width and height of sprite
; Get number of bitplanes in sprite
move.w (a0)+,blsp_sprite_bitplanes
; Get sprite width
move.w (a0)+,d7 ; let's not modify d7 for a while, it's going to see use later
; Get sprite height
move.w (a0)+,d6 ; let's not touch a0 or d6, we're going to use them in a bit
move.w d6,blsp_height_pixels
;-- Get width and height of sprite
;------------------------------------------------------------
;-- "Early out" clipping
; X min
cmp.w #SCREEN_CLIP_X_MIN,d0
blt .early_out
; Y min
cmp.w #SCREEN_CLIP_Y_MIN,d1
blt .early_out
;X max
move.w #SCREEN_CLIP_X_MAX,d2
sub.w d7,d2
add.w #1,d2
cmp.w d2,d0
bgt .early_out
; Y max
move.w #SCREEN_CLIP_Y_MAX,d2
sub.w d6,d2
add.w #1,d2
cmp.w d2,d1
bgt .early_out
;-- "Early out" clipping
;------------------------------------------------------------
;-- Get width-in-words
move.w d7,blsp_width_pixels
lsr.w #4,d7
move.w d7,blsp_width_words
;-- Get width-in-words
;------------------------------------------------------------
;-- X position
move.w d0,d5 ; Back up X so we can still access the untouched value later
and.b #$f,d5 ; Get lowest 4 bits from X for skewing
move.b d5,blsp_skew_value
; Mask out so we get a clean multiple of 16
move.w d0,d5
sub.w #SCREEN_CLIP_X_MIN,d5
tst.w d5
bmi .x_position_done
move.w d0,d4
; Adjust screen pointer to correct "16-pixel block"
lsr.w #4,d4
lsl.w #BPL_SHIFTER,d4 ; instead of mulu #2*SCREEN_BITPLANES,d4
add.w d4,a1
.x_position_done:
;-- X position
;------------------------------------------------------------
;-- Y position
; Adjust screen pointer - this multiplication should be LUT:ed, obviously
mulu #SCREEN_WIDTH_BYTES,d1
add.w d1,a1
;-- Y position
;------------------------------------------------------------
;-- Destination Y increment
move.w blsp_width_words,d7
sub.w #1,d7 ; because we need one less for dest y inc
move.w #SCREEN_WIDTH_BYTES,d6
; multiply width-in-words to account for words and bitplanes
lsl.w #BPL_SHIFTER,d7 ; instead of mulu #2*SCREEN_BITPLANES,d7
sub.w d7,d6 ; subtract total sprite width from total screen width...
move.w d6,blsp_bl_dest_y_inc ; ...and that's how much the blaster needs to add each line
;-- Destination Y increment
;------------------------------------------------------------
;-- Pointers to mask and sprite data
; a0 contains the address to the mask, let's save it
move.l a0,blsp_mask_pointer
move.w blsp_width_words,d7 ; d7 now contains the width of the mask, in words
move.w blsp_height_pixels,d6 ; d6 now contains the height in pixels
mulu.w d7,d6 ; should be LUT:ed, or why not part of the sprite struct
add.w d6,d6 ; we double d6 to convert a word-offset to bytes
add.w d6,a0 ; a0 now points to sprite data
move.l a0,blsp_sprite_pointer
ext.l d6
move.l d6,blsp_bitplane_offset ; save offset for sprites with multiple bitplanes
;-- Pointers to mask and sprite data
;------------------------------------------------------------
;-- Skewing and endmasks
tst.b blsp_skew_value ; if skew value is 0, that's one codepath...
bne .skewing ; if not, that's another.
; no skewing
move.w #-1,blsp_bl_endmask_0 ; left endmask
move.w #-1,blsp_bl_endmask_2 ; right endmask
bra .skewing_done
.skewing:
; Fetch endmasks from LUTs
clr.l d7
move.b blsp_skew_value,d7
add.l d7,d7
lea blsp_leftmasks,a6
move.w (a6,d7),blsp_bl_endmask_0
lea blsp_rightmasks,a6
move.w (a6,d7),blsp_bl_endmask_2
; ... and because we've skewed (effectively adding 16 pixels to the right), we need to adapt these values:
add.w #1,blsp_width_words ; one more xcount
sub.w #SCREEN_BITPLANES*2,blsp_bl_dest_y_inc ; adjust dest y inc accordingly
sub.w #2,blsp_bl_source_y_inc ; source y inc
.skewing_done:
;-- Skewing and endmasks
;------------------------------------------------------------
;-- Blast mask
; Clear blaster halftone RAM
move.l #$ffff8a00,a6
rept 16/2
clr.l (a6)+
endr
move.l a1,-(sp) ; save a1 for next bitplane
move.w blsp_sprite_bitplanes,d7
subq #1,d7
ext.l d7
.blast_mask_loop:
move.w blsp_bl_source_x_inc,$ffff8a20 ;source x inc
move.w blsp_bl_source_y_inc,$ffff8a22 ;source y inc
move.l blsp_mask_pointer,$ffff8a24 ;source address
move.w blsp_bl_endmask_0,$ffff8a28 ;endmask 0
move.w blsp_bl_endmask_1,$ffff8a2a ;endmask 1
move.w blsp_bl_endmask_2,$ffff8a2c ;endmask 2
move.w blsp_bl_dest_x_inc,$ffff8a2e ;dest x inc
move.w blsp_bl_dest_y_inc,$ffff8a30 ;dest y inc
move.l a1,$ffff8a32 ;destination address
move.w blsp_width_words,$ffff8a36 ;x count (n words per line to copy)
move.w blsp_height_pixels,$ffff8a38 ;y count (n lines to copy)
move.b blsp_skew_value,$ffff8a3d ; set skew
move.b #BLASTER_HOP_SOURCE,$ffff8a3a ; halftone operation
move.b #BLASTER_OP_SOURCE_AND_TARGET,$ffff8a3b ; operation
;move.b #BLASTER_OP_SOURCE,$ffff8a3b ; operation
move.b #BLASTER_COMMAND_START_HOG_MODE,$ffff8a3c ; start blaster
addq #2,a1 ; offset to next bitplane
dbra d7,.blast_mask_loop
move.l (sp)+,a1
;-- Blast mask
;------------------------------------------------------------
;-- Blast sprite
move.l blsp_bitplane_offset,d6
move.w blsp_sprite_bitplanes,d7
subq #1,d7
ext.l d7
.blast_sprite_loop:
move.w blsp_bl_source_x_inc,$ffff8a20 ;source x inc
move.w blsp_bl_source_y_inc,$ffff8a22 ;source y inc
move.l blsp_sprite_pointer,$ffff8a24 ;source address
move.w blsp_bl_endmask_0,$ffff8a28 ;endmask 0
move.w blsp_bl_endmask_1,$ffff8a2a ;endmask 1
move.w blsp_bl_endmask_2,$ffff8a2c ;endmask 2
move.w blsp_bl_dest_x_inc,$ffff8a2e ;dest x inc
move.w blsp_bl_dest_y_inc,$ffff8a30 ;dest y inc
move.l a1,$ffff8a32 ;destination address
move.w blsp_width_words,$ffff8a36 ;x count (n words per line to copy)
move.w blsp_height_pixels,$ffff8a38 ;y count (n lines to copy)
move.b blsp_skew_value,$ffff8a3d ; set skew
move.b #BLASTER_HOP_SOURCE,$ffff8a3a ; halftone operation
move.b #BLASTER_OP_SOURCE_OR_TARGET,$ffff8a3b ; operation
;move.b #BLASTER_OP_SOURCE_XOR_TARGET,$ffff8a3b ; operation
;move.b #BLASTER_OP_SOURCE,$ffff8a3b ; operation
move.b #BLASTER_COMMAND_START_HOG_MODE,$ffff8a3c ; start blaster
add.l d6,blsp_sprite_pointer ; to get to the next sprite bitplane
addq #2,a1 ; offset to next bitplane
dbra d7,.blast_sprite_loop
;-- Blast sprite
;------------------------------------------------------------
.early_out:
rts
;------------------------------------------------------------------------------
.bss
blsp_sprite_bitplanes: ds.w 1
; Address to mask
blsp_mask_pointer: ds.l 1
; Address to first bitplane in sprite
blsp_sprite_pointer: ds.l 1
; Offset to next bitplane in sprite
blsp_bitplane_offset: ds.l 1
; Width and height variables
blsp_width_pixels: ds.w 1
blsp_height_pixels: ds.w 1
blsp_width_words: ds.w 1
blsp_skew_value: ds.b 1
even
; Blaster shadow variables
blsp_bl_source_x_inc: ds.w 1
blsp_bl_source_y_inc: ds.w 1
blsp_bl_endmask_0: ds.w 1
blsp_bl_endmask_1: ds.w 1
blsp_bl_endmask_2: ds.w 1
blsp_bl_dest_x_inc: ds.w 1
blsp_bl_dest_y_inc: ds.w 1
.68000
;------------------------------------------------------------------------------
.data
bfly_palette:
include "bfly_160x164_4bpl.pal"
bfly_160:
dc.w 1 ; bitplanes
dc.w 160, 164 ; width, height
incbin "bflymask_160x164.1bp"
incbin "bfly_160x164.1bp"
blsp_leftmasks:
dc.w %1111111111111111
dc.w %0111111111111111
dc.w %0011111111111111
dc.w %0001111111111111
dc.w %0000111111111111
dc.w %0000011111111111
dc.w %0000001111111111
dc.w %0000000111111111
dc.w %0000000011111111
dc.w %0000000001111111
dc.w %0000000000111111
dc.w %0000000000011111
dc.w %0000000000001111
dc.w %0000000000000111
dc.w %0000000000000011
dc.w %0000000000000001
blsp_rightmasks:
dc.w %0000000000000000
dc.w %1000000000000000
dc.w %1100000000000000
dc.w %1110000000000000
dc.w %1111000000000000
dc.w %1111100000000000
dc.w %1111110000000000
dc.w %1111111000000000
dc.w %1111111100000000
dc.w %1111111110000000
dc.w %1111111111000000
dc.w %1111111111100000
dc.w %1111111111110000
dc.w %1111111111111000
dc.w %1111111111111100
dc.w %1111111111111110
.68000
The only clipping this code actually performs is an “early out” clipping that skips all drawing if any part of the sprite is outside the clipping frustrum.
Code: Clipping sprite routine
Here we get into the real nitty-gritty.
“Early-out clipping”
The first difference you will notice is that the “early-out” clipping has been modified. Instead of exiting if a single pixel is outside the clipping frustrum, it now only exits if all of the sprite is outside the clipping frustrum.
“Partial clipping”
Rather than draw all of nothing of the sprite (like we do in the code above), we want to draw only parts of the sprite, which doesn’t sound too bad.
Vertically: Y clipping
And vertically, it’s really not too bad! All we do is move the start address of the sprite and the number of lines to draw.
Horizontally: X clipping
From high above, horizontal clipping (or X clipping) doesn’t seem too bad. We calculate how many pixels to draw, and we adjust the blaster’s “X count” register for that.
Oh, except the “X count” deals with words. And since the “X count” is the width of the sprite, we need to change the “dest y increment” register too. And the “source y increment” register.
Oh, and if we’re cutting the sprite on the right, we have to set the right endmask to all 1’s (we don’t want it to mask off anything).
And if the width drawn is 16 pixels or less, it’s not the right endmask that needs modifying, it’s the left one (because endmask_0 is always used, the other only come into play at widths greater than 16).
Aaand as if that wasn’t enough, there’s a specific case where we have to set the NFSR (No Final Source Read) and FXSR (Force eXtra Source Read) bits of the skew register.
So… Yes, there are a lot of cases we have to be aware of, but I have sorted it all out for you, so go nuts with the code:
;--------------------------------------------------------------
;-- Blaster EQUs
BLASTER_HOP_ONES equ %00000000
BLASTER_HOP_HALFTONE equ %00000001
BLASTER_HOP_SOURCE equ %00000010
BLASTER_HOP_SOURCE_AND_HALFTONE equ %00000011
BLASTER_OP_SOURCE equ %00000011
BLASTER_OP_SOURCE_AND_TARGET equ %00000001
BLASTER_OP_SOURCE_AND_NOT_TARGET equ %00000010
BLASTER_OP_SOURCE_OR_TARGET equ %00000111
BLASTER_OP_SOURCE_XOR_TARGET equ %00000110
BLASTER_OP_SOURCE_NOT_TARGET equ %00000100
BLASTER_OP_ZEROES equ %00000000
BLASTER_OP_ONES equ %00001111
BLASTER_COMMAND_START_HOG_MODE equ %11000000
BLASTER_COMMAND_START_SHARED_MODE equ %10000000
;-- Blaster EQUs
;--------------------------------------------------------------
;-- Screen mode EQUs
SCREEN_WIDTH_PIXELS equ 320
SCREEN_WIDTH_BYTES equ 160
SCREEN_BITPLANES equ 4
;-- Screen mode EQUs
;--------------------------------------------------------------
;-- User setting EQUs
SCREEN_CLIP_Y_MIN equ 32
SCREEN_CLIP_Y_MAX equ 199-32
SCREEN_CLIP_X_MIN equ 32 ; must be on a 16-pixel-boundary!
SCREEN_CLIP_X_MAX equ 319-32 ; must be on a 16-pixel-boundary!
;-- User setting EQUs
;--------------------------------------------------------------
;-- Verify user settings
macro fatal_error errorstring
print "---[ FATAL ERROR ]-----------------------------------------------------------"
print \{errorstring}
print ""
print ""
.error
endm
; Check clip values
if SCREEN_CLIP_Y_MIN<0
fatal_error "SCREEN_CLIP_Y_MIN must be a positive value!"
endif
if SCREEN_CLIP_X_MIN<0
fatal_error "SCREEN_CLIP_X_MIN must be a positive value!"
endif
if (SCREEN_CLIP_X_MIN & $f)!=0
fatal_error "SCREEN_CLIP_X_MIN must be divisible by 16!"
endif
if ((SCREEN_CLIP_X_MAX+1) & $f)!=0
fatal_error "SCREEN_CLIP_X_MAX must be divisible by 16!"
endif
; Check number of bitplanes, and create bitplane shift value to save a couple of multiplications
BPL_ERROR set 1
BPL_SHIFTER set 0
if SCREEN_BITPLANES=1
BPL_ERROR set 0
BPL_SHIFTER set 1 ; 2
endif
if SCREEN_BITPLANES=2
BPL_ERROR set 0
BPL_SHIFTER set 2 ; 4
endif
if SCREEN_BITPLANES=4
BPL_ERROR set 0
BPL_SHIFTER set 3 ; 8
endif
if SCREEN_BITPLANES=8
BPL_ERROR set 0
BPL_SHIFTER set 8 ; 256
endif
if BPL_ERROR=1
fatal_error "SCREEN_BITPLANES must be 1, 2, 4 or 8!"
endif
;-- Verify user settings
;--------------------------------------------------------------
; Sprite struct:
; Number_of_bitplanes.w
; X.w
; Y.w
; mask data, 1 bitplane
; sprite data, 1-4 bitplanes, not interleaved
blsp_draw_fullclip:
; In: d0.w - X
; d1.w - Y
; a0.l - pointer to sprite struct
; a1.l - pointer to screen
clr.b bslp_clip_x_min_flag
clr.b bslp_clip_x_max_flag
clr.b blsp_bl_extra_source_read_flag
;------------------------------------------------------------
;-- Setup blaster registers
; These are all default values
move.w #2,blsp_bl_source_x_inc
move.w #2,blsp_bl_source_y_inc
move.w #-1,blsp_bl_endmask_1
move.w #SCREEN_BITPLANES*2,blsp_bl_dest_x_inc
;-- Setup blaster registers
;------------------------------------------------------------
;-- Get width and height of sprite
; Get number of bitplanes in sprite
move.w (a0)+,blsp_sprite_bitplanes
; Get sprite width
move.w (a0)+,d7 ; let's not modify d7 for a while, it's going to see use later
; Get sprite height
move.w (a0)+,d6 ; let's not touch a0 or d6, we're going to use them in a bit
;-- Get width and height of sprite
;------------------------------------------------------------
;-- "Early out" clipping
; Y max
cmp.w #SCREEN_CLIP_Y_MAX,d1
bgt .early_out
; X max
cmp.w #SCREEN_CLIP_X_MAX,d0
bgt .early_out
; Y min
move.w #SCREEN_CLIP_Y_MIN,d2
sub.w d6,d2
cmp.w d2,d1
ble .early_out
; X min
move.w #SCREEN_CLIP_X_MIN,d2
sub.w d7,d2
cmp.w d2,d0
ble .early_out
;-- "Early out" clipping
;------------------------------------------------------------
;-- Get width-in-words
move.w d7,blsp_width_pixels_before_clip
lsr.w #4,d7
move.w d7,blsp_width_words_before_clip
;-- Get width-in-words
;------------------------------------------------------------
;-- X position
move.w d0,d5 ; Back up X so we can still access the untouched value later
and.b #$f,d5 ; Get lowest 4 bits from X for skewing
move.b d5,blsp_skew_value
; Mask out so we get a clean multiple of 16
move.w d0,d5
sub.w #SCREEN_CLIP_X_MIN,d5
tst.w d5
bmi .x_position_done
move.w d0,d4
; Adjust screen pointer to correct "16-pixel block"
lsr.w #4,d4
lsl.w #BPL_SHIFTER,d4 ; instead of mulu #2*SCREEN_BITPLANES,d4
add.w d4,a1
.x_position_done:
;-- X position
;------------------------------------------------------------
;-- Partial clipping
; Write "before clipping" width and height values
move.w d6,blsp_height_pixels_before_clip
move.w d6,blsp_height_pixels_after_clip ; default value, in case no clipping happens
move.w d7,blsp_width_words_after_clip ; default value
; Y max clipping
move.w #SCREEN_CLIP_Y_MAX,d2
move.w d1,d3 ; d3=Y
add.w d6,d3 ; add sprite height to d3...
sub.w d3,d2 ; subtract that from clip value. If result<0, it's clipping time
bpl.s .no_y_max_clip
neg.w d2 ; overshoot in d2
sub.w #1,d2
sub.w d2,d6
move.w d6,blsp_height_pixels_after_clip
.no_y_max_clip:
; Y min clipping
move.w #SCREEN_CLIP_Y_MIN,d2
sub.w d1,d2
bmi.s .no_y_min_clip
sub.w d2,d6
move.w d6,blsp_height_pixels_after_clip
; add offset to mask/sprite data
move.w d7,d3 ; offset = number of words in width...
add.w d3,d3 ; ...times 2...
mulu d3,d2 ; ...times the number of lines
add.w d2,a0 ; add offset
; Adjust Y to clipping
move.w #SCREEN_CLIP_Y_MIN,d1
.no_y_min_clip:
; X max clipping
move.w #SCREEN_CLIP_X_MAX,d2
sub.w blsp_width_pixels_before_clip,d2
add.w #1,d2
cmp.w d2,d0
ble.s .no_x_max_clip
move.b #1,bslp_clip_x_max_flag
; Calculate overshoot
move.w d0,d3
add.w blsp_width_pixels_before_clip,d3
sub.w #SCREEN_CLIP_X_MAX,d3
sub.w #1,d3 ; d3 is now overshoot
move.w blsp_width_pixels_before_clip,d7
sub.w d3,d7
lsr.w #4,d7
move.w d7,blsp_width_words_after_clip
lsr.w #3,d3
add.w d3,blsp_bl_source_y_inc
; Adjust for skew value changing width
tst.b blsp_skew_value
beq .no_fix_src_y_inc
add.w #2,blsp_bl_source_y_inc
.no_fix_src_y_inc:
.no_x_max_clip:
; X min clipping
move.w #SCREEN_CLIP_X_MIN,d2
cmp.w d2,d0
bge.s .no_x_min_clip
move.b #1,bslp_clip_x_min_flag
; Calculate undershoot
move.w d0,d7
sub.w #SCREEN_CLIP_X_MIN,d7
neg.w d7 ; make d7 positive
lsr.w #4,d7 ; multiply by 16, because word = 16 pixels
sub.w d7,blsp_width_words_after_clip ; since we'll be blasting fewer words in X
sub.w #1,blsp_width_words_after_clip ; adjust
; For each 16-pixel-block, we need add 8 to source pointer...
lsl.w #1,d7 ; ...so since we multiplied d7 by 16, if we now halve d7... you get it.
add.w d7,a0
add.w d7,blsp_bl_source_y_inc ; we also have to adjust source_y_inc for the new width
lsl.w #2,d7
add.w d7,blsp_bl_dest_y_inc ; ...and we adjust dest_y_inc
; Adjust screen offset to clipping
move.w #SCREEN_CLIP_X_MIN,d2
lsr.w #4,d2
lsl.w #BPL_SHIFTER,d2 ; instead of mulu #2*SCREEN_BITPLANES,d4
add.w d2,a1
.no_x_min_clip:
;-- Partial clipping
;------------------------------------------------------------
;-- Y position
; Adjust screen pointer - this multiplication should be LUT:ed, obviously
mulu #SCREEN_WIDTH_BYTES,d1
add.w d1,a1
;-- Y position
;------------------------------------------------------------
;-- Destination Y increment
move.w blsp_width_words_after_clip,d7
sub.w #1,d7 ; because we need one less for dest y inc
move.w #SCREEN_WIDTH_BYTES,d6
; multiply width-in-words to account for words and bitplanes
lsl.w #BPL_SHIFTER,d7 ; instead of mulu #2*SCREEN_BITPLANES,d7
sub.w d7,d6 ; subtract total sprite width from total screen width...
move.w d6,blsp_bl_dest_y_inc ; ...and that's how much the blaster needs to add each line
;-- Destination Y increment
;------------------------------------------------------------
;-- Skewing and endmasks
tst.b blsp_skew_value ; if skew value is 0, that's one codepath...
bne .skewing ; if not, that's another.
; no skewing
move.w #-1,blsp_bl_endmask_0 ; left endmask
move.w #-1,blsp_bl_endmask_2 ; right endmask
bra .skewing_done
.skewing:
; Fetch endmasks from LUTs
clr.l d7
move.b blsp_skew_value,d7
add.l d7,d7
lea blsp_leftmasks,a6
move.w (a6,d7),blsp_bl_endmask_0
lea blsp_rightmasks,a6
move.w (a6,d7),blsp_bl_endmask_2
; ... and because we've skewed (effectively adding 16 pixels to the right), we need to adapt these values:
add.w #1,blsp_width_words_after_clip ; one more xcount
sub.w #SCREEN_BITPLANES*2,blsp_bl_dest_y_inc ; adjust dest y inc accordingly
sub.w #2,blsp_bl_source_y_inc ; source y inc
.skewing_done:
;-- Skewing and endmasks
;------------------------------------------------------------
;-- X clipping adjustments
tst.b bslp_clip_x_max_flag
beq.s .no_x_max_clip_adjustments
move.w #-1,blsp_bl_endmask_2 ; since we're clipping on the right, we want the corresponding endmask to show everything
.no_x_max_clip_adjustments:
tst.b bslp_clip_x_min_flag
beq.s .no_x_min_clip_adjustments
move.w #-1,blsp_bl_endmask_0 ; clipping on the left = leftmost endmask all 1's
tst.b blsp_skew_value
bne.s .no_skew_fix
add.w #1,blsp_width_words_after_clip
sub.w #SCREEN_BITPLANES*2,blsp_bl_dest_y_inc
bra.s .skew_fixing_done
.no_skew_fix:
move.b #1,blsp_bl_extra_source_read_flag
add.w #2,blsp_bl_source_y_inc
.skew_fixing_done:
cmp.w #1,blsp_width_words_after_clip
bne .nopers
; Width is just a single word, so endmask_2 --> endmask_0
move.w blsp_bl_endmask_2,blsp_bl_endmask_0
tst.b blsp_skew_value
beq .dont_change_src_y_inc
sub #2,blsp_bl_source_y_inc
sub.w #2,a0 ; At the time of writing (Oct 23rd, 2018), this
; works on my real STE (TOS 1.62), but not in
; Hatari 2.0.0 or STEem v3.7.2
.dont_change_src_y_inc:
.nopers:
.no_x_min_clip_adjustments:
tst.b blsp_bl_extra_source_read_flag
beq .no__extra_src_read
add.b #$c0,blsp_skew_value ; set NFSR and FXSR bits
.no__extra_src_read:
;-- X clipping adjustments
;------------------------------------------------------------
;-- Pointers to mask and sprite data
; a0 contains the address to the mask, let's save it
move.l a0,blsp_mask_pointer
move.w blsp_width_words_before_clip,d7 ; d7 now contains the width of the mask, in words
move.w blsp_height_pixels_before_clip,d6 ; d6 now contains the height in pixels
mulu.w d7,d6 ; should be LUT:ed, or why not part of the sprite struct
add.w d6,d6 ; we double d6 to convert a word-offset to bytes
add.w d6,a0 ; a0 now points to sprite data
move.l a0,blsp_sprite_pointer
ext.l d6
move.l d6,blsp_bitplane_offset ; save offset for sprites with multiple bitplanes
;-- Pointers to mask and sprite data
;------------------------------------------------------------
;-- Blast mask
; Clear blaster halftone RAM
move.l #$ffff8a00,a6
rept 16/2
clr.l (a6)+
endr
move.l a1,-(sp) ; save a1 for next bitplane
move.w blsp_sprite_bitplanes,d7
subq #1,d7
ext.l d7
.blast_mask_loop:
move.w blsp_bl_source_x_inc,$ffff8a20 ;source x inc
move.w blsp_bl_source_y_inc,$ffff8a22 ;source y inc
move.l blsp_mask_pointer,$ffff8a24 ;source address
move.w blsp_bl_endmask_0,$ffff8a28 ;endmask 0
move.w blsp_bl_endmask_1,$ffff8a2a ;endmask 1
move.w blsp_bl_endmask_2,$ffff8a2c ;endmask 2
move.w blsp_bl_dest_x_inc,$ffff8a2e ;dest x inc
move.w blsp_bl_dest_y_inc,$ffff8a30 ;dest y inc
move.l a1,$ffff8a32 ;destination address
move.w blsp_width_words_after_clip,$ffff8a36 ;x count (n words per line to copy)
move.w blsp_height_pixels_after_clip,$ffff8a38 ;y count (n lines to copy)
move.b blsp_skew_value,$ffff8a3d ; set skew
move.b #BLASTER_HOP_SOURCE,$ffff8a3a ; halftone operation
move.b #BLASTER_OP_SOURCE_AND_TARGET,$ffff8a3b ; operation
;move.b #BLASTER_OP_SOURCE,$ffff8a3b ; operation
move.b #BLASTER_COMMAND_START_HOG_MODE,$ffff8a3c ; start blaster
addq #2,a1 ; offset to next bitplane
dbra d7,.blast_mask_loop
move.l (sp)+,a1
;-- Blast mask
;------------------------------------------------------------
;-- Blast sprite
move.l blsp_bitplane_offset,d6
move.w blsp_sprite_bitplanes,d7
subq #1,d7
ext.l d7
.blast_sprite_loop:
move.w blsp_bl_source_x_inc,$ffff8a20 ;source x inc
move.w blsp_bl_source_y_inc,$ffff8a22 ;source y inc
move.l blsp_sprite_pointer,$ffff8a24 ;source address
move.w blsp_bl_endmask_0,$ffff8a28 ;endmask 0
move.w blsp_bl_endmask_1,$ffff8a2a ;endmask 1
move.w blsp_bl_endmask_2,$ffff8a2c ;endmask 2
move.w blsp_bl_dest_x_inc,$ffff8a2e ;dest x inc
move.w blsp_bl_dest_y_inc,$ffff8a30 ;dest y inc
move.l a1,$ffff8a32 ;destination address
move.w blsp_width_words_after_clip,$ffff8a36 ;x count (n words per line to copy)
move.w blsp_height_pixels_after_clip,$ffff8a38 ;y count (n lines to copy)
move.b blsp_skew_value,$ffff8a3d ; set skew
move.b #BLASTER_HOP_SOURCE,$ffff8a3a ; halftone operation
move.b #BLASTER_OP_SOURCE_OR_TARGET,$ffff8a3b ; operation
;move.b #BLASTER_OP_SOURCE_XOR_TARGET,$ffff8a3b ; operation
;move.b #BLASTER_OP_SOURCE,$ffff8a3b ; operation
move.b #BLASTER_COMMAND_START_HOG_MODE,$ffff8a3c ; start blaster
add.l d6,blsp_sprite_pointer ; to get to the next sprite bitplane
addq #2,a1 ; offset to next bitplane
dbra d7,.blast_sprite_loop
;-- Blast sprite
;------------------------------------------------------------
.early_out:
rts
;------------------------------------------------------------------------------
.bss
blsp_sprite_bitplanes: ds.w 1
; Address to mask
blsp_mask_pointer: ds.l 1
; Address to first bitplane in sprite
blsp_sprite_pointer: ds.l 1
; Offset to next bitplane in sprite
blsp_bitplane_offset: ds.l 1
; Width and height variables for clipping version
blsp_width_pixels_before_clip: ds.w 1
blsp_width_pixels_after_clip: ds.w 1
blsp_width_words_before_clip: ds.w 1
blsp_width_words_after_clip: ds.w 1
blsp_height_pixels_before_clip: ds.w 1
blsp_height_pixels_after_clip: ds.w 1
; Flags to determine if X clipping is going on
bslp_clip_x_min_flag: ds.b 1
bslp_clip_x_max_flag: ds.b 1
; Flag for extra source read
blsp_bl_extra_source_read_flag: ds.b 1
blsp_skew_value: ds.b 1
; Blaster shadow variables
blsp_bl_source_x_inc: ds.w 1
blsp_bl_source_y_inc: ds.w 1
blsp_bl_endmask_0: ds.w 1
blsp_bl_endmask_1: ds.w 1
blsp_bl_endmask_2: ds.w 1
blsp_bl_dest_x_inc: ds.w 1
blsp_bl_dest_y_inc: ds.w 1
.68000
;------------------------------------------------------------------------------
.data
bfly_palette:
include "bfly_160x164_4bpl.pal"
bfly_160:
dc.w 1 ; bitplanes
dc.w 160, 164 ; width, height
incbin "bflymask_160x164.1bp"
incbin "bfly_160x164.1bp"
blsp_leftmasks:
dc.w %1111111111111111
dc.w %0111111111111111
dc.w %0011111111111111
dc.w %0001111111111111
dc.w %0000111111111111
dc.w %0000011111111111
dc.w %0000001111111111
dc.w %0000000111111111
dc.w %0000000011111111
dc.w %0000000001111111
dc.w %0000000000111111
dc.w %0000000000011111
dc.w %0000000000001111
dc.w %0000000000000111
dc.w %0000000000000011
dc.w %0000000000000001
blsp_rightmasks:
dc.w %0000000000000000
dc.w %1000000000000000
dc.w %1100000000000000
dc.w %1110000000000000
dc.w %1111000000000000
dc.w %1111100000000000
dc.w %1111110000000000
dc.w %1111111000000000
dc.w %1111111100000000
dc.w %1111111110000000
dc.w %1111111111000000
dc.w %1111111111100000
dc.w %1111111111110000
dc.w %1111111111111000
dc.w %1111111111111100
dc.w %1111111111111110
.68000
Limitations of this code
This code always masks and draws the same number of bitplanes. It is trivial to change it to for example mask four bitplanes, but draw only two.
Sprites so wide that they clip both on the left and the right display incorrectly. It should be relatively easy to fix, if this is a problem (Y clipping does not have this issue).
X Clipping is currently only allowed on 16-pixel boundaries (to keep things relatively simple). There’s no reason this code couldn’t be expanded to allow for any X clipping boundaries; it’s mainly a lot of extra endmask work.
Thanks
Awesome butterfly graphics by Acca
Teaching-me-everything-I-know and Smacking-me-upside-my-head-when-I-deserve-it by ggn/küa