The correct increment of the instruction address should be performed as follows:
address=*+1
lda self_modifying_address
inc address+0
bne *+5
inc address+1
thus, probably neglecting all the memory savings for self-modified code.
I propose a different approach, which includes self-modifying instructions only when absolutization is required, and also stores the memory variables in the instructions.
.loop
fetch_ptr=*+1
ldx #0
lda filler_bytes,x ;have two tables, first contains only filler bytes,
ldy repeat_bytes,x ;second only repeat counts
beq .exit
inc fetch_ptr ;this way you save 1 increment
fill_ptr=*+1
ldx #0
.fill
sta SCREEN_RAM,x
inx
bne +
inc .fill+2 ;only self-modify high byte of address in the instruction
+ dey
bne .fill
stx fill_ptr
jmp .loop
.exit
rts
filler_bytes !byte 1,2,3,4,5,4,3,2,1
repeat_bytes !byte 4,4,5,5,6,6,5,5,4,0