Lab 1 - Calculate performance & memory usage
Bitmap Code
lda #$00 ; set a pointer in memory location $40 to point to $0200
sta $40 ; ... low byte ($00) goes in address $40
lda #$02
sta $41 ; ... high byte ($02) goes into address $41
lda #$07 ; colour number
ldy #$00 ; set index to 0
loop: sta ($40),y ; set pixel colour at the address (pointer)+Y
iny ; increment index
bne loop ; continue until done the page (256 pixels)
inc $41 ; increment the page
ldx $41 ; get the current page number
cpx #$06 ; compare with 6
bne loop ; continue until done all pages
Calculate Performance
Cycle Cycle Count Alt Cycles Alt Count Total
1. lda #$00 2 1 2
2. sta #$40 3 1 3
3. lda #$02 2 1 2
4. sta #$41 3 1 3
5. lda #$07 2 1 2
6. ldy #$00 2 1 2
7. loop: sta($40), y 6 1024 6144
8. iny 2 1024 2048
9. bne loop 3 1020 2 4 3068
10. inc $41 5 4 20
11. ldx $41 4 4 16
12. cpx #$06 2 4 8
13. bne loop 3 3 2 1 11
=======================================================================
Total Cycles = 2 + 3 + 2 + 3 + 2 + 2 + 6144 + 2048 + 3068 + 20 + 16 + 8 + 11 = 11321
CPU Speed: 1MHz
uS per clock: 1
Time: 11321 uS
11.321 mS
Modified code to decrease time
lda #$07 ; colour number
ldy #$00 ; set index to 0
loop: sta $0200,y ; set pixel colour at the address (pointer)+Y
sta $0300,y
sta $0400,y
sta $0500,y
iny ; increment index
bne loop ; continue until done the page (256 pixels)
Cycle Cycle Count Alt Cycles Alt Count Total
1. lda #$07 2 1 2
2. ldy #$00 2 1 2
3. loop: sta $0200, y 5 256 1280
4. loop: sta $0300, y 5 256 1280
5. loop: sta $0400, y 5 256 1280
6. loop: sta $0500, y 5 256 1280
7. iny 2 256 2048
8. bne loop 3 255 2 1 767
=======================================================================
Total Cycles = 2 + 2 + 4 x 1280 + 2048 + 767 = 6403
CPU Speed: 1MHz
uS per clock: 1
Time: 6403 uS
6.403 mS
Instead of using pointer to point address, I used absolute value to save colour data at first. And I modified the loop so that it can run one time instead of four times like before.
- Memory Usage
lda #$00
: 2 bytessta $40
: 2 byteslda #$02
: 2 bytessta $41
: 2 byteslda #$07
: 2 bytesldy #$00
: 2 bytesloop: sta ($40),y
: 2 bytesiny
: 1 bytebne loop
: 2 bytesinc $41
: 2 bytesldx $41
: 2 bytescpx #$06
: 2 bytesbne loop
: 2 bytes- Pointer/Variables : Saved pointer into address of $40, $ 41 each = 2 bytes / Y, X register used 1 byte respectively
Modifying Code
lda #$00 ; set a pointer in memory location $40 to point to $0200
sta $40 ; ... low byte ($00) goes in address $40
lda #$02
sta $41 ; ... high byte ($02) goes into address $41
ldx #$00 ; refresh page counter
set_color:
cpx #$00
beq color_page_1
cpx #$01
beq color_page_2
cpx #$02
beq color_page_3
cpx #$03
beq color_page_4
color_page_1:
lda #$07 ; yellow color for page 1
jmp fill_page
color_page_2:
lda #$02 ; red color for page 2
jmp fill_page
color_page_3:
lda #$05 ; green color for page 3
jmp fill_page
color_page_4:
lda #$06 ; blue color for page 4
jmp fill_page
fill_page:
ldy #$00 ; set index to 0
Experiments (Optional, Recommended)
- Add this instruction after the
loop:
label and before thesta ($40),y
instruction:tya
What visual effect does this cause, and how many colours are on the screen? Why?-> Create multi-colour stripes vertically. Because 'tya' copy Y register to A register. So at the first, the color data of A register was yellow but after tya command, value of Y register change from 0 to 255. Therefore, there are multiple color of stripes on screen. - Add this instruction after the
tya
:lsr
What visual effect does this cause, and how many colours are on the screen? Why?-> It makes the width of each stripe thicker than before. Because 'lsr' move the value of A register 1 bit to the right. As a result it has same effect with dividing the value of A register by 2. When the value of Y register increase from 0 to 1, the value of A register doesn't change. When the value of Y register increase from 2 to 3, the value of A register remain at 1. Therefore lsr makes the width of stripe thicker. - Repeat the above tests with two, three, four, and five
lsr
instructions in a row. Describe and explain the effect in each case.-> two 'lsr' in a row: Each vertical stripe thicker than before and create horizontal stripes in each vertical stripe. Moreover, Horizontal stripes change color in cycles of 2. Because the value of A register is the value of Y register divided by 4three 'lsr' in a row: Same pattern as two 'lsr' in a row but each vertical stripe thicker than before. Moreover, Horizontal stripes change color in cycles of 4. Because the value of A register is the value of Y register divided by 8four 'lsr' in a row: Same pattern as three 'lsr' in a row but each vertical stripe thicker than before. Moreover, Horizontal stripes change color in cycles of 8. Because the value of A register is the value of Y register divided by 16 - Repeat the tests using
asl
instructions instead oflsr
instructions. Describe and explain the effect in each case.-> two 'asl' in a row: Create vertical stripes repeated color in cycles of 4.three 'asl' in a row: Create vertical stripes repeated color in cycles of 2.four 'asl' in a row: Create black color of display.'asl' move 1 bit to the left. As a result there is same effect as multiplying 2. So if there are two 'asl' in a row then the value of A register increase by 4 when the value of Y register increase. Therefore, if there are more 'asl' keyword in a row then it makes the pace of increasing A register faster and the number of repeated color's cycles decrease. - The original code includes one
iny
instruction. Test with one to five consecutiveiny
instructions. Describe and explain the effect in each case. Note: it is helpful to place the Speed slider is on its lowest setting (left) for these experiments.-> one iny: fill the display with yellow color, increase the value of Y register by 1two iny: create black vertical stripe, increase the value of Y register by 2three iny: fill the display with yellow color but the pace of filling display is slower than one iny. it increases the value of Y register by 3four iny: create black vertical stripe but the black stripe is thicker than the stripes of two iny. it increases the value of Y register by 4five iny: fill the display with yellow color but the pace of filling display is slower than three iny. it increases the value of Y register by 5 - Make each pixel a random colour. (Hint: use the psudo-random number generator mentioned on the 6502 Emulator page).lda #$00sta $40 ; Low byte ($00) goes in address $40lda #$02sta $41 ; High byte ($02) goes into address $41ldy #$00 ; Set index to 0loop:jsr RandomColor ; Generate a random color value (between 0 and 255)sta ($40),y ; Set pixel color at the address (pointer)+Yiny; Check if we've filled the entire page (256 pixels)bne loop ; Continue loop if not done the page; If done the page, increment the page numberinc $41 ; Increment the pageldx $41 ; Get the current page number; Check if we've filled all pages (6 pages total)cpx #$06 ; Compare with 6bne loop ; Continue loop if not done all pagesRandomColor:; Use a simple pseudo-random number generatorlda $fe ; Load random colorrts ; Return from subroutine
Challenges (Optional, Recommended)
- Set all of the display pixels to the same colour, except for the middle four pixels, which will be drawn in another colour.
- Write a program which draws lines around the edge of the display:
- A red line across the top
- A green line across the bottom
- A blue line across the right side.
- A purple line across the left size.
Overcoming Challenges
Understanding 6502 Assembly: The 6502 processor, with its unique instruction set and memory addressing modes, was a significant departure from the high-level languages I was more familiar with. It took time and dedication to grasp the fundamentals of 6502 assembly and how to leverage its capabilities effectively.
Measuring Execution Time: Determining the exact execution time of my code was a crucial aspect of the optimization process. Figuring out the right techniques, such as using hardware-based timers or cycle-accurate emulators, was a learning experience in itself. It required a deep understanding of the underlying hardware and the ability to interpret the performance data accurately.
Optimizing Loop Structures: Reducing the number of loop iterations was a key strategy in my efforts to decrease the overall execution time. This involved analyzing the algorithm, identifying opportunities for optimization, and implementing more efficient loop structures. It was a iterative process of testing, measuring, and refining the code to achieve the desired performance improvements.
Lessons Learned
Through this experience, I gained valuable insights that will undoubtedly benefit me in future projects:
Embracing Low-Level Optimization: Delving into the world of assembly programming has given me a deeper appreciation for the importance of low-level optimization. Understanding the hardware-level details and being able to fine-tune the code at the assembly level can lead to significant performance gains, especially in time-critical applications.
Importance of Measurement and Profiling: Accurately measuring and profiling the execution time of my code was crucial for identifying bottlenecks and validating the effectiveness of my optimization efforts. Developing the skills to use the right tools and techniques for performance analysis has become an essential part of my problem-solving toolkit.
Iterative Optimization Approach: The process of optimizing the code, measuring the results, and then refining the approach based on the findings was a valuable lesson. It taught me the importance of an iterative, data-driven approach to optimization, where each step builds upon the insights gained from the previous iterations.
댓글
댓글 쓰기