Lab 1 - Calculate performance & memory usage

Bitmap Code

   lda #$00 ; set a pointer in memory location $40 to point to $0200

  sta $40 ; ... low byte ($00) goes in address $40

  lda #$02

  sta $41 ; ... high byte ($02) goes into address $41

  lda #$07 ; colour number

  ldy #$00 ; set index to 0

 loop: sta ($40),y ; set pixel colour at the address (pointer)+Y

  iny ; increment index

  bne loop ; continue until done the page (256 pixels)

  inc $41 ; increment the page

  ldx $41 ; get the current page number

  cpx #$06 ; compare with 6

  bne loop ; continue until done all pages


Calculate Performance

- Time Performance

                                Cycle    Cycle Count    Alt Cycles    Alt Count    Total

1. lda #$00                  2                1                                                           2

2. sta #$40                  3                1                                                           3

3. lda #$02                  2                1                                                           2

4. sta #$41                  3                1                                                           3

5. lda #$07                  2                1                                                           2

6. ldy #$00                  2                1                                                           2

7. loop: sta($40), y      6               1024                                                      6144

8. iny                           2               1024                                                       2048

9. bne loop                  3               1020                  2                   4              3068

10. inc $41                  5                   4                                                         20

11. ldx $41                  4                   4                                                         16

12. cpx #$06               2                   4                                                          8

13. bne loop                3                   3                  2                   1                11

=======================================================================

Total Cycles = 2 + 3 + 2 + 3 + 2 + 2 + 6144 + 2048 + 3068 + 20 + 16 + 8 + 11 = 11321

CPU Speed: 1MHz

uS per clock: 1

Time: 11321 uS

          11.321 mS

Modified code to decrease time

  lda #$07 ; colour number

  ldy #$00 ; set index to 0

 loop: sta $0200,y ; set pixel colour at the address (pointer)+Y

        sta $0300,y

        sta $0400,y

        sta $0500,y

  iny ; increment index

  bne loop ; continue until done the page (256 pixels)

- Time Performance

                                Cycle    Cycle Count    Alt Cycles    Alt Count    Total

1. lda #$07                  2                1                                                           2

2. ldy #$00                  2                1                                                           2

3. loop: sta $0200, y   5               256                                                      1280

4. loop: sta $0300, y   5               256                                                      1280

5. loop: sta $0400, y   5               256                                                      1280

6. loop: sta $0500, y   5               256                                                      1280

7. iny                           2               256                                                       2048

8. bne loop                  3               255                  2                   1              767

=======================================================================

Total Cycles = 2 + 2 + 4 x 1280 + 2048 + 767 = 6403

CPU Speed: 1MHz

uS per clock: 1

Time: 6403 uS

          6.403 mS


Instead of using pointer to point address, I used absolute value to save colour data at first. And I modified the loop so that it can run one time instead of four times like before.

- Memory Usage

  1. lda #$00 : 2 bytes
  2. sta $40 : 2 bytes
  3. lda #$02 : 2 bytes
  4. sta $41 : 2 bytes
  5. lda #$07 : 2 bytes
  6. ldy #$00 : 2 bytes
  7. loop: sta ($40),y : 2 bytes
  8. iny : 1 byte
  9. bne loop : 2 bytes
  10. inc $41 : 2 bytes
  11. ldx $41 : 2 bytes
  12. cpx #$06 : 2 bytes
  13. bne loop : 2 bytes
  14. Pointer/Variables : Saved pointer into address of $40, $ 41 each = 2 bytes / Y, X register used 1 byte respectively
======================================================
Total Program Code Usage = 2 + 3 + 2 + 3 + 2 + 2 + 3 + 1 + 2 + 3 + 2 + 2 + 2 = 21 bytes
Total Pointer/Variable Usage = 2 + 1 + 1 = 4 bytes
Total Memory Usage = 21 + 4 = 25 bytes

Modifying Code 

In order to display blue colour, I modified line 5, lda #$07 => lda #$06

Change the code to fill the display with a different colour on each page

   lda #$00 ; set a pointer in memory location $40 to point to $0200

  sta $40 ; ... low byte ($00) goes in address $40

  lda #$02

  sta $41 ; ... high byte ($02) goes into address $41

  ldx #$00 ; refresh page counter

set_color: 

        cpx #$00

        beq color_page_1

        cpx #$01

        beq color_page_2

        cpx #$02

        beq color_page_3

        cpx #$03

        beq color_page_4 

color_page_1:

        lda #$07 ; yellow color for page 1

        jmp fill_page

color_page_2:

        lda #$02 ; red color for page 2

        jmp fill_page

color_page_3:

        lda #$05 ; green color for page 3

        jmp fill_page

color_page_4:

        lda #$06 ; blue color for page 4

        jmp fill_page

fill_page:

        ldy #$00 ; set index to 0

loop:
        sta ($40), y ; set pixel color at the address (pointer) + Y
        iny ; increment index
        bne loop ; continue until done the page (256 pixels)
        inc $41 ; increment the page
        inx ; increment the page counter
        cpx #$04 ; compare with 4 (4 pages)
        bne set_color ; continue until done all pages

  1. Add this instruction after the loop: label and before the sta ($40),y instruction: tya
    What visual effect does this cause, and how many colours are on the screen? Why?
    -> Create multi-colour stripes vertically. Because 'tya' copy Y register to A register. So at the first, the color data of A register was yellow but after tya command, value of Y register change from 0 to 255. Therefore, there are multiple color of stripes on screen.

  2. Add this instruction after the tyalsr
    What visual effect does this cause, and how many colours are on the screen? Why?
    -> It makes the width of each stripe thicker than before. Because 'lsr' move the value of A register 1 bit to the right. As a result it has same effect with dividing the value of A register by 2. When the value of Y register increase from 0 to 1, the value of A register doesn't change. When the value of Y register increase from 2 to 3, the value of A register remain at 1. Therefore lsr makes the width of stripe thicker.

  3. Repeat the above tests with two, three, four, and five lsr instructions in a row. Describe and explain the effect in each case.
    -> two 'lsr' in a row: Each vertical stripe thicker than before and create horizontal stripes in each vertical stripe. Moreover, Horizontal stripes change color in cycles of 2. Because the value of A register is the value of Y register divided by 4
        three 'lsr' in a row: Same pattern as two 'lsr' in a row but each vertical stripe thicker than before. Moreover, Horizontal stripes change color in cycles of 4. Because the value of A register is the value of Y register divided by 8
        four 'lsr' in a row: Same pattern as three 'lsr' in a row but each vertical stripe thicker than before. Moreover, Horizontal stripes change color in cycles of 8. Because the value of A register is the value of Y register divided by 16

  4. Repeat the tests using asl instructions instead of lsr instructions. Describe and explain the effect in each case.
    -> two 'asl' in a row: Create vertical stripes repeated color in cycles of 4.
        three 'asl' in a row: Create vertical stripes repeated color in cycles of 2.
        four 'asl' in a row: Create black color of display.
        'asl' move 1 bit to the left. As a result there is same effect as multiplying 2. So if there are two 'asl' in a row then the value of A register increase by 4 when the value of Y register increase. Therefore, if there are more 'asl' keyword in a row then it makes the pace of increasing A register faster and the number of repeated color's cycles decrease.

  5. The original code includes one iny instruction. Test with one to five consecutive iny instructions. Describe and explain the effect in each case. Note: it is helpful to place the Speed slider is on its lowest setting (left) for these experiments.
    -> one iny: fill the display with yellow color, increase the value of Y register by 1
         two iny: create black vertical stripe, increase the value of Y register by 2
         three iny: fill the display with yellow color but the pace of filling display is slower than one iny. it increases the value of Y register by 3
         four iny: create black vertical stripe but the black stripe is thicker than the stripes of two iny.  it increases the value of Y register by 4
         five iny: fill the display with yellow color but the pace of filling display is slower than three iny.  it increases the value of Y register by 5

  6. Make each pixel a random colour. (Hint: use the psudo-random number generator mentioned on the 6502 Emulator page).
    lda #$00        
    sta $40         ; Low byte ($00) goes in address $40
    lda #$02       
    sta $41         ; High byte ($02) goes into address $41
    ldy #$00        ; Set index to 0
    loop:           
        jsr RandomColor ; Generate a random color value (between 0 and 255)
        sta ($40),y  ; Set pixel color at the address (pointer)+Y
        iny          

        ; Check if we've filled the entire page (256 pixels)
        bne loop     ; Continue loop if not done the page
        ; If done the page, increment the page number
        inc $41      ; Increment the page
        ldx $41      ; Get the current page number
        ; Check if we've filled all pages (6 pages total)
        cpx #$06     ; Compare with 6
        bne loop     ; Continue loop if not done all pages

    RandomColor:
        ; Use a simple pseudo-random number generator
        lda $fe      ; Load random color
        rts          ; Return from subroutine
  1. Set all of the display pixels to the same colour, except for the middle four pixels, which will be drawn in another colour.
  2. Write a program which draws lines around the edge of the display:
    • A red line across the top
    • A green line across the bottom
    • A blue line across the right side.
    • A purple line across the left size.

 Overcoming Challenges

  1. Understanding 6502 Assembly: The 6502 processor, with its unique instruction set and memory addressing modes, was a significant departure from the high-level languages I was more familiar with. It took time and dedication to grasp the fundamentals of 6502 assembly and how to leverage its capabilities effectively.

  2. Measuring Execution Time: Determining the exact execution time of my code was a crucial aspect of the optimization process. Figuring out the right techniques, such as using hardware-based timers or cycle-accurate emulators, was a learning experience in itself. It required a deep understanding of the underlying hardware and the ability to interpret the performance data accurately.

  3. Optimizing Loop Structures: Reducing the number of loop iterations was a key strategy in my efforts to decrease the overall execution time. This involved analyzing the algorithm, identifying opportunities for optimization, and implementing more efficient loop structures. It was a iterative process of testing, measuring, and refining the code to achieve the desired performance improvements.

Lessons Learned

Through this experience, I gained valuable insights that will undoubtedly benefit me in future projects:

  1. Embracing Low-Level Optimization: Delving into the world of assembly programming has given me a deeper appreciation for the importance of low-level optimization. Understanding the hardware-level details and being able to fine-tune the code at the assembly level can lead to significant performance gains, especially in time-critical applications.

  2. Importance of Measurement and Profiling: Accurately measuring and profiling the execution time of my code was crucial for identifying bottlenecks and validating the effectiveness of my optimization efforts. Developing the skills to use the right tools and techniques for performance analysis has become an essential part of my problem-solving toolkit.

  3. Iterative Optimization Approach: The process of optimizing the code, measuring the results, and then refining the approach based on the findings was a valuable lesson. It taught me the importance of an iterative, data-driven approach to optimization, where each step builds upon the insights gained from the previous iterations.

                          

댓글

이 블로그의 인기 게시물

Project Stage - 1 GCC for AArch64

Project stage-3 (testing & reflection)