CSC 2210, Note 5: Labels and Addresses

CSC 2210 Note 5: Labels & Addresses

Referring to bits of code

How to write a loop?
- When get to the end, need to jump back to the top
- Could jump the given number of instructions, but then have to modify the jump instruction every time we add a new line of code
- More robust: add a label - a reference to a specific bit of memory
- This is what main is in the code we've been writing

Code to compute factorials:

  long long result = 1;
  while ( num > 1 ) {
    result *= num;
    num--;
  }
  return result;

Translating to x86, assuming num is in %rcx:

	movl	$1, %rax        # result
	cmp	$1, %rcx        # while num > 1
	jle	done            # if less, jump to end
loop_top:
	imul	%rcx, %rax      # result *= num
	sub	$1, %rcx        # num -= 1
	cmp	$1, %rcx        # compare num to 1
	jne	loop_top        # if not equal, return to loop top
done:
	ret                     # done, result in rax

Note the pseudocode in the comments for the instructions!
- It's almost impossible to interpret instructions without a tie to the pseudocode (or in the case of a compiler, the C/C++ source)
What are loop_top, done?
- These are names for addresses in the machine: loop_top is the address of the imul instruction
- This address depends on how instructions are converted to sequences of bytes and words in the machine!
Jump instructions
- jmp: jump to label (unconditionally)
- cmp: compare values for conditional jumps
  - cmp X, Y: sets flags based on Y - X
    - SF: sign bit from result (MSB) - allows testing for negative
    - ZF: Y - X == 0
    - PF: bitwise operation (ignore)
    - CF: carry flag - would be 0 in this case
- Conditional jumps:
  - je: jump if ZF = 1
  - jne: jump if ZF = 0
  - jg: jump if SF = 0, ZF = 0
  - jge: jump if SF = 0 or ZF = 1
  - [see textbook for others, especially jc, jo]
- Note, arithmetic instructions set these bits as well
  - sub is the same as cmp, cmp just tosses the result
- loop, similar items: ignore
In same section of book: hlt, nop, lock, wait
- nop: this doesn't actually exist as an instruction!
- Most x86 assemblers translate it to xchg %eax,%eax because it's short and has no effect (including not affecting flags)
- This is an example of a pseudo-instruction - a machine instruction that is translated into other instructions by the assembler
- If you read about the full instruction set, you'll see other pseudo-instructions in the x86 family

Exercise: translate the following to x86 assembly code:

   # count number of bits that are set in an integer
   # eg: 0xA3 (10100011) has 4 bits set
   # assumption: %ebx is the value to count bits in
   # %ecx = 0
   # while %ebx != 0
   #   if lsb(%ebx) == 1, add 1 to %ecx
   #   shift %ebx right by one position (shr %reg, 1)

See counting_bits.asm, counting_bits.cpp for an implementation

Using labels for data and storing data in stack frames

Add a label for the count:

        source:
                .space 4        ; allow 4 bytes for the info
        count:
                .space 4

In OnlineGDB, would have

        .globl	source
        .globl	count
	.bss
	.align 8
source:
	.space 4
count:
	.space 4

Copy the data there: mov %eax, count

Another way to access global data

See fact.c in factorials demo code - note last_number global variable

See fact.s for declaring last_number:

        .text
        .globl  last_number
        .align 8
last_number:
        .space 8        # quadword

Setting this value: last_number = result;

        movq %rax, last_number(%rip)
        # %rax: result
        # %rip: current "instruction pointer" (address of the current instruction)
        # last_number: offset from that place - effectively computing an
        #   address based on the offset between instructions and last_number

Stack Frame
- Review also the body of fact overall - note that we set up space for local data at the start and return it at the end
- Note the same in main
- This is setting up a stack frame
- This is a critical concept in programming, especially in languages like C & C++
- Memory: heap (growing up), stack (growning down), stack broken into stack frames with depth = current nesting depth of function calls
- Each frame: return address, arguments, saved registers, local variables
- Typically, %ebp or %rbp (depending on 32-bit or 64-bit model) is the base address of a function's frame
- The current frame size (in 32-bit): %ebp - %esp (stack grows towards address 0, so %ebp is always larger than %esp)
Addresses
- Note last_number(%rip) computes an address - it's where to store a result, not a register that holds the result
- This distinction between an address and a value is also critical
- All data (i.e., a variable, array, etc.) has an address
- The address can be well-defined even if the value at that address is not
- For example, int x; declares x and defines its address, but does not define its value
- int x; cout << x; prints a "random" value!
  - The value depends on what was at that memory location before executing the print statement
  - It's random in that it's unpredictable
  - Changing computers, changing code, using different versions of libraries can all affect the value
  - A variable value's defined if it is guaranteed to be a particular value (ignoring machine failures!)
  - undefined: no guaranteed value
  - The key is that undefined values lead to unpredictable results; something users rarely care for...
  - Note that it is not truly random: you can get the same result on every run, and you can often predict possible values
  - Illustrate: create an empty assembly program in OnlineGDB with breakpoint at return, note values of registers
Key takeaways:
- Data declared in a function, and the function's arguments, exist in a stack frame
- A value is (well-)defined if it is predictable based on the code and the rules of the programming language
- Saying something is undefined means it is not useful for computation
See this Rutgers page for another discussion of stack frames
- The concept of a call stack ("run time stack") and stack frames took decades to understand and popularize!
- Don't be surprised if you don't quite understand it all in one go!
- This course: figuring out how to use logic and organization to make sure computations are well-defined
- All of us have to struggle against our less logical selves!
- This is our (joint) mountain to climb: building efficient, working, maintainable programs while wading past our desire to "just get it done"

Processing Arrays

See init_array.c in demo code

Initializes an array of factorials
View optimized_init_array.s
Address of array passed as %rcx
movq %1, (%rcx)
- (%rcx): %rcx is the address of the array, not a value
- (%register): treat this as an address, not a value
- If %rcx holds 5000, will store value at addresses 5000-5007 (a quad word)
- Needs to be movq - can't determine how much to move by simply checking the size of the register
Implementing destination[i] = i * destination[i - 1]:
- i: in %rax
- imulq -8(%rcx,%rax,8), %rdx
  - General form of address: offset(base,index,scale)
  - base, index: registers
  - scale, offset: integer constants
  - Computes effective address:
    base + index * scale + offset
  - Example: -8(%ebp,%esi,4):
```
        address is %ebp + %esi * 4 - 8
```
  - scale: defaults to 1 (if not specified)
  - offset, base, index: default to 0
  - (%rcx): equivalent to 0(%rcx,0,1)
    - Do know the difference between %rax and (%rax) (without notes)
  - -8(%rcx,%rax,8): %rcx + %rax*8 - 8
  - This gives an address in the array parameter (using %rax as an index)
  - The instruction: set %rdx to value of multiplying index by previous value (i - 1)
- %movq %rdx, (%rcx,%rax,8): store result in array at address %rcx + %rax*8 (that is, destination[i])

In the function body, rcx is the address of the array - where from?
- See the call in main: leaq 32(%rsp), %rcx
- lea: load effective address - not a mov!
- Loads address of data at 32 bytes from %rsp - the address of the array factorials in the stack frame
- Following convention, this is passed as the first parameter, %rcx
When initialize_factorials returns, have
```
        movl 144(%rsp), %eax
```
- Compiler has computed address of last element of array as being 144 bytes from the current stack pointer
- I'm happy to let it do that computation!
- Again, we move from the address, 144(%rsp), not the value of %rsp itself...
As a side note, the return result is completely predictable: it's factorial(15)
- See fully_optimized_init_array.s in demo code directory
- This is from using gcc -S -O4 init_array.c
- -O4: optimization level 4 (the max)
- This version precomputes all values in the array, initialize_factorials simply loads the values into the array with an "unrolled" loop!
- Final return from main: movl $1278945280, %eax
- The compiler has precomputed everything it can!
I will not expect you to write array processing code in assembly from scratch

Review

Labels: named locations in memory
Using labels as targets of jump instructions
jmp, conditional jumps
cmp, SF (sign flag), ZF (zero flag), CF (carry flag)
pseudo-instruction: not a real instruction, but a convenience
.data, .text, .globl, .align are also pseudo-instructions
stack frame
address and (%rsp) vs. %rsp
effective address offset(base,index,scale)
- computed as base + index * scale + offset
arrays: computed using effective addresses