Flashcards in Linkers and Loaders Deck (35):
Explain the justification for relocation
Many executables are created by combining many differnt files.
If we want to attach a label to a print routine at 0x0, but then combine that file with a main procedure that goes before it, we need to relocated the address for print
What is loading?
The program responsible for loading machine language programs from secondary storage to RAM and prepare them for execution
What is the basic idea around the relocating loader
Relocating loaders determine where in RAM the program will reside and adjusts labels as necessary
A simple loader is implemented as follows:
// P is the program to load and run, P = P, P, ...
for i = 0 to codelen-1 // copy P into mem starting at 0x0
MEM[i] = P[i]
- We dont necessarily know that the program will be loaded at address 0x0
We relocate addresses depending on where in RAM the program is stored
Explain what's going on here in this relocating loader:
// P is the program to load and run, P = P, P, ...
// determine mem needed, n, and a location in RAM, a
n = codeLength + space for heap and stack
a = findRAM(n) // a is the starting address
for i = 0 to codeLength-1 // copy P into RAM starting at a
MEM[α + i] P[i]
$30 = α + n // set addr of stack
place a into $3 // start executing P
1) Determine the size of the program (codeLength)
2) Allocate RAM for the code, stack and heap starting at address a.
3) Copy the program from secondary storage to RAM
4) If needed, set up the program by passing parameters into registers or into its stack
5) Load the address a into some register ($3)
6) Start executing the program (jalr $3)
Which value from the MIPS code below would have to be changed if the program was relocated?
What would it change to?
p: sw $2, -4($30)
On line 2, the value for p would have to be changed to whatever the line number is for p + alpha
Which values need to be changed if a program is relocated?
If a .word instruction refers to a location, you must add alpha to it
If a .word instruction refers to a constant, do nothing.
Do the values in a beq or bne instruction need to be changed during relocation?
No, they jump forward or backward by i, they do not jump to an address
How do we know which words are addresses that need to be adjusted and which are constants?
We use object code: code that augments our machine code that tells us which words need to be adjusted if the program is relocated
What does MERL stand for?
MIPS Executable Relocatable Linkable file
What is MERL?
MERL is a format for machine code that includes information needed if the program was relocated to an address other than 0x00
What are the three parts of MERL?
2. MIPS machine code
3. Relocation Information
What are the three parts of the MERL header?
2. FileLength (length of MERL file in bytes)
3. CodeLength (length of header + machine code)
What is the value of the cookie? What does it mean?
Value: 0x1000 0002
It can be interpreted as beq $0, $0, 2, used to skip over the header when the file is being executed
In MERL, what RAM location would the MIPS machine code work at without relocation?
0x0c (since the header is 3 lines)
What does the Relocation Information section of MERL include?
1. Relation entries
2. External symbol definitions and references
What is the format of Relocation Entries in MERL?
[location of the word in the MERL file that needs to be adjusted in the event of relocation]
Write pseudo code for a MERL Loader
read in MERL header
a = findRAM(codeLength) // space for code, heap, stack
for i = 0 .. codelength-1 //copy code into RAM
MEM[a + i] = instruction[i]
for each REL entry // update words for relocation
MEM[a + location] += a
initialize $30 // stack ptr
place a into $3 // execute code
How would you modify an assembler for MERL?
- Read header
- Count addresses starting at 0x0c
- When a .word [label] instruction is encountered, record the location
- Output the header
- Output the MIPS machine code
- Output the relocation table
Why do Assemblers do two passes of code?
Assemblers need to translate labels into addresses so that those labels can be used before they are defined
Why do Relocating Loaders track and adjust labels that were used .word instructions?
It allows the program to be loaded anywhere in RAM
Why do we link object code files?
So we can break up large programs into smaller modules
What are the advantages to breaking up larger programs?
- Procedural abstraction: if you are writing subroutines that can be re-used, only need to know the interface, not the implementation
- Collect related subroutines together that can be used in many programs (library)
- Easier to track errors
- Divide work on modules to different groups/people
- Avoid duplication of effor (e.g. dont have to re-rite a print routine)
What are the three requirements for our Linker?
Why are these requirements?
1) Works with multiple MERL files as input
- We don't want to concat all the assembly language files into one and then assemble since that would make it hard to make changes
- We would rather work with the object code
2) Outputs to MERL format
- If we just leave all the MERL files as they are, none of the labels will be in the proper address since each MERL file when assembled started at 0X00
- So we want the input and output to be MERL
3) Labels can be defined in one file and used in another
- This will allow us to validly use subroutines from different files
What is the External Symbol Reference?
A directive, in our case .import, that indicates to the assembler that this label occurs in another file
How does the assembler deal with external symbol references?
When the assembler encounters a label that is imported, it initially assigns it a value of zero, then notes in the MERL file that it is not yet defined.
If the label is not found after linking, an error is thrown.
How is an External Symbol Reference stored in a MERL file?
1) An ESR entry is made in the Relocation and External Symbol Table section of the MERL file
2) The first word is 0x11, indicating it is an ESR entry
3) The second word is the location of the symbol
4) The third word is the length of the label
5) Then, each char of the label is entered one at a time`
Why do we need an external symbol definition directive?
Need a way to hide labels that are meant for local use and export labels that are meant for use by other files
What is the external symbol definition directive?
.export [label] - indicates this symbol is meant for use by other files
What is the format for external symbol definition?
In the Relocation and external symbol table section of the MERL file, same as ESR entries except the first word is 0x05
How does the assembler need to be modified for linking?
- When an .import directive is encountered, record each symbol that needs importing
- Record the locations where the symbols are referenced
- Do the same for export directives
- When outputting the Relocation and External Symbol section of the MERL file, add ESR entries for each external symbol referenced and ESD entries for each symbol that is exported
At a high level, what are the different steps in implementing a Linker?
1) Concatenate the programs
2) Combine all the ESDs in the the Reloc & ES Table section, adjust locations as necessary
3) Use ESDs to update ESRs and convert them into REL entries (since now they are just regular labels)
4) Relocate the addresses of labels in both the program body and in the Relocation Table
What is the key task when concatenating programs in the linker?
Since we have just concatenated the program bodies, we need to update the addresses for the programs other than the first
e.g. Addr2n = Addr2o + | Prog1 |
What happens when we combine and adjust ESDs in Linking?
All the ESDs in the linked programs are added to the Rel & ES Table section.
The ESDs not in the first program will have their locations adjusted by the size of the programs that come before it
The size of prgrams are gotten from the header files