Come caricare costanti lunghe 32 bits
Caricamento di costanti Load 0x12345678 in R8
Si procede in due passi. Esempio: caricare 0xaabbccdd. – Si caricano i 16 bit piu’ significativi
lui
lui $t0, 1010 1010 1011 1011
(LI $8, 0x12345678)
Zero Fill
1234 0000
$8, 0x1234
1234 5678
ori $8, 0x5678
Bit meno significativi posti a zero – Si caricano i 16 bit meno significativi (ori, addi)
0xaabb
0x0000
0x0000
0xccdd
0xaabb
0xccdd
Load 0x1234 in R8
ori $t0, $t0, 1100 1100 1101 1101
(LI $8, 0x1234) Sign Estended
addiu
$8, $zero, 0x1234
0000 1234
Nota differenza tra ori e addi
Load -1 in R8
Principio di progetto 4: – rendere veloce l’evento piu’ frequente (costanti come parte dell’istruzione rende la loro esecuzione piu’ veloce)
(LI $8, -1)
addiu $8, $zero, -1
Sign Estended
FFFF FFFF
89
Indirizzamento nei salti
bne $t4,$t5,Label
#Salta a Label se $t4!=$t5
beq $t4,$t5,Label
#Salta a Label se $t4==$t5
j Label
#Salta Label
Instruzioni:
Formato: gli indirizzi non sono di 32 bits op
J
op
rs
rt
16 bit address
26 bit address
Indirizzamento PC-relative
Importante:
I
L’istruzione di salto condizionato (branch) ha solo 16 bits per l’indirizzo di salto PC = registro + indirizzo di salto – il programma puo’ raggiungere la dimensione di 232 (4Gb) – si puo’ saltare fino ad una distanza ±215 word (±128Kb) (principio di localita’)
Indirizzamento nei salti
90
91
l’offset e’ relativo all’indirizzo dell’istruzione successiva (PC+4) l’offset usa un indirizzamento relativo alle parole (deve essere moltiplicato per 4 prima di essere sommato a PC+4) 92
Modi di indirizzamento
Indirizzamento Base + Offset
Intervallo ± 32 Kb dalla base
93
Indirizzamento PC-relative (branch)
94
Indirizzamento PC-relative (jump)
95
96
Pseudoistruzioni
Pseudo Istruzioni
Sono istruzioni accettate dall’assemblatore MIPS che non hanno corrispettivo nel linguaggio macchina per uno dei seguenti motivi:
aritmetiche
– abs, neg, negu, mul, mlo, mlou, div, divu, rem, remu
– not – rol, ror
– usano un’operazione non implementata dallo hardware – usano un modo di indirizzamento esteso, non implementato dallo hardware
logiche
L’assemblatore le espande in sequenze di poche istruzioni binarie, usando il registro $1 ($at), riservato a questo scopo
di trasferimento dati
– la, ld, ulh, ulhu, ulw, li – sd, ush, usw – move
di confronto – seq, sge, sgeu, sgt, sgtu, sle, sleu, sne
97
Pseudo Istruzioni
98
Pseudoistruzioni: esempi
di controllo
– b, beqz, bge, bgeu, bgt, bgtu, ble, bleu, blt, bltu, bnez
lui $1, UpperPartLabelAddress
dei coprocessori
ori $4, $1, LowerPartLabelAddress
– mfc1.d
floating-point
bge $8,$9,address slt at,$8,$9
– li.s(.d), l.s(.d),s.s(.d)
beq at,$zero, address
la $4, label
move $8,$10 addu $8, $0, $10
99
100
Pseudoistruzioni: esempi addu $9,$0,$8
li $8,-1
abs $9,$8
Pseudoistruzioni:esempi lui $1,-1 ori $8,$1,-1
bgez $8,$L li $8,-16
sub $9,$0,$8
ori $8,$1,-16
$L: ... bgez $8,$L
abs $8,$8
lui $1,-1
sub $8,$0,$8
li $8,50000
ori $8,$0,-15536
div $10,$9,$8
bne $8,$0,$L
$L: ...
break 0 $L: div $9,$8 mflo $10 101
Riassumendo
Si ha quando il modo di indirizzamento usato non è direttamente fornito dal processore
Pseudo-Indirizzamento
102
Anche in questo caso una singola (pseudo)istruzione viene tradotta in una sequenza di più istruzioni di macchina
R
Esempio:
I
J Label Istruzioni lw/sw: l’hw consente solo imm(register), ma e’ possibile anche:
J
– – – – –
Tutte le istruzioni sono lunghe 32 bits (1 word) Solo tre i formati delle istruzioni: op op op
rs rs
rt rd shamt funct rt 16 bit address 26 bit address MIPS operands
Name
Example Comments $s0-$s7, $t0-$t9, $zero, Fast locations for data. In MIPS, data must be in registers to perform 32 registers $a0-$a3, $v0-$v1, $gp, arithmetic. MIPS register $zero always equals 0. Register $at is $fp, $sp, $ra, $at reserved for the assembler to handle large constants. Memory[0], Accessed only by data transfer instructions. MIPS uses byte addresses, so 230 memory Memory[4], ..., sequential words differ by 4. Memory holds data structures, such as arrays, words Memory[4294967292] and spilled registers, such as those saved on procedure calls.
(register) imm label label +/- imm label +/- imm(register) 103
104
Riassumendo
Tutte le istruzioni modificano tutti i 32 bits del registro destinazione (incluso lui, lb, lh) e tutte leggono i 32 bits della sorgente (add, sub, and, or, …)
subtract
sub $s1, $s2, $s3
$s1 = $s2 - $s3
Three operands; data in registers
Le istruzioni Immediate aritmetiche e logiche sono estese nel seguente modo:
add immediate
addi $s1, $s2, 100 lw $s1, 100($s2) sw $s1, 100($s2) lb $s1, 100($s2) sb $s1, 100($s2) lui $s1, 100
$s1 = $s2 + 100 Used to add constants $s1 = Memory[$s2 + 100] Word from memory to register Memory[$s2 + 100] = $s1 Word from register to memory $s1 = Memory[$s2 + 100] Byte from memory to register Memory[$s2 + 100] = $s1 Byte from register to memory Loads constant in upper 16 bits $s1 = 100 * 216
load word store word
Data transfer load byte store byte load upper immediate
Unconditional jump
branch on equal beq $s1, $s2, 25
if ($s1 == $s2 ) go to PC + 4 + 100
Equal test; PC-relative branch
branch on not equal bne $s1, $s2, 25
if ($s1 != $s2 ) go to PC + 4 + 100
Not equal test; PC-relative
set on less than slt $s1, $s2, $s3
if ($s2 < $s3 ) $s1 = 1; else$s1 = 0
Compare less than; for beq, bne Compare less than constant
Comments
– gli operandi logici immediati sono “zero extended” a 32 bits – gli operandi aritmetici immediati sono “sign extended” a 32 bits (incluso addu)
Overflow puo’ verificarsi con: – add, sub, addi
non puo’ verificarsi con: – addu, subu, addiu, and, or, xor, nor, shifts, mult, multu, div, divu
set less than immediate
slti $s1, $s2, 100
if ($s2 < 100 ) $s1 = 1; else$s1 = 0
jump
j 2500 jr $ra jal 2500
go to 10000 Jump to target address go to $ra For switch, procedure return $ra = PC + 4; go to 10000For procedure call
jump register jump and link
I dati caricati da lb e lh sono estesi nel seguente modo: – lbu, lhu “zero extended” – lb, lh “sign extended”
Conditional branch
Three operands; data in registers
Arithmetic
add
MIPS assembly language Example Meaning add $s1, $s2, $s3 $s1 = $s2 + $s3
Instruction
Category
Riassumendo
105
Riassumendo
106
Architecture alternative
Instruction complexity is only one variable
– provide more powerful operations
Design Principles: simplicity favors regularity smaller is faster good design demands compromise make the common case fast
– danger is a slower cycle time and/or a higher CPI
– – – –
– goal is to reduce number of instructions executed
Sometimes referred to as “RISC vs. CISC” – virtually all new instruction sets since 1982 have been RISC – VAX: minimize code size, make assembly language easy –
– lower instruction count vs. higher CPI / lower clock rate
Design alternative:
107
instructions from 1 to 54 bytes long!
We’ll look at PowerPC and 80x86
108
PowerPC
80x86 1978: The Intel 8086 is announced (16 bit architecture)
Indexed addressing
1980: The 8087 floating point coprocessor is added
– example: lw $t1,$a0+$s3 #$t1=Memory[$a0+$s3] – What do we have to do in MIPS?
1982: The 80286 increases address space to 24 bits, +instructions 1985: The 80386 extends to 32 bits, new addressing modes
Update addressing
1989-1995: The 80486, Pentium, Pentium Pro add a few instructions (mostly designed for higher performance)
– update a register as part of load (for marching through arrays) – example: lwu $t0,4($s3) #$t0=Memory[$s3+4];$s3=$s3+4 – What do we have to do in MIPS?
1997: MMX is added – “This history illustrates the impact of the “golden handcuffs” of compatibility
Others:
“adding new features as someone might add clothing to a packed bag”
– load multiple/store multiple – a special counter register “bc Loop” – decrement counter, if not 0 goto loop
“an architecture that is difficult to explain and impossible to love”
109
A dominant architecture: 80x86
Registri 80386
See your textbook for a more detailed description Complexity: – – – – –
110
Instructions from 1 to 17 bytes long one operand must act as both a source and destination one operand can come from memory complex addressing modes e.g., “base or scaled index with 8 or 32 bit displacement”
Saving grace:
– the most frequently used instructions are not too difficult to build – compilers avoid the portions of the architecture that are slow
“what the 80x86 lacks in style is made up in quantity, making it beautiful from the right perspective” 111
112
Istruzioni 80x86
Formati istruzioni 80x86
113
114