Processors in Programming Languages

really

come in a version that

sorry

trickle its way through but Burroughs had this distressing

set of traits

of having the best hardware architects

really in the world for many years and

they could never ever implement these designs partly

because the hardware architects are a little bit purest

major

figure here is Bob Barton who is a

well-known iconoclast

remarkable man he's I think

he said still

my first this is B five thousands the

third machine I ever learned I learned it in 1962

I happened to be in the Air Force and

they had a Burroughs 220 Burroughs

220 was this huge vacuum tube

thing with 5 or

bigger than this room and you could heat coffee

on top of it wait

old machine and had a couple of interesting features on it that actually

partially triggered off some of the ideas to be 5000

Martin first

was really a philosopher he was a philosophy

major at the University of Connecticut and

gotten interested in proving he's already reading philosophy

books and books not logic

a man of parts

and I guess he started coding around

1951 or sellin

boroughs bought a company called datatron

and they set up

they've made a machine called the 204

in the 2:05 were true run taste machines

and a 205 in the borough's 220 was the same

xcept that the tool fire was completely based on a drum

it had the registers on the drum also many

copies of the residues just placed around the drum so

that you didn't have to wait for a full revolution this way to get

things in those days and

they set up this whole latch up in a Safeway

store in Pasadena partly

because of that location they formed the early

association with Cal Tech back

canoes first machine was the

205 and the 220 and this

horrible thing known as mix in the

canoes books is a very reminiscent

of the 205 into 20 in

other words it was a decimal machine

ach position

had

10 decimal digits

and a sign digit and each

one of these had four bits in it so I had

44 bit words and the sign

position for some

reason known only to them and to God they

decided to have all ten possibilities

for the sign position okay

and because there's only plus and minus and then eight more they

decided to assign some meaning to those it was

meaning assigned to those extra sign positions that

that led to

ideas in the be 5,000 I won't go through the whole history but it's very

interesting to watch how these ideas come about one of the sign positions

was saying position six which marked

this thing as a piece of code

okay and in fact the

the machine would not

if you if you tried to

fetch through something if you

tried to fetch a piece of code there

for instance they were trying it would we try

and do a job in other

words they both protected things in a limited way in

memory and there are certain side effects that can be triggered off by

touching things in memory it was that idea which is

being played around with in the 50s that that led to part of what

to be 5,000 was okay the most important

thing about this is

that in 57 and 58 there

is a language developed called Algol 58

and alcohol

58 was a reaction

sort of a purist reaction of the Europeans

Plus Alan Perlis a few other people in this country

too Fortran so they thought

was absolutely detestable but a Fortran had come

arlier so they sat down and designed this thing it was cleaner

and except for the fact of not

having data structures except for arrays and

strings and integers and

floating point typical

alcohol data structures but not having Pascal

data structures this language is essentially what Pascal is

in other words it has one level

of definition

for global

procedures

etc

and [Music]

the some of the versions of this I could

o recursive subroutine calls and other ones couldn't the separate procedures

have local variables

there are a couple other features

language it was basically a very simple but vanilla

language and Barton

head of the project to put this on the burst to

20 and doing so they

a thing that had only been really discovered in

56 57 by by

a few people in Europe which is called the stack the

staff was originally invented for doing compiling and

you have to realize that back in these days in 1952-53

when they're still thinking about compiling

arithmetic expressions this was considered to be an AI problem

okay

what Fortran was was the world's first automatic

programming system that's what it was called automatic

programming was how to get away from writing having

ive explicit sequence when writing machine code

ok so the first task of automatic

programming was to not have to specify sequence

for thanks and the

first parser is back in the early 50s actually

went in and parenthesize effect Fortran yeah

they went and actually put all the parentheses in to

the source code okay

and then went through and removed all the parentheses by

going okay was unbelievable but

that this is the in the days before it be an F there's

no such thing as a compiler compiler so

some Germans

pop up Bower and

a couple of other people

you er can't remember the

other guy Sam Wilson says yeah Santa Valerie

invented

a stack algorithm for doing compiling

arithmetic expressions that used precedents it was the first

precedents algorithm Martin just loved this he just

sucked this up and

the basic idea I won't go

you know you either know it or you don't at

this point this is part of the

that used to be important for people to know but

is yeah the noise now as far as I'm concerned

but basically the idea is you kept

opera Diana from the arithmetic expression you kept operands

if you had something like a plus B times C as

you chucked your way through this thing

you would have shoved into

the stack everything

up until the time side which is the

something and do something with the city and the B and

either have to sometimes they have

two stacks you put the operators and 170 you plus there'd be

an A to B maybe

you push it all the way down so you'd have time

silent CPA here and

as soon as you could find out there's

nothing more for you to do over here and then you know you could generate

code for these guys okay

ou kept on going through in the stack kept

these intermediate results bargain very quickly realized that

a stack could be a dandy idea de for doing

subroutine calls in

fact they were just starting to mumble about the idea of recursive

subroutines for doing problems

so when they implemented the system on the on

the burros b20 they had the idea of

that they were

going to get enough do

things in this order and then generate code through

an assembler multi pass assembler and

they're going to use a stack for subroutine calls they

also had the idea that this

group of global contact should probably be

represented as some kind of table in storage

that had it in either pointers to

arrays allocated somewhere else or

had numbers in them

and in this particularly

implementation the code had to know what these were supposed to

be okay so one

fine guy I don't know when it was 59 or so

after all this had been done

Barton was reading a book

by Lucchese which logic and

what Lucchese which had done is in order to

do certain logical manipulations of things which requires symbolic

manipulation it invented a notation that

had no parentheses

and in fact it was a

I think Lucchese which is

notation was a prefix notation can't

remember can you remember Steve can't

remember whether this prefix anyway the two possibilities for an

expression like this or either saying

something like BC times a

nd

yeah I think the case which is was actually prefix

which there are various ways of writing

that you can say x b c

plus a or plus

a times b SI

prefix use a stack

until you get to something it will actually

has a couple of operands

and think

about Barton is one of these guys has a mind like a net

almost completely

intuitive

your life because he never knew

when he was going to have a good idea in fact he lived for quite a

nd he feared he was never going to have another the

fact he had good ideas all the time I used to absolutely

destroy PhD theses at

Utah where I ran into him because

he'd be working with a thesis student for six months and

all of a sudden he'd wake up in the morning with a total

solution to that thesis and that was the end of that

thesis and so the smart thesis students

like me stayed away from them we

didn't want him to solve our problem and

even though I was a natural

to be his thesis student I went to Dave Evans and

Barton would not talk for me for five years because of that

that's the kind of guy is he's sort of like

William F Buckley to talk to there's

an incredible command in the English language and he's very

very snide delightfully snide

it's a Barton okay

when this when

Bart made the connection between this and this

his mind just started working and

in 48 hours he had designed to be 5,000

okay it was one continuous hit according

was just literally stayed up for 48 hours and everything

fell into place I'll certainly give you an

idea the things that led into it

was the Barton's basic snobbery that

said I never want to write machine code again higher-level

anguages are a good idea then by god what's right in them

never want to write an operating system in the lower level

language never want to write anything in

the lower level language tomah's

the compilers weren't the

machines weren't a good environment for this kind of

thing so he had in the

back of his mind the idea what let's build a machine to run algal

if we think that's a good language and

in fact the Algol that they built it for was this Algol

58 B 5000 doesn't have

all the hooks in it for the

6006 he came along while they're in midstream

machine so here's here

are some of the things he thought about

first thing he did was to

say well

I know we're

gonna be running lots of different jobs in fact the first be

5000 you have to realize in fact most of these machines

had multiple processors on it right from the beginning

teeny little machine -

it was only a 4k 48 50 words had

swapped off a 32k drunk

this is a mainframe computer

and they don't knew they were gonna have to

do lots of fun lots of batch jobs and do compilations

and all those things one of the things that always happened in those days

as a program would start writing into somebody

else's programs core now would lead to strange

results and that

was not a good thing so one of things that they wanted to do is

to protect so we decided that

the

consequences I don't know how he all came to all of us at

once because the interesting thing about to be 5000 is not that

it has a stack that was the least interesting

complete innovation those days there's only one other machine

that has sort of a stack that was called the KDF right

so here's what he did he said

when a program is running

some kind of code here

ach

I only have two kinds of things that I'm really worrying

about there are things that denote operations

and those are things that are going to appeal

directly to the machine so those are

pretty safe and the only other thing I have to worry about

are things up to note variables I notice he didn't say values

because some higher-level languages

don't have values they have variables everything comes down to looking

like a like a variable and

so he asked himself what is a variable and

a variable actually is

some item in this list here or

some item in the list of the procedure and that's all it

is so he immediately misses came

out of this table already but immediately said

there are two kinds of variables are local and global and

all those things are our numbers that are offset

from the beginning of whatever this list is and

since we're going to be running so if

this is the red list here and

this is the red procedure and

over here we have another job that has a green

list and a green procedure

than

any particular variable

like this one down here it's

like variable 19 that's

red and this one over here is variable

19 that's green

completely two different things because they're reference

relative to this guy to this guy

they

said okay that's that's great what we need to have now

is a register that the code can't see the points

to this so that registered

it's called the PRT

whenever a piece of program

is running this PRT register is pointing to one of these table

I'm going to draw it as a table map

okay and now what code turns out to be is

something that can be rather small in

fact this thing I think was up to a thousand law

these are 48 that words

and so to

worry about any

one of these guys the code for that is only needs

to be ten bits long worry about

that and

for various reasons

they

could have done this there's whether they say it's really terrible

criticize this but you know after 20 years and a few

things have been discovered but

they decide also to make the operator

set be 10 bits long so

they could have a thousand operators never

only had more than a few and then to distinguish these two

there are a couple of extra bits

on as headers that told you what kind of code symbols

these were four kinds of code syllables

if I can still remember these

there's an operator

I think I want to get to use three of them

is basically

the operator a what is called a value

call and what is called a

name called

I'll

tell you what those are in a second

so code in this machine was 12

bits long because it had

48 bit words the

register up here for

would hold for instructions

at a time so to have a little instruction cache on

it and just attach those is one of the consequences

of e 5000 and only have to cycle memory every for instructions

now going back to how

executed Barton quickly discovered

that going to post fix is much more efficient

then he asked himself how

am I going to implement the stack I think the first way he thought

of it is that he would simply have another

register here called the s register

that would point into the top of memory

and there would be

the

stack then he realized he needed a couple

of registers to hold stuff

was going to go through the arithmetic part of the part

of the system and so put in two registers

two Hardware registers now actually

what I should do is

let's be consistent here

put the staff like this

okay so I put in a couple of hardware registers that

were logically the top of the stack called the a and B registers

the a and B registers had

an extra bit that marked whether there was something

in the register or not okay

let's call the presents bidder

so

the idea was if you chugged up to a-plus

and there

wasn't a + requires two operands and both

of these guys weren't on this this

machine would know to cycle the stack to make sure there are two operands

in the registers so

did the most efficient thing that it could

would only cycle a stack when absolutely

necessary to memory that all happened automatically

okay so

I'm very logical to think of these these

things are called syllables the code syllables

would have a register to know

what this was called but I'll call it was either called C or

P or something like that

here are

a bunch of code syllables so

this bc times a plus

would be turned into something that would look like

value call for wherever be was

maybe be is number three here so this

would be something like value call

three this

would be value call maybe seven

word see

our operator

50 or something which is a plus

and value call

whenever a is maybe 30

plus

I'm sorry this is x operator

45 which is plus

so

that's what that turns into and this thing

had been this thing had been in originally

an assignment statement into a variable called

e

and the code would have come out

he

underbar a sign

and that would have turned into here

would have been a name call

on wherever D is

these at 50

say Lane called 50 op

sign

they just we

numbers there and just draw these things in

okay so

what name call does is to

generate an address that becomes an operand assignment

is an operation in this machine and

there are a couple of interesting

this is penetrating at an instant

of the most important things you can ever do in

a higher-level language which is to be able to

give other senses for store and fetch

and in the hardware of this machine never been done

programming like reports in the hardware this

machine will talk about what's right and what's wrong with weight it turns

out you did it the wrong way but you be forgiven

because nobody thought before in fact borrows

use of it because most of the programs that borrows do not understand

this particular feature she

okay now a

wonderful thing

further

about this machine is that if a and

B or C we're procedures the

code is exactly the same okay

compiler does not care for

the following reason that one of

the bits in this program reference table

it's called

a flag bit and

it is there to distinguish direct

operands like integers and floating

point numbers and the

way integers and floating point were stored was such that

there's only one Plus that you needed there's

only one operation over here and the

system could tell whether there was an integer floating

point when the flag bit was zero it meant what was stored

here was either an integer or floating-point

okay when the flag bit was one

meant there was something that needed to cause

an interrupt to be triggered off to other parts of the machines hardware

yeah yeah this is what we

want to do is look at these guys when the thing is marked is

what so what I'm saying here is that

there's no such thing number one is the code knowing

what it's going to work on it doesn't it's

a direct translation of this which

in this form B could just as easily be an integer procedure

as anything else and your function

okay and number

two the code is not allowed to

see any of its own state and

note that the code cannot see

its segment that it's in unless

there happens to be a pointer for it in this program

reference tables everybody see that okay

the code contains no addresses and in

fact there's no way for the code to generate any addresses the

only addresses it can ever use are

that's been given in these registers this code

cannot see these registers the code cannot see the

a and the B register it can only use the

absolutely critical this is what it's so clever

all modern protection schemes are based on

this idea this is the idea that turned into what is now known as capabilities

but

done almost perfectly the first time around

when I was done in Baltics it

okay so let's take a look at

these are these

48-bit words with the flag bit of one

they had enough room in here that

they decided what they would

also do is to make the storage conform to the exact

objects that the programmer wanted to use which are little variable length

segments okay and the segment

he thing that not only has a starting address but

also has an ending address okay

so for instance

an array would have something like

the following format

his would be the base address of the array

this would

be the length of the array this

would be the fact that it is an array as

it's additional type information there'd be a couple

of extra bits here one of which would

indicate whether the thing was in storage or not

the idea is this bit where

zero when you ran into it here meant

you had to go out and fetch it before you could use it okay

so what this was was a protected

segmenting scape another interesting

one here is procedure

is one of these guys

think of this code if you want

as a base address

it has a length

as a present spit

so

forth and there a few other ones

okay so what do we have

here we have a scheme

I got draw I haven't shown everything

yet not to show how subroutines are done let's

review for a second we've

got a scheme and where the the code only refers

to off set of registers that can't see

no object in storage

can be referenced directly by the code the

code wants to see something there has to be one of these

guys in its program reference table

what's in the program

reference table is in there unambitious lee because the flag

bit it's not in there as bits if the code has to remember what

the heck it was so

what can you do with this well suppose

one

of the things you can do for instance is replace see

here with

a procedure suppose

we want to enrich in this thing without changing the code

you just put a one flag bit of one and point

this off to another piece of code down

here every time you go through and execute this code

and execute the procedure and generate the result from it

his procedure can just as easily access

a file one of these things can be a

prominent going to another process the process is one

of these whole conglomerations

another one over here

okay so the

in this machine you can implement directly

things like streams in UNIX

okay what you implement it just simply any variable

can act as a stream but

so far we've only talked to that about them as sources

let's go through the how

execute some code here

suppose suppose we have the code

e5

okay right at this point we don't

happen to know whether there is an array or a

procedure and if the caller doesn't know

either what it does is it says but

I'll do it different color so you can see it says now you call

oh I

know what the other one was

no it's no it's just small subscripts

actually it was was a small immediate

operands that's what it was

0 0 0 1 1

0 1 1 yeah

these are small integers

operators

o

this thing would be immediate

five now we put the five

in the top of the stack

then the

I'm not going to go through

right now exactly how

this how this does in fact

president was slightly more baroque way than

it needs to do then

would do a value call assuming it's on this side of the

assignment arrow do a value

EU never is one of

these guys in here down here so if you value call on

95 or something

now if it comes in here

and the flag bid is on the flag that is off

there's a little little more code

generated here if the five minutes is off it's

going to complain it's an opera

opera hat waiting in the stack if the flag

that is on here remember the five is is in

the stack already then

going to look further than if it sees it's an array what

it will do is automatically index

the array against the base address but what it does

is it checks the this guy against blanks first

and everything if everything is okay it generates this

other way generates a bound check error so BAM is checking

done automatically by the hardware here if everything is all right

then what it will fetch into the top of the stack is

the

value from that particular array

important thing to realize is that

arrays there's an array

point razor

things like this program reference table arrays have flag

bits also okay

so it's a multi-dimensional array fetching in one

that has a flag bit on it's going to trigger off another indexing

and we'll start peeling the staff dry from

all the indexes of the rail just appeal so rays are stored

as trees on the be 5000

until you get to something that's an operand then

I'll finally fetch that and answer it will have the answer

even more fun

that comes in here to E and just discovers

it's a procedure it does exactly the same thing except

he procedure with this guy as a parameter

so bharden having a somewhat

symmetric line

ask himself the question of what suppose I

had something like this

what should i do

there now the answers is this is a by

being on this side of the assignment operator this

part of the thing is going to be replaced with a name

what I want you to wind up with is an address

that this thing can operate

on okay so far so good and

happens with array is it ripples through the same way except it doesn't

fetch the last guy just we use the last guy in

the top of the stack this guy can

work on so for symmetry he decided

that if this guy happened

to be a procedure and it was called with the name call he should actually

do something rather than complain this was the master stroke

and what he does do is to call

in such a way that the procedure itself can discover

what it's being asked for a name or a value that

was a test you can make in this language

called Bal Gulf was in the beginning of a

saying if somebody want me to generate an address or to somebody want

a value from me and so in

the complex Bal gall procedures were actually

two-part procedures each part was a function that would produce

a value it was also a procedure for calculating

what that address was

that works for many

ou want to do turns out it doesn't work for all the things

you want to do but it's very close it certainly

allows you to simulate arrays and other

data structures I think you can see that if you

put this thing in you can simulate the semantics

of most data structures completely

cover up whatever representation is actually being

and this was something that was used to a limited extent as

programs didn't really understand this in

particular you can replace variables with files

you could have input

and output pipes it would fire off

things that's what a lot

now the final set of mumbo-jumbo

here is how the

stack was used to

also

store subroutine state

that

that way is as follows there's another register called the F

down here

let's

think of what it has to get shoved into the stack when you call us every

day somehow we

have to have the following information there's

a program counter that must be here somewhere

that is indexing

into this code notice the program counter is

relative to see

okay see see could

be anywhere this is was the first computer to ever have

so the various things we need to store are the program

counter of where we were and that counts

as guess what one

procedure segments these

two guys just bottle up

in there we need to

know where the previous stack

frame was and we have to know

where the current one is because we have to get these

guys relative to this frame register

so the several ways of doing this

at the be 5,000 does not do this in the cleanest possible

fashion a modern

way of doing this is as follows

is to

point in have the frame register point into

here and index local variables

negatively relative

to this so the local variables and the

frame information

the old program counter and stuff like that indexed

negatively going that way and use this

stuff here for the arithmetic operations in the new procedure

ok that's the way small talk does it

the way to be five they didn't realize

they could do this on the me five thousand so they they went through

a more complicated thing of

marking the stack first and a few other little goodies but

this is good enough to show

hat

what is in part of this thing is the Preet the old F

which is a pointer back down the stack

into the previous frame okay

information you so think chugs along as much

as it wants as soon as you hit a new procedure thing that triggers off

a procedure call it bottles up this guy and

that guy into the stack assuming

that in typical Polish postfix

form that these all of these variables

here have been loaded into the stack first

okay it just doesn't even know it's going to call

a procedure just go chucking along the early eventually

run into a procedure call and all of a sudden it discovers

all of the variables have already been

evaluated and they're stuffed in the stack in the right order

and when this F register goes in then the

parameters are just this

reason it knows is that these ten bits

that this thing is built up

into for these two guys is actually broken up into

two smaller segments

for global and local

so that's basically how the

machine works when

you do an inter process call you run into

something that's a process it bundles up

more than this it actually talks all

of the state of what's going on

into the top of the last sack that

you have and stashes that in a program

reference table in the operating system program reference table

the operating system is the scheduler okay

because that's what it is you want scheduler

a bunch of things that have just processed descriptors

in there and you just go from one to the other and

you're either executing or closing them up so the subroutine

call time on this 1961 the machine was

just a couple of microseconds

almost as fast if the operations were itself because

it have to do it didn't have to do much more than

what it was doing on operations the result of

this thing is that the procedures that you wrote

were very close in

thinking to the notion of extending the whole

operator set of the of the computer is like a programmed operator

set that you find in blessing machines like the

but fit

into this one homogenous scheme

so what we have on the unbe 5000

is have complete protection one

this is one of the few architectures that actually

exists today they're still selling these goddamn machines forget

what it's called fifty seven hundred fifty nine hundred out

every couple years they upgrade the

hardware on it but it's basically such a sound

design I have so much code written for it that they're still

selling the goddamn things the reason

is you can't break the operating system just

cannot break it because there's no way you

have to do extremely special things in order to be

able to get any kind of descriptors which what these things are called

- any piece of code or any piece

of segmenting the very first

he third one of the first multiprocessor

systems and again you can see why because

protection is nearly absolute

as you can get you can afford to have several processors kicking around at

the stuff that's there and the prizes are

put to good use because the storage swapping was so slow

that one of the processors would be running was

omething while the other guy was trying to swap stuff yeah

you have programmable data

on balance

checking automatic subroutine

calling and process switching

and a limited amount of code cache one

batch for every

okay so that's that's sort of a quick pass

through what to be 5,000 wise now

funny thing about this sort

of the despair of anybody who actually knows about

this machine is that almost nobody knows about these

ideas these ideas would look goddamn good on

a chip from Intel er Motorola today believe me

I think we can all use them and they're a couple of

things that we might like to do to to fix up going

after Pascal like data structures but this would be one dandy

Pascal machine and it's one of the things you have to realize

that one of Klauss worst teachers was Bob Barton

Klauss went and Dave Evans

okay Klaus was a graduate student

of Dave Evans when Dave and Harry husky when Dave was at Berkeley

and he

then came over to work with Bill McKinnon at

Stanford where they had a be five thousand and thousands

first two languages were

there's language called Euler which is a particularly

good good design it was implemented as

this this whole thing that led to Pascal P codes is

right here okay Pascal P codes

are sort of a crummy software way of doing what this machine did

in hardware this is what is so amazing about the

as most things that you would like to have in a programming

language they had to extend Balga or to

handle features like multi processing the

the protection the fact that you could actually change types

dynamically here which bound all didn't allow but it's

like they came up with a language called s Paul

I can't remember what s Paul stands

for anymore but it's basically an alcohol extended

for systems programming and it had all it was language

that used all these features directly that's what they wrote all the

operating system they never wrote a line of

assembly code there was never a sembly the

fate of this machine was kind of funny because

I remember when when

burrows was Hawking it they were

so proud of what this

machine could do but they went out and hired a bunch of college

graduates as salesmen and actually

taught them what the machine was worst

mistake they ever made because they immediately

went out and completely snowed all of data-processing

managers remember this is 1961-62 telling

them about this machine that was distinctly not like

IBM system they'd ever seen before and

it just scared the out of everybody they made up they

sense of humor they made up a game called compiler

games it was a board game that you played

like Monopoly except what I would do was show you it

would actually generate would take any piece of alcohol

code and generates burroughs syllables

he just went through it had the stacks the little

the stacks you know you just went through and it had a me

and you just went through the thing and by god the thing

would showed you how simple compiling was on

this system that scared everybody that's

how I learned how to compile when I was in the Air Force using

this damn compiler game I'm looking for one ever since

must be some around the other thing they

did is they came out without Fortran on

the grounds that alcohol 58 was infinitely better than the

soon to be a now the algaas they upgraded this to Algol

60 was infinitely better than Fortran which was

true and they could run Algol 60 with no

appreciable overheads unlike how and on the other

machines nobody bought it in fact

this machine languished for years until

they wrote a preprocessor the translated

Fortran into alcohol and then compiled that

into these things and that is

how that is how they finally sell the machine they fired all of the college

graduates and got guys like the old SDS

salesmen you ever asked them a question they would say I don't

know the answer but can I buy you a drink

sent them out and then they finally people started

noticing that the machines didn't crash very often like once every

two years and that

they could even these small machines could

twenty four thirty jobs at the same time interleaving

all the stuff they're incredibly efficient

okay so

generally speaking a a

system like this is this

is sort of the place where

you start thinking what you start about higher-level languages and if

you're thinking about doing Pascal a couple

of modifications this gives you a damn good Pascal

machine it is essentially the algorithm

that the Mesa people

put in micro code in the Xerox

PARC machines Mesa is sort of a super Pascal it's

the algorithm that Cosworth put in the

Loess machines

that were spin-offs of the park

stuff does pretty well what's

wrong with this

well first thing is

that it has such a determined way of

doing things that

one might ask the question

is how does this dough do less okay

the most natural way of doing Lisp here is to

have these guys point to segments that are only two words long

it turns out that is a disaster

because remember the thing thinks it's trying

to swap segments the whole system had

an assumption about how long segments are like they're an average of

40 words long which is a reasonable swapping

size and strings were longer than that so the

first first time Lisp was tried to be put

on the powers be 5000 just

he last time McCarthy ever looked at a machine like this

he made the incorrect assumption

that since list wouldn't run out of be 5000 that a

higher level architecture is a wrong idea that is false

but it was such a failure that an

apartment in fact Burroughs has compounded that error

over the years they grew to love

this with a passion approaching that

of religion and essentially decided

that didn't run on this wasn't a good system

okay which is the usual way he defined

the problem out of existence and

so as a result Barros has never entered the

mainstream of research never ever

and the current borrows machines don't do much better on Lisp than

the old ones did and I

found that this is sort of to my heart because I adapted this architecture

on the first machine I ever designed around

1967 or so tried to do a desktop

machine that directly executed a higher-level language

of the euler type but with multi

processes as being the basic idea that it turns out

it doesn't work the reason it

doesn't work is basically

cost performance

what

do I mean by that well this

to balance what this does for you with how

much machinery actually have to put in there with

how many times you're actually using it with

how many times you're going to change your mind and what the

data structures are going to be like and

notice you can you can make any kind of a

data structure you want here with a programmable system

is it just gets cumbersome because this isn't a good way

of extending things so when we finally

set out to do small touch we had a model in mind

and small talk actually if you squint at it

you discover that this thing

in small talk is what we call the instant state

and it's usually much smaller and

these stack frames and

most of the small talks are actually separate objects called activation

records which are allocated separately rather than

on a single stack and if you do that

then you wind up having an object oriented architecture

okay

that was that was partly led to by reflecting on how Simula might

be executed on this on this machine

but the thing that you have to take care of is

this basic assumption about what storage is

going to be the real question is how much

can you afford to pay for

machinery on data structures most of

don't know what they're going to be when you start out and

at Parc was instead of building any machinery on

this thing we tried to make the Machine run enough faster than

main memory so that we could change our minds frequently

in the microcode as we went along and

we finally staff settled

on a few standard formats larger than the number that's be 5,000

had that would do about 95% of all the

structures that you could construct in small

talk this particular direct

has anybody seen any of the bugs I know

Steve Steve Saunders knows all these but

notice one of the things that happens when you do a process

which is pretty unfortunate

is anybody see

in the stack that you don't want to have in the stack when

you do a process switch

and we can have integers in here that's okay

doesn't matter what else can be in here

yeah we can have some addresses of data and

addresses that they are absolute addresses

okay and that means that

you can have an one

of these array descriptors or a procedure descriptor in

here that's pointing to something in

storage and a frozen process that you're not going to use for

might like to do is to clear out core and

let the new process run for a while and

the problem is if you do that then the chances of these things

to these guys when you come back and is going to be zero so

that led to some immense first

set of immense clue juries which

persists actually to it to this day they're on the

6700 has this horrible

instruction I forgot what it's called but its

purpose is to chase down the stack and find these guys

and make them turn them from absolute guys

the relative guys will then get triggered back

it does it does this when when it something

is going to be moved that belongs to this guy yeah

barf terrible

that one we can't pin on Barton

he was long gone by the time they put that one in there

but it brings up the notion here that the mapping

scheme which worked so well on this fairly primitive machine

doesn't extend I think

the one that yeah

would check to see that

no reason okay the

storage manager made sure that that happened

but one of the one of the problems with it is that

in most of the early successful system they actually used

this structure as the map also in

other words the operating system would look through this thing

and this would be besides the program reference table

for the program it was also the storage mapping structure

of the whole system which is trying

to make do a little bit too much double duty

you see what I mean because you know you

have integers there's there are too many side

that can happen and make it go on so the weakest thing on

this machine they realized by the time we

this stuff at Parc was that the storage mechanism wasn't

sat down and thought about it we realized that we in

actually know almost nothing about allocating

storage it was

problem in the B 5,000 it's just be simply that

getting caught and we're always spending the most time trying

to figure out how to allocate storage how to

how to use it and

in small talk the scheme

that we had was to have much shorter

addresses and the addresses weren't based addresses at all but

actually names of objects

okay so the instead of pointing to anything in core these

things denoted the entire object and the object was found by

looking up in a step separate storage table that

would tell you where the thing was it's more

it's a cleaner solution to the thing and it actually worked

small talks schemed was the first

segmenting scheme that actually worked thanks to Ted

Kaler and Dan Ingalls they actually figured out how to do it

what you have to do in order to make one of these schemes work

is amazing why

should anybody go to that trouble well if

you are swapping objects

thank you always are interested in knowing is what does

this thing called

working set working

set in most operating systems is the

that you have to have in so that you won't swap more

than every like 10 milliseconds or something

like that in other words you want to have enough context

in storage so that you're not swapping on every other reference

and paging

is terribly inefficient because the matter how

allocator is it's always allocating the wrong objects

on pages and the

discovered when we did this in small talk was that

the packing efficiency of having objects

packed in the core by a compacting allocator was

a factor of 15 over what pages give

you okay that is

tremendous it's like having 15 times as much core

unfortunately the overhead for

getting objects is much worse right

you have many more objects and

several thousand objects and stories instead of a you know a

small number of pages you have to go through much

worse mapping scheme it's not just a table lookup anymore

you have to do hashing my name's

you have to fetch small objects off digital

disks which is not good when you're writing things out you have

which is worse and

that was what

caused this scheme to fail and the Flex machine but

then as I say Keller and Engels that

people pitching in from the sidelines came up

with a composite scheme that solved all of those problems

using partly a technique

of realizing that the last thing you ever wanted to have

in storage was written on objects

okay and so whether you whether the system wanted to

r not it was constantly jumping through every couple of seconds it would jump

through and write out every object that had been written on whether

it needed to or not and so only I

think something like 4% of the occasions

bring something in did you ever have to write anything out to make

room it was always clean stuff in court

that turned out to be the key that was adapted

from the great old algorithm on the SDS 940

the Butler Lampson thought out but this time for

objects rather than pages so the result of it is

that the page the swapping efficiency of

Smalltalk 76 worked out to be a factor

of 9 over paging in

fact the in that particular system never ran in

more than 80 K bytes ok every

demonstration we ever did in small effects 76 only swapped

an 80 K bytes something that is quite

and part actually just after we discovered at work we

actually decided to keep it at that size because that was a size

it was going to be possible in commercial machines

very very shortly the equivalent

performance to a system like list was equal to about

almost 10 times that amount of storage it

took a kinder list on the darada you

have to have something like 350 k words

16-bit words in order to get the same swapping

inefficiency and difference they have almost an order

of magnitude which was remarkable okay

so that is that

is the basis for thinking about these

systems now let's talk about a little bit about the wrist

chip

very interesting what they tried to do

wrist chip was designed

an innocence or at least a partial innocence

of the be 5000 which is probably just as well first

hing after learning something like this is to forget it you

know it's sort of a useful fashion you

don't want to remember anything directly just want to survive

the air but this yeah you wanted to seep through

your consciousness when you're designing but you don't want

to try and imitate it

one of the things to realize is that most

of the early machines were reduced instruction set

computers there's only the IBM

and Remington Rand corporation

and control data had this thing where they sold

computers on the night and

that led to a whole bunch of what we used to call it drop

screwdriver instructions debugging

the machine and screwdriver drops inside and shorts

something out the machine does something you

it a feature rather than a bug and write it into the 500

or so instructions

this is what literally what they used to do a

long time almost none of them can be

compiled - and most of them can't be understood by machine code

either you know they just sit there with

cubic feet of hardware to try and execute them

machines like the PDP one and

other delightful things that sprang out of the world

when computer he had these very very

simple way about going to do things and people noticed that they

were pretty efficient even though they hardly did

instruction the reason of course for that is that

yeah if didn't take any time the machine was incredibly

efficient it didn't have to hardly do anything machines

are really sort of like ticking clocks

o from in the sixties

what is the entropy of the program

so in Paterson and se

caen wanted

student project they wanted to do a processor they

had a chance to look at this thing called the OM

processor at Cal Tech which was a

attempt to do an extremely clean computer science

e type processor by Ivan

Sutherland and Carver Mead turned out that was an utter failure

it really was one of these

things where they got so enamored in dealing in the

what the machine was supposed to do they put it I mean

you know all the things that you read about in the books that are good

they tried to do in this thing and it turned out to

be a flop because the chip got very very large

still only a 16-bit processor

just didn't do very much so

basically what

what they did was to take a bunch of

C code and pascal code and

use a variety of ways of

analyzing it's a thing called a gibson

mix analyzer which is a dynamic way of

analyzing what percentage of instruction is getting executed

they look at the code statically

they came

up with I think everybody has the the document

they came up with a set of

percentages of what things got executed when and what you

had to do and they decided they had only optimized

two things one

was to try and have the

instructions that executed 90% of the time executed

one cycle and the other one was to

try to reduce subroutine overhead

okay and that is what they

that is what they did

and so they came up with this mead Conway

design which is

traightforward in architecture as you could want

and

several interesting things about the architecture I'm going to need some help here

sigh I actually didn't get a chance to prepare for this

because I I gave up my copy of that document which I haven't read for

like six months but

so call out

when I start going astray

because of the simplicity and

regularity on this thing they got an incredible

ratio of controls day to

act action state

you know what it is but

I think this is like 25% it's like 75%

maybe it's

30 70 that's about almost opposite

from what it is in most chips

most chips at least half of the stuff

that's on that ship being in control gates one kind

of another so control is very very simple here almost

all of the silicon area on this thing is used for doing real

things and because of that on a forty thousand transistor

chip they could build a full

32-bit machine that's

what they did so this is 32 bits wide

and survive

it really into two parts

and the

uninteresting part is sort of this part here

and that is where

things are actually done there

are bunches of latches and ALU

shifters shifters first I

Wi-Fi recall and

some buffers and so

forth that's where stuff actually gets done

on that ship then this part of

the chip has I think 132

registers and if I remember the number correctly these

registers are mapped registered registers

words they're not registers you can get to directly

but their registers that are used implicitly

by the code in the in the machine and the mapping

is kind of clever let's see if I can remember it

he register

space for each process is 32

so you can get to 32 registers

if I if I'm correct I believe nine

nine of these are

and the

other 24 are local to

each subroutine

then somebody thought of a real clever idea this

is an incredibly clever

yeah they had two problems they hadn't saw one problem

is how do you pass parameters back for instance

parameters are passed in the top of the stack just

as opera and the be 5000

and the other problem the problem

is how can I map these guys

in here in a very rapid way so I

don't have to futz around I want to have 0 over again somebody

got to what we should do is think

of these registers as

being overlaid like this

so

I guess they grow from the bottom to the top so let's do it that way

here's 24

what we do at 60

we're 16

cuts off we overlap the next set of 24

okay

result result of that is that these

two guys share six registers those

registers are what it used for passing parameters forward

and back these are if you will the in/out

variables of the Pascal type

thing that they're doing and thus on it

goes now what that means is that the

start address is of each register block that you care about

is on a boundary

that can be gotten to you by straight logic

okay in other words they only in

order to get one of these offsets they only have to or a

number in to what ever

they don't have to do any actual additions

just or whatever if you want

to get variable five and this guy

here variable

5 this guy you're just oaring a zero 101

into whatever

one this this thing is in this array

so there's no add here

to calculate any of these things just slam the things together

and whichever one is active goes

up and down how many let's see how

many of these is what is this 10 frames

or something 10 frames 12 frames

everybody know 16 so it's probably

9 frames or something like that so you could store

nine frames and of course in a language like

Pascal which is almost never used recursively

this is a perfectly reasonable

depth there's

statistics showed that most processes that were

written and passed down especially in C never

went to more than a depth like this so it means that the thing

never touches storage for all of its temporaries so

it just runs like a bat out of hell

right this

this particular this particular scheme is not nearly

as good for less for instance

for several reasons we'll look at it's not very good for

small target because as we noted the small has

both an object state and an activation state there's

no place in here for the object state it

has to stay outside so if you want to use this chip for

doing something like small talk then you have to

do something to it the

other drawback on this is that the code on

this thing what he thought was is

I want the object code to be like micro code so

the code is humongous every instruction is 32 bits

long whether you want it to be or not that's ridiculous okay

that is what the does that is what the design of this chip

is so if you're willing to put up with 32 bit code

on every squad this thing

will run each instruction that

it has in one cycle through

the thing so there's no appreciable overhead on any of the

calculations

that take place in here that use this thing the cycle

on this thing could be I think the chips that they're making now their first

ones are are executed 200

nanoseconds of crack and the design

this thing was to execute instructions at 100 nanosecond

crawling their way up to that now it's quite easy if you see

there's nothing but the thing is doing look at a 68000

which is less than twice the size of this and it's

a nightmare this thing is just laid out like Manhattan

Island and it's very very simple

the control is very simple so

now

you look at this thing you have

thinking thoughts because a hundred nanoseconds an instruction time

is not that bad especially if you know ahead of time that most

of those instructions are really working for you you've got something

that you're worth it's really worthwhile thinking about building

we know

or at least we pay lip service to the idea that we want

higher-level languages here but very high level

languages so the machine that simply execute

spaz gal and see very rapidly it's

not nearly as interesting and Atari research

as it is in HDD turns out

Apple is very interested in this machine they have been courting Patterson

for some time because

this is their language is

a Lisa is

a language that is

basically Pascal with classes okay

so those some of the things that small talk

so what can you do to this check

I just I think what

I'm gonna do is just give you a couple of hints and into

the next one one of these that we do on this thing will

go through the schema more detail one

of the things that we can do

make it a little bigger

what we sure would like

to feed in here our bike coats

these things like those

code syllables one of the most ridiculous things

in the world if you have a reduced instruction set

is to have the instructions be 32 bits long

why can't we make use of the fact that we're

actually doing a higher-level language unless we know where

is going to be it's really all relative addressing so

we'd like to we'd like to use that but

to keep the control simple at some place

we have to have a PLA that will translate one of these

guys to the wider paths necessary to control everything so

one of the simplest things that we can do here

is to

think of this area as being an instruction cache

now in all the all

of the studies on caching it's only a few facts have

remained one of the caching works where paging doesn't because

ratios between the two memory speeds are different

okay that's basically what it comes

to catching has very small objects

and the ratios are usually no worse than a factor of 10

so it works paging has very large objects

usually in the ratios or factors of thousand or a hundred so

it doesn't work the other thing that's

known about caching is that code behaves much better

than data does like

a reasonable set up machine code

has the

a very important feature that you

override it in the cache you never have to store it back

out so the cache is just sucking stuff in

then the question is what is the cache look like is

it a full of sociation is it a overlapping

code segment

thing that held has to do with how complex

the Association paths for interpreting

addresses are that went down this way this control

side here right up here would

be a barrel shifter

actually limited barrel shifter

that is taking

these syllables and feeding

them can think of these things is really proud of the

variable size maybe 8 16 and 32 besides

things what

it's doing is looking at the leading order bits of whatever's there

making a decision as to how much it's going to map down

through the PLA that's going to cause control to

determine standard sizes

true enough

yeah we should mention besides

being a general man about

town and a good fellow Steve Saunders thesis

was on minimization of code size through

various kinds of compiling techniques and by

going as he says by going to a variable

length by taking the language that you got and going a couple

of steps further than the risk people you can build a compiler

that will generate optimal size code which

can be peeled out just as easily as

variable size of a few things definitely

true providing that

the largest size is no larger than this

otherwise everything else is really quite easy

okay the two more things that you have to do to

make small talk or a language like

it work on this chip there has to be

a place where instance state is put I'm

going to talk about that more next time let's just think that

let's put their turns out in a

language like small talk there's almost no depth of stack

I think you can sort of because

control is always passing sideways to objects you're

rarely going deep you're going sideways all

the time and so the typical stack

depth in the small talks is less than five

so you need to have a cache for

instance state here

we can talk about what that looked like later

on then the final thing that we have to have to make

a scheme like this work for small talk or Lisp is

we have to let arithmetic run quickly

or another way of

putting it there is a subset of these byte codes that

have to fulfill two purposes one

is they have to refer to

operands that are protected like in the B

5000 okay these guys better

not know they're going after the integer or all is

lost so that

hey're picking up it has to be something that was protected by

one of those flag bits somehow and

cause an interrupt of what I picked it up as an address to

something more interesting in an integer okay

that procedure in Lisp is called unboxing

typical Lisp we're

small talk address space so what a small

talk is laid out in small talk 76

16 bits

pointers

so that gives

you a range of 65,000 things you can refer to

each

thing knows how long it is and the storage

manager knows where they are okay so these things

are actually names the average object

size in Smalltalk 76 was about

20 bytes so this

gives you 20 bytes

is about four and a half bits so

this gives you about a 20 and 1/2 bit address

space okay

equivalent objects does everybody understand

I'm sure in

other words so we we could refer to about a million words

using only 16

bits for a pointer size

turns out you'd like to refer

to more than a million words

ixteen is maybe a

power of two but boy that's all it is

everything else is wrong 24

is much better turns out but

then in order to there's

certain small things that you'd like to have implicitly encoded

empty addresses in particularly

encoded I forget what the way wound up

but at one point it was like - mm the integers between

- mm + mm

were encoded into a section

of that that is they never existed as explicit objects and storage

assistant

and resse is in this range that these things are actually integers

that it should unbox these are called boxed integers

there's a box around them

other images were actually encoded as real objects

part

smaller okay

small arms in this

okay and the

so whenever these are encountered

they have to be translated into something

that's a real integer that can be go through the

operation that translation has to be done at the last moment 3600

has a fair amount of circuitry to do that

one thing we could do on this

system what's supposed

to be moseying the thing up to 32 bits wide

like this machine is now

I don't think we feel too bad

about having a flag bit

the fine bit can be very simple in

fact I think in the scheme that I came up with

I put the flag bit forward there

and let it be zero if it

okay

and if it was if it wasn't one then the thing

else in the other words we give up half the address space

for numbers it's like what to be 5,000

does and then I think we can see

that if

we decide to have a particularly simple encoding between direct

operands and addresses

and by the way I

think you all realize that when I say one bit I mean

but one bit is enough for doing all

operating system things we want to do and essentially they're

what we're what we need to do is to have some of this

control state let the action

go ahead because

after the ALU there's this latch here

that has to be in the right state in order for the

operation to mean anything okay and so

one way of handling this problem of how can we write an

system in a really high-level language is

just say every time we do a primitive operator like

plus we realize that the plus may have lots

of different meanings the small cog for instance that are probably

25 different meanings for plus 25 different routines scattered

through the microcode and small talk code for plus 150

meanings are print but

there's one plus that better run as fast as the machine

can run it and that is the one that adds two integers

together okay the way to do that on

this machine is to just let

the right here you

get to get a chance to look at both flag bits just

let the operation perceive and

then the decision

whether to recycle is back on the C bus or the BB

bus like this machine has it has to

state of those two bits are if they're both zeros then

you got the right answer they're not both zeros you're going to have to

do something more in which case the way you to do that is

interrupt but then looks at these things more closely

basically ideas you have the generation generality

you're willing to pay for but if what

not general then you wanted to have no appreciable

overhead whatsoever you do

that you now have a machine that will run if this thing runs in 100 nanoseconds

doing what we doing what we said

it will run it'll still run a higher-level language and run all

things that have to work quickly to do an operating system at

that very same speed you've got

a very nice little chip we'll talk about in more detail next

eve probably has some comments

yes you

might mention something about the kinds of encoding

efficiencies that you get when you actually look at

syndrome

pop-pop-pop-pop

them do an operation with all the

attendant shuffling bits around if you

tell tell the operation what the compiler

knows about it namely whether its operands

we're

computed basically have to been pushed

on the stack or whether they're direct

in which case all you feed it is an address and it just gets it

you

can save a lot of Pushpa

that combined with

variable

size fields dynamically variable size fields

four addresses the only

happening locally

known names besides the locality varies

just to do those things straightforwardly

roll that together with the overlapping

in fact that's where the small named

what kind of packing efficiencies did

you get in here thesis over listen

bliss 11

was taken as justice unity

ctype language right yeah

thousand

benchmark immunity

my thesis stuff without any

optimization that was the set point just

just not customizing

straightforward nation on this variable

and that's that many fewer benches form

with that much smaller phone sessions

that's

why I say 30% like this the reason

this is a very nice for Atari this

is a very nice area and

generally speaking there are other things you'd rather optimize

first like the graphics situation but

once you sort

you can do there this is the next interesting thing which

is how can you run say

artificial intelligence programs of some scope on

a consumer level machine without going

six year you know the double cycle

to get enough memory on the things

this is one way of doing it small

talk was predicated on the idea that there would be some

sort of solid state swapping device like a bubble memory

that would match

up so it turns out a bubble memory is a perfect

device for the particular algorithm that we developed a

small talk 76 because it's not too fast

it turns out it works better

when the on a crummy disc it does huh yeah

what you have is not as good as you'd

be in a particular way we went about doing things where

it's much better and you

get a system that can do real live no IRA is PI

system which is an enormous enormous

system done in a small time he uses everything in them that the dam

system had there's huge millions of bytes of code swapped

in 80k bytes and it ran quite

acceptable e running byte

codes at the rate of about 300,000 per

okay this thing can run what we call byte codes which are those fast ones

at the rate of 10 million per

speaking of aortic recep another piece of work

diversity

this kind of a mechanism which is in fact a general

simple processor the only simple system

that one should wire into it is the exceptionally simple

one of small integer array

there are other simple powerful

subsystems in general simple systems

there are more primitives

out there that ought to run faster we have yeah

absolutely I'd like to I'd like to

some of those some

of the stuff that underlies set

set theory

really be useful yeah well as a

would have the same kind of disadvantage than 55-thousand he had that

programmer would realize that he needed it

but

I think there are a few

simple systems

on a different style than small

integers that would turn out to be the right ones

or a lot of what we do but

I think that you know the this

whole coprocessor thing it's not done very well right now

I don't

I don't think that's right I think

it should be you know you ought

to be a mother s you

that there are that

there are coherent small

simple systems will provide useful basis for things

address arithmetic is really where all the action

is going on and all the stuff small pogrom

is okay but there are a few other things like

hashing functions okay those

only become address or it would take through the contortions

or not why not pick one that's that's Universalist

and build it in reverse so that hashing

is a cheap thing just like indexing

okay think of the power ahead gets you also

reverse rotations for lost

transforms and teas and stuff we

know my list of what I came to Atari I first

started thinking about designing a processor

then I gradually dawned on me that a processor

is actually very low on the totem pole priorities

priorities I came up with us

at the number one in the number two priorities

have to do with sort of frame buffer

and graphics

chip a deadly embrace

between those two because they

trade off against each other so much number three is doing

storage management

ell

you one thing we trade trade this chip and

no storage manager for a 286

the storage manager chip any day of the week

and then finally down here is something

like this is a reasonable CPU that

will actually do something for you one of these

interesting things you can get

efficiency here and then give up a factor of 200

once you start swapping if you haven't worked

out a thing so it's it's a question of where

getting probably these things is that there's so much easier to understand

than these things that's

why no hardly anybody works on these it's nobody

said nobody really understand you build one of these things and

you don't build it programmable you're dead

Processors in Programming Languages

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Tool box

Tools