MSM: You joined AT&T in '76?
Weinberger: That's right.
MSM: From what?
Weinberger: From the University of Michigan. I was teaching mathematics. My background is in mathematics. Got my Ph.D. in '69 from Berkeley. And uh...
MSM: What kind of math had you done?
Weinberger: Number theory. Analytic and algebraic number theory. And I worked for a year in Washington then I went to the University of Michigan, taught math, wrote math papers and stuff. And then I didn't get tenure. And so I looked around for.... It's a big decision... It's what would I rather do? You could pretend you have many choices but the basic choice was would I rather be at a first class place or would I rather keep doing number theory and end up living in a place where perhaps I didn't want to live. I don't know that I could have gotten tenure at the University of Illinois but that was the kind of place that...where I didn't particularly want to live and I managed to get a job at Bell Labs in Development in '76 and worked on computer stuff -data bases and things. And in about a year and a half (as I remember - my memory for this stuff is not good) I transferred over into research. That was done by when we started working AWK.
MSM: I see...so you came in with AWK. How did AWK come about?
Weinberger: Well, as I remember, I came over to ask while I was doing this data base work in development (up the hill someplace)...I came over to ask Aho about parsing because there these...there was this sort of current about extensible languages. And it seemed like it would be convenient to, you know, add.... It was one of those things where the name is more than what the technique actually provided, ok. Extensible languages seems very desirable. The user could adjust the syntax in the language in some way and the composite would somehow automatically understand it and it would all be more convenient for the user, and so forth. And, um...Al was not very enthusiastic about that approach for...the technical reasons where that he was at that time, and I think as everybody is now, much more in favor of table driven parsers, in particular LR Parsers, and if you've compiled a table it's a little hard to adjust the syntax. And I suspect in retrospect he didn't think much of the usefulness of the technique. And I think in retrospect, he was probably right. Extensible languages is not something that worked out well. And the places where people are very enthusiastic about tailoring the syntax of...of...for users. There's two of them that I can think of. One is emacs, which is this editor where you can bind keys in weird ways. And the other is with something called Read Macros and Lisp. And the effect of both of those is that I can't enter your world, because each person personalizes his world in some...some way they like and nobody else can cope. So in fact it's not at all clear it was a good idea. But he said 'oh, you're in databases?'. He and Kernighan had been talking about some database-y extensions to UNIX. So we sat around and talked about this stuff and there's roughly speaking 2 pieces to databases. One is the question of how you get stuff out of the database. And the other is the question of how you sort of put stuff into the database. And putting stuff into a database gets involved in these 'are we going to allow for concurrent transactions' and 'do we have to do locking' because UNIX was not particularly good...was incapable of in those days. And it was just all too weird. Eventually we settled on the idea of what we wanted was some...thing that was...some tool that would let you get stuff out of ordinary UNIX files in a way that was...more general, more useful, more database-like, more report generally like. I don't know exactly what. Then stuff like grep, which just search for these patterns. That was all it did. But you didn't want to give that stuff up either. So whatever you were going to do was in this world of.... It wasn't going to just go into the database world, which to this day doesn't understand regular expressions. It believes everything. A lot of them are firmly into fixed size fields. And a lot of other stuff is pushed by a somewhat spurious view of efficiency, at least in small databases. So that's AWK, ok? AWK is like graphics, except it understands numbers and it understands that things could be divided into fields, and.... We went through a bunch of versions. As with everything, you get a lot of versions right at the beginning, and then things get stable with only slower changes.
MSM: Can you assign responsibilities to the AW&K?
Weinberger: Not so much in the finished object. I can pick out in no particular chronological order some of the things that people did. Always, Aho was in charge of the regular expression stuff, because you need some variant of regular expression stuff and he can just write it. He doesn't have to worry about it.
Weinberger: I did...we discussed syntax a lot. Kernighan likes free-form syntax. That is without extra parentheses and stuff, because it's so much easier to type. The problem with free-form syntax is that after a while the language becomes very ambiguous, and it's hard to see what it all means. I like more parentheses and stuff like that. So we tried out a couple of different syntaxes. We spent a lot of time discussing, all of us discussing, this pattern-in-action business...was that really what we wanted to do? And several rather more elaborate ways of doing it which were more database-y, which in the end didn't go in because either...we couldn't either...it turned out they didn't add anything. It was just do the same stuff anyway. Or we can figure out exactly what they meant and how to make them work. I think I wrote the first...a very early translator to try to...all it did was take these provisional languages and put out C-codes so you compile it. And...this may not be quite right. I think Kernighan wrote an interpreter, and then I wrote an interpreter and then some years later Kernighan re-wrote the interpreter. Or we both re-wrote the interpreter.
MSM: Am I right that there's a certain YACCy flavor to it? It looks a lot like C-code...
Weinberger: I ought to say a few words about the appearance to the users of the whole thing. Yeah, the pattern action stuff is...I'm not sure that yacc...yacc must have been in our mind. Because in those days especially there were not very many models of that stuff. There's another, I forget whether these are mark off somethings or post somethings...one of these constructs in theoretical computer science. Maybe they're post-productions...you match the left side and transform according to the right side, which is also vaguely the yacc model. It's not just yacc, it's several things. The C-like appearance of it all is...instead.... Well, the idea is if you made it just like C then you wouldn't have to explain it to the people around here. Or, that's how I remember the idea. We may not have been quite so concrete about it. What's interesting is that at the same time that we were doing AWK, there was a project at Xerox Park called Poplar. Maybe called Poplar. Anyway, whatever it was called there was this project at Xerox Park, which had fairly similar goals. That is they were going to take these files consisting of a lot of characters and you break them apart in various ways and you process them using some language and you pass them on. And they put a lot of work into clever ideas. It was a functional programming language. Which that's...especially then a big deal, now it's less of a big deal. And it was supposed to be user friendly and in addition you wrote the program sort of here and then you worked an example on the other side of the page and the compiler checked that the program worked the example the same way you had. So that you had some feeling for what the program was doing. And there were several other quite nice ideas. It was a real research project as opposed to AWK, which had it's research aspects but, you know...I wouldn't call it a hack but what we had in mind was producing this program to use, right...a useful tool. A lot of the stuff we were just prepared to push on. And, so this is a probably fairly friendly user interface. I don't remember the language but you had to learn it but it wasn't really complicated. Whereas we had all the C-grot and there's lots of C-grot that...particularly bad partly because of power syntax. There's this...the...one of the less...one of the ideas that I pushed fairly hard, which in retrospect is a little mixed, was this idea that to do stringing catenation, you just...stick the stuff there. And this just gave the parser complete hell. It's a swell idea but it doesn't mix well with, you know, minus signs or something. And, we really went through hell keeping that working. They had all sorts of stuff. And AWK lived and Poplar died. And I don't know...you know...this has affected my view about people who talk about user interface fairly severely. I'm no longer convinced anybody knows anything about user interface. It's clear some things are easier to use and some things are harder to use. That's okay. But it's also clear that people...people...learn a lot. Okay, and that some kind of...that, although AWK's language is both C-like and not real elegant, a lot of AWK programming isn't done by getting the reference manual and writing code. What is done is by finding some AWK program that's very similar to what you want to do and changing it, okay? And then the fact that it's a little weird isn't so bad. Because, you know...you're just changing it. AWK lives partly, I think, because of it's programming by example. The style you write code. Um....
MSM: I have Poly-AWK on my home computer, so I use it quite a bit. Indeed, I use it...I got it because I was just interested in it and I discovered it down in Holmdel. But then I had to deal with the ASCII file output of the database and I didn't know...it proves that I can do anything that....
Weinberger: I know, I know...it's...it's been a real success. It really worked out well and all I can...the only reason I can think of for that is because what we wanted to do with it was something that was worth having a tool that does. A lot of the other aspects of the program are irrelevant compared to the fact that it does something useful.
MSM: Let's take that back for a second, because one of the big criticisms of UNIX is user interface. It's a difficult system to learn. I wonder to what extent what you just said about AWK applies to UNIX as well?
Weinberger: Well, there's 2 pieces to that. Let me do not the one you asked but the one I thought you were about to ask. The way you compose programs in UNIX is this pike. What the pike...and...well, that's just what the system provides the way it's used. Almost invariably, is...this guy puts out text and this guy takes in text and this is essentially aside from that very low level thing. This is a technique with no structure. So it's academically completely unrespectable. Because there's no types. Everybody's got to agree. Now, in UNIX the convention that makes this work is everything is line oriented. Or, almost everything is line oriented and everybody at least knows they're getting lines and they have to cope as best they can. But for many years UNIX took a lot of crap on that I think in academic circles because it's so untyped. Academics don't think of ASCII as the type, right? I thought of something much stricter. But one of the experiences I think of all that sort of computing is that there is a balance between safety and usefulness and a balance between various flavors of generality. And the kinds of generality I think you have here is that if...if...if only passed typed objects, of course, some kind of small talk interface...if everything came with the type. That would be one kind of generality. The other kind of generality is you just passed random bits and its up to everybody to cope. I think UNIX was successful because it turned out that a lot of the safety that people relied on or that academically seemed respectable because you can say things about it...just...just was pointless. There's no need for it in many things. But of course, UNIX programs screw up all the time. Because one guy puts out output in a form the other guy cannot accept as input. That's one of the reasons for AWK, after all...is you gotta transform the output. But...if the...you can view this in retrospect of course...part of the UNIX philosophy as being...that's just inevitable. And it's important that the system provides a convenient way of coping with that. So the idea that you have...that you can stick an extra guy in the pipeline, shell scripts or AWK or whatever, to somehow adjust all of this is very much in the UNIX flavor of things. On the other hand, it means that UNIX is a system for people who...who...for people to use it like that. People have to be prepared to tinker. And if you provide a system where people can tinker, it's bound to have a lot more stuff at the top level than a system where people can't tinker. And I think you could contrast that with PC interfaces. This is long after the fact. People say 'well the PC interface is uniform'. Well it's not uniform at all in fact it's really crappy. But it does have properties. That is when you get Word Perfect...Word Perfect has 17 flags...you want Lotus 123 output, you want text output, you want Harvard Graphics output, you want all these different kinds of graphic output. And you buy something that either has the features you want or you can't have it. There's no chance of adjusting it yourself. And although people say UNIX is hard to learn, I noticed that each application that comes with your PC has a manual that's the size of the full UNIX documentation, at least the full research UNIX documentation. Now it's a hell of a lot less terse. But nonetheless, this idea that, for each dollar you buy on some paper software you deserve 10 pages of documentation, seems to sort of pervade the PC world. So, yeah, I think it's a little more...somewhat more complicated and that...in...but I don't believe it's all that much more complicated. I do believe nobody explains it very well. And certainly those of us who are experts are not very good at explaining well and probably not capable of explaining it well. My guess is that there is a modest amount to learn and you can use it. And the truth is our secretaries use it. We don't have a special system for secretaries. They just use it. Now, when you watch them use it you say 'oh, but there's so many easier ways of doing it...there this and this'...but it doesn't really matter. They don't have to use it perfectly.
MSM: Did you know about it before you came here?
Weinberger: When I knew I was coming here, I started looking into it. I'd heard of it, but it wasn't something that was easy to find.
MSM: What was the status in '76?
Weinberger: Well, it was being shipped to universities out of research. I think that was version 5 in those days. Possibly the beginning of version 6. Version 5 I'm pretty sure was the assembly line version.
MSM: It hadn't reached Berkeley by the time you left?
Weinberger: Oh no that was '69, no, no...and I think it went to Berkeley with Ken's sabbatical, which was probably '75. In '69 it was hardly anywhere. Even here. But you can find it at universities. But I went back to Berkeley the summer of '75, I think. I don't remember when I was back in Berkeley. But anyway, that's where I had...I think it probably was in between semesters or something...it was the winter. And Ken was there and I met Ken at the house of a mutual acquaintance and learned to use some UNIX stuff. And found it...you know, I was learning C at that point too...and continued to learn when I got here. But I found it much easier to use than the world I was used to. The...one of the...it had a bunch of really good properties which I suspect are partly Doug's responsibility. You can take the manual which was pretty big even in those days and you could read it once and do some things and you could read it again and read it again and by the third time through you actually understood how the system worked to a large extent. All of a sudden things became much less surprising. And then all the source code was on-line in those days. It wasn't all that much source code. And you could look at it. And a large fraction of it had been written by people with very good style. Right, like say Dennis. And you could to write C from it too. So I found it very easy to learn and I found it very easy to switch from FORTRAN to C.
MSM: FORTRAN is what you had been using?
Weinberger: Yes FORTRAN is what I've been using.
MSM: You spend a lot of time around computers in Michigan?
MSM: Because Michigan had a computer environment.
Weinberger: For an academic computer center, they had a first class place. It was very innovative. They had their own operating system, which was not a bad operating system. Like all home grown operating systems, it had these places with just appalling weakness, but usually users don't notice them...just like our home grown operating system. They had quite a good computer setup and it was all interactive. I'd been doing a fair amount of computing. My advisor D.H. Schlamer did...does...did...does a lot of computing.
MSM: He was one of the first to suggest using computers.
Weinberger: He started using them as soon as you could get on them...which is of course about '46. And kept using them quite productively....very ingenious things. Built special purpose fiber and all sorts of stuff. So I was pretty much into believing that they were useful and certainly in straightforward number theory. You could use them for experiments, which was quite convenient. You could use them to find numbers which...the world of mathematics is a very peculiar world. It's essentially a human creation but it's pretty clearly somehow there. So the experiments you do with computers in mathematics are the investigation you do using computers aren't...it isn't like the simulations weather people which is just a model. That's reality. It's the one case where you're actually at reality. Hugh Montgomery and I, for instance, wrote a paper where we did substantial calculation. I did the substantial calculation and he did the substantial mathematics and found some numbers that were unnaturally close. There were zeroes of some L function...all that sort of number theories. They were unnaturally close. It turned out this pair of unnaturally close numbers could be used to...you did a whole bunch of estimates and stuff like that and...and they could be used to cover a range from something like 10 to the 400th or 10 to the 2500 of cases in some other problem. So in this peculiar business of sort of missing...ontological gap which is the normal situation in modeling when you computers in mathematics. That's right. The things you discover are there. They're not just hypothetically there. You don't have to pretend the model is the reality, you know....
MSM: You have somebody reading them...reading Introduction to Minsky's in 1969 book on mathematics and computation, finite machines, he talks in the beginning there about that, the difference between the computer as a machine and ??????????, one's a model and the other is...
Weinberger: Yeah, but see, in mathematics there are at least...if you're...there are cases where there is no difference in mathematics. What you tell the machine to do it's not doing it on the model it's doing it on the mathematical reality. There's no gap. Which, I think, makes it very appealing. And one of the...there's a difference between a computer program and a theorem, which some people...I think is a fundamental difference, at least to me. Which is that when you prove a theorem you now sort of know something you didn't know before. But when you write a program, the universe has changed. You could do something you couldn't do before. And I think people are attracted to different...in different...to these kinds of things. I've always found thinking a lot of work. And anything you can get the computer to do that's not what you would call thinking. But I've always found it hard to get things right. And once the program is working, it's working and if it can do different forms of the calculation for you, it's not just that it's faster, it's also more accurate than doing them over and over by hand, when you can get it to do it. So I've always been pushed in that direction...by myself...by my weaknesses...
MSM: One of the things I've noticed...
Weinberger: It's going to take us away from UNIX but that's okay. We can get back. As long as your tape holds out.
MSM: One interesting thing about discussions among numerical analysts, Wilkinson...Forsythe...toward the end of the '60's, is a concern that this marvelous tool for numerical analysis was, as it were, luring away good numerical analysts into what the authors was referring to as computerology. That is, they started off using the machine to do computations, but then they were getting interested in the machine itself. Is that happening among number theory?
Weinberger: I think it happened to some extent among...I think it's a...I don't know. I don't really have an answer to that. There is...
MSM: Did Lehmer ever complain about that?
Weinberger: No. Lehmer...Lehmer...Lehmer had a lot of strong views and as long as you weren't just completely out of touch on logic he didn't care what your views were either. No, I don't think Lehmer complained about that. I think Lehmer was amused by whatever, you know...whatever people did amused Lehmer. But he wouldn't take money from the government. Which I used to sort of wonder about and now I don't think it was a position of principle, I think it was pragmatically valuable. It's too much trouble to deal with the government if you can get by. But lots of people can't get by.
MSM: I've had government funding and had private foundation funding, and believe me, the second is a lot easier.
Weinberger: Yeah, absolutely. But there is a...something very seductive about hacking on the computer. Whatever it is. I'm sure it's different for different people. And that distracts people who...also computing in theoretical computer science and computing in general is another place where mathematically good people could go. And as long as it looks like the rewards are much greater people steer themselves in that direction. I think that...the fascination of programming is only part of that. So, yeah, I think...on the other hand it leads to a lot of...it's not clear what you ought to learn in the computer science education that's going to be of any particular value in 5 or 10 years.
MSM: One of the chapters...search in the '60's for what computer science was a science of.
Weinberger: I think that's well worth it. Especially since I don't think the answers are at all clear now.
MSM: They're not clear yet.
Weinberger: That's right. The optimistic way of looking at business, like saying 'well, it's like mathematics was at this point. Nobody was quite sure what the real subject was.' The other point of view is that we're just totally, you know...it's just totally wrong and it's just totally badly organized and we will discover where the subject we think we're studying doesn't...isn't there. And that it's actually some other set of subjects. It just breaks up some other way. Who knows? It's not predetermined. It's just that things may work out that there's a different division. Now, that's not easy because the way subjects develop is, you know, 'why is there a subject like this', 'well, because it's very much the way it was yesterday.' It's a historical development. So probably computer science will look a lot like computer science does. But I'm not sure it ought. I'm not sure what they're teaching is what they ought to be teaching. I don't have any...I'm sure if people thought of better things to teach, they'd teach them. I'm very uncomfortable with the state of computing as computer science education. And it's not just the subject matter. However, this is irrelevant for your book but while I'm talking, I'll talk. Push me back to whatever subject you want to talk about.
MSM: No, I've got a couple of books on the fire. So keep talking.
Weinberger: Okay. The problem I have with computer science education is I think that graduate students are...this is, of course, unfair to condemn them all. But on average lazy. Compared to say the graduate students in...biochemistry, which is a hard science that's moving fast and competitive. And computing science is a something science that's moving fast and potentially competitive. But I think there's just a significant difference in attitude. And part of it is, this is of course all conjecture, part of it is that the students feel they have it made because they're in this leading field. But part of it is that the standards of rigor aren't very high. If you're a biochemist you have an idea about what it takes to do an experiment that's publishable...about what the science of biochemistry is. And if you're a computer scientist I don't know how the hell you'd ever figure that out.
MSM: Hopcroft goes after that in his Turing Award lecture. He points out that the advantage that a new field like molecular biology has is that it's got very strong disciplines behind it. It's clearly an amalgam of the two. So when a graduate student comes in and wants to do molecular biology, 'you have to know this, you have to know this' and 'the courses are fair' and 'if you don't know this you can't do the field'...
Weinberger: But it's also clear what constitutes research. Or how you tell the difference between good research and bad research.
MSM: Yeah, because the problems are clear. And...
Weinberger: I really don't know. In some sense, of course, because the problems are clearer. But I don't understand the real difference. But there's a real difference. I just don't know how to articulate it.
MSM: Let me pick up and take it just a slightly different direction. Because one of the things that struck me about UNIX as a system, but also about the environment in which it has grown up, is that on the one hand there's a lot of very clever computing going on and yet so much of it seems to be, if not theory driven, at least an exploration of theory. You turn around and look at this feature and there's a theoretical paper behind it. You look at this feature and there's a theoretical paper. So there's a discipline behind you that...might be surprising. I think that's the way to which it's a hack....
Weinberger: I think that's very interesting. I think it's true. (phone rings) (phone rings) It's clear that the most powerful pieces of it have that property and that many of the pieces, even where there's no clear theoretical piece behind them, have sort of...the next best cousin in programming language, designer stuff like that. But there's a lot of...there certainly was in the past, a lot of push towards solving the whole problem. Not that the program solved the whole problem. UNIX is famous for this theory that it's best to do 80% because the last 20% is way too hard. But if there were a big piece you could chop off then you did it. And that's why you get general regular expressions instead of some other version. Or all those other things that come to mind. But I think regular expressions is clearly the most...the single thing that distinguishes the UNIX way from other ways, the MS DOS way and many others. Yes I think the compromises that are made are made somewhere else. Not made in these places where there are strong algorithms. They don't seem to put that sort of stuff in. Because there's also a somewhat minimalist tendency, which of course has affected the user interface too.
MSM: Yes. What led me to that was being exposed to a particular computer science curriculum, which is very much a Bell Labs, I think...influenced partly because the people involved in this seem to have a Princeton connection, only that as one moved through the introduction of programming systems, which in my year was taught by Peter Honeyman, worked on UNIX and one worked ones way up through to theory of computation, theory of automata...one was reading Aho, Hopcroft, and Ullman. There was a sequence where one began with very practical programming problems...and when in doubt...seeing a body of theory and the connection between the two. Which seemed to me the right way to go about it, if one could do a little better job of articulating it than they were doing...but still it was....
Weinberger: I think the other tendency has been a feeling, this is not a dominant tendency, but there's definitely been a tendency is, if you don't know how to do it right, just don't bother in the UNIX development. And I think that's one of the things that leads you...leads one to choose things that are backed by theory.
MSM: That's an ethos, it seems to me.
Weinberger: I think that's right.
MSM: How is it, I don't want to use the word 'enforced', but clearly it was something that you had to acquire when you got here. You acquired when you came into research. Do you have any sense on how that happened? If we can answer....
Weinberger: That's extremely hard to answer. I'll have to think about that. Let me put it this way. I think I was more receptive to it. I don't know how I acquired it and I don't know when I got it and all that sort of stuff. But this business...this point of view that the computer would do my work for me is a very absolutist point of view. Because I have this feeling I make mistakes, everybody makes mistakes, I also have this feeling that you never want to have to touch the programs. So it's important to do it right early and that it always be okay. So it always has to be...it's not just a problem of the minute, although one writes a lot of code that's got to do the problem of the minute, it's got to fill the niche permanently, which is completely unrealistic but it's certainly an attitude. And I think that matches this other. If it's just going to be a slipshod temporary hacked up way of doing it it's just not going to work long enough. And you're going to just have to come back and do it again and it's just too much like work. Not that reality actually matches this in any way but I think that's the attitude. I think that makes it easier to pick up that ethos.
MSM: When you talk to one another about stuff you're doing do the questions tend to drift back to what lies behind the code? What theory are you applying here, what makes this solution general? Can you verify this for me?
Weinberger: Verify it for you....
MSM: You show somebody something and say that's pretty neat but can you show me how it's going to work?
Weinberger: I don't think so. I don't think that's the way that works. My impression is that what happens when people talk about their work.... They explain the theoretical backdrop and all the problems and how they work it out and stuff like that. But...the effect of the sort of...external face of the work is that it's general and does a lot. And frequently people are surprised and disappointed, surprised and pleased, whatever by...things. For instance, the well-known UNIX regular expression stuff is not at all uniform. There's this business about back referencing. Sometimes you can say...later in the pattern, you take apart that matched in the first. It's got to be there again. That sort of stuff. Okay. And some of the pattern matchers do that stuff and some of them don't and people keep saying 'well how about this' or 'how about that?' And somebody has to point out that, well, if you put that in, the potential running time has just gone to unreasonable things. If you only use this class of regular expressions, then we can make the whole thing very fast, but if you use this class, then something else. So there's an awareness of the strengths and weaknesses of that stuff. And there's a sensitivity to the implications. And why...because the half...there's a problem with using theory based solutions. There's an advantage, which is you can explain what the theory gives you. And there's a disadvantage, which is you have to put up with the weaknesses of what the theory gives you. And for many specific problems there's an ad-hoc way that covers much better and we've been on, I think, both sides of that.
MSM: When you say weaknesses do you mean the theory tends to have longer run times?
Weinberger: No, I mean...well...The implicit claim for LR parsing...in fact it's an explicit theorem in LR parsing and turns into an implicit claim about LR parsers, is that they find an error at the first possible time. An LR parser does nothing once it comes to a symbol that cannot be illegal. It just can't beat them. Unfortunately that doesn't mean that an LR parser does that in any realistic sense. Because the parser doesn't see what you type out...the parser sees the output of some other program. It would just be endlessly confused. So people say you look at your buggy code and you say 'you could see right back there that is not a legal program'. Supervisor says 'yes that's right'. He says 'parser didn't notice until here'...'yes, that's right'... Well, but syntactically, without all this other stuff it was a legal program. The parser is just a little piece of it. There's a...the...when people give the talks that are backed by theory, of course you explain the theorems, and all the theorems...it's fairly typical, I think, of theorems in computing, is that there's a marketing piece to them which is that they.... (I guess it was not your thing clicking....) There's a marketing piece to them. People use terminology that makes their results sound more important. You end up hearing about, you know, absolutely invulnerable oracles. That turns out that just doesn't mean anything. And this LR parser theorem is another example. The theory is correct. The theorems are correct. But the implications aren't what they sound like. You learn...that's one of the annoyances. Now on the other hand, if you don't base it on theory, it's just chaos, you say 'well I can hack it'. We have programs which are not based on theory and some of our quite successful programs are not based on theory. The extremes in that are Mike Lesk's contribution, TBL, one of the more extreme.... Can I get TBL to do this and the answer was 'in 45 minutes'. He didn't say that literally but the true answer was in 45 minutes. And you say here's a bug in TBL and it's fixed in 45 minutes. But although that started with a model, Lesk's style, which he was extremely good at, was just to go in and hack it. The model got weaker and distorted and confused and 'well, listen the problem mostly works.' But my feeling about that code was, which I use all the time in the ways that I understand how it works, was...there's this joke about farmlands in New England and farmlands in Iowa, right. If you take a rock out of a farm in Iowa, you have some feeling it's the last rock. The farm in New England is nothing but rocks surrounded by water. So it is with bugs and these two styles of programs. If you have a theory based program you can believe you got the last bug out. If you have a hacked up program, all it is, is bugs. Surrounded by, you know, something that does something. You never get the last bug out. So we've got both kinds. And it's partly the style of the programmer. Both kinds are useful. But one kind's a lot easier to explain and understand even if it's not more useful.
MSM: The community tends to have a tolerance across the reach.
Weinberger: I think the community, sure.... There's some balance between any program that's sufficiently useful can have any number of bad effects...properties. But people prefer small, clean, easy to understand programs. But they will use big horrible grotesque disgusting buggy programs if they're sufficiently useful. And some will complain louder than others but it's a rare few who will say 'this is just so awful I won't use it.' We've got more of those few here than in many places.
MSM: Sandy Fraser was telling me the other day that the UNIX manual was the first one to explicitly to cite its bugs, in a open, honest fashion. Is that part of that?
Weinberger: That's interesting. I don't know who did that style. The oral history around here is that's essentially Doug's style. He may be right. There's a couple other things that happened here. It may not be true, of course. There's a couple other things that I think are related to that. Not necessarily having anything to do with Doug, and are possibly the consequences of the fact that we don't have to go out and get funding, which is that there's a tendency not to go out and talk outside about your work a lot until it's finished. As opposed to well before you start.
MSM: When you say go outside, do you mean go outside your own office?
Weinberger: No, I mean go outside your organization. And there's also a tendency not to talk too much formally inside the group about what you're working on until it's done informally. There's a lot of that. And I think that's part of this same kind of flavor...calling it intellectual honest would be a little pretentious but it's that same flavor of approach that leads to bug sections in the manual. Well, this stuff is good and bad and here's the good and here's the bad.
MSM: I've gotten the sense from several conversations, not the least of which with Doug himself, that the early versions he used the manual as a way of...sort of getting people to clean things up. Go through a new version of the manual...anybody go through your code and want to clean it up a little bit?
Weinberger: I'd like to be able to take it so you can explain it in a short amount of time. I think that's also one of the...a common feel that's common in computing. The story is if you read the documentation early, it's likely it'll be possible to explain what your program does, whereas if you wait until your program is completely finished, you may discover that however coherent it looked while you writing the various pieces, it's impossible to explain it.
MSM: There's a guy down in Holmdel that had on his door a sign that said documenting a software project is like changing a tire on a truck while it's moving.
Weinberger: That's a bigger scale than we do.
MSM: You came in then on AWK?
MSM: How did you pick up your next mission.... I'm interested in how people pick up missions in this....
Weinberger: I'm not sure I remember. I know one of the things I worked on almost immediately is FORTRAN I/O library. Stu Feldman and I were sitting around talking and...FORTRAN 77 had just come out, and we had this idea that it would be easy to implement a FORTRAN compiler because we had all these great compiler tools. So we were talking about it and I said...I think this was probably while I was still in development, I started on this also...well, we talked about the parts that are hard. There's essentially two things that are real aggravation in FORTRAN. One is the lexical analysis. Once you've figured where the tokens are, it's not all that hard, and the other is the wretched FORTRAN I/O library dominated by the god awful format statements. I thought about for a while and said I know how to deal with format status. So we divided it up. He would write a compiler. I would write the I/O library. And we'd just get this done in almost no time at all. Well, the truth is that our compiler tools are not all that good. What we had fundamentally was lex and yacc, and that's a long way from a compiler. The other truth is that, in addition to figuring out where the pieces of the format statements are, there's quite a lot of work in an I/O library . But basically we just started this thing up and ran. Because it was the first UNIX FORTRAN compiler. It had quite a long life. So it was quite an embarrassment. Another example is it doesn't matter whether your stuff is good or bad, as long as you've filled the niche it'll be a long time before anybody else comes along and starts poking. When we got our Cray 3 or 4 years ago the first one was UNIX...on it. Somebody sat...was running perfect...they said, 'I'm getting these errors from FORTRAN. I don't understand them. Do you understand them? You run an I/O library, do you understand?' I said, 'No why would the I/O library have anything? They've got real FORTRAN.' But in fact they may have had real FORTRAN but they had my awful FORTRAN I/O library after all these years and I recognized the messages right away because it was still all my code. That's I think a story in favor of portability in writing in C, which is, by now, fairly well understood. After that it's fair to say I drifted a lot off and on. I did little things of one sort or another. And medium sized things, when I was interested. But it was stuff I thought of myself. I said this would be interesting to do and on my recollection, is that weeks or months would pass while I would be sort of trying things out that weren't working out. And then something I'd do would work out. That would be nice. Get me a raise. So it was very...structured.
MSM: Have you had any continuing themes that you've been pursuing?
Weinberger: Well you sort of end up working on stuff that you've turned out to be good at in the past. No, for a while there was. But no, I don't think so. The answer is there's stuff that I had done that I've sort of stopped doing. There'd be sequences of things, network file systems, servers, and that's still something I think about. But I'm less and less comfortable about doing kernel work. It's gotten more complicated and secondly you have to keep a lot in your head at once and I'm getting interrupted more now. It's hard to debug. So I don't do that much anymore. The research kernel has gotten a lot less modular, despite serious efforts to make it more modular, which of course have just...it's like releasing less carbon dioxide. It's not less carbon dioxide it's just not as much more. So there have been improvements in places but the general tent has not been very satisfactory. It's gotten much more complicated. So I don't know. The stuff I work on, there's always some, some...it's right around the system interface. I'm not sure there's too much more to it than that. My idea of an application program is a compiler. It'd be nice to move up a little from that. But I think I'd write some actual useful code. That's all I can think of that's been constant.
MSM: The prompt is as I was talking with Ken and listening to him talk and thinking of this Turing Award lecture, and so on...this continuing theme of self-replicating code, referencing systems which seem to reflect itself here, and reflect itself there, and the theme he came back to, and I was wondering if you have a theme like that you find yourself coming back to?
Weinberger: No. I'm not sure I would have characterized Ken that way either. There are things he really does keep coming back...file systems, and so forth. That's always been...that's how UNIX started, right, the point was...and he kept rewriting them and now he's just rewritten another one. Another batch. I can't think of...I'm not sure...I mean, that may be...I don't know.
MSM: What happened to the mathematician?
Weinberger: Well, what happened to the mathematician, the mathematician is still sort of there but it's hard to keep up. Keeping up with a single field of mathematics is essentially full time. And so it decays. What it is now is I have a really good education for computer scientists. And it's quite useful. It was interesting. Hopcroft came by a couple of years ago...I think it was Hopcroft. And when he was just starting in robotics and explained all this stuff that people ought to be learning in computer science...'this is the wave of the future' and it was all this stuff I had learned in the first year of graduate school at Berkeley in mathematics, right? It may have been new to him but it was just classical mathematics - period. So a lot of that stuff's good for a long time. What the mathematicians talk about in shifts. I can't read the fancy physics books because that's the kind of mathematics I...slowly it all shifts...(tape cuts)...I wrote math papers for a few years after I got here. The last one was probably 5 years out or something. But it's just...you have to be a scholar to write a research paper in mathematics. You have to be up on the field. It's just too hard.
MSM: Now you say you joined Development and you pointed back and it was behind the building.
Weinberger: Yes. There was a building back there.
MSM: That doesn't exist anymore?
Weinberger: Well that particular Development organization isn't there anymore. But yeah, maybe, yeah they moved somewhere else, but yeah, they're still around. Doing quite different things as you would expect after 15 years.
MSM: Were they where the analytic computing group was? When I first visited Bell Labs it was a fellow named Charlie Stenard.
Weinberger: Yes I think so. That's right. I was in Building 5. It was a weird building up there. Yes it was. I knew Jim Downs. He was a peculiar person in many ways. Frequently I couldn't figure out what he was talking about. But he had very enlightened personnel policies. If his people wanted to try something out, he would support them. Which is nice because frequently they're fairly hard to replace. It was nice.
MSM: How easy was it to make the shift? You said...again I'm interested in certain managerial structures. While you over here you found something?
Weinberger: No, there's more to it than that. I had tried to get a job here several years before. They come around and since I was a mathematician, Stan Brown had brought me in and they were working on ALTRAN in those days and I gave a talk on computer algebra. And I talked to Steve Johnson, all those people, and Sam Morgan wasn't hiring...no more mathematicians. Sam had in retrospect had this job of rebuilding this place or keeping it alive after the Multics thing. And there was a whole collection years right around there where they essentially hired nobody. As recently as the early '80's the place was 20 people. Well 20 was low, but 25. So he had already said no to me once. But I was doing this stuff with Al and Brian who, in retrospect, clearly were very hot properties. And they obviously thought this was a good deal. So we talked about it informally and then started trying to do it. And I guess they talked Sam into it. I don't know anything about it. It was, I think, probably a non- trivial effort on their part or on Doug's part, who was the department head...on somebody's part...probably did a lot of work which I didn't notice at all. Just dropping in on Al was the...yeah, I just came over, and I probably made an appointment and came over and explained my thing. It developed into this other stuff.
MSM: He was talking with Sam next week but you say he wasn't hiring any more mathematicians, and are there splits?....
Weinberger: No, I didn't mean it was just mathematicians. My feeling is that Sam was not much interested in taking chances...in people it was clear would succeed...given his druthers. And he may also have had feelings about the project that people were likely to work on. In my case there was no telling at that point about whether or not I was going to be good at anything else.
MSM: So in '76, '77, there was still fallout?
Weinberger: No, the first time I had tried to come over was probably '73 or '74. '73 probably. There was still fallout. This is in retrospect, I don't know what the actual situation with Sam was...Sam will have a much clearer idea of what the actual situation was than I do.
MSM: I have a feeling I'm going to get a sort of...you know the story of probably apocryphal ...Sir Walter Raleigh, while in the tower, decided to sit down and write a history of the world. Just get his story down. So he was up there writing this story and there was a scuffle in the courtyard down below, so he went down at mealtime and tried to find out what it was about and talked to a lot of people. Got so many different accounts he went upstairs and ripped up his history.
Weinberger: Well, but all of us are...all of us just remember what we remember. I...
MSM: But also there's a question of what people were seeing from different vantage points.
Weinberger: Yes. I understand. Even if we remembered exactly you'd get different stories. Sure. I wouldn't be surprised if you got a lot of stuff out of Sam. I expect the environment was much more complicated than it looked to young beginning technical people...political environment.
MSM: Well, it also says something about the way....
Weinberger: Oh yes. Sure.
MSM: Shield people from what they don't need to know.... Did you find when you came it was positive?
Weinberger: Oh yes. Sure. I found it quite interesting. As with all these things there are some people you work well with, there are some people you don't work well with...but you talk to. Some people you hardly ever talk to. I think I was here two years before I understood anything Ken said. Because if you ask Ken a question, he sort of gives you a one line answer to the question he thinks you ought to have asked approximately. I don't know what he thinks he's doing. But that's certainly what it seemed to be like he was doing. And you have to know a lot to understand a one line answer to these things. And after a while the answers become extremely informative. But it really took a long time before I just understood anything Ken was saying. That was pretty funny.
MSM: Did you pick a mentor in the group?
Weinberger: No. My guess is if I did that I'd have been more organized probably. More would have gotten done or it all would have been more structured. But I was used to as a mathematician used to working alone. Although there were a fair number of joint publications, it was.... And the style here, not universally, but to a large extent, is even when people are working on the same things they work on different pieces of it. Partly that's because of the little funny schedules. If I'm working on something with, say Dennis, I get in at 7:30 in the morning and work till about just after Dennis shows up and then I go home. Then Dennis comes in and works. You've just got to have some way of not stomping all over it. The most remarkable example of that is Ken and Dennis doing UNIX, where apparently there was only one time they even considered writing the same code. When they wrote it, it was line for line the same code.
MSM: Yeah, I heard about that.
Weinberger: I believe that story.
MSM: It was, character for character, a match of assembly code.
Weinberger: Well, it was the right way to do it, I guess. So, there's a lot that sort of...I think that's part of it too. People talk about what they're doing. But they sit and write code not always...until recently and somewhere else. I have not seen people sit at the terminal and one person type while both of them were writing the program. You just don't see much of that around here. And it's a perfectly reasonable way to work, especially in complicated code. And occasionally here it happens a little bit. But it could happen much more systematically. That's just one of the things that doesn't happen here. Here it only happens for little pieces of a program where you're having some particular trouble, as opposed to, essentially, the whole program.
MSM: This is a part of its history. Where do you think UNIX is going to go?
Weinberger: That's actually quite easy up to a point. I think it's going to be completely dominant, in spirit at least. In part of the spirit. And you can see that in MS DOS or...where all this stuff....
MSM: All those years you knew DOS....
Weinberger: Well, unfortunately, not quite because it's such a botch. But yeah, that's right. All the stuff that was in original MS DOS and of the stuff they've added to DOS, a huge fraction of it, you know, 80% of it, is UNIX-like stuff. And the...open...all this open...all this standard stuff...standard to conform, and it's in POSEX and all that sort of stuff is UNIX stuff. And if you're building an innovative computer, unless you're extremely weird, it's just inconceivable to spend a lot of time building all the aspects of an operating system. So you take UNIX in one of its variants and put it on. So at that level, it's like FORTRAN. We're going to have a lot of the features of the UNIX interface forever.
MSM: I'm going to interrupt you for a second. Because there is another model entirely. You were talking about it when I came in, by the coffee machine....
Weinberger: Right. That's the Apple model in some sense. And we're going to see combinations of those. That's for the user interfaces. But the underlying system stuff, what that runs on top of, is going to be UNIX. Because, roughly speaking, that part of it they got right. And this other part of it, the UNIX people got right, and you're just going to have them both. It's a very...now...that is when you sit down and turn the thing on, right, we've got Windows, right, mice, and menus and all that stuff.
MSM: Also got command lines.
Weinberger: That's right. We've also got command lines. But you can chop way back on that. I can write an interface to this, with a modest amount of work, where one got most of your days work done. And the last command you typed, where you wouldn't type very many commands, it's not very hard, actually, to decide how much of that you want to do. The AT&T open look interface has a lot more of that. All this stuff that I find so easy just to type comes on these elaborate collections of menus. If you've never typed...if you've typed hardly anything. But the Macintosh interface and the Xerox interface did actually have a lot of typing. You didn't type commands but the stupid systems always want to talk to you. They always want another wretched file name and you never get to say "star dot c 2 dot". Do this a million times on all the dot c files which is something that UNIX is really quite good at. You get to type a lot of stuff over and over again. I think there will be some balance which is probably isn't close to it being achieved. Because interfaces need to be smarter.
MSM: Because when I got my next coffee cup...for looking at the demonstration...they had introduced the machine. They talked about this was going to be UNIX but with a friendly interface. And I said to the guy 'well, when I think of UNIX and I think of friendly, I think of pipes and macros. Now, are you telling me I can pile up icons in a pipeline' and he said 'no, we don't have to do that ,but you can open a command window'. I said 'I can do that in DOS.' It was part of making myself friendly.
Weinberger: The hell with them. They're all so self-satisfied. I don't like vendors.
MSM: Ron knows. He came in with his lowest common denominator thinking they were showing the next interface. But again I stopped him. I said 'wait a minute...you're hooking messages of the objects. Where did the objects come from? You're telling me that all the applications for this have been written in Objective C?' He said yes. I turned to our graphics person and he said 'now, that's raising the lowest common denominator.'
Weinberger: It certainly is. And Objective C's the wrong solution also.
MSM: Yes. But you've got faculty members who are scared by BASIC, you're not going to get them like to Objective C and you're not going to start them maintaining university status. Objective C codes....
Weinberger: No I think that's right.
MSM: Not the scholars working.
Weinberger: No I think that's exactly right. I don't know what to do about that stuff but that's not it. So the rest of the UNIX story and I think, to some extent, all these other stories is that the commercial products, by the standards we're used to, are pretty crappy. They're, for all the reasons that commercial products deviate from esthetic ideals of one sort or another, right...they've got to get them out. You've got to get them out on time. You've got to build them. They're not allowed this 'well, I don't know how to do it', 'well, so let's not do it' attitude that we can indulge in. They've got to do it. They have no choice. It you don't know how to do it, well, you do it in a bad way. Then the bad way has some feature and some customer uses the feature and you can never get rid of the loathsome thing. It's all awful. And the commercial systems show every sign of that. They have the standards we use. They're appalling. And they make UNIX more and more like other commercial operating systems, because in fact the same effects are in all of them and they get bigger and harder to maintain and grottier and the manuals become outrageous, badly written, and all that sort of stuff. My only feeling about that...I have two feelings about that. One is it's inevitable. It could be a little better or little worse, but it's inevitable. And if it's gonna...I'd rather complain about what they've done with our work than have to put up with somebody else's work. So, it's relatively easy to complain about this and that but my view is that it's much better than it would be if...without it. Much better. The piece of the promise that does work...as you go out and buy equipment from lots of people, and you run it and the environment is fairly similar and fairly familiar and in fact it's so familiar and so similar, you complain a lot about the differences, instead of having to fight your way through a whole collection of new things. So I think permanent is too long, but I think it's here for a long time. I think we're going to be real sick and tired of it after a while. Just sick and tired.
MSM: It's a program, it's a file system, it's a programming environment. Did you hear about it as a software development environment?
Weinberger: No but I think that's what it is basically.
MSM: You think it's already industrial strength?
Weinberger: Well I don't know what industrial strength means, of course.
MSM: All you have to do is take the software cycle. Can one do requirements now as a specification, formal specification, tools, etc.?
Weinberger: One can do all those things. If they're being done, I assume they're done on UNIX systems. I think many of those things are operating system independent. The...it's true that the standard system doesn't come with some of the tools that some people want. I'm skeptical about a certain amount of that stuff. The real world suffers from a lot of, sort of, wretched, grotty problems. Everybody says 'no we don't want to mix languages.' There's always a little something weird. Not much but a little something weird so it has to be capable. And old code...you don't get to start over. And so a lot of the systems that ought to work, a lot of the things people write about as if they ought to work, they work real well if you write in one language and start over each time. Or at least they have a chance of working well if you write them one liners. But anybody can write software if you start over each time. It's this living with the past that's so hard. I'm moderately skeptical about a lot of this stuff. It's clear some people do software projects better than other people, but I'm not sure formal requirements is a piece of it. The people who seem to do the best are the people who are doing pretty much what they did the last time.
MSM: And starting from scratch?
Weinberger: No, not necessarily starting from scratch, but sometimes starting from scratch. As long as you don't get too ambitious and step on second systems and stuff like that. If software development environments means things, you know, like structured language, sensitive editors and stuff, incremental...incremental somethings and all that sort of stuff then no, we don't have a lot of that stuff. But I would much rather have general purpose editors and very fast compilers...and not have to worry about all this incredible crap that you otherwise have to build to keep your life straight. I think for instance C++, which is going to end up being a big step in that direction. I think C++ will be very popular, probably partly for the wrong reasons. Partly there are things I wish it didn't do, and so on and so forth. But I think this is tendency, okay. As machines are getting faster and faster and faster, the research community is usually big on...techniques for utilizing machines which have rates of growth that are much faster than the cycles are becoming available. Just impossible horrible things. C++ is less efficient than C if you use it in a natural way. But it's not so much less efficient that it's not going to pay off. This is where...instead of where all the...Xerox PARC is another example of this. They have these ALTOS which are slow and creaky and wonderful. Just, nobody's seen anything like it. They moved to...through a state of thing to DERADOS, which were damn fast machines and the damn machines were no faster than their other machines. The reason was that their software had been very greedy. They were going to do everything the right way. And it was incredibly expensive. All these cycles pissed away on the internal esthetics of the program, you know, layering and abstractions and all that stuff which everybody believes in but which can cost. And if you go to functional programming languages, you know, all that sort of stuff dynamic this and small talk that and all that sort of stuff, that stuff just eats the cycles like crazy. You end up with systems that don't work any better than the last systems except that they have lots more software in them and slightly fancier interfaces. And I think C++ is going to be a good compromise. Sure, it's going to be less efficient than C. You could probably make it the equivalent C++ program, written using all the features, half the speed of the C program. Well, that's a factor we can afford. Especially since not everybody's going to pay it. The 10% and 30%'s we can easily afford those if it really means the code is reusable where you can get the programs written faster. So I think it's going to be a big win.
MSM: That's interesting because, again at the next demonstration, one of the stunning figures was what percent...at the time you hadn't loaded how many megabytes on board those operating systems just to get those graphics. And I thought back to the original MAC which was so damn slow because it was all interface.
Weinberger: Yes, that's right. Well, is it worth it? Well, people differ. But if you keep it under control, and it's not clear in advance, whether you're keeping it under control. I think Ax is an example. There's this unbelievable pile of Ax crap. On the other hand if you don't use it all but only use some of it, Ax is reasonably fast. Much to my surprise. Of course they've re- written it several times and made it's standards very complicated so it can be fast. But they succeeded. There's always this question of do you take the functions you want and then learn how to make them fast or do you just try to make it fast from the start? And sometimes one, and sometimes the other seems to be the right answer. It's hard to be general about that. That's why software engineering is in such a terrible state. Civil engineering would be in a terrible state if all the properties...the physical properties and materials we used were changing at an exponential rate. How the hell would you ever learn to do engineering?
MSM: Yes, if you think you have trouble with computer science curriculums, the newest I triple E computer has suggestions for a software engineering curriculum. Any new curriculum....
Weinberger: Yeah right. I can't imagine what that can teach.
MSM: Essentially courses based on the cycle....
Weinberger: The only way you can get people to do that is you insist they go out and do it in groups and watch what happens. At least then they'd have some practical experience. Because the rules of thumb are just pointless.