SIMS 141 - Search, Google, and Life: Sergey Brin - Google
Sergey Mikhaylovich Brin, co-founder of Google, Inc gives a talk to University of Berkeley students.
We have a fantasic guest speaker today, Sergey Brin, who is a co-founder of Google. You might have heard of this company. (Laughter) I actually got the Wikiperdia article on you, Sergey. In order to give your history SO I could sit here and read things for a couple of minutes but I don't think I'm going to do that. - I should sit down and do that. Yeah, why don't you do that? That'd be great, I'm not going to sit here and take up the time. We don't have that much time with you so I'm just going to let you take over. Ok.
So, Sergey Brin.
(Applause)
So I mostly want to do some Q&A here today, but I wanted to start with a few opening thoughts. And actually you reminded me of one of them which is the Wikipedia - Wikipedia in general. There are things out there that are very simple and you never think would work. And that's why you just don't do things that you assume they basically won't work. Wikipedia is one of those that it would never occored to me that something liek that would work. And I assume many of you - has everyone here seen Wikipedia articles? All right.
Yeah and it's amazing to think that you can build an encyclopedia and everyone can edit anytime. I've gone to wikipedia pages at first when I said, look I don't believe they're getting this content this way. Here, I'll hit the edit button and see what happens. I go on random web page. I don't know, it was some artist, 18th century and I made some stuff up. He really liked the colors brown and orange, something like that. And I punched it in there. And I said, come on, there's no way this is going to work. And of course, I click submit and when I view and there's the change there. And then I quickly undid it, I didn't want to pollute it.
But it does work. And it works for several reasons, many that I don't understand for sure. But one of them is scale. And by virtue of the fact that there are so many people out there that are reading these Wikipedia entries, that are editing - well, there are a smaller number editing them. And then a still smaler number that really actively monitors all of them. But still, it's a small fraction of a huge number of people. They're able to keep it to be a pretty - a very comprehensive, reasonably high quality site. Occasionally like some of the stuff I think above me is a little bit wrong. But you know, I don't know how it would compare to like normal encyclopedia entries. I know it not. So I think that they do very, very well and I'm very impressed.
With the internet search as a whole, forget about Googles for second. That too, which today we take for granted in a sense. But it was a fairly simple idea that you take all the information out there which let's say 12 years ago when the first search engine start being developed wasn't that much. But the computers were a lot less heftly then too. And you just create an index. Even, you know a faily basic inverted index. In fact, in the earliest days, people didn't really worry about ranking even. It wasn't that big deal. By that I mean there weren't that many matches for most searches. And AltaVista probably made the biggest leap in terms of comprehensiveness and speed and what not. And you just index it and you let everybody query it. And today it's just it's very - we all take it for granted. But this was just a short time ago. And it wasn't at all obvious that it would work, that it would be useful or anything like that.
And I would extend the same idea to the web as a whole. They were a number of hypertext experiments and systems that people put up. What was the one with a funny guy, Xanadu? Did you cover that? Yes, Ted Nelson. Who's , no he's very interesting guy. But anyway, so he had created this thing and it wasn't the quite the same as the web. but it was - people have tried that. And yet, with a few simple ideas and I won't pretend to know to identify the key features that really allowed the web to grow - but it's really became a repository of the world's knowledge.
[05:07]
So anyway, I guess I want to finish that intro just with the point that people who have taken fairly simple ideas, ones which you might not think would work at all really, at a certain scale and after they gain a certain amount of momentum, they can really take off and work. And that's really an amazing thing.
Let'see, may be I should try to relate that to Google a little bit. I want to leave time to ... At Google we had one simple idea which now seems obvious. But the idea that the ranking does matter. And in fact that was not a high priority in a lot of information retrieval web search research at the time. That the ranking is - I mean some people worked on it, but it wasn't that important of a thing. And we decided that for queries that really return a lot of results that we could do something more reasonable. And we sort of stubled upon a way to do that by studying links. And I don't know if any of you have - what have we presented here in this class to date?
Ok. So you've covered a lot of page and page rank. Have you? OK. I'll go through in a high level.
We originally developed page rank - well, I was kind of playing around with studying all the links on the web. And that too was a pretty revolutionary idea through it seem very simple that you could even just collect them and then do anything meaningful. Because as a graph in the computer science sence it was a very large graph compared to computers of that time. Or at least compared to our budget of computers at that time.
And any how, I really credit Larry pursuing that idea that it's even worth collecting the graph and then that you could run any kind of processing on it. But soon after we had it, and we had a crawler that went out, and we have to kind of develop our own RAID to be able to write the data to the disk fast enough. And it's kind of things that are trivial today, even probadly on your laptops, but were hard back then. And then they started to play with it and came up to the notion that not all web pages are created equal. People are but not web pages and some web pages are inherently not worse than others but at least less important than others. And we developed this analysis of a graph of link structure of the web that imputed an importance for every web page.
Anh we use a similar algorith today. There are many other algorihms that we have to run. And it's evolved a bit over the years. But it is one of the things that we continue to use. And the general concept that not all web pages are created equal is very important. The other thing I want to highlight is that when we were studying this, and we actually weren't sure that we wanted to have search as the big application, at some point we realized that this actually worked really well for a search That if you type Berkeley - there are a lot of pages that mention Berkeley - but some like the Berkeley homepage are probably somewhat more important than others. And I guess there's a UC Berkeley homepage and a Berkeley city homepage.
Anyway, all the Berkeley pages. And we decided that was actually very useful to saarch when you had a lot of results and that if you wanted ranking to matter, that was a good way to do it. But the other thing we were kind of thinking about at the time is how would you - we weren't kind of thinking of this as how much we let millions or hundreds of million of people use this. But how would you even make something anyone, a single person, could use or how could you make a search that would work well.
We had a phrase for it: search for kings. No, you're not searching for kings but a search that a king would use or queen.
But the point was, is given the resources that we had, how would we create really good search engine, not worry about how many searches it could handle or how large a user base it could support, but to make something really, really good for a small number of people. It wasn't thatwe wanted to make something good for a small number of people particularly.
[10:03]
But we wanted to get rid of that constraint that you had to scale it up to a large number of searches. But ultimately what we developed we were able to scale. And in fact in subsequent years as a company at Google when we've had sort of projects which say, well throw as much compute power as it as you want.
Let's say we just want this to work well for a small number of people. We've ultimately always found ways to scale it up and deliver it to everyone. Which is kind of interesting. It's kind of like technology as an inherent democratizer. Because based on the evolution of hardware, probably more importantly the evolution of algorithms and the system software that supports these, you're able to scale sort of almost anything you can think of up.
Now it takes -- it's not trivial. It takes some hard work and effort. But I think that's an interesting observation that it's -- we'll have to see if in our lifetime if that means everybody has more or less tools that are equal power. And there's not much way that you can really spend a lot more for the search and get much better results because in a short period of time technologist are able to make it work better for everyone.
So anyhow, that said, I just wanted to quickly go over a little background and open it up to some questions.
[11:45]
No comments:
Post a Comment