Host the committee the co is that the state of the net conference in washington, d. C. Were going to share some of the interviews we conducted with members of congress, government officials and technology leaders. So brewster kahle, what you do for living . Guest i run the internet archives, Internet Library onto internet the gives way books, music, video webpages, software for free trying to build the internet into the library of alexandria for the digital age. Host that sounds like the internet, doesnt it . Guest the internet is getting there but lets take of the web. The average life of the webpage is only 100 days before it is changed or deleted. 100 days. We built our culture on this ever shifting sand so what the internet archives does is it take snapshots of the webpages on websites every two months. Snapshot, snapchat comes snapshot. Its been doing this at 1986 and offers this as a service. Its used by hundreds of thousands a day. They find all these things disappear, either maliciously or sometimes just they drop off the net. Host how many websites are there today . Guest hundreds of millions and their coming and going all of the time. But we collect about 800 million pages every day. The total collection is about 800 billion urls. Actually kind of huge. Thats only part of what we do. Also archived television, abc, nbc, cbs, fox but also international television. If you go to tv dot archive. Org you can search to find clips of what other people said and be able to put those in blog posts and the like. The idea is to make it so people can quote compare and contrast critically about whats happened on television. The old daily show with jon stewart he did Something Like that. Can we do that now . It is now used by journalists, by endusers all the time. Its a free library, a library on the internet. Host why couldnt i go to google and type in jon stewart . Guest you will find the jon stewart show and a put up certain clips from their past or on youtube you might see a smattering, but you dont know what showed came from. It doesnt have the context of television. Ours is just a run of television. You can only pick bits and pieces of television before we shut it down, try to make it so the publishers are not unhappy with us. But if you want the whole thing, we printed on a dvd or now a thumb drive and lend it to you. Then you have to send it back. People wanted for the documentary and the like them to go to the publishers to go and say hey, that i i use this clip for my documentary . Its just like like a library e sense you are borrowing things from the library. We also do this with books. We digitize several thousand books a day, about a million books a year now and digitizing these and then weaving them into the net more and more wikipedia, if you go to footnote and it has a page number you click on it and it goes right to the right page. And you can see page back in page forward but then if you want more of it you have to borrow with and if someone else is only checked it out you have to wait. But at least you get a couple pages. You can fact check and go a little deeper than wikipedia. Wikipedia is encyclopedia of the internet. We want to be the library of the internet. Where do you go deeper . How to get to the published work of humankind, old webpages, television, books, literature, Music Recordings . Tribal what kind of Law Department jeff tap at the internet archive to handle all the rights . Guest we dont have any Law Department action at all. We are a library, so we just operate like a library. The idea is to not offend people or feel like they been taken advantage of. We dont make any money. Where a nonprofit library, and we cut short when people come like television, its just clips. The music collection, we try to link it over to spotify. Its only 30 seconds, unless its old material like 78 rpm records which happen to be just completely great. It was before my time but they are just wacky and fun. Those we make downloadable and you can listen to them but they sound 78, you crank it up, horn, dog, that era, the first half the 20th century is largely forgotten. Records come up in cities spotify. Host how are you funded . Guest the same kind of like wikipedia or npr is funded. We have the end of the year please donations. We get grants. About a third of our income comes from libraries to collect the webpages. We collect, for instance, the web collection for for the natl archives of the United States or the library of congress. That all come sunday in the archive. We have room inside the ad was building. You should go visit it. Its part of a room in the library of congress where we are digitizing all day long. We have 20 locations around the country, actually now the world. Digitizing books. Okay, you think gosh, shouldnt this all be done by robots or has it all been done by now . It turns out it hasnt been done. If you look at the number of books on the internet archive, a goes up, up up to 1923 and theres copyright, everything beyond that is somewhat to a respect is restricted. It goes up, up up and then crash. Then decades of almost nothing online. Then comes back up again at the end of the 20th century or 21st century. We are missing the 20th century. Itself, okay, so its not online, i can just buy the book. Go to amazon and people are standing what books by decade are available and amazon knew dy goes up, up up, 1923, then crash. 20th century basically is not online. Whats amazing is we think theyre so much Information Online and there is. A lot of it is crap but a lot of it is good. But the 20th century, the published material is almost nonexistent. Its almost not there. We are raising a generation and ourselves really are not the best we have to offer. So we basically have this collective amnesia about the 20th century. Thats a pretty important century to not forget. We will be doomed to repeat it if we just forget the lessons from other times. So we tried to go through the 20th century, better world books is now donating all the books that we dont already have to the internet archives. They give them some libraries, and were trying to basically fill in the 20th century and make it so all those wikipedia footnotes turn live. We even went and fixed the broken link in wikipedia. So wikipedia, executive director wikipedia, she was worried that truth might fracture that if we didnt really work on trying to make wikipedia stronger, cited by better sources, that people would start citing sources that were available but not good. Those citations that happened behind the scenes on wikipedia articles are based on how good the citations are and what you can click and see them. So we committed to going and fixing all the broken links and filling in all the books and the journal literature that is linked to wikipedia. We fixed 11 million broken links in wikipedia in the last couple of years, and now were going to all of the books, finding them and replacing those black text with a blue link so you can click on it and go to. If the books are missing their we try to find this books, digitize them, put them up. Host how did you come up with this idea . Guest it was the vision of the net at budget bus, certainly i had of what one of the internet to be. Okay, it was 1980 and i was like like why dont we go and make the library of alexander for the digital age . We had to build the computers and the internet and the World Wide Web and help her to sleep in this. Yeah, internet hall of fame. Ive been at the stuff for a long time, building the early thing that came before the web. The web came along, i helped get the published on the web by 1996 we had enough momentum that i thought i could turn to build the library. The idea to make it all the published works of humankind one click away. If youre in the middle of some rare place or if youre in africa someplace, if you want access, you should be able to access. That was the dream of the internet i signed onto. We are now in 2020 and were still not there yet. But there are a mounting number of us are just going lets get there. It was a good idea, to make a hyper connected set of information. Lets do that. Some of us was motivated. People are just making stuff up and theyre not being called on it because you cant get to the cited material. You can ask a go and say no, this is not heres better information. People are just making stuff up. We cant live that way. Weve convinced a whole generation to turn to the net. We dont go to the librarys anymore in the same way. We go there for events and things but is probably not to go a full books except kids books and things like that, audiobooks, great. But reference materials . The net. The net isnt good enough yet. We are working on it. We are the 300th most popular website. We have 4 million users every day that come to us and look for information. Some people just want to live in their bubbles but an awful lot want to go deeper. The internet archives is part of that ecosystem. Host you have a little invention called lexa at one point. Whatever happened if it . Guest alexa internet, the company that amazon. Com bought, but its not actually are talking with you. Alexa internet is a web monitor, named for the library of alexander. I work for jeff bezos directly for three years, terrific time, really smart guy, and hopefully , hopefully he paid you in stock. Guest he did. The smartest i ever did was not sell all of it so it is help the internet archives grow and grow. Thank you to jeff bezos and steve case bought my company before that. He ran american online. Seibold the company that America Online and steve case but. I have been very fortunate but it was all towards this goal of building the library. Ive only had one idea, and so just trying to stay at it. I wanted by 2020, october of 2020, lets be able to say welcome to the library, that the internet is now a library. The internet is library and it will have all of the features that frankly we grew up with, whether its the old periodicals. It has reliable access and card catalog you can find things. Can we actually make the library of the digital age come to be, that has enough to raise educated citizens . If we dont wig work on end uph a generation of deserters. He will learn from whatever they have in front of them. And if its paid for stuff from political point of view or foreign points of view or just trolling people that are just making stuff up, we are going to up with a mess. And i say where sort of seeing that play out. So why do we go and stand up and help out the facebooks, twitter, that are trying to make referenceable material. Sometimes not as much as it should be, but how do we make it possible so people can go and know what it is theyre looking at . Yes, it may be made at but at least you can know it is made up based on the analyses of the authors of materials. How can we go and build an internet thats a global brain that we can learn to trust . Because right now we are in this position where its starting to be scary out there. People are starting to worry that maybe the internet is just full of junk, but we dont have another alternative of where to go to otherwise. How do we go and reinforce, make some websites that want to be better be able to be better, referenceable . How do we help authors, wikipedia contributors . How do we give them access to the library of the books in the library so they can reference right into it . How can we give the reader my favorite thing, this leaving books into the web thing with wikipedia with my nextdoor neighbor. She 15 years old and i was telling her, were going to digitize these books and weave them into wikipedia. She lit up and said i want that. I never get a rise out of my 15yearold nextdoor neighbor and i said, why do you want that . She said well, my school will not let me quote wikipedia in my research papers. Wikipedia, thats not good enough. You have to go and follow through and if i could just click on it and open the book of i could do my homework in the middle of the night. Shes like, thats good, right . Thats what we want. We want people to be able to go deeper and naked so that publishers still sell books, they may sell even more books, we get information out the books, music, video, journal literature, old periodicals that they know where it came from and what they can trust. Host you have nine months for your 40yearold goal. Are you going to make it . Guest we are trying to get as a same silicon valley, the minimum viable product. Can we have enough to do this . So phillips andover, else the academy andover, they went and had their full library, they lent it to us so we could digitize it, and we now have the full library of one of the best prep schools in the country is now a High School Library for anybody that wants to access. Isnt that great . Mary grove college, which is a university, college that just went out of business, unfortunately, in detroit. It was a Catholic Girls School and then became coed. But just last year it was the last time, and what they did with the library is a delegate it to the internet archives, and now were in the process of digitizing over the next nine months. We will now have a College Library and complete prep School Library plus about 1. 2 million of the books here and if we can get up to a total of 4 million books, about and 80 million project, a lot of money but doable, we would have yale, princeton or boston public Class Library available to anybody who wanted it on the internet. Thats the dream lover going for, start with these first steps and leaving them into wikipedia so people can find them. Thats just on the book side. Website is going well and where using it to help journalists be able to know when our things being considered by people in being able to keep some of the web referenceable, even though they may have been taken away. Host what are the mechanics of the digitization . This somebody have to stand there, page by page by page . Guest lets take book digitization. We built our own machine. That holds the book like this so doesnt break the bookbinding, and it raises and lowers class with a foot pedal so theres a pedal think of this as a workout. If you raise and lower the class, it flattens the page, goes click, click. Thats it. A person turns the page. Now, click, click. Gosh, shouldnt that ill be done with a robot . We tried. I invented a Robot Company to go to get this to work and get ripped books and it was inefficient and broke a lot. We just said okay, lets just that people do it. People are doing this now at a couple thousand bucks a day. Google has already digitized enormous number of books, and some of them are available but they got caught up in copyright issues. So our approach and doing digitize and lend, what if we have a physical copy, we digitize it and only one reader at the time can read it. So you can get a couple pages to print good like an amazon, look inside the book, but if you want the whole thing then you check it out for two weeks. Then he comes back and the next person. Anytime theres one book or we have two or three copies or other library of congress have those, then they can lend them out as well. So its restricted. Its not even all that great because its kind of restricted, but it balances the copyright interest to go and make sure there are no more copies floating around and were originally bought and purchased from the publishers. Host brewster kahle, 1980 when you came up with this idea, was it a Lightning Strike or was it just a gradual thought process . What were you doing at the time . Guest i i was walking over the Charles River. A friend of mine posed this question, which really haunted me, although it is directed all these years, which was, he said brewster, your technologist. Youre also a utopian idealist. Yes. Paint a a portrait of positive because your technology. That turned out to be a very hard question. We are really good at complaining about things whether its nuclear war or nicaragua problems but coming up with a positive vision is much harder. I could only come up with two to ideas. One was trying to save peoples privacy, even the people are throwing it away. The other was build a library about everything. I thought the second one, a library of everything was too obvious so i started working on the privacy. I found it was too difficult to try to make costeffective privacy devices by making chips in 1980. So went to plan b, and it never turned back. There are a number of us who read this vision of what the internet, the World Wide Web should be, and time to deliver. Weve made progress. Its easy to say the internet is just a pile of a file and whatever, but its also got all sorts of terrific things and participation by lots of people but we need better tools to make our way through it. I it feels confusing. Feels sometimes even threatening to people. And by people being actively spreading disinformation and misinformation, we need better tools. Not going to let this go the wrong way. There is a large number. We are 150 people at the internet archive but there are thousands and thousands of others who were all participating towards wikipedia, Public Library of science, mozilla, the open source world that all have the same general dream of building something that is more than just ourselves. Its an information interconnection system that connects people with information that they need. It gives people an idea of what they can leave behind by writing things that will endure. Thats the dream of the internet that im still after and many, many others are as well. Host what was your role in the development of the internet and the World Wide Web . Did you have one . Guest the actual internet i was on the side more or less. I was part of for a time part of the Engineering Steering Group of the internet, how you build it, but i was not a leader of that. There was a system for how to be the first publishing system on the internet, and i did that. It was called ways. Again before gopher and the web and thats probably why im in the internet hall of fame. But when tim bernerslee got the web going, all of these technologies folded into the web. The web is better and insight d to get publishers online. I got the wall street journal, new york times, reuters, ap, encyclopedia britannica. I got those all on board so the open world worked. This was a time when it couldve been in these very small silos of lexisnexis or compuserve, aol. They were very controlled but wanted an open environment where everyone could get although a bit of the wild west so i was a key part in that era. Once that era was going and i s