comparemela.com

New hub for Interdisciplinary Research on the Global Security and politics of Artificial Intelligence. We are working to understand the effects of misuse and unintended consequences, how ais change a global Power Dynamics and what governance models are meaningful to support and the safe and trustworthy ai. Our work is oriented with a view over the horizon and our goal is to help decisionmakers identify the steps they can take today that will have an outsized impact on the future trajectory of ai around the world. This work help support the Broader Mission of the center for long term cybersecurity which is to help individuals and organizations address to mars Information Security challenges to apple five the upside of the digital revolution. The center for human compatible ai is a research lab based at uc berkeley aiming to reorient the field of ai towards approval of beneficial systems. Through ai safety research. The faculties researchers and phd students are doing pioneering Technical Research on topics that include cooperative reinforcement learning, this specified objection functions, human robot cooperation, value preference alignment, multiagent systems and theories of rationality among other topics. Researchers use insights from Computer Science, Machine Learning, decision theory, game three, statistics of both social sciences. We are thrilled that the founder and director of chai professor Stuart Russell herewith that the second to talk with his new book, human compatible Artificial Intelligence and the problem of control. This book has been called the most important book on ai so far, the most important book ive read in quite some time, a mustread, and the book weve all been waiting for. Stuart russell in stone to many of you. Is been a faculty member of berkeley for 33 years and Computer Science in cognitive science. Hes also an honorary fellow at oxford he is the coauthor of Artificial Intelligence, a modern approach in which is the standard textbook on ai and use in over 1400 universities in 128 countries. Right now he holds a senior Andrew Carnegie fellowship one of the most prestigious awards in social sciences. And last but not least he served for several years as an adjunct professor at neurological surgery at us the redesign of an operator license operate. Also is richard waters, the Financial Times west coast editor based in San Francisco and he leads a team of front focus on technology in silicon valley. He also writes widely about the Tech Industry and the uses and effect of technology. Current areas of interest include Artificial Intelligence, and the building part of the leading u. S. Tech platform. His previous positions at Financial Times include various finance meets in london, new york bureau chief, and Technology Meeting and telecoms editor also based in new york. Professor russell and mr. Waters were in a development, including the expectation that ai capabilities will eventually exceed those of humans across a range of realworld decisionmaking scenarios. We will hear about steps we can to get through this is not the dystopic future Science Fiction, but a new one that will benefit us all. We will hear from them for about half an hour and then open it up for questions from the audience. After that we will break for a reception out in the tariffs and have food and water and drinks available. The book human compatible will also be available for purchase and thus russell has kindly agreed to sign copies for those interested. So with that i will turn it over to professor Stuart Russell and west coast editor richard waters. Inqtel. [applause] thank you. Thank you very much. Welcome for joining us to share a great if you dont know, rush out and buy the book after the introduction. Well dig into it as much as again but maybe we well hold back some secrets so you have to pay for the sing as well, i dont know. As a journalist one of the things i find absolutely fascinating about the ai debate is his complete schism amongst people who allegedly know what theyre talking about. So on the one hand, we have this thing, will never get to superhuman intelligence and even if we did, these machines are perfectly safe. And on the other hand, we have what i think of as the elon musk tended to get its a shame that as much as we all admire him has run away with the kind of scifi end of this debate and to think it needs to be anchored in something a little more serious. Im very glad to have this debate because i think, stuart, what youve done is, is both make us aware of the potential and the risks while anchoring this in a real live kind of a real solid understanding of the science and where we are starting from. This this is a really good placo start the debate rather than this schism that we have right now. Since im a journalist and i love i schism im going to dive straight in. We are here in foggy berkeley, particularly foggy im sure you did your doctorate that tha your place, the peninsula. The other place. So Stanford University study, the 100 your study of ai, which is this kind of landmark attempt to map what is happening in ai, to anchor this debate is some kind of reality now going forward. You quote them saying that unlike in the movies there is no superhuman robot on the horizon or probably even possible. Basically denying that agi of what if you want to is even coming. How did you come to that . Is is actually working . Okay. [inaudible] okay. They could hear you. I dont think i could keep my voice at high level long enough. Interestingly, right, for the 70 year history of ai, ai researchers have been the ones saying ai is possible. And usually philosophers have been the ones saying its impossible, for whatever reason. We dont have the right kind of quantum tuples in our ai system, whatever it might be. And usually those claims of impossible have just fallen by the wayside one after the other. But as far as i know, ai researchers themselves have never said ai isnt possible until now. So what could have prompted them . Imagine if, its a hundred your study, right . With 20 distinguished ai researchers giving their considered consensus opinion on whats happening and whats going to happen in ai. So imagine a 20 biologists did a summary of the state of the field of Cancer Research and they said, you know, i cure for cancer is not on the horizon and probably isnt even possible, right . You would think what on earth would make them say that, right . We have given them 500 billion of taxpayer money over the last few decades, and now theyre telling us actually the whole thing was a con all along. I dont understand what justification they could possibly before ai researchers a ai is not possible, except a kind of denialism which is just saying i dont want to think about the consequences of success, its too scary. And so im going to find any argument i can to avoid having to think about it. And i have a long list. I used to give a talk where i would talk about ai and about the risks, and then here are all the arguments why we should ignore the risks. After i got through about 28 arguments, kind of like the impeachment, republicans 28 reasons why you cant impeach a donald trump, and i just gave up because it was taking up too much time in the talk, at a dont want it to take up too much time today. [inaudible] so you get the usual, well, there is no reason to worry, right . We can always just switch it off. One of my favorites, right . So ai will never happen and we can always just switch it off. There are other ones that it will even mention because they are too embarrassing. Before we get to what the machines might do to us if we get there, lets focus on the are we going to get their question. So i mean if you say this is an amazing. That you lived three decades of ai researchers promising us the world and nothing happening at now all of a sudden were in this period of amazing progress and they want to tell us its not going to happen. Yes. Nonetheless, its this poit where we are where we are now. The massive limitations of deep learning and these datadriven models, and we could all see this kind of potential. Theres this huge gulf to get from your here to there. You say its going to take big conceptual breakthroughs still to do that. This point about conceptual breakthroughs you just dont know them to be. What are the breakthroughs that you see and why do you think they are going to happen . So i can tell you the conceptual breakthroughs how are think we need. Youre right that actually make all those breakthroughs we might fight it still not intelligent, and we may not even be sure why. There are clear places where we can say look, we dont know how to do this, but if we did, that would be a big step forward. And there have already been argued with dozens of breakthroughs over the history of ai. Actually even going back much further. I mean, you could say aristotle was doing ai. He just did have a computer or any electricity to do ai with, but he was thinking about the mechanical process of human thought, decisionmaking, planning and so on. And even described, so on the front of my textbook actually have a little greek text which describes a simple planning algorithm that he talks about. This is how you can reach a decision about what to do. The idea has been there, and steps have been taken, including the development of logic, which again started in ancient greece in ancient india, and revived itself in the mid19th century. Logic is overlooked these days either deep learning community, but it is a mathematics of things, right . And the world has things in it. So if you want to have systems that are intelligent in a world that contains things, you have to have a mathematics that incorporates things as sort of firstclass citizens and logic in that mathematics. So whatever shape a super intelligent system eventually takes, its going to incorporate in some form logical reasoning and the kind of expressive form of languages that go along with it. Let me give a couple of examples that clearly needed breakthroughs. One is the ability to extract complex content from natural language text, right . So imagine being able to read the physics book and then be able to use that knowledge to design a better radiotelescope, right . That at the moment is not even close to being feasible, but there are people working on being able to read physics books and certainly being able to pass the exams. The sad thing is it turns out that most exams that we give students, especially multiplechoice exams, can be passed with no understanding whatsoever of the content. So my friend, a japanese researcher, has been Building Software to pass the university of tokyo Entrance Exam which is kind of like getting into harvard or mit or maybe even getting into berkeley. Her program is now up there around the passing mark to get into the university of tokyo, and still doesnt understand anything about anything, right . Its just learned a whole lot of tricks on how to do well on the exam questions. This is i think a little perennial problem that the media often overlook. They have the big headline, you know, ai system gets into the university of tokyo, or whatever, but not the underlying truth that it still doesnt understand stuff. So being able to understand the book and extracted complex content from it and then do reasoning and design and convention with a content would be a big step forward. And i think theres a little problem of imagination the failure when we think about ai systems, because we think, okay, its not a smart ss and maybe if we try really hard it could be a smart as us. But if the machine can read a physics book and do that, then that same morning it will Read Everything that human race has ever written. And to do that it doesnt even need more computer Processing Power than we already have, right . So theyre not going to be like humans in any way, shape, or form. And this is i think an important thing to understand. Obviously we far exceed human capabilities and arithmetic and now in chess and go and video games and so on, but these are much broader corridors of capabilities, and when we reach human level text understanding, then that, immediately they blow by human beings in their ability to absorb knowledge. That gives them access to everything we know in every language at any time in history. Another really important thing is the ability to make plans successful in the real world. Let me expound on that a little bit. If you look at out alphago, whs a very impressive achievement, so its the program that the human World Champion at go, and sometimes when its thinking about what moved to make, its looking 50 or maybe 100 of the moves into the future, which is superhuman, right . Human beings dont even have the memory capacity to remember that many moves. But if you took that same program and applied it to a real embodied physical robot that actually has to get around in the world, pick up the kids in school, lay the table for dinner, perhaps landscape the garden, 50100 moves teach you about onetenth of a second into the future in the physical world with the physical robot. So it simply doesnt help at all, right . Site might think of out for go as being superhuman and its ability to look into the future, but it is completely useless when you take it from the go board and try to put into a real robot. She wins managed to make plans at the millisecond timescale, so your brain preloads, so generates, preloads and downloads into your muscles enormously complex motor control plans that allow you to, for example, speak, right . Thats thousands of motor control commands sent to your tongue and your lips on your vocal cords and your mouth and everything. The brain has special structures to store these instructions the spit them out at high speed so that your body can function. It operates on the millisecond timescale, but it also as i was talking to richard about his daughters decision to do her phd in molecular biology at berkeley, it took six years. We also make decision on that kind of timescale. Im going to do a phd at berkeley. Six years is a trillion motor control commands. We operate at every scale between the decade down to the millisecond, and we do it completely seamlessly. Somehow we always have motor control commands ready to go, right . We dont usually sort of freeze in middle of doing something and wait for 72 minutes for the more motor control commands to be computed and then redo moving. We always have motor control commands ready to go but we also have the minute, the hour, the day, the week, the month, the year and it is all seamless. There is actual progress towards a solution and some of the results weve seen recently in games like starcraft and the ota to illustrate this because whereas go is a 200 move game, these are 20,000 or 100,000 move games and yet the ai is playing playing at superhuman speed a bad for a second, lets assume these problems are being tackled. Lets say we get to that point of superhuman intelligence. I mean, this is heaven because you say at one point in your book it took 190 years or tbc pro capita in the world to account tenfold and we can do this almost with the technology we even have at that point we can do this in one generation or however long it would take to rollthis out. So nonetheless, what could go wrong . I think when you think about what could go wrong, i think its the interesting point is not the technology, its how we designed it at a fundamental level that you seem most concerned about so you can talk a bit moreabout that. Making a new, and i think the economists put it this way. Introducing a Second Intelligence species under the earth, what could possibly go wrong. So in fact, if you put it that way, if you said clearly intelligence is what gives us power over the world so if we make things that are more intelligent and thereforemore powerful than us , how are we going to have our over more powerful entitiesforever. Like, when you put it like that you get us. Good point. Perhaps we should think about that so thats what i tried to do. And the first thing to think about is why things go wrong. So people have known the business of problems, alan turing said basically we would have to expect the machines to take control. He was completely natural fact and reside to thisfuture. So its not a new thing that elon must just invented and i dont think anyone would say alan turing isnt sufficiently expert to have an opinion about ai or Computer Science. And same with marvin minsky,one of those cofounders of the field and various other people. But turing basically doesnt giveyou a choice. If the answer is we lose and machines are going to take control and at the end of the human error, theres only one choice which is to say and we betters stop to the ai and that choice, he actually refers to samuel butlers novel erewhon from 1863. And in erewhon, thats the choice so its sort of a Science Fiction novel thats developing very sophisticated machines and then decides that they dont want to be taken over. I want to have control of their world taken by the machines so they just band machines. They destroyed all the machines in a terrible war to the promachinists and antimachinists. But the antimachinistswin the war and now machines only exist in museums. So, but that i think is feasible for exactly the reason. If we have super intelligent ai and we can do this well. That tenfold increase in gdp is concerned. It just means given everyone on earth access to the same level of technology and quality of life that we have here in berkeley. Not scifi, not talking about eternal life or faster than light travel. That tenfold increase in gdp is just bringing everyone up to a decent standard ofliving. Its worse, worth about between 10 and 20 quadrillion dollars so that the size of the prize. Thats creating a momentum and saying are just going to ban ai, its completely unfeasible. Not to mention the fact that ai, unlike Nuclear Energy or even crystal babies, ai perceived by people writing formulas on whiteboards. And you cant ban the writing of formulas on whiteboards so its really hard to domuch about it so we have to say what could go wrong . What is making better ai that they. And the reason is because the way we designed our Ai Technology from the beginning as the property that the smarter you make the ai system, the worse it is for humanity. Why . Because the way we build ai systems have always had essentially a copy of how we thought about human intelligence. Human intelligence is the capability to take actions that you can expect will achieve your objectives. His economic philosophical notion of the rational agent. And thats how weve always built ai. We build machines that received an objective from us and then take actions that they can expect will achieve that objective. The problem is as weve always known for thousands of years we are unable to specifyobjectives completely and incorrectly. This is the fundamental problem, this is the legend of king midas, this is why the third wish that you give to the genie is always please undo the first two wishes because i completely ruined everything. But we may not get a third wish. If you create a system more intelligent, more powerful than human beings and you give it and incorrectly specified objectives, then will achieve that objective and youre basically creating a chess match between us and that machine. And we lose that chess match. And the downside of losing that chess match is arbitrarily bad so its a fundamental design error that we may very early on in the field and actually not just ai, control theory , economics, operations research, statistics. All operate on this principle that we exaggerate initially specify an objective and then machinery, its going to optimize it so corporations optimizing quarterly profit are already destroying the world. We dont need to wait to see how super intelligent ai messes things up. You can see it happening already. Corporations are for all into tents and purposes algorithmic machines that maximize and incorrectly specified objective and they are making a mess of the world and we are powerless to stop them. They outlaw us. And theyve been doing this for 50 years and thats why were unable to fix our climate problem despite the fact that we even know what the solutions are. Class to sum up, we have to design ai systems a different way. If were going to be able to live with our own creation successfully. Different way to other all our organizations because we cant have anything this powerful that can take us at our word, thats the last thing we want. So sometimes corporations took us at our word. We set them up to maximize shareholder return. And thats what they did. Thats the problem. Because economists call it externalities and sometimes you can fix externalities by taxes or fines or regulations but sometimes as with social media , messing up our democracy and society, you cant. Theres no way to attack the number of neofascists that you create on your social media platform. And thats an example, the social media platforms are simple learning algorithms that manipulate human beings and make them more predictable sources of revenue and thats all they care about but because theyre operating on billions of platforms or interacting with everyone for hours every day, there already a super powerful force and their miss specified their objectives of maximizing click through is another one of those miss specified objectives that we keep messing up with. Going to leave plenty of time for questions today. Can you start thinking of what you all ask but before we do we shouldnt hold back the punch line from your book which is there is an answer. I hope so and i guess we can do that. I think its, actually the answers in the first chapter. So the first chapter sort of presages the narrow aims of the rest of the book. So i dont want to leave everyone with the impression that i just am one of these doomsayers is predicting end of the world and we have enough of those books already. I cant help being anoptimist because i think i always think every problem has a solution. If it doesnt have asolution, then its a fact of life and not a problem. So i am proposing one way of thinking about ai at different in the following way. If we are unable to specify objectives completely and correctly what we want our machines to do, then it follows that the machines should not assume it knows what the objective is read all our ai system every chapter of ai textbook is based on the assumption that the machine as the correct objective. That cannot be the case in real life. So we need machines that know that they dont know what the true objective is. So the true objective is to the satisfaction of human preferences about the future. What each of us what the future to be like what we dont want it to be like. Thats what the machine should be trying to help us with. But it knows that it doesnt know what our preferences are. And this is a kind of machine that in some ways were already quite familiar with. How many people have been to a restaurant . When you go to arestaurant, does the restaurant already know what you want to eat . No, not usually unless you go there a lot. Like my japanese placeacross the road, they just bring me my lunch. Generally speaking i have a menu. Why do they have a menu . That way they can learn what you want so they know that they dont know you want. And they have a process, a protocol to find out more about what you want. Now, theyre not actually finding out in complete detail exactly how many grains of rice you want on your plate and exactly where you want the little grill marks on your burger or any of that stuff so theyre getting a very rough sense, if they have 15 items on the menu, thats only four bits of information about your preferences for your main course. But thats the protocol where the restaurant is like the ai system knows it doesnt know what you want so it has a protocol to learn enough that it can make you happy. And thats the general idea. Except this is going to be much moreradical. This will be not just what you want for dinner but what you want for the whole future and what everyone on earth wants for the whole future. And we can show the two Important Properties of these systems. Number one is that they will not mess with parts of the world value they dont know about. So in the book and ive often use this example in talks, suppose you have a domestic robot that is opposed to be looking after your kids because relate home from work and its supposed to be cooking dinner and theres nothing in the fridge. What does it do mark it looks around the house and spots a cat. And calculates the nutritional about value of the cash. And then cooks the cat for dinner. Because it doesnt know about the sentimental value of the cat. So with system that no i dont know the value of everything, it would saywell , the cat may have some value of it being alive but i dont know about and so cooking the cat wouldnt be an option. At least it would ask permission. It would say, it would call me on my cell phone and say is it okay if i cook the cat for dinner and i would say no. Is it okay if we turn the oceans into sulfuric acid in order to reduce the Carbon Dioxide level in the atmosphere, dont do that. So thats point number one. You get minimally invasive behavior. So it can still do things as long as it understands your preferences in particular directions, like i would like a cup ofcoffee. If it can get me a cup of coffee about missing the rest of the world and its quite happy to do that. The second point is it will allow itself to be switched off and this is sort of the one plus one equals two of safe ai. If you cant switchit off, were toast. So why would allow itself to be switched off . Was it doesnt want to do whatever it would be that would cause us to want to switch it off. So by allowing itself to be switched off it avoids the met negative consequences, whatever thoseare it doesnt know. It doesnt know why im angry with it, doesnt know why i want to switch it off but wants to prevent whatever it is that it is so it lets me switch it off and this is a mathematical theorem. We can prove that as long as the machinesuncertain about human preferences, it will always allow itself to be switched off. And as thatuncertainty goes away , then our margin is of safety goes away so machines that believe they have complete knowledge of the objective will not allow themselves to be switched off the that would prevent them from achieving the objective so that the core of the solution. Its a very different kind of ai system and it requires rebuilding all the Ai Technology that we have. Because as i said, all of that technology is based on an incorrectassumption. Then that we havent really noticed because our ai systems have been stupid and constrained to the lab. Constrained to the lab part is now going away. Theyre out there in thereal world messing things up. And the stupid is also going away and we have to solve this problem. We have to rebuild all that technology from the foundations of before the systems get too powerful and too intelligent. Ive got one more thing. Lets say the machines dont kill us. Lets say they give us what we want. And we have to work out what we want and not just individually and i have no idea what i want in the aggregate, this is going to be a phenomenal problem for humanity. And you come to the idea, the image of some scene where humanity at the end is kind of all sitting back in an easy chair being fed by robots. And its kind of not with a bang but a whimperkind of and to the world. But how on earth are we going to look towards the future where the machine give us what we want. This is a problem that i dont have a good solution for. And its not really a technological one, this is really a social cultural problem of how do we maintain the vitality of our civilization when in fact we no longer need to do most of what constitutes a vital civilization and letsthink about education. Why do educate . From a very practical matter if we didnt, civilization would collapse because the next generation be able to run it . So human intelligence and even animal species have figured this out. Theyre going to have to pass on knowledge to the next generation. And you add it up over history about 1 trillion years of effort has gone into just passing our civilization onto the next generation. Because we have no choice. We can put it all down on paper but paper is not going to run the world. It has to get into the brains of the next generation but what happens when thats not true . What happens when instead of going through all that long painful process of educating all those humans, we can put our knowledge intomachines and they take care of it for us . And this is a story that actually em forster wrote so if you want one take away, if you cant bring yourself to buy the book, you can download it because its no longer in copyright, em forsters short story called the machine stops in the machine stops was written in 1909 but in the story everyone is looked after by machines 24 seven. We spent most of our time on the internet doing videoconferencing with ipads and listening to book lectures or giving book lectures to each other and we are all a little bit obese and we dont like facetoface contact anymore so it sounds a lot like today but it was written in 1909. And of course the problem is no one knows how to run the machines anymore. Weve turned over the management of our own civilization to machines and become feeble as a result and its a modern version of that story. What we need to do, im reminded of the culture of the spartans. So sparta for all its faults took a very serious cultural attitude towards the survival of their city state. The typical life seems to be that every couple of years you would be invaded by a neighboring civilization, city state, whatever and they would haul off your women and kill all the men so sparta decided that it needed to have a serious Civil Defense capability so education for spartans, i was reading another book by daniel so kind and it was supposed to be World Without work and you he, he described it as 20 years of pc classes. He prepared males and females to fight so it was a military boot camp that went on before you can walk until you were old enough to carry weapons and thats how they fought. There was a cultural decision that to create and that was what was valid in the culture. So im not recommending that we do that exactly but some notion of agency and knowledge capability has to become not just an economic necessity the way it is now actually a cultural necessity that youre not a valuable human being, i dont want to date you unless you know a lot, unless youre capable of skinning around it and catching your own fish and fixing a car and doing this, that and the other soits a cultural change. And i think its also then, there will be a matter of your own selfesteem, that you dont feel like a full human unless youre capable of doing all these things and not being dependent on the machines to help you. I cant see any other kind of solution for this problem. The machines are going to tell us basically as you may have done with your children, its time for you to tieyour own shoelaces but your children are always saying no , no. We have to leave for school in five minutes and i cant do it. Ill do it tomorrow. Thats what the human race is going to do. Going to say we will get around to this agency stuff tomorrow but for now, the machines have to just help us do everything. Going to be myopic and thats a slippery slope thats pretty dangerous so we have to work against thatslope. I think this is a great point to leave this as at one of the worlds greatest educationalinstitutions where maybe we dont need to learn anymore but lets opening this up forquestions. Were going to pass around these microphones. And dont feel constrained by the book if you ask anything thats onyour mind. Thanks for the talk. Begging the question of how to think about this idea that the objective should be unknown , another way of thinking about what you just said is well, the objective of satisfying human preferences, human preferences are unknown so we should satisfy the expected value of human preferences and thats just standard position theory where think about it, companies also dont know what action will bring the most profit so they just do what will bring the most expected profit so i want to hear what the difference is between your model where preferences are an unknown and a model where the preference is Something Like maximizing Human Welfare and since we dont use thatwe use expected values. Thats a great question and it brings up the point how come we never noticedthis before . So the answer is actually that what you say is correct. In the standardformulations of decisionmaking , uncertainty about the objectives can simply be eliminated. You just replace theobjective with its expected value and everything is fine. So thats a full theorem and the most textbooks dont even mentionit because its so obvious but its false. The reason its false is because the environment is a source of Additional Information about preferences. And the most obvious source is that there are humans in the environment and what they do provide more information. So the theorem isnt true and in fact you cant just make decisions on the basis of expected value. The system will do is for example ask questions. Is it okay if i cook the cat for dinner . As opposed to saying on average it seems like the cat has high nutritional value and perhaps they hate the cat so on average we might just be able to cook it. Thats not the right answer. The right answer is to ask permission and so i think that there are two reasons we didnt notice this for 70 years. One is that we copied this notion of intelligence from human rationality and for the most part in thinking about humans, we just assume we have objectives. And of course we know what our objectives are which turns out of course not to be true. We have real epistemic uncertainty about our own preferences for the future. And there are some from time to time over the last 50 years in decision analysis, in thoughts and economics a few papers talk about that though not very many. But the other reason i think is that this notion that the environment is an additional source of preference information wouldnt make so much sense if you were thinking about the human decisionmaking but here were talking about a couple of systems. A machine that is trying to be beneficial to a human. And in economics theres something called principal agent gain where to some extent this is study. The principal would be an employer, the agent would be an employee so the employee in order to get a raise tries to find out more about what the employer wants so you get Something Like these gains but from the point of view of ai is just a different actually, technically inconsistent with theprevious ways of doing things. Thank you very much for taking the time. The model youre proposing is you have some notion of the possible variances and consequences and i think theres one way as an ai or a person designing the system saying could this be consequential enough that we would have a human in the loop but the question on grappling with is it feels like some of the time that might be contraryto the goals of the system at this scale. And i guess even more fundamentally if theres any way of not having had some human valuation ourselves of one of those systems are really as essential and having ai itself be able to tell us yes, this is really something that we have to go to humans about as opposed to humans more fundamentally knowing that. So the decision to get help from a human depends on how expensive it is to get help. So if someone is is he during house surgery, you dont want to interrupt them to know how much they want to pay for their coffee. So that its really, everything works out all right. The more expensive it is to interrupt the human less often ai demands of doing it but of course then more often the human suffers minor inconveniences, theres no solution. Its so everything works out kindof the way you expect. There are some important and difficult technical questions. The most difficult one i think is the fact that all the theories we have so far are assuming a rational human and of course you arefar from rational. And if you observing the behavior of the human, you want to infer something about their underlyingpreferences for the future. That really means reverse engineering human cognitive architecture. So to give you a simple example, lisa doll is the go player who lost to the famous matt to how to go, in order to lose he had to play some losing moves. And if you saw he wasrational. The only conclusion you could draw as he wanted to lose the match. That would not be the correct conclusion. Instead, you need to understand that he has limited computational ability, limited lookahead, limited decision capacity and by far the most likely explanation is that he wanted to win, but is limited abilities prevented him from choosing the right moves. And thats the case where you go to a highly trained human working on a relatively solvable tiny piece of the world. So course in real life, our actions are not even close to rational. Just think about what rational means. Your life consists of about 20 trillion more different trial commands. So to be rational, you have to pick the firstone. Such that the expected value of the remaining 19 trillion is maximized. Its completely and utterly, totally bonkers infeasible, that a new technical term i just invented. So we have to start looking at what are the major ways in which human preferences are realized in the form of behavior and one of the major ways in which we deviate from a pure rationality. The other point is that its not as if the ai system as always, they have the only source of information is its owner. Its going to be constantly bugging the owner. Everything the human race has ever written is evidence about our preferences because it describes human beings doing things and other human beings getting upset about it in many cases. Even babylonian clay tablets which you hope that they contain the secrets of the universe and in fact they contain boring accounts of joe buying 22 camels and getting 17 bushels of corn and two slaves. But that tells you something about human preferences, tells you the Exchange Rate between camels and bushels of corn. Another interesting thing is that you can infer something about human preferences without seeing any humans at all. Imagine that we all were on holiday. The cabin is now empty and theres no humans here at all. But just seeing the way we have left the world tells us an awful lot about human preferences because its the result of us humans pursuing our preferences for decades and decades. So we published a paper about this. You can think of the state of the world as a sample from the result of what happens when cause i rational entities pursue their ends over time and then you can work back from to figure out what are the ends of these entities have made the world us . So we call this the nonnaturalistic fallacy. As long as you know what the naturalistic fallacy is, that you cantperceive fallacy but in a sense you can. So ive got a question about an ai having an objective. [inaudible] when you have yet ai considering a cat it would it would assess the Something Like this happen objective so maybe the objective is satisfying human preferences or Something Like that but it would happen objective what the objective was intent on homicide but these unfortunate results, as a result of like trying to defy us in weird ways. So yes. I mean, the kind of thing is what would happen in the story that you mentioned. I wonder whether we can escape this objective structure for ai because it sounds like theyre always going to happen objective if only depending how specific it is and if they happen maybe its so open weird kinds of computation. Thats a great question and in fact thats what we tried to do. Youre constantly saying devils advocate with ourselves saying what, is there some loophole in the scheme thatwe havent thought of. So true, in the design were proposing, the goal of the ai system is to satisfy human preferences but crucially, it knows that it doesntknow what they are. So thats what gives us this margin of safety. Thats what makes the machine want to be switched off if thats what we want to do. And so on. So one loophole is that human preferences are not fixed. So there are actions that the machine can take as every politician knows to manipulate human preferences. To modify themto be easier to satisfy. So that seems like a failure mode. That seems like a loophole. And so the obvious answer you might say is well okay, we have to set things up so that the human preferences is sacrosanct, that the machine is allowed to modify references. But thats not feasible. Because obviously, having a highly capable domestic robot in your household is going to change your preferences. Youre probably going to become a little more as aresult. Lots of things change our preferences, otherwise we all have the preferences of newborn babies , whatever those are and we dont. So the question is what are acceptable preference modifications and what are unacceptable preference modifications. That i think is an open question that were working hard to understand read. That i think the main loophole that weve identified. But this is what we do. We try and its an engineeringprocess. So we try smallscale experiments with little simulated world and make sure everyone behaves to the terms and actually our interpretation of our terms is in fact that yes they do behave the way we want. And you get very interesting behavior. Its interesting to look at the human side of this equation. So if the machine is solving a problem better and better, the nice thing is that the smarter your machine, the more beneficial it is under this model. Humans and so thats good because under the old model, the smarter the machine the worse it gets for humans. But theres also an attempt for the humans, if you formulate this in game theory, the human half of this game actually involves teaching robot. Because the human will benefit by the robot learning more quickly what human preferences are. And so so when we make little simulated worlds where we have a toy human and a toy robot, the toy human in some sense leads the robot around by hand to show him whats going on, where not to go, where to go so on so you get the seeking behaviors automatically following up as solutions this formal problem. [inaudible] my question is how do we steer society towards that because it seems like todays technology, we have fight like the regeneration of the luddites. People saying technology is bad and ai researchers and corporations are saying we should develop the most advanced versions of ai that we can how do we bring these people in a room together and decidewho were going to reengineer . Thats a great question. So in my experience, scientists generally never say okay, im wrong, youre right. In the best case in 10 years they will all say of course we always thought this period of calls weve always done things this way so that will be the idealoutcome. Probably if you want to get for example google to change the way they do Machine Learning, the first thing is to get them to see where its hurting them. So if you look at what happened with google photo which classified a person as a guerrilla, that was a huge Public Relations disaster for google. Why did it happen . It happened because their training their Machine Learning other than with a fixed objective which im willing to bet was minimize the number of errors on the training set because thats what we all do in our Machine Learning competitions, but of course the errors are not all created equal so miss classifying a north norfolk terror carrier as a norwich terrier, in fact 50 years ago they werent even two separate categories of dogs, they were the same kind of dog so theyre very hard to tell apart and im pretty sure the norwich terriers are not going to go on to twitter and say ive been misclassified as a norwich terrier, how dare they. And apple doesnt really care if they are misclassified as terriers. But obviously people carea lot. So google used incorrectly specified objective. That loss matrix, youve got 20,000 dogs were the categories in the image net database so the matrix has more than 400 million entries but what is the cost of miss classifying an object in type a and an object in type b. No one knows what those 400 million entries are. So why were they using an algorithm that a few that they did know. So you can see immediately from this perspective, what went wrong and what you need to do which is Machine Learning algorithms need to operate with uncertainty over the loss function. And those algorithms look quite different. Every so often theyre going to refuseto classify objects because its too much risk. Every so often theyre going to go back to the human expert say how much did apples get upset about the call pears and vice versa. So you get a different type of album. Second point is those algorithms dont yet exist. So we the ai researchers who are proposing that everyone should follow this way of doing things its on us to actually develop those algorithms. So start to develop both this Core Technology like learning algorithms and even demonstration systems so one thing were doing at the center we had berkeley is the currently been figuring out what is theright demonstration system that we should build. Im in favor of an indigenous personal assistant because i think a personal assistant needs to understand human preferences quite well in order to be useful but the preferences of the user very enormously. Additional personal assistant for donald trump had better be different than the digital personal assistant working for me interms of the preferences and pursuing. And so that kind of work i think is really important. So its not enough to just keep saying doom, doom. Youve got to say theres another way right, here is the other road that we can take. And here is the proof that when you take this road you actually get better ai systems. This is absolutely crucial and you cant just talk about ai safety like okay, i am the ai safety expert, im going to lag my finger at you and tell you youre a bad person. And ai ethics i think is probably even worse term. Because now im saying to you your unethical so stop doing that. Show them the right way and the system is not like, this is an extra safety addon. This is like a Nuclear Power station that doesnt blow up. Wouldnt you rather have one of those then, i was watching chernobyl on the plane back from toronto yesterday and people have seenchernobyl . Fantastic, if you havent seen it watch it but it shows you how difficult it is to convince people that the technology isnt perfect and you have to Pay Attention to risk because if you dont, what happens in the Nuclear Industry . The Nuclear Industry was wiped out. We didnt get any of the benefits of Nuclear Power because of chernobyl. So all this stuff about well, why dont you talk about the benefits . Why are you talking about the risks . You wont get any benefits if you dont Pay Attention to the risks and build Nuclear Power stations that dont blow up and thats what were tryingto do with ai. Trying to build ai that doesnt blow up. [applause] now on cspan2s book tv, more television forserious readers. Going to get started with our next talk the radical book fair pavilion. Im john, im a workerowner at red emmons, the organizers behind this event. Im excited to see folks here and excited for

© 2024 Vimarsana

comparemela.com © 2020. All Rights Reserved.