comparemela.com

Transcripts For CSPAN2 Michael Kearns Aaron Roth The Ethical Algorithm 20240713

Welcome. Welcome to the Keystone Strategy transformative ideas lectureseries. Were lucky today to be hosted with cspan2. A couple housekeeping notes. When we get to q a there are microphones so raise your hand and the microphone will come to you and then the questions will be captured. So today we are extremely lucky to have Michael Behrens and erin ross, both from the university of pennsylvania here to talk about their book the ethical algorithm. I think a day does not go by in the news or otherwise or in our own work when the subject of algorithmic fairness or privacy is not frontpage news. Today were going to the two leading lights in that area and theyre going to help us understand what the stateoftheart is now and what the stateoftheart will be going forward. With that i think we will welcome professor Michael Kearns first to the stage, is that right . Great. Michael and erin,welcome to the stage. Okay, good morning. Thanks to everyone for coming. My name is michaelkearns and with my foot close friend and colleague we have coauthored a book , a general audience book called the general ethical algorithm, the science of algorithm design so what we want to do for roughly half an hour is just to take you at a high level through what some of the major themes of the book are and then we will open it up as jeff said to q a. So i think many, many people and certainly this audience is well aware that in the past decade or so, Machine Learning has gone from a relatively obscure corner of ai to mainstream news. And i would characterize the first half of this decade as the glory period when all the news reports were positive and we were hearing about all these amazing advances in areas like deep learning, applications in speech recognition, image processing, image categorization and many other areas so we all enjoyed the great benefits of this technology and the advances that were made but the last few years or so have been more of a buzz kill. And there have been many, many articles written and now even some popular books on essentially the Collateral Damage that can be caused by algorithmic decisionmaking, especially decisionmaking powered by ai Machine Learning so heres a few of those books, weapons of mass disruption was a big bestseller from a couple of years ago that did a good job of making very real and visceral and personal theways in which algorithms decisionmaking can result in discriminatory predictions like gender discrimination, Racial Discrimination or the like , david and goliath is a book about the fact that weve become something came to a commercial surveillance state and the breaches of privacy and trust and security that a company that and erin and i have read these books and we like these books very much and many others like them. But one of the things we found lacking in these books which was much of the motivation for writing our own was that when you get to the solution section of these books, i. E. What should we do about these problems, the solutions suggested are what we call traditional ones. They say we need better laws, we need better regulations, we need watchdogs, we need to keep an eye on this stuff and we agree with all that. But as Computer Scientists and Machine Learning researchers working directly in the field we also know theres been a movement in the past 5 to 10 years to sort of design algorithms that are better in the first place so rather than act of the fact you wait for some predictive model to exhibit Racial Discrimination in criminal sentencing you can think about making the algorithm better in the first place and theres now a fairly large Scientific Community in the Machine Learning Research Area and many adjacent areas that is trying to do exactly that so our book is really, think of it as a popularscience book. Were trying to explain to the reader how you would go about trying to encode and embed social norms that we care about directly into algorithms themselves. A couple preparatory remarks. We got a review on an early draft of the book that basically said well, i think your title is a conundrum or possibly even an oxymoron. What do you mean an ethical algorithm . Howcan an algorithm be any more ethical than a hammer . This reviewer pointed out an algorithm like a hammer is a human design artifact for particular purposes. And while its possible to make unethical use of a hammer, for instance i might decide to hit you on the hand , no wood nobody would make the mistake of ascribing any unethical behavior or immoral activity to the hammer itself. If i hit you on the hand with the hammer you would blame me and you and i would both know that real harm had come to you because of me hitting you on the handwith a hammer so this reviewer said i dont see why these same arguments dont apply to algorithms. We thought about this for a while and decided we disagree. We think algorithms are different even though they are indeed just tools that are human artifacts for particular purposes, we think theyre different for a couple reasons. One of them is its difficult to predict outcomes and is difficult to ascribe blame and the part of the reason is that our rhythmic decisionmaking when powered by ai Machine Learning as a pipeline so let me quickly review what that pipeline is. You usually start off with some complicated data, complicated in the sense that its high dimensional and has many variables and it might have many roads so think of a medical database for instance of individual citizens medical records and we may not understand this data in any detail and may not understand where it came from in the first place. It may have been garrett rather from Many Disparate sources and the usual pipeline or methodology of Machine Learning is to take that data and turn it into some sort of optimization problem. We have an objective landscape over a space of models and want to find the model that does well on the data in front of us and usually that objective is primarily or often exclusively concerned with predictive accuracy or some notion of utility or profit. Theres nothing more natural to do in the world if youre a Machine Learning researcher or practitioner and to take a data set and say lets find the Neural Network that on this data makes the fewest mistakes in deciding who to give a loan. So you do that and then what results is some perhaps very complicated high dimensional model. This is a classic clip art from the internet of deep learning. This is a Neural Network with many layers between the input and the output and lots of transformations of the data and variables so the point is that a couple things about this pipeline, its very diffuse. If something goes wrong in this pipeline it might not be easy to pin down the blame. Was it the data, was the the objective function, was it the optimization procedure or was it the Neural Network itself and even worse , if this algorithm or this predictive model that we use at the end causes harm to somebody, if you are falsely denied alone for instance becausethe Neural Network said you should be denied the loan , when this is happening at scale behind the scenes we may not be aware that ive even hit you on the head with a hammer and also because we give algorithms so much autonomy, to hit you on the head with a hammer i have to pick the thing up and hit you. These days algorithms are running autonomously without any human intervention so we may not even realize the harms being caused unless we know to explicitly look for them. So our book is about how to make things better, not through regulation and laws and the like but by actually revisiting this pipeline and sort of modifying it in ways that give us various social norms that we care about like privacy, fairness, accountability, etc. And one of the interesting and important things about this endeavor is that even though many, many scholarly communities and others have thought about these social norms before us, so for instance certainly philosophers have been thinking about fairness for time immemorial, lots of people thought about things like privacy and the like , theyve never had to think about these things in such a precise way that you could actually write them into a Computer Program or into an algorithm. And sometimes just the act of forcing yourself to be that precise and reveal flaws in your intuitions about these concepts that you werent going to discover any other way and we will give concrete examples of that during our presentation. So thewhirlwind , high tour of the book is a series of sessions about different social norms, some of which ive written down here and what the science looks like of actually going in and giving a precise definition to these things, a mathematical definition and then encoding at mathematical definition and an algorithm and importantly, what the consequences of doing that are, in particular tradeoffs. In general if i want to get an algorithm thats more fair or more private, that might come at the cost of less accuracy for example and we will talk about this. So youll notice that ive written these different social norms in increasing shades of gray here. And what that roughly represents is our subjective view of how music mature the science is so in particular, we think that when it comes to privacy, this is the field thats in relative terms the most mature in that theres what we think is the right definition of data privacy and quite a bit known about how to embed that definition in powerful algorithms including Machine Learning algorithms, fairness which is a little bit lighter is a more recent, more nascent field but is off to a good start. And things like accountability, interpretability or morality are in grayer shades because in these cases we feel like there arent even good technical definitions yet so its hard to get started about encoding these things in algorithms and i promise you theres a bottom bullet here which says the singularity but its entirely in white so you cant even see it so what were going to do with the rest of our time is talk about privacy and fairness which cover roughly the first half of the book and then we will bend a few words about telling you the same sort of game theoretic twists that the book takes midway through so im going to turn it over to erin for a bit now. So as michael mentioned, privacy is by far the most welldeveloped theme in the book we talk about so i want to fit spend a few minutes giving you a brief history of the study of data privacy which is about 20 years old now and in that process try to go through a case study of how we might think precisely about definitions. So it used to be 20, 25 years ago that when people talked about releasing data sets in a way that wasprivacy preserving what they had in mind was some attempt at anonymous asian. I would have some data set of individual peoples records and in my data set might have peoples names and i would just if i wanted to release this, try to anonymize the records by removing the names and maybe if i was careful, unique identifiers like Social Security numbers but i would keep things like age or zip code, features about people that werent enough to uniquely identify me. So in 1997 the state of massachusetts decided to release a data set that would be useful for medical researchers. It was a good thing,medical data sites sets are hard to get their hands on because of privacy concerns and massachusetts had an enormous data set of medical records, medical records corresponding to everyemployee in massachusetts and they released this in a way that was anonymous. There were no names, no Social Security numbers but there were ages and there were genders. So it turns out that although age is not enough to uniquely identify you, zip code is not enough,gender is not enough , in combination a candy and there was a student named Tonya Sweeney who is a professor at harvard who figured this out. And in particular she figured out that you could cross reference the supposedly anonymized data sets with Voter Registration records which also had demographic information like a zip code and Social Security number and gender but together with names so she crossreferenced this medical data set with the Voter Registration records and was able to with this triple identifier, identify the record, the medical record of bill weld who was the governor of massachusetts at the time and make a point. Okay. So this was a big deal in the study of data privacy and for a long time people tried to fix this problem by basically just using little bandaids. Trying to most directly fix whatever the most recent attack was so for example people thought all right, if it turns out the combinations of zip code and gender and age can uniquely identify someone in a record, why dont we try course and in that information so instead of reporting age exactly, maybe we will report it up to an interval of 10 years, maybe only report zip code up to three digits and we will do this that we can make sure that any combination of attributes in this table that we released doesnt correspond to just one person. So for example if i know that my 56yearold neighbor who is a woman attended some hospital ,maybe the hospital at the university of pennsylvania and theyve released and anonymized data set in this way then theyve got the guarantee that i cannot connect the attributes i know about my neighbor to one record. I can consent totwo records. So for a little while people tried doing this. And if you think about it, if you look at this data set you might already begin to realize this isnt getting quiet at what we mean by privacy because even though if i know that my 36yearold female neighbor attended the hostility of at the university of pennsylvania i cant figure out what their diagnosis is because this response to two records but i can figure out either she had colitis which might already be something she didnt want me to know but the problem goes much deeper than that. Suppose that i know that shes been a patient not just that one hospital but at two hospitals and the other hospital has also released records anonymized in the same way and in fact may be a little better because now my 56yearold neighbor matches not just to but three of these records. But if both of these data sets have been released i can just crossreference them and theres a unique record,only one record that could correspond to my neighbor and all of a sudden ive got her diagnosis. So the overall problem here is the same as it was when we just tried removing names and its that maybe attempts at privacy like this would work if the data set that i was releasing was the only thing out there but thats never the case and the problem is small amounts of idiosyncratic information are enough to identify you in ways that i can uncover if i can cross reference the data sets thats been released with all this stuff thats out there. So people tried catching this up as well but for a long time the history of data privacy was a cat and mouse game where researchers would try to do futuristic things patching up whatever vulnerability ledto the most recent attack. And attackers trying new clever things and this was a losing game for privacy researchers and part of the problem is we were trying to do things, trying to do things we hoped were private without ever really defining what we meant by privacy so this is an approach that was too weak. Let me in an attempt to think about what privacy might mean talk about an approach thats just wrong and we will find the right answer. So you might say okay, lets think about what privacy should mean. Maybe if im going to use data sets to conduct for example medical studies, what i want is that nobody should be able to learn anything about you as a particular individual that they couldnt have learned about you had the study not been conducted. That would be a strong notion of privacy if we could promise it and maybe to make it more concrete lets think about whats come to be known as the british doctors study, a study carried out by hill in the 1950s and it was the first piece of evidence that smoking and lung cancer had Strong Associations. So its called the british doctors study because every doctor in the uk was invited to participate in the study and two thirds of them did so two thirds of the doctors agreed to have theirmedical records included. And very quickly it became apparent that there was a Strong Association between smoking and lung cancer so imagine you are one of the doctors who participated in the study. Say youre a smoker and this is the 50s so you definitely made no attempt to hide the fact that youre a smoker. Youd probably be smoking in this presentation so everyone knows that youre a smoker but when the study is published all of a sudden everyone knows Something Else about you they didnt know before and in particular they know that you are at an increased risk for lung cancer because all of asudden we learned new facts about the world, that smoking and lung cancer are correlated. If youre in the us this might have caused you concrete harm at the time in the sense that your Health Insurance rates might have gone up so thiscould have caused you concrete quantifiable harm. So if we were going to say that what privacy means is that nothing new should be learned about you as the result of conducting a study , we would have to call the british doctors study a violation of your privacy. But theres a couple of things that are wrong about this. First of all, observed that the story could have played out in exactly the same way even if you were one of the doctors who decided not to have your data included. The supposed violation of your privacy in this case, the fact that i learned you are at higher risk of lung cancer, that wasnt something that i learned about your data in particular, i already knew you were a smoker before the study was carried out. The violation of privacy would have to be entreated to the facts about the world that i learned that smoking and lung cancer were correlated and that wasnt your secret to keep and the way we know that is i could have discovered that without your data, i could have discovered that from any sufficiently large sample of the population and if we were going to call things like that a violation of privacy, we couldnt do any Data Analysis at all because there are always going to be correlations between things that are publicly observable about you and things you didnt want people to know and i couldnt uncover any correlation in the data at all without having a privacy violation of this type. So this was an attempt at thinking about what privacy should mean, getting at semantics but it was one that was too strong. And the real breakthrough came in 2006 when a team of mathematical Computer Scientists had the idea for what is called differential privacy and the goal of differential privacy is to find something similar to what we wanted to promise them in the british doctors study but with a slight twist. Again, think about two possible worlds but now dont think about the world in which the study is carried out and the world in which the study is not carried out but instead think about the world in which the study is carried out and an alternative world where the study is still carried out but with without your data. Everything is the same except your data was removed from the data set and the idea is we want to assert in this ideal world where your data wasnt used at all that there was no privacy violation for you because we didnt even look at your data and of course in the real world your data was used but there was no way for me to tell substantially better than random guessing whether were in the real world where your data actually was used or whether were in the idealized world where there was no privacy violation, we should think about your privacy as having been only minimally violated and this is a parameterized definition because what this says is that the difference between there should be no way to tell the difference substantially better than random guessing in the world in which we use your data compared to the world in which we dont. Its substantially something we can qualify and its a knob that we can to try to trade off accuracy with privacy. So you might think that this is too strongstill. When you think about it for a bit it sounds like a satisfying definition and you might worry that this like the definition we attempted in the british doctors study is too strong to allow anything useful to be done but it turns out thats not the case. I wont go through the simple example here unless we have questions about it in the q a but suffice to say 15 years of research have shown that any statistical task, any Statistical Analysis youd want to carry out on a data set which includes Machine Learning can be done with protections of privacy, albeit at a cost that typically manifests itself in the need for more data or in the need for diminished accuracy. And in this 15 years of Academic Work on this topic but in the last few years this has moved from lets say the whiteboard to become a real technology. Its become something thats widely deployed. If you have an iphone it might as we speak be actively recording statistics back to the mothership subject to differentiations of privacy and google has tools that report statistics in similar ways and a real moonshot of this technology is going to come in just about a year. The us 2020 census is going to release all of it statistical products subject to the protections of differential privacy so this is the sense in which we say that of the topics we talk about in the book, this is the most welldeveloped. Not that we understand everything there is to know about differential privacy but we got a strong definition that has real meaning. We understand the algorithms you need to satisfy this definition while still doing useful things with data and understand a lot about what the tradeoffs are and this has become atechnology thats used in practice. Im going to get a similar definition for algorithmic fairness. As i said at the beginning the study of fairness in algorithmic decisionmaking is considerably less mature than privacy and differential privacy in particular and we already know even though its less mature and is going to be messier so in our book we argue that anybody who thinks long and hard enough about data privacy will arrive at definitions similar to differential privacy in the sense differential privacy is the right definition of data privacy. We already know theres not going to be a single monolithic rights definition of algorithmic fairness so in the past few years there have been a couple of publications that have following broad form, they start off and say can we all agree that any good definition of fairness should meet the following three mathematical properties and the sensible reader looks at these three properties and says yes, of course we would want these three properties. These are weak minimal properties. I would want these properties and Even Stronger was also the punchline is guess what, heres a fear of proving that there is no definition of fairness that can simultaneously achieve the three properties and to make this a little bit more concrete, this really might mean for instance in real applications that if you are trying to reduce lets say the discriminatory behavior of your algorithm by gender for instance, thatmight come at the cost of increased discrimination by race for example. You might face these very difficultmoral and conceptual tradeoffs. But this is the reality of the way things are and so we still propose proceeding as scientists to carefully study alternate definitions and what their consequences are. So what i want to do with most of the points, similar to what erin did which was show you how things can go wrong instead of in anonymity how Machine Learning can result in things like racial or gender discrimination and then have that lead to a particular proposal or how one might try to address these sorts of Collateral Damages. And so i want to talk about why Machine Learning might be unfair. Many of you even in the past few weeks ive heard of these notableinstances , one in which a assessment model, predictive model that is widely used in large american hospitals and Healthcare Systems was shown to have systematic Racial Discrimination in it and perhaps less scientifically there was a twitter storm recently over the recently introduced apple credit card underwritten by Goldman Sachs , there were a number of reports of married couples in which the husband said a, my wife and i filed taxes jointly. She has a higher Credit Rating than i do yet i got 10 times the credit limit on the apple card and she did. Erin and i just spent about a week ago friday hour in the new york state federal regulators office is investigating this particular issue, and we dont know unlike this assessment model, we dont know whether these are a couple of tweaks or whether they are systematic underlying gender discrimination but these are the kinds of concerns were talking about when we talk aboutalgorithmic fairness. So i want to, somewhat like air errands to medical databases take you through a toy example of how things can go wrong in building predictive models and data area and so lets imagine erin and i for instance were asked by the pen Admissions Office to help them develop a predictive model for collegiate success based only on two variables. Your high school gpa and your sat score so what im showing you is kind of a sample of data points, each one of these green pluses or minuses represents its x value, represents the high school gpa of a form against the pen and the y value represents the sat for and lets say that this is a sample of individuals who were actually admitted to penn and so we know whether they succeeded at 10 and by succeed, take any quantifiable, subjective definition in hindsight we can objectively measure the one example would be the success means that you graduated within five years of matriculating with at least a 3. 0 gpa, a different definition of success would mean you donate at least 10 million within 20 years of leaving, as long as we can verify it in hindsight, thats fine thats what the plus and minus needs. For each point we have gpa and sat score of the applicants and the plus indicates students succeeded at penn minuses mean students that didnt succeedso a couple things about this cloud of green points. First of all if you counted carefully youd see that slightly less than half of these historical admits succeeded. Theres slightly more minuses and plus a handful. Thats observation number one. Observation number two is if i show you this cloud of points and ask could you build a predictive model from this data use on a forward going basis to predict whether apple will succeed or not, theres a line you can draw through this cloud. Predicting that everybody above that blueline would be successful and the ones below would be not successful, you can see we do a pretty good job. Its not perfect. Theres a couple of calls accepts and false rejects down here but for the most part we are doing a good job and this is of course in a simplified form exactly what the entire enterprise of Machine Learning is about even including things like the Neural Networks, youre trying to find some model perhaps more complicated that does a good job of separating positives from negatives. Now lets suppose that in this same historical applicant tool, there was another subpopulation besides the greens, lets call them the orange population and heres their data. And i want you to notice a fewthings about the orange population. First of all, they are a minority in the little literal mathematics and that there are fewer orange points in this article data site set there were green points. Observation number two is the data looks different. It looks like the sat scores of the orange population are systematically lower, but also note they are no less qualified for college. In fact theres exactly the same number of orange pluses as there are orange minuses so its not the case the orange population is less successful even though they have statistically lower sat scores. One reason you might imagine is perhaps in this minority orange population theres less well so in the green population which is wealthier they can afford sat preparation courses. They can afford multiple retakes of the exam and taking the maps of their score and the orange population which is less wealthy and has fewer resources, they just do self study and take the exam once and take what they can get. If we had to build a predictive model for just the orange population, theres a good one. In fact theres a perfect modelon the historical data. This line perfectly separate the positives from negatives. So whats the problem . The problem arises if we look at the combined data set and asked whats the predictive model does best on the combined eta set . It is again a single model that did best on the green population and you can see that visually. If i tried to move this line down in order to catch the orange pluses, im going to pick up so many green minuses that the air will increase by my trying to do that so this is the optimal model on the underlying aggregated data in question and you can see that its intuitively unfair and that we rejected all of the qualified orange applicants and so we might call this the false rejection rate , the false race on the orange projection is close to 100 percent and fall rejection rate on the green population is close close to zero percent. Of course, you might say what we should do is just notice that the orange population as systematically lower sat scores even though there are not less qualified for call college and build a twopart model. We should basically say if your green , were going to apply this line and if your red, if your orange were going to apply this line and by doing this we would actually compare it to the single model on aggregate data we would not only make the model more fair, but we would also make itmore accurate as well. The problem with this of course is that if we think about the green and orange as being rates for instance, there are many areas of law and regulation that forbid the use of race as an input to the model. Theres this twopart model as race as an input because the model says first look at race and decide which of these models to apply and of course these definitions or laws or regulations that prevent the use of things like race or gender or other apparently irrelevant variables are usually meant to protect the minority population. So heres a concrete example in which regulations that were meant to protect the minority population guarantee that we will harm that minority population if we just do the most sensible Machine Learning exercise and in the same way that erin said the definitions of privacy on anonymous asian dont make sense, we argue in the book that any definition, any time youre trying to get fairness in algorithmic decisionmaking by permitting inputs is fundamentally misguided and what you should do instead is not respect the inputs to an algorithm but constrain its output behavior in the way that you want. And in particular, one thing you can imagine doing even if we were forced to pick a single model is i could change my objectivefunction. I could say theres a criteria i care about. On the one hand i care about taking accurate predictions, minimizing the error of my model. On the other hand we also care about this other objective which is fairness and in this particular application i might define fairness as having approximate equality of the false rejection rate so i might say im worried about the orange population being mistreated and the particular type of mistreatment im talking about is false rejections. Students would have succeeded but we actually are model rejected so i can define a numerical measure which is what is the difference between the false rejection rates on the green population and orange population and so instead of just saying minimize the error on the data set, i could say minimize the error on the data set object to the constraints that the difference in false rejection rates between these two populations is at most lets say zero percent or i could relax that and say at most five percent or 10 percent and of course if i let this go all the way to 100 percent disparity, its like im not asking for fairness at all anymore and im back to minimizing redacted accuracy so the same way differential privacy gave this non let you tune between how strong your privacy demands are your accuracy demands, this definition of fairness lets us interpolate between asking for the strongest type of fairness, zero disparity in the false rejection rates to know fairness whatsoever and once youre armed with a quantitative definition , you can plot the quantitative tradeoffs you might face in any real application so on three different real data sets in which fairness is a consideration im showing you actual numerical plots here which the x value for each one of these red points is the error of some pretty particular predictive model and the y value is the unfairness of that model in the sentence of this disparity between false rejection rates between two populations and of course, smaller is better for both of these criteria. Where id like to be is in the corner where my error is zero and my unfairness is also zero. You can see thats not happening on any one of these data sets and in real Machine Learning applications even ignoring fairness are not going to get to zero errors but what you see is we face a numerical tradeoff. We can either choose to essentially ignore fairness and take this point up here which gives us the smallest error, the other stream extreme we can ask for zero fairness and get much larger error and in between we can get things that are in between and we argue in the book that its important as a society that we become quantitative enough that people can, even nontechnical people can look at these tradeoffs and understand the implications of them because we do not propose that we should not apply some algorithm to decide which one of these models we should pick because it should dependon whats at stake. In particular, theres a big difference in whats at stake for instance in medical decisionmaking which might have life or death consequences versus the ad that youre shown on facebook or google which many of you may never look at for the most part in general and furthermore you can see that the shape of these curves are quite different so for a couple of them like this one, its possible near the left end of the curve to get big reductions in unfairness for only very mall increases in the air and that mightseem like its worth it. Whereas this one here you sort of face hard tradeoffs right from the beginning so this is an example of the kind of thing we discussed in the book where you start by thinking conceptually about whatfairness should mean and what youre trying to accomplish. You might go through bad definitions based on things like anonymity or not using certain inputs or variables in a computation and eventually you arrive at a more satisfying definition and algorithms and implement particular social norm on real data sets and real algorithms so let me turn it over to erin to talk about all the warm fuzzy stuff later in the book so we talked in depth about privacy and fairness which is the first half ofthe book. Im not going to talk in much depth about any particular thing but i want to give you a quick survey of whats in the second half of the book and maybe at a high level, you can think about the first half of the book as studying algorithms in isolation. We have some Machine Learning other than and we can think about is this algorithm private warfare without thinking necessarily about the largercontext in which all rhythm is embedded. But that context is often important because what the algorithm is doing affects the behavior of people and its important tothink about all those things interact. So in the third chapter, we start thinking about using the tools of game theory. If you changed algorithm, how will it change the particular decisions people make in a way that might reverberate out to have larger societal contexts and we talk about an examplewhich is perhaps not the most consequential socially but is perhaps clearer to get an idea of what were talking about. Many of you will have experience using appslike google maps to plan your daily commute. I can in the morning type in where i want to go and it will not just find directions on the lookup traffic reports andgive me a route which will minimize my commute timegiven the current traffic. So if you think about it , this aspect of google maps, this integration with traffic reports turns this interaction that im having with into what an economist would call a game. In the sense that the actions that i think, which route i choose to drive along negative externalities on other people in the form of traffic. Selfishly i would prefer that everyone else stay home and i would be the only one on the road. I would take a straight shot to work but other people wouldnt agree to that solution so different people have competing interests and their choices affect the wellbeing of other people. Each choice i make, the choices i make have asmall effect on any particular other person. I dont contribute too much to traffic but collectively thechoices we make have large effects on everybody. So one way to view these apps is that they are helping us to play the game better, at least in a myopic sense. Before these acts were around i would have addressed minimal traffic information so i would probably take the same route every day now i can safely respond to what other people are doing and what a game theorist with prescribed app is doing is healthy commute my best response. What can i do that will selfishly and myopically optimize for me . Everyone else is doing the same thing though the result is that what these apps are doing is driving global behavior to what would be called a competitive equilibrium, a mass equilibrium that in some states is stable in the sense that everybody is myopically and selfishly optimizing themselves. Now, if youve taken a class on game theory or even just read the right books you will know that just because something is a competitive equilibrium does not mean its necessarily a good social outcome and the prisoners dilemma is the most famous example of this. So its not at all obvious and in fact you can come up with clear case studies where these apps, even though they are selfishly optimizing for each individual person are making things worse globally for the population at large in the sense of larger average commute times. That might not be an enormous deal when were talking about traffic, but this is just an example of a phenomenon as much more pervasive when algorithms mediate social interactions which happens now all the time so for example you might think of the content moderation algorithms that drive things like the Facebook Newsfeed in a similar context. Lavishly , facebooks interests are not so misaligned with my own at their algorithms are optimized to drive engagement but what facebook wants me to do is stay on facebook as long as so i can view like a bad in the way they do that is by showing the content that i would like to engage with, that i would like to click on and read and myopically that seem toalign with my interest. Have a choice of what website i want to go to. But when facebook simultaneously does this for everybody, even though its myopically optimizing for each person it might have global consequences that we dont like and in particular it might lead to the filterable phenomenon where people sort of do a lot of handwringing about and drive up globally to for example a society that is less deliberative in this chapter we go through a bunch of examples trying to think about and point out the ways in which algorithmic decisions can have widespread consequences on social behavior and how game theory is a useful tool in thinking about those things area and in the last chapter we Start Talking about another important problem which is the statistical crisis in science which some of you might have heard about and is actually not so disconnected from the equilibrium kind of behavior we talk about in the game theory chapter. So there have been a bunch of news articles for example showing that if food science or social psychology, these are emblematic literatures where if you flip through a scientific journal and put your finger down at random, more likely than not the study you have picked out will not replicate. If you reproduce theresults with new data and new subjects , its not nearly as likely as it should be that you will find the same results. Theres lots of spurious results in your next kcb cartoon, there were nice enough to let us include in the book which is getting exactly this problem. We got our scientist here and hes got a tip, someone tells him jellybeans cause acne so he tests the hypothesis and the p value gets is about 805 standard level of statistical significance so he says sorry, no result but then its only a certain color of jellybean and he starts testing them and he starts testing ground jellybeans and purple ones and pink ones and for all of these hes finding a p value greater than or equal to. 05 but then he finds one green jellybeans that appears to be significant. There seems to be a correlation between green jellybeans and acne at a statistical significance level of 95 percent which means if you tested 20 hypothesis you would expect only one of these two incorrectly appear to be significant and of course he did test 20 and heres the headline. Green jellybeans linked to acne, only five percent chance though this is called the multiple hypothesis testing problem and its relatively well understood how to deal with it when its just a single scientist conducting the study and whats going on is really statistical malfeasance. Someone has checked a bunch of hypothesis but then hes only publishing the most interesting one without mentioning the others but of course this is just as much a problem if rather than one scientist studying 20 hypothesis, we have 20 scientists putting one hypothesis and each following proper statistical hygiene. This is as much of a problem as if only one hypothesis is published and that is what the incentives that underlie the game of scientific publishing are exactly designed to do because if you find that blue jellybeans do not cause acne, that is not going to be published. You wont even try to publish that because its not a result that any prestigious journal is going to want to present but if you find something surprising, that greenjellybeans cause acne, thats a big finding. So the problem is, if youve used scientific publishing as a game, even if you eat in each individual player is following proper statistical hygiene you get the same result and in the chapter we talk about how these phenomena are exacerbated by the tools of Machine Learning which promote checking many different hypothesis very quickly which promote data sharing and how tools from this literature in particular , tools from differential privacy which we talk about in the first chapter can be used in this problem. And then thats it. So thank you. [applause] i know we have a lot of folks in the room who regularly work in the space, both beyond and so we would love examples of problems that you guys have faced or questions that you have to provide us. Maybe ill start with this. So you talked a little bit michael about the limitations of Computer Science when it comes to answering these questions of fairness. Having now talk about the book probably for a couple of months , have you found that the public kind of once the Computer Scientist to solve this . Number i mean, i think that in ourexperience , they appreciate the fact that People Like Us, the community that we come from canidentify the point at which there is judgment involved. And sort of moral decisions to be made and that mistakes matter and so i think there are generally appreciative of the fact that both sides need to cometowards each other a little bit. So so just like those types of, these parade ogres that i was showing between error and unfairness , it takes a little bit of explanation to understand what central plot is saying but in general i think that people from nonquantitative areas that are worst stakeholders in problems like these, policy and think tanks and the like, they like that but i dont think theyre wanting Computer Scientists per se to take a leading role in picking out a point and saying your best tradeoff between error and unfairness of it depends on the data and the problem i dont think even we think that Computer Scientists should be exclusively or even in large part the ones making many of these judgments and were careful to say in the book that theres, the scientific problem and theres the part of the problem that requires moral judgments, of various sorts, those are different. We do not propose that it should be albertans or necessarily Computer Scientists who define like what it is we mean by fairness, once you pick a definition we certainly dont propose that this Computer Scientist who should be picking out in various circumstances how we want to trade off thingslike privacy and fairness and accuracy. But it is important and what i think Computer Scientists have to be involved in is figuring out first of all what those tradeoffs are and how to make them as manageable as possible. For example, as the u. S. Census right now, there is literally a room full of people, a committee s job is to look at these curves and figure out how we should tradeoff these very different things. One of which is privacy which innocence is legally obligated to promise to american citizens and the other of which is statistical validity for this data, this is extremely useful data use to allocate resources, school lunch programs, important thing so there are different stakeholders disagree about how these things to be traded off their inner room hashing it out as we see but their work is made very much easier because we can precisely quantify what those tradeoffs are and we can manage them and thats what Computer Scientists i think have to play an Important Role in class that leads to another question i had while listening which is in an ideal universe where the ethical algorithms are on every Computer Scientist and the framework that you described are actually used in action and i think much of this is happening in industry. Some of it obviously happening in government as well. What does it look like to have a community of people kind of living these principles . Is there a public api that we all can see . Is it you use kind of a rudimentary example, like when we go to the Grocery Store and look at the size of the box and we know that how much fat there is and howmuch sugar there is. In a world where some people might comply, some people might have read the book, somepeople might not , what issuccess . So we dont, while we dont talk a lot about this in the book, we continue to procrastinate on writing a socalled policy brief for the Brookings Institute where were going to talk a little bit more about regulatory implementations of these kinds of things and the reason i mention that in response is once you have a precisedefinition of fairness or privacy , you can do what we mainlydiscussing the book which is embedded in algorithms to make them better in the firstplace. But you can also use it for auditing purposes. In particular , if were specifically worried about gender stem jobs in google advertising something that was demonstrably shown to exist a few years, you can run controlled studies. And i think we believe that some of that should happen at that particular when you can anticipate what the objections of the Technology Companies might be, they might include like thats our intellectual property, this is our secret sauce, we can have an automated query which currently of course violate terms of service. Our response is like, this is your regulator. They wouldnt have this access and be able to use it to start a competing Search Engine in the same way that the fcc has all kinds of very sensitive counterparty trading data but is not allowed to use those to go start their own hedge fund, for example, spc. In a world where the ideas we discussed in the book become widespread and embedded a big part of it, the sort of on the site of the cereal box might be things like okay, on the side of the google cereal box heres the rates of discrimination in advertising by race, gender or by age, by income, et cetera you could really imagine having some sort of quantitative notion or scorecard if you like a Different Technology services and how well or poorly there were doing a different social movements. Also what were going to see is regulations for things like privacy and fairness will have to become more quantitative. At the moment theres this disconnect where people in industry are not sure exactly what is expected of them, what is going to count as algorithmic unfairness. For example, the apple card were there with seeming gender discrimination would have an easy to find had only people thought to look for it. When we were chatting with the new york regulators a few weeks back, one thing we heard that was interesting is sometimes companies will explicitly, like avoid writing checks like this because if they dont check, then theres plausible deniability if they do check theres discoverability if theres lawsuit. This is the kind of thing that floor shows when theres ambiguity. But if youre precise about what exactly is going to constitute discrimination in the state of new york, then companies will look for it. I think our view is even apparently strong regulatory documents like the gdpr are really ill formed documents. They look strong but the push words like privacy, fairness around on the page but nowhere in the pages do they say what they mean. Its a bit of a catch22 or chicken and egg problem. It looks like its it strong regulation because their demanding interpretability everywhere but nobody is committed to what it means. I do think that as is often the case, even the nascent site we discussed in the book is sort of running ahead of certainly things like laws and regulation. Before the kinds of changes we were discussing can take place on the regulatory side, much of regulatory law has to be rewritten and the needs to be cultural change at the regulators. That makes sense. Shifting gears just a bit. I was struck by the fact that differential privacy, you guys have view whether, as you said theres an objective preference answer come like an answer they can be defended almost like as a serum . Because privacy issues more important to folks, or is it ahead because privacy is more important than fairness . Theres some choice going on like almost subterranean and it got more attention earlier and it can get solved faster. Or is it no choice at all . Two short comments and aaron can chime in. There are differences in how long these things have been studied but as i said was talking about fairness, i really think theres a technical difference. It just so happens privacy is lucky in the sense that theres a well grounded, very general mathematical definition of privacy that is very satisfying in which Subsequent Research has shown you could do a lot with it. You could meet that definition and still do lots of things that we want to do in terms of Data Analysis and the like. Fairness is it like that. Its not a matter of time. These theorems i mention this ad here are these three properties you would like that you cant simultaneously achieve, its a serum, right . Well talk about this as much in the book but it do think privacy is lucky in the same sense that public key cryptography was lucky. Turned out the peril between the development of public and differential privacy where the wasnt this period where there was of this cat and mouse game. People would invent encryption schemes that looked random until they didnt. Something to look random and then the advent of public put the whole field are a much firmer algorithmic and definition level and then was off to the races. Which means it doesnt mean those things to everything you want from security or they are perfectly implemented every time. I dont think we are ever going to get there with their rooms and thats just life. Its hard to project in the future. Privacy is that 15 years ahead of fairness in terms of its academic studies. Weve had data sets for a long time and so privacy violations have been going on for a long time. Where as when it comes to algorithmic fairness it becomes relevant when you start using Machine Learning algorithms to make important decisions about people. Its only in the last decade or so that both with veteran enough data about individual peoples daily interactions with the internet to be able to make those decisions and learning algorithms have become sufficiently good that we can start to automate some of those. As michael says, like its clear already theres not going to be one definition of fairness, but i do think that if you try to look 15 years down the road, which is what you would have to look before, you know, fairness is, at least chronologically, you might still hope for mature sites here it will not have one definition but perhaps isolated a small number of precise definitions that correspond to different kinds of fairness and we were more precisely understand how it traits off in different circumstances. It will look a different but im optimistic that there will be a lot that you will be able to say, given as much time as privacy gets. One of the comment, i think i didnt appreciate until we started working in algorithmic fairness a lot, which is another difference which will persist basic between privacy and fairness that the net to do with maturity or technological aspects is that discussions about fairness always become politicized very quickly. In the principle everybody agrees that privacy is a good thing and that everybody should have it. As soon as you Start Talking about fairness, you immediately find yourself debating with people who want to talk about affirmative action or redressing past wrongs. All these definitions require that you identify who you are worried about being harmed and what constitutes harm to that group and often why you think the constitutes harm. Some of the things we talk about, like forbidding the use of race in, say, blending or very much in the news the past couple of years in College Admissions and the like, you are really, these definitions also are requiring you to pick groups to protect. This always becomes politicized, regardless of what definition youre talking about and i dont think this will change in 15 years. Some of privacy and fairness are different just in the social or cultural sense as well. Sounds like theyre expecting to algorithms to do something that society has a figured out for itself. Yeah, or conversely, they dont think algorithms shipley any role whatsoever not only in deciding those things but in mediating them or enforcing them or the like and we take paint in the book 2. 0 look, racism was not invented with the advent of algorithms in computers. It was around before. You can just talk about it more precisely now. You could have problems of fairness that are a bigger scale but you can also solutions that are of a bigger scale now that things are automated. Questions in the room . A show of hands. Anyone lax theres a question in the back there. First of all thanks so much for the talk. We really enjoy reading the book. Ive a question about the differential privacy and aggressive data acquisition. In the book you talk what google and apple is being collecting user specifics subject to differential privacy but the type of Data Collected is not the type of data they used to collect so its a new area of data acquisition. I wonder what your comment about this tradeoff using kind of like using differential privacy as like a shield of collecting user data especially i dont know how like a secure differential privacy is against like adversary attacks. So do you see a possibility that users like under the impression of differential privacy, they are willing to give out more data but only to find the data compromised in the end . Thats a good question and its a question that relates to i have to think about aye rhythms not just in isolation but in their theoretic context. So youre right in both the way apple and google use differential privacy that they didnt use to add further protections to data. They all we had available which turns out to be a hard sell to engineers. If thou have a Data Set Available than adding privacy protections correspond correspe away some of the access they have been given access only to a noisier version of the did. Whats a much easier sell and this is why it was not worked in the first two diplomas this is a look, heres some data set you previous had no access to it all because of the privacy concerns. Heres how technology that can mitigate those privacy concerns and that will give you access. Youre right, one thing that happens when you introduce technologies that allow you to make use of data while mitigating their harm is that you make more use of data which makes sense. And so youre right one of the aspects is apple and google are collecting now a little bit more data. On the other hand, they are using this extremely strong model of privacy which would talk about in the book, the local model, and what it really means is they are not collecting in the clear your data at all. They are collecting some random signal from your data and the randomization is added on device, so apple for example, is never collecting your data. Its collecting the result of coin flips from your data. Although more data is being collected, differential privacy in this context is offering an extremely strong guarantee a plausible deniability. And for that reason its not subject to like data breaches, for example. You might worry that differential privacy causes companies to collect more data and sure, maybe thats okay whether using it but as soon as some hacker gets in the system as a data set is released him all of a sudden things are worse off. Thats not have google and apple are using differential privacy. They are doing it in a way that doesnt collect the data at all. At the census it is different. They are collecting the data, they are always collected the data but they are adding these protections in a way they didnt before. They are giving researchers in 2020 access to data that is more privacy preserving that it was in 2010. So theres lots of tradeoffs, interesting things but i think these are two different use cases that should different ways in which it can play out. Just to follow up on that. For kind of a lay audience, can you explain the coin flip . Sure. Cfi the picture of that. So suppose just you sort of like the toy example we used in the book, suppose i wanted to conduct a survey of the residence in philadelphia about something embarrassing, like and what a figured how many people have cheated on their spouse. One thing i could how to do is call up some random subsample of people and ask them if they cheated on the spouse right down the integrate at the nmi try to tighten the results, the average, maybe similars and call it a day. But i might not get the responses that are one because people might legitimately be worried about telling me this over the phone and in particular, like they might not trust me. They might worry someone is going to break into my house and steal this list. They might worry in Divorce Proceedings it will be subpoenaed. Here to give way to carry out the same survey. I call people up and say okay, have cheated on your spouse . Wait, wait, wait, dont tell e just yet. First flip a coin. If the coin comes up heads, dont tell me if it comes up heads but if it comes up heads tell me the truth, tell me if you cheated on your spouse. But if it comes up tales, just tell me a random answer. Again dont tell me the result of the coin flip. People do this and now they have a very strong form of plausible deniability which is since didnt come have a coin flip came out for any particular edge of the told me they can legitimately and convincingly say okay, that wasnt my real answer. That was just the random as instructed me to give. I cant forbear strong beliefs about any particular person. Edwin has this strong statistical guarantee of plausible deniability and this is something you can formalize. But thats okay because the question i cared about wasnt pertaining to any particular person. It was about this population leverage average, and it turned at this is because of a consequence of the law of large numbers and even though i folded the three noisy signals from each person, in aggregate i can figure it break precisely the population level average because i know the process at which a noise was added. So in aggregate i can subtract it off. This is not so different whats happening on your iphone right now. Your iphone is reporting much more called wicked statistics then yes or no question but in the end, this is like text completion, or texts are sensitive but apple would like to know, for example, which the most likely next word given what youve typed so far. They collect data that helps them do that by basically hashing, you know, hashing this text dated debt into a bunch of yes or no questions, which are binary data and running a coin flipping procedure that looks not so different from this. Just to put in the context of embarrassing, maybe you are embarrassed that you still play seven hours of bejeweled on your phone a decade on and so youll be reluctant to report that directly, but if all of our phones add a large random positive or negative number to our weekly usage of the jewel, it that if a look at any individual persons like random no noise report and all, jeff played 17 hours lastly, i wonder whether he played in a in a random numf sigh 17 or he played 30 hours and 13 13 was extracted. He basically for any particular person that is reportedly a lot of the jewel that the same plausible deniability but if i add up all of these very, very noisy reports the noise averages out to get a very good estimate of aggregate or average bejeweled uses. I dont play bejeweled by the way flashback. Great answer. Other questions . We are waiting for the mic. Jennifer. A hell of a mic. Thanks so much for the talk today. We spoke about what our you spe about how introducing differential privacy or unfairness can be in an algorithm. The sad affect the commercial potential of the algorithm . Does that affect the commercial potential of the algorithm . The short answer is definitely. So for instance, google uses Machine Learning at massive scale for decades now to do specifically things like click rate prediction. The more accurate, the better the targeted ads they can show them that directly translates it revenue and profit. Going in and assisting on things like not discriminate against this or that group in your advertising or more privacy in the way the Machine Learning is deployed is going to reduce those accuracy rates and reduce profits. I dont know how to put on it yet but it think we can be sure this is going to happen. I think just relate this to things with just been discussing, i also think this is why i bought of the commercial deployments we have seen so far are in experimental areas that are not part of the core business of these companies. They are experimenting like they would like to know emoji use statistics. Its not a core part of their business, but theyre sticking a toe in the water and they think its to the credit their dipping a toe in the water. Im kind of waiting for the first Big Tech Company that says oh, were not just going to adopt these technologies around the edges. Well put the better Core Services, and by the way all the Big Companies come with many, many excellent colleagues at these companies that do research in this exact area. Its not like any of the Big Tech Companies dont know a lot about differential privacy, about algorithmic fairness and these topics but, of course, theres a disconnect between the researchers who study these things and the people with the Business Unit that they oversee. So i do think, i dont have strong flyers on how to support out. I hope maybe theres some organic adoption by Tech Companies, not just Tech Companies but other Large Companies that are consumer facing in some way. Of voluntarily biting the bullet and sang were going to take the lead on this and our Core Services or products. But i think the more likely answer is this going to need to be regulatory pressure, and that will take time. [inaudible] in reference to the last slide, we shouldnt be using machine using algorithms in big help data sets because the ability they need to be very basic, you know, kind of algorithms to substantiate the outcomes without knowing [inaudible] now. Like we do what you say theyll use Machine Learning algorithms but you do have to be careful. First of all its not Machine Learning is entirely a theoretical. Like if you have some result, you train some classifier and it seems to be pretty good on your hold outset predict tumors of some sort, then you can legitimately put a circle around that to estimate with statistical validity what its out a a sample air is. But the problem comes when you start sharing data sets in particular, like reusing holdup sets. When youre taking Machine Learning 101, or the way you often get to ova littered in Machine Learning when you are, unlike in statistics, not explicitly assuming that the data fits a lineal linear model, the way you get specific statistical validity is yet this hold outset, this piece of the data set that youve never seen before, entirely independent of everything your training, which looks fine on paper but if i read a paper of yours and a singin email say that was a great paper, could you send me your data set so that i could do Something Else with it . Well, even if i lay self follow all of the rules of statistical hygiene, i read your paper and implicitly everything im doing is a function of an instant response to the finding that you wrote about, which were a function of the data. As soon as anything like that happens, all of the guarantees that come with a hold outset go entirely out the window. Now, the sort of, the easy way, and i say easy mean like theoretically easy but practically very hard to solve this problem is what people advocate for when the talk about the registration. Like i should make sure i cannot look at the data at all before i conduct the experiment im going to conduct but he did take that seriously it rules out data sharing for exactly those reasons. So although that works is sort of draconian and would rule out a lot of interesting studies. We and talk about in the chapter is a nation algorithmic site that allows you to share data and reuse data in a in a way tt doesnt give up on rigorous statistics. Im betting when you said a theoretical, you were referring more to sort of causality about whether, so there is this split between Machine Learning community and the other commuters including medicine economics about causality and some Machine Learnings are militantly antikhalsa, just like lets get the data and if we, we get a a good fit to the data and would practice sound statistical techniques. So i think certainly having strong priors and having a causal model in your head as example of something i would consider having strong priors, can help any sort of reduce the number of things you try on the data but i still think its a substitute for the kinds of things we discussed in the chapter. Because its against a matter of discipline. I can think that some causal model but, of course, i dont literally have liked usually the will be parameters to a causal model and ill Start Playing around with the strength of the causality or as soon as i do that youre going to the same route where youre testing many, many hypotheses sequentially or in parallel on same data sets and you are prone to false discovery if youre not very, very careful. I cant its a very, very early days for this kind of stuff even earlier days than ferris but i think we think discipline algorithmic approaches including ones that of all things at different to privacy and other statistical methods are better than human beings themselves in their head saying well, im not engaged in this reproducibility crisis because of strong priors than the causal model. The book is really insightful. It points as to the challenges that will be facing the next 50 years, 100 years. We be happy to still be selling the book. [laughing] my question is more immediate actually. The responsibility to explain the observations of public in a way that actually supports good policymaking. We see examples all the time of that stuff happening in a very good way. All these things are not happening. Whats your perspective on even take the stick step on Machine Learning and out easily would be able to reach [inaudible] i think thats very important to, Computer Science has been unwittingly thrust into policymaking come just informally. Like if youre a Software Engineer at facebook and you tweak a parameter and go to lunch and all think about it, you are affecting all sorts of things for millions of people. Facebook in many ways is informally making policy in ways that are not even precise he thought out. And i think given were already in the situation, its important we work to make this more explicit, to make it clearer how software decisions affect policy and, therefore, its important to try to understand just brought an audience as possible, okay, at a high level, like what are algorithms, what are they . What are they trying to accomplish . Thats in large part what we are trying to do with this book. I mean, its funny because ive been around a lot longer than aaron and actually, the word Machine Learning unit, the word Machine Learning is in my dissertation. At the time this was a very obscure area to be studying. In fact, even majoring in Computer Science when i was an undergraduate was viewed as an odd thing. Its interesting come sometimes i joke that sort of through no foresight or merit of my own, sort of the world was delivered to the doorstep of Computer Science sometime in the last 20 or 30 years. That was thrilling for a long time because it had no downside in many ways. Theyre all kinds of interesting new jobs, all kinds of interesting new science. In many ways now the bill is coming due and our book is about that bill and how we might pay it. I think the other part of it is, i dont like to use this term but i think more Computer Scientists need to think about if not the coming public intellectual, getting more involved in the use of the technological the misuses of the technology and trying to help society solve the problems created by those technologies. Theres still not a lot of that yet. Its still i think a lot of it is very superficial not in criticizing which are starting to see Computer Scientist do things like write oped pieces, for example. I think we really are going to need technically trained people who are really willing to spend their entire careers are much of their careers mediating between the technical part of two peter science and apposite social evocations of it. I think that when the people that work on these types of problems is starting to breathe again regeneration of people that are willing to make that career choice. Maybe People Like Us are at the point in our careers were we can say okay, i can go do this and not worry about whether i will be able to have Research Career also. But i think its very important and is starting to happen organically, it is very early on. So i think we have time for just one last question. I thought i might as something just completely personal, which is you work on research, i think we all know, for a link the time and then you work on a book for a long time. Its a lot of yourself in there. I know its important work, and it is and we talk a lot about that, besides that like, what is it about this subject matter, and you guys couldve chosen any subject matter, like once the 16yearold version of yourself that, like, really make this right what you want to spend your days doing . Yeah, so a good question. The 16 and 18yearold version of myself wanted to be very mathematical. I started college as a math major and drifted towards Computer Science as i realized you could think mathematically about competition. Moving more and that direction i realize there was something called, not just Machine Learning but learning theory. I thought i was really cool. Here you are, you can sit there and prove mathematical theorems about how people learn or at least how machines might learn. Then he got to grad school and i realized okay, you can apply this mathematical competition the lens to all sorts of things. Differential privacy was just being defined and it was an exciting time and i enjoyed thinking, again, proving theorems about privacy. Wow, you can think about privacy, well, you know, using math. More recently, you know, you can do the same thing with fairness. Theres something very different about writing for an expert academic audience. You try to come for so theres a lot of math. You try to precisely defined ideas in sort of a dry way to be concise. Writing this book was quite different. Its a fun, liberating to try to write in an engaging way. Its difficult but rewarding to try to describe these ideas, which are the root mathematical but without equations. We tried very hard to remove all the equations from the book. I hope in the end we succeeded in conveying what is not just the sort of natural interest in these topics that we were lucky these topics are not just interesting as mathematical curiosities but our real meaningful important questions of the day, but also the excitement of doing research in these fields. Because we can take people in the book by up to the frontiers of knowledge because so little is known so far. My origin story is a little different. If you told my 16 year old self that okay, at some point later in life youre going to write a general audience nonfiction book, i think that wouldve made a lot more sense to me than if you told me oh, you will be a professor in Computer Science and Machine Learning. Because in high school i was a very indifferent math student. I didnt like it very much. I didnt try very hard to i wasnt especially good at it then, and i started college as an english major, and declared english major. And pretty quickly realized that i had chosen english cousin wanted to learn how to write and that majoring in english was going to teach me how to read. But at the same time i managed to just hang on by my fingernails in math classes long enough that when i got to berkeley i started taking more of them. Any of you have studied math and Computer Science to high school and all the way through college, you know at some point theres this Phase Transition with things become much more interesting and you start to become aware of the creative aspects of the. I think i first discovered this more in Computer Science just because of the buzz of being able to program a computer to do something that you could possibly yourself achieve in your entire lifetime, and it takes ten seconds. Something stupid like sorting a list of numbers, for instance. I think i enjoyed that and then i started kind of hung around math long enough, a purely mathematical aspects became of interest. In some ways biting this book in some vague way does fulfill the type of thing i wanted to when i was very, very young. One of the comment about all this work weve been doing it fairness. I remember maybe six years ago or so the specific moment aaron and i come we stayed in a cafe and we first started talking some problem and algorithmic fairness. Ill let you our publication which was interesting but flawed. And i would like to say, when you are working Something Like that if you like to say oh, six years ago i realized this was really important for society. And we as Machine Learning researchers has a a responsiblo fix the problem. And id like to say even the research had turned out to be boring, like maybe all the problems are easy or they are all too hard or does nothing you can do or the solutions are clear and technically straightforward and its just a matter of going out into work and convincing people to adopt them. Id like to claim that come hell or high water wouldve said no, this is what we have to do as responsible citizens. And luckily i dont have to know what choice i wouldve made because it took a nap to be sort of a mathematically and algorithmically very, very rich field. But it is great to be able to work on a topic that a, society is arrested in in both positive and negative ways, where there was really interesting tactical work to do that is created and satisfying and also to be able to do with somebody that you are so simpatico with from a technical and expository standpoint, its been a great deal of fun. Thank you. I think we all feel the same way in being able to hear your ideas and have been shared directly with us. It means a lot to extend congratulations on the book and thank you for coming. Thanks for hosting us. Yeah, thanks for having us. [applause] good time. Thanks, everybody. You are watching booktv on cspan2 with top nonfiction books and authors every weekend. Booktv, television for serious readers. Here are some programs to watch out for this weekend on booktv on cspan2. Heres a look at some books being published this week. Look for these titles and bookstores in the coming weeks and watch for many of the authors in the near future on booktv on cspan2. [inaudible conversations] all right. Thank you so much for coming to the beautiful fact of the class here at uc berkeley for this event today. We really excited for the discussion. The event is being filled i cspan2 to be broadcast at a later date so will make that available to you as soon as we have it. This event is cohosted by the Ai Security Initiative witches house at the center for long term cybersecurity and by center for human compatible ai

comparemela.com © 2020. All Rights Reserved.