205. Tutoring

Equity gaps in educational outcomes play a major role in perpetuating economic inequality. In this episode, Philip Oreopoulis  joins us to discuss his research examining how tutoring and computer-aided instruction can be used to reduce disparities in educational outcomes. Philip is a Distinguished Professor of Economics and Public Policy at the University of Toronto, the Education co-chair of MIT’s Abdul Latif Jameel Poverty Action Lab, and an award-winning researcher who has conducted a wide variety of studies relating to education and educational policy.

Shownotes

Transcript

John: Equity gaps in educational outcomes play a major role in perpetuating economic inequality. In this episode, we discuss research examining how tutoring and computer-aided instruction can be used to reduce disparities in educational outcomes.

[MUSIC]

John: Thanks for joining us for Tea for Teaching, an informal discussion of innovative and effective practices in teaching and learning.

Rebecca: This podcast series is hosted by John Kane, an economist…

John: …and Rebecca Mushtare, a graphic designer…

Rebecca: …and features guests doing important research and advocacy work to make higher education more inclusive and supportive of all learners.

[MUSIC]

John: Our guest today is Philip Oreopoulis. Philip is a Distinguished Professor of Economics and Public Policy at the University of Toronto, the Education co-chair of MIT’s Abdul Latif Jameel Poverty Action Lab, and an award-winning researcher who has conducted a wide variety of studies relating to education and educational policy. Welcome, Philip.

Philip: Thanks so much for having me.

Rebecca: Today’s teas are:…Philip, are you drinking tea?

Philip: My tea is coffee. I love coffee. I once looked for a reason not to drink coffee, I couldn’t find one. I love my black coffee.

Rebecca: A true researcher at heart. [LAUGHTER]

John: And I am drinking a bing cherry black tea, a custom Tea Republic tea made for Harry & David.

Rebecca: And I have Irish breakfast tea. I really need to get some new tea [LAUGHTER]. I’m going to a tea store this weekend, so I’m looking forward to getting some new options.

John: And we have lots of tea in the office, some of which may not be as fresh as it was a year and a half ago. But this one still is good. It was purchased right before the shutdown.

Philip: You guys are inspiring me. I think I’m gonna have some tea sometime today.

Rebecca: All right, good, good.

John: In a November 2020 Scientific American article, you describe a meta analysis that you worked on with some colleagues that found that tutoring results in significant improvements in student learning. Could you describe this meta analysis a bit and what you found?

Philip: To backtrack a little bit, how it got started: my colleagues at J-PAL, Vincent Quan, Andre Nickow, and I, had heard about the potential of tutoring to be an effective form for increasing test score learning performance. For example, there’s Benjamin Bloom’s seminal article in the 80s, where he had two very small studies done by his students that both found off the charts improvement from offering tutoring in randomized control trials. In fact, that’s why he called it the ‘“2 sigma problem” that he found estimated impact from these two small studies were raising learning performances by enough to potentially solve most of our problems that we would be having in education policy. There were a number of recent studies as well, a randomized control trial coming out from the University of Chicago’s Ed lab, also finding very promising results from an RCT looking at providing in-class tutoring to grade nine students. And so we wanted to explore whether there was some consistency in these results, so we decided to try to take a more systematic look, and we gathered up all the RCTs, randomized control trials, in the last 40 years for about 96 studies, and we took a look and we found that consensus was quite remarkable. About 80% of those studies found significant effects larger than .2 of a standard deviation, and the average effect size was .38 of a standard deviation, which is like the equivalent of almost an entire extra year of school, from receiving these programs. And not only were the impacts really quite meaningful, as about as large as you get from education interventions, but they were consistent across the board. I think that this is about as much consistency as you’re ever going to get in an education policy intervention. So we were quite excited about that. We found that the effects were pretty consistent no matter which type of program that you looked at. They were larger for things like in-school delivery, three days a week, one-to-one delivery, full time tutors, but even in cases where that wasn’t the case, usually there were still significant effects.

Rebecca: Can you talk about what age the students were, what grades they were in?

Philip: It was for K-12.

John: I think it’s probably safe to assume, though, that the same effect would hold in the college environment as well. Those are some pretty dramatic effects.

Philip: Of course, to some extent, maybe it’s not that surprising. Giving instruction one-to-one leads to higher learning gains, and the biggest challenge, of course, is cost. We can’t all have our own teacher when we go to school. And so the biggest challenge, which gets back to Bloom’s point calling this the 2 sigma problem, is I think we have a powerful intervention to help education, it’s just that it costs too much to implement it on a larger scale. So the fundamental problem is to figure out a way to scale this in a way that can complement the classroom instruction.

John: And so that’s one of the things I think you’re looking at now, how this can be scaled up in a more cost effective manner. Could you tell us a little bit about your current research in terms of computer-assisted learning?

Philip: Sure. So computer-assisted learning or computer-assisted instruction is a type of educational software designed to help students progress through topics at their own pace. It has a lot of similar features as what you might receive when you’re receiving tutoring. So a typical example might be Khan Academy, MATHia, there’s lots of other types of software designed to help with different topics, math and reading, but they all have these sort of common features that allow students to progress through topics at their own pace. You receive immediate feedback from trying to work through your own problems and a chance to understand where you went wrong. If you do make a mistake, there’s data that’s generated from going through it that someone like a teacher might be able to follow and respond to. And so computer-assisted learning can, in some ways, simulate the tutoring experience, but of course, at a much lower cost. The challenge is you don’t have a real person guiding you through it. So even though a platform like Khan Academy is easily accessible, your willingness or motivation to go through it on your own is probably not as great as if you had a real person guiding you through the same material. So there has been some experimental evidence on computer-assisted learning, not as much as theories on tutoring, but of the 15 or 20 randomized control trials that have been done in this area, they have also been showing quite promising results. In cases where computer-assisted learning is provided, especially during a school setting, those receiving it also seemed to be performing at significantly higher rates than those in the comparison group. So there does seem to be some promise at using computer-assisted learning to generate the gains that we see from tutoring. But the way to introduce it, the instructions that teachers need to learn how to use it effectively, are not yet maybe as developed as we’d like them to be. So getting to, I guess jump into what I’m working on, I think that there’s a lot of potential for leveraging existing resources to combine with computer-assisted learning in a way that might come close to the tutoring experience. And so what I’m thinking of is in the classroom, that the kind of facilitated practice that might go on, say, in a math subject might be much better through a tool like Khan Academy than paper and pencil that we often give students. And so the question I’m investigating is around reshuffling the classroom in a way where the teacher is trained how to use computer-assisted learning more effectively in the classroom to generate that type of experience. So in the context of the program that I’m looking at now, which tries to integrate Khan Academy more into math classes, the teacher is still instructing and presenting topics, but now emphasizing the students following an individualized roadmap that allows the students to progress at their own pace, rather than having to keep up even if they’re missing on topics and not understanding. So the program which we’re calling “Coaching with Khan Academy,” or CWK, has students receive a roadmap of incremental topics and videos to follow at the start of school that roughly proceed in the same order that the teacher is going through. Now, the teacher has the students to try to work on this roadmap for at least an hour, an hour and a half a week, and tries to facilitate that time during the class and encourage more done at home, and the students then have the ability to hopefully get into a routine of watching a video and taking the exercises, and if they don’t score high enough on the exercises they’re asked to try to understand why they made the mistake using the hints and tips and guidance that Khan provides or gets help from the teacher, and then repeat it so that they don’t move on to the next topic until they’ve mastered that. So the students are not proceeding all in the same pace, but it is just a much better way to learn math such that the students don’t go on to the next topic until they’ve established a strong enough foundation on the first one.

John: During the global pandemic, most high schools moved to emergency remote instruction for an extended period, and there’s quite a bit of evidence that that led to a decline in overall learning, but also some growing achievement gaps which are tied to household wealth and the wealth of the school districts in which the students reside. What types of policies could be implemented at the K-12 level so that students are more equally prepared for entry into college

Philip: On COVID, we’ve all been exposed to online learning now, and most research suggests that it’s not a great substitute for in-person but there are certain benefits from being able to speak with a real person over a computer in regards to tutoring. So the biggest one is convenience, both for the tutor and the tutee. It’s nice to be able to jump in on a call and spend just 30 minutes on that or an hour, and not have to drive to the person’s location or do this after school. The opportunity to facilitate more tutoring, I think, is increased by having this online access. So I think there’s a lot of interesting promises from that. This one particularly interesting study that was done during COVID last summer, where a group of Italian faculty organized a volunteer tutoring experiment where they got the Deans of their respective universities to invite university students to volunteer their time, three to six hours a week to reach out and connect with students who have been struggling in the high schools and lower grades. And on the flip side, they got the school districts of several locations in Italy to ask teachers to identify students that they thought could benefit from having this one-on-one instruction. And then the response was great in both ways, there were a lot of people willing to volunteer their time for this effort, and there was also a lot of perceived need for students that needed this. And so from this large set-up, they randomized who they were able to give this offer of assistance to. And it was done all online, sometimes over the phone, but more often through Zoom, or Skype, or whatever was most convenient for the match to take place. The tutors met with tutees, for three hours a week, over six weeks. The topics were either math, Italian, or English, and then at the end, the researchers collected the survey and found similar gains to what we were finding in the online overall. Not only that, but they also collected data on mental health and found improvements in feelings of connection, more positive outlook on life. And what’s also interesting as they seem to show improvements and positive outcomes for the tutors themselves, as well. So it stands the potential for a win-win, and this was all done online. So it’s like the only online study I know, but it seems to show the potential that it might be done there. One other example I should mention is Khan Academy has also initiated another organization that facilitates free volunteer online tutoring. It’s called ‘schoolhouse.world’ and it’s been interesting to watch that trying to get up and running. Their system allows anyone in the world to volunteer their time as a tutor, and then they try to connect anyone in the world wanting to receive that tutoring. And you get some sense of some of the challenges from doing that. How do you screen for quality? And also, how do you screen for safety? So they’ve had to go away from a one to one model to more of a group model. They’ve had to have systems in place to check the quality of the tutoring, what’s being discussed. They’ve had to switch to allowing only high school students to receive the tutoring and a few other challenges. And so there’s challenges but also a lot of potential in this that wasn’t available from always having to meet your tutor in school or after school or face-to-face. So the potential scalability is enormous, and that’s where the intriguing possibilities are with that tool.

Rebecca: So if we’re looking to reduce achievement gaps, we’ve talked a little bit about COVID and the mix of instruction that students might’ve had during COVID, the quality of instruction, access to technology, to even have interactions with teachers in some cases, and historically even, differences in ability when students arrive in higher ed. What are some of the things that the higher ed community might be thinking about in terms of this research? Should we be advocating for certain kinds of policies or programs in K-12? Should we be trying to institute some of these things in higher ed? What are your thoughts on that?

Philip: So just in terms of advocacy and thinking about facilitating more equality, there’s no question that tutoring has, in general, been an unequal program. There’s the whole private sector of tutoring where a lot of households for more affluent families seem to receive it than those from less affluent households. And so one thing we can do as policy-makers is to try to facilitate more tutoring to happen in schools, especially at schools for more disadvantaged backgrounds. We can also focus on providing tutoring to those who need it most. I think that there is a growing awareness of the potential for tutoring to make a real difference in helping address the learning loss that may have occurred with the pandemic and just helping address education inequalities in general. And so a lot of resources have started going towards trying to increase the amount of tutoring happening in schools. I think that the more we understand how to implement it successfully, the more guidance that we can provide the K-12 sector in trying to introduce that. I think that there is a lot of optimism now around its potential. I think tutoring is one of the most effective programs that we can offer to make a meaningful difference at scale, such that we can get more students arriving into post-secondary ready to handle it and succeed well there. So that’s on that end. I think that there’s no reason why we also can’t consider tutoring at the post-secondary level as well, and the potential benefits that might come from that. Even if we just look at first-year calculus, or other subjects in math, computer-assisted learning is well developed even at that level, the need for tutoring at that level is there as well. And so it really does go from that importance of establishing a foundation that one might benefit from tutoring at earlier ages. But even at the post-secondary level, regardless of what level the student is, we can all benefit from one-on-one instruction compared to being in a calculus class of 500, right? I think there has been less research that’s been done in that area, but the evidence certainly points to the direction that tutoring at the post-secondary level would be also effective and important to consider.

John: And you mentioned that Italian experiment where college students were providing tutoring, and you mentioned that that was a very positive experience for the college students as well. That might be an interesting model where college students could improve their own skills and develop a bit more automaticity and more practice in basic concepts, while helping bring students up to a higher level in secondary schools. That’s a program that I think offers a lot of potential.

Philip: So I would agree, absolutely, the expression is you don’t really understand something until you teach it. I think that there’s something to be said for that. I think that there’s also a lot of skills and experience that is gained from trying to help others, from trying to connect with perhaps younger individuals that have not had the same background as you. I think that the experience is also attractive to employers looking at who to hire. I think there’s huge gains from all the things that you might volunteer or use your time for in college, spending some time to volunteer to do something like tutoring could be a very rewarding thing as well. So I’m also excited about that model. I think that there are ways to try to facilitate that kind of model at scale and more research needs to be done to explore how to do that.

Rebecca: One of the things that I heard you mentioned early on in the conversation is the idea that, historically, folks who had access to tutoring are more affluent. So the students who most need the tutoring are the ones that aren’t always getting it, because they can’t afford it. So I love the idea of having it in schools or it’s a part of our programs. But also I think sometimes tutoring has a negative connotation to it. It’s like a deficit model. Especially I’ve seen this in higher ed, students don’t want to go to a tutor because it makes them feel like they’re dumb or something.

Philip: My first reaction to that is that tutoring can be beneficial at any level. For example, in the Khoaching with Khan project that I’m looking at, the potential is to help all students in the class regardless of their level, because every student can be given their own individual roadmap. And that not only includes those that are behind grade level that benefit from establishing a stronger foundation in that earlier material so that they can catch up, it also includes those at a higher level that don’t have to be held back or wait for the instructor to cover new material can use a platform like Khan Academy or a tutor to work on more challenging material that interest them. And so how to remove that stigma that exists in general, I agree the usual perception is when someone asks, “Do you need a tutor?” it’s because you’re struggling. It doesn’t need to be that way, but at the same time, I think the more we become aware of the benefits from the tutoring, the more we realize that it’s a great resource to take advantage of. Getting back at the college level, I don’t know about your own experiences, but it always amazes me how few students take advantage of all the free tutoring that’s being offered by the universities through, like, office hours. The opportunity for receiving one-on-one discussion is often there, and yet so few students seem to take advantage of it, perhaps because of that stigma or perhaps they’re too busy. Some of us, when we went through college, were pleasantly surprised by how much you can get with office hours of graduate students and extra tutoring and how much you can learn from that process.

John: As in a lot of classes, students are treated as if one-size-fits-all education and students come in, especially in subjects such as math where there is a very rigid structure, if you don’t have a solid foundation and concepts, learning new topics is not going to be very productive, because you don’t have that foundation to connect to. And I see that in my own classes, and it’s a bit of a challenge to try to do that. Because of issues of scale I often teach large classes, I try to rely on peer instruction as much as possible with small group activities. Could small group peer interactions in working through problems and problem sets achieve something similar to the one on one attention?

Philip: In the literature, it’s called peer-to-peer, we did not look at peer-to-peer in our meta analysis on tutoring, but there is some literature and there’s some effort to consider that. It’s a little bit of a different model, because you’re relying on slightly older students or similar students to help assist other students. I think more research needs to be done on how to make that happen effectively. On one hand, the potential is there to make this a scalable, effective program that doesn’t cost very much. On the other hand, monitoring quality and the potential to train to be a tutor and to do a good job with it may not be there as much as with the regular type of tutoring program.

John: In particular, I was thinking of activities in class where students work on problems in groups, and they try to argue out solutions. They work together and they can explain to each other things they don’t understand, but the key aspect of that is they get feedback on whether they’re correct or not, some constructive feedback on where they went astray. But I was just thinking that those types of small group interactions could provide some of the benefits without that stigma of needing to go to tutoring and perhaps at a higher scale than tutoring might work.

Philip: The advice that I often give my students is to study until you feel you can explain it to someone else. And so there’s a similar, perhaps, mechanism at play when we’re thinking about that. When you try to write down a concept or explain it, even to yourself, out loud or to someone else, you quickly realize what you understand and what you don’t. There does seem to be a lot of potential there.

Rebecca: Sounds like one of the keys to reducing stigma around all of this is making the coaching or this tutoring model just something that’s normalized. Maybe it’s normalized in class, it’s normalized through the school day, and then people might be more apt to take advantage of it because they have access to it. But also, it becomes a standard way of being, that’s what other people around them are also doing.

Philip: Absolutely! I think if we can reframe tutoring as just individualized instruction or personalized instruction, then we can all understand the potential benefits of receiving more personal help than in a classroom setting, and that goes for pretty much anyone.

Rebecca: It really also matches up well with a lot of universal design for learning principles of flexibility as well, and allowing students to go at their own pace and finding ways of teaching and learning that match well for students and where they’re at.

Philip: And of course, the issue is scale. Getting children to learn in a classroom of 25 to 30 students, when these students vary enormously in academic levels, is just really difficult. And trying to figure out a way to provide that individual attention is the challenge that all teachers face and have been facing for many, many years. And if we can find a way to scale adding on or providing more and more individualized attention, it has the potential, I think, to make a real difference in education. Of all the potential policies that we can be looking at, I do think that, at the school level, leaning towards more individualized instruction is where we should be looking at, for a solution.

Rebecca: It’s so interesting to me that we’re having this conversation early on in our semester, because after teaching online for a year, which I hadn’t done previously, I’ve really worked to make my classes more flexible and actually offer some of those kinds of models that you’re describing where students are going more at their own pace, and that they can get some individualized instruction when they need it and that they need to do this mastery learning so that they build on things over time. It looks to me like maybe I need to look more into tutoring and coaching models that have worked really well to see if I can’t implement some of that more during class time.

Philip: There may be different ways to do it. Some may be more effective than others, but I do think, getting back at what John was saying, it’s harder to provide that individual support or help to students arriving in college without that foundation. I have done some other work at the college level, trying to facilitate more personal attention to students arriving, trying to help them out and encourage them to get into better habits, and it has proved quite difficult to change behavior, and so I have found myself reacting to that by focusing more on earlier grades to see if there might be more promise on trying to foster better study habits, better learning habits, earlier on with the hope that students arrive in college more prepared.

John: I think that’s one of the things a lot of behavioral economic studies have found. Interventions that result in long-term changes of behavior are challenging in general.

Philip: Absolutely.

John: And I think you’ve done some research on that.

Philip: Absolutely. So if we have to change one-time actions, like helping students through applying for college, applying for financial aid, those types of interventions are much more promising at affecting one-time goals than to change habits or routines that involve much more continuous behavior. So helping someone study more effectively, spend more time studying, these are much harder problems to solve. And maybe low-cost nudges that we’ve been looking at in the literature may not be as effective. I think that does tie back into how my perspective has changed over time. It’s hard to have significant influence without personal connection. It’s a lot more expensive, but there’s only so far you can go with sending an email or a text message or a one-time meeting in trying to change someone’s learning trajectory or life trajectory. And the more you sort of look at education policies that have been successful, the more you notice that they often come with this personal connection that’s been important for making that meaningful change.

Rebecca: It seems like we should all be really advocating then for these much more early interventions. It’s much more cost effective if we get those habits in place really early [LAUGHTER].

Philip: I will say there’s surprisingly not enough research on the long-term effects of tutoring. I’ve seen one study that has found that the benefits of receiving that tutoring continued one year past the program ended; the effects faded, but not by that much, and that’s the only study I’m aware of that actually does a long-term study. So on the question of whether we can have these life-changing impacts from targeting earlier ages, certainly, there’s a literature for the very young… like, almost helping at the household, but at the school, I think that more work could be done.

John: And that could be a really productive research area. Before we started recording, we were talking a little bit about, with the pandemic, creating our own videos. Could you talk a little bit about how you try to implement what you’ve learned in your own classes at the college level?

Philip: Yeah, I think that using the situation last year to put my lectures online has freed up space in the actual lectures to be more interactive. So I think it was a benefit both ways. The videos of the lectures themselves became more streamlined, I got a chance to break them up into smaller parts, sort of like Khan Academy videos, where instead of one video that’s two hours long, that goes all over the place, and you’re staring at me and the Blackboard, I created five- to ten-minute videos of vignettes that I could focus on with slides and have a series of these videos that students could watch at their own pace. I could edit them and make sure that the video is as succinct as possible and gets across what I really want to say. So that was good on the video side, and then on the actual lecture side, we spent that time going through problem sets and answering questions and it was much more interactive, closer to the spirit of more personalized instruction. So there was more opportunity for questions, more opportunities for the students to get more involved, and I think it did lead to more satisfaction of that approach. Obviously, the big question is, ‘Do they really watch the videos when they’re asked to do it on their own?’ I think there are ways to try to incentivize that, but just like any class, the students really perk up when they’re working on a problem that was, say, a previous exam question.

John: I’ve used a very similar approach. I’ve used videos for like 20 some years in my classes, but one thing I started doing last year is I embedded questions in the middle of the videos, and that’s a pretty effective incentive structure. It does get them all watching the videos, and at least thinking about it and trying to make some connections while they do it, and that’s worked pretty well.

Philip: Not only that, but you can make them mandatory for class participation. So you stick those questions in and they have to watch the video to find the questions when they pop up, there’s software that can do that. And then you can make it as a way to encourage them to have to watch the video.

John: Do you think that more use of computer-aided instruction is going to be helpful in allowing more students to be successful?

Philip: I’m very optimistic on this potential of leveraging computers with teachers and parents working together on trying to facilitate high-dosage practice. We’ve been talking mostly in math, but it could also be language as well, and maybe other topics. But I think this really is a good way to learn, as long as the practice time is long enough, and the student’s not stuck. I think that it takes a while to get into the habit, getting used to the software, getting used to the routine, both for the teacher providing this and for the student doing it, and so that, for me, right now, is the biggest challenge. I am optimistic that if we can facilitate a way to help teachers and students get to that higher-dose practice using computers, then very good things will happen. I think that the evidence is highly suggestive that the high dosage is a worthwhile thing to get done. I’m hoping that we can generate evidence that that’s the case, but we are finding that there are challenges because there’s a learning curve, it is changing the way that the classroom is done and changing the way the student usually learns, but I’m optimistic that if we can get past that, the students and the teachers will come to like this approach, and that we can do more of it at scale.

John: And I think a lot of people began experimenting with some sort of a flipped approach where they created videos and then use the classroom for more interactive activities, ast least at the college level, I don’t think that’s happened quite as much at the secondary school level. But I think that has helped provide at least some professional development for faculty. But it is an adjustment that students are not adjusting to perhaps as easily as I would like, I know I always have trouble getting across to students that there is some benefit of working through problems in class and watching videos and learning some of the basic concepts outside of class. Students would rather be lectured to, there was that big study that was done at Harvard not too long ago, where students were asked about active learning classes versus lecture classes, and the research certainly showed that active learning in the classroom led to significant learning gains, but students perceived a higher learning gain from lecture classes, and that’s where I think that issue of students’ adjustment is a challenge, and until we get to see a large amount of this occurring, it’s going to be a while convincing students of this, because it’s really easy to sit there in a lecture and nod and smile and have it all make sense and it seems to fit together very logically, but then when you try to apply it, there’s a bit of a problem, and then the questions are somehow unfair. But when students are faced with problems and interactive work in class, they’re confronted by not knowing things as well as perhaps they thought they did, and it’s not as pleasant of an experience. And I think that’s the source of that metacognition, that students perceive that lectures are more effective, because it’s easy to sit there and listen in, and it all seems reasonable. But the problem is when they try to work through problems and realize they don’t quite have those connections fully there yet.

Philip: The lecture seems to make so much sense until you sit down when you get home and try to go over it again, but I do think there’s the potential for this middle ground that even in the experiment we’re looking at, we’re not entirely flipping the class, in fact, we want to work with the teacher to understand what their own preferences are, while still trying to hit this high dosage of practice, which may occur in class, but also could occur at home as well. And I think that there is something to be said by having a lecture of a new topic being done in class, in person, with the real person. It gets back to that importance of personal connection that the computer is not able to provide. And so maybe there is a sweet spot around providing real instruction, real empathy, but also enough time to be working through these problems at your own pace. My vision for the Khan project is that students say, in grade four, getting 90 minutes of math a day, maybe half an hour of that would be the teacher’s own instruction of a new topic, but then a lot of the other time would be students working on their own devices, while the teacher takes the time… instead of just sitting up at their desk… walks around and spends a lot of time looking over the student’s shoulder, using the data that they’re seeing to understand who’s struggling and where, and spends a lot of time working individually while the student is using the computer. So there’s still that interaction going on and taking advantage of the personalization. I think they too can go really well together.

Rebecca: That’s definitely something I’ve been experimenting with. I went all the way flipped before, and right now I think I’m right in the middle. There’s some flipped, there’s some demos that are live so that people can interact and ask questions, and then there’s lots of practice with individualized attention. And it does take a little time to get everyone on board, to get everyone trained to do things in a new way. So in a 15-week semester, it might take two full weeks to develop new habits and workflows for everyone, but really after we get over that two- week hurdle at the beginning of the semester, my classes tend to settle into a routine that seems really productive and that students have been pretty positive about.

Philip: A key feature of the coaching with Khan program, is that every teacher gets their own coach that we spell with a “kh,” and our coaches meet with the teacher prior to school to go over our suggested recipe to follow, but then they don’t just leave it at that, they keep working with the teachers to check in and try to troubleshoot or brainstorm or reassure and remind the teacher until things are going smoothly. But it can take longer than two weeks to figure out how things are going, and then on the student side, it can take a while for them to adapt and understand that there’s some independence on their own for wanting to do it. The hope is that the students start to gain confidence when they see their own progress, when they see that maybe they didn’t consider themselves a strong math student, but if you start them at the right spot on this roadmap, and then they proceed incrementally, and they can see that they are advancing, then they start to understand the potential benefits and internalize the desire to keep going on their own.

Rebecca: Yeah, that autonomy and that empowerment, I think, is really key to the whole puzzle. And I think something that probably tutoring historically helps students achieve is that they can do this. They might have a little extra guidance initially, but then they achieve it and can do it, and that’s really empowering.

John: That’s our hope

Rebecca: We always wrap up by asking: “What’s next?”

Philip: What’s next? I think I made some notes on that. [LAUGHTER] So I think the issue around tutoring and individualized learning is all about, now, scale. I don’t think we need another study to demonstrate that one-on-one instruction, or one-on-two is an effective additional tool for learning, that more should be done if it were possible. A lot of resources are now going into trying to provide individualized instruction. I think a lot of policymakers and governments are looking to tutoring as a way to address some of the learning loss that may have gone on during the pandemic, and I think, in that space, there’s some optimism by researchers and policymakers to try to understand what types of scale up are better than others in a way that we can make a meaningful difference at the aggregate level.

Rebecca: Well, thanks so much. I’m really excited to hear more as your research develops and more information becomes available!

Philip: It was a pleasure to get a chance to chat with you guys. It’s a topic I’ve been spending a lot of time on and losing a bit of sleep on trying to get things to work. The experiment that we have going on, this is going on in Texas, and one of the challenges of doing a field experiment is that so many things go wrong while you’re trying to deal with real people, real students, and provide evidence that this is a good idea. And it’s always a bit frustrating to face these challenges, like just account issues, students have trouble getting on to Khan Academy and the teachers getting frustrated, and it would be a shame to have those issues that can be worked out actually create this wedge from the program going smoothly and making the difference between having these great impacts or not. So it is stressful, but I think it’s worth it to try to keep at it, and I hope to be able to do so. With funding and policy support we’ll just keep trying. I think there’s a lot of interest in it, I think that it hasn’t been difficult to motivate these ideas and wanting to do more on it. So thanks a lot for giving me the chance to share these thoughts.

John: Your work is incredibly important. And so much income inequality is associated with differences in educational attainment, that understanding these achievement gaps and what we can do to narrow them can have a really dramatic impact on society.

Philip: Fingers crossed!

[MUSIC]

John: If you’ve enjoyed this podcast, please subscribe and leave a review on iTunes or your favorite podcast service. To continue the conversation, join us on our Tea for Teaching Facebook page.

Rebecca: You can find show notes, transcripts and other materials on teaforteaching.com. Music by Michael Gary Brewer.

John: Editing assistance provided by Anna Croyle.

[MUSIC]

119. Faculty Incentives

If faculty were paid more when their students learned more, would student learning increase? In this episode, Sally Sadoff and Andy Brownback join us to discuss their recent study that provides some interesting results on this issue. Sally is an Associate Professor of Economics and Strategic Management in the Rady School of Management at the University of California at San Diego. Andy’s an Assistant Professor of Economics in the Sam M. Walton College of Business at the University of Arkansas.

Show Notes

Transcript

John: If faculty were paid more when their students learned more, would student learning increase? In this episode, we discuss a recent study that provides some interesting results on this issue.

[MUSIC]

John: Thanks for joining us for Tea for Teaching, an informal discussion of innovative and effective practices in teaching and learning.

Rebecca: This podcast series is hosted by John, an economist…

John: …and Rebecca Mushtare, a graphic designer.

Rebecca: Together, we run the Center for Excellence in Learning and Teaching at the State University of New York at Oswego.

[MUSIC]

Rebecca: Our guests today are Sally Sadoff and Andy Brownback. Sally is an Associate Professor of Economics and Strategic Management in the Rady School of Management at the University of California at San Diego. Andy’s an Assistant Professor of Economics in the Sam M. Walton College of Business at the University of Arkansas. Welcome.

Andy: Thank you.

Sally: Thanks. Great to be here.

John: Our teas today are:

Andy: I wanted to represent Fayetteville, so I went to the tea shop and I got what I have been told is the world’s greatest cup of Earl Grey tea. [LAUGHTER] It’s an award winning cup. They promised me this. [LAUGHTER]

Rebecca: Does it taste award winning?

Andy: I haven’t had enough of it yet. [LAUGHTER]

Rebecca: Reserve judgment?

Andy: I don’t give these awards out lightly.

Rebecca: And a nice lineup of mugs on your desk too.

Andy: Yes, many, too many. So this is just a way I avoid doing dishes. [LAUGHTER]

John: And Sally?

Sally: I’m drinking coffee but I’m on California time, so I’m excused.

Rebecca: And I’m drinking Spice of Life today, a white tea, John.

John: Pretty good.

Rebecca: Unusual, right?

John: And I’m drinking Oolong tea

Rebecca: You’re drinking nothing cause you forgot the cup of tea. [LAUGHTER]

John: . If I remember where I put it, I think I may have left it in the office before I came over here. But I did make a cup of Oolong tea and I did have a sip of it before and I will have it right after this.

Rebecca: I intended to drink tea. [LAUGHTER]

John: We invited you here to talk about your forthcoming article on improving college instruction through incentives. Could you start by giving us a general overview of this study?

Andy: Our study, we partnered with a large community college in Indiana called Ivy Tech. And what Ivy Tech wanted to do was incentivize instructors based on student performance. At the same time, they were rolling out a new set of large end-of-semester comprehensive, and importantly, objective exams. And so we were able to partner with them to use those exams to incentivize instructors based on the outcomes of students. So, that’s kind of the high level overview of what we were doing. I know we’ll get into more detail in a bit.

Rebecca: Can you talk a little bit about what motivated the study in the first place?

Andy: Yeah, absolutely. So, community colleges are obviously really important. It’s thought of as a sort of pathway to the middle class. At the same time, the rates of success at the community college level have been relatively low. And so if we think of community colleges as a particularly good tool for upward mobility, then it needs to be the case that they achieve better outcomes. And with the low current rates of success, it also leads to long times of accruing debt without receiving the benefits of these higher incomes from having that college education. So, there’s a whole host of factors that are kind of coming into play to make these both important and potentially underachieving tools for upward mobility. And then the other side of the equation is also that the faculty at community colleges are predominantly or at least, there’s a large percentage of adjunct faculty with really low pay and sort of what could be seen as an unsustainable business model where you’re relying on people to work in short-term, non-guaranteed contracts regularly and teach these classes. So, we wanted to address both sides, both the student achievement side, as well as the sort of personnel side of the community college setting.

John: And in terms of student success, specifically, I think you’re referring to the proportion of students that move through to a four-year degree program as being lower than what students intended. Is that the primary metric?

Andy: Yes, that’s one of the primary metrics. You can think of the community colleges as having two goals: one being graduating students with associate degrees and another being transferring students to four-year degrees. Now, Sally will know the exact number, but a large percentage of students attending community colleges, I forget what the number is, but their ultimate goal is to eventually transfer and graduate from a four year-college with a bachelor’s degree. So, there’s kind of two ultimate goals. In the process of achieving those goals there’s also gains from simply taking additional classes or receiving accreditation in certain skills, and that’s something that a lot of people go to community college to do. But, our primary long-term concerns are graduation rates and transfer rates.

Sally: Yeah, I think it’s really fascinating. Most of my work up until now has been at the K-12 level. And I think most economists, if you look at education economists, there’s a lot of focus on the K-12 level and looking at teacher quality at the K-12 level and how can we improve teacher quality at the K-12 level? When we came to the college level, there’s been work showing how important it is who your instructor is. Instructor quality matters a lot. But we couldn’t find any work looking at how can we improve instructor quality at the college level? I think it’s really interesting because community colleges are getting a lot of attention from policymakers because they’re low cost, they expand access to underrepresented populations that normally don’t have as much access to college: minority students, students who are first generation college goers, students who are working and so they can’t travel necessarily to go to a college. And so we think that community colleges provide amazing opportunities to students, but as Andy was saying, they really struggle with success rates. And so 80% of students entering a community college say they intend to transfer to a four-year school and fewer than 30% end up doing so. Fewer than 40% of students graduate with any kind of degree within six years. And so these colleges, and we see this working with Ivy Tech, they are incredibly dedicated. The administrators and the teachers there are incredibly dedicated, but they’re working with students who are struggling, nd so there’s a lot of room for improvement. And what we found actually that’s interesting, I think, at community colleges, is that there’s actually more room to think about how to structure employment contracts than there is at the K-12 level. Because often, the instructors aren’t unionized, as Andy was saying they work under these short-term, flexible contracts. And so there’s a lot of flexibility. And really, people haven’t thought much about how to structure these contracts in a way that can improve performance and motivate both instructors and students.

John: It’s a fascinating study. For those of our listeners who aren’t familiar with field experiments, could you tell us a little bit about what a field experiment is?

Andy: Yeah, absolutely. So a field experiment is, in our case, a test of policy. And the way it’s experimentally designed is through what would be known as a randomized controlled trial, meaning that you take a sample of people from a population and you split that sample into a treatment and a control group, and you do this randomly… and that’s the really important part. Because if you test a policy with an assignment that’s anything but random, then you can’t guarantee that these two groups are otherwise equal. But in our case, we’re going to randomly assign people to be in the treatment group or the control group. So, the treatment group will receive the policy, the control group will continue in the current status quo. And then what we will do is look at outcomes and how they differ between the two groups. Now, since the assignment to the two groups is random, again, there’s no mechanical correlation between treatment assignment and any of the characteristics of the groups themselves. Then we can know that any differences subsequent to the assignment are results of the treatment itself and not any sort of spurious correlations or selection biases.

Sally: Yes, I think listeners are probably familiar with this kind of experiment when you think about testing a drug or a vaccine, those kinds of clinical trials. And more and more economists have brought those models in for testing policies. And I think they gained a lot of attention recently because of the recent Nobel Prize, which highlighted how powerful these experiments can be for evaluating policies. And so I think that they gained a lot of attention from economists, they’re growing in their use, and it’s really thanks to partners like Ivy Tech that are willing to let us come in and test things in this way. Because, I think although people are very comfortable with the idea of testing a drug in a clinical trial, sometimes there’s discomfort with testing policies in this randomized way. And so we’re really grateful when we have partners who are willing to let us come in and try these new policies and implement them in this randomized way where some instructors receive incentives and some won’t.

John: And in a sense, we’re always testing things. It’s just, we don’t always measure the effect of it. When you something new in your class, you are doing an experiment. But unless you have a control group to compare it to, you can’t really assess whether the gain is due to that particular intervention or something else that was happening.

Sally: That’s exactly right and we really try to emphasize to people exactly that, that you’re always trying things, rolling out new policies or stopping one thing and doing it differently. And if you’re going to be making these changes, do it in a way where you can learn from them instead of just trying something, trying to step back and try to understand whether it worked or not. How do you know whether something is working or not unless you can compare it to a proper control group?

Andy: And just to emphasize the importance of this methodology, there’s a lot of policy that gets rolled out based on bad data and bad evidence. And so if you’re using a poorly designed experiment, or simply looking at correlational data and rolling out policy, what you could be doing might not be effective, it might be actively detrimental to students. But once you have this clear causal evidence, we can be really confident in the policies we roll out and understand the cost-benefit analysis of the policies prior to implementation.

Rebecca: Can you talk a little bit about the policy that you were testing in this particular experiment?

Andy: Yeah, so as we talked about, we wanted to roll out incentives for instructors based on student performance. And we base these incentives on objective, comprehensive exams for a variety of courses in a variety of departments. The exams are designed outside of the classroom in the sense that it was designed by deans and department heads and represented the types of material that they wanted the students to master by the end of the semester. So, those form the basis of our incentives that we would be giving to instructors. Now, we didn’t just want to offer incentives based on outcomes. We wanted these to be potentially as powerful as possible. So, we leveraged an approach that Sally’s researched in the past in a paper with Roland Fryer, John List, and Steve Levitt, where they looked at loss contracts.

Our incentives were actually such that every instructor would receive $50 for every student who passed the exam, and passing the exam is defined as receiving a 70% or higher on the exam.
So, we framed these as losses. And we delivered incentives at the beginning of the semester, as if half of the students in an instructor’s course had passed the exam. Now, this established it as sort of a target, but it also allowed us to leverage this idea of loss aversion, that instructors would value keeping money potentially more than they value gaining an equivalent amount of money. So, as the students progressed through the semester, at the end of the semester they would take this exam, we would have these objective evaluations for how many students passed the exam, and then we calculate their final payments. If their final payments exceeded this initial payment, they would receive additional payments. If their final payment was less than this initial payment, we would clawback some of that payment. And this was all explained at the outset of the experiment. And again, this sort of loss framing is leveraging a long line of research in behavioral economics, about how much more motivating it can be to face potential losses than equivalent gains.

Sally: Yeah, so just to give an example, if you have 20 students, and you get $50 per student who passes, half of your students passing, that would be $500. So we would send you a check for $500 at the beginning of the year, the beginning of the semester. At the end of the semester, if fewer than 10 of your students pass the exam, say only eight students pass the exam, you have to write us a check back for $100. If more than 10 of your students pass the exam, say 12 of your students pass the exam, then we send you a check for an additional hundred dollars. And we found in previous work that having this money in your bank account and knowing that you potentially could lose it if your students don’t pass the exam can be very motivating, compared with rewards that you only receive at the end of the semester.

Andy: Yeah. And one point about the logistics real quick is that these initial targets were based on enrollment as of what they call the census date. It’s not the drop deadline in the sense that you can’t drop afterwards, but it’s the deadline at which point dropping a course is no longer costless. All the students at this point in the course are enrolled sort of formally, and instructors will receive the upfront incentives based on that number of students. So, there’s multiple margins at which the instructors can influence student outcomes.

John: One thing I think that’s probably worth noting is that one advantage of doing it in a community college is that it’s much easier to have that standardized testing. I know in a lot of four-year colleges, faculty would object to having to assign an externally designrf exam at the end of the term, while in community colleges that type of standardization is much more common, which makes it a bit easier to design a study like this, I would think.

Sally: Yeah, that may be the case. Interestingly, I think, even for accreditation, for example, often you need to show that the test has certain questions on it. I know in large classes with many sections, they often write the exam together. The goal at Ivy Tech was to sort of create this bank of questions that every year tests would be drawn from, and I think moving classes over to that model is interesting. And there’s more openness to it than I thought. So, for example, when we started this study, I thought, “Oh, the only courses we’re going to get in this study are going to be math and maybe some science courses.” And what’s really interesting to me about this study, is unlike at the K-12 level, where it’s primarily focused on math and reading, we have a really wide range of courses. We have anatomy and physiology, art history, nursing, psychology, criminology, sociology, psychology. And so what it showed to me was that you can really get a wide range of courses into this kind of framework. And it doesn’t cover every element of the course. But, for example, in the English courses, one thing they were moving toward was evaluating the essays in a more objective way where you’d have two readers that would both rate the essays and compare ratings. And as colleges move toward those models, I think that this kind of framework will be more and more implementable.

John: It’s certainly good for assessment, and it’s certainly good for evaluating the effectiveness of innovations in instruction. There’s a lot to be said for it. I’m just thinking, at my college I know in many departments there’d be some objections to this. We used to have a standardized common final in the economics department where I teach and people objected to that for a long time, and we eventually moved away from it, but we are talking about doing something similar with at least some subset of questions that would be standard, for that sort of purpose.

Sally: Right. And I think always a concern about these kinds of studies is if the incentive is based on the objective part of the exam that can be tested and assessed in that way, does it take away from the other parts of the course that are more qualitative or more specific to each instructor? And so one thing we were really careful about in this study was to look at not just performance on the test, but how did students do in the class overall, how did they do on the other courses they were taking at the same time? How did they do in future coursework? And I think that’s really important that it’s not just all about teaching to this one assessment that’s going to be used for the incentive.

John: Given the strong findings on loss aversion in terms of how people find losses much more painful than gains of equivalent value, how did faculty react to that incentive structure? I believe you surveyed them on that early on, and then again later.

Andy: Yes, at the outset or at the baseline, the faculty did not like the idea of these incentives. This is both evidence-based where we have survey information and people were willing to sacrifice a rather large amount of money to have these contracts converted into contracts that were gain-based contracts that wouldn’t be paid out until the end of the semester. Anecdotally, this fits with my experiences, I went to explain these contracts. There was quite a bit of pushback in asking why these were framed in this way, and some people potentially wanting to approach them differently. Interestingly, this was very heterogeneous across departments. The accountants were like, Okay, well, I know what to do with this, [LAUGHTER] and put it away, and the psychologists were particularly upset because they knew exactly what we were doing. But, the data show that with experience, our treatment group, on average, has no preference between a loss contract and a gain contract, meaning that a large amount of this distrust of the contract could be attributable to just a lack of experience with this style of contract. And that as instructors gained more experience, they also gained a comfort level with the contracts as well.

John: I still wouldn’t rule out loss aversion as being a factor, but it is interesting that it gets reduced after they’ve experienced it.

Andy: Oh, absolutely. So, that’s not to say that loss aversion isn’t still a factor. But, as you gain experience with these contracts, maybe you start to appreciate the motivating qualities of loss aversion. So, maybe you understand that although these contracts cause you to work harder, or cause you to exert more effort around a certain goal, that by increasing that effort, you’re actually achieving greater outcomes for yourself. And if that’s the case, then they’re still motivating you through loss aversion, but you may not be as averse to the contracts as you were ex ante.

Sally: Yeah, so it may be that people are using them as a type of commitment contract where they know that yes, it will be painful while I’m in the contract, but it’s a way to motivate me to work harder, and I’ll walk home with more money than I would otherwise.

John: Just a couple of months ago, we did a podcast on commitment devices with Dean Karlan…

Sally: Oh nice.

John: …and we talked a little bit about that, and StickK.com, the site he created for that. Now, we’ve talked a little bit about the incentives for faculty, but you also introduce an incentive for students. Could you talk a little bit about that as well?

Andy: Yeah. So, on the student side, this was only in the spring semester. We rolled it out in the fall semester, where we had a pure control group and instructor incentives only. As we moved to the spring, we then cross randomized those two groups with student incentives. The students were incentivized with the following possibility. If they pass the exam, that is receive a 70% or higher, they would get a voucher for free tuition for a summer course. And this could be worth up to about $400 worth of tuition. So, now students are incentivized alongside the faculty. And we wanted to test whether 1. student incentives were effective and 2. if they made the instructor incentives even more effective.

Sally: Yes, we were interested in whether there’s complementarities between student incentives and instructor incentives. We knew from prior work that offering student incentives alone has, at best, modest effects. But, we thought that maybe if we put them in combination with instructor incentives, we could imagine the instructor saying to the students, “Look, guys, you guys have something at stake here too…” and it could create this positive cycle.

Rebecca: So can you tell us a little bit about the results?

Andy: That’s on page 22. [LAUGHTER] We found that the instructor incentives were really effective. They increased student outcomes by about 0.2 standard deviations on those exams. It’s a really nice effect in this literature. What’s also exciting is, suppose you don’t believe our tests or don’t like our tests, they also reduce course dropouts by 3.7 percentage points, which is about a 17% decline in the course dropout rate. They raised grades in the course by over a 10th of a standard deviation. And that’s even if you take out the effect of the exam itself, the course grades still go up by about a 10th of a standard deviation. And these positive results spill over into other courses. They complete other courses at higher rates, they accumulate more credits, and they even go on to transfer at higher rates. So, that’s in the faculty incentives or the instructor incentives branch of the study. When we look at the student incentives by themselves, we see essentially no effects on any key outcomes that we care about. When we look at them in combination, they actually don’t improve the impact of instructor incentives. If anything, we see a pretty small negative effect that wouldn’t be any significant difference at all. But, there simply doesn’t seem to be any impact of the student incentives. Now, this could be attributable to our specific student incentives. But, you’d have to believe essentially that they have either no value or very limited value to say that it’s just the fact that we’re incentivizing students in a very specific way.

John: When you first were talking about it, one of the things that struck me as… I think it was W.C. Fields who was talking about a contest where he said the first prize was a week in Philadelphia. Second prize was two weeks in Philadelphia. [LAUGHTER]

Sally: So, Andy and I are doing a separate study on summer school. And we do find that students do not want to attend school in the summer. But, interestingly, if we can get them to attend school in the summer, it has a really big impact on helping them graduate sooner. So, we’re really fascinated with understanding how we can address this aversion to summer school. But, that may be for another podcast. But,we agree that, I think that the incentive for students may not have been very motivating. I think just to return to the results about the instructor incentives, I think there’s some really interesting results there. First, something that’s unique to the college setting that you don’t find in the K-12 setting, is this really large problem with students enrolling in a course, paying for the course, and then not completing the course. So, about a quarter of students fail to complete courses that they’ve enrolled in and paid for. And this is a big struggle at community colleges. So, just increasing these rates of persistence in the course we think has a really large impact. And what it seems like is happening is instructor incentives get students to keep coming to their course, and so students go to their other classes as well. And so it has this really positive reinforcement effect on students completing all of their courses that they’re taking that semester. I think another really exciting result is that a year after our program ends, when we’ve stopped giving anybody incentives, you see these really large impacts on transfers to four-year schools… about a 20% increase in the rate of transferring into a four-year school, which we think is really exciting, which is the primary goal… as we talked about the primary goal of community college is to get these students to transfer to four-year schools. They really struggle with that. And so we see that this could have a really large impact.

John: And education is costly. And if we get more people finishing, the private and social returns, both go up significantly. And the cost of doing this is relatively low. It’s substantially less costly than the student intervention.

Sally: Yeah, it’s incredibly low, about $25 per student. One thing that’s interesting, again, about community colleges, because adjunct faculty are not paid very well, you can offer relatively cheap incentives that represent a significant bonus. So for these adjunct instructors, the average bonus represented a 20% increase on their baseline salary. Our adjuncts are making about $1,700 for a 16-week course. So, you can get a lot of bang for your buck with adjunct instructors, and we see the largest impact among adjunct instructors. Those are the instructors that really responded to the incentives. And adjunct instructors are increasingly becoming the model for schools, not just community colleges, but four-year schools as well. So, they represent about 50 to 80% of instructors at four-year and two-year schools, respectively. And that’s on the rise. So we expect that to increase in the future.

John: And that’s another topic we actually address in a podcast that was released on December 18.

Andy: So, I think the adjunct effect is also one that’s worth emphasizing, just because of the model of using adjunct faculty or increasingly using adjunct faculty is unsustainable at the current pay rates. So, if we think about these contracts as being more flexible as these adjunct instructors are more used to working on temporary contracts, if it turns out to be the case that you can’t continue to pay people such small amounts for so much work, then how do you design contracts in the future that can maximize student outcomes? So, if we’re in a world where we know we have to redesign these contracts, what we wanted to be able to do with this study is say, “This is a way you can redesign the contracts and achieve the outcomes that you hope to achieve.”

John: That works well, when the test is administered or designed externally. There would be some incentive issues, though, if the instructors had more control over the test or that assessment of how well their students did, I would think.

Andy: Yeah, absolutely. And that was at the front of our minds while we were designing the study was, “Are we not simply motivating people to either teach to the test, or to lie to us outright,” and based on the way the exams were designed, these are both objective and for the most part, externally graded. So, it’s still possible, for example, for a teacher to just erase answers and write in the correct answer if they wanted. But, there’s a certain point at which you have to start trusting your subjects, that they’re not attempting to deceive you. And so we kept that sort of in mind as we were thinking about how to design the study.

Rebecca: Did you have any feedback from faculty at the end of the study, when they discovered that your incentive worked, for example?

Andy: So, we have been in touch with our partner in the administration, we haven’t been in touch with the faculty themselves with our working paper or now the forthcoming paper. So, we hadn’t gotten feedback at that point. We did get feedback in the process of the study that is like at the end of the fall semester, and at the end of the spring semester, and just like the preferences for these contracts, the feedback was, of course, not universally positive. But for the most part, the majority of people appreciated the extra money. And I guess this is something that we haven’t emphasized yet, but we didn’t really change anyone’s contract, they were still operating under the existing contracts. And these served as a bonus on top of those contracts. So, there was very little room to think of these as sort of a really detrimental change toward your contract. Because the worst-case scenario is that you were under the exact same contract as you were previously.

John: If everybody failed, or if everybody came in below the threshold.

Andy: If literally zero percent of your students were able to pass this exam, you were in the same world you were previously.

Sally: We had high rates of sign up in the fall, and then even in the spring semester, there were people in the fall who hadn’t signed up that chose to sign up when they had a chance again, and all but one instructor continued the study from the fall to the spring. So, I think that instructors did like participating and we generally got positive feedback.

John: So, you got really strong results for the incentives for instructors with larger results for the lower-paid instructors… for adjuncts. Was there any evidence of the mechanism by which this affected student outcomes?

Andy: So, we look into mechanisms in two ways. One, we look at self reports of time use. And we really don’t see any significant differences between the treatment and control groups. So nothing that would clearly identify a change in behavior. Now we have one caveat to this, and that’s that when we put the time-use survey out, we limited each activity to 16 hours, not thinking how many of our instructors might spend more than 16 hours on a given activity. And that was made pretty obvious with the outside-work option. And so it is possible that we are top coded there and unable to differentiate between the two. And we also look at student evaluations, and we don’t see any significant differences between the way students evaluate instructors that were in the treatment versus the control group. So, we don’t really see a specific mechanism that’s driving these differences in student outcomes. And if we really wanted to try to isolate these things, we would need to maybe have some better or more objective data about instructor practices or a more fine-grained approach to looking at time use, I think.

John: That could be an interesting follow-up study.

Sally: Yeah, I think now that we’ve shown that these incentives work and can be very powerful, getting inside the black box of the mechanisms is our next step. And we’re currently working with an online university where everything instructors do and everything students do is passively recorded because they’re interacting online. And we think that will give us more fine-grained data. If you think about it… If I asked you last week, “How many hours did you spend on email? How many hours did you spend prepping your course?” It’s really hard to recall that without a lot of noise in there. And I think the other thing we discovered after presenting the results, talking to instructors, talking to administrators, talking to other people who work in this area, is that a lot of it might not be captured by time spent. Some of it might be… you learn the names of the students in your class… when you saw a student in your class who was on their phone, instead of letting them be on their phone, you said, “Please put your phone away, please close your laptop.” And so it might be much more subtle practices that we need to either observe classrooms or do focus groups or really get more qualitative data. And that’s something we’re really interested in doing.

John: Because it could be motivational, it could be that instructors who know that they’re going to get paid more might put a little more effort into those things that may not be captured by those measures. One hypothesis I was thinking is that it could also be that the existence of the incentives might perhaps encourage people to develop a growth mindset. And there’s a lot of evidence that faculty that have a growth mindset tend to have students that do better, or at least that have narrower performance gaps.

Sally: That would be really interesting, I think, for evaluating. We’re already surveying instructors at baseline and throughout and so we could see if the characteristics of the instructors change or their attitudes. We do ask them their attitudes about teaching and their view of students. For instance, questions like “most of my students achievement is determined by background” or “I’m able, with enough effort, to change how my students achieve.” And so we can look more closely at those questions. We use them mainly as baseline questions to characterize teachers about their attitudes. I don’t think we’ve looked to see whether their attitudes change. So that might be an interesting approach, we should take a look at those data.

Andy: One other mechanism that’s opened up by our incentives is that what we’re doing is essentially giving people a big influx of cash at the beginning of the semester. And so this could also just open up resources or capacity constraints that they had without these incentives. So for example, you could imagine someone who’s also working part time, who now gets a check at the beginning of the semester based on all of these potential student gains and doesn’t have to spend as much time working in their other job. Things like that could be potential mechanisms and could also explain why adjunct faculty have this really large differential effect. But again, we don’t have that hard data. And so it’s something that’s really interesting to us. But, unfortunately, not cleanly identified by our data.

Sally: One thing that we received is an unsolicited text message exchange between an instructor and their student, which I thought was interesting, because my students don’t have my cell phone number. But, things like that, giving out your number, exchanging text messages, the sort of individual support that I think, especially for community college students who may be less connected to campus, less connected to the community, could be really important. And so we want to think more about that sort of sense of connection to the community, to your instructor, to your fellow students.

Rebecca: I’m really excited to find out what your next round of studies reveals, because you have interesting directions that you can go in right now. And then really valuable information that you’ve already discovered.

Sally: Yeah, I think another interesting direction that we’re very interested in… is Andy’s talked about this model of being sustainable, especially as schools move more and more over to this adjunct model. So, another thing we want to understand is if a school offers these kinds of incentives, what kinds of people do you attract? Are you better able to retain your high quality instructors? Do you recruit higher quality instructors? So, that’s another question we’d really like to answer in future studies.

John: Because you’re offering higher pay to the faculty that are more effective, which could have an interesting self-selection effect on the faculty composition.

Sally: Exactly.

Andy: Yeah. And if anything, our results suggest that it takes a little bit of experience with these contracts to really appreciate them. So, moving to a model where you have these types of contracts, there might be a transition period where it was challenging before it became something that people understood as beneficial to themselves.

Rebecca: And not just to themselves, to the bigger educational community. Yeah.
So, we always wrap up by asking, what’s next?

Andy: I can talk about a project Sally and I are working on right now, as we talked about earlier, summer enrollment was seen as this potential mechanism to drive student success. And so we did a really simple experiment where we just randomly assigned people to receive a free summer course and then tracked their outcomes for the two years subsequent to that summer course. So, we’re wrapping up a working paper on that. And it looks like summer has this really nice long-term effect that would be kind of hidden in the short-term data because of the fact that you don’t see impacts on retention between spring and fall. But, you do see these impacts on credit accumulation in the short run and then graduation and transfers over these shorter windows as well.

Sally: So, I think as behavioral economists, something that Andy and I are really interested in is the intersection between preferences for contracts, preferences to attend in the summer, and the impact of those kinds of contracts on your future outcome. For example, we find that instructors don’t really like these loss contracts, but they perform really well under them. We find that students don’t really want a summer scholarship, but it has a really big impact on their future outcomes. And so trying to understand this intersection of your preferences for the here and now, and how these things may or may not translate into your future outcomes, is something that I think will be really interesting for future research.

John: This is a topic we keep coming back to in other contexts, that in terms of student metacognition, that the approaches that we know are most effective for learning are the things that students tend to value the least, and tend to perceive as being less important. So this is a pretty general problem, I think.

Andy: And isn’t there data showing how students give worse evaluations to teachers that cause greater amounts of learning?

John: There was that Harvard study a few months ago in a physics program there, where they found that students believed active learning to be less effective in terms of their learning. And yet the students who were exposed to active learning techniques ended up with larger learning gains. And that was also a randomized control trial.

Andy: Yeah.

Rebecca: People just don’t know what’s good for them.

Sally: But it’s hard, because I was trained at the University of Chicago. I am a behavioral economist, but I’m also a University of Chicago economist. And I believe in respecting people’s preferences and their choices. And so we have to be very careful about how to sort of take these complex and think about how to translate them into policy.

John: In terms of gentle nudges that work well.

Rebecca: Well thank you so much for joining us, it’s been really interesting.

John: It’s always better when there’s economists on.

Rebecca: I’m always outnumbered.

John: This has been fascinating, thank you.

Andy: Thank you.

Sally: Thank you so much for having us.

[MUSIC]

John: If you’ve enjoyed this podcast, please subscribe and leave a review on iTunes or your favorite podcast service. To continue the conversation, join us on our Tea for Teaching Facebook page.

Rebecca: You can find show notes, transcripts and other materials on teaforteaching.com. Music by Michael Gary Brewer.

[MUSIC]