Part 3 of a 3 Part Series on how to Make AEA, and Evaluation, Relevant in the Future: Evolution, Diversity and Change from the Middle

Common Introduction to all Three Parts

I have been thinking about what will happen to AEA, and to evaluation, in the future. I can conjure scenarios where AEA and evaluation thrive, and I can imagine scenarios where they whither. What I cannot envision is a future in which AEA and evaluation, as we know them now, stay the same. What I want to do is to start a conversation about preparing for the future. AEA is already active in efforts to envision its future: What will AEA be in 2020? My intent is to inject another perspective into that discussion.

What I’m about to say draws on some thinking I have been doing on two subjects – 1) AEA’s development in terms of evolutionary biology (Ideological Diversity in Evaluation. We Don’t Have it, and We Do Need It, and 2) Using an evolutionary biology view to connect the intellectual development of evaluation and the development of the evaluation community); and the nature of diversity in complex systems. (If you have not read Scott Page’s Diversity and Complexity, I recommend it.)

Part 1: What do I mean by diversity?

Continue reading

Advertisements
Posted in Uncategorized | 2 Comments

Evaluation use by people opposed to the program

Recently I posted a message to Evaltalk about a story in the NY Times that dealt with the Administration’s plan to cut funding for teenage pregnancy prevention programs: Programs That Fight Teenage Pregnancy Are at Risk of Being Cut. (Evaltalk is an American Evaluation Association’s listserv.) That message led to an interesting back and forth, which in turn led me to a response. I’m reproducing it here.

What I took from this has very little to do with teen pregnancy or the transferability of program effectiveness across contexts. For me it was a bit of a values crisis. I strongly oppose the Administration’s policy on these programs. But I also strongly favor Continue reading

Posted in Uncategorized | 1 Comment

Are Policy Makers, Program Designers and Managers Doing a Good Job if they Rely Too Much on Evaluation?

We  like to complain about evaluation use.
People in my business (me included) like to lament the lack of attention that people pay to evaluation. If only we did a better job if identifying stakeholders. If only we could do a better job of engaging them. If only we understood their needs better. If only we had a different relationship with them. If only we presented our information in a different way. If only we chose the appropriate type of evaluation for the setting we were working in. If only we fit the multiple definitions of “evaluation use” to our setting. And so on and so forth. I’m in favor of asking these questions. I do it myself and I am convinced that asking them leads to more and better evaluation use.

Lately I have been thinking differently.
I’m using this blog post for two reasons. One is that I want to begin a discussion in the evaluation community that may lead to more and better evaluation use. The second reason is that writing this post is giving me a chance to discern the logic underlying a behavior pattern that I seem to have fallen into. As for the logic, as far as I can tell it has two roots: continuous process improvement, and complexity. Continue reading

Posted in Uncategorized | 1 Comment

Some Musings on Evaluation Use in the Current Political Context

This blog is my effort to consolidate and organize some back and forth I have been having about evaluation use. It was spurred by a piece on NPR about the Administration’s position on an after school program. (Trump’s Budget Proposal Threatens Funding For Major After-School Program.) In large measure the piece dealt with whether the program was effective. Arguments abounded about stated and unstated goals, and the messages contained in a variety of evaluations. Needless to say, the political inclinations of different stakeholders had a lot to do with which evaluations were cited. Below are the notions that popped into my head as a result of hearing the piece and talking to others about it.

Selective Use of Data
Different stakeholders glommed onto different evaluations to make their arguments. Continue reading

Posted in Uncategorized | Leave a comment

Another post on joint optimization of uncorrelated program goals as a way to minimize unintended negative consequences

Recently  have been pushing the notion that one reason why programs have unintended consequences, and why those consequences tend to be undesirable, is because programs attempt to maximize outcomes that are highly correlated, to the detriment of multiple other benchmarks that recipients of program services need to meet in order to thrive.  Details of what I have been thinking are at:

Blog posts
Joint Optimization of Uncorrelated Outcomes as a Method for Minimizing Undesirable Consequences of Program Action

A simple recipe for improving the odds of sustainability: A systems perspective

Article
From Firefighting to Systematic Action: Toward A Research Agenda for Better Evaluation of Unintended Consequences

Despite all this writing, I have not been able to come up with a graphic to illustrate what I have in mind. I think I finally might have. The top of the picture illustrates the various benchmarks that the blue thing in the center needs to meet in order to thrive. (The “thing” is what the program is trying to help – people, school systems, county governments, whatever.)

joint_optimization

The picture on the top connotes the situation before the program is implemented. There is an assumption  made (an implicit one, or course) that A, C, D, E, and F can be left alone, but that the blue thing would be better off if B improved. The program is implemented. It succeeds. The blue thing gets a lot better with respect to B. (Bottom of picture.)

The problem is that getting B to improve distorts the resources and processes needed to maintain all the other benchmarks. The blue things can’t let that happen, so it acts in odd ways to maintain its “health”. Either it works in untested and uncertain ways to maintain the benchmark (hence the squiggly lines), or it fails to meet the benchmark, or both. Programs have unintended consequences because they force the blue thing into this awkward and dysfunctional position.

What I’d like to see is programs that pursue the joint optimization of at least somewhat uncorrelated outcomes. I don’t think it has to be more than one other outcome, but even that would help a lot. My belief is that so doing would minimize the distortion in the system, and thus minimize  unintended negative outcomes.

 

 

 

 

 

 

 

 

Posted in Uncategorized | 2 Comments

Ideological diversity in Evaluation. We don’t have it, and we do need it

I’m about to make a case that the field of Evaluation would benefit from theoreticians and practitioners that were more diverse than they are now with respect to beliefs about what constitutes the social good, and how to get there. Making this argument is not easy for me because it means putting  head over heart. But I’ll do my best because I think it does matter for the future of Evaluation.

Examples from the Social Sciences
Think of the social sciences – Economics, Sociology, Political Science.

One does not have to have left wing inclinations to appreciate Marxian critiques of society and the relationships among classes. That understanding can inform anyone’s view of the world whether or not you think that overall, Capitalism is a good organizing principle for society. On the other end of the spectrum, even a dyed in the wool lefty would (should?) appreciate that self-interest and the profit motive are useful concepts for understanding why society works as it does, and that it does (might?) produce some social good despite its faults. Would the contribution of the field of Economics be as rich as it is if one of those perspectives did not exist?

Or to take an example from Sociology. Functionalists like Talcott Parsons and Robert Merton lean toward the notion that social change can lead to dysfunction. The existence of theory like that can shape (support? further?) go-slow views about the pace of social change. Or think of the conflict theories of people like Max Weber and C. Wright Mills. Those views support the idea that conflict and inequality are inherent in Capitalism. That’s the kind of theory that could support or shape a rather different view about the need for social change.

So what we have is a diversity of theory that is in some combination based on/facilitative of, different views of how society should operate. I think the disciplines of Economics and Sociology are better off because of that diversity. More important, we are all better off for having access to these different perspectives as we try to figure out how to do the right thing, or even, what the right thing is.

Evaluation
I am convinced that over the long run, if Evaluation is going to make a contribution to society, it has to encompass the kind of diversity I’m giving examples of above. Why?

One reason is that stakeholders and interested parties have different beliefs about programs – their very existence, choices of which ones to implement, their makeup, and their desired outcomes. How can Evaluation serve the needs of that diversity if there is too much uniformity in our ranks? Also, what kind of credibility do we have if the world at large comes to see our professional associations and evaluations as supportive of only one perspective on the social good and the role of government?

The argument above deals with the design of evaluations and the collection and interpretation of data. But the importance of diversity extends to Evaluation theory as well.

Explaining the value of diversity in Evaluation theory is harder for me because I don’t have a good idea of how it might play out, but I’ll try. It seems to me that right now, all existing Evaluation theory carries the implicit belief that change is a good thing. Change may not work out as we wish because programs may be weak or have unintended consequences. But fundamentally, change is good and the reason to evaluate is to make the change better. Well, what would Evaluation look like if we had evaluation theory that drew from the Functionalist school of Sociology, which takes such a jaundiced view of social change? I have no idea, and emotionally, I’m not sure I want to know because personally I am in favor of intervention in the service of the social good. But on an intellectual level, I know that evaluation based on a conservative (small “c”) view of change would end up producing some very worthwhile insight that I am sure would not come from our present theory.

Moving from Blather to Action
There are numerous impediments to working toward ideological diversity. Mostly, I am convinced that almost everyone in our field has politics that are not too much different from mine. We go into the evaluation business because we think that government is good and we want to make it better. That self-selection bias makes us a pretty homogeneous group that forms into associations that do not throw out the welcome mat for divergent opinion. Maybe the best we can do is make it known that ideological dimensions of diversity are welcome. That itself is not so easy because what does “dimension of diversity” even mean? Still, I think it’s worth a shot.

 

 

Posted in Uncategorized | 12 Comments

Invitation to a Conversation Between Program Funders and Program Evaluators: Complex Behavior in Program Design and Evaluation

Effective programs and useful evaluations require much more appreciation of complex behavior than is currently the case. This state of affairs must change. Evaluation methodology is not the critical inhibitor of that change. Program design is. Our purpose is to begin a dialog between program funders and evaluators to address this problem.

Current Practice: Common Sense Approach to Program Design and Evaluation
There is sense to successful program design, but that sense is not common sense. And therein lies a problem for program designers, and by extension, for the evaluators that are paid to evaluate the programs envisioned by their customers.

What is common sense?  
“Common sense is a basic ability to perceive, understand, and judge things that are shared by (“common to”) nearly all people and can reasonably be expected of nearly all people without need for debate.”

What is the common sense of program design?
The common sense of program design is usually expressed in one of two forms. One form is a set of columns with familiar labels such as “input”, “throughput”, and “output”. The second is a set of shapes that are connected with 1:1, 1:many, many:1 and many:many relationships. These relationships may be cast in elaborate forms, as for example, a systems dynamics model complete with buffers and feedback loops, or a tangle of participatory impact pathways.

But no matter what the specific form, the elements of these models, and hypothesized relationships among them, are based on our intuitive understandings of “cause and effect”, mechanistic views of how programs work. They also assume that the major operative elements of a program can be identified.

To be sure, program designers are aware that their models are simplifications of reality, that models can never be fully specified, and that uncertainties cannot be fully accounted for. Still, inspection of the program models that are produced makes it clear that almost all the thinking that went into developing those models was predominantly in the cause and effect, mechanistic mode. We think about the situation and say to ourselves: “If this happens, it will make (or has made) that happen.” Because the models are like that, so too are the evaluations.

Our common sense conceptualization of programs is based on deep knowledge about the problems being addressed and the methods available to address those problems. Common sense does not mean ignorance or naiveté. It does, however, mean that common sense logic is at play. There is no shame in approaching problems in this manner. We all do it. We are all human.

Including Complex Behavior in Program Design and Evaluation
When it comes to the very small, the very large, or the very fast, 20th Century science has succeeded in getting us to accept that the world is not common sensical. But we have trouble accepting a non-common sense view of the world at the scale that is experienced by human beings. Specifically, we do not think in terms of the dynamics of complex behavior. Complex behavior has much to say about why change happens, patterns of change, and program theory. We do not routinely consider these behaviors when we design programs and their evaluations.

There is nothing intuitively obvious about complex behavior. Much of it is not very psychologically satisfying. Some of it has uncomfortable implications for people who must commit resources and bear responsibility for those commitments. Still, program designers must appreciate complex behavior if they are ever going to design effective programs and commission meaningful evaluations of those programs.

Pursuing Change
There is already momentum in the field of evaluation to apply complexity. Our critique of that effort is that current discussions of complexity do not tap the richness of what complexity science has discovered, and also, that some of the conversation is an incorrect understanding of complexity. The purpose of this panel is to bring a more thorough, a more research based, understanding of complexity into the conversation.

By “conversation” we mean dialogue between program designers and evaluators with respect to the role that complexity can play in a program’s operations, outcomes, and impacts. This conversation matters because as we said at the outset, the inhibiting factor is recognition that complex behavior may be at play in the workings of programs. Methodology is not the problem. Except for a few exotic situations, the familiar tools of evaluation will more than suffice. The question is what program behavior evaluators have license to consider.

Our goal is to pursue a long-term effort to facilitate the necessary discourse. Our strategy is to generate a series of conferences, informal conversations, and empirical tests that will lead to a critical mass of program funders and evaluators who can bring about a long term change in the rigor with which complexity is applied to program design and evaluation.

 

Posted in Uncategorized | 2 Comments