Measuring to Improve vs. Improving Measurement

This blog post originally appeared on the Stanford Social Innovation Review website.

Measurement was once again a hot topic at this year’s Skoll Forum; with seven measurement-related sessions over three days, it eclipsed other perennially popular topics like funding and innovation. And yet there was a marked difference in the discourse this year, with many speakers and attendees questioning whether social sector organizations are thinking too narrowly about the whole paradigm of measurement. Put another way, there seemed a real tension between whether the greatest bang for the buck in measurement will come from organizations measuring for their own improvement, or from the social sector improving on the measurement tools and techniques available to organizations in the first place.

A session I co-led, “Measuring to Improve (and Not Just to Prove),” fell decidedly in the first camp. With most social sector organizations under-resourcing and under-prioritizing measurement, the session argued that organizations get the best return when they: a) collect a small number of easily verifiable measures linked to their theories of change, b) do this regularly at every level, and c) couple data collection with analysis, learning, and action. The session used One Acre Fund, an NGO that boosts incomes of smallholder farmers in East Africa (and where I’m the founding board chair), as an example. At the lowest level, field officers, who work directly with farmers, collect and work in groups to analyze data each week on farmer repayments, farmer attendance at trainings, and farmer adoption of One Acre Fund techniques. Middle managers are trained to look at aggregate data around these measures and quickly take action to fix anomalies. And at the highest level, leadership focuses on simple organizational measures, such as average increase in farmer income and farm income generated per donor dollar invested, rather than every possible outcome.

Other Skoll sessions and content drove home a similar view. Caroline Fiennes, director of Giving Evidence, talked about the “operational uselessness” of collecting impact data solely on your organization’s current model, without comparison to other approaches you or others are utilizing or testing that might deliver better results, lower costs, or both. Ehren Reed, Skoll Foundation’s research and evaluation officer, argued that the most successful social entrepreneurs are constantly tweaking their business models by scanning their environments and internalizing the implications for their strategies. One social enterprise leader perhaps put it best when she noted, “We decided that if we couldn’t name a meaningful action we would take as a result, we would stop collecting the data.”

On the other hand, several Skoll sessions were devoted to new measurement tools and techniques that could theoretically propel a giant leap forward in the social sector’s use of data. Big data, for one, arose time and again, with proponents arguing for its ability to turbo-charge social sector impact much in the same way that it has turbo-charged profits for the Facebook’s and Amazon.com’s of the corporate world. While presenters shared several promising examples, including Global Giving’s Storytelling Tools and Benetech’s new Bookshare Web Reader, there seemed a dangerous extrapolation from these examples to a prevailing belief that “big data” would plug the “big gap” in social impact potential across the sector.

Similarly, funding vehicles such as impact investing and social impact bonds were highlighted extensively as new tools meant to accelerate impact in the social sector. And yet the data suggests that both are struggling to gain traction, given the small number of interventions that can absorb these funding types.

At the end of the day, the usefulness of any measurement tool depends on whether it is the best at addressing a high-priority question that a decision-maker at any level of an organization is seeking to answer. Most social sector organizations are still struggling to answer basic questions about their program models: Do a high proportion of the clients they reach meet the organization’s own selection criteria? Do clients that participate more realize higher levels of outcomes? Does the organization’s model produce greater impact per dollar than the other models available for their target clients? These basic questions require basic measurement tools, coupled with a much greater leadership commitment to—and a culture that embraces—data-driven decision-making. For this reason, newer tools like big data may be more of a big distraction than a big opportunity for the typical social sector organization.

What is your experience applying these new kinds of measurement tools and approaches to your organization? Has it worked, and if so, why?
 
Matthew Forti is director, One Acre Fund USA, where he coordinates all US functions and oversees performance measurement for One Acre Fund, a nonprofit that assists over 135,000 smallholder farming families in East Africa to double their farm profits and eliminate persistent hunger. Matt is also advisor to the Performance Measurement Capability Area at The Bridgespan Group (@BridgespanGroup), an advisory firm to mission-driven leaders and organizations that supports the design of performance measurement systems for continuous learning and improvement.

Ten Years of Performance Measurement

This is part of a series of reflective posts by regular Stanford Social Innovation Review (SSIR) bloggers in honor of SSIR’s 10th anniversary.
 
In my December 2012 entry recapping the year in performance measurement, I shared two studies that suggest nonprofits and funders continue to struggle with the kind of measurement that drives continuous improvement. But viewed over a longer time period, I’d argue that the social sector has made some pretty impressive gains in this area.

So in honor of SSIR’s anniversary, let’s raise a glass to celebrate five improvements in performance measurement during the last 10 years:

  1. From overhead to outcomes. Prevailing wisdom used to be that the best way to judge the efficiency of nonprofits was by looking at the proportion of their budgets that they spent on overhead. Today, websites such as GiveWell and Coalition for Evidence-Based Policy are rating nonprofits on the quality of their outcomes. Even Charity Navigator has dramatically changed course—“results reporting” is now a pillar of its rating system.
  2. From ideology to evidence, particularly in global development. In the early 2000s, fierce ideological debates about how to bring developing countries out of poverty were the norm. Then a new breed of economists at the Poverty Action Lab and its sister networks proposed letting the evidence decide. Though the meteoric growth in the use of randomized control trials is controversial, it has helped the largest global development funders make more decisions based on evidence about what works—a shift we are also starting to see in some domains, such as education, in the United States.
  3. From isolated to shared measurement. The past decade has seen renewed interest in partnership and collaboration across agencies and sectors to achieve common goals. Especially exciting is that many of these local collaborations, as well as national networks, are now setting common indicators, reporting through a common data system, and analyzing and learning from data together.
  4. From straightforward to complex interventions. Innovation in measurement used to focus mainly on direct-service interventions with fairly linear logic models (for example, how to rigorously measure outcomes of a summer literacy program for children). But lately there has been an explosion of innovative approaches (such as developmental evaluation, outcome mapping, and policymaker ratings) for measuring what’s happening in more dynamic environments, such as advocacy work, systems change, or neighborhood revitalization. These and other approaches are allowing organizations and initiatives with complex interventions to learn more from their measurement, undergirded by more sophisticated theories of change.
  5. From external evaluation to performance management. In the past, as described by the 2006 SSIR article "Drowning in Data" social sector measurement was dominated by funder requests for endless lists of metrics or proof of program impact from external evaluations. While many nonprofits will tell you these requests haven’t necessarily slowed, there has been a more concerted effort by evaluation firms and other measurement providers to develop techniques and tools that support performance management. For example: the adaptation of the balanced scorecard to the nonprofit sector, the proliferation of performance management data systems tailored to nonprofits, and the launch of the PerformWell website.

And now for you “glass half empty” types, five areas where we need to make much more progress. We need:

  1. Greater focus on long-term outcomes. Nonprofits still get by with measuring only what’s easiest to measure: short-term outcomes such as school attendance or job placement rates. But if nonprofits want to change lives, they (and their funders) should also care about whether their programs are helping people achieve more meaningful goals, such as completing college or making sustained economic gains. If more nonprofits asked the question of whether the people they serve actually end up in a better situation over the long haul, there would perhaps be a greater focus on holistic interventions or collaborations that create solid pathways to those more meaningful outcomes.
  2. Fewer organizations getting big on the basis of limited evidence. A recent study by Veris Consulting found that only 39 percent of nonprofits that are scaling their programs have evaluated the impact of their work in any way; less than a fifth had a third-party outcome or impact evaluation. While having multiple sites can help an organization test and improve its programs, more extensive growth should require rigorous evaluation.
  3. Greater recognition of organizational and contextual factors that drive strong performance. In the rare instances when models are proven highly effective, funders understandably get excited about scaling those models. But most evaluation studies devote little, if any, attention to underlying organizational factors (such as culture and leader characteristics) and contextual factors (such as regulatory climate and the presence of high-capacity partners) that play a role in success. In their absence, organizations or funders often require that replicators follow the original model with full fidelity; yet this precludes important adaptations and improvements that could increase the odds of success.
  4. A bigger role for an organization’s constituents in performance measurement. Nonprofits have a hard enough time implementing a measurement system that works for senior leaders and program staff; they rarely tackle the question of how measurement can work for the individuals, families, and communities they seek to benefit. But nonprofits often do a real disservice to these constituents—and their own success—when they fail to involve their constituents in reflecting on results, setting goals, and deciding how best to achieve those goals.
  5. As much attention given to “performance audits” as financial audits. Though exact requirements vary by state, most nonprofits with revenues greater than $500,000 are required to obtain an annual independent financial audit. Yet only a handful ever get an outside assessment of their self-reported program performance data. Over time, the social sector would benefit from independent social impact analysts to help clarify for funders and nonprofits the extent to which program data is accurate and representative.

As you reflect on the past decade, what other gains or gaps in social sector measurement would you put on the list?

How Leading Philanthropists Fail Well

(This blog post originally appeared on the Stanford Social Innovation Review website.)

By Matt Forti and Matt Plummer

The philanthropic sector seems to be changing its tune about failure. While some, like former Hewlett Foundation President Paul Brest, have been encouraging philanthropists to talk about their failures (of grants, initiatives, or entire strategies) for years, only more recently has the sector more widely adopted the view that failure can be something positive, an indicator of a willingness to take risks, experiment, and adapt. A number of recent initiatives demonstrate this new outlook: the Case Foundation’s Be Fearless campaign, the Institute of Brilliant Failures Award for Best Learning Moment in international development, the Admitting Failure online community, and the FailFaire conferences. All of these have launched in just the last three years.

While failure can be an incredibly valuable learning tool, research from the private sector suggests that most organizations don’t take a systematic approach to experimentation, and therefore don’t reap the benefits of failure. In 2011, Bridgespan began a series of blogs based on a decade of close client work with philanthropists called “Does Your Philanthropy Have an Adaptive Strategy?” These blogs chronicled an emerging redefinition of strategy from a static towards a more flexible view of what constitutes success, and a greater willingness to prototype ideas, learn from mistakes, and adapt in light of new information and opportunities. A video series of candid conversations with more than 60 philanthropists, recently released by Bridgespan, echoes this approach and provides five insights into how to diagnose, learn from, and improve after failures.

  1. Start with clear definition of success. In the videos, Paul Brest notes: “You can’t know whether you’re succeeding or failing unless you’re pretty clear about what outcomes you’re achieving.” Philanthropists can be especially challenged in being clear about outcomes, since they must typically consider outcomes at multiple levels: what their grantees are achieving for the people they serve, how the capacity of the grantee organizations themselves may be increasing, and whether the philanthropist and grantees are collectively achieving a broader set of outcomes for populations or systems. In another short video, Michael Steinhardt shares a great example of how the initiative he co-founded, Taglit-Birthright Israel, defined success and failure upfront so that he could know if his investment was actually making a difference.

    Proven results: Michael Steinhardt gives Jewish kids the experience of a lifetime.
     
  2. Measure along the way to learn and adapt. Since even the best strategies are based on an imperfect understanding of future conditions, plans and initiatives need to be regularly evaluated against new information. When the Robert Wood Johnson Foundation (RWJF) first got involved in end-of-life care, it funded a study to test whether an intervention it planned to support would result in the outcomes it desired. According to RWJF President Risa Lavizzo-Mourey, the study revealed that, “What we thought was going to happen absolutely didn’t happen,” and this allowed RWJF to change course and help advance the movement that changed the way physicians deal with death and dying. Performance measurement systems that behave more like instant feedback mechanisms than long-term evaluation studies alert philanthropies when a strategy is not working as planned, and provide the input to reflect and adapt when necessary.

    Risa Lavizzo-Mourey on how early measurement redirected RWJF's strategy.
     
  3. Resist seeing results as black or white. With increasing efforts to publicize failures, there is a growing pressure to label initiatives and grants as either a success or a failure. However, as President of the Silicon Valley Community Foundation Emmett Carson explains, “The reality is, very few evaluations, under the best of circumstances, are unambiguous ... There’s always some failure and there’s always some success.” Rather than simply plowing ahead with an initiative, or abandoning it, identifying which parts were successes and which were failures can help philanthropists move forward more effectively.

    Emmett Carson says evaluation results are not black or white.
     
  4. Create space for good failures. Just as initiatives can combine success and failure, failures can be either good, bad, or somewhere in between. Many “bad failures” happen because of avoidable errors. Good failures are often the result of taking risks that could lead to transformative change. Inevitably such risks increase the chances of failure, but potentially also the chances of breakthrough success. Pierre Omidyar creates space for these types of failure by empowering each of his teams at the Omidyar Network to spend 5-10 percent of their budgets on “things that aren’t very clear that they’ll have impact.”

    Take smart risks: The Omidyars' belief in innovation means every dollar may not have impact.
     
  5. Talk about failure. When Paul Brest introduced The Worst Grant Contest at the Hewlett Foundation, in which staff nominate and discuss their worst grant of the year, he initially met with great resistance from some of the program staff. But over time, the contest has taken root. The program responsible for the “winning failure” gets a dinner. But the real motivator for staff, says Brest, is “the intrinsic motivation of being able to learn something and help the rest of the foundation learn something.” After a while, Brest and his colleagues realized that there was too much focus on grantee organizations that had failed, rather than on potentially broader strategy failures by the foundation. The emphasis has now shifted to how the foundation itself has failed and what it can learn from this failure. Encouraging open and purposeful conversation about failure is one of the best ways that a philanthropic—or any other—organization can get better at what it does and achieve more impact in the world.

    Paul Brest incentives discussions on failures.
     

Matthew Plummer is a Senior Associate Consultant in Bridgespan’s Boston Office, playing an integral role in assembling Bridgespan’s video collection “Conversations with Remarkable Givers." Prior to joining Bridgespan, Matthew worked as an operations manager at McMaster-Carr Supply Company.

What Obama’s Campaign Can Teach Nonprofits about Measurement

This blog originally appeared on the Stanford Social Innovation Review website. It is co-authored by Bridgespan Associate Consultant Colin Murphy.

As President Obama was recently inaugurated for his second term, it is worth asking what made his campaign succeed in the face of such strong economic and political headwinds? Nearly every analysis we’ve read suggests that the use of data and analytics was key factor.

Nonprofits can learn a lot from the way the Obama campaign approached performance measurement. For although the campaign’s resources dwarfed those of the typical nonprofit, the measurement practices it followed mirror those of high-performing organizations.

  1. Focus on cost per outcome. Dan Wagner, the campaign’s chief analytics officer and the man credited with much of the success of Obama’s data team, considered his scope “the study and practice of resource optimization for the purpose of...earning votes more efficiently.” With this mandate, the campaign’s advertising team bought ads on programs that offered the greatest number of persuadable voters per dollar, instead of simply trying to reach the biggest audience. This practice led to unorthodox ad buys in smaller markets that diverged from the strategy of the Romney campaign.

    High-performing nonprofits have a similarly relentless focus on improving their productivity, defined as cost to achieve their primary outcome. For instance, Jumpstart, an early education nonprofit, defines its success as cost per child to achieve proven gains in school readiness. By standardizing best practices, investing in good overhead, and using measurement to learn and adjust, Jumpstart and others achieve sustained improvement in the one measure that best captures what they are aiming to achieve.
  2. Tap into the best available evidence and expertise when designing programs. When Obama volunteers in swing states knocked on doors, they read from a script that asked potential voters either to describe their plan to get to the polls or to sign a small voter commitment card with a picture of Obama. Both techniques were drawn from social science research about what actually gets people to take action. In fact, the campaign solicited advice from a team of behavioral scientists, including Professor Richard Thaler at the University of Chicago, co-author of the much-discussed 2008 book Nudge.

    High-performing nonprofits are also constantly scouring the research, keeping in contact with evaluators and other experts, and ensuring that their practices and programs integrate the best knowledge from the field—all of which can help improve the quality of their work.
  3. Segment and target. According to one account, the campaign learned some important lessons from looking closely at its data. Its data system could assemble individual profiles of voters and donors, allowing for an unprecedented level of “micro-targeting”. For instance, they found that George Clooney had a strong influence among 40- to 49-year-old women, the demographic group most likely to hand over cash. The campaign therefore offered a chance to dine in Hollywood with Clooney and Obama – raising huge sums of money. They then replicated the event on the east coast with Sarah Jessica Parker, an east coast celebrity with similar appeal to this demographic of women.

    Nonprofits shouldn’t just measure outcomes. They also need to measure inputs and outputs, such as demographic information on their constituents. High-performing nonprofits go further by analyzing the relationships among these inputs, outputs, and outcomes—a practice often overlooked in the end-of-year reporting rush. Thoughtful analysis and segmentation can allow leaders to see which types of interventions work best for which groups of beneficiaries, and ultimately to make data-driven decisions that can improve their impact.
  4. Invest in a cross-functional data system. Before the Obama campaign even got underway, the Democratic National Committee invested in a data system that connected its voter database to the Obama campaign’s. By doing so, it learned who had volunteered, made a donation, and visited the campaign website—data that informed the kinds of segmenting and targeting activities described above.

    Nonprofits make use of all the data at their fingertips to manage and improve their programs. When its performance management data system can integrate program data with data from government surveys, volunteers, peers, and the like, a nonprofit can achieve a much more nuanced understanding of how it to reach constituents and create impact.
  5. Make measurement a priority. Obama’s internal data science team was reported to be more than 10 times larger than Romney’s, who outsourced some of his analysis to less-responsive consulting firms. After painful losses for Democrats in the 2010 midterm elections, the campaign believed a stronger investment in data science would be critical; they made the difficult decision to invest more resources here and less elsewhere.

    Most nonprofits see measurement as a discretionary investment that can be delayed or eliminated in tough times. But many of today’s most effective nonprofits became high-performing in part by making the tough decision to invest in data systems, measurement staff, and evaluation, even when it might mean having less available for current services.

By following these measurement practices, the Obama campaign focused their resources on the most effective interventions, made smart resource allocation decisions, and adjusted rapidly as the context changed. One telling example of the latter: Late in the campaign, Obama made a highly successful appearance on the social networking website Reddit, which many of the President’s senior aides had never heard of, because the data team had determined that its users represented key turnout targets.

The Obama campaign took what author Sasha Issenberg, who closely observed the campaign’s data strategy, called “a decisive break with 20th-century tools for tracking public opinion.” What do you believe it will take for nonprofits to follow a similar course in their measurement approaches?

Social Sector Measurement: Down but Not Out

This blog originally appeared on the Stanford Social Innovation Review website.

For those pining to see a revolution in how social sector organizations view and use measurement, two reports were launched towards the end of last year with good insight into where we are and how far we still have to travel. To sum up: Buckle your seatbelts—it’s going to be a long ride!

Innovation Network’s “State of Evaluation 2012” study (October 2012) draws from a representative sample of US nonprofits and provides a bevy of statistics and insight, including comparisons to results from a similar 2010 survey. The good news is that 90 percent of nonprofits report measuring their work, and while funders and boards remain the primary audiences for measurement, more than three quarters also report using their measurement to plan and revise strategies and programs.

That said, the proportion of nonprofits that have at least one full-time employee devoted to measurement remains frighteningly low (18 percent), as does the proportion of them that are spending at least 5 percent of their budget on measurement (27 percent)—figures that have increased only a few, potentially statistically insignificant percentage points since the 2010 survey. Further, the proportion of nonprofits that have revised their theories of change or logic models within the past year has actually declined, with fewer than half of nonprofits using measurement to adjust their programs at least annually. Finally, nonprofits continue to rate research and evaluation as their two lowest organizational priorities.

Not surprisingly, the study paints funders as one of the biggest culprits, both in how they support grantees in using measurement and how they use it themselves. (One particularly telling datapoint: Funders are nearly half as likely as nonprofits to use measurement to plan and revise their programs).

Taken together, these and other findings suggest a lot of measurement is happening, but it’s unlikely the quality required for organizations to truly learn what’s working and continuously improve their programs. With fewer than one-third of nonprofits demonstrating promising capacities and behaviors to meaningfully engage in measurement, the authors conclude that the state of evaluation in the social sector can best be judged only as “fair.”

The disappointing role of funders is echoed in the Center for Effective Philanthropy’s “Room for Improvement” (September 2012). The headlines from this study:  only 32 percent of nonprofits believe foundation funders have been helpful to their ability to measure progress, and only 29 percent have foundation funders that provide financial and/or non-monetary support for measurement efforts. And given that foundations place relatively high importance on nonprofits’ evidence-bases (according to recent Bridgespan research), the figures above likely overstate the support nonprofits are receiving from their typical funders.

Both reports conclude that funders can play a pivotal role by changing their approaches to working with and supporting grantees in measurement. But if 2012 taught us anything, the greater promise may lie in funders supporting initiatives that get nonprofits learning from one other. For instance, Mario Morino’s Leap of Reason, which recently surpassed the threshold of 50,000 books in circulation, is chock-full of practitioner essays and examples that provide rich, often first-person accounts around how measurement has changed the trajectories of nonprofit organizations. PerformWell, which launched earlier this year, is attracting hundreds of nonprofit leaders to its free webinars to hear how their peers are establishing performance cultures and regularly using data to improve. And the “Saving Philanthropy” documentary, with its compelling footage of leaders and staff of Nurse Family Partnership, Roca, and others describing how they use performance measurement, has met with great acclaim from those who have participated in screenings and workshops.  

We believe the biggest returns on investment in educating and inspiring nonprofit leaders around performance measurement will tend to accrue at the earliest stages of a nonprofit’s development, when leaders are still forming their operating philosophies and cultures. To see greater improvement by the time “State of Evaluation 2014” is released, we’ll need many more initiatives like the ones described above—initiatives that help nonprofit leaders (particularly of young organizations) learn directly from their peers about how measurement can first and foremost be a tool to improve outcomes for those they serve and better advance their missions.

What are your views on the future of social sector measurement? What other exciting initiatives have you seen?