- Although considerable research exists on digital curation costs, stakeholders have adopted few of the resulting tools and methods, often because they are organization-specific and not easily applicable in other contexts.
- The Collaboration to Clarify the Costs of Curation (4C) project seeks to better understand and support digital curation and ensure that stakeholders across Europe understand — and know how to apply — existing research and tools.
- The 4C's Curation Costs Exchange (CCEx) was launched to support these efforts; its goal is to bring together stakeholders from across organizations to share knowledge and resources, develop cost modeling exercises, and support smarter digital curation investments.
- The CCEx is intended as an independent brand that will outlast the 4C project and be owned, cultivated, and driven by its user community.
Sarah Middleton is head of Communications and Advocacy, Digital Preservation Coalition, Innovation Centre, York University Science Park and a contributor to the 4C Project.
A sizeable canon of research exists on cost modeling for digital curation. Although this research typically emphasizes the cost and complexity of digital curation and preservation, it is in many ways preliminary; the tools and methods developed have seen little uptake. Tools to manage and estimate costs, for example, have not been integrated into other digital curation processes or tools. The question is why? To answer it, a consortium of 13 European partners and cost modeling specialists launched the Collaboration to Clarify the Costs of Curation (4C) project.
4C seeks to help organizations better understand the costs and benefits of digital curation and preservation. As I describe here, the Curation Costs Exchange (CCEx), a platform for the exchange and comparison of digital curation costs and cost information, is a key 4C project deliverable developed to support this aim. Building upon and drawing together existing work, like that of AV Preserve's Cost of Inaction tool and the Blue Ribbon Task Force, the project also evaluates and presents models like NASA's Cost Estimation Tool, the California Digital Library's Total Cost of Preservation Model, and David Rosenthal's Economic Model for Long-Term Storage through the CCEx's accessible platform.
The 4C project builds on existing research on cost modeling — particularly the work by the Blue Ribbon Task Force (BRTF) on Sustainable Digital Preservation and Access. 4C's main objective is to ensure that stakeholders realize and understand how to employ relevant existing work. The project also seeks to examine more closely how these resources might be made more fit-for-purpose, relevant, and useable by a wide range of organizations operating at different scales in both the public and private sectors.
4C project partners include 13 digital curation cost modeling specialists from across Europe, five or six of whom have developed a cost model to satisfy requirements specific to their own organizations. The project's Evaluation of Cost Models and Needs & Gaps Analysis sought to discover how well some of these models and tools meet stakeholders' needs for calculating and comparing financial information. In doing so, the evaluation found that, while a wide range of models existed, there were significant gaps in their applicability to organizations outside those for which they were created.
To avoid creating yet another cost model which would be equally specialist, the CCEx seeks to help users draw together existing and useful resources so they can both make their own assessment of existing models and develop their own cost modeling exercises. CCEx also seeks to support smarter investments in digital curation by enabling knowledge transfer and cost comparisons among organizations of all types.
Methodology and Description
4C calls itself an "open and social" project, and the key to its success has been its ongoing engagement with a wide range of stakeholders. Starting with little more than an idea — that greater transparency around the supply and demand of curation services could change how all organizations think about and sustainably manage their digital assets — the project team consulted staff from memory institutions, data-intensive research groups, and industry and commerce to see how best this transparency could be achieved.
Building on the concept of an open and social online crowd-sourced database of curation cost information, the team set to work identifying the kind of information that such a platform might contain to facilitate knowledge exchange. Rather than waiting for perfect and polished results, we shared its findings and blogged about its development process, inviting comment and input along the way, and released the beta version of the CCEx on August 5, 2014 (figure 1).
Figure 1. The Curation Costs Exchange (CCEx) home page
From a concept of transparency, the CCEx is now a framework of useful information designed to help anyone who creates or curates digital content, or funds the process, to make smarter investments in digital curation. In unprecedented fashion, the CCEx encourages its users to be bold and share their cost data; in return, they will be rewarded with cost information from their peers. The idea? The more we share, the more we learn.
At the heart of CCEx is the Cost Comparison Tool (known as Compare Costs), which enables this exchange of sensitive data. Compare Costs gives users the opportunity to identify greater efficiencies, better practices, and valuable information exchanges among peers. The tool addresses an acknowledged reluctance to share this data by anonymizing and amalgamating the data into a user average for comparison. The CCEx reveals identity details only for those organizations that wish to share them.
Figure 2 shows a screen from the CCEx Cost Comparison Tool interface that invites users to add details of their costs, such as technology and human resources over a specified time frame.
Figure 2. A screen inviting users to add cost details
Acknowledging that some organizations are not able to share digital curation costs, the CCEx also draws together a wealth of other information designed to help users better understand these costs.
Whether organizations are planning to build a business case for investments in digital curation, carry out a cost appraisal or analysis, or simply want to learn about cost modeling, the CCEx offers the Understand Your Costs toolkit, drawn from the 4C project. The toolkit contains the following resources to ground and support users throughout digital curation costs lifecycle.
The Core Cost Concepts model for example, explains the key concepts in digital curation's costs and benefits, which is helpful when putting together a business case for any spending, whether for staffing, activities, or hardware and software. Linked to the core cost concepts, Indirect Cost Drivers pinpoint a set of significant values in the practice of digital curation today. These values might incur a cost, but such costs will either realize a specific benefit or mitigate a particular risk. Applied at the organizational management level, these drivers and their related risks and benefits can help inform decision making about curation investment; they can also help shape business cases and sustainability strategies, and help users derive the best value from their digital assets.
The Economic Sustainability Reference model highlights key digital curation concepts, relationships, and decision points in a complex problem space, helping users benchmark and compare their own local models and invest strategically to preserve data for the long term. And, for those considering certification, the Quality and Trustworthiness report examines the costs and benefits of investing in a trusted digital repository.
Drawing from the Evaluation of Cost Models and Needs & Gaps Analysis deliverable and providing an overview of selected cost and benefit models, the Summary of Cost Models helps users identify which model might best suit their needs. Then, if users still want to create their own cost model, the Model Requirements Specification provides concepts and a generic specification for this task. The value built into this specification leverages the 4C project's comprehensive engagement with various user communities and all of the detailed analysis of the requirements, drivers, obstacles, and objectives related to that engagement.
Figure 3 shows a screen presenting cost comparisons as a "global average" (against all other CCEx users' costs) in terms of a financial accounting breakdown. Figure 4 shows a peer-to-peer comparison in terms of an activity breakdown. Presenting the information this way makes it easy for users to see where they might introduce efficiencies, or where they are most efficient. Furthermore, the Read More, Discuss and Share, and Find Services tools give users interested in digital curation's costs a starting point to read, discuss, share, and research the existing work in this area.
Figure 3. Global average comparison
Figure 4. Peer-to-peer comparison
Although it consolidates many of the 4C deliverables, the CCEx was developed as an independent brand, designed to outlast the 4C project's life and go on to be owned, cultivated, and driven by its user community. It is therefore entirely reliant on continuous user engagement and feedback.
During beta testing, we asked users from 4C's stakeholder network to provide feedback on the CCEx. We also collected further responses through an online survey, the Usersnap feedback tool, and targeted usability testing. Based on this feedback, we continued developments throughout the summer; in October we presented the full version of the CCEx at the International Conference on Preservation of Digital Objects (iPRES) in Melbourne, Australia, as part of a soft release.
Since the beta release, the CCEx has had more than 9,000 page views, and more than 100 people have registered on behalf of their organizations. However, only eight organizations have submitted their data, and it is this reluctance to share costs that 4C must address.
Feedback throughout the initial consultation, beta testing, and at iPRES all indicated a need for and anticipation of a platform such as the CCEx, as well as a willingness to use it and become involved in its community. Some digital curation practitioners at iPRES even asked whether the CCEx could be expanded to incorporate analog curation.
High-level funders — such as Ron Dekker of the Netherlands Organisation for Scientific Research (NWO) — have also recognized the benefits of using the CCEx cost comparison tool. "The CCEx is the platform to help funders realize the benefit of their investments," said Dekker. "By being transparent about their costs and plugging them into this platform, projects can demonstrate that the taxpayer is getting value for money."
It is early days for the CCEx, but the 4C team obviously wants to optimize the number of users who share their data through the cost comparison tool. We face something of a catch-22 situation, however: most people introduced to the CCEx are interested — even keen — to use it, but would really be compelled to use it only when it holds more cost data sets for comparison.
As the Danish National Archives' Alex Thirifays, who leads the 4C team that is collating cost data, observes:
"The value we can derive from the CCEx depends entirely on the willingness of organizations to submit their cost data, but more so on their understanding of the benefits that sharing will bring about. The more costs are shared, the more we can all learn about making smarter investments in digital curation."
However, unless organizations are willing to share first, such data will remain unavailable to attract others. Luis Faria of KEEP Solutions, who led the development team behind the CCEx, reassures those wary of sharing their cost data:
"All cost data shared with the 4C project will be treated confidentially and will be used solely for the purposes of building up data sets for the CCEx. We would like as many organizations to share their data as possible, and we know sensitivity is an issue so, even as outputs of the CCEx, data will remain anonymous."
The digital curation community recognizes the potential of the CCEx — and especially its Cost Comparison Tool — to demystify digital curation's costs and bring about greater cost efficiencies all around. This distinguishing CCEx component makes the website different from others promoting similar but more specific topics. The degree of the CCEx platform's success will thus be measured both by stakeholder readiness to share sensitive cost data and by the Cost Comparison Tool's adequacy and pertinence: Does it work properly? Is it intuitive? Are the results interesting?
One way 4C has aimed to achieve adequacy and pertinence has been to ensure users' high-level involvement in the tool's development and to examine, in-depth, user behavior around the costing of digital curation. The CCEx is consequently a product of the user requirements that 4C identified through its deliverables, user consultations, and iterations such as workshops, webinars, focus groups, beta-testing, individual usability tests, Advisory Board consultations, internal meetings, and conference sessions.
Two key challenges remain for CCEx: we must gather cost data, and then we must translate that data into information that all types of organizations find meaningful and engaging. The meaningfulness of the data does not depend only on how results are presented, but also on how many cost data sets the CCEx can attract to generate statistically reliable and useable figures. Put simply, the results' meaningfulness is the compelling argument to attract others to submit new cost data sets.
In short, the CCEx needs community support. It needs digital curators, creators, vendors, and funders to use it, add to it, take care of it — and, above all, to share their data with it to make it work. The full release of the CCEx will coincide with the 4C/Digital Preservation Coalition's Investing in Opportunity: Policy, Practice and Planning for a Sustainable Digital Future Conference at the Wellcome Trust in London, November 17–18, 2014.