Information Technology for Open Science: Innovation for Research

min read

Colleges and universities are better at moving ideas and inventions outside of the institution than they are at achieving change within. One way to accomplish the latter is to innovate research practices by aligning information technology with open science.

Information Technology for Open Science: Innovation for Research
Credit: Lightspring / Shutterstock.com © 2020

What does innovation mean for higher education? I've asked many college/university and association presidents this question. The typical response points to successful startups, economic development programs, and entrepreneurship training. Yes, these are important ways that we serve society through innovation. But how do higher education institutions themselves innovate?

Colleges and universities are much more facile at outside innovation than inside innovation. By that, I mean that they have created cultures and rewards for moving ideas and inventions outside of the institution through technology transfer or student education but are not as adept at achieving change within.

For fifteen years, while serving as vice president of research at the University of Southern California (USC), I witnessed—and encouraged—a transformation in research practices. Under the umbrella of the "Creativity and Collaboration in the Academy" initiative, USC explored and experimented with how higher education institutions might evolve to a research culture that, aligned with information technology, could achieve innovation inside the institution so as to be more impactful on the outside.

The Catalyst

As a teenager growing up in the 1970s, I (along with what seemed like almost everyone else in the United States) glued my eyes to the weekly episodes of All in the Family. Airing for nine seasons, the TV program held the number-one position in the Nielsen ratings for five consecutive years.

All in the Family was not known for sticking to tradition. Using humor as the vehicle, it challenged us to consider our own biases as we watched the dynamics of the Bunker family of Queens, New York. Through this lens, the program spotlighted changes in society and our struggles with the notions of equality, justice, and respect.

Flash forward 35 years or so, to December 2010. The creator of All in the Family, Norman Lear, had joined a group of USC faculty and visitors for a dinner discussing the changing research practices and how these changes might relate to entertainment. Cable TV and social media had already eroded network TV. The music industry had transitioned from physical media to much-harder-to-monetize digital content. Movies were increasingly watched from home rather than in theaters.

Thinking about these changes in entertainment, the group discussed how research might change and whether there was a threat on the horizon to colleges and universities. Would publications in online (rather than print) journals be recognized as scholarly? If someone used USC research data, would the university lose control? Should faculty aim toward products that continued to look like journal articles, or should something more dynamic and engaging be imagined? What do digital analytical tools mean for humanities scholarship?

Through the USC Norman Lear Center, our faculty had studied creative industries and how they were being affected by technology. Johanna Blakley, managing director at the Norman Lear Center, had completed "Ready to Share," a fascinating study that documented how the fashion industry thrived in the absence of intellectual property protection. The attendees at the dinner saw how mash-ups of creative ideas could inspire new work—for instance, music created through sampling—and how Creative Commons licensing had changed the way credit can be attributed to individual contributors.

But are researchers also "ready to share"?

Changing How Research Is Done

When we think about modernizing higher education institutional processes with information technology, administrative systems generally come to mind—for instance, purchasing, student registration, or patient records. But technology is also changing the fundamentals of scientific research, enabling new forms of creativity and collaboration. To appreciate the change, let's first turn back the clock—more than 500 years.

Science is a process of discovery derived from evidence through which theories are formed and tested in scientific communities. But this hasn't always been the case. As David Wootton has noted, "there was no concept of discovery" before 1492 and the discovery of the Americas.1 Instead, to the degree that science existed at all, knowledge was deduced from the past and from assumptions rather than from rigorous experiments. For example, the geocentric model that placed Earth at the center of the universe was not displaced until the 16th and 17th centuries, when it was contradicted by new theory and telescopic observation.

The scientific revolution that occurred five centuries ago was not just a matter of new instruments for collecting data. It was a revolution in the technology of communication, specifically the printing press. With the creation of a mechanism for recording and conveying knowledge, science was born as a system in which one scientist's findings would inform and inspire the findings of others, who might build evidence in support of foundational theories or sometimes disprove such theories. Without a mechanism for managing, communicating, and preserving information, these basic steps of science could not occur.

Like the printing press of the past, information technology is again changing the scientific process, producing new models of collaboration. Technology to find, store, and share data and information is (1) expediting the transfer of knowledge and speeding research, and (2) making it easier to validate the quality and accuracy of prior studies. Although printed articles, journals, and books permitted ideas and results to be shared, findings flowed slowly among scientists through peer review and publication. Research data flowed even more slowly, if at all.

The alternative, today, is "open science." Through open science, data are contributed to shared repositories as they are produced and collected, methods and designs are revealed at the onset of a study, and papers are shared widely in their draft form. As a consequence, the slow and sequential process of traditional research (see figure 1) is transformed into a system of parallel research that occurs more rapidly and collaboratively (see figure 2).  When the underlying data, materials, equipment, and detailed experimental design are shared, research from one laboratory can be more readily validated by another.

Boxes connected with arrows leading from one to the next: Literature Review; Hypothesis; Methods; Experiment; Data; Analysis; Conclusions; Publication; Literature.
Figure 1. Traditional Sequential Research
2 rows with the same boxes connected with arrows leading from one to the next: Literature Review; Hypothesis; Methods; Experiment; Data; Analysis; Conclusions; Publication. Both rows point into the same final box:  Literature. Between the two rows is a third row. Both Hypothesis and Methods boxes point to the central Registration area. Both Experiment boxes point to the Shared Apparatus area.  Both Data boxes are linked by a Data Aggregation area. The two Analysis boxes have a Common Analysis area between them.
Figure 2. Collaborative Parallel Research

Current Research Issues

As a new style of research, open science requires culture changes, within higher education institutions and disciplines and among research sponsors. Federal sponsors (e.g., the National Institutes for Health), as well as foundations, are increasing their expectations for data management and sharing and also are investing in data repositories. Discipline specialists are taking fresh looks at how research contributions are assessed, while institutions are supporting open science through the combined effort of research, library, and IT offices, in alignment with departments and schools. Best practices for open science are being developed through several national programs:

Open science (or, more broadly, open scholarship) is a work in progress but is already having an impact, as is evident from the race to stop COVID-19. The transparency of open science, along with the required culture change, is also revealing challenges in the college/university research enterprise.

Reproducibility. As reported in 2015, higher education research faces a "reproducibility crisis." Earlier, in 2012, C. Glenn Begley and Lee M. Ellis concluded that among 53 "landmark" preclinical cancer studies, "scientific findings were confirmed in only 6 (11%) cases." Other studies of reproducibility through the Open Science Collaboration and by John P. A. Ioannidis (the latter study provocatively titled "Why Most Published Research Findings Are False") have raised doubts as to the quality of academic scientific research.2 Whether the quality bar has changed over time is unknown. What is known is that the transparency surrounding open science is making it easier for others to test research findings and assess their validity, allowing evaluation of reproducibility.

Conflicts of Interest. Faculty members' outside financial interests—sometimes unmanaged and undisclosed—have come to light. Particularly troubling have been cases where US scientists have been handsomely rewarded by foreign institutions for dual employment or where clinical investigators have not disclosed their financial interests in the companies supporting their work and supporting their communication. The fluidity of data and research across locations and organizations has made engaging in conflicted activities much easier.

Pre-Prints. Higher education investigators now routinely post early versions of their research papers on pre-print sites (e.g., arXiv, medRxiv, SSRN) to accelerate the availability of findings. Although pre-prints are not peer-reviewed, they receive media attention, sometimes prematurely. Notably, because the research is so easily findable and shareable, the journal's traditional role for distribution and accumulation of knowledge has waned.

Privacy. Respect for the privacy of individuals—including both conventional Institutional Review Board (IRB) regulated research and research drawn from large datasets amassed from social media, devices, and apps—has challenged colleges and universities. By cross-referencing datasets, investigators can determine the identities of individuals even when their contributed data are ostensibly anonymized. Or an individual might be identified by examining data contributed from others in the same family tree. The "Golden State Killer" was found not because of a match to his own data but because of a match to related people who had contributed to a public dataset.

Access. Researchers, research sponsors, libraries, and the public have advocated for free access to the products of federally sponsored research, including both data and actual publications, which have traditionally been copyright-protected by journals. Provisions for freely sharing publications are upsetting the financial model of journals, which rely on subscriptions to pay the costs of peer review, publication, and distribution. Major library systems are balking at paying for traditional subscriptions and are advocating for free open access to the research produced by their faculty, students, and staff.

These examples demonstrate that higher education research needs openness to shed a light on successes and problems and also to motivate research of the highest integrity and rigor. The examples also illustrate the dilemmas that arise as colleges and universities work to modernize their research practices.

USC

Looking back to the USC symposium "Creativity and Collaboration in the Academy," we immediately saw the value of an integrated strategy for technology-enabled research and scholarship (not yet called "open science") for the entire university, in all disciplines. But we needed to re-envision the fundamentals of our academic work.

Surely IT tools for data management and sharing, multimedia, analytics, and so on could improve research. But these new tools did not match up with the university culture. Policies and incentives reflected the past, when research was communicated on paper and when data and other research objects were not portable and easily shareable. They did not align with the creative and collaborative approach of technology-enabled open science.

From a policy perspective, we took a close look at our promotion and tenure (P&T) process. Our P&T manual's language at the time emphasized "independence." That is, a faculty member being considered for tenure needed to demonstrate success in an independent research program. We changed that language changed to emphasize research contribution, through either independent or collaborative work. Language was also added to recognize born-digital work and nontraditional research products, such as multimedia, "interactive works that are impossible to publish in print form," and digital databases or repositories. As a complement to the P&T manual, the USC Academic Senate adopted standards for attribution of diverse research products, recognizing that assignment of credit is essential to understanding contributions.

To meet our obligation to protect the privacy and rights of human subjects who contribute specimens and data to research studies, USC also adopted a policy on biorepositories. The University Office of Research took on the issue of "Rigor, Transparency, and Reproducibility," which was coupled with resources and training for the research community. The university also developed guidance and training on data management and data sharing.

Resources are important. Following the symposium, we established a "collaboration fund" and a "core instrumentation fund" to support groups of faculty building novel research communities. We developed an online portal [https://digital.usc.edu] to assemble "4-C" resources: Communicate, Collaborate, Create, and Catalogue. We also designed and implemented the TARA (Total Access for Research Administration) system as an integrated university-wide system, going completely digital for activities such as proposal submission, ethics review, intellectual property management, and core lab management.

USC benefited from its Information Sciences Institute, through which work on scientific workflows led to multiple national repositories, built expressly for scientific communities (ranging from earthquakes to facial imaging), as well as recruitment efforts focused on faculty leading collaborative research communities (e.g., neural imaging and clinical trials for Alzheimer's). Through this concerted effort—aiming toward resources for scientific communities rather than individuals—USC has been well supported by federal agencies, including receiving the Clinical and Translational Science Award from the National Institutes of Health in 2010.

Achieving Change

Information technology has incredible power for changing the paradigm for how research is done, shifting from the paper-driven sequential process of the past to a process that is more rapid, creative, and collaborative. The change is not a simple matter of buying technology, however. The change requires a strategic reinvention of the culture of research, supported by policies and resources and inspired by a desire to innovate. Nevertheless, there are challenges:

  • Sharing data requires ongoing effort by the creators, distracting them from the primary work of experimentation.
  • Sharing data may enable others to "scoop" the creators, getting credit for their hard work.
  • Open-access publication pushes the costs of publication to the researcher.
  • Governments and companies may steal the best ideas from one another in order to gain a military edge or move ahead in the market.
  • Data-management requirements from agencies are an unfunded compliance mandate, not helping research.

These criticisms are fair, but mostly they reflect a failure of organizations and cultures to adapt to new collaborative research methods that, ultimately, advance science.

Change requires cooperation among the offices supporting research, particularly the research office, the library, and the IT organization. The research office needs to reconsider styles of research, the nature of research products, how contributions are recognized and rewarded, and how intellectual property is protected. The library needs to support curation of, preservation of, and access to new types of research products. The IT organization needs to develop infrastructure for storage, computation, privacy protection, and administration as new tools to facilitate the creative aspects of research. The IT organization should also recognize that research data are meant to be shared (with privacy exceptions), as in Harvard University's Information Security Policy: "Harvard is equally committed to preserving an environment that encourages academic and research collaboration through the responsible use of information technology resources."

Change is also coming through research sponsors. On October 29, 2020, the National Institutes of Health broadened its Policy on Data Management and Sharing, expanding requirements for open-science data-sharing plans and making grant recipients accountable for executing their plans in the grant-review process. Higher education institutions need to prepare now for the new rules, which are effective as of January 2023.

Conclusions

Research and scholarship are central to the mission of colleges and universities. Information technology can transform practices in these activities, but doing so depends on culture change and on a reimagination of how faculty and staff in higher education work and how they serve.

Transformation is also needed because of the competition facing not-for-profit higher education. Privately owned companies aim to capture slices of the academic market. Whereas colleges and universities have prospered by offering disparate campus activities (including cultural events, athletics, housing, and hospitals, along with research and education), leaders will need to work harder to sustain this "multiversity." Cross-subsidies underlying higher education make institutions vulnerable within those areas that generate surpluses and for which innovative technologies can deliver superior offerings at lower cost.

It is time for IT innovation to happen inside the institution. Like All in the Family, higher education needs to break from tradition. Are we ready?

For more on this topic, see Randolph Hall, "Innovation by Open Scholarship," EDUCAUSE Annual Conference, October 27–29, 2020, on-demand session.

Notes

  1. David Wootton, The Invention of Science: A New History of the Scientific Revolution (London: Allen Lane, 2015).
  2. Kirsten Weir, "A Reproducibility Crisis?" Monitor on Psychology 46, no. 9 (October 2015); Glenn Begley and Lee M. Ellis, "Raise Standards for Preclinical Cancer Research," Nature 483 (March 28, 2012); Open Science Collaboration, "Estimating the Reproducibility of Psychological Science," Science 349, no. 6251 (August 28, 2015); John P. A. Ioannidis, "Why Most Published Research Findings Are False," PLoS Medicine 2, no. 8 (August 30, 2005).

Randolph Hall is Professor in the Epstein Department of Industrial and Systems Engineering at the University of Southern California. He served as Vice President of Research at USC from 2005 to 2019.

© 2020 Randolph Hall. The text of this work is licensed under a Creative Commons BY 4.0 International License.