Key Takeaways
- The new NIST SP 800-171 requirements can place a heavy burden on a higher education infrastructure not designed to support such strict security standards.
- Using infrastructure-as-a-service allows Purdue to create a separate domain for restricted research, as the standard suggests, without purchasing additional hardware.
- The ability to logically separate each project helps Purdue more accurately allocate cost, scale on-demand, and protect from insider threats.
- Encryption both in-transit and at-rest provides strong risk mitigation against data theft.
With data breaches becoming an epidemic, the U.S. government decided to take a more active role in defining security requirements. Two new contract clauses, DFAR 242.204-7012 and FAR 52.204-21, together outline the information security requirements placed on nonfederal entities that possess and process sensitive data, known as controlled unclassified information (CUI), under contract with the U.S. government. These clauses reference NIST Special Publication 800-171, which describes the IT security controls expected by the government when a contractor is processing federal data.
Previously, universities and research institutes enjoyed wide academic freedom with respect to how they handle data, particularly in the realm of fundamental research, which may be publicly released. However, that precedent is changing.
Academic institutions dealing with unclassified controlled technical information are not exempt from the controls of this rule. The protection of the information is equally necessary, regardless of whether the contractor is a university or a business concern.1
In central Indiana, Purdue University has been working to evaluate these new regulations and determine the best course of action. Purdue participates in a great deal of U.S. government research, and we quickly realized that implementing these new security requirements would undermine the collaborative, sharing-centric infrastructure our researchers and students currently rely on. To meet NIST 800-171, Purdue would need a new, separate environment for this kind of research data. The real question was whether the university should build a data center or lease space in the cloud.
While attempting to understand the scope of this problem, Purdue realized the biggest issue with its research systems was the wide variety and difference in needs from contract to contract. One might be long-term, others only a few months in duration. Some might just need a few workstations, while others need high-performance computing clusters.
Ultimately, the variety of research Purdue takes on, combined with other factors, contributed to our decision that using a cloud solution would be our best option.
Getting There
While looking for a path forward, we found several elements within the NIST SP 800-171 to be informative for how best to go about building a compliant environment for CUI data. The following excerpt from 800-171 heavily influenced our thought process:
If nonfederal organizations entrusted with protecting CUI designate specific information systems or system components for the processing, storage, or transmission of CUI, then the organizations may limit the scope of the CUI security requirements to those particular systems or components. Isolating CUI into its own security domain by applying architectural design principles or concepts (e.g., implementing subnetworks with firewalls or other boundary protection devices) may be the most cost-effective and efficient approach for nonfederal organizations to satisfy the requirements and protect the confidentiality of CUI.2
Purdue's network, like that of many other universities, is designed to provide an open environment conducive to data sharing, not compartmentalization. Furthermore, certain contracts come with International Traffic in Arms Regulations (ITAR) restrictions that limit access by foreign nationals. This conflicts with Purdue's inclusive hiring practices, as there are non-U.S. persons with administrative privileges in our network who we don't want to displace. With that in mind, the paragraph recommending the creation of a separate environment (domain) specifically for CUI data made good sense.
Choosing a Platform
In the world of IT, agility and scalability are key to survival in the 21st century.3 Research using sensitive data is no different. If a university cannot respond to the needs of the various U.S. government entities paying for research, they will lose those opportunities to other, more capable institutions. Purdue found that there was no one-size-fits-all model, and the diverse requirements of our researchers would put an incredible burden on existing facilities as contracts came and went, limiting the flexibility and infrastructure support for other resources.
After weighing the pros and cons of building a new data center against leveraging IaaS cloud providers, Purdue University ultimately chose to leverage AWS GovCloud as the provider for this initiative. As noted on its website, GovCloud is an Amazon cloud-based platform for use with government contracts and subcontracts. Its data centers are housed on continental-U.S. soil and staffed by US-persons only.
Purdue codenamed this new endeavor the Research Environment for Encumbered Data, or "REED."
Designing a Compartmentalized Infrastructure in the Cloud
Purdue's design for this new environment was inspired by the isolated/gloved boxes used in bioresearch labs.
- All sensitive data will be housed within the REED environment.
- Data ingress and egress will be tightly controlled, monitored, and logged. Data coming in will be screened for malware, and data going out must be reviewed by two team members before being permitted to leave REED.
- Using the Purdue VPN encrypts data traffic to REED and provides multifactor authentication as required by NIST SP 800-171.
- The remote desktop interface keeps data from being stored on the local computer and allows us to disable external devices, like USB drives.
- Each project and its data are contained within their own instance; data cannot be transferred (accidentally or maliciously) to another project without approval.
- MS Terminal Server allows centralized management of user systems. Group Policy (GPO) keeps user systems aligned with the U.S. Government Computer Baseline (USGCB).
- Amazon EBS storage containers allow for data encryption inside the instance.
The purpose behind REED is to create a contained environment for CUI data that has the tools our researchers need and the security the U.S. government requires: one point where data can be housed, secured, processed, and monitored, supported by one team. This allows Purdue to meet both current and future-state IT security requirements levied by the U.S. government for CUI and other sensitive data without placing an unnecessary burden on our university network and its infrastructure. We can scale to meet any demand much more rapidly than if we had to stand up physical systems.
Purdue also uses the ability to create virtual private clouds (VPC) to separate the environment by project, allowing for stronger access control and a more accurate cost assessment. Individual projects are responsible for the direct costs of their resources, plus an overhead cost to share the burden of the administrative and audit systems.
REED is built into a series of individual environments, each one compartmented. Cross-traffic is limited to the administrative systems for overhead services such as authentication, updates, etc., and to the audit systems for log collection and monitoring. This can be accomplished with proper configuration of the security groups, virtual PCs, and subnets within the AWS GovCloud.
Within these project environments, Purdue can easily accommodate multiple types of computing systems based on project need. From simple data processing environments with just a few computing systems, to high-performance clusters for modeling and simulations, we can stand up computing instances of varying performance and price to meet our requirements.
Figure 1 helps illustrate Purdue's model for REED. "It's like a bucket of Legos. You can build whatever you want…"4
Purdue used Amazon EBS containers for storage, in order to leverage their ability to encrypt data both at-rest and in-transit between the container and the instance at the AES-256 level. After reviewing Amazon's security policies and our contract agreement with AWS, we felt that using the native AWS key management solution meets the requirements of NIST 800-171. However, AWS supports other key management options if higher security is ever required.
To connect to REED, users will authenticate to the Purdue VPN with a password and one-time PIN from either a keyfob or smartphone application. This provides Federal Information Processing Standard (FIPS) Publication 140-2 validated encryption in-transit and a secure gateway for accessing the environment.
REED is being designed and built to be as uniform as possible. While add-on components may differ, the basic blocks (EBS containers for storage, terminal servers for the user interface) in each instance remain similar to facilitate security review and make unauthorized changes or improper configurations more readily apparent. This also allows for rapid reconstruction in the event of system failure and rapid deployment when new instances are called for.
Using Separation of Duties to Protect Against Insider Threats
In the wake of insider breaches like those of Chelsea Manning5 and Edward Snowden,6 the focus has shifted to data theft and system abuse by trusted users and system administrators. Separation of duties is a requirement in NIST SP 800-1717 and is a best business practice even when not required in order to prevent one person from inadvertently (or purposefully8) becoming a single point of failure. To that end, REED's administrative functions will be divided by task, with those in the administrative environment lacking rights to the audit environment and vice-versa. Also, one group of analysts will be tasked with monitoring the systems, and another group with certification and inspection of the required controls.
Purdue understands that these security safeguards may be overcome, but it will be difficult to do so without the audit logs reflecting these actions, which are monitored. These efforts are based on a risk analysis and evaluated against the cost-feasibility of more stringent measures, which ultimately proved prohibitive.
Data Ingress and Egress
The most difficult element to address in REED is the ingress and egress of sensitive data. Data coming into the system presents a risk of malicious intent, and data going out needs to be cleared for release. Both processes will use systems in REED being designed by Purdue's infrastructure and applications teams.
Data handling has posed several challenges to Purdue's infrastructure and security teams, and we have had discussions with representatives from other universities on how best to address this issue. Ultimately, Purdue chose to follow an "airlock" approach to control the access points data can use to get into REED.
Data will never be directly deposited from outside the REED environment into a project instance. Each project will include a hardened secure FTP that sits in public IP space, where incoming data will be deposited, scanned for malicious content, and logged. From there, an automated system function will move the files into the project's main storage container. Figure 2 sketches the data ingress process.
Controlling data going out will be more difficult. We intend not only to ensure that data remains secure but also to prevent insider theft of sensitive data. As noted, requiring separation of duties reduces the risk of data theft.9 With that in mind, the outgoing "airlock" system requires review and consent by two different individuals in the project in order to move data out of REED and to the secure FTP site.
This system will require that multiple users inside the project environment work together in removing data from REED. The first user will deposit the selected data files into a holding area, then a second user will review the files (they will be unable to make changes) and either approve or deny the transfer. This two-step approach forces each data export to be staged and reviewed by two separate users.
Once the data is on the secure FTP, credentials can be issued to the receiving party. Requiring two people to verify outgoing data greatly reduces the risk of inadvertent disclosure and data theft. Figure 3 sketches the data egress process.
Recommendations
For universities and other research institutions, it can be difficult to separate CUI data from other material on an existing infrastructure. Creating a separate domain configured to meet the NIST SP 800-171 environment offers an effective solution, but it can be difficult to operate and maintain multiple data centers if you require higher security. Using an IaaS solution allows an institution to offload the burden of maintaining physical hardware to an organization specializing in that area and instead focus on the operational tasks of running the environment.
Further, the long lead times associated with expanding or constructing a data center puts pressure on timelines for grant applications and other contracts. Leveraging IaaS provides the agility to move as quickly and as far or near as your needs require, without the traditional concerns of a physical footprint, HVAC, power, hardware life cycle, and other elements required of a data center. Consequently, when those grants and contracts end, you will see further cost avoidance because those elements will not need to be reallocated — or torn down.10
Finally, a principal concern with IaaS is data ownership. Does a company truly own and have control of its data on another entity's infrastructure? The answer is, simply, yes. Legal agreements between the cloud vendor and the data owner make ownership rights very clear, and with the multitude of security tools and options available for data protection and ownership, an institution can take measures to ensure its data are as private in the cloud as in the local data center. Most major cloud vendors offer encryption key management solutions with strong legal access control language, but in the event that an institution needs to retain absolute privacy, there are a number of third-party vendors offering separated key management solutions that integrate with cloud services, allowing the data owner to retain complete access control to the data. With one of these in place, the cloud vendor would be unable to access the stored data, even under subpoena.
Key Questions
As institutions think about how to protect controlled data that they receive from the federal government, IT leaders need to ask the following questions:
-
What data do you need to protect? How are you receiving it from the federal government, and where will it be stored?
-
Consider your resources: what technological, financial, and human resources are available?
-
Consider your constraints: how do existing structures and modes of operation limit your flexibility in employing a solution?
-
Who are your stakeholders? Do you have support from institutional leadership and researchers to consider alternate solutions to help you achieve regulatory compliance?
Conclusion
REED is Purdue University's attempt to provide a secure, agile, and scalable solution for its researchers working in the ever-tightening world of U.S. government research, one that will meet the data security requirements of the U.S. government and NIST SP 800-171 and create an environment suitable for the demands of academic research.
Currently, REED is online and testing with live project data. Purdue is planning to integrate its remaining DFAR 7012 contracts prior to the December 2017 deadline as dictated by the Department of Defense. Cost models are being generated so that our researchers can accurately assign costs into their grant and contract proposals, and Purdue has been working with other universities engaged in similar efforts to exchange ideas and findings regarding this method of handling regulated data.
These new regulations were not written with consideration of academia's unique position. We at Purdue hope that leveraging cloud systems under this model will allow universities to continue pursuing controlled research without affecting their native infrastructure in a way that hinders academic freedom and fundamental research.
Notes
- Federal Register, Vol. 78, No. 222 (November 18, 2013); Rules and Regulations: 69273, "Defense Federal Acquisition Regulation Supplement: Safeguarding Unclassified Controlled Technical Information (DFARS Case 2011-D039)," part II, question #13 "Applicability to Universities."
- NIST Special Publication 800-171, "Protecting Controlled Unclassified Information in Nonfederal Information Systems and Organizations," June 2015, 3, para.3.
- Susan Grajek, "Top 10 IT Issues, 2016: Divest, Reinvest, and Differentiate," EDUCAUSE Review, January 11, 2016.
- Co-author Michael Shuey, infrastructure architect at Purdue, uses the "Lego" simile to explain the flexible and scalable on-demand nature of the REED environment.
- Julie Tate, "Bradley Manning sentenced to 35 years in WikiLeaks case," Washington Post, August 21, 2013.
- Barton Gellman, Aaron Blake, and Greg Miller, "Edward Snowden comes forward as source of NSA leaks," Washington Post, June 9, 2013.
- NIST Special Publication 800-171, "Protecting Controlled Unclassified Information in Nonfederal Information Systems and Organizations," June 2015, ch. 3, 9: requirement 3.1.4.
- David Kravets, "San Francisco Admin Charged With Hijacking City's Network," WIRED, July 15, 2008.
- NIST Special Publication 800-53 (rev.4), "Security and Privacy Controls for Federal Information Systems and Organizations," April 2013, appendix F-AC, F-18, AC-5 "Separation of Duties."
- Tom Gillis, "Cost Wars: Data Center vs. Public Cloud," Forbes Online, September 2, 2015.
Jason Stein is an IT security analyst and ISSM at Purdue University.
Michael Shuey is an infrastructure architect at Purdue University.
© 2016 Jason Stein and Michael Shuey. This EDUCAUSE Review article is licensed under the Creative Commons BY-NC-SA 4.0 International license.