Campus Cyberinfrastructure: A Crucial Enabler for Science

min read

© 2005 Peter A. Freeman and Guy T. Almes

EDUCAUSE Review, vol. 40, no. 1 (January/February 2005): 64–65.

Peter A. Freeman is Assistant Director of the National Science Foundation, responsible for the Computer and Information Science and Engineering Directorate and for Cyberinfrastructure. Guy T. Almes is a Program Director at the National Science Foundation, working within the Shared Cyberinfrastructure Division of the Computer and Information Science and Engineering Directorate. Comments on this article can be sent to the authors at [email protected].

Driven by the needs of college/university researchers and guided by a blue-ribbon advisory panel chaired by Daniel E. Atkins, the National Science Foundation (NSF) has initiated a broad, multi-directorate activity to create modern cyberinfrastructure and to apply it to transforming the effectiveness of the scientific research enterprise in higher education.1 National cyberinfrastructure resources—including multi-teraflop computers, multi-terabyte online storage, rich discipline-specific data collections, advanced instruments, connecting networks, and the grid, database, and middleware software technology needed to weave this all together—represent a valuable enabler for researchers on campus.

We make two claims about cyberinfrastructure and the campus:

  1. Cyberinfrastructure is of strategic value to the college/university as an institution.
  2. Cyberinfrastructure’s "last mile," what we might call campus cyberinfrastructure, presents challenges to and enhanced relevance for campus IT organizations.

Strategic Value

The report by the NSF blue-ribbon advisory panel makes excellent points about how advanced computing and storage resources, data, and instruments transform the capabilities of campus-based scientists. It also presents compelling examples funded from a variety of NSF directorates:

  • Kelvin Droegemeier’s team of atmospheric scientists at the University of Oklahoma is able to receive real-time data from 150 of the National Weather Service’s Doppler radars and use these data to initialize mesoscale forecast models executed at the Pittsburgh Supercomputing Center.
  • The Spallation Neutron Source, now being constructed at Oak Ridge National Laboratory, is being integrated into NSF’s TeraGrid facility to provide real-time access to a unique "neutron accelerator" for scientists in such fields as materials science and biology and to make data immediately available to other TeraGrid elements such as the new SGI Altix cluster at the National Center for Supercomputing Applications (NCSA) and data-intensive computing facilities and advanced storage facilities at the San Diego Supercomputer Center (SDSC).

Note that remote instruments, remote computation, and remote storage facilities are involved in these cases. Networks thus play a crucial role in their success.

Challenges and Relevance

Although campuses can contribute local resources in the areas of advanced instruments, computation, and storage, the focus of this column is on networking. For even if a given campus has advanced instrument, computation, and storage resources, researchers on that campus will still have a compelling need to access larger-scale advanced computing and storage resources and/or different advanced instruments. In fact, those campuses with advanced local resources are particularly likely to have researchers engaged in activities whose scale of requirements or whose breadth of collaborations will require high-speed data transfers among collaborating sites.

Further, there is now evidence that when the campus network and host computers are well engineered, wide-area flows of surprisingly high performance are possible. As an extreme example, researchers at Caltech recently demonstrated single TCP flows of about 7.4 Gbps between Caltech and the CERN laboratories in Geneva. The wide-area facilities used were the same advanced production networks (CENIC within California, Internet2’s Abilene backbone between Los Angeles and Chicago, the StarLight exchange point in Chicago, and the LHCnet link used by the Large Hadron Collider community between Chicago and CERN) used by many academic researchers. What was unusual in this test were the very well engineered host hardware and operating software and the very cleanly deployed local area networks (LANs). More typically, in cases where the slowest link along a path is a lightly loaded gigabit Ethernet, large fractions of the available bandwidth are achievable, again when the LAN and the host components are well engineered.

The claim here is not that such end-to-end speeds should be supported everywhere on campus. After all, these speeds are probably not needed everywhere, and the costs of supporting them everywhere would be large. The claim, rather, is that campus IT organizations should organize to support such speeds to those sites on their campus where the speeds are needed by specific research projects.

Doing so requires several things. First, at the level of campus-wide plumbing, an institution needs a physical plant that can support high-end LANs, specifically the trenches and conduits between buildings, the conduits and closets within buildings, and the single-mode fiber either provisioned or easy to add within those conduits. Georgia Tech, for example, was fortunate to be able to do this early on as part of the preparation for the 1996 Olympics; the resulting ability to project high-performance LANs to any part of campus with programmatic demand has been of great value.

Second, the IT organization needs to be aware of and talk with those faculty members whose research and education projects properly motivate high-end cyberinfrastructure requirements. What are their network requirements? When might elevated performance requirements be needed in the future? This opens up funding sources, whether associated with the institution or with the project. But equally important, those whose work requires the high performance are the ones who typically provide, engineer, and operate the host computers whose proper engineering is necessary for effective end-to-end performance. Further, working with people who care deeply about positive outcomes is probably the only way to achieve success.

Third, the IT organization should implement a program of performance quality control to guide these efforts. It can start by working with wide-area networking organizations—such as gigaPoPs and Internet2—to measure, on an ongoing basis, performance between the campus and key points within the core of the wide-area network fabric. This will allow the organization to know when things are working well and to complain convincingly when they aren’t. It will also allow an explanation of what performance levels are achievable between high-end parts of the campus LAN and those same key points within the wide-area core and, by implication, end-to-end to collaborating sites that have similar campus-to-core performance. Again, by doing these performance measurements on an ongoing basis, the IT organization can rapidly notice and more likely fix problems soon after they arise.

Fourth, this performance quality-control work should be applied to key high-end projects on campus. The latter will be less interested in measuring to the core but will be very interested in monitoring the performance achieved to collaborating sites.

Fifth, the institution should work closely with other campuses to strengthen regional structures, such as the gigaPoPs and the Regional Optical Networks (RONs) emerging across the country, and also national structures, such as Internet2 and National LambdaRail. The campus cyberinfrastructure requirements and the results of ongoing performance quality control can drive these relationships.

Conclusion

The balance of this decade will be characterized by uncertainty and turmoil in the networking arena. Uncertainty will result because although the key role of fiber is clear, there is weak technical consensus on exactly what pattern of routers, Ethernet switches, circuit switches, and optical switches will best be used to exploit these fiber facilities. Investing in the fiber plant while remaining nimble in switching equipment can help manage this uncertainty. Still, turmoil will arise as new forms of switching are added to the traditional packet-switching base. Investing in performance quality control and in communication with the cyberinfrastructure drivers on campus is key to managing this uncertainty and turmoil. Staying focused on driving applications and on "doing what works" will prevent any confusion that could be caused by the rapid pace of technology changes likely to occur over the next few years. By doing so, a college or university can empower its students, faculty, and staff in their most demanding cyberinfrastructure-based efforts. This in turn will establish the campus as an attractive place for success in such efforts.

Note

1. Revolutionizing Science and Engineering through Cyberinfrastructure: Report of the National Science Foundation Blue-Ribbon Advisory Panel on Cyberinfrastructure, January 2003, http://www.communitytechnology.org/nsf_ci_report/report.pdf.