Updates from the San Diego Supercomputer Center

min read
The CNI Interviews Podcast | Season 1, Episode 4

Director Frank Wuerthwein, of the San Diego Supercomputer Center, discusses their goals and strategies, hardware and software investment, as well as the Open Science Grid.

Listen on Apple Podcasts Listen on Google Podcasts Listen on Spotify Listen on Stitcher

View Transcript

Gerry Bayne: On this episode we feature Frank Wuerthwein, Director of the San Diego Supercomputer Center.

In this discussion we talk about the Open Science Grid, the hardware and software investment at the SDSC, and more. I started our discussion by asking about the current goals and strategies for the SDSC in 2022 and beyond.

Frank Würthwein: My name's Frank Würthwein. And I'm the director of the San Diego Supercomputer Center, which is located at the University of California, San Diego.

Gerry Bayne: What are some of the current goals and priorities for the San Diego Supercomputer Center?

Frank Würthwein: The fundamental objective of the San Diego Supercomputer is to translate innovation into practice. Innovation may come from anywhere in the industry or academia or foundations and have to do with anything out of the broad area of hardware, software systems, data science algorithms, the whole stack in a way. And practice can literally be any domain. And in particular currently, we're very excited about societal impact, the entire direction of convergence research, where we take societal challenges, like for example, wildfire in the Western United States, and develop and apply modern technologies to the problem, develop systems, platforms, research programs, to aid, both the people on the ground who fight the fires as well as planning capacity, prescribe burns and that sort of thing.

Another example in that vein is working with the school district on a COVID-19 app in order to aid decision making by parents, by teachers, by principals, on the ever evolving COVID pandemic. Those kinds of things are things that we're very excited about among the many things that we do.

Gerry Bayne: Could you give us a sense of the SDSC's hardware and software investment strategies?

Frank Würthwein: And that's an interesting question. The investment strategy, I think the short answer is, opportunistic. There is in... We have in a way, three times of avenues to invest in hardware. One is bidding on solicitations, winning awards.

The next one is to provide value, to pull resources off the university. What that means is that a classic condo model, where PIs at the university can buy their own hardware. We help them choose what it is. When we then operate it for them, we offer services that go from system administration all the way through the stack to higher level issues of porting software, to new architectures of working with PIs on modern data science issues, making trainings to the groups available, et cetera.

And then the third leg of the hardware investment is all about being relevant and useful to both local, as well as national industry, as well as foundations, non-profits the entire public, private sector, being able to provide value to that and making hardware investments backed up by funding from those entities.

Gerry Bayne: Now this is a very naive outside the industry question. When we talk about hardware, what kind of hardware are we talking about?

Frank Würthwein: Again, a very good question. Well, there's the traditional workhorses for high performance computing, which basically is x86, it's CPUs from Intel OR or EMD with a few GPUs thrown in, but that landscape is very rapidly changing. In fact, one of the biggest challenges to the field at large in my mind, and the biggest opportunities, is that every single dumb device of age, now is programmable, exists in a high end version that includes some kind of programmable device into it. Think of the network, interface cards, NICs. They now come with FPGA's on them with GPS on them with, with CPUs on them, think storage. You can now buy an N V E that has built in FPGA making the N V E drive the drive itself. Programmable think of switches. Switches. Now are programmable have individual ports that are programmable all of these each and every one of these new categories of pro devices come with their own programming paradigm.

You have computational storage, you have data flow programming, and all of these kinds of things need to be translated into actual systems that do something useful for either science or societal impact. And that... Where exactly that goes is a very active, very dynamic, very heterogeneous landscape. So I see that more and more hardware investments at STC are not just the traditional clusters that are confined to the data center and are used by thousands or even hundreds of thousands of users. But also these very esoteric, one of a kind type of hardwares that exist, and niches to certain things very, very well. Example, we have a piece of hardware, which is a dragon and a system from Illumina that accelerates a gene assembly. We use that for all our COVID pipelines for the university. So the university runs its own COVID infrastructure waste, water testing for students and faculty, the whole nine yards through specialized hardware at STSC in order to achieve its goals, to allow returning to learn to happen.

Gerry Bayne: With all these additions to devices, programmable devices. Does that add a level of complexity that creates backup or any sort of problems because it's much more complex?

Frank Würthwein: Totally. You... We are looking at a massive explosion of complexity.

Gerry Bayne: Is that good news or is it bad news?

Frank Würthwein: It's bad news. In some level, it's bad news, but it's also an opportunity. The way that I look at it is that the fact that Moore's law is kind of slowing down for the traditional CPU hardware, we've benefited from decades of scaling of hardware getting cheaper, computing getting cheaper because CPU's got cheaper and faster and so forth. And that has slowed down a little bit.

It's rescued by GPU's. GPUs were sort of the first diversification of our head, making things more heterogeneous. After the GPUs come, the DPU, the TPUs, the whatever PUs that will come, and all of this ultimately, is driven.

If we were still in a position that we were 15 years ago, when things just got so fast, so much cheaper, then we wouldn't need all of this heterogeneity. But given that the fundamentals have... We are at a level of miniaturization, that progress there, is no longer made as fast. So now you have to be making that progress elsewhere. And that includes heterogeneous architectures. That includes AI architectures dedicated for machine learning. We have a brand new machine that is entirely... The hardware architecture is entirely optimized for machine learning.

Gerry Bayne: Wow.

Frank Würthwein: And so those... I'm seeing a future where large scale computing will happen more and more on a heterogeneous set of hardware, where some things get done on a special thing here. Then some things get done on the workhorse, which is standard that everybody uses. And then some things get done over there on this little special thing, and assembling the workflows and this platforms that make all of this again, easy to use is a huge challenge. And it's a challenge that is in a way, a good thing for a place like STC, because it makes us have enormous added value. It's a bad thing for the scientists who have to deal with all of their complexity and need to employ us in a way to make that complexity go away, be hidden behind easy use interfaces.

Gerry Bayne: You kind of covered a little bit of this, but what are some of the biggest challenges in your organization in 2022?

Frank Würthwein: There's two types of challenges. One that is always the case. Sustainability is always a challenge. Figuring out if you are a soft money organization like us, STSC has 269 employees right now, and a revenue of about 50 some million dollars a year. Now sustaining that is a constant hustling for new awards, service agreements, customers, the whole bang. We're not a regular department, which 90% of it sustains itself just by teaching and the university paying the faculty for nine months of their salary, STC is a very different kind of beast. So, for us, sustainability is always a challenge, will always be a challenge, that is always a number one challenge.

Now, acknowledging that the big technical challenge is the one that I mentioned earlier, the diversification of computing, building infrastructures that can present, can make use of these diverse devices, heterogeneous ways of computing in a way that it is easy to use for the end user scientist or researcher. That is the big challenge. And it'll be a lot of fun to sort that out.

Gerry Bayne: Can you talk a little bit about your involvement with the Open Science Grid?

Frank Würthwein: The open-

Gerry Bayne: First can you tell us what it is?

Frank Würthwein: Exactly, let's start with what is the open science grid? The Open Science Grid is a consortium dedicated to the advancement of all of open science via the practice of high-throughput computing and the advancement of the state of the art of high-throughput computing. What does that mean? It's basically dedicated to all signs that can use heavily distributed computing by virtue of encapsulating, every single computational unit into something that can be run independently and can be re-run independently on a single node or device. And the,...we created that entity... I was one of the founders of this entity in 2005. We created this entity by first creating a framework and a structure to deal with sustainability, to have an unfunded consortium and consortium leadership in place that can govern itself via some kind of bylaws mechanism. And then guide that, the funded part of the consortium, through ups and downs, changes in funding paradigms, changes on solicitations and et cetera, over the years.

And we managed to make this work since 2005 continuously, and sustainability has worked even though there were periods in between where we had to cobble together the funding from six, seven, eight different grants. And that's basically what OSG is. My role in it is, I was the founding... In fact, they put me in charge of the bylaws writing and I was the founding executive because I didn't want to be the executive. And so I was the common agreed on person because it was totally clear that I only wanted to use it as a scientist. And I was going to get out of that position as quickly as possible. And, that way I became part of it. And then for the longest time, I was the spokesperson for applications, that use it, spokesperson for the user community and negotiate interfaces as with the CS end of the side and with the IT end of the side of the organization. And then I became, again, the executive in 2015, I am currently the executive director in charge of the day to day operations, basically.

Gerry Bayne: So I've got one final question for you. I come from EDUCAUSE which is more of a general Higher Ed IT organization. Is there anything you'd like folks to know, about what you're doing at SDSC, that may be of interest to those outside the data and computing field?

Frank Würthwein: As I described the biggest challenge earlier, some of that is non-technical. Some of the things that we need to do today is reduce the gap of opportunity that has... I used to call it the geek gap, because I used to think that it was technical, that some people just didn't know how to do things. And therefore there was this ever growing geek gap developing, that made it difficult for some institutions to participate. In recent past, I've come around to looking at it as a much more of a social phenomenon and an opportunity gap phenomenon than a geek gap phenomenon. It's not that people don't know, people don't know because they don't have the resources to know. And so bridging the gap, democratizing access in order for us as a society to maximally benefit from the human capital that we have inherently, because more people have access to make the most out of their lives and make choices, career choices that advance themselves broadly because they have access to broadband.

They have access to computing, they have access to software. They can participate in the knowledge industry that is built on the technology industry. All of those things are places where I think EDUCAUSE can play a role and help achieve the larger goal of creating open infrastructure, open science, open data, open software for an open society. And ultimately in the interest of providing means such that everybody has the opportunity to make the most out of their lives. And that is, I think, something that goes well beyond just software and computing, it requires social organizations. It requires us thinking about what are we willing to share, thinking about what structures we are building? We have the opportunity to build this now, because there's enough awareness that, that gap has been growing and we need to put effort into shrinking it again for our own self-interest.

Gerry Bayne: That's a great note to end on. Thank you very much for your time Frank.

Frank Würthwein: Thank you.

Gerry Bayne: Very much appreciate it.

Frank Würthwein: Pleasure talking with you.

This episode features:

Frank Würthwein
San Diego Supercomputer Center
UC San Diego