Empowering the Use of Open-Source Software for Research

min read
The CNI Interviews Podcast | Season 2, Episode 1

Sayeed Choudhury is Director of the Open Source Programs Office at Carnegie Mellon University Libraries. In this conversation, Choudhury discusses his new role—created to raise awareness of and stimulate collaboration among open-source projects across the university—and the impact of this new office on his institution.

Listen on Apple Podcasts Listen on Google Podcasts Listen on Spotify Listen on Stitcher

View Transcript

Gerry Bayne: This is Gerry Bayne at the Coalition for Networked Information Spring 2023 Meeting, and I'm joined in conversation today with Sayeed Choudhury, Director of the Open Source Programs Office at Carnegie Mellon University. Thanks for being here, Sayeed.

Sayeed Choudhury: Thank you, Gerry. It's a pleasure to be here.

Gerry Bayne: So we're going to talk a little bit about the Open Source Program Office and your new role there. What is the Open Source Program Office, and why have you decided to dedicate yourself to this work?

Sayeed Choudhury: Yeah, so the Open Source Programs Office is a fairly new construct within the university setting. It's something that's been in the private sector for a few years. So we're still defining in many ways what an Open Source Programs Office is, but I think of it as a convener of the community and a center of competency within a university to help better manage, curate, and share software. So raising awareness, providing capacity, consultation expertise, bringing people together, acting as sort of a clearinghouse for resources and support.

And then once you have better managed software, there are lots of different things that a university or different sets of universities might want to do, and we're exploring a lot of that. There's six so far, funded through the Alfred P. Sloan Foundation. There'll be another round coming soon. So we're starting to see a network and a community form around them.

Why have I devoted my time to it? Well, I've always been focused on building infrastructure to support open scholarship, open science. And while in previous roles, I had responsibility for looking across articles, data, and software, this one is much more focused, as the name implies, on the software piece of it. I happen to think that there are some fundamental differences in characteristics around how software's produced and how it's shared and how communities are built, and also the impact in terms of supporting open science through things like reproducibility or better engagement with communities. So the previous funder session, we heard about being more transparent, more inclusive, more accessible. I think open-source software is a great way to make those things happen, so I'm really eager just explore how to make those things work.

Gerry Bayne: Can you describe what your role is at Carnegie Mellon?

Sayeed Choudhury: Yeah, so I'm the director of the Open Source Programs Office. So I've been hired to basically help build out the office and to build out a set of services around that, with some support from, as I mentioned, the Sloan Foundation, but also from CMU. Keith Webster, who is the Dean of Libraries at CMU, has been thinking about open science for a long time and has done some really great work at Carnegie Mellon, even beyond the libraries, in terms of building interests and capacity for it. So how the OSPO fits into that is a key part of my role.

Also, trying to think about sort of the external impact and potential and partnerships and collaboration for CMU researchers, CMU students, local community, all the way to federal agencies is a key part of it. We've hired recently a community manager who's going to help look at more of the internal-facing community-building efforts of the faculty and students at CMU.

Gerry Bayne: So what should folks in higher education know, or why should they care about what has primarily been a practice in technology companies?

Sayeed Choudhury: Yeah. So it's an interesting question, and I agree with you that the OSPO itself has been primarily something in the tech sector, or in the private sector. Although, I will say, at this point, companies like Walmart, American Airlines, whatever, also have OSPOs. So open source is everywhere. Recent estimates are anywhere between 95% to 98% of all software uses open source.

Gerry Bayne: So you're saying that there's a trend toward open source that's been happening?

Sayeed Choudhury: Yeah, so the interesting thing is while the OSPO has not been a university focus, open source has been in universities over the last decades as well. The fact that the private sector has been more intentional and strategic about it, I'll say, is to their credit, and they have very clear business reasons for doing this. There's a perception of they only do it for altruism and community relations. That's simply not true. I mean, that is a benefit, but there are bottom line decisions these companies have made.

Now, obviously, universities are different in terms of the mission and the way we work and the composition of the roles, but we haven't been intentional and strategic about open-source software, even though open-source software started in universities well before the private sector picked up. There are cases of open-source software coming out of universities, like Hadoop, that got adopted by the private sector and the Apache Foundation and so on. But universities are a little bit late, in my opinion, to come to realize that there is open-source software being produced, there is open-source software being used, and that there's real value in being intentional and strategic and organizing around that. So it's more a matter of it's already been happening, and now wouldn't you like this to be a little bit more intentional, particularly given a lot of the emphasis on open science?

Gerry Bayne: Right. So I'm just thinking out loud here about the OSPO office. Ten years ago, you didn't see the CSO role very much.

Sayeed Choudhury: Correct. Exactly.

Gerry Bayne: Is this sort of the same sort of thing? You didn't see this around-

Sayeed Choudhury: That's right.

Gerry Bayne: ... and it's sort of a new... Could you trace roughly how long this has been fomenting?

Sayeed Choudhury: Yeah. No, it's a really good observation. So I think one of the things that Josh Greenberg, the program officer at the Sloan Foundation who's funding these OSPOs, had said to me about previous efforts when they funded data science institutes is new roles were created within universities, right? So fifteen years ago, there was no such thing as a data management consultant or a data wrangler, whatever those terms might be. So I think we'll start to see a lot of these new roles evolve in the university context.

The advantage we have in some sense, given what the private sector's done, is they've kind of blazed the trail. They have community managers, they have maintainers, they recognize the value of those roles, and I think that'll start to happen in the university context.

So sort of the fomenting of it in some sense is, as I said, open source has been in universities for a long time, but as there's been more of a shift over time through OSTP memos and public access, the White House declaring 2023 as the open science, is we're seeing shift from articles to data to software. And in many ways, the software piece is critical for reproducibility, transparency, things of that nature, but we don't have as much of an understanding of what's an appropriate policy structure, what are the appropriate licenses, what should the federal funders expect of universities. So there's a greater recognition that it's a critical piece, but there isn't as much awareness or capacity yet.

Gerry Bayne: You talked about your role. Can you talk about the role of open source at a university? I mean, when I think about it as just a layman, I think, "I'm going to get Blender. Great. Maybe I'll help develop Blender. What do we need you to do? And I don't mean to be insulting, just-

Sayeed Choudhury: No, not at all.

Gerry Bayne: ... I want to understand what are you tracking, what are you directing. If it's open source and it's just kind of out there, it's free, what needs to be wrangled?

Sayeed Choudhury: Not insulting at all. As I said, these are early days. You're not the only one with these questions. So better to get this out now than to keep it hidden.

So in essence, the way I've argued about open-source software is it is a primary research object. Just like articles, just like data, there's open hardware, there's other kinds of things. But open-source software is a primary research object.

I think you won't get a lot of disagreement about that, right? So in a university context, if you ask someone, "Do you care about open source?" you'll get a wide degree of answers. But if you ask, "Do you care about research outputs?" everyone will say, "Yes, of course. I care about that." So it has the benefit of activating the whole university support around research interest and support on research.

And when you start to think of, "Well, yeah, it is a primary research object, therefore, we should know how we're producing it, we should know how we share it and what impact its having not only in the academic sense of citation and connections to data and so on, but local communities, other universities, other companies." A really important partnership is with tech transfer, which typically looks at commercialization, which is an option around open-source software. But ultimately, they are about new forms of impact as well. So trying to measure those, I think, is key.

But where are we today? I couldn't tell you all the open-source software that's being produced at Carnegie Mellon. And I would submit no university could tell you that right now. So we have some foundational types of information and pieces of infrastructure to put into place before we can start really addressing those things. But they can happen in parallel, right? We don't have to wait until we have this full inventory to start all those other kinds of useful strategic things.

Gerry Bayne: That makes a lot of sense. Can you talk about the connections between open source and cybersecurity?

Sayeed Choudhury: Yeah, so I can, but I will point out a talk that was given recently at Carnegie Mellon by the director of CISA, the Cybersecurity and Infrastructure Security Agency, Jen Easterly. It was called Unsafe at Any CPU Speed. I would encourage your listeners to look it up and either watch the talk or read the transcript, because she did a great job talking about exactly this question.

Gerry Bayne: I'll be sure to put that in the show notes for this.

Sayeed Choudhury: That'd be great. So I think the key question is that open-source software is a part of critical infrastructure now. Digital infrastructure, but it's increasingly connected to physical objects, right? So my television is now an open-source software platform. It runs Android OS, and quite frankly, sometimes it's flaky and whatever, right? So it's not to say that there still aren't questions to explore around this, but open-source software is becoming a key part of everyday life. And therefore, any issues around its security or its vulnerabilities can affect your everyday life.

I don't mean to speak for the director of CISA. I do think people should listen to the talk directly. But a fundamental premise was the comparison to the auto industry in the past, right? So decades ago, if people had accidents and they were hurt and they died, there was this sense of, "Well, you're just a bad driver, it's your fault." But over time, we recognize that putting that burden on the user is not appropriate, and that the auto industry has a much more robust capability to make structural changes cascade throughout the system. And now you have seat belts and airbags and crumple zones and crash testing and so on. And her premise was the same thing needs to happen with software. And there's an educational aspect to it, there's a regulatory aspect to it. There's multiple dimensions.

But the good thing is open-source software is indeed open. So a lot of people talk about the Log4j security incident that happened, and it was a very serious issue, but it was addressed openly by the community as well.

Going back to automobiles, I used to own a Honda. I guess I shouldn't pick on that company. I'm sure this happens to all companies. They sent me this very nice letter saying, "We need you to come in to get your airbags replaced." So I'm sitting here, thinking, "Wait, what does that mean? I've been-

Gerry Bayne: I hope I'm...

Sayeed Choudhury: ... driving this car."

Gerry Bayne: I hope I'm okay on the drive over there.

Sayeed Choudhury: Exactly. And I think they gave me a free oil change as compensation. It ultimately comes down to trust. I'm trusting this auto manufacturer to do the right thing and have done all the things before that recall happened to make sure it shouldn't have happened, and then after the recall happens, that they'll do the right thing.

Well, okay, is that better than putting trust into a community of universities or even companies who are working openly and saying, "Here are the security issues that are involved, and here's how we're addressing them." And you only hear about the ones that hit the news, right? You don't hear about the hundreds of cases that come up where there are processes in place, there are organizations in place that make sure it never happens.

So it's a key part of how we think about security. Cybersecurity particularly gets integrated into things we do every day. But there's nobody, CISA, Department of Defense, you name it, that's saying, "Don't use open-source software." Everybody's saying, "We have to continue to get the benefits here, we just need to do it in a safe and secure way."

Gerry Bayne: So can you talk about any suggestions you might have for any institutions that are interested in establishing an OSPO at their university?

Sayeed Choudhury: Sure. At a session earlier today, we had a panel, which I would encourage people to look at the recording. The six institutions that have been funded by the Sloan Foundation were represented there. We also have produced a playbook, or a guide, about exploring open-source software within your university, and then if desirable, how to create an open-source programs office.

I had mentioned the Sloan Foundation recent call for OSPOs. As part of an informational session for that, we recorded a description of that guide and an overview of those questions.

But beyond those resources, I think there's a community that's starting to form, basically, around university OSPOs. So you've got these six that are funded now, there'll be a few more. I'm hoping that that group, obviously, will have its own internal kinds of goals and considerations, but can become a place where people start to see common patterns evolve and even differentiating patterns that might resonate with the type of institution that people are working in.

But it is early days. I wouldn't be remiss if I said, "We've got a clear pattern, and this is what you do." A lot of this is learning, sharing those learnings and evolving from there. But I think we've gotten enough of a foundation through those resources I mentioned that people could start.

Gerry Bayne: And where might someone find that guide you referenced?

Sayeed Choudhury: So the guide is available on a website for an organization called OSPO++. I can send you those resources-

Gerry Bayne: Sure.

Sayeed Choudhury: ... as well. And then the informational session is on the Sloan Foundation's site for launching the next set of OSPOs, and I can point you to that as well.

Gerry Bayne: Great. So Sayeed, is there anything about this that we haven't touched on that you'd like to mention?

Sayeed Choudhury: Yeah, one thing in particular, the role of working with communities. So we've heard this throughout the CNI, which I think is really great, is how do universities better engage with people outside of the walls of the university, right? But speaking very broadly. Whether that's within your city or throughout the world. And I was involved in some work previously when I was at Hopkins, and we're starting to explore this at CMU as well, where using open source helped build the connection with the local community center in terms of trust, in terms of transparency, in terms of believing that we weren't here to just say, "We know what your problems are and we're going to solve them for you."

There's a group at Carnegie Mellon called the Center for Shared Prosperity, and the director of that group said it very eloquently. He said, "Our goal is to give people in Pittsburgh the agency to solve their own problems." And one of the ways universities can act as equals or partners, right, not say, "We're the experts, you're the victims," is through open source. So anyone can look at the code, anyone can join, typically, the Slack channels where the development is happening, anyone can contribute. Anyone can say, "I don't think this is the way they should be." Anyone can test the user interface. Anyone can participate in the design. That's a really powerful way of signaling to the community that we're not here just to tell you what to do or how to do it, we want to work with you. I think open-source software plays a key role in that.

There's somebody in Pittsburgh who has many roles, but he's sort of a force of nature in the community. His name is Majestic Lane, and he said this incredible phrase where he said, "What we need to do is flatten the landscape." It's not about helping people come up a hill, right? It's about flattening the landscape so they can walk on their own. I think open-source software can play a key role in that.

Gerry Bayne: A great way to end. Thanks so much for your time, Sayeed.

Sayeed Choudhury: Thanks. I appreciate it.

This episode features:

Sayeed Choudhury
Director, Open Source Programs Office
Carnegie Mellon University