Building The Public Interest Corpus for AI and Computational Research
This episode explores an effort to create a public-interest corpus for AI training using digitized materials from research libraries, archives, and special collections. Dan Cohen, dean of the Libraries at Northeastern University, and Thomas Padilla, public interest artificial intelligence strategist for Authors Alliance, outline their goal to expand AI’s access to high-value, long-form academic content, typically absent from commercial models.