From 2022: ‘Reflections on guest editing a Frontiers journal’

Serge Horbach, Michael Ochsner, and Wolfgang Kaltenbrunner, in a Leiden Madtrics post, detail a vexing guest-editing role at a Frontiers journal, circa late 2022:

Reviewers are selected by an internal artificial intelligence algorithm on the basis of keywords automatically attributed by the algorithm based on the content of the submitted manuscript and matched with a database of potential reviewers, a technique somewhat similar to the one used for reviewer databases of other big publishers. While the importance of the keywords for the match can be manually adjusted, the fit between submissions and the actually required domain expertise to review them is often less than perfect. This would not be a problem were the process of contacting reviewers fully under the control of the editors. Yet the numerous potential reviewers are contacted by means of a preformulated email in a quasi-automated fashion, apparently under the assumption that many of them will reject anyway. We find this to be problematic because it ultimately erodes the willingness of academics to donate their time for unpaid but absolutely vital community service. In addition, in some cases it resulted in reviewers being assigned to papers in our Research Topic that we believed were not qualified to perform reviews. Significant amounts of emailing and back-and-forth with managing editors and Frontiers staff were required to bypass this system, retract review invitations and instead focus only on the reviewers we actually wanted to contact.

Their post appeared just one month before ChatGPT’s public roll-out. How many AI peer-review “solutions” like this are in the works now?

‘Towards Robust Training Data Transparency’

As if on cue, Open Future releases a new brief call for meaningful training data transparency:

Transparency of the data used to train AI models is a prerequisite for understanding how these models work. It is crucial for improving accountability in AI development and can strengthen people’s ability to exercise their fundamental rights. Yet, opacity in training data is often used to protect AI-developing companies from scrutiny and competition, affecting both copyright holders and anyone else trying to get a better understanding of how these models function.

The brief invokes core, Mertonian science norms in its argument to put muscle behind Europe’s AI Act:

The current situation highlights the need for a more robust and enabled ecosystem to study and investigate AI systems and critical components used to train them, such as data, and underscores the importance of policies that allow researchers the freedom to conduct scientific research. These policies must include a requirement that AI providers be transparent about the data used to train models […] as it will allow researchers to critically evaluate the implications and limitations of AI development, identify potential biases or discriminatory patterns in the data, and reduce the risk of harm to individuals and society by encouraging provider accountability.

‘AI Act fails to set meaningful dataset transparency standards for open source AI’

Open Future’s Alek Tarkowski, writing in March about Europe’s AI Act:

Overall, the AI Act does not introduce meaningful obligations for training data transparency, despite the fact that they are crucial to the socially responsible development of what the Act defines as general purpose AI systems.

Tarkowski’s post is nuanced, and well worth a read. My mind kept drifting to the scholarly-publishing case—in which scholars’ tracked behavior, citation networks, and full-text works might train propriety models built by the likes of Elsevier. As Tarkowski hints here—echoing Open Future’s July 2023 position paper—open science norms around data sharing should be brought to bear on legislation and regulation. The case for FAIR-like principles to apply to models trained on scholarly data is stronger still.

‘Publishers can’t be blamed for clinging to the golden goose’

I missed this Steven Harnad piece from last May. It is trademark Harnad:

So, you should ask, with online publishing costs near zero, and quality control provided gratis by peer reviewers, what could possibly explain, let alone justify, levying a fee on S&S [scientists and scholars] authors trying to publish their give-away articles to report their give-away findings? The answer is not as complicated as you may be imagining, but the answer is shocking: the culprits are not the publishers but the S&S authors, their institutions and their funders! The publishers are just businessmen trying to make a buck. […] Under mounting ‘open access’ pressure from S&S authors, institutional libraries, research funders and activists, the publishers made the obvious business decision: ‘You want open access for all users? Let the authors, their institutions or their research funders pay us for publication in advance, and you’ve got it!’

Harnad, the original (and wittiest) advocate for the “green” repository route, is basically right. It’s not just scholars, of course—we’re not free agents when it comes, say, to productivity metrics imposed by university managers. But the academic system as a whole (funders included) is responsible for letting the oligopolist publishers laugh, as Harnad has it, all the way to the bank.

‘He Wanted Privacy. His College Gave Him None’

I missed this great Markup piece when it was published last November. It tells the story of dorm-to-classroom surveillance through the lens of a California college student:

By the time Natividad went to bed that night, Google and Facebook had data about which Mt. SAC webpages he’d visited, and a company called Instructure had gathered information for his professors about how much time he’d spent looking at readings for his classes and whether he had read messages about his courses. Campus police and a company called T2 Systems potentially had information about what kind of car he was driving and where he parked. And as he drifted off to sleep, Natividad had to contend with the worry that, later this semester, his professors could subject him to the facial detection software incorporated into the remote proctoring tools used at Mt. SAC.

The Markup story touches on textbook surveillance:

This semester, one of Natividad’s professors assigned a digital textbook through Cengage, a publishing company turned ed tech behemoth. […] According to Cengage’s online privacy policy, the company collects information about a student’s internet network and the device they use to access online textbooks as well as webpages viewed, links clicked, keystrokes typed, and movement of their mouse on the screen, among other things. The company then shares some of that data with third parties for targeted advertising. For students who sign into Cengage websites with their social media accounts, the company collects additional information about them and their entire social networks.

The Markup story might have added: When a student turns to research a term paper, they’re also being tracked there. Surveillance publishers like Elsevier harvest a shocking amount of data through their article-delivery platforms. Your journals, to paraphrase Sarah Lamdan, are spying on you.

‘Thomson Reuters announces expanded vision to provide GenAI assistant for every professional it serves’

The information conglomerate Thomson Reuters, in a press release announcing an “expanded vision” for its “professional-grade GenAI assistant”:

CoCounsel is an AI assistant that acts like a team member – handling complex tasks with natural language understanding. Completing tasks at superhuman speeds, CoCounsel provides high-quality information at the right time, maintains multiple threads of work, as well as keeping context and memory across the different tasks and products customers use each day. By augmenting professional work with GenAI skills, CoCounsel delivers accelerated and streamlined workflows, enables professionals to produce higher-quality work more quickly, all while keeping customer data secure.1

The CoCounsel name is, it seems, a nod to Thomson Reuters’ Westview and other legal businesses—and a lazy riff on Microsoft’s Copilot. Either way, another publishing-adjacent colossus picking up its pace in the race to re-monetize “content” through AI.


  1. Probably written by CoCounsel.  ↩︎

Jeff Pooley is affiliated professor of media & communication at Muhlenberg College, lecturer at the Annenberg School for Communication at the University of Pennsylvania, director of mediastudies.press, and fellow at Knowledge Futures.

pooley@muhlenberg.edu | jeff.pooley@asc.upenn.edu
press@mediastudies.press | jeff@knowledgefutures.org

CV
Publications
@jpooley@scholar.social
Orcid
Humanities Commons
Google Scholar


Projects

mediastudies.press

A non-profit, scholar-led publisher of open-access books and journals in the media studies fields
Director

History of Media Studies

An open access, refereed academic journal
Founding co-editor

History of Social Science

A refereed academic journal published by the University of Pennsylvania Press
Founding co-editor

Journal of Communication Forum

The Journal of Communication’s new review-essay section, created by Nikki Usher
Co-editor

Knowledge Futures

Consulting, PubPub, an open source scholarly publishing platform from Knowledge Futures, the scholarly infrastructure nonprofit
Fellow

Annenberg School for Communication Library Archives

Archives consulting, Communication Scholars Oral History Project, and History of Communication Research Bibliography
Consultant

MediArXiv

The open archive for media, film, & communication studies
Founding co-coordinator