Skip to main content
News

GraspOS Chats #11 Get to know Kumar Guha and Samuel Scalbert

03 April 2025

IN THIS SERIES OF INTERVIEWS WE FEATURE GRASPOS MEMBERS TO PRESENT THE PEOPLE BEHIND THE SCENES. WHAT IS THEIR ROLE IN THE PROJECT, THEIR BACKGROUND, AND THEIR VIEW ON THE NEED FOR AN OPEN SCIENCE-AWARE RESPONSIBLE RESEARCH ASSESSMENT SYSTEM?

In this interview we feature Kumar Guha and Samuel Scalbert from Inria. Kumar is Responsible of the Open Archive Unit at Inria Scientific Information Department, and Samuel is an Expert in text and document data mining. In this interview, they provide an overview of their work in the GraspOS project, both as part of the Comupter Science pilot, and on the Open Science Assessment Framework and the GraspOS catalogue of tools and services.

Hello KUmar and Samuel, first of all, Could you briefly introduce yourselves?

Kumar: I am a research librarian as well as a translator by training and I am in charge of Open archive at Inria, the French Institute for Computer Science. My work focuses mostly on publications and softwares. During my whole career, my motivation has been facilitating information and communication across communities, as well as giving priority to quality over quantity. A famous French philosopher, Michel Serres, used to say that now that computers can be our memory, our brains have more space for developing intelligence. I coordinate a group in charge of data quality in the national Open archive HAL and I also take part in a data analysis group which provides answers to complex questions by science directors to help them define their strategic policies.

In GraspOs, Samuel and I, along with our Director, Laurent Romary, take part in various Work Packages as part of the Computer Science Pilot, in which Inria, the University of Bologna and Athena Research Center collaborate. We give our expertise of the computer science field specifics and develop a tool for the identification of software mentions in the full text of publications. We also take part in the CoARA Working Group Towards Open Infrastructures for Responsible Research Assessment (OI4RRA) (subgroups 1 and 2), and try to attend the OpenAIRE Community calls, as often as our schedule allows.

Samuel: I’m an engineer and historian, I mainly work on software mentions at Inria. Through my work, we aim to identify, disambiguate, and display software cited in research articles from Inria. I am committed to making my work open and accessible so that it can be used by institutions, laboratories, and authors alike. 

Reflecting on the past year, what are the most significant achievements you have been part of so far?

Kumar: A great lot of work has been done last year. I am happy to have taken part in the work on the Open Science Assessment Framework (OSAF). The OSAF gives recommendations on the right questions evaluators should ask themselves. It also gave solid ground for the infrastructure GraspOs is going to make available. I also think the joint working groups with CoARA project is a great way to ensure consistency of Open Science services as a whole. On the Computer science side, we were able to give some insights to institutional pilots about the specifics of our field. And I hope we contributed fruitfully to the general reflection on  the GraspOS services and tools catalogue. At Inria, we also developed a tool to identify software mentions in the full text of scientific publications. Samuel is the main creator of this tool, so I will let him give more details about this work.

Samuel: For me, the most significant achievement is that my tools are actively being used at Inria. In a few months, we will have the resources to process Inria’s full dataset, which contains approximately 1.4 million documents.
I also work on a project called SOFAIR (Making Software FAIR: A Machine-Assisted Workflow for the Research Software Lifecycle). We are currently training the AI used for identifying software mentions, ultimately improving its performance.  

What challenges have you or your team faced, and how did you overcome them?

Kumar: At the very beginning, during the conceptual part of the project the progression seemed a bit slow, but progressively, on the grounds of the OSAF definition, the group gained momentum. The challenge now is to make the definitions, the infrastructure, and the tools and services catalogue readily available, so that evaluators, whether Institutions or researchers, understand correctly the proposed framework, so that they can apply it to their varied realities and goals. The diversity of the Pilots enables us to offer a comprehensive and qualitatively satisfying framework for a more transparent and responsible research evaluation.
Technically, we need to define metadata that is as exhaustive as possible and as flexible as possible at the same time.

Samuel: The main challenge we face is the disambiguation process. We are currently conducting experiments to identify the most effective methods. Through these experiments, we hope to develop a solution that is both efficient and minimises noise in our dataset.

Looking forward, what legacy do you think GraspOS will leave behind? If you could highlight one key contribution of GraspOS to supporting an Open Science-aware research assessment system, what would it be? 

Kumar: It is hard to predict the future of such a project that embraces such a large theme as Research Assessment. Let us wish that it will be the beginning of a new way of thinking and doing research assessment, based on Open Science. From the participation in this project, I retain the necessity to keep in mind interoperability, consistency with other Open Science standards, and adaptability to various platforms. It is hard to speak about one main contribution, for the scope of the project is wide and all its components are interlinked. Giving a Framework, recommendations and proposing a set of tools and services should greatly serve the scientific community.

Samuel: Our work with HAL will make connections between software and publications. This paves the way to a similar work to connect research data as well. One of the tools I developed allows anyone to launch the full SOFTware pipeline with a single command line, resulting in a web app displaying all the mentions. It is called SOFTware-Hub.   

On a lighter note, what is your favorite way to relax and recharge after a busy week?

Kumar: I recently started watercolor drawing and found it quite relaxing! After a busy week, meeting with friends and family, going together to an exhibition or to a show usually help recharge the batteries! 

Samuel: I usually just meet with friends or skate in Paris to discover the city. 

Thank you for your time Kumar and Samuel!

Written by

Lottie Provost
This email address is being protected from spambots. You need JavaScript enabled to view it.