Science cluster & challenges
Summary
FASCA builds on the GoTriple platform, a multilingual discovery service developed as an EOSC service by the TRIPLE project led by OPERAS* Research Infrastructure (RI) and catering specifically to the needs of researchers in the domains of the Social Sciences and Humanities (SSH). GoTriple provides SSH researchers with access to over 15 million publications and datasets, promoting international collaboration and Open Science. The FASCA project enhances this platform by integrating data science tools to support comprehensive analysis of scholarly communication, with a specific focus on linking research to the Sustainable Development Goals (SDGs). Through two pilot studies, the project aims to deliver FAIR-compliant research outputs, including publications, data, and code, while establishing a sustainable support GoTriple-based service for data-driven SSH research.
*OPERAS is the RI supporting open scholarly communication in SSH across the European Research Area. Its mission is to coordinate and federate both technical and non-technical resources in Europe. OPERAS is an ESFRI on the path to become an ERIC in 2027. It is part of the SSH Open Cluster with its representative in the SSHOC Governance Board.
Challenge
Open Science project, Open Science Service
The need for accessible and comprehensive research tools in the SSH has become increasingly evident as scholars struggle to navigate a vast array of multilingual resources. Traditional research methods often fall short of leveraging the full potential of large datasets and diverse research outputs. Moreover, there is a growing demand for making scholarly communication more aligned with global challenges such as the SDGs. Without efficient tools for data-driven research, the SSH community faces obstacles in fostering cross-disciplinary collaborations and ensuring Open Science practices.
Solution
FASCA aims to establish a data science support service for multilingual data-driven research with data science assistance, by implementing a pipeline for comprehensive research support through tailored, personalised services. It will run two pilot research projects dedicated to investigating the relationship between scholarly communication and SDGs. Selected teams will get access to support from the Data Science team, publish their manuscript/preprint, data and code in a FAIR-compliant way. The pilots will be used to test and develop a GoTriple Data Science support service that will be offered on top of the GoTriple discovery platform.
It will build a Data Science GoTriple pipeline that facilitates the use of GoTriple data (publications, projects, profiles, datasets) with the Data Science Universal Toolbox (JupyterLab, Neo4j) and improved GoTriple functionalities (GoTriple API, Chatbot, Pundit). It will enable users to publish research results (on GitHub, Zenodo, BinderHub) in a FAIR-compliant way by linking the results with datasets, as well as to use software and run research pilots. Their results will be presented through the dedicated GoTriple’s Project page, as a manuscript/preprint, research data and code to form a FAIRified research object which connects publication, research data and research code.
Scientific Impact
The project is expected to enhance the EOSC service GoTriple by developing and providing a Data Science support service, which will allow the SSH community to build cross-disciplinary collaborations between the SSH Open Cluster and other EOSC Thematic Clusters thanks to shared background of data-driven methodologies. The results of the 2 data-driven research pilots from SSH researchers eager to obtain data science support to investigate relationships between scholarly communication and SDGs, will be published in a FAIR-compliant way (manuscript/preprint, code and data). The project’s FAIR-compliant outputs will serve as models for future studies, promoting best practices in Open Science.
Results
- Chatbot design and first implementation: FASCA designed and implemented the first production-ready version of the GoTriple chatbot, integrating Retrieval-Augmented Generation (RAG) and Large Language Models to support conversational interaction with scholarly resources. The solution introduced a layered architecture with API-based backend services and a JavaScript interactive frontend integrated directly into the GoTriple platform. The chatbot enables researchers to analyse collections of documents, generate summaries, perform question-answering tasks, identify concepts and entities, and export structured outputs such as JSON. The service was integrated with GoTriple authentication and deployed in production as part of the GoTriple Data Services environment.
- GoTriple Data Science Toolbox and JupyterHub deployment: The project successfully deployed a browser-based JupyterHub environment integrated with the GoTriple platform. Researchers can now access personalised notebook workspaces for computational analysis, graph exploration, and reproducible workflows without requiring local installation. The environment integrates Neo4j graph technologies, API access, notebook templates, and FAIR-oriented publication workflows connected to GitHub and Zenodo.
- Development of the GoTripleHelper Python package: FASCA developed the gotriple_helper Python package to simplify access to the GoTriple APIs within notebooks and analytical workflows. The package supports retrieval and search across documents, projects, researchers, and users while handling pagination and metadata processing automatically. This significantly lowers the technical barrier for SSH researchers wishing to perform computational analysis using GoTriple data.
- Pilot projects validating reproducible SSH workflows: Two external pilot projects were selected through an open call and supported by the FASCA Data Science team. The pilots demonstrated how GoTriple data and computational workflows can support multilingual SSH research, FAIR data practices, metadata analysis, environmental sustainability studies, and reproducible scholarly communication workflows.
Research outputs
Research outputs include scientific article drafts, datasets, and reusable Jupyter Notebook repositories.
- Pilot 1 produced the article draft “GREANISSH - How green are full open-access publications in Translation Studies journals? A corpus study” (https://doi.org/10.5281/zenodo.20080088), a dataset containing metadata and full-text corpora from Translation Studies journals (https://doi.org/10.5281/zenodo.20081489), and a repository of notebooks for data collection and processing (https://github.com/nwolczuk/fasca-pilot-1).
- Pilot 2 produced the article draft “Multilingualism at the Semantic Layer: Metadata Evidence from the GoTriple Research Infrastructure” (https://doi.org/10.5281/zenodo.19262149), datasets derived from GoTriple samples (https://doi.org/10.5281/zenodo.19731084), and open repositories for metadata harvesting and multilingual metadata analysis workflows (https://github.com/hibernator11/fasca-ssh-gotriple; https://github.com/nwolczuk/fasca-pilot-2).
Events
- 20-21 May 2026 - Poster at OPERAS & SCIROS 2026 conference, to present the notebook/chatbot, communication to general audience and scholarly audience
Publications
- Facilitating Scholarly Communication Analysis: the GoTriple Pipeline - DOI pending
- A scholar-driven AI notebook for assisted and transparent research [Conference poster - OPERAS & SCIROS 2026 Conference]
Other material and resources
- FASCA online presentation in GoTriple
- Open call presentation page in GoTriple
- Beginner’s guide to the Data Science Toolbox [ready, to be included inside the Jupyter Environment]
- Use Case analysis for the Data Services, based on a user research [can supply additional material for dissemination and further research]
- Examples of Jupyter Notebooks with scripts (ready-to-use notebooks as templates) [to facilitate use of the Toolbox]
- Results of evaluation of the services done with users [can supply additional material for dissemination and further research]
Principal investigator
Sy is the OPERAS Chief Technology Officer (CTO), in charge of the technical vision and service strategy, coordinating the teams behind the solutions and aligning with the overall organisational strategy. Previous to OPERAS, Sy spent more than fifteen years in EU-funded projects related to the development and implementation of e-Infrastructures for research and innovation as well as commercial exploitation. He is a certified expert, trainer and auditor (ISO 19011) in both FitSM and ISO/IEC 27001 standards and volunteers as Co-chair of the ITEMO working group to evolve the FitSM standard.
- FASCA presentation in GoTriple
- Open Call presentation page in GoTriple
- Homepage of the FASCA services in GoTriple