Vilnius University Researcher: “Open Data Enables Global Technological progress”
The 2024 Annual Review of the European Organization for Nuclear Research (CERN) includes highlights from the major experiments at CERN. The lead contributor to one of them is Dr. Mindaugas Šarpis, a researcher in Vilnius University Faculty of Physics and the head of CERN LHCb Vilnius - a research group for experimental particle physics.
Dr. Šarpis was responsible for the preparation of the Run1 (2011-2012) data set and the technical implementation of the Open Data Release for the the Large Hadron Collider Beauty (LHCb) experiment . The data collected by the LHCb detector at CERN during Run1 (the first phase of data taking) were made available to everyone in the early 2024. The data consists of proton-proton collision parameters obtained over a two-year period. The particle physicist said that the total amount of Run1 data exceeds 1PB and includes 800TB of physics data, pointing out that “this huge amount of data would be the equivalent of 250,000,000 digital photographs, or so many high-definition videos that it would take 35 years to view them.”
CERN has a lot of experience in particle physics data analysis, preservation of data and analysis techniques. “Data availability enables global technological advances in a wide range of sectors, reaching beyond science and business. Science is a resource for all of humanity and data is a common resource for all of us,” emphasised the scientist, who follows the principles of open science in his daily work at Vilnius University.
Dr. Šarpis said that LHCb is a leading experiment at CERN in terms of open data. Although these data sets are very specific and require special software to "be read", the software itself is also fully and freely available to anyone in the world who is interested. “Such large amounts of data can be useful and applicable to a wide range of technological fields. The data I have been working with since 2014 is curated by an automated system, preserving the relevant information for each file. There are over 100 000 of files. To make the open data more accessible to people without the background in particle physics or experience at CERN, the open data portal also includes an interactive glossary of 1000 terms and ~10 000 web pages describing the requirements for different kinds of physics analyses,” he said.
Under the FAIR principles (an acronym for Findable, Accessible, Interoperable, and Reusable), CERN's collected data is gradually being made available to everyone. “We are already working on Run2 data release, and are currently collecting Run3 data set, which will be bigger than what we have collected in the entire history of data collection at the LHC,” the physicist said.
CERN is the world's largest particle physics laboratory, bringing together scientists from more than 100 countries. Located on the border between Switzerland and France, CERN is where scientists carry out experiments to understand the particles of the Universe and the interactions between them. One of CERN's most important projects is the Large Hadron Collider (LHC), which allows the study of proton collisions to search for new particles and phenomena.
Vilnius University Accepted as a New Institute for the CERN LHCb Experiment