\documentclass[a4paper]{jpconf} \usepackage{graphicx} \usepackage{hyperref} \begin{document} \title{The KM3NeT neutrino telescope network and CNAF} \author{C. Bozza$^1$, T. Chiarusi$^2$, K. Graf$^3$, A. Martini$^4$ for the KM3NeT Collaboration} \address{$ˆ1$ Department of Physics of the University of Salerno and INFN Gruppo Collegato di Salerno, via Giovanni Paolo II 132, 84084 Fisciano, Italy} \address{$ˆ2$ INFN, Sezione di Bologna, v.le C. Berti-Pichat, 6/2, Bologna 40127, Italy} \address{$ˆ3$ Friedrich-Alexander-Universit{\"a}t Erlangen-N{\"u}rnberg, Erlangen Centre for Astroparticle Physics, Erwin-Rommel-Stra{\ss}e 1, 91058 Erlangen, Germany} \address{$ˆ4$ INFN, LNF, Via Enrico Fermi, 40, Frascati, 00044 Italy} \ead{cbozza@unisa.it} \begin{abstract} The KM3NeT Collaboration is building a new generation of neutrino detectors in the Mediterranean Sea. The scientific goal is twofold: with the ARCA programme, KM3NeT will be studying the flux of neutrinos from astrophysical sources; the ORCA programme is devoted to investigate the ordering of neutrino mass eigenstates. The unprecedented size of detectors will imply PByte-scale datasets and calls for large computing facilities and high-performance data centres. The data management and processing challenges of KM3NeT are reviewed as well as the computing model. Specific attention is given to describing the role and contributions of CNAF. \end{abstract} \section{Introduction} Water-Cherenkov neutrino telescopes have a recent history of great scientific success. Deep-sea installations provide naturally high-quality water and screening from cosmic rays from above. The KM3NeT Collaboration \cite{web} aims at evolving this well-proven technology to reach two scientific goals in neutrino astronomy and particle physics, by two parallel and complementary research programmes \cite{ESFRI, LoI}. The first, named ARCA (Astroparticle Research with Cosmics in the Abyss), envisages to study the neutrino emission from potential sources of high-energy cosmic rays, like active galactic nuclei, supernova remnants and regions where high fluxes of gamma rays originate (including supernovae). It received a boost of interest after the IceCube discovery of a high-energy, cosmic diffuse neutrino flux. The goals of directional identification of the source of high-energy neutrinos and good energy resolution, together with their small flux, require a detector with a total volume beyond $1\,km^{3}$. The second line of research is devoted to studying the ordering of neutrino mass eigenstates (Oscillation Research with Cosmics in the Abyss - ORCA). A detector technically identical to the ARCA one but with smaller spacing between sensitive elements will be used to detect atmospheric neutrinos oscillating while crossing the Earth's volume: the modulation pattern of the oscillation is influenced by a term that is sensitive to the ordering (normal or inverted), hence allowing discrimination between the models. The KM3NeT detector technology originates from the experience of previous underwater Cherenkov detectors (like ANTARES and NEMO), but it takes a big step forward with the new design of the digital optical modules (DOM), using strongly directional 3-inch photomultiplier tubes (PMT) to build up a large photocathode area. The whole detector is divided for management simplicity and technical reasons in \emph{building blocks}, each made of 115 \emph{Detection Units} (DU). A DU is in turn made of 18 DOM's, each hosting 31 PMTs. Hence, a building block will contain 64,170 PMT's. With an expected lifetime of at least 15 years with full day operation, and a single photoelectron rate of a few kHz per PMT, online, quasi-online and offline computing are challenging activities themselves. In addition, each detector installation will include instruments that will be dedicated to Earth and Sea Science (ESS), and will be operated by the KM3NeT Collaboration. The data flow from these instruments is negligible compared to optical data and is not explicitly accounted for in this document. \section{Project status} In the ARCA site, two DU's operated for part of 2017, continuously taking physics data as well as providing the KM3NeT Collaboration with a wealth of technical information including long-term effects and stability indicators. Also data processing proved able to work in a stable way. One ORCA DU was also deployed and operated in 2017, with smooth data flow and processing. At present time, most of the computing load is due to simulations for the full building block, now being enriched with feedback from real data analysis. As a first step, this was done at CC-IN2P3 in Lyon, but usage of other computing centres is increasing and is expected to soon spread to the full KM3NeT computing landscape. This process is being driven in accordance to the goals envisaged in setting up the computing model. The KM3NeT collaboration is now preparing for the so-called ``phase 2.0'' aiming at increasing the detector size. \section{Computing model} The computing model of KM3NeT is modelled on the LHC experiment strategy, i.e.\ it is a three-tier architecture, as depicted in Fig.~\ref{fig:threetier}. \begin{figure}[h] \includegraphics[width=0.9\textwidth]{threetier.png} \caption{Three-tier model for KM3NeT computing.} \label{fig:threetier} \end{figure} With the detector on the deep-sea bed, all data are transferred to data acquisition (DAQ) control stations on the shore. Tier 0 is composed of a computer farm running triggering algorithms on the full raw data flow with a reduction from $5 GB/s$ to $5 MB/s$ per \emph{building block}. Quasi-on-line reconstruction is performed for selected events (alerts, monitoring). The output data are temporarily stored on a persistent medium and distributed with fixed latency (typically less than few hours) to various computing centres, which altogether constitute Tier 1, where events are reconstructed by various fitting models (mostly searching for shower-like or track-like patterns). Reconstruction further reduces the data rate to about $1 MB/s$ per \emph{building block}. In addition, Tier 1 also takes care of continuous detector calibration, to optimise pointing accuracy (by working out the detector shape that changes because of water currents) and photomultiplier operation. Local analysis centres, logically allocated in Tier 2 of the computing model, perform physics analysis tasks. A database system interconnects the three tiers by distributing detector structure, qualification and calibration data, run book-keeping information, and slow-control and monitoring data. \begin{figure}[h] \includegraphics[width=0.9\textwidth]{compmodel.png} \caption{Data flow in KM3NeT computing model.} \label{fig:compmodel} \end{figure} KM3NeT exploits computing resources in several centres and in the GRID, as sketched in Fig.~\ref{fig:compmodel}. The conceptually simple flow of the three-tier model is then realised by splitting the tasks of Tier 1 to different processing centres, also optimising the data flow and the network path. In particular, CNAF and CC-IN2P3 aim at being mirrors of each other, containing the full data set at any moment. The implementation for the data transfer from CC-IN2P3 to CNAF (via an iRODS-to-gridFTP interface at CC-IN2P3) has been established but it was soon found that the stability of the solution could improve. A joint effort, to be used also by Virgo, is underway. Usage of the GRID by KM3NeT is expected to quick ramp up as software is distributed through containers. CNAF supports Singularity, which is a relatively popular and very convenient containerisation platform, and KM3NeT is converging to that solution. In view of boosting the use of the GRID, KM3NeT envisages to use DIRAC INTERWARE for distributed computing management, but the activity on this development front have lagged for a while becuse of the lack of human resources. \section{Data size and CPU requirements} Calibration and reconstruction work in batches. The raw data related to the batch are transferred to the centre that is in charge of the processing before it starts. In addition, a rolling buffer of data is stored at each computing centre, e.g.\ the last year of data taking. Simulation has special needs because the input is negligible, but the computing power required is very large compared to the needs of data-taking: indeed, for every year of detector lifetime, KM3NeT envisages to produce 10 years of simulated events (with exceptions). Also, the output of the simulation has to be stored at several stages. While the total data size is dominated by real data, the size of reconstructed data is dictated mostly by the amount of simulated data. Table~\ref{tier1} details how the tasks are allocated. \begin{table} \caption{\label{tier1}Task allocation in Tier 1.} \begin{center} \begin{tabular}{lll} \br Computing Facility&Main Task&Access\\ \mr CC-IN2P3 & general processing, central storage &direct, batch, GRID\\ CNAF¢ral storage, simulation, reprocessing&GRID\\ ReCaS&general processing, simulation, interim storage&GRID\\ HellasGRID&reconstruction&GRID\\ \br \end{tabular} \end{center} \end{table} Thanks to the modular design of the detector, it is possible to quote the computing requirements of KM3NeT per \emph{building block}, having in mind that the ARCA programme corresponds to two \emph{building blocks} and ORCA to one. Not all software could be benchmarked, and some estimates are derived by scaling from ANTARES ones. When needed, a conversion factor about 10 between cores and HEPSpec2006 (HS06) is used in the following. \begin{table} \caption{\label{cpu}Yearly resource requirements per \emph{building block}.} \begin{center} \begin{tabular}{lll} \br Processing stage&Storage (TB)&CPU time (MHS06.h)\\ \mr Raw Filtered Data&300&-\\ Monitoring and Minimum Bias Data&150&150\\ Calibration (+ Raw) Data&1500&48\\ Reconstructed Data&300&238\\ DST&150&60\\ Air shower sim.&50&7\\ Atmospheric muons&25&0.7\\ Neutrinos&20&0.2\\ \br \end{tabular} \end{center} \end{table} KM3NeT detectors are still in an initial construction stage, although the concepts have been successfully demonstrated and tested in small-scale installations\cite{PPMDU}. Because of this, the usage of resources at CNAF has been so far below the figures for a \emph{building block}, but are going to ramp up as more detection units are added in the following years. KM3NeT software that runs on the GRID can use CNAF computing nodes in opportunistic mode. Already now, the data handling policy to safeguard the products of Tier-0 is in place. Automatic synchronization from each shore station to both CC-IN2P3 and CNAF runs daily and provides two maximally separated paths from the data production site to final storage places. Mirroring and redundancy preservation between CC-IN2P3 and CNAF are foreseen and currently at an early stage. CNAF has already added relevant contributions to KM3NeT in terms of know-how for IT solution deployment, e.g.~the above-mentioned synchronisation, software development solutions and the software-defined network at the Tier-0 at the Italian site. Setting up Software Defined Networks (SDN) for data acquisition deserves a special mention. The SDN technology\cite{SDN} is used to configure and operate the mission-critical fabric of switches/routers that interconnects all the on-shore resources in Tier-0 stations. The KM3NeT DAQ is built around switches compliant with the OpenFlow 1.3 protocol and managed by dedicated controller servers. With a limited number of Layer-2 forwarding rules, developed on purpose for the KM3NeT use-case, the SDN technology allows effective handling of the KM3NeT asymmetric-hybrid network topology, optimising the layout of connections on shore, ensuring data-taking stability. Scaling to a large number of DOMs up to 2 \emph{building blocks} with the present technology won't require any change of the DAQ system design. \section{References} \begin{thebibliography}{2} \bibitem{web} KM3NeT homepage: \href{http://km3net.org}{http://km3net.org} \bibitem{ESFRI}ESFRI 2016 {\it Strategy Report on Research Infrastructures}, ISBN 978-0-9574402-4-1 \bibitem{LoI} KM3NeT Collaboration: S. Adri\'an-Mart\'inez et al. 2016 {\it KM3NeT 2.0 – Letter of Intent for ARCA and ORCA}, arXiv:1601.07459 [astro-ph.IM] \bibitem{PPMDU}KM3NeT Collaboration: S. Adri\'an-Martínez et al. 2016 {\it The prototype detection unit of the KM3NeT detector}, {\it EPJ C} {\bf 76} 54, arXiv: 1510.01561 [astro-ph.HE] \bibitem{SDN}T. Chiarusi et al. for the KM3NeT Collaboration 2017: {\it The Software Defined Networks implementation for the KM3NeT networking infrastructure} in {\it 35\textsuperscript{th} International Cosmic Ray Conference — ICRC2017 10–20 July, 2017 Bexco, Busan, Korea}, {\it PoS(ICRC2017)} {\bf 940}, https://pos.sissa.it/301/940/pdf \end{thebibliography} \end{document}