km3net.tex

\documentclass[a4paper]{jpconf}
\usepackage{graphicx}
\usepackage{hyperref}
\begin{document}
\title{The KM3NeT neutrino telescope network and CNAF}

\author{C. Bozza$^1$, T. Chiarusi$^2$, K. Graf$^3$, A. Martini$^4$ for the KM3NeT Collaboration}

\address{$ˆ1$ University of Salerno and INFN Gruppo Collegato di Salerno, Fisciano (SA), IT}

\address{$ˆ2$ INFN Sezione di Bologna, Bologna, IT}

\address{$ˆ3$ Friedrich-Alexander-Universit{\"a}t Erlangen-N{\"u}rnberg, Erlangen, GE}

\address{$ˆ4$ INFN-LNF, Frascati, IT}

\ead{cbozza@unisa.it}

\begin{abstract}
The KM3NeT Collaboration is building a new generation of neutrino
detectors in the Mediterranean Sea. The scientific goal is twofold:
with the ARCA programme, KM3NeT will be studying the flux of neutrinos
from astrophysical sources; the ORCA programme is devoted to
investigate the ordering of neutrino mass eigenstates. The
unprecedented size of detectors will imply PByte-scale datasets and
calls for large computing facilities and high-performance data
centers. The data management and processing challenges of KM3NeT are
reviewed as well as the computing model. Specific attention is given
to describing the role and contributions of CNAF.
\end{abstract}

\section{Introduction}
Water-Cherenkov neutrino telescopes have a recent history of great
scientific success.  Deep-sea installations provide naturally
high-quality water and screening from cosmic rays from above.  The
KM3NeT Collaboration \cite{web} aims at evolving this well-proven
technology to reach two scientific goals in neutrino astronomy and
particle physics, by two parallel and complementary research
programmes \cite{ESFRI, LoI}.  The first, named ARCA (Astroparticle
Research with Cosmics in the Abyss), envisages to study the neutrino
emission from potential sources of high-energy cosmic rays, like
active galactic nuclei, supernova remnants and regions where high
fluxes of gamma rays originate (including supernovae). It received a
boost of interest after the IceCube discovery of a high-energy, cosmic 
diffuse neutrino flux.  The goals of directional identification
of the source of high-energy neutrinos and good energy resolution, 
together with their small flux, require a detector with a total
volume beyond $1\,km^{3}$.  The second line of research is devoted to
studying the ordering of neutrino mass eigenstates (Oscillation
Research with Cosmics in the Abyss - ORCA). A detector technically
identical to the ARCA one but with smaller spacing between sensitive
elements will be used to detect atmospheric neutrinos oscillating
while crossing the Earth's volume: the modulation pattern of the
oscillation is influenced by a term that is sensitive to the ordering
(normal or inverted), hence allowing discrimination between the
models.  The KM3NeT detector technology originates from the experience
of previous underwater Cherenkov detectors (like ANTARES and NEMO),
but it takes a big step forward with the new design of the digital
optical modules (DOM), using strongly directional 3-inch
photomultiplier tubes (PMT) to build up a large photocathode area. The whole
detector is divided for management simplicity and technical reasons in
\emph{building blocks}, each made of 115 \emph{Detection Units} (DU).
A DU is in turn made of 18 DOM's, each hosting 31 PMTs.  Hence, a 
building block will contain 64,170 PMT's. With an
expected lifetime of at least 15 years with full day operation, and a
single photoelectron rate of a few kHz per PMT, online, quasi-online
and offline computing are challenging activities themselves.  In
addition, each detector installation will include instruments that
will be dedicated to Earth and Sea Science (ESS), and will be operated
by the KM3NeT Collaboration.  The data flow from these instruments is
negligible compared to optical data and is not explicitly accounted
for in this document.

\section{Project status}
In the ARCA site, two DU's operated for part of 2017, continuously taking 
physics data as well as providing the KM3NeT Collaboration with a wealth of
technical information including long-term effects and stability
indicators. Also data processing proved able to work in a stable
way. One ORCA DU was also deployed and operated in 2017, with smooth 
data flow and processing. At present time, most of the computing load is 
due to simulations for the full building block, now being enriched with 
feedback from real data analysis. As a first step, this
was done at CC-IN2P3 in Lyon, but usage of other computing centers is
increasing and is expected to soon spread to the full KM3NeT
computing landscape. This process is being driven in accordance to the
goals envisaged in setting up the computing model.  The KM3NeT
collaboration is now preparing for the so-called ``phase 2.0'' aiming at
increasing the detector size. 

\section{Computing model}
The computing model of KM3NeT is modelled on the LHC experiment
strategy, i.e.\ it is a three-tier architecture, as depicted in
Fig.~\ref{fig:threetier}.

\begin{figure}[h]
\includegraphics[width=0.9\textwidth]{threetier.png}
\caption{Three-tier model for KM3NeT computing.}
\label{fig:threetier}
\end{figure}

With the detector on the deep-sea bed, all data are transferred to data
acquisition (DAQ) control stations on the shore. Tier 0 is composed of
a computer farm running triggering algorithms on the full raw data
flow with a reduction from $5 GB/s$ to $5 MB/s$ per \emph{building
block}. Quasi-on-line reconstruction is performed for selected
events (alerts, monitoring). The output data are temporarily stored on
a persistent medium and distributed with fixed latency (typically less
than few hours) to various computing centers, which altogether
constitute Tier 1, where events are reconstructed by various fitting
models (mostly searching for shower-like or track-like
patterns). Reconstruction further reduces the data rate to about $1
MB/s$ per \emph{building block}. In addition, Tier 1 also takes care
of continuous detector calibration, to optimise pointing accuracy (by
working out the detector shape that changes because of water currents)
and photomultiplier operation. Local analysis centers, logically
allocated in Tier 2 of the computing model, perform physics analysis
tasks. A database system interconnects the three tiers by distributing
detector structure, qualification and calibration data, run
book-keeping information, and slow-control and monitoring data.

\begin{figure}[h]
\includegraphics[width=0.9\textwidth]{compmodel.png}
\caption{Data flow in KM3NeT computing model.}
\label{fig:compmodel}
\end{figure}

KM3NeT exploits computing resources in several centers and in the
GRID, as sketched in Fig.~\ref{fig:compmodel}. The conceptually simple
flow of the three-tier model is then realised by splitting the tasks
of Tier 1 to different processing centers, also optimising the data
flow and the network path. In particular, CNAF and CC-IN2P3 aim at being
mirrors of each other, containing the full data set at any moment. The
implementation for the data transfer from CC-IN2P3 to CNAF (via an
iRODS-to-gridFTP interface at CC-IN2P3) has been established but it was
soon found that the stability of the solution could improve. A joint
effort, to be used also by Virgo, is underway. Usage of the GRID by 
KM3NeT is expected to quick ramp up as software is distributed through
containers. CNAF supports Singularity, which is a relatively popular
and very convenient containerisation platform, and KM3NeT is 
converging to that solution. In view of boosting the use of the GRID,
KM3NeT envisages to use DIRAC INTERWARE for distributed computing
management, but the activity on this development front have lagged
for a while becuse of the lack of human resources.

\section{Data size and CPU requirements} 
Calibration and reconstruction work in batches. The raw data related
to the batch are transferred to the center that is in charge of the
processing before it starts. In addition, a rolling buffer of data is
stored at each computing center, e.g.\ the last year of data taking.

Simulation has special needs because the input is negligible, but the
computing power required is very large compared to the needs of
data-taking: indeed, for every year of detector lifetime, KM3NeT
envisages to produce 10 years of simulated events (with
exceptions). Also, the output of the simulation has to be stored at
several stages. While the total data size is dominated by real data,
the size of reconstructed data is dictated mostly by the amount of
simulated data.

Table~\ref{tier1} details how the tasks are allocated.

\begin{table}
\caption{\label{tier1}Task allocation in Tier 1.}
\begin{center}
\begin{tabular}{lll}
\br
Computing Facility&Main Task&Access\\
\mr
CC-IN2P3 & general processing, central storage &direct, batch, GRID\\
CNAF&central storage, simulation, reprocessing&GRID\\
ReCaS&general processing, simulation, interim storage&GRID\\
HellasGRID&reconstruction&GRID\\
\br
\end{tabular}
\end{center}
\end{table}

Thanks to the modular design of the detector, it is possible to quote
the computing requirements of KM3NeT per \emph{building block}, having
in mind that the ARCA programme corresponds to two \emph{building
  blocks} and ORCA to one.  Not all software could be benchmarked, and
some estimates are derived by scaling from ANTARES ones. 
In the following, the standard conversion factor (~10) is used between cores and HEPSpec2006 (HS06).

\begin{table}
\caption{\label{cpu}Yearly resource requirements per \emph{building block}.}
\begin{center}
\begin{tabular}{lll}
\br
Processing stage&Storage (TB)&CPU time (MHS06.h)\\
\mr
Raw Filtered Data&300&-\\
Monitoring and Minimum Bias Data&150&150\\
Calibration (+ Raw) Data&1500&48\\
Reconstructed Data&300&238\\
DST&150&60\\
Air shower sim.&50&7\\
Atmospheric muons&25&0.7\\
Neutrinos&20&0.2\\
\br
\end{tabular}
\end{center}
\end{table}

KM3NeT detectors are still in an initial construction stage, although
the concepts have been successfully demonstrated and tested in
small-scale installations\cite{PPMDU}. Because of this, the usage of
resources at CNAF has been so far below the figures for a
\emph{building block}, but are going to ramp up as more detection
units are added in the following years. KM3NeT software that
runs on the GRID can use CNAF computing nodes in opportunistic mode.

Already now, the data handling policy to safeguard the products of Tier 0
is in place. Automatic synchronization from each shore station to both 
CC-IN2P3 and CNAF runs daily and provides two maximally separated 
paths from the data production site to final storage places. Mirroring
and redundancy preservation between CC-IN2P3 and CNAF are foreseen and
currently at an early stage.

CNAF has already added relevant contributions to KM3NeT in terms of 
know-how for IT solution deployment, e.g.~the above-mentioned synchronisation, software development solutions and the software-defined network at the Tier 0 at 
the Italian site. Setting up Software Defined Networks (SDN) for data 
acquisition deserves a special mention. The SDN technology\cite{SDN} is used to 
configure and operate the mission-critical fabric of switches/routers 
that interconnects all the on-shore resources in Tier 0 stations. The 
KM3NeT DAQ is built around switches compliant with the OpenFlow 1.3 
protocol and managed by dedicated controller servers. With a limited 
number of Layer-2 forwarding rules, developed on purpose for the KM3NeT 
use-case, the SDN technology allows effective handling of the KM3NeT 
asymmetric-hybrid network topology, optimising the layout of connections 
on shore, ensuring data-taking stability. Scaling to a large number of DOMs 
up to 2 \emph{building blocks} with the present technology won't require any change
 of the DAQ system design.

\section{References}
\begin{thebibliography}{2}
\bibitem{web} KM3NeT homepage: \href{http://km3net.org}{http://km3net.org}
\bibitem{ESFRI}ESFRI 2016 {\it Strategy Report on Research Infrastructures}, ISBN 978-0-9574402-4-1
\bibitem{LoI} KM3NeT Collaboration: S. Adri\'an-Mart\'inez et al. 2016 {\it KM3NeT 2.0 – Letter of Intent for ARCA and ORCA}, arXiv:1601.07459 [astro-ph.IM]
\bibitem{PPMDU}KM3NeT Collaboration: S. Adri\'an-Martínez et al. 2016 {\it The prototype detection unit of the KM3NeT detector}, {\it EPJ C}  {\bf 76}  54, arXiv: 1510.01561 [astro-ph.HE] 
\bibitem{SDN}T. Chiarusi et al. for the KM3NeT Collaboration 2017: {\it The Software Deﬁned Networks implementation for the KM3NeT networking infrastructure} in {\it 35\textsuperscript{th} International Cosmic Ray Conference — ICRC2017 10–20 July, 2017 Bexco, Busan, Korea}, {\it PoS(ICRC2017)} {\bf 940}, https://pos.sissa.it/301/940/pdf


\end{thebibliography}

\end{document}