Skip to content
Snippets Groups Projects

Compare revisions

Changes are shown as if the source revision was being merged into the target revision. Learn more about comparing revisions.

Source

Select target project
No results found

Target

Select target project
  • faproietti/ar2018
  • chierici/ar2018
  • SDDS/ar2018
  • cnaf/annual-report/ar2018
4 results
Show changes
Showing
with 884 additions and 18 deletions
\documentclass[a4paper]{jpconf}
\usepackage{graphicx}
\begin{document}
\title{INFN Corporate Cloud - Management and Evolution}
\author{C. Duma$^1$, A. Costantini$^1$, D. Michelotto$^1$, D. Salomoni$^1$}
\address{$^1$ INFN-CNAF, Bologna, IT}
%\address{$^2$IFCA, Consejo Superior de Investigaciones Cientificas-CSIC, Santander, Spain}
\ead{ds@cnaf.infn.it}
\begin{abstract}
This paper describes the achievements and the evolution of INFN Corporate Cloud (INFN-CC), the geographically
distributed private Cloud infrastructure aimed at providing ICT services starting from the Infrastructure as a Service
(IaaS) cloud level based on OpenStack. In particular, the contribution provided by CNAF in terms of operations and possible evolution
is here described and analysed.
\end{abstract}
\section{Introduction}
The INFN Cloud Working Group as been active for almost three years within the so called ``Commissione Calcolo e Reti'' (CCR)
in INFN. Its activity being that of testing and acquiring expertise on technologies related to Cloud Computing and of selecting
solutions that can be adopted in INFN sites in order to meet the computing needs of the INFN scientific community and more
generally to ease information sharing inside and outside INFN. In the recent past, a number of projects related to Cloud
Computing started in INFN thanks to the knowledge and expertise that were the outcome of the activity of the Cloud Working
Group. A restricted team has been working in the last two years on the deployment of a distributed private cloud infrastructure
to be hosted in a limited number of INFN sites. The INFN-CC working group planned and tested possible architectural designs for
the implementation of a distributed private cloud infrastructure and implemented a prototype that is described hereafter.
INFN-­CC \cite{infncc-chep2018, infncc-wiki} is intended to represent a part of the INFN Cloud infrastructure, with peculiar features
that make it the optimal cloud facility for a number of usecases that are of great importance for INFN.
While the INFN Cloud ecosystem will be able to federate heterogeneous installations that will forcibly adopt a loose coupling
scheme, INFN-­CC tightly couples a few homogeneous OpenStack installations that share a number of services, while being
independent, but still coordinated, on other aspects.
The focus of INFN-­CC is on resource replication, distribution and high availability, both for network services and for user applications. INFN-­CC
represents a single, though distributed, administrative domain.
\section{INFN-CC - a distributed cloud}
As already mentioned, INFN Corporate Cloud (INFN-CC) is the INFN geographically distributed private Cloud infrastructure
aimed at providing services starting from the IaaS level and it is based on OpenStack that has been deployed in three of the
major INFN data centers in Italy (INFN-CNAF, INFN-Bari and INFN-LNF). INFN-CC has a twofold purpose: on one hand its fully
redundant architecture and its resiliency characteristics make of it theperfect environment for providing critical network services
for the INFN community, on the other hand the fact that it is hosted in modern and large data centers makes of INFN-CC the
platform of choice for a number of scientific computing use cases. INFN-CC also deploys a higher PaaS layer, developed within
the EU funded project INDIGO-DataCloud \cite{indigo-dc}, in order to provide to the INFN scientific communities
not only an easier access solution to computing and storage resources, but also both automatic instantiation and configuration
of services or applications used for their everyday work, like batch-system on demand or big data analytics facilities. The PaaS
layer, together with the underlying IaaS, is able to provide automatic scalability of the clusters instantiated and fault tolerance
in case of a single node or complete site failures.
\subsection{Architecture and services}
Techinically speaking, from the OpenStack \cite{openstack} point of view INFN-CC is a multi-region cloud composed of different OpenStack
installations sharing a set of services that are managed globally while maintaining other services local, as shown in Figure \ref{infncc-services}.
The available INFN-CC services can be summarized in the following categories:
Ancillary services such as:
\begin{itemize}
\item A distributed Percona XtraDB Cluster relies on this network and is the back-end both for the identity service and the image service catalog,
\item A distributed DNS dynamically modified, by humans as well as monitoring processes, in order to make clients point only to working endpoints.
\end{itemize}
Local services, implemented independently on each site, they have a local scope such as compute, volume and network.
In particular Compute and Volume services rely on a CEPH \cite{ceph} back-end. Each site has a CEPH instance with different priority and the
CEPH rbd mirror is employed to replicate data across INFN-CC sites for disaster recovery.
Global services, Implemented on all sites for high availability, backed by common DBs when needed, they have a global scope and are here listed:
\begin{itemize}
\item Openstack Horizon, providing the GUI to access the INFN-resources and services via Web,
\item OpenStack Keystone access points, pointing to the above mentioned distributed DBMS, are available on all INFN-CC sites,
\item OpenStack Swift relies on the INFN-CC private network and is deployed geographically,
\item Openstack Glance relies on CEPH as a storage backend and on the Percona cluster as a catalog, this way it is fully distributed as well.
\end{itemize}
\begin{figure}[h]
\centering
\includegraphics[width=15cm,clip]{infncc-services.png}
\caption{INFN-CC architecture and related services.}
\label{infncc-services}
\end{figure}
In order to provide the above mentioned list of services, particular care was made to define the network setup among the INFN-CC sites in
order to provide standard connectivity to the VMs and cross site connectivity among the three INFN sites.
The connectivity, made by a level 3 distributed private network provided by GARR, allows an easy cloud management and fast data exchange.
As shown in Figure \ref{infncc-net}, in the INFN-CC model the VM networks remain private and do not cross the border of their own “region”.
Also public networks are separate and managed locally, except they might benefit of a cross-site DNS domain namespace in order to allow
for easy service migration.
Hosts on the management networks of the different sites, on the other hand, must be able to intercommunicate, possibly taking advantage
of a set of loose firewall rules, in order to speed up the system setup and maintenance. Moreover, a cross-site DNS domain namespace
enable tos dynamically migrate cloud services, when needed, for high availability.
\begin{figure}[h]
\centering
\includegraphics[width=15cm,clip]{infncc-net.png}
\caption{INFN-CC networking and related components.}
\label{infncc-net}
\end{figure}
\subsection{INFN-CC functionalities}
INFN-CC provides some interesting functionalities and features thanks to the its geographical distribution among different sites:
\begin{itemize}
\item Single point of access to distributed resources, fully exploiting the native functionalities of OpenStack and with no (or very small)
need of external integration tools.
\item Single Sign On (SSO) and common authorization platform. User roles and projects are the same throughout the infrastructure,
while quotas for projects vary from site to site.
\item Secure dashboard and API access to all services for all users. The dashboard, OpenStack APIs and EC2 APIs are available.
All services are implemented on top of an SSL layer, in order to secure resource access and data privacy.
\item Easy sharing of VM images and snapshots through a common Object Storage deployment.
A single image/snapshot database is used by all the project sites. This means that all VM images and snapshots are available in all sites.
\item Common DNS name space for distributed resources. DNS HA provides high availability for distributed resources.
\item Block device sharing over remote sites; a rough way to implement is through CEPH backend volume backups, faster and more
efficient ways are under investigation. The final approach will mainly depend on the WAN bandwidth and latency among the INFN-CC sites.
\item Self-service backup for instances and block storage. Backed-up data can be accessed/restored transparently from/to any site.
Final users and tenant administrators are responsible for backing up their instances and the attached block devices. Adequate tools,
native to OpenStack, are provided. As the backup storage backend, both for instance images and snapshots and for block devices,
are replicated and distributed, data backup is transparently available in all the cloud sites and is still available in the case of a site failure.
\end{itemize}
\subsection{The management model}
As resources in INFN-CC are so closely coupled and interdependent, they must be managed carefully by expert staff and must always work correctly.
For this reasons a limited number of cloud administrator, no matter where they are based, are allowed to administer hosts offering
OpenStack services in any INFN-CC site both for normal maintenance and in case of emergency.
This approach is eased by the homogeneity of the infrastructure, but requires a trust agreement that breaks the barriers of the
single site: remote administrators must be trusted exactly as residents.
On the other hand, infrastructure and hardware management is not easily performed from a remote site and should be full responsibility
of the local IT staff. Cloud administrators and local IT staff should of course interact for better problem detection and solving.
This management model brings issues that exceed the technical and organizational problems of a distributed management team: hosting
sites must agree on having external people manage part of their infrastructure as if they were local staff.
\section{Use cases for the INFN-CC}
The architecture of INFN-CC is particularly fit for a wide range of use cases where a strict relation exists between resources that are
distributed over different sites.
Most of these use cases are related to the delivery of computing services for the INFN community, be they of local interest for users
belonging to a single INFN site or of general interest for the whole community.
This does not mean that scientific computing is unfit for the INFN-CC, but often scientific computing environments do not need the
high availability features provided by INFN-CC and can take advantage of other cloud deployments.
\subsection{Scientific computing}
Massive data analysis or simulations do not usually need an environment like that of INFN-CC, but this does not mean that the
INFN-CC doors are closed to scientific computing.
Tier 3 virtualization, the last mile of data analysis, as well as software development environments are the first use cases that
might take advantage of INFN-CC and use it efficiently.
Further use cases might be applicable in the future, according to the available resources and to the project development.
\subsection{Distributed Web application}
In this use case, a generic distributed web application can use a distributed SQL database (accessed through HAProxy) and a
distributed object storage data backend (with almost no single point of failure).
The failure of one instance does not affect final users, that are still able to use the application.
\section{Operations and evolution}
As previously described in the text, the INFN-CC infrastructure is a multi-region cloud composed of different OpenStack installations.
The actual configuration of the CNAF region is here described:
\begin{itemize}
\item A network node: bare metal resource hosting the OpenStack Networking service that deploys several processes across a number of nodes. These processes
interact with each other and other OpenStack services. The main process of the OpenStack Networking service is neutron-server, a Python
daemon that exposes the OpenStack Networking API and passes tenant requests to a suite of plug-ins for additional processing
\item Two Compute nodes: bare metal resources on which the VM's are actually deployed. Each compute node runs an hypervisor to deploy and run the VM.
\item A storage node: bare metal resource hosting a cluster distribution of CEPH object storage that provides interfaces for object-, block- and file-level storage
\item A ToR switch
\end{itemize}
Up to date, CNAF is providing to the INFN-CC the following IaaS resources: 20 VCPUs, 50GB RAM, 50 Floating IPs and 50TB of volume storage.
A set of new resources are expected to be acquired by 2019 and became part of the CNAF region od INFN-CC infrastructure.
In the next year, a contribution from CNAF in terms of development and integration of new services is also expected, in particular in the deployment and testing of
CEPH distributions.
\section{Conclusions}
In this contribution, the INFN-CC cloud infrastructure has been briefly presented from different aspects. Archietecture, services, maintecance model
and possible usecse have been also discussed.
In particular, the contribution of CNAF in terms of operations and evolution of both the CNAF regions and the the INFN-CC cloud infrastructure at whole
have been described.
In the next years, the evolution of the services offered by INFN-CC is expected to bring new and challenging use cases.
In this respect, CNAF is aiming to contribute in terms of manpower and expertise to improve both the reliability and the quality of the service offered to INFN and worldwide.
\section{References}
\begin{thebibliography}{}
\bibitem{infncc-chep2018}
Web site: https://indico.cern.ch/event/587955/contributions/2935944
\bibitem{infncc-wiki}
Web site: https://wiki.infn.it/cn/ccr/cloud/infn\_cc
\bibitem{indigo-dc}
Web site: www.indigo-datacloud.eu
\bibitem{openstack}
Web site: https://www.openstack.org/
\bibitem{ceph}
https://ceph.com
\bibitem{deep}
Web site: https://deep-hybrid-datacloud.eu/
\bibitem{xdc}
Web site: www.extreme-datacloud.eu
\end{thebibliography}
%\section*{Acknowledgments}
%eXtreme-DataCloud has been funded by the European Commision H2020 research and innovation program under grant agreement RIA XXXXXXX.
\end{document}
contributions/ds_infn_cc/infncc-net.png

198 KiB

contributions/ds_infn_cc/infncc-services.png

128 KiB

File added
\documentclass[a4paper]{jpconf}
\usepackage{graphicx}
\usepackage{tikz}
\usepackage{hyperref}
\usepackage{eurosym}
%%%%%%%%%% Start TeXmacs macros
\newcommand{\tmem}[1]{{\em #1\/}}
\newcommand{\tmop}[1]{\ensuremath{\operatorname{#1}}}
\newcommand{\tmtextit}[1]{{\itshape{#1}}}
\newcommand{\tmtt}[1]{\texttt{#1}}
%%%%%%%%%% End TeXmacs macros
\begin{document}
\title{The INFN-Tier1: the computing farm}
\author{Andrea Chierici$^1$, Stefano Dal Pra$^1$, Diego Michelotto$^1$}
\address{$^1$ INFN-CNAF, Bologna, IT}
\ead{andrea.chierici@cnaf.infn.it, stefano.dalpra@cnaf.infn.it, diego.michelotto@cnaf.infn.it}
%\begin{abstract}
%\end{abstract}
\section{Introduction}
The farming group is responsible for the management of the computing resources of the centre.
This implies the deployment of installation and configuration services, monitoring facilities and the fair distribution of the resources to the experiments that have agreed to run at CNAF.
%\begin{figure}
%\centering
%\includegraphics[keepaspectratio,width=10cm]{ge_arch.pdf}
%\caption{Grid Engine instance at INFN-T1}
%\label{ge_arch}
%\end{figure}
\section{Farming status update}
During 2018 the group got reorganized: Antonio Falabella left the group and Diego Michelotto took over him. This turnover was quite harmless since Diego was already aware of many of the procedures adopted in farming group as well as of the collaborative tools used internally.
\subsection{Computing}
It's well known that in November 2017 we suffered a flooding in our data center and so the largest part of 2018 was dedicated to restoring the facility,
trying to understand how much of the computing power was damaged and how much was recoverable.
We had quite a luck on blade servers (2015 tender), while on 2016 tender most of the nodes that we thought were reusable, after some time got broken and were unrecoverable. We were able to recover working parts from the broken servers (like ram, CPUs and disks) and with those we assembled some nodes to be used as service nodes: the parts were accurately tested by a system integrator that guaranteed for us the stability and reliability of the resulting platform.
As a result of the flooding, approximately 24 kHS06 got damaged.
In spring we finally installed the new tender, composed of AMD EPYC nodes, providing more than 42 kHS06, with 256GB of ram, 2x1TB SSDs and 10Gbit Ethernet network. This is the first time we adopt 10Gbit connection for WNs and we think from now on it will be a basic requirement: modern CPUs provide several cores, enabling us to pack more jobs in a single node, where a 1Gbit network speed may be a significant bottleneck. The same applies to HDDs vs SSDs: we think that modern computing nodes can provide 100\% of their capabilities only with SSDs disks.
General job execution trend can be seen in Figure~\ref{farm-jobs}.
\begin{figure}
\centering
\includegraphics[keepaspectratio,width=15cm]{farm-jobs.png}
\caption{Farm job trend during 2018}
\label{farm-jobs}
\end{figure}
\subsubsection{CINECA extension}
Thanks to an agreement between INFN and CINECA\cite{ref:cineca}, we were able to integrate a portion (3 racks for a total of 216 servers sporting $\sim$180 kHS06) of the Marconi cluster into our computing farm, reaching the total computing power of 400 kHS06, almost doubling the power we provided last year. Each server is equipped with a 10 Gbit uplink connection to the rack switch while each of them, in turn, is connected to the aggregation router with 4x40 Gbit links.
Due to the proximity of CINECA, we set up a highly reliable fiber connection between the computing centers, with a very low latency
(the RTT\footnote{Round-trip time (RTT) is the duration it takes for a network request to go from a starting point to a destination and back again
to the starting point.} is 0.48 ms vs. 0.28 ms measured on the CNAF LAN), and could avoid to set up a cache storage on the CINECA side:
all the remote nodes access storage resources hosted at CNAF in the exact same manner as the local nodes do.
This simplifies a lot the setup and increases global farm reliability (see Figure~\ref{cineca} for details on setup).
\begin{figure}
\centering
\includegraphics[keepaspectratio,width=12cm]{cineca.png}
\caption{INFN-T1 farm extension to CINECA}
\label{cineca}
\end{figure}
These nodes have undergone several reconfigurations due to both the hardware and the type of workflow of the experiments.
In April we had to upgrade the BIOS to overcome a bug which was preventing the full resource usage,
limiting what we were getting from the nodes to $\sim$78\% of the total.
Moreover, since nodes at CINECA are setup with standard HDDs and since so many cores are available per node, we hit a bottleneck.
To mitigate this limitation, a reconfiguration of the local RAID configuration of disks has been
done\footnote{The initial choice of using RAID-1 for local disks instead of RAID-0 proved to slow down the system even if safer from an operational point of view.} and the amount of jobs per node was slightly reduced (generally this equals the number of logical cores). It's important to notice that we did not reach this limit with the latest tender we purchased, since it comes with two enterprise class SSDs.
During 2018 we kept using also the Bari ReCaS farm extension,
with a reduced set of nodes that provided approximately 10 kHS06\cite{ref:ar17farming}.
\subsection{Hardware resources}
Hardware resources for farming group are quite new, and a refresh was not foreseen during 2018. The main concern is on the two different virtualization infrastructures, that only required a warranty renewal. Since we were able to recover a few parts from the flood-damaged nodes, we were able to acquire a 2U 4 node enclosure to be used as the main resource provider for the forthcoming HTCondor instance.
\subsection{Software updates}
During 2018 we completed the migration from SL6 to CentOS7 on all the farming nodes. The configurations have been stored on our provisioning system:
with the WNs the migration process has been rather simple, while with CEs and UIs we took extra care and proceeded one at a time in order to guarantee continuity
to the service. The same configurations have been used to upgrade LHCb-T2 and INFN-BO-T3, with minimal modifications.
All the modules produced for our site can easily be exported to other sites willing to perform the same update.
As already said, the update involved all the services with just a small number of exceptions: CMS experiment is using PhEDEx\cite{ref:phedex}, a system that provides the data placement and the file transfer system that is incompatible with CentOS7. Since the system will be phased out in mid 2019, we agreed with the experiment to not perform any update. Same thing happened with a few legacy UIs and some services for the CDF experiment, that are involved in a LTDP project (more details in next year report).
In any case, if an experiment needs a legacy OS, like SL6, on all the Worker Nodes we provide a container solution based on singularity\cite{ref:singu} software.
Singularity enables users to have full control of their environment through containers: it can be used to package entire scientific workflows, software and libraries, and even data. This avoids the T1 users to ask farming sysadmin to install any software, since everything can be put in a container and run. Users are in control of the extent to which containers interacts with its host: there can be seamless integration, or little to no communication at all.
Year 2018 has been terrible from a security point of view.
Several critical vulnerabilities have been discovered, affecting data-center CPUs and major software stacks:
the major ones were Meltdown and Spectre~\cite{ref:meltdown} (see Figure~\ref{meltdown} and~\ref{meltdown2}).
These discoveries required us to promptly intervene in order to mitigate and/or correct these vulnerabilities,
applying software updates (this mostly breaks down to updating Linux kernel and firmware) that most of the times required to reboot the whole farm.
This impacts greatly in term of resource availability, but it's mandatory in order to prevent security issues and possible sensitive data disclosures.
Thanks to our internally-developed dynamic update procedure, patch application is smooth and almost automatic, avoiding waste of time for the farm staff.
\begin{figure}
\centering
\includegraphics[width=0.5\textwidth]{meltdown.jpg}
%\includegraphics[keepaspectratio,width=12cm]{meltdown.jpg}
\caption{Meltdown and Spectre comparison}
\label{meltdown}
\end{figure}
\begin{figure}
\centering
%\includegraphics[keepaspectratio,width=12cm]{meltdown2.jpg}
\includegraphics[width=0.5\textwidth]{meltdown2.jpg}
\caption{Meltdown attack description}
\label{meltdown2}
\end{figure}
\subsection{HTCondor update}
INFN-T1 decided to migrate to HTCondor from LSF for several reasons.
The main one is that this software has proved to be extremely scalable and ready to stand the forthcoming challenges that High Luminosity LHC will raise
in our research community in the near future. Moreover, many of the other T1s involved in LHC have announced the transition to HTCondor or have already completed it,
not to consider the fact that our current batch system, LSF, is no longer under warranty, since INFN decided not to renew the contract with IBM
(the provider of this software now re-branded ``Spectrum LSF''), in order to save money and consider the alternative given by HTCondor.
\section{DataBase service: Highly available PostgreSQL}
In 2013 INFN-T1 switched to a custom solution for the job accounting
system~\cite{DGAS} based on a PostgreSQL backend. The database was
made more robust over time, introducing redundancy, reliable hardware
and storage. This architecture was powerful enough to also host other
database schema, or even independent instances, to meet
requirements from user communities (CUORE, CUPID) for their computing
model. A MySQL based solution is also in place, to accommodate needs of
the AUGER experiment.
\subsection{Hardware setup}
A High Availability PostgreSQL instance has been deployed on two
identical SuperMicro hosts, ``dbfarm-1'' and ``dbfarm-2'', each one equipped as
follows:
\begin{itemize}
\item Intel(R) Xeon(R) CPU E5-2603 v2 @ 1.80GHz,
\item 32GB Ram,
\item two FiberChannel controllers,
\item a Storage Area Network volume of 2 TB,
\item two redundant power supply.
\end{itemize}
The path to the SAN storage is also fully redundant, since each Fiber Channel
controller is connected to two independent SAN switches.
One node also hosts 2 HDDs ()1.8TB configured with software RAID-1) to work as service storage
area for supplementary data-base backup and other maintenance tasks.
\subsection{Software setup}
A PostgreSQL 11.1 master has been installed on the two host; dbfarm-1
has been set up to work as master and dbfarm-2 works as a hot standby
replica. With this configuration, the master is the main database,
while the replica can be accessed in read-only mode. This instance is
used to host the accounting database of the farming, the inventory of
the hardware of the T1-centre (docet) and a database used by the CUPID
experiment. The content of this database is updated directly by
authorized users of the experiment, while jobs running on our worker
nodes can access its data from the standby replica.
A second independent instance has also been installed on dbfarm-2
working as a hot standby replica of a remote Master instance managed
by the CUORE collaboration and located at INFN-LNGS. The continuous
synchronization with the master database happens through a VPN channel.
Local read access from our Worker Nodes to this
instance can be quite intense: the standby server has been
sustaining up to 500 connections without any evident problem.
\subsection{Distributed MySQL service}
A different solution for the AUGER experiment has been put in place for several
years now, and has been recently redesigned when moving our Worker
Nodes to CentOS7. Several jobs of the Auger experiment need
concurrent read-only access to a MySQL (actually MariaDB, with CentOS7
and later) database. A single server instance cannot sustain the
overall load generated by the clients. For this reason, we have
configured a reasonable subset of Worker Nodes (two racks) to host a
local binary copy of the AUGER data base. The ``master'' copy of this database
is available from a dedicated User Interface and
users can update its content when they need to.
The copy on the Worker Nodes can be updated every few months, upon
request from the experiment. To do so, we must in order:
\begin{itemize}
\item drain any running job accessing the database
\item shutdown every MariaDB instance
\item update the binary copy using rsync
\item restart the database
\item re-enable normal AUGER activity
\end{itemize}
\section{Helix Nebula Science Cloud}
During the first part of 2018, the farming group has been directly involved in the pilot phase of Helix Nebula Science Cloud project~\cite{ref:hnsc}, whose aim was to allow research institutes like INFN to be able to test commercial clouds against HEP use-cases, identifying strength and weak points.
The pilot phase has seen some very intense interaction between the public procurers and both commercial and public service providers.
\subsection{Pilot Phase}
The pilot phase of the HNSciCloud PCP is the final step in the implementation of the hybrid cloud platform proposed by the contractors that were selected. During the period from January to June 2018, the technical activities of the project focused on
scalability of the platforms and on training of new users that will access the pilot at the end of this phase.
Farming members guided the contractors throughout the first part of the pilot phase,
testing the scalability of the proposed platforms, organizing the procurers’ hosted events and assessing the deliverables produced by the contractors together with the other partners of the project.
\subsection{Conclusions of the Pilot Phase}
Improvements to the platforms have been implemented during this phase and even though
some R\&D activities had still to be completed, the general evaluation of the first part of the pilot phase is positive.
In particular, the Buyers Group reiterated the need for a fully functioning cloud storage service and highlighted the commercial advantage such a transparent data service represents for the Contractors. Coupled with a flexible voucher scheme, such an offering will encourage a greater uptake within the Buyers Group and the wider public research sector. The increase in demand for GPUs, even if not originally considered critical during the design phase, has become more important and highlighted a weak point in the current offer.
\section{References}
\begin{thebibliography}{9}
\bibitem{ref:cineca} Cineca webpage: https://www.cineca.it/
\bibitem{ref:ar17farming} Chierici A. et al. 2017 INFN-CNAF Annual Report 2017, edited by L. dell’Agnello, L. Morganti, and E. Ronchieri, pp. 111
\bibitem{ref:phedex} PhEDEx webpage: https://cmsweb.cern.ch/phedex/about.html
\bibitem{ref:singu} Singularity website: https://singularity.lbl.gov/
\bibitem{ref:meltdown} Meltdown attack website: https://meltdownattack.com/
\bibitem{ref:hnsc} Helix Nebula The Science Cloud website: https://www.hnscicloud.eu/
\bibitem{DGAS} Dal Pra, Stefano. ``Accounting Data Recovery. A Case Report from
INFN-T1'' Nota interna, Commissione Calcolo e Reti dell'INFN,
{\tt CCR-48/2014/P}
\bibitem{DOCET} Dal Pra, Stefano, and Alberto Crescente. ``The data operation centre tool. Architecture and population strategies'' Journal of Physics: Conference Series. Vol. 396. No. 4. IOP Publishing, 2012.
\end{thebibliography}
\end{document}
contributions/farming/cineca.png

16.7 KiB

contributions/farming/farm-jobs.png

230 KiB

contributions/farming/meltdown.jpg

39.2 KiB

contributions/farming/meltdown2.jpg

28.5 KiB

\documentclass[a4paper]{jpconf}
\usepackage{graphicx}
\usepackage{xspace}
\newcommand{\Fermi}{\emph{Fermi}\xspace}
\begin{document}
\title{The \Fermi-LAT experiment}
\author{
M. Kuss$^{1}$,
F. Longo$^{2}$,
on behalf of the \Fermi LAT collaboration}
\address{$^{1}$ INFN Sezione di Pisa, Pisa, IT}
\address{$^{2}$ University of Trieste and INFN Sezione di Trieste, Trieste, IT}
\ead{michael.kuss@pi.infn.it}
\begin{abstract}
The \Fermi Large Area Telescope is a current generation experiment dedicated to gamma-ray astrophysics.
\end{abstract}
\section{The \Fermi LAT Experiment}
The Large Area Telescope (LAT) is the primary instrument on the \emph{Fermi Gamma-ray Space Telescope} mission, launched on June 11, 2008.
It is the product of an international collaboration between DOE, NASA and academic US institutions as well as international partners in France,
Italy, Japan and Sweden.
The LAT is a pair-conversion detector of high-energy gamma rays covering the energy range from 20 MeV to more than 300 GeV \cite{LATinstrument}.
It has been designed to achieve a good position resolution ($<$10 arcmin) and an energy resolution of $\sim$10 \%.
Thanks to its wide field of view ($\sim$2.4 sr at 1 GeV), the LAT has been routinely monitoring the gamma-ray sky and has shed light on the extreme, non-thermal Universe.
This includes gamma-ray sources such as active galactic nuclei, gamma-ray bursts, galactic pulsars
and their environment, supernova remnants, solar flares, etc.
% ref:
% triggers: last roll-over integral + telemetry trending of LHKGEMSENT
% events: datacat integral
% photons: current value on https://fermi.gsfc.nasa.gov/cgi-bin/ssc/LAT/LATDataQuery.cgi
% 2018: I took Rob's presentation at the collaboration meeting
By end of 2018, the LAT had registered 640 billion triggers (1800 Hz average trigger rate).
An on-board filter analyses the event topology and discards about 80\%.
Of the 128 billion events that were transferred to ground 1.2 billion were classified as photons.
All photon data are made public almost immediately.
Downlink, processing, preparation and storage take about 8 hours.
\section{Scientific Achievements Published in 2018}
% source: Fermi LAT publications page and links there and Peter's presentation
In 2018, 48 collaboration papers (Cat.\ I and II) were published, keeping the impressive pace of about 50 - 60 per year since the launch in 2008.
Independent publications by LAT collaboration members (Cat.\ III) amount to 18.
% source: NASA Fermi Bibliography Search Tool
Also external scientists are able to analyse the \Fermi public data, resulting in about 100 external publications.
Also the latter number is constant over the years, demonstrating the continuous interest in LAT science by the astrophysics community.
\section{The Computing Model}
The \Fermi-LAT offline processing system is hosted by the LAT ISOC (Instrument Science Operations Center)
based at the SLAC National Accelerator Laboratory in California.
The \Fermi-LAT data processing pipeline (e.g.\ see \cite{LATp2} for a detailed technical description)
was designed with the focus on allowing the management of arbitrarily complex work flows and handling multiple tasks simultaneously
(e.g., prompt data processing, data reprocessing, MC production, and science analysis).
The available resources are used for specific tasks: the SLAC batch farm for data processing, high level analysis, and smaller MC tasks,
the batch farm of the CC-IN2P3 at Lyon and the grid resources for large MC campaigns.
The grid resources \cite{CHEP2013} are accessed through a DIRAC (Distributed Infrastructure with Remote Agent Control) \cite{DIRAC} interface to the LAT data pipeline \cite{LAT-DIRAC}.
This setup was put into production mode in April 2014.
\section{Conclusions and Perspectives}
\label{conclusions}
The prototype setup based on the DIRAC framework described in the INFN-CNAF Annual Report 2013 \cite{CNAF2013} proved to be successful.
In 2014 we transitioned into production mode.
However, \Fermi{} is in its full maturity and doesn't plan major upgrades
to be supported by massive MC simulations.
The regular data processing and possible re-processing is managed by SLAC.
The not regular MC production is currently taken care by the Lyon farm.
Note however that we expect that our usage may rise again,
partially due to changes in the computing model associated with long term mission plans of Fermi-LAT,
and partially to satisfy specialized simulation needs.
\section*{References}
\begin{thebibliography}{99}
\bibitem{LATinstrument} Atwood W B et al.\ 2009 {\it Astrophysical Journal} {\bf 697}, 1071
\bibitem{LATp2} Dubois R 2009 {\it ASP Conference Series} {\bf 411} 189
\bibitem{CHEP2013} Arrabito L et al.\ 2013 CHEP 2013 conference proceedings arXiv:1403.7221
\bibitem{DIRAC} Tsaregorodtsev A et al.\ 2008 {\it Journal of Physics: Conference Series} {\bf 119}, 062048
\bibitem{LAT-DIRAC} Zimmer S et al.\ 2012 {\it Journal of Physics: Conference Series} {\bf 396}, 032121
\bibitem{CNAF2013} Arrabito L et al.\ 2014 {\it INFN-CNAF Annual Report 2013}, edited by L.\ dell'Agnello, F.\ Giacomini, and C.\ Grandi, pp.\ 46
\end{thebibliography}
\end{document}
\documentclass[a4paper]{jpconf}
\usepackage{url}
\usepackage[]{color}
\usepackage{graphicx}
\usepackage{makecell}
\usepackage{booktabs}
\usepackage{subfig}
\usepackage{float}
\usepackage{graphicx}
\usepackage{tikz}
\usepackage[binary-units=true,per-mode=symbol]{siunitx}
% \usepackage{pgfplots}
% \usepgfplotslibrary{patchplots}
% \usepackage[binary-units=true,per-mode=symbol]{siunitx}
\begin{document}
\title{GAMMA experiment}
\author{AGATA collaboration}
\ead{Benedicte.Million@mi.infn.it, Silvia.Leoni@mi.infn.it, Daniel.R.Napoli@lnl.infn.it}
\begin{abstract}
AGATA (Advanced GAmma Tracking Array) represents the state-of-the-art in gamma-ray spectroscopy and is an essential precision tool underpinning a broad program of studies in nuclear structure, nuclear astrophysics and nuclear reactions. Consisting of a full shell of segmented germanium detectors instrumented with digital electronics its projected detection sensitivity rises much higher than for precedent shielded germanium array and up to 1000 times in specific configurations.
\end{abstract}
\section{The GAMMA experiment and the AGATA array}
The strong interaction described by quantum chromodynamics (QCD) is responsible for binding neutrons and protons into nuclei and for the many facets of nuclear structure and reaction physics. Combined with the electroweak interaction, it determines the properties of all nuclei in a similar way as quantum electrodynamics shapes the periodic table of elements. While the latter is well understood, it is still unclear how the nuclear chart emerges from the underlying strong interactions. This requires the development of a unified description of all nuclei based on systematic theories of strong interactions at low energies, advanced few- and many-body methods, as well as a consistent description of nuclear reactions. Nuclear structure and dynamics have not reached the discovery frontier yet (e.g. new isotopes, new elements, …), and a high precision frontier is also being approached with higher beam intensities and purity, along with better efficiency and sensitivity of instruments. The access to new and complementary experiments combined with theoretical advances allows key questions to be addressed such as:
How does the nuclear chart emerge from the underlying fundamental interactions?
Where are the limits of stability and what is the heaviest element that can be created?
How does nuclear structure evolve across the nuclear landscape and what shapes can nuclei adopt?
How does nuclear structure change with temperature and angular momentum?
How can nuclear structure and reaction approaches be unified?
How complex are nuclear excitations?
How do correlations appear in dilute neutron matter, both in structure and reactions?
What is the density and isospin dependence of the nuclear equation of state?
\noindent The experiment GAMMA is addressing most of these questions through gamma-ray spectroscopy measurements, in particular studying: properties of nuclei far from stability by measuring observables (energy, spin, transition probabilities, beta-decay lifetimes) to constrain theoretical models in exotic regions of the nuclear chart; collective modes like Pygmy Dipole Resonances, Giant Resonances; symmetries, including dynamical symmetries; Nuclear shapes; Magnetic moments, as a test of the nature of specific isomeric configurations; conversion electrons (EC), probing electromagnetic decays and E0 transitions. The experimental activity is based on the use of multi-detector arrays at stable (LNL Legnaro, GANIL Caen) and radioactive (GSI Darmstadt, RIKEN, SPES/LNL in future) ion beam facilities. The experimental setups involve the use of Ge detector arrays coupled to complementary detectors such as large scintillators (for high-energy gamma-rays), small scintillators (for multiplicity measurements or fast timing techniques), charged particle detectors, neutron detectors and magnetic spectrometers. The main activities involving CNAF are carried out with AGATA, presently installed at GANIL and coupled now to the VAMOS spectrometer and the segmented charged particle detector MUGAST, and previously to the NEDA neutron array.
\noindent AGATA \cite{ref:gamma_first,ref:gamma_second} is the European Advanced Gamma Tracking Array for nuclear spectroscopy project consisting of a full shell of high purity segmented germanium detectors. Being fully instrumented with digital electronics it exploits the novel technique of gamma-ray tracking. AGATA will be employed at all the large-scale radioactive and stable beam facilities and in the long-term will be fully completed in 60 detectors unit geometry, in order to realize the envisaged scientific program. AGATA is being realized in phases with the goal of completing the first phase with 20 units by 2020. AGATA has been successfully operated since 2009 at LNL, GSI and GANIL, taking advantage of different beams and powerful ancillary detector systems. It will be used in LNL again in 2022, with stable beams and later with SPES radioactive beams, and in future years is planned to be installed in GSI/FAIR, Jyvaskyla, GANIL again, and HIE-ISOLDE.
\section{AGATA computing model and the role of CNAF}
At present the array consists of 15 units, each composed by a cluster of 3 HPGe crystals.
Each individual crystal is composed of 36 segments for a total of 38 associated electronics channels/crystal.
The data acquisition rate, including Pulse Shape Analysis, can stand up to 4/5 kHz events per crystal.
The bottleneck is presently the Pulse Shape Analysis procedure to extract the interaction positions from the HPGe detectors traces.
With future faster processor one expects to be able to process the PSA at 10 kHz/crystal. The amount of raw data per experiment, including traces,
is about 20 TB for a standard data taking of about 1 week and can increase to 50 TB for specific experimental configuration.
The collaboration is thus acquiring locally about 250 TB of data per year. During data-taking raw data is temporarily stored
in a computer farm located at the experimental site and, later on, it is dispatched on the GRID in two different centers, CCIN2P3 (Lyon) and CNAF (INFN Bologna),
used as Tier 1: the duplication process is a security in case of failures/losses of one of the Tier 1 sites.
The GRID itself is seldom used to re-process the data and the users usually download their data set to local storage
where they can run emulators able to manage part or the full workflow.
\section{References}
\begin{thebibliography}{9}
\bibitem{ref:gamma_first}
S. Akkoyun et al., Nucl. Instrum. Methods A 668, 26 (2012).
\bibitem{ref:gamma_second} A. Gadea et al., Nucl. Instrum. Methods A 654, 88 (2011).
\end{thebibliography}
\end{document}
contributions/icarus/ICARUS-nue-mip.png

106 KiB

contributions/icarus/ICARUS-sterile-e1529944099665.png

36.7 KiB

contributions/icarus/SBN.png

2.93 MiB

contributions/icarus/icarus-nue.png

476 KiB

\documentclass[a4paper]{jpconf}
\usepackage[font=small]{caption}
\usepackage{graphicx}
\begin{document}
\title{ICARUS}
\author{A. Rappoldi$^1$, on behalf of the ICARUS Collaboration}
\address{$^1$ INFN Sezione di Pavia, Pavia, IT}
\ead{andrea.rappoldi@pv.infn.it}
\begin{abstract}
After its successful operation at the INFN underground laboratories
of Gran Sasso (LNGS) from 2010 to 2013, ICARUS has been moved to
Fermilab Laboratory at Chicago (FNAL),
where it represents an important element of the
Short Baseline Neutrino Project (SBN).
Indeed, the ICARUS T600 detector, which has undergone various technical upgrades
operations at CERN to improve its performance and make it more suitable
to operate at shallow depth, will constitute one of three Liquid Argon (LAr) detectors
exposed to the FNAL Booster Neutrino Beam (BNB).
The purpose of this project is to provide adequate answers to the
``sterile neutrino puzzle'', due to the observation, claimed by various
other experiments, of anomalies in the results obtained in the
measurement of the parameters that regulate the mechanism of neutrino
flavor oscillations.
\end{abstract}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{The ICARUS project}
\label{ICARUS}
The technology of the Liquid Argon Time Projection chamber (LAr TPC),
was first proposed by scientist Carlo Rubbia in 1977. It was conceived as a tool for
detecting neutrinos in a way that would result in completely uniform imaging with high
accuracy of massive volumes (several thousand tons).
ICARUS T600, the first large-scale detector exploiting this detection technique,
is the biggest LAr TPC ever realized, with a cryostat containing 760 tons of liquid argon.
Its construction was the culmination of many years of ICARUS collaboration R\&D studies,
with larger and larger laboratory and industrial prototypes, mostly developed thanks
to the Italian National Institute for Nuclear Physics (INFN), with the support of CERN.
Nowadays, it represents the state of the art of this technique, and it marks a major
milestone in the practical realization of large-scale liquid-argon detectors.
The ICARUS T600 detector was previously installed in the underground Italian INFN Gran
Sasso National Laboratory (LNGS) and was the first large-mass LAr TPC operating as a continuously
sensitive general-purpose observatory.
The detector was exposed to the CERN Neutrinos to Gran Sasso (CNGS) beam,
a neutrino beam produced at CERN and
traveling undisturbed straight through Earth for 730 km.
This very successful run lasted 3 years (2010-2013),
during which were collected
$8.6 \cdot 10^{19}$ protons on target with a
detector live time exceeding 93\%, recording 2650 CNGS neutrinos,
(in agreement with expectations) and cosmic rays (with a total exposure of 0.73 kilotons per year).
ICARUS T600 demonstrated the effectiveness of the so-called {\it single-phase} TPC technique
for neutrino physics, providing a series of results, both from the technical and from the
physical point of views.
Beside the excellent detector performance, both as tracking device and as homogeneous calorimeter,
ICARUS demonstrated a remarkable capability in electron-photon separation and particle
identification, exploiting the measurement of dE/dx versus range, including also the
reconstruction of the invariant mass of photon pairs (coming from $\pi^0$ decay) to reject to unprecedented level
the Neutral Current (NC) background to $\nu_e$ Charge Current (CC) events (see Fig.~\ref{Fig1}).
\begin{figure}[ht]
\centering
% \includegraphics[width=0.8\textwidth,natwidth=1540,natheight=340]{icarus-nue.png}
\includegraphics[width=0.8\textwidth]{icarus-nue.png}
\end{figure}
\begin{figure}[ht]
\centering
\includegraphics[width=0.6\textwidth]{ICARUS-nue-mip.png}
\caption{\label{Fig1} {\it Top:} A typical $\nu_e$ CC events recorded during the ICARUS operation
at LNGS. The neutrino, coming from the right, interacts with the Ar nucleus and produce a
proton (short heavy ionizing track) and an electron (light gray track) which starts an electromagnetic
shower, which develops to the left. {\it Bottom:} The accurate analysis of {\it dE/dx} allows
to easily distinguish the parts of the track in which there is the overlap of more particles,
locating with precision the beginning of the shower.}
\end{figure}
The tiny intrinsic $\nu_e$ component in the CNGS $\nu_{\mu}$
beam allowed ICARUS to perform a sensitive search for anomalous LSND-like $\nu_\mu \rightarrow \nu_e$ oscillations.
Globally, seven electron-like events have been observed, consistent with the $8.5 \pm 1.1$ events
expected from intrinsic beam $\nu_e$ component and standard oscillations, providing the limit on
the oscillation probability $P(\nu_\muμ \rightarrow \nu_e) \le 3.86 \cdot 10^{3}$ at 90\% CL and
$P(\nu_\mu \rightarrow \nu_e) \le 7.76 \cdot 10^{3}$ at 99\% CL, as shown in
Fig.~\ref{Fig2}.
\begin{figure}[ht]
\centering
\includegraphics[width=0.5\textwidth]{ICARUS-sterile-e1529944099665.png}
\caption{\label{Fig2} Exclusion plot for the $\nu_\mu \rightarrow \nu_e$ oscillations.
The yellow star marks the best fit point of MiniBooNE.
The ICARUS limits on the oscillation probability are shown with the red lines. Most of
LSND allowed regios is excluded, except for a small area around $\sin^2 2 \theta \sim 0.005$,
$\Delta m^2 < 1 eV^2$.
}
\end{figure}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{ICARUS at FNAL}
\label{FNAL}
After its successful operation at LNGS, the ICARUS T600 detector was planned
to be included in the Short Baseline Neutrino project (SBN) at Fermilab\cite{SBN},
in Chicago, aiming to give some definitive answer to the so-called
{\it Sterile Neutrino Puzzle}.
In this context, it will operate as the {\it far detector}, put along the
Booster Neutrino Beam (BNB) line, 600 meters from the target (see Fig.~\ref{Fig3}).
\begin{figure}[h]
\centering
\includegraphics[width=0.8\textwidth]{SBN.png}
\caption{\label{Fig3} The Short Baseline Neutrino Project (SBN) at
Fermilab (Chicago) will use three LAr TPC detectors, exposed to the
Booster Neutrino Beam, at different distances fron the target.
The ICARUS T600 detector, put at 600 m, will operate as the {\it far detector},
voted to detect any anomaly in the beam flux and spectrum, with respect to
the initial beam composition detected by the {\it near detector}
(SBND).
These anomalies, due to neutrino flavour oscillations, would consist of
either $\nu_e$ appearence or $\nu_\mu$ disappearance.
}
\end{figure}
For this purpose, the ICARUS T600 detector underwent intensive
overhauling at CERN, before shipping to FNAL,
in order to make it better suited to surface operation (instead of in
an underground environment).
This important technical improvements took place in the CERN
Neutrino Platform framework (WA104) from 2015 to 2017.
In addition to significant mechanical improvements, especially concerning
a new cold vessel, with a purely passive thermal insulation,
some important innovations have been applied to the scintillation
light detection system\cite{PMT} and to the readout
electronics\cite{Electronics}.
% The role of ICARUS will be to detect any anomaly in the neutrino beam flux and
% composition that can occour during its propagation (from the near to the
% far detector), caused by neutrino flavour oscillation.
% This task requires to have an excellent capability to detect and identify
% neutrino interaction within the LAr sensitive volume, rejecting any other
% spurious event with a high level of confidence.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{ICARUS data amount}
\label{Computingreport_2015.pdf}
% The new ICARUS T600 detector (that has been modified and improved to operate
% at FNAL) contains about 54,000 sensitive wires (that give an electric signal
% proportional to the charge released into the LAr volume by ionizing particles)
% and 180 large PMTs, producing a prompt signal coming from the scintillation light.
% Both these analogic signal types are then converted in digital form, by mean of
% fast ADC modules.
%
% During normal run conditions, the trigger rate is about 0.5 Hz, and
% a full event, consisting of the digitized charge signals of all wires
% and all PMTs, has a size of about 80 MB (compressed).
% Therefore, the expected acquisition rate is about 40 MB/s, corrisponding
%to 1 PB/yr.
The data produced by ICARUS detector (which is a LAr Time Projection Chamber)
basically consist of a large number of waveforms generated by sampling the electric
signals induced on the sensing wires by the drift of the charge deposited along
the trajectory of the charged particles within the Lar sensitive volume.
The waveforms recorded on about 54000 wires and 360 PMTs are digitized
(at sample rate of 2.5 MHz and 500 MHz respectively) and compressed,
resulting in a total size of about 80 MB/event.
Considering the forseen acquisition rate of about 0.5 Hz (in normal
run conditions), the expected data flow is about 40 MB/s, which
involves a data production of about 1 PB/yr.
The raw data are then processed by automated filters that allow to recognize
and select the various event types (cosmic, beam, background, etc.) and rewrite
them in a more flexible format, suitable for the following analysis,
which is also supported by means of graphics interactive programs.
% The experiment is expected to start commissioning phase at the end of 2018,
% with first data coming as soon as the Liquid Argon filling procedure is completed.
% Trigger logic tuning will last not less than a couple of months during which
% one PB of data is expected.
Furthermore, the ICARUS Collaboration is actively working on
producing Montecarlo events needed
to design and test the trigger conditions to be implemented on the detector.
This is done by using the same analysis and simulation tools
developed at Fermilab for the SBN detectors (the {\it LArSoft framework}), in
order to have a common software platform, and to facilitate algorithm testing
and performance checking by all the components of the collaboration.
During the 2018 many activities related to the detector installation
were still ongoing, and the start of data acquisition activities
is scheduled for the 2019.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Role and contribution of CNAF}
\label{CNAF}
All the data (raw and reduced) will be stored on the Fermilab using local facility;
however, the ICARUS collaboration agreed to have a mirror site in Italy
(located at CNAF INFN Tier 1) where to retain a full replica of the preselected
raw data, both to have redundancy and provide a more direct data access
to european part of the collaboration.
The CNAF Tier 1 computing resources assigned to ICARUS for 2018 consist of:
4000 HSPEC of CPU, 500 TB of disk storage and 1500 TB of tape archive.
A small fraction of the available storage has been used to
make a copy of all the raw data acquired at LNGS,
which are still subject to analysis.
During 2018 the ICARUS T600 detector was still in preparation, so
only a limited fraction
of such resorces has been used, mainly to perform data transfer tests
(from FNAL to CNAF) and to check the installation of LArSoft framework
in the Tier 1 environment. For this last purpose, a dedicate virtual
machine with custom environment was also used.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section*{References}
\begin{thebibliography}{1}
\bibitem{SBN}
R. Acciarri et al.,
{\it A Proposal for a Three Detector Short-Baseline Neutrino
Oscillation Program in the Fermilab Booster Neutrino Beam},
arXiv:1503.01520 [physics.ins-det]
\bibitem{PMT}
M. Babicz et al.,
{\it Test and characterization of 400 Hamamatsu R5912-MOD
photomultiplier tubes for the ICARUS T600 detector}.
JINST 13 (2018) P10030
\bibitem{Electronics}
L. Bagby et al.,
{\it New read-out electronics for ICARUS-T600 liquid
argon TPC. Description, simulation and tests of the new
front-end and ADC system}.
JINST 13 (2018) P12007
\end{thebibliography}
\end{document}
File added
......@@ -6,13 +6,13 @@
\author{C. Bozza$^1$, T. Chiarusi$^2$, K. Graf$^3$, A. Martini$^4$ for the KM3NeT Collaboration}
\address{$ˆ1$ Department of Physics of the University of Salerno and INFN Gruppo Collegato di Salerno, via Giovanni Paolo II 132, 84084 Fisciano, Italy}
\address{$ˆ1$ University of Salerno and INFN Gruppo Collegato di Salerno, Fisciano (SA), IT}
\address{$ˆ2$ INFN, Sezione di Bologna, v.le C. Berti-Pichat, 6/2, Bologna 40127, Italy}
\address{$ˆ2$ INFN Sezione di Bologna, Bologna, IT}
\address{$ˆ3$ Friedrich-Alexander-Universit{\"a}t Erlangen-N{\"u}rnberg, Erlangen Centre for Astroparticle Physics, Erwin-Rommel-Stra{\ss}e 1, 91058 Erlangen, Germany}
\address{$ˆ3$ Friedrich-Alexander-Universit{\"a}t Erlangen-N{\"u}rnberg, Erlangen, GE}
\address{$ˆ4$ INFN, LNF, Via Enrico Fermi, 40, Frascati, 00044 Italy}
\address{$ˆ4$ INFN-LNF, Frascati, IT}
\ead{cbozza@unisa.it}
......@@ -24,7 +24,7 @@ from astrophysical sources; the ORCA programme is devoted to
investigate the ordering of neutrino mass eigenstates. The
unprecedented size of detectors will imply PByte-scale datasets and
calls for large computing facilities and high-performance data
centres. The data management and processing challenges of KM3NeT are
centers. The data management and processing challenges of KM3NeT are
reviewed as well as the computing model. Specific attention is given
to describing the role and contributions of CNAF.
\end{abstract}
......@@ -80,7 +80,7 @@ way. One ORCA DU was also deployed and operated in 2017, with smooth
data flow and processing. At present time, most of the computing load is
due to simulations for the full building block, now being enriched with
feedback from real data analysis. As a first step, this
was done at CC-IN2P3 in Lyon, but usage of other computing centres is
was done at CC-IN2P3 in Lyon, but usage of other computing centers is
increasing and is expected to soon spread to the full KM3NeT
computing landscape. This process is being driven in accordance to the
goals envisaged in setting up the computing model. The KM3NeT
......@@ -105,14 +105,14 @@ flow with a reduction from $5 GB/s$ to $5 MB/s$ per \emph{building
block}. Quasi-on-line reconstruction is performed for selected
events (alerts, monitoring). The output data are temporarily stored on
a persistent medium and distributed with fixed latency (typically less
than few hours) to various computing centres, which altogether
than few hours) to various computing centers, which altogether
constitute Tier 1, where events are reconstructed by various fitting
models (mostly searching for shower-like or track-like
patterns). Reconstruction further reduces the data rate to about $1
MB/s$ per \emph{building block}. In addition, Tier 1 also takes care
of continuous detector calibration, to optimise pointing accuracy (by
working out the detector shape that changes because of water currents)
and photomultiplier operation. Local analysis centres, logically
and photomultiplier operation. Local analysis centers, logically
allocated in Tier 2 of the computing model, perform physics analysis
tasks. A database system interconnects the three tiers by distributing
detector structure, qualification and calibration data, run
......@@ -124,10 +124,10 @@ book-keeping information, and slow-control and monitoring data.
\label{fig:compmodel}
\end{figure}
KM3NeT exploits computing resources in several centres and in the
KM3NeT exploits computing resources in several centers and in the
GRID, as sketched in Fig.~\ref{fig:compmodel}. The conceptually simple
flow of the three-tier model is then realised by splitting the tasks
of Tier 1 to different processing centres, also optimising the data
of Tier 1 to different processing centers, also optimising the data
flow and the network path. In particular, CNAF and CC-IN2P3 aim at being
mirrors of each other, containing the full data set at any moment. The
implementation for the data transfer from CC-IN2P3 to CNAF (via an
......@@ -144,9 +144,9 @@ for a while becuse of the lack of human resources.
\section{Data size and CPU requirements}
Calibration and reconstruction work in batches. The raw data related
to the batch are transferred to the centre that is in charge of the
to the batch are transferred to the center that is in charge of the
processing before it starts. In addition, a rolling buffer of data is
stored at each computing centre, e.g.\ the last year of data taking.
stored at each computing center, e.g.\ the last year of data taking.
Simulation has special needs because the input is negligible, but the
computing power required is very large compared to the needs of
......@@ -179,9 +179,8 @@ Thanks to the modular design of the detector, it is possible to quote
the computing requirements of KM3NeT per \emph{building block}, having
in mind that the ARCA programme corresponds to two \emph{building
blocks} and ORCA to one. Not all software could be benchmarked, and
some estimates are derived by scaling from ANTARES ones. When needed,
a conversion factor about 10 between cores and HEPSpec2006 (HS06) is
used in the following.
some estimates are derived by scaling from ANTARES ones.
In the following, the standard conversion factor (~10) is used between cores and HEPSpec2006 (HS06).
\begin{table}
\caption{\label{cpu}Yearly resource requirements per \emph{building block}.}
......@@ -211,7 +210,7 @@ resources at CNAF has been so far below the figures for a
units are added in the following years. KM3NeT software that
runs on the GRID can use CNAF computing nodes in opportunistic mode.
Already now, the data handling policy to safeguard the products of Tier-0
Already now, the data handling policy to safeguard the products of Tier 0
is in place. Automatic synchronization from each shore station to both
CC-IN2P3 and CNAF runs daily and provides two maximally separated
paths from the data production site to final storage places. Mirroring
......@@ -219,11 +218,11 @@ and redundancy preservation between CC-IN2P3 and CNAF are foreseen and
currently at an early stage.
CNAF has already added relevant contributions to KM3NeT in terms of
know-how for IT solution deployment, e.g.~the above-mentioned synchronisation, software development solutions and the software-defined network at the Tier-0 at
know-how for IT solution deployment, e.g.~the above-mentioned synchronisation, software development solutions and the software-defined network at the Tier 0 at
the Italian site. Setting up Software Defined Networks (SDN) for data
acquisition deserves a special mention. The SDN technology\cite{SDN} is used to
configure and operate the mission-critical fabric of switches/routers
that interconnects all the on-shore resources in Tier-0 stations. The
that interconnects all the on-shore resources in Tier 0 stations. The
KM3NeT DAQ is built around switches compliant with the OpenFlow 1.3
protocol and managed by dedicated controller servers. With a limited
number of Layer-2 forwarding rules, developed on purpose for the KM3NeT
......