Skip to content
Snippets Groups Projects

Compare revisions

Changes are shown as if the source revision was being merged into the target revision. Learn more about comparing revisions.

Source

Select target project
No results found

Target

Select target project
  • faproietti/ar2018
  • chierici/ar2018
  • SDDS/ar2018
  • cnaf/annual-report/ar2018
4 results
Show changes
Showing
with 270 additions and 159 deletions
......@@ -10,7 +10,6 @@
\address{$^1$ INFN-CNAF, Bologna, Italy}
\ead{alessandro.costantini@cnaf.infn.it}
\begin{abstract}
......@@ -98,7 +97,7 @@ The DEEP Hybrid DataCloud project is structured into six different work packages
Activities (NA) devoted to the coordination, communication and community liaison; Service Activities (SA)
focused on the provisioning of services and resources for the execution of the data analysis challenges; and
Joint Research Activities (JRAs), dealing with the development of new components and technologies to
support data analysis. Figure \ref{DEEP-WP} describes the interaction between the different work packages.
support data analysis. Figure \ref{fig-wp} describes the interaction between the different work packages.
\begin{figure}[h]
\centering
......
......@@ -237,7 +237,7 @@ components whose goal is to maximise the accessibility of data to clients while
are aggregated globally through a federation.
To such purpose, various technologies are available to the project to serve as the basis of an implementation:
\begin{itemize}
\item The system runs native dCache \cite{dcache} or EOS, but operates in a "caching mode" staging data in
\item The system runs native dCache \cite{dcache} or EOS, but operates in a ``caching mode'' staging data in
when a cache miss occurs.
\item A service such as Dynafed \cite{dynafed} will be augmented to initiate data movement. While it would
hold only metadata, it would use a local storage system for this.
......
contributions/storage/danni.PNG

1.52 MiB

......@@ -69,13 +69,13 @@ A list of storage systems in production as of 31.12.2018 is given in Table \ref{
The first three months of 2018 were completely dedicated to recovery of the hardware and restoring of the services after the flood event which
happened on November $9^{th}$ 2017.
At that time, the Tier-1 storage at CNAF consisted of the resources listed in Table \ref{table:1}. Almost all storage resources were damaged or contaminated by dirty water.
At that time, the Tier-1 storage at CNAF consisted of the resources listed in Table \ref{table:1}. Almost all storage resources were damaged or contaminated by dirty water (Figure \ref{fig:danni}).
\begin{table}[h!]
\centering
\begin{tabular}{|c|c|c|c|}
\hline
System & Quantity & Net Capacity, TB & (\%) of Use, \\
System & Quantity & Net Capacity, TB & (\%) of Use \\
\hline
DDN SFA 12K & 2 & 10240 & 95 \\
DDN SFA 10K & 1 & 2500 & 96 \\
......@@ -90,7 +90,15 @@ At that time, the Tier-1 storage at CNAF consisted of the resources listed in Ta
\label{table:1}
\end{table}
The recovery started as soon as the flooded halls became accessible. As the first step, we extracted all tape cartridges and hard disks which went in contact with water respectively from the tape library and from disk enclosures. After extraction, all of them were marked with respective position, cleaned, dried and stored in secure place.
\begin{figure}
\centering
\includegraphics[width=0.6\textwidth]{danni.PNG}
\caption[]{ Almost all storage resources were damaged or contaminated by dirty water.}
\label{fig:danni}
\end{figure}
The recovery started as soon as the flooded halls became accessible. As the first step, we extracted all tape cartridges and hard disks which went in contact with water respectively from the tape library and from the disk enclosures. After extraction, all of them were marked with respective position, cleaned, dried and stored in secure place.
\subsection{Recovery of disk storage systems}
The strategy for recovering disk storage systems varied depending on redundancy configuration and availability of technical support.
......@@ -99,50 +107,50 @@ The strategy for recovering disk storage systems varied depending on redundancy
All DDN storage systems consisted of a pair of controllers and 10 Disk Enclosures and were configured with RAID6 (8+2) level of data protection in such a way that every RAID group was distributed over all 10 enclosures. Thus, having one Disk Enclosure damaged in every DDN storage system means reduced level of redundancy. In this case we decided to operate systems with reduced redundancy for the time needed to evacuate data to newly installed storage or substitute damaged enclosures and relative disks with new ones and rebuild missing parity.
For the most recent and still maintained systems, we decided to replace all potentially damaged parts, and specifically 3 Disk Enclosures and 3x84 8TB disks.
After cleaning and drying ,we tested several disk drives in our lab and found that Helium filled HDD being well insulated are mostly immune to water contamination.
The only sensitive part on such drives is the electronic board and connectors which are easily cleanable even without special equipment.
After cleaning and drying, we tested several disk drives in our lab and found that Helium filled HDD, being well insulated, are mostly immune to water contamination.
The only sensitive parts on such drives are the electronic board and connectors which are easily cleanable even without special equipment.
Cleaning of Disk Enclosures is much more complicated or even impossible.
For this reason, we decided to replace only DAE and populate them with old but cleaned HDDs, startup the system and then replace and reconstruct old disks one by one while in production. In this way we were able to start using the biggest part of our storage immediately after restore of our power plant.
For this reason, we decided to replace only Disk Array Enclosures (DAE) and populate them with old but cleaned HDDs, startup the system and then replace and reconstruct old disks one by one while in production. In this way we were able to start using the biggest part of our storage immediately after restore of our power plant.
For the older DDN systems like SFA10000 and S2A 9900, we decided to disconnect contaminated enclosures (one in each system) and to run them with reduced redundancy (RAID5 8+1 instead RAID6 8+2) while moving data to the new storage systems.
\subsubsection{Dell}
Air-filled disks after cleaning and drying demonstrated limited operability (up to 2-3 weeks), usually enough for data evacuation.
For Dell MD3860f storage system the situation was quite different since there were only 3 DAE of 60 HDD each, 24 contaminated disks in each system and data protection was based on Distributed RAID technology.
After cleaning and drying, air-filled disks demonstrated limited operability (up to 2-3 weeks), usually enough for data evacuation.
For Dell MD3860f storage system the situation was quite different since there were only 3 DAE of 60 HDD each, 24 contaminated disks in each system
and data protection was based on Distributed RAID technology.
In this case, working in close connection with Dell Support Service and trying to minimize costs, we decided to replace only contaminated elements like electronics boards, backplanes and chassis, leaving original (cleaned and dried) disks in their places and replacing them with new ones after powering-on the system one-by-one, so to allow the rebuild of missing parity. Replacement and rebuild tookn about 3 weeks for each MD3860f system. During this time, we observed only 3 failures (distributed in time) of “wet” HDDs successfully recovered by automated rebuild using reserved capacity.
In this case, working in close connection with Dell Support Service and trying to minimize costs,
we decided to replace only contaminated elements like electronics boards, backplanes and chassis, leaving original (cleaned and dried) disks
in their places and replacing them one-by-one with new ones after powering-on the system, so to allow the rebuild of missing parity.
Replacement and rebuild took about 3 weeks for each MD3860f system. During this time, we observed only 3 failures (distributed in time) of “wet” HDDs successfully recovered by automated rebuild using reserved capacity.
\subsubsection{Huawei}
The Huawei OceanStor 6800v5 storage system consisting of 12 disk enclosures of 75 HDD each were installed in 2 cabinets and ended up with two disks enclosures on the lowest level. Therefore, they were contaminated by water. The two contaminated disk enclosures belonged to two different Storage pools.
The Huawei OceanStor 6800v5 storage system, consisting of 12 disk enclosures of 75 HDD each, were installed in 2 cabinets and ended up with two disks enclosures on the lowest level. Therefore, they were contaminated by water. The two contaminated disk enclosures belonged to two different Storage pools.
The data protection in this case was similar to that adopted for Dell MD3860, i.e. three Distributed Raid groups built on top of Storage pools of four disk enclosures. For the recovery we followed the procedure described above, and replaced two disk enclosures. The spare parts were delivered and installed, the disks were cleaned and installed in their original places. However, when powered on, the system did not recognize new enclosures. It turned out that delivered enclosures were incompatible on firmware level with the controllers. While debugging this issue, the system remained powered on and the disks began deteriorating. Finally, when the compatibility issue was solved after two weeks, the number of failed disks had exceeded the supported redundancy. Hence, two out of three RAID-set became permanently damaged, and two third of all data stored on this system were permanently lost.
The data protection in this case was similar to that adopted for Dell MD3860, i.e. three Distributed Raid groups built on top of Storage pools of four disk enclosures. For the recovery we followed the procedure described above, and replaced two disk enclosures. The spare parts were delivered and installed, the disks were cleaned and installed in their original places. However, when powered on, the system did not recognize the new enclosures. It turned out that delivered enclosures were incompatible on firmware level with the controllers. While debugging this issue, the system remained powered on and the disks began deteriorating. Finally, when the compatibility issue was solved after two weeks, the number of failed disks had exceeded the supported redundancy. Hence, two out of three RAID-set became permanently damaged, and two third of all data stored on this system were permanently lost.
The total volume of lost data amounts to 1.4 PB out of 22 PB stored at CNAF data center at the moment of flood.
\subsection{Recovery of tapes and tape library}
The SL8500 tape library was contaminated by water in its lowest 20 cm, enough to damage several components and 166 tape cartridges that were stored in the first two levels of slots (out of a total of 5500 cartridges in the tape library).
The SL8500 tape library was contaminated by water in its lowest 20 cm, enough to damage several components and 166 cartridges that were stored in the first two levels of slots (out of a total of 5500 cartridges in the tape library).
Part of the damaged tapes (16) were still empty.
As a first intervention, wet tapes were removed and placed in a safe place, so to let them dry and to start evaluating the potential data loss.
The TSM database was restored from a backup copy saved on a separate storage system, evacuated to CNR site. This operation permitted to individuate the content of all wet tapes.
As a first intervention, wet tapes were removed and placed in a safe place, so to let them dry and to start evaluating the potential data loss. The Spectrum Protect database was restored from a backup copy saved on a separate storage system, evacuated to CNR site. This operation permitted to identify the content of all wet tapes.
We communicated the content of each wet tape to the experiments, asking them whether the data on those tapes could be recovered from other sites or possibly be reproduced.
It turned out that data contained in 75 tapes were unique and non-reproducible, so those cartridges were sent to a laboratory of an external company to be recovered.
The recovery process lasted 6 months and 6 tapes resulted partially unrecoverable (20 TB lost out of a total of 630 TB).
In parallel, a not-trivial work started to clean, repair and certify again the library, finally reinstating the maintenance contract that we still had in place (though temporarily suspended) with Oracle. External technicians disassembled and cleaned all the library and its modules, which also allowed the underlying damaged floating floor to be replaced. Main powers and two robot hands were replaced, and one T10kD tape drive went lost.
When the SL8500 was finally ready and turned on again, a control board placed in the front door panel got burned, and was therefore replaced, clearly damaged by the moisture.
In parallel, a not-trivial work started to clean, repair and certify again the library, finally reinstating the maintenance contract that we still had in place (though temporarily suspended) with Oracle. External technicians disassembled and cleaned all the library and its modules, which also allowed the underlying damaged floating floor to be replaced. Main powers and two robot hands were replaced, and one T10kD tape drive got lost. When the SL8500 was finally ready and turned on again, a control board placed in the front door panel got burned, and was therefore replaced, clearly damaged by the moisture.
Once the tape system was put back in production, we audited a sample of non-wet cartridges in order to understand whether the humidity had damaged the tapes during the period immediately after the flood. 500 cartridges (4.2 PB), heterogeneous per experiment and age, were chosen. As a result, 90 files resulted unreadable from 2 tapes, that is a normal error rate compared to production, so no issue related to the exposure to the water has been observed.
Once the tape system was put back in production, we audited a sample of non-wet cartridges in order to understand whether the humidity had damaged the tapes during the period immediately after the flood. 500 cartridges (4.2 PB), heterogeneous per experiment and age, were chosen. As a result, 90 files resulted unreadable from 2 tapes, that is a normal error rate compared to production, so no issue related to the exposure to water has been observed.
The flood affected also several tapes (of 8 GB each) containing data taken from the RUN1 of the CDF experiment, that ran at Fermilab since 1990. When the flood happened, CNAF team had been working to replicate CDF data stored on those old media tapes to modern and reliable storage technologies, in order to make them accessible for further usage. Those tapes were dried in the hours immediately after the flood, but their legibility was not verified afterwards.
The flood affected also several tapes (of 8 GB each) containing data taken from the RUN1 of the CDF experiment, that ran at Fermilab from 1990 to 1995. When the flood happened, CNAF team had been working to replicate CDF data stored on those old media tapes to modern and reliable storage technologies, in order to make them accessible for further usage. Those tapes were dried in the hours immediately after the flood, but their legibility was not verified afterwards.
\subsection{Recovery of servers, switches, etc.}
In total 15 servers were damaged by contact with water, mainly by leak of acid from on-board batteries which happens in prolonged presence of moisture. In fact, recovery of servers was not of our priority and all contaminated servers remained untouched for about a month. Only one server has been recovered, 6 servers were replaced by already dismissed ones still in working conditions, and 8 servers were purchased as new.
In total 15 servers were damaged by contact with water, mainly by leak of acid from on-board batteries which happens in prolonged presence of moisture. In fact, recovery of servers was not our priority, and all contaminated servers remained untouched for about a month. Only one server has been recovered, 6 servers were replaced by already dismissed ones still in working conditions, and 8 servers were purchased as new.
Also, three Fiber Channel switches were affected by the flood: Brocade 48000 (384 ports) and two Brocade 5300 (96 ports each). All three switches were successfully recovered after cleaning and replacement of power supply modules.
\subsection{Results of hardware recovery}
......@@ -153,7 +161,7 @@ At the end, after the restart of the Tier1 data center, we have completely recov
\begin{tabular}{|c|c|c|c|c|p{4cm}|}
\hline
Category & Device & qty & Tot. Capacity & Status & Comment \\
Category & Device & Q.ty & Tot. Capacity & Status & Comment \\
\hline
SAN & Brocade 48000 & 1 & 384 ports & recovered & repaired power distribution board \\
SAN & Brocade 5300 & 2 & 196 ports & recovered & replaced power supply units \\
......@@ -172,7 +180,7 @@ Servers & & 15 && recovered & 1 recovered and 14 replaced\\
\section{Storage infrastructure resiliency}
Considering the increase in single disk, capacity we have moved from RAID6 data protection to Distributed RAID in order to speed up the rebuild of the eventually failed disks. On the other hand, given the foreseen (huge) increase of the installed disk capacity, we are doing a consolidation of the disk-server infrastructure with a sharp decrease in their number: in the last two tenders, each server was configured with 2x100 Gbps Ethernet and 2x56 Gbps (FDR) IB connections while the disk density has been increased from ~200 TB-N/server to ~1000 TB-N/server.
Considering the increase in single disk capacity, we have moved from RAID6 data protection to Distributed RAID in order to speed up the rebuild of eventually failed disks. On the other hand, given the foreseen (huge) increase of the installed disk capacity, we are doing a consolidation of the disk-server infrastructure with a sharp decrease in their number: in the last two tenders, each server was configured with 2x100 Gbps Ethernet and 2x56 Gbps (FDR) IB connections while the disk density has been increased from ~200 TB-N/server to ~1000 TB-N/server.
Currently, we have about 45 disk servers to manage ~37 PB of storage capacity.
......@@ -182,45 +190,38 @@ We are trying to keep all our infrastructures redundant: the dual-path connectio
The StoRM instances have been virtualized both allowing the implementation of HA.
\section{GEMSS}
GEMSS is the Mass Storage System used at the Tier-1, a full HSM integration of the General Parallel File System (GPFS), the Tivoli Storage Manager (TSM), both from IBM, and StoRM (developed at INFN); its primary advantages are a high reliability and a low effort needed for its operation.
The GPFS and TSM interaction is the main component of the GEMSS system: a thin software layer has been developed in order to optimize the migration (disk to tape data flow) and, in particular, the recall (tape to disk data flow) operations.
While the native GPFS and TSM implementation of HSM performs recalls file per file, GEMSS collects all the requests in a configurable time lapse and then performs re-ordering to minimize the number of mount/dismount operations in the tape library and unnecessary tape “seek” operation on a single tape.
The migrations from disk to tape are driven through configurable GPFS policies.
The TSM core component is the TSM server (with a “warm” standby machine ready) which relies on a database (replicated and backed up every 6 hours over the SAN) which keeps all metadata information.
StoRM implements the SRM interface and it is designed to support guaranteed space reservation and direct access using native Posix I/O calls to the storage.
\section{Tape library}
At present, a single tape library SL8500 is installed. The library has undergone various upgrades and it is now fully populated with tape cartridges having 8.4 TB of capacity. In the period 2014-2016 a complete repack has been performed moving all the data to the current technology tapes. After the flooding in 2017, one tape drive and several tapes were damaged: now the library is equipped with 16 T10kD drives, all interconnected via 16 Gbps FC to the TAN.
Since the present library is expected to be completely filled over 2019, a tender is ongoing for a new one. In the meanwhile, the TAN infrastructure has been upgraded to FC 16 Gbps.
\section{Tape library and drives}
At present, a single tape library Oracle SL8500 is installed.
The library has undergone various upgrades and it is now populated with tape cartridges having 8.4 TB of capacity each,
for a total installed capacity of 70 PB at the end of 2018.
The 16 T10kD tape drives are shared among the file systems handling the scientific data. Currently, there is no way to allocate dynamically more or less drives to recall or migration activities on the different file systems.
In fact, the HSM system administrators can only set manually the maximum number of migration or recall threads for each file system by modifying the GEMSS configuration file. Due to this static setup, we experience that frequently some drives are idle and, at the same time, we notice a certain number of pending recall threads that could become running by using those free drives. In order to overcome this inefficiency, we designed a software solution, and namely a GEMSS extension, to automatically assign free tape drives to accomplish pending recalls and to perform administrative tasks on tape storage pools, such as space reclamations or repack.
Since the present library is expected to be completely filled over 2019, a tender is ongoing for a new one.
In the meanwhile, the TAN infrastructure has been upgraded to FC 16 Gbps.
\section{Backup and recovery service}
The Data Management group is also responsible for the backup and recovery service that is running to protect different kinds of CNAF IT services data (mail servers, repositories, service configurations, logs, documents, etc.).
The 16 T10kD tape drives are shared among the file systems handling the scientific data.
In our current production configuration, there is no way to allocate dynamically more or less drives to recall or migration activities on the different file-systems. In fact, the HSM system administrators can only set manually the maximum number of migration or recall threads for each file system by modifying the GEMSS configuration file. Due to this static setup, we experience that frequently some drives are idle and, at the same time, we notice a certain number of pending recall threads that could become running by using those free drives. In order to overcome this inefficiency, we designed a software solution, namely a GEMSS extension, to automatically assign free tape drives to accomplish pending recalls and to perform administrative tasks on tape storage pools, such as space reclamations or repack. We plan to put this solution in production during 2019.
This service was re-designed during 2016 after a couple of episodes of data loss that needed restore of backed-up data from the system. In those cases, data were recovered successfully, but that experience convinced the system administrators to make the service more efficient and secure. Data are stored as multiple copies on both disk and tape, with different retention times.
\section{Data preservation}
CNAF provides the Long Term Data Preservation of the CDF RUN-2 dataset (~4 PB) collected between 2001 and 2011 and already stored on CNAF tapes since 2015. 140 TB of CDF data were unfortunately lost because of the flood occurred at CNAF on November 2017; however now all these data have been successfully re-transferred from Fermilab via GridFTP protocol. The CDF database (based on Oracle), containing information about CDF datasets such as their structure, file locations and metadata, has been imported from FNAL to CNAF.
CNAF provides the Long Term Data Preservation of the CDF RUN-2 dataset (4 PB) collected between 2001 and 2011 and already stored on CNAF tapes since 2015. 140 TB of CDF data were unfortunately lost because of the flood occurred at CNAF on November 2017; however now all these data have been successfully re-transferred from Fermilab via GridFTP protocol. The CDF database (based on Oracle), containing information about CDF datasets such as their structure, file locations and metadata, has been imported from FNAL to CNAF.
The Sequential Access via Metadata (SAM) station, a data-handling tool specific to CDF data management and developed at Fermilab, has been installed on a dedicated SL6 server at CNAF. This is fundamental step in the perspective of a complete decommissioning of CDF services by Fermilab. The SAM station allows to manage data transfers and to retrieve information from the CDF database; it also provides a SAMWeb tool which uses HTTP protocol for accessing the CDF database.
The Sequential Access via Metadata (SAM) station, a data-handling tool specific to CDF data management and developed at Fermilab,
has been installed on a dedicated SL6 server at CNAF. This is a fundamental step in the perspective of a complete decommissioning of CDF services by Fermilab.
The SAM station allows to manage data transfers and to retrieve information from the CDF database;
it also provides a SAMWeb tool which uses HTTP protocol for accessing the CDF database.
Work is ongoing to verify the availability and the correctness of all CDF data stored on CNAF tapes: we are reading all files from the tapes, calculating their checksum and comparing it with the one stored in the database and retrieved through the SAM station. Recent tests showed that CDF analysis jobs, using CDF software distributed via CVMFS and requesting delivery of CDF files stored on CNAF tapes, work properly. When some minor issues regarding the use of X.509 certificates for authentication on CNAF farm will be completely solved, CDF users will be able to access CNAF nodes and submit their jobs via LSF or HTCondor batch systems.
Work is ongoing to verify the availability and the correctness of all CDF data stored on CNAF tapes: we are reading all files from the tapes,
calculating their checksum and comparing it with the one stored in the database and retrieved through the SAM station.
Recent tests showed that CDF analysis jobs, using CDF software distributed via CVMFS and requesting delivery of CDF files stored on CNAF tapes, work properly.
When some minor issues regarding the use of X.509 certificates for authentication on CNAF farm will be completely solved, CDF users will be able to access CNAF nodes and submit their jobs via LSF or HTCondor batch systems.
\section{Third Party Copy activities in DOMA}
At the end of the summer, we joined the TPC (Third Party Copy) subgroup of the WLCG’s DOMA\footnote{Data Organization, Management, and Access. see https://twiki.cern.ch/twiki/bin/view/LCG/DomaActivities} project, dedicated to improving bulk transfers between WLCG sites using non-GridFTP protocols. In particular, the INFN-Tier1 is involved in these activities for what concerns StoRM WebDAV.
At the end of the summer, we joined the TPC (Third Party Copy) subgroup of the WLCG’s DOMA\footnote{Data Organization, Management, and Access. see https://twiki.cern.ch/twiki/bin/view/LCG/DomaActivities} project, dedicated to improving bulk transfers between WLCG sites using non-GridFTP protocols. In particular, the Tier 1 is involved in these activities for what concerns StoRM WebDAV.
In October, the two StoRM WebDAV servers used in production by the ATLAS experiment have been upgraded to a version that implements basic support for Third-Party-Copy, and both endpoints entered the distributed TPC testbed of volunteer sites.
......
contributions/summerstudent/MLalgorithms.png

25.4 KiB

contributions/summerstudent/StoRM-full-picture.png

381 KiB

contributions/summerstudent/StoRM.png

17 KiB

contributions/summerstudent/kibana.png

388 KiB

\documentclass[a4paper]{jpconf}
\usepackage{graphicx}
\begin{document}
\title{INFN CNAF log analysis: a first experience with summer students}
\author{D. Bonacorsi$^1$, A. Ceccanti$^2$, T. Diotalevi$^1$, A. Falabella$^2$, L. Giommi$^2$, B. Martelli$^2$, D. Michelotto$^2$, L. Morganti$^2$, E. Ronchieri$^2$, S. Rossi Tisbeni$^1$, E. Vianello$^2$}
\address{$^1$ University of Bologna, Bologna, IT}
\address{$^2$ INFN-CNAF, Bologna, IT}
\ead{barbara.martelli@cnaf.infn.it}
\begin{abstract}
In 2018 the INFN CNAF computing center has started to investigate predictive and preventive maintenance solutions in order to improve fault diagnosis by applying machine learning techniques to hardware and service logs. An excellent experience has been carried out by three students who dedicated three summer months to collect logs of the StoRM services and the resources that host them, to preprocess these logs in order to remove all bias information and to perform initial data analysis. Here we are going to present the activities fulfilled by these students, the initial outcome and the ongoing work at the INFN CNAF data center.
\end{abstract}
\section{Introduction}
In recent years INFN CNAF has put a great effort to define and implement a common monitoring infrastructure based on Sensu, InfluxDB and Grafana and to centralize logs from the most relevant services \cite{bovina2015, bovina2017}. Nowadays, this unified infrastructure has been fully integrated in the data center \cite{fattibene2018} and there is the intention to face the new challenge/opportunity to correlate this vast volume of data and extract actionable insights.
During the summer 2018 a first investigation has been exploited with the help of three summer student \cite{seminario}. Once identified a specific system to analyze, i. e. StoRM, the following activities have been addressed:
\begin{itemize}
\item Log collection and harmonization
\item Log parsing of various services, such as StoRMfrontend, StoRMbackend, heartbeat, messages, GridFTP and GPFS (not covered in our study, but potentially interesting)
\item Metrics data adding (from Tier 1 InfluxDB)
\end{itemize}
However, to provide a first proof of concept for the predictive and preventive maintenance, data categorization and machine learning techniques application represent two key points that have been conducted from the end 2018 and the middle 2019.
\section{Log collection and harmonization}
The first part of the work consisted in the collection of StoRM logs from the StoRM servers dedicated to the Atlas experiment.
Subsequently, most relevant information was extracted from the logs using the ELK Stack suite \cite{elk}. The ELK stack consists of four components: Beats used for data collection from multiple sources, Logstash used for data aggregation and processing, Elasticsearch used for store and index data, Kibana for data analysis and visualization. In particular, Logstash has been used to ingest data from Beats in a continuous live-feed streaming, filter relevant entries and parse each event, identifying named fields to build a user defined structure and ship parsed data to the Elasticsearch engine. Most data was filtered using a \textit{grok} filter which is based on regular expressions and provides predefined filters together with the ability of defining customized ones.
Finally, several dashboards were created using Kibana in order to show in a human-friendly way a summary of the most relevant information derived from StoRM logs (ee for example \ref{fig3}).
\begin{figure}[h]
\includegraphics[width=20pc]{kibana.png}\hspace{2pc}
\begin{minipage}[b]{14pc}\caption{\label{fig3}An example of Kibana dashboard created.}
\end{minipage}
\end{figure}
\section{Log parsing}
Among the INFN Tier 1 services hosted at the INFN CNAF computing center, there are efficient storage systems, like StoRM that is a grid Storage Resource Manager (SRM) solution. Figure \ref{fig1} shows the StoRM architecture: the frontend service manages user authentication and stores requests data, while the backend service executes SRM functionalities and takes care of space and authorization.
The log files contains basically three types of information: timestamp, metrics, and messages.
\begin{figure}[h]
\includegraphics[width=20pc]{StoRM-full-picture.png}\hspace{2pc}%
\begin{minipage}[b]{14pc}\caption{\label{fig1}The StoRM architecture.}
\end{minipage}
\end{figure}
At the beginning of this work (mid 2018), StoRM at Tier 1 was monitored by InfluxDB and Grafana. Metrics monitored included CPU, RAM, network and disk usage; number of sync SRM request per minute per host; duration of async PTG and PTP per host (avg). We wanted to add information derived from the analysis of StoRM logs to already available monitoring information, in order to derive new insights potentially useful to enhance service availability and efficiency with the long-term intent of implementing a global predictive maintenance solution for Tier 1. In order to build a Machine Learning model for anomaly prediction, logs from two different period were analyzed: a normal behavior period and a critical behavior period (due to wrong configuration of the file system and wrong configuration of the queues coming from the farm).
A four-steps activity has been carried out:
\begin{enumerate}
\item Parsing: log files were parsed and deconstructed, converting them to CSV format
\item Feature selection: was done grouping messages based on their common content (core part of the message). The grouping phase resulted in 20 \textit{Request Types} (Connection, Run, Ping, Ls, Check permission, PTG, PTG status, Get space tokens, PTP, PTP status, BOL status, Put don, Release files, Mv, Mkdir, BOL, Abort request, Abort files, Get space metadata, nan) and 15 \textit{Result Types} (SRM\_SUCCESS, SRM\_FAILURE, SRM\_NOT\_SUPPORTED, SRM\_REQUEST\_QUEUED, SRM\_REQUEST\_INPROGRESS, Protocol check failed, Received 4 protocols, Some protocols supported, SRM\_DUPLICATION\_ERROR, rpcResponseHandler\_AbortFiles, SRM\_INVALID\_REQUEST, SRM\_INVALID\_PATH, Received 5 protocols, SRM\_INTERNAL\_ERROR, nan). A first data exploration phase was performed by counting occurrencies of messages in each group.
Techniques used for the feature selection procedure were: SelectKBest with the chi-squared statistical test, Recursive Feature Elimination, Principal Component Analysis (PCA) and Feature Importance from ensembles of decision tree methods.
\item One-hot encoding: CSV rows encoded in binary vectors (feature vectors). Each vector represents the summary of 15-minutes log contents.
\item Labelling: operation specific for StoRM log files done manually discriminating between normal and critical period based on help-desk tickets.
\end{enumerate}
Feature vectors obtained in (iii) and labeled datasets built in (iv) were used to train several ML algorithms and to test their accuracy. Figure \ref{fig2} depicts the results of tests performed on the following algorithms: LogisticRegression (LR), LinearDiscriminantAnalysis (LDA), KNeighborsClassifier (KNN), GaussianNB (GNB), DecisionTreeClassifier (CART), BaggingClassifier (BgDT), RandomForestClassifier (RF), ExtraTreesClassifier (ET), AdaBoostClassifier (AB), GradientBoostingClassifier (GB), XGBoostClassifier (XGB), MultiLayerPerceptronClassifier (MLP).
\begin{center}
\begin{figure}[h]
\includegraphics[width=20pc]{MLalgorithms.png}\hspace{2pc}
\begin{minipage}[b]{14pc}\caption{\label{fig2}Machine Learning Algorithms Comparison (scorer=accuracy).}
\end{minipage}
\end{figure}
\end{center}
\section{Metrics data adding}
This activity was mainly focused on collecting metric data from InfluxDB in order to put them in relation with StoRM logs obtained with activities explained in previous sections and extract new insights.
Key components of log files were identified, parsed and structured in a CSV file with the following columns: timestamp, metric, message, descriptive keys and separators. All timestamps were converted in UNIX epoch time in order to be comparable. On one side, InfluxDB stores information with different granularity depending on the age of data collected and on the other side, StoRM front-end and back-end logs are produced with different frequencies (one line each minute for heartbeat logs, multiple lines every minute for metrics logs, one line every five minutes for InfluxDB more recent data, and so on). Therefore, some concatenation rules have been implemented in order to correctly put in relation all data sources based on the time of occurrence of the event: backend metrics are split by type, timestamp is rounded off to one‐minute precision, in case of overlap the more recent is kept and every CSV file is concatenated and ordered by timestamp.
\section{Conclusion}
This experience is a good example of mutually beneficial collaboration between university students and INFN CNAF. The outcome has allowed master students (i.e. Diotalevi T. and Giommi L.) to publish papers at international conferences \cite{diotalevi, giommi20191}, to win Giulia Vita Finzi's award \cite{giommi20192}, and to start their PhD courses with success. Furthermore, the undergraduate student (i.e. Rossi Tisbeni R) will hold a master degree in Physics in July 2019. On the other hand, the INFN CNAF data center managers has decided to continue exploiting predictive and preventive maintenance to establish where and when to use it to keep services running optimally.
\section*{References}
\begin{thebibliography}{9}
\bibitem{seminario} Martelli B, Giommi L, Rossi Tisbeni S, Diotalevi T, https://agenda.infn.it/event/17430/, 2018.
\bibitem{bovina2015} Bovina S, Michelotto D, Misurelli G, \emph{CNAF Annual Report}, pp. 111--114, 2015.
\bibitem{bovina2017} Bovina S, Michelotto D, In Proc of CHEP 2017.
\bibitem{fattibene2018} Fattibene E, Dal Pra S, Falabella A, De Cristofaro T, Cincinelli G, Ruini M, In Proc of CHEP 2018.
\bibitem{diotalevi} Diotalevi T, Bonacorsi D, Michelotto D, Falabella A, In Proc of International Symposium on Grids \& Clouds (ISGC), Taipei, Taiwan, 2019 (under review).
\bibitem{giommi20191} Giommi L, Bonacorsi D, Diotalevi T, Rossi Tisbeni S, Rinaldi L, Morganti L, Falabella A, Ronchieri E, Ceccanti A, Martelli B, In Proc of International Symposium on Grids \& Clouds (ISGC), Taipei, Taiwan, 2019 (under review).
\bibitem{giommi20192} Giommi L, In INFN CCR Workshop, La Biodola, 3-7 June 2019.
\bibitem{elk}https://www.elastic.co/, site visited on June 2019.
\end{thebibliography}
\end{document}
contributions/sysinfo/deps_scan.png

108 KiB | W: 0px | H: 0px

contributions/sysinfo/deps_scan.png

4.3 MiB | W: 0px | H: 0px

contributions/sysinfo/deps_scan.png
contributions/sysinfo/deps_scan.png
contributions/sysinfo/deps_scan.png
contributions/sysinfo/deps_scan.png
  • 2-up
  • Swipe
  • Onion skin
......@@ -6,16 +6,16 @@
\title{The INFN Information System}
\author{
Stefano Bovina$^1$,
Marco Canaparo$^1$,
Enrico Capannini$^1$,
Fabio Capannini$^1$,
Claudio Galli$^1$,
Guido Guizzunti$^1$,
Barbara Demin$^1$
S. Bovina$^1$,
M. Canaparo$^1$,
E. Capannini$^1$,
F. Capannini$^1$,
C. Galli$^1$,
G. Guizzunti$^1$,
B. Demin$^1$
}
\address{$^1$ INFN CNAF, Viale Berti Pichat 6/2, 40126, Bologna, Italy}
\address{$^1$ INFN-CNAF, Bologna, IT}
\ead{
stefano.bovina@cnaf.infn.it,
......@@ -28,114 +28,140 @@
}
\begin{abstract}
The Information System Service's mission is the implementation, management and optimization of all the infrastructural and application components of the administrative services of the Institute. In order to guarantee high reliability and redundancy, the same systems are replicated in an analogous infrastructure at the National Laboratories of Frascati (LNF).
The Information System's team manages all the administrative services of the Institute, both from the hardware and the software point of view and they are in charge of carrying out several software projects.
The mission of the Information System Service is the implementation, management and optimization of all the infrastructural and application components of the administrative services of the Institute. In order to guarantee high reliability and redundancy, the same systems are replicated in an analogous infrastructure at the National Laboratories of Frascati (LNF).
The Information System's team manages all the administrative services of the Institute,
both from the hardware and the software point of view, and it is in charge of carrying out several software projects.
The core of the Information System is made up of the salary and HR systems.
Connected to the core there are several other systems reachable from a unique web portal: firstly, the organizational chart system (GODiVA); secondly, the accounting, the time and attendance, the trip and purchase order and the business intelligence systems. Finally, there are other systems which manage: the training of the employees, their subsidies, their timesheet, the official documents, the computer protocol, the recruitment, the user support etc.
Connected to the core, there are several other systems reachable from a unique web portal:
firstly, the organizational chart system (GODiVA); secondly, the accounting, the time and attendance,
the trip and purchase order and the business intelligence systems.
Finally, there are other systems which manage the training of the employees, their subsidies, their timesheet, the official documents,
the computer protocol, the recruitment, the user support etc.
\end{abstract}
\section{Introduction}
The INFN Information System project was set up in 2001 with the purpose of digitizing and managing all the administrative and accounting processes of the INFN Institute, and of carrying out a gradual dematerialization of documents.\\
In 2010, INFN decided to transfer the accounting system, based on the Oracle Business Suite (EBS) and the SUN Solaris operating system, from the National Laboratories of Frascati (LNF) to CNAF, where the SUN Solaris platform was migrated to a RedHat Linux Cluster and implemented on commodity hardware.\\
The Service “Information System” was officially established at CNAF in 2013 with the aim of developing, maintaining and coordinating many IT services which are critical for INFN. Together with the corresponding office in the National Laboratories of Frascati, it is actively involved in fields related to INFN management and administration, developing tools for business intelligence and research quality assurance; it is also involved in the dematerialization process and in the provisioning of interfaces between users and INFN administration.\\
The Information System service team at CNAF in 2018 was composed of 8 people, both developers and system engineers.\\
The INFN Information System project was set up in 2001 with the purpose of digitizing and managing all the administrative and accounting processes of the INFN Institute,
and of carrying out a gradual dematerialization of documents.\\
In 2010, INFN decided to transfer the accounting system, based on the Oracle Business Suite (EBS) and the SUN Solaris operating system,
from the National Laboratories of Frascati (LNF) to CNAF, where the SUN Solaris platform was migrated to a RedHat Linux Cluster and implemented on commodity hardware.\\
The Service “Information System” was officially established at CNAF in 2013 with the aim of developing, maintaining and coordinating many IT services which are critical
for INFN. Together with the corresponding office at the National Laboratories of Frascati, it is actively involved in fields related to INFN management and administration, developing tools for business intelligence and research quality assurance; it is also involved in the dematerialization process and in the provisioning of interfaces between users and INFN administration.\\
Over the years, other services have been added, leading to a complex infrastructure that covers all aspects of people's life working at INFN.
In 2018, the Information System service team at CNAF was composed of 8 people, both developers and system engineers.\\
\section{Infrastructure}
In 2018, the infrastructure-related activity was composed of various tasks that can be summarized as follows: firstly, the consolidation of the Disaster Recovery site in Bari and the restore of CNAF as primary site; secondly, the finalization of Puppet 3 phase out and related Foreman upgrades; thirdly, the improvement of our ELK (Elasticsearch/Logstash/Kibana) and monitoring infrastructure and finally, several "Misure Minime" AGID and GDPR compliance adjustment.
In 2018, the infrastructure-related activity was composed of various tasks that can be summarized as follows:
firstly, the consolidation of the Disaster Recovery site in Bari and the restore of CNAF as primary site;
secondly, the finalization of Puppet 3 phase out and related Foreman upgrades;
thirdly, the improvement of our ELK (Elasticsearch/Logstash/Kibana) and monitoring infrastructure and finally, several ``Misure Minime'' AGID and GDPR compliance adjustments.
\newline
After the complete revisiting and upgrade of the ELK stack to version 5 last year, many activities have been done to enhance systems and applications monitoring using this set of tools. To improve the discovery and resolution of problems, several views and dashboards (see Fig.~\ref{fig:presenze_kibana}) have been created on Kibana, as well as a deep analysis and customizations of application logs to introduce useful information.
After the complete revisiting and upgrade of the ELK stack to version 5 last year,
many activities have been done to enhance systems and applications monitoring using this set of tools.
To improve the discovery and resolution of problems, several views and dashboards (see Figure~\ref{fig:presenze_kibana}) have been created on Kibana,
as well as a deep analysis and customization of application logs to introduce useful information.
\begin{figure}[htbp]
\begin{center}
\includegraphics[scale=0.5]{presenze_kibana.png}
\end{center}
\caption{\label{fig:presenze_kibana} Time and attendance system manual squaring statistics on Kibana (ELK)}
\caption{\label{fig:presenze_kibana} Time and attendance system manual squaring statistics on Kibana (ELK).}
\end{figure}
With the aim of enhancing our cronjobs management, improving its monitoring and management, avoiding cronjob overlap and in order to identify "dead-man-switches" a new cronjob management tool has been adopted.
Cronjob executions are available both on Kibana and Grafana (as annotation), so they can be used to be correlated with system events (see Fig.~\ref{fig:cronjob_annotation}); In the same way, software releases are also displayed on Grafana.
With the aim of enhancing our cronjobs management, improving its monitoring and management, avoiding cronjob overlap and in order to identify ``dead-man-switches'''
a new cronjob management tool has been adopted.
Cronjob executions are available both on Kibana and Grafana (as annotation),
so they can be used to be correlated with system events (see Figure~\ref{fig:cronjob_annotation}); In the same way, software releases are also displayed on Grafana.
\begin{figure}[htbp]
\begin{center}
\includegraphics[scale=0.5]{cronjob_annotation.png}
\end{center}
\caption{\label{fig:cronjob_annotation} Annotations for cronjobs on Grafana}
\caption{\label{fig:cronjob_annotation} Annotations for cronjobs on Grafana.}
\end{figure}
\newpage
Because of the recent regulations that came into force ("Misure Minime" AGID and GDPR), many audits and related adjustments were made, also relying on both official Center for Internet Security (CIS) guides and Openscap scan, using the Payment Card Industry - Data Security Standard (PCI-DSS) profile.
Because of the recent regulations that came into force (``Misure Minime'' AGID and GDPR), many audits and related adjustments were made, also relying on both official Center for Internet Security (CIS) guides and Openscap scan, using the Payment Card Industry - Data Security Standard (PCI-DSS) profile.
Afterwards, we introduced a proactive security model on some pilot projects, adopting tools for static code analysis and dependency scanning (see Fig.~\ref{fig:deps_scan}).
Afterwards, we introduced a proactive security model on some pilot projects, adopting tools for static code analysis and dependency scanning (see Figure~\ref{fig:deps_scan}).
\begin{figure}[htbp]
\begin{center}
\includegraphics[width=1.0\textwidth]{deps_scan.png}
\end{center}
\caption{\label{fig:deps_scan} Dependencies scan tool in action on Gitlab-CI}
\caption{\label{fig:deps_scan} Dependencies scan tool in action on Gitlab-CI.}
\end{figure}
In addition to this, the Platform as a Service (PaaS) infrastructure based on RedHat Openshift Origin (3.x) was upgraded to release 3.11 and for all container-based projects, a signature/scan services was deployed at container registry level (see Fig.~\ref{fig:container_ci}).
In addition to this, the Platform as a Service (PaaS) infrastructure based on RedHat Openshift Origin (3.x) was upgraded to release 3.11
and a signature/scan services was deployed at container registry level for all container-based projects (see Figure~\ref{fig:container_ci}).
\begin{figure}[htbp]
\begin{center}
\includegraphics[width=1.0\textwidth]{container_ci.png}
\end{center}
\caption{\label{fig:container_ci} Container registry details and related Gitlab-CI pipeline}
\caption{\label{fig:container_ci} Container registry details and related Gitlab-CI pipeline.}
\end{figure}
\newpage
In 2018, Oracle databases related activities concerned their maintenance, an initial analysis about the necessary activities to upgrade to Oracle to databases’ later versions and the study about how to achieve real time replication between the Oracle databases of the Accounting application. Periodic recovery tests were also conducted on the Bari Disaster Recovery site.
In 2018, Oracle databases related activities concerned their maintenance,
an initial analysis about the necessary activities to upgrade to later versions and the study on how to achieve real-time replication
between the Oracle databases of the Accounting application. Periodic recovery tests were also conducted on the Bari Disaster Recovery site.
\section{Time and attendance system improvements}
The time and attendance system allows employees to clock in and out electronically via swipe card. The data is instantly transferred into a database and shown in a web-based application. This system tracks the working hours and offers employees self-service that allows them to handle many time-tracking tasks on their own all subjected to customizable approval workflows and which include reviewing the hours they have worked, the current and future schedule and requests of paid or unpaid leaves.
The time and attendance system allows employees to clock in and out electronically via swipe card.
The data is instantly transferred into a database and shown in a web-based application.
This system tracks the working hours and offers employees self-service that allows them to handle many time-tracking tasks on their own,
all subjected to customizable approval workflows, which include reviewing the hours they have worked, the current and future schedule and requests of paid or unpaid leaves.
In 2018, the Time and Attendance system related activities concerned both the introduction of new features and the modifications of the existing ones. Furthermore, developers focused on the performance improvement of the system through the optimization of some common procedures.
The Time and attendance system was enabled to "read" codes introduced together with the clock in/out: through this mechanism, employees can specify the reasons for their leave of absence without using the web-based application.
The Time and Attendance system was enabled to ``read'' codes introduced together with the clock in/out: through this mechanism, employees can specify the reasons for their leave of absence without using the web-based application.
Some modifications have been carried out to implement some changes occurred in the national collective agreement. This activity included two new leaves of absence and an extension from three to four months of the period for the check of the average weekly working hours.
As concerns performance, the developers' team have optimized the procedure that manages the clock in/out by web portal, and the report that shows the paid overtime aggregated in sectors, employees and months.
\section{Oracle EBS improvements}
In 2018, a new Electronic Payments and Receipts (EPR) Framework was introduced, in compliance with the standard set by the Agency for Digital Italy (Agenzia per l'Italia Digitale, AgID) and transmitted through SIOPE+.
In 2018, a new Electronic Payments and Receipts (EPR) Framework was introduced,
in compliance with the standard set by the Agency for Digital Italy (Agenzia per l'Italia Digitale, AgID) and transmitted through SIOPE+.
SIOPE+ is the new infrastructure that enables general government entities and banks that provide treasury services to exchange information with the objective of improving the quality of the data used for monitoring government expenditure and tracking the payment times to firms that supply general government entities.
SIOPE+ is the new infrastructure that enables general government entities and banks that provide treasury services
to exchange information, with the aim of improving the quality of the data used for monitoring government expenditure and tracking the payment times to firms that supply general government entities.
SIOPE+ responds to the following needs:
\begin{itemize}
\item Availability of detailed information on payments made by general government bodies without burdening the entities involved in the flow of outlays and collections. This will make it easier to obtain information on the payments of trade receivables and, more broadly, to monitor public sector financial flows in real time.
\item Standardization of information exchange between government bodies and treasury service providers by adopting a single digital standard OPI (Ordinativo di Pagamento e Incasso) in place of the previous local standard OIL (Ordinativo Informatico Locale), with the aim of raising the quality of treasury services, facilitating further integration between the accounting systems of the entities and between payment processes, and supporting the development of electronic payments services.
\item availability of detailed information on payments made by general government bodies without burdening the entities involved in the flow of outlays and collections. This will make it easier to obtain information on the payments of trade receivables and, more broadly, to monitor public sector financial flows in real time.
\item standardization of information exchange between government bodies and treasury service providers by adopting a single digital standard OPI (Ordinativo di Pagamento e Incasso) in place of the previous local standard OIL (Ordinativo Informatico Locale), with the aim of raising the quality of treasury services, facilitating further integration between the accounting systems of the entities and between payment processes, and supporting the development of electronic payments services.
\end{itemize}
\section{Business Intelligence improvements}
In 2018, the main task was investigating technical solutions as alternatives to the current Business Intelligence installation, with the aim of reducing licensing costs, while remaining on an open source solution, preserving functionalities and compatibility with other INFN tools and platforms.
In 2018, the main task was investigating alternative technical solutions to the current Business Intelligence installation,
with the aim of reducing licensing costs, while remaining on an open source solution and preserving functionalities and compatibility with other INFN tools and platforms.
At the end of this activity, the current solution, based on TIBCO platform, was confirmed the best one.
%At present, we are converting reports that are using deprecated features. Once all reports are converted, the Business Intelligence infrastructure will be upgraded to the last version.
\section{Contratti}
Contratti (previously named Repertorio Contratti) is a new Java application (in test phase) for long term preservation of contract made between INFN and an external supplier, based on Alfresco and mDM protocol.
Each contract is enriched with a full set of metadata which describe the Contract in its relevant parts and suppliers are extracted automatically from the central supplier registry, together with details of the contract signer.
Contratti (previously named Repertorio Contratti) is a new Java application (in test phase) for long term preservation of contracts made between INFN and an external supplier, based on Alfresco and mDM protocol.
Each contract is enriched with a full set of metadata which describe the contract in its relevant parts, and suppliers are extracted automatically from the central supplier registry, together with details of the contract signer.
Last year, several bugfix and improvements has been made, in order to respect our customers requirements. Improvements, can be summarized as following:
Last year, several bugfix and improvements have been made, in order to respect our customers requirements. Improvements can be summarized as follows:
\begin{enumerate}
\item integration with mDM protocol:
\begin{itemize}
\item it is now possible to manage a set of folder where to store the contract file, as if it was a complete folder explorer;
\item before the contract file is stored in mDM, a protocol signature is written onto the document, without invalidating PAdES signature of the issuer.
\item it is now possible to manage a set of folders where to store the contract file, as if it was a complete folder explorer;
\item before the contract file is stored in mDM, a protocol signature is written onto the document, without invalidating PAdES (PDF Advanced Electronic Signatures) signature of the issuer.
\end{itemize}
\item complete refactoring of ACLs mechanism, used to manage document and app permissions;
\item complete refactoring of the ACLs mechanism used to manage document and app permissions;
\item added email notification in order to send a contract link to a set of recipients, extracted automatically from Godiva;
\item it is now possible to print a label containing the relevant characteristics of the contract;
\item complete UI restyling in order to improve both readability and usability of the product.
......
contributions/tier1/pledge.png

180 KiB

......@@ -13,42 +13,30 @@
\begin{document}
\title{The INFN Tier-1}
\title{The INFN Tier 1}
\author{Luca dell'Agnello}
\address{INFN-CNAF, Bologna, IT}
\author{Luca dell'Agnello$^1$}
\address{$^1$ INFN-CNAF, Bologna, IT}
\ead{luca.dellagnello@cnaf.infn.it}
\section{Introduction}
CNAF hosts the Italian Tier-1 data center for WLCG: over the years, Tier-1 has become the main computing facility for INFN.
Nowadays, besides the four LHC experiments, the INFN Tier-1 provides services and resources to 30 other scientific collaborations, including BELLE2 and several astro-particle experiments (Tab.\ref{T1-pledge})\footnote{CSN 1, CSN 2 and CSN 3 are the National Scientific Committees of the INFN, respectively, for experiments in high energy physics with accelerators, astro-particle experiments and experiments in nuclear physics with accelerators.}. As showns in Fig.~\ref{pledge2018}, besides LHC, the main users are the astro-particle experiments.
CNAF hosts the Italian Tier 1 data center for WLCG: over the years, Tier 1 has become the main computing facility for INFN.
Nowadays, besides the four LHC experiments, the INFN Tier 1 provides services and resources to 30 other scientific collaborations,
including BELLE2 and several astro-particle experiments (see Table \ref{T1-pledge}).
As shown in Fig.~\ref{pledge2018}, besides LHC, the main users are the astro-particle experiments.
\begin{figure}[h]
\begin{center}
\begin{minipage}{35pc}
\includegraphics[width=15pc]{cpu2018.png}\hspace{2pc}%
% \caption{\label{cpu2018}xxx}
% \end{minipage}\hspace{2pc}%
% \begin{minipage}{30pc}
\includegraphics[width=15pc]{disk2018.png}\hspace{2pc}%
% \caption{\label{disk2018}xxx}
% \end{minipage}
\vspace{2pc}%
% \begin{minipage}{20pc}
\begin{center}
\includegraphics[width=15pc]{tape2018.png}\hspace{2pc}%
% \caption{\label{tape2018}xxx}
\caption{\label{pledge2018}Relative requests of resources at INFN Tier-1}
\end{center}
\end{minipage}\hspace{2pc}%
\includegraphics[keepaspectratio,width=15cm]{pledge.png}
\caption{\label{pledge2018}Relative requests of resources at INFN Tier 1}
\end{center}
\end{figure}
Despite the flooding that occurred at the end of 2017, we were able to provide the resources committed to the experiments for 2018, almost in time.
Despite the flooding that occurred at the end of 2017, we were able to provide the resources committed to the experiments for 2018 almost in time.
......@@ -166,7 +154,7 @@ Despite the flooding that occurred at the end of 2017, we were able to provide t
\br
\end{tabular}
\end{center}
\caption{Pledged and installed resources at INFN Tier-1 in 2018 (for the CPU power an overlap factor is applied)}
\caption{Pledged and installed resources at INFN Tier 1 in 2018 (for the CPU power an overlap factor is applied). CSN 1, CSN 2 and CSN 3 are the National Scientific Committees of the INFN, respectively, for experiments in high energy physics with accelerators, astro-particle experiments and experiments in nuclear physics with accelerators.}
\label{T1-pledge}
\hfill
\end{table}
......@@ -174,12 +162,16 @@ Despite the flooding that occurred at the end of 2017, we were able to provide t
\subsection{Out of the mud}
The year 2018 began with the recovery procedures of the data center after the flooding of Novembrer 2017.
The year 2018 began with the recovery procedures of the data center after the flooding of November 2017.
Despite the serious damages to the power plants (both power lines were compromised), immediately after the flooding we started the recovery procedures of both the infrastructure and the IT equipment. The first mandatory intervention was to restore, at least, one of the two power lines (with a leased UPS in the first period). This goal was achieved during December 2017.
In January, after also the chillers were restarted, we could proceed to re-open all services, including part of the farm (at the beginning only $\sim$ 50 kHS06, 1/5 of the total power capacity, were online, while 13\% was lost) and, one by one, the storage systems.
The first experiments to resume operations at CNAF were Alice, Virgo, Darkside: in fact, the storage system used by Virgo and Darkside had been easily recovered after Christmas break, while Alice is able to use computing resources relaying on remote storage. During February and March, we were able to progressively re-open the services for all other experiments. %(Fig.\ref{farm2018} shows the restart of the farm). Meanwhile, we had setup a new partition of the farm hosted at CINECA super-computing center premises (see Par.~\ref{CINECAext}).
In January, after the restart of the chillers, we could proceed to re-open all services, including part of the farm (at the beginning only $\sim$ 50 kHS06, 1/5 of the total power capacity, were online, while 13\% was lost) and, one by one, the storage systems.
The first experiments to resume operations at CNAF have been Alice, Virgo and Darkside:
in fact, the storage system used by Virgo and Darkside had been easily recovered after Christmas break, while Alice is able to use computing resources relaying on remote storage. During February and March, we were able to progressively re-open the services for all other experiments.
%(Fig.\ref{farm2018} shows the restart of the farm). Meanwhile, we had setup a new partition of the farm hosted at CINECA super-computing center premises (see Par.~\ref{CINECAext}).
The final damage inventory shows the loss of $\sim$ 30 kHS06, 4 PB of data and 60 tapes: on the other hand, it was possible to repair all the other systems recovering $\sim$ 20 PB of data; for the infrastructure, the second line was recovered (see \cite{FLOODCHEP} for details).
The final damage inventory shows the loss of $\sim$ 30 kHS06,
1.4 PB of data and 60 tapes: on the other hand, it was possible to repair all the other systems recovering $\sim$ 20 PB of data;
with respect to the infrastructure, the second line was recovered (see \cite{FLOODCHEP} for details).
%\begin{figure}[h]
% \begin{center}
......@@ -190,22 +182,24 @@ The final damage inventory shows the loss of $\sim$ 30 kHS06, 4 PB of data and 6
\subsection{The long-term consequences of the flooding}
The data center was designed taking into account all possible accidents (e.g. fires, power outages ...), except at least this.
In fact, it was believed that the only threat due to water could come from a very heavy rain and, indeed, waterproof doors were installed some years ago (after a heavy rain).
The post-mortem analysis showed that the causes, beside the breaking of the tube, are to be found in the unfavorable position (2 underground levels) and in the excessive permeability of the perimeter (while the anti-flood doors worked). Therefore, an intervention has been carried out to increase the waterproofing of the data center and, moreover, work is planned for summer 2019 to strengthen the perimeter of the building and build a second water collection tank.
The data center was designed taking into account all possible accidents, e.g. fires, power outages... except very unlikely events
such as the breaking of one of the main water pipelines in Bologna, located in a road next to CNAF,
which is precisely what happened in November 2017.
In fact, it was believed that the only threat due to water could come from a very heavy rain and, indeed,
waterproof doors were installed some years ago, after a heavy rain.
The post-mortem analysis showed that the causes, beside the breaking of the pipe, are to be found in the unfavorable position (2 underground levels) and in the excessive permeability of the perimeter (while the anti-flood doors worked). Therefore, an intervention has been carried out to increase the waterproofing of the data center and, moreover, work is planned for summer 2019 to strengthen the perimeter of the building and build a second water collection tank.
Even if the search for a new location to move the data center had started before the flooding (the main drive being its limited expandability not able to cope with the foreseen requirements for HL-LHC era when we should scale up to 10 MW of power for IT), the flooding gave us a second strong reason to move.
An opportunity is given by the new ECMWF center which will be hosted in Bologna, in a new Technopole area, starting from 2019. In the same area the INFN Tier-1 and the CINECA computing centers can be hosted too: funding has been guaranteed to INFN and CINECA by the Italian Government for this. The goal is to have the new data center for the INFN Tier-1 fully operational by the end of 2021.
An opportunity is given by the new ECMWF center which will be hosted in Bologna, in a new Technopole area, starting from 2019.
In the same area the INFN Tier 1 and the CINECA\footnote{CINECA is the Italian Supercomputing center, also located near Bologna ($\sim17$ km far from CNAF). See \url{http://www.cineca.it/}} computing centers can be hosted too: funding has been guaranteed to INFN and CINECA by the Italian Government for this. The goal is to have the new data center for the INFN Tier 1 fully operational by the end of 2021.
\section{INFN Tier-1 extension at CINECA}\label{CINECAext}
As mentioned in the previous Paragraph, part of the farm is hosted at CINECA\footnote{CINECA is the Italian Supercomputing center, also located near Bologna ($\sim17$ far km from CNAF). See \url{http://www.cineca.it/}}.
\section{INFN Tier 1 extension at CINECA}\label{CINECAext}
Out of the 400 kHS06 CPU power (340 kHS06 pledged) of the CNAF farm, $\sim180$ are provided by servers installed in the CINECA data center.
%Each server is equipped with a 10 Gbit uplink connection to the rack switch while each of them, in turn, is connected to the aggregation router with 4x40 Gbit links.
The logical network of the farm partition at CINECA is set as an extension of INFN Tier-1 LAN: a dedicated fiber couple interconnects the aggregation router at CINECA with the core switch at the INFN Tier-1 (see Farm and Network Chapters for more details). %Fig.~\ref{cineca-t1}).
The logical network of the farm partition at CINECA is set as an extension of INFN Tier 1 LAN: a dedicated fiber couple interconnects the aggregation router at CINECA with the core switch at the INFN Tier 1 (see Farm and Network Chapters for more details). %Fig.~\ref{cineca-t1}).
%The transmission on the fiber is managed by a couple of Infinera DCI, allowing to have a logical channel up to 1.2 Tbps (currently it is configured to transmit up to 400 Gbps).
%\begin{figure}
% % \begin{minipage}[b]{0.45\textwidth}
......@@ -227,7 +221,7 @@ Since this partition have been installed from the beginning with CentOS 7, legac
\section*{References}
\begin{thebibliography}{9}
\bibitem{FLOODCHEP} L. dell'Agnello, "Disaster recovery of the INFN Tier-1 data center: lesson learned" to be published in Proceedings of the 23rd International Conference on Computing in High Energy and Nuclear Physics - EPJ Web of Conferences
\bibitem{FLOODCHEP} L. dell'Agnello, "Disaster recovery of the INFN Tier 1 data center: lesson learned" to be published in Proceedings of the 23rd International Conference on Computing in High Energy and Nuclear Physics - EPJ Web of Conferences
\bibitem{singularity} \url{http://singularity.lbl.gov}
\end{thebibliography}
......
File added
......@@ -2,34 +2,34 @@
\usepackage{graphicx}
\begin{document}
\title{User and Operational Support at CNAF}
\author{D. Cesini, E. Corni, F. Fornari, L. Morganti, C. Pellegrino, M. V. P. Soares, M. Tenti, L. Dell'Agnello}
\address{INFN-CNAF, Bologna, IT}
\author{D. Cesini$^1$, E. Corni$^1$, F. Fornari$^1$, L. Morganti$^1$, C. Pellegrino$^1$, M. V. P. Soares$^1$, M. Tenti$^1$, L. Dell'Agnello$^1$}
\address{$^1$ INFN-CNAF, Bologna, IT}
\ead{user-support@lists.cnaf.infn.it}
\begin{abstract}
Many different research groups, typically organized in Virtual Organizations (VOs),
exploit the Tier-1 Data center facilities for computing and/or data storage and management. Moreover, CNAF hosts two small HPC farms and a Cloud infrastructure. The User Support unit provides to the users of all CNAF facilities with a direct operational support, and promotes common technologies and best-practices to access the ICT resources in order to facilitate the usage of the center and maximize its efficiency.
exploit the Tier 1 Data center facilities for computing and/or data storage and management. Moreover, CNAF hosts two small HPC farms and a Cloud infrastructure. The User Support unit provides to the users of all CNAF facilities with a direct operational support, and promotes common technologies and best-practices to access the ICT resources in order to facilitate the usage of the center and maximize its efficiency.
\end{abstract}
\section{Current status}
Born in April 2012, the User Support team in 2018 was composed by one coordinator and up to five fellows with post-doctoral education or equivalent work experience in scientific research or computing.
The main activities of the team include:
\begin{itemize}
\item providing a prompt feedback to VO-specific issues via ticketing systems or official mail channels;
\item forwarding to the appropriate Tier-1 units those requests which cannot be autonomously satisfied, and taking care of answers and fixes, e.g. via the tracker JIRA, until a solution is delivered to the experiments;
\item forwarding to the appropriate Tier 1 units those requests which cannot be autonomously satisfied, and taking care of answers and fixes, e.g. via the tracker JIRA, until a solution is delivered to the experiments;
\item supporting the experiments in the definition and debugging of computing models in distributed and Cloud environments;
\item helping the supported experiments by developing code, monitoring frameworks and writing guides and documentation for users (see e.g. https://www.cnaf.infn.it/en/users-faqs/);
\item solving issues on experiment software installation, access problems, new accounts creation and any other daily usage problems;
\item porting applications to new parallel architectures (e.g. GPUs and HPC farms);
\item providing the Tier-1 Run Coordinator, who represents CNAF at the Daily WLCG calls, and reports about resource usage and problems at the monthly meeting of the Tier-1 management body (Comitato di Gestione del Tier-1).
\item providing the Tier 1 Run Coordinator, who represents CNAF at the Daily WLCG calls, and reports about resource usage and problems at the monthly meeting of the Tier 1 management body (Comitato di Gestione del Tier 1).
\end{itemize}
People belonging to the User Support team represent INFN Tier-1 inside the VOs.
People belonging to the User Support team represent INFN Tier 1 inside the VOs.
In some cases, they are directly integrated in the supported experiments. Moreover, they can play the role of a member of any VO for debugging purposes.
The User Support staff is also involved in different CNAF internal projects, notably the Computing on SoC Architectures (COSA) project (www.cosa-project.it) dedicated to the technology tracking and benchmarking of the modern low-power architectures for computing applications.
\section{Supported experiments}
The LHC experiments represent the main users of the data center, handling more than 80\% of the total computing and storage resources funded at CNAF. Besides the four LHC experiments (ALICE, ATLAS, CMS, LHCb) for which CNAF acts as Tier-1 site, the data center also supports an ever increasing number of experiments from the Astrophysics, Astroparticle physics and High Energy Physics domains, and specifically Agata, AMS-02, Argo-YBJ, Auger, Belle II, Borexino, CDF, Compass, COSMO-WNEXT CTA, Cuore, Cupid, Dampe, DarkSide-50, Enubet, Famu, Fazia, Fermi-LAT, Gerda, Icarus, LHAASO, LHCf, Limadou, Juno, Kloe, KM3Net, Magic, NA62, Newchim, NEWS, NTOP, Opera, Padme, Pamela, Panda, Virgo, and XENON.
The LHC experiments represent the main users of the data center, handling more than 80\% of the total computing and storage resources funded at CNAF. Besides the four LHC experiments (ALICE, ATLAS, CMS, LHCb) for which CNAF acts as Tier 1 site, the data center also supports an ever increasing number of experiments from the Astrophysics, Astroparticle physics and High Energy Physics domains, and specifically Agata, AMS-02, Auger, Belle II, Borexino, CDF, Compass, COSMO-WNEXT CTA, Cuore, Cupid, Dampe, DarkSide-50, Enubet, Famu, Fazia, Fermi-LAT, Gerda, Icarus, LHAASO, LHCf, Limadou, Juno, Kloe, KM3Net, Magic, NA62, Newchim, NEWS, NTOP, Opera, Padme, Pamela, Panda, Virgo, and XENON.
Clearly, a bigger effort from the User Support team is needed to answer to the varied and diverse needs from these no-LHC experiments and to encourage them to adopt more modern technologies, e.g. FTS, Dirac, token-based authorization.
\begin{figure}[ht]
......@@ -60,12 +60,13 @@ The following figures show resources pledged and used by the supported experimen
Unfortunately, the accounting data for storage, both disk and tape statistics, are available only after summer 2018, given the restoration of the complex system of sensors for accounting after the 2017 flooding had a lower priority with respect to activities needed for a complete of the storage resources involved in the flood.
\section{Support to HPC and cloud-based experiment}
Apart from Tier-1 facilities, CNAF hosts two small HPC farms and a cloud infrastructure. The first HPC cluster, in production since 2015, is composed of 27 nodes, some of them also equipped with one or more GPUs (NVIDIA Tesla K20, K40 and K1). All nodes are infiniband interconnected and equipped with 2 Intel CPUs, 8 physical cores each, HyperThread enabled. The cluster is accessible via the LSF batch system. It is open to various INFN communities, but the main users are theoretical physicist dealing with plasma laser acceleration simulations. The cluster serves as testing infrastructure to prepare the high resolution runs submitted to supercomputers.
Apart from Tier 1 facilities, CNAF hosts two small HPC farms and a cloud infrastructure. The first HPC cluster, in production since 2015, is composed of 27 nodes, some of them also equipped with one or more GPUs (NVIDIA Tesla K20, K40 and K1). All nodes are infiniband interconnected and equipped with 2 Intel CPUs, 8 physical cores each, HyperThread enabled. The cluster is accessible via the LSF batch system. It is open to various INFN communities, but the main users are theoretical physicists dealing with plasma laser acceleration simulations. The cluster is used as a testing infrastructure to prepare the high resolution runs to be submitted afterwards to supercomputers.
A second HPC cluster entered into production in 2017 to serve the CERN accelerators R/D groups. The cluster consists of 12 nodes OmniPath interconnected. Can be access through batch queues managed by the IBM LSF system.
A second HPC cluster entered into production in 2017 to serve the CERN accelerators R/D groups. The cluster consists of 12 nodes OmniPath interconnected. It can be access through batch queues managed by the IBM LSF system.
The support is provided on a daily base for what concerns software installation, access problems, new accounts creation and any other usage problems.
The User Support team manages an OpenStack-based tenant hosted within the Cloud@CNAF. This tenant, provided with 300 vCPUs, is mostly devoted to support peculiar use cases which require unusual software configurations and only for a limited amount of time. The most important of these use cases is the FAZIA experiment, for which 256 vCPUs were provided, distributed over 16 worker nodes with 8GB of RAM each, where the Debian 8.4 operating system has been installed and configured with LDAP+Kerberos for user authentication and authorization, and NFS 4 for network storage sharing. Recently, other experiments started accessing the Cloud infrastructure: AMS, EEE, FAZIA, Icarus and NTOF.
The User Support team manages an OpenStack-based tenant hosted within the Cloud@CNAF. This tenant, provided with 300 vCPUs, is mostly devoted to support peculiar use cases which require unusual software configurations and only for a limited amount of time. The most important of these use cases is the FAZIA experiment, for which 256 vCPUs were provided, distributed over 16 worker nodes with 8GB of RAM each, where the Debian 8.4 operating system has been installed and configured with LDAP and Kerberos for user authentication and authorization, and NFS 4 for network storage sharing.
Recently, other experiments started accessing the Cloud infrastructure: AMS, EEE, Icarus and NTOF.
\end{document}
......
......@@ -5,18 +5,18 @@
%\author{P. Astone$^1$, F. Badaracco$^{2,3}$, S. Bagnasco$^4$, S. Caudill$^5$, F. Carbognani$^6$, A. Cirone$^{7,8}$, G. Fronz\'e$^{4}$, J. Harms$^{2,3}$, I. LaRosa$^1$, C. Lazzaro$^9$, P. Leaci$^1$, S. Lusso$^4$, C. Palomba$^1$, R. DePietri$^{11,12}$, M. Punturo$^{10}$, L. Rei$^8$, L. Salconi$^6$, S. Vallero$^{4}$, on behalf of the Virgo collaboration}
\author{P. Astone$^1$, F. Badaracco$^{2,3}$, S. Bagnasco$^4$, S. Caudill$^5$, F. Carbognani$^6$, A. Cirone$^{7,8}$, M. Drago$^{2,3}$, G. Fronz\'e$^{4}$, J. Harms$^{2,3}$, I. LaRosa$^1$, C. Lazzaro$^9$, P. Leaci$^1$, S. Lusso$^4$, C. Palomba$^1$, R. DePietri$^{11,12}$, M. Punturo$^{10}$, L. Rei$^8$, L. Salconi$^6$, S. Vallero$^{4}$, on behalf of the Virgo collaboration}
\address{$^1$ INFN, Roma, IT}
\address{$^2$ Gran Sasso Science Institute (GSSI), IT}
\address{$^3$ INFN, Laboratori Nazionali del Gran Sasso, IT}
\address{$^4$ INFN, Torino, IT}
\address{$^5$ Nikhef, Science Park, NL}
\address{$^6$ EGO-European Gravitational Observatory, Cascina, Pisa, IT}
\address{$^7$ Universit\`a degli Studi di Genova, IT}
\address{$^8$ INFN, Genova, IT}
\address{$^9$ INFN, Padova, IT}
\address{$^{10}$ INFN, Perugia, IT}
\address{$^{11}$ Universit\`a degli Studi di Parma, IT}
\address{$^{12}$ INFN, Gruppo Collegato Parma, IT}
\address{$^1$ INFN Sezione di Roma, Roma, IT}
\address{$^2$ Gran Sasso Science Institute (GSSI), L'Aquila, IT}
\address{$^3$ INFN Laboratori Nazionali del Gran Sasso, L'Aquila, IT}
\address{$^4$ INFN Sezione di Torino, Torino, IT}
\address{$^5$ Nikhef, Amsterdam, NL}
\address{$^6$ EGO-European Gravitational Observatory, Cascina (PI), IT}
\address{$^7$ Universit\`a degli Studi di Genova, Genova, IT}
\address{$^8$ INFN Sezione di Genova, Genova, IT}
\address{$^9$ INFN Sezione di Padova, Padova, IT}
\address{$^{10}$ INFN Sezione di Perugia, Perugia, IT}
\address{$^{11}$ Universit\`a degli Studi di Parma, Parma, IT}
\address{$^{12}$ INFN Gruppo Collegato Parma, Parma, IT}
%\address{Production Editor, \jpcs, \iopp, Dirac House, Temple Back, Bristol BS1~6BE, UK}
......@@ -32,7 +32,7 @@ The amount of data processed during the last few years has emphasized the fact t
\section{Advanced Virgo computing model}
\subsection{Data production and data transfer}
The Advanced Virgo data acquisition system is writing about 35MB/s of data (so-called ``bulk data'') during O3. CNAF and CC-IN2P3 are the Virgo Tier-0: during the science runs, bulk data is stored in a circular buffer located at the Virgo site, and simultaneously transferred to the remote computing centres where they are archived in tape libraries. The transfer is realized through an ad-hoc procedure based on GridFTP (at CNAF) and iRods (at CC-IN2P3). Other data fluxes reach CNAF during science runs:
The Advanced Virgo data acquisition system is writing about 35MB/s of data (so-called ``bulk data'') during O3. CNAF and CC-IN2P3 are the Virgo Tier 0: during the science runs, bulk data is stored in a circular buffer located at the Virgo site, and simultaneously transferred to the remote computing centers where they are archived in tape libraries. The transfer is realized through an ad-hoc procedure based on GridFTP (at CNAF) and iRods (at CC-IN2P3). Other data fluxes reach CNAF during science runs:
\begin{itemize}
\item trend data (few GB/day), periodically transferred using the system described above;
......@@ -42,23 +42,23 @@ The Advanced Virgo data acquisition system is writing about 35MB/s of data (so-c
\subsection{Data Analysis at CNAF}
%The analysis of the LIGO and Virgo data was made jointly by the two collaborations; the analysis pipelines are distributed among the worldwide network of computing facilities offering computing resources to the GW experiments. CNAF was mainly used for CW analysis, looking for continuous gravitational wave signals, developed by INFN–Roma people (see hereafter more details). But at CNAF is also running part of the pyCBC pipeline, submitted via OSG, looking for compact binaries signals. pyCBC has a crucial role in the detection of the coalescence of BBH and BNS. CNAF contributed to the computation performed through pyCBC for the analysis of the events GW170814, the first BBH coalescence detected also by Virgo, and GW170817, the BNS coalescence. During the last month a new extension of CVMFS, \emph{big cvmfs} was mounted at cnaf to support another OSG pipeline, \emph{BayesWave}. The big cvmfs is able to export, in a posix fashion, big file of data from nearby cache in Amsterdam instead of accessing data directly from Nebraska. BayesWave is a Bayesian algorithm designed to robustly distinguish gravitational wave signals from noise and instrumental glitches without relying on any prior assumptions of waveform morphology. In the last year coherent WaveBurst \emph{cwb} was ported to cnaf and made available to run. cwb is a pipeline based on coherent algorithm for detection and reconstruction of modelled and unmodelled GW bursts. A new newtonian noise cancellation algoritmh, developed by the group of Gran Sasso Science Institute (\emph{GSSI}) was made available very recently. The increased number of LVC pipelines running at cnaf has led to saturate advance virgo pledge at cnaf, cnaf promptly rensponded to advance virgo needed enlargin our quota and giving experimental access to gpu.
LIGO-Virgo data analysis is organized jointly, meaning that the analysis pipelines are made available to the computing facilities related to the LVC network, ready to be distributed to each GW detector. CNAF has been mainly used for Continuous Wave(\emph{CW}) analysis, led by the Roma INFN group, and for the Compact Binary Coalescence python-based analysis (\emph{pyCBC}), submitted via OSG. In particular CNAF computationally contributed to GW170814 and GW170817 events, respectively the first BBH coalescence detected by Virgo and the first BNS merger ever observed. During the last month a new extension of CVMFS, so-called ``big cvmfs'', was mounted at CNAF to support another OSG-based pipeline, Bayes Wave. The former is able to make available, in a POSIX-like fashion, big data files from a cache in Amsterdam, instead of accessing the data directly from Nebraska. The latter is a Bayesian algorithm, designed to robustly distinguish GW signals from noise and instrumental glitches, without relying on any prior assumptions on the waveform shape. During the last year, coherent WaveBurst(\emph{cWB}), an algorithm dedicated to the detection and reconstruction of GW Bursts, was also ported to CNAF. Furthermore, new Newtonian Noise cancellation algorithms, which are currently being developed by the GSSI group, were made recently available. The increasing number of LVC pipelines running at CNAF has led to resource saturation, and consequently to a demand for enlarged computing power, together with access to GPUs.
LIGO-Virgo data analysis is organized jointly, meaning that the analysis pipelines are made available to the computing facilities related to the LVC network, ready to be distributed to each GW detector. CNAF has been mainly used for Continuous Wave (\emph{CW}) analysis, led by the Roma INFN group, and for the Compact Binary Coalescence python-based analysis (\emph{pyCBC}), submitted via OSG. In particular CNAF computationally contributed to GW170814 and GW170817 events, respectively the first BBH coalescence detected by Virgo and the first BNS merger ever observed. During the last month a new extension of CVMFS, so-called ``big cvmfs'', was mounted at CNAF to support another OSG-based pipeline, Bayes Wave. The former is able to make available, in a POSIX-like fashion, big data files from a cache in Amsterdam, instead of accessing the data directly from Nebraska. The latter is a Bayesian algorithm, designed to robustly distinguish GW signals from noise and instrumental glitches, without relying on any prior assumptions on the waveform shape. During the last year, coherent WaveBurst (\emph{cWB}), an algorithm dedicated to the detection and reconstruction of GW Bursts, was also ported to CNAF. Furthermore, new Newtonian Noise cancellation algorithms, which are currently being developed by the GSSI group, were made recently available. The increasing number of LVC pipelines running at CNAF has led to resource saturation, and consequently to a demand for enlarged computing power, together with access to GPUs.
\subsubsection{CW pipeline}
CNAF has been in 2018 the main computing center for Virgo all-sky continuous wave (CW) searches. The search for this kind of signals, emitted by spinning neutron stars, covers a large portion of the source parameter space and consists of several steps organized in a hierarchical analysis pipeline. CNAF has been mainly used for the ``incoherent'' stage, based of a particular implementation of the Hough transform, which is the heaviest part of the analysis from a computational point of view. The code implementing the Hough transform has been written in such a way that the exploration of the parameter space can be split in several independent jobs, each covering a range of signal frequencies and a portion of the sky. This is an embarrassingly parallel problem, very well suited to be run in a distributed computing environment. The analysis jobs have been run using the EGI UMD grid middleware, with input and output files stored in a StoRM-based Storage Element at CNAF. Candidate post-processing, consisting of clusterisation, coincidences and ranking, and parts of the candidate follow-up analysis have been also carried on at CNAF. Typical Hough transform jobs needs about 4GB of memory (with a fraction requiring more, up to 8GB). Past year most of the resources have been used to analyze Advanced LIGO O2 data. Overall, in 2018 more than 10M CPU hours have been used at CNAF for CW searches, by running O($10^5$) jobs, with duration from a few hours to ~3 days.
CNAF has been in 2018 the main computing center for Virgo all-sky continuous wave (CW) searches. The search for this kind of signals, emitted by spinning neutron stars, covers a large portion of the source parameter space and consists of several steps organized in a hierarchical analysis pipeline. CNAF has been mainly used for the ``incoherent'' stage, based of a particular implementation of the Hough transform, which is the heaviest part of the analysis from a computational point of view. The code implementing the Hough transform has been written in such a way that the exploration of the parameter space can be split in several independent jobs, each covering a range of signal frequencies and a portion of the sky. This is an embarrassingly parallel problem, very well suited to be run in a distributed computing environment. The analysis jobs have been run using the EGI UMD grid middleware, with input and output files stored in a StoRM-based Storage Element at CNAF. Candidate post-processing, consisting of clusterisation, coincidences and ranking, and parts of the candidate follow-up analysis have been also carried on at CNAF. A typical Hough transform job needs about 4GB of memory (with a fraction requiring more, up to 8GB). Past year most of the resources have been used to analyze Advanced LIGO O2 data. Overall, in 2018 more than 10M CPU hours have been used at CNAF for CW searches, by running O($10^5$) jobs, with duration from a few hours to ~3 days.
\subsubsection{cWB pipeline}
Starting in 2019, the coherent WaveBurst based pipelines have been ported and adapted to run at CNAF to reproduce the cWB environment setup on the worker nodes, without the constraint to read the user home account during running. It is planned to run at CNAF all Virgo offline long duration all-sky searches on the data that will be collected during the Observational Run 3 (03) that started April 1st, 2019. cWB is a data-analysis tool to search for a broad range of gravitational-wave (GW) transients. The pipeline identifies coincident events in the GW data from earth-based interferometric detectors and reconstructs the gravitational wave signal by using a constrained maximum likelihood approach. The algorithm performs a time-frequency analysis of the data, using wavelet representation, and identifies the events by clustering time-frequency pixels with significant excess coherent power. The likelihood statistics is built as a coherent sum over the responses of different detectors and estimates the total signal to noise ratio of the GW signal in the network. The pipeline splits the total analysis time into sub-periods to be analyzed in parallel jobs, using HTCondor tools and it is expected to use a consistent amount of CPU hours during 2019.
Starting in 2019, the coherent WaveBurst based pipelines have been ported and adapted to run at CNAF to reproduce the cWB environment setup on the worker nodes, without the constraint to read the user home account during running. It is planned to run at CNAF all Virgo offline long duration all-sky searches on the data that will be collected during the Observational Run 3 (03) that started April 1, 2019. cWB is a data-analysis tool to search for a broad range of gravitational-wave (GW) transients. The pipeline identifies coincident events in the GW data from earth-based interferometric detectors and reconstructs the gravitational wave signal by using a constrained maximum likelihood approach. The algorithm performs a time-frequency analysis of the data, using wavelet representation, and identifies the events by clustering time-frequency pixels with significant excess coherent power. The likelihood statistics is built as a coherent sum over the responses of different detectors and estimates the total signal to noise ratio of the GW signal in the network. The pipeline splits the total analysis time into sub-periods to be analyzed in parallel jobs, using HTCondor tools and it is expected to use a consistent amount of CPU hours during 2019.
\subsubsection{Newtonian noise pipeline}
The cancellation of gravitational noise from seismic fields will be a major challenge both from theoretical and computational point of view, since the involved simulations are very demanding. This activity requires the accurate positioning of a large number of seismometers. A cluster at CNAF was used to run position optimisations of the seismic arrays used for cancellation and to determine the cancellation performance as a function of the number of sensors and its robustness with respect to sensor-positioning accuracy.
\subsection{outlook}
The first detection of gravitational waves (GW) and the birth of multi-messenger astrophysics have opened a new field of scientific research. With the possibility to detect GW from various kind of sources we can probe new physical phenomena in regions of the Universe we couldn't explore before, with new perspectives on our knowledge about how it works.
Indeed, so far only signals from the coalescence of compact objects have been detected, while one of the most interesting and promising class of continuous GW signals, coming from asymmetrical rotating neutron stars, is still missing. Wide searches of this kind of signals require a huge amount of computational power due to the Doppler effect of the Earth motion, which disrupts the incoming signal dramatically increases the parameters space. This means that it is necessary to develop complex algorithms to reduce the computational power needed, at the price of significantly reducing the sensitivity of the search.
\subsection{Outlook}
The first detection of gravitational waves (GW) and the birth of multi-messenger astrophysics have opened a new field of scientific research. With the possibility to detect GW from various kinds of sources we can probe new physical phenomena in regions of the Universe we couldn't explore before, with new perspectives on our knowledge about how it works.
Indeed, so far only signals from the coalescence of compact objects have been detected, while one of the most interesting and promising class of continuous GW signals, coming from asymmetrical rotating neutron stars, is still missing. Wide searches of this kind of signals require a huge amount of computational power due to the Doppler effect of the Earth motion, which disrupts the incoming signal and dramatically increases the parameters space. This means that it is necessary to develop complex algorithms to reduce the computational power needed, at the price of significantly reducing the sensitivity of the search.
The development of new algorithms, which use the high efficiency and computational power of modern GPUs, showed that the new codes on a single GPU can run with a factor of ten speed-up with respect to the older ones on a ten times more expensive multi-core CPU.
For the CW case, using real data from the 9 months long run of the LIGO detectors we have estimated that on a cluster of about 200 GPUs a complete search can be done in about a couple of months, to be confronted with the several months required by the older code on a 2000 CPUs cluster.\\ A GPU cluster would be also extremely useful to test and train Machine Learning algorithms, which in the recent years were shown to be able to face very complex analyses with high efficiency and speed.\\
For the CW case, using real data from the 9 months long run of the LIGO detectors we have estimated that on a cluster of about 200 GPUs a complete search can be done in about a couple of months, to be compared with the several months required by the older code on a 2000 CPUs cluster.\\ A GPU cluster would be also extremely useful to test and train Machine Learning algorithms, which in the recent years were shown to be able to face very complex analyses with high efficiency and speed.\\
Advanced Virgo and Advanced LIGO are also exploring different technologies to face the new challenges of GW physics. The growing number of computing centers involved in GW research forces us to relax our idea on computing, searching a way to uniformly run different pipelines in complex and heterogeneous infrastructures. For example, the de-supporting of GridFTP pushes towards the use of Rucio, a well supported and flexible tool for data-transfer and management, while the de-supporting of the Cream-CE suggests a redesign of the job submission strategy, possibly under the control of an overall management system like DIRAC. \\ CNAF staff is intensively supporting Virgo members in all this these tests.
......
......@@ -20,9 +20,9 @@
\title{XENON computing model}
%\pagestyle{fancy}
\author{M. Selvi}
\author{Marco Selvi$^1$}
\address{INFN - Sezione di Bologna}
\address{$^1$ INFN Sezione di Bologna, Bologna, IT}
\ead{marco.selvi@bo.infn.it}
......
immagini/Additional-Information_18_web.jpg

752 KiB