Skip to content
Snippets Groups Projects
main.tex 14.4 KiB
Newer Older
Lucia Morganti's avatar
Lucia Morganti committed
\documentclass[a4paper]{jpconf}
\usepackage{graphicx}
\usepackage{hyperref}
\usepackage{todonotes}

\begin{document}
\title{DAMPE data processing and analysis at CNAF}

\author{G. Ambrosi$^1$, G. Donvito$^5$, D.F.Droz$^6$, M. Duranti$^1$, D. D'Urso$^{2,3,4}$, F. Gargano$^{5,\ast}$, G. Torralba Elipe$^{7,8}$}


\address{$^1$ INFN, Sezione di Perugia, I-06100 Perugia, Italy}
\address{$^2$ Universit\`a di Sassari, I-07100 Sassari, Italy}
\address{$^3$ ASDC, I-00133 Roma, Italy}
\address{$^4$ INFN-LNS, I-95123 Catania, Italy}
%\address{$^3$ Universit\`a di Perugia, I-06100 Perugia, Italy}
\address{$^5$ INFN, Sezione di Bari, I-70125 Bari, Italy}
\address{$^6$ University of Geneva, Departement de physique nucléaire et corpusculaire (DPNC), CH-1211, Gen\`eve 4, Switzerland}

\address{$^7$ Gran Sasso Science Institute, L'Aquila, Italy}
\address{$^8$ INFN - Laboratori Nazionali del Gran Sasso, L'Aquila, Italy}


\address{DAMPE experiment \url{http://dpnc.unige.ch/dampe/},
\url{http://dampe.pg.infn.it}}

\ead{* fabio.gargano@ba.infn.it}

\begin{abstract}
DAMPE (DArk Matter Particle Explorer) is one of the five satellite missions in the framework of the Strategic Pioneer Research Program in Space Science of the Chinese Academy of Sciences (CAS). DAMPE has been launched the 17 December 2015 at 08:12 Beijing time into a sun-synchronous orbit at the altitude of 500 km. The satellite is equipped with a powerful space telescope for high energy gamma-ray, electron and cosmic ray detection. 
CNAF computing center is the mirror of DAMPE data outside China and the main data center for Monte Carlo production. It also supports user data analysis of the Italian DAMPE Collaboration. 
\end{abstract}

\section{Introduction}

\begin{figure}[ht]
\begin{center}
Lucia Morganti's avatar
Lucia Morganti committed
\includegraphics[width=20pc]{dampe_layout_2.jpg}
Lucia Morganti's avatar
Lucia Morganti committed
\end{center}
\caption{\label{fig:dampe_layout} DAMPE telescope scheme: a double layer of the plastic scintillator strip detector (PSD);
the silicon-tungsten tracker-converter (STK) made of 6 tracking double layers; the imaging calorimeter with about 31 radiation lengths thickness, made of 14 layers of Bismuth Germanium Oxide (BGO) bars in a hodoscopic arrangement and finally
the neutron detector (NUD) placed just below the calorimeter.}
\end{figure}

DAMPE is a  space telescope for high energy cosmic-ray detection.
In Fig. \ref{fig:dampe_layout} a scheme of the DAMPE telescope is shown. The top, the plastic scintillator strip detector (PSD) consists of one double layer of scintillating plastic strips detector, which serves as an anti-coincidence detector and to measure particle charge, followed by a silicon-tungsten tracker-converter (STK), which is made of 6 tracking layers. Each tracking layer consists of two layers of single-sided silicon strip detectors measuring the position on the two orthogonal views perpendicular to the pointing direction of the apparatus. Three layers of Tungsten plates with a thickness of 1~mm are inserted in front of tracking layers 3, 4 and 5 to promote photon conversion into electron-positron pairs. The STK is followed by an imaging calorimeter of about 31 radiation lengths thickness, made up of 14 layers of Bismuth Germanium Oxide (BGO) bars which are placed in a hodoscopic arrangement. The total thickness of the BGO and the STK corresponds to about 33 radiation lengths, making it the deepest calorimeter ever used in space. Finally, in order to detect delayed neutron resulting from hadron showers and to improve the electron/proton separation power, a neutron detector (NUD) is placed just below the calorimeter. The NUD consists of 16, 1~cm thick, boron-doped plastic scintillator plates of 19.5 $\times$ 19.5 cm$^2$ large, each read out by a photomultiplier.

The primary scientific goal of DAMPE is to measure electrons and photons with much higher energy resolution and energy reach than achievable with existing space experiments. This will help to identify possible Dark Matter signatures but also may advance our understanding of the origin and propagation mechanisms of high energy cosmic rays and possibly lead to new discoveries in high energy gamma-ray astronomy.

DAMPE was designed to have unprecedented sensitivity and energy reach for electrons, photons and heavier cosmic rays (proton and heavy ions). For electrons and photons, the detection range is 2 GeV-10 TeV, with an energy resolution of about 1.5\% at 100 GeV. For proton and heavy ions, the detection range is 100 GeV-100 TeV, with an energy resolution better than 40\% at 800 GeV. The geometrical factor is about 0.3 m$^2$ sr for electrons and photons, and about 0.2 m$^2$ sr for heavier cosmic rays. The angular resolution is 0.1$^{\circ}$ at 100 GeV.

\section{DAMPE Computing Model and Computing Facilities}
As a Chinese satellite, DAMPE data are collected via the Chinese space communication system and transmitted to the China National Space Administration (CNSA) center in Beijing. From Beijing data are then transmitted to the Purple Mountain Observatory (PMO) in Nanjing, where they are processed and reconstructed. 
On the European side, the DAMPE collaboration consists of research groups from INFN and University of Perugia, Lecce and Bari, and from the Department of Particle and Nuclear Physics (DPNC) at the University of Geneva in Switzerland.


\subsection{Data production}
PMO is the deputed center for DAMPE data production. Data are collected 4 times per day, each time the DAMPE satellite is passing over Chinese ground stations (almost every 6 hours). Once transferred to PMO, binary data, downloaded from the satellite, are processed to produce a stream of raw data in ROOT \cite{root} format  ({\it 1B} data stream, $\sim$ 7 GB/day), and a second stream that include the orbital and slow control information ({\it 1F} data stream, $\sim$ 7 GB/day). The {\it 1B} and {\it 1F} streams are used to derive calibration files for the different subdetectors ($\sim$ 400MB/day). Finally, data are reconstructed using the DAMPE official reconstruction code, and the so-called {\it 2A} data stream (ROOT files, $\sim$ 85 GB/day) is produced. The total amount of data volume produced per day is $\sim$ 100 GB.
Data processing and reconstruction activities are currently supported by a computing farm consisting of more than 1400 computing cores, able to reprocess 3 years DAMPE data in 1 month.

\subsection{Monte Carlo Production}
Analysis of DAMPE data requires large amounts of Monte Carlo simulation, to fully understand detector capabilities, measurement limits and systematic. In order to facilitate easy work-flow handling and management and also enable effective monitoring of a large number of batch jobs in various states, a NoSQL meta-data database using MongoDB \cite{mongo} was developed with a prototype currently running at the Physics Department of Geneva University. Database access is provided through a web-frontend and command tools based on the flask-web toolkit \cite{flask} with a client-backend of cron scripts that run on the selected computing farm. The design and implementation of this work-flow system were heavily influenced by the implementation of the Fermi-LAT data processing pipeline \cite{latpipeline} and the DIRAC computing framework \cite{dirac}.

Once submitted, each batch job continuously reports its status to the database through outgoing HTTP requests. To that end, computing nodes need to allow for outgoing internet access. Each batch job implements a work-flow where input and output data transfers are being performed (and their return codes are reported) as well as the actual running of the payload of a job (which is defined in the metadata description of the job). Dependencies on productions are implemented at the framework level and jobs are only submitted once dependencies are satisfied. 
Once generated, a secondary job is initiated which performs digitization and reconstruction of existing MC data with a given release for large amounts of MC data in bulk. This process is set-up via a cronjob at DPNC and occupies up to 200 slots in a 6-hour limited computing queue.

\subsection{Data availability}
DAMPE data are available to the Chinese Collaboration through the PMO institute, while they are kept accessible to the European Collaboration transferring them from PMO to CNAF and from there to the DPNC.
Every time a new {\it 1B}, {\it 1F} or {\it 2A} data files are available at PMO, they are copied, using the GridFTP \cite{gridftp}  protocol, 
to a server at CNAF, \texttt{ gridftp-plain-virgo.cr.cnaf.infn.it}, into the DAMPE storage area.  From CNAF, every 4 hours a copy of each stream is triggered to the Geneva computing farm via rsync. Dedicated lsf jobs are submitted once per day to asynchronously verify the checksum of newly transferred data from PMO to CNAF and from CNAF to Geneva. 
Data verification and copy processes are managed through a dedicated User Interface (UI), \texttt{ui-dampe}. 

The connection to China is passing through the Orientplus \cite{orientplus} link of the G${\rm \acute{e}}$ant Consortium \cite{geant}. The data transfer rate is currently limited by the connection of the PMO to the China Education and Research Network (CERNET), which has a maximum bandwidth of 100 Mb/s. So the PMO-CNAF copy processed is used for daily data production.

To transfer towards Europe data in case of DAMPE data re-processing and to share in China Monte Carlo generated in Europe,   
a dedicated DAMPE server has been installed at the Institute for high energy physics, IHEP, in Beijing which is connected to CERNT with a 1Gb/s bandwidth. Data synchronization between this server and PMO is done by a manually induced hard-drive exchange.
 
To simplify user data access overall Europe, an XRootD federation has been implemented: an XRootD redirector has been set up in Bari with end-point XRootD server installations (providing the real data) at CNAF, Bari and in Geneva. These end-points provide unified read access for users in Europe. 

\section{CNAF contribution}
The CNAF computing center is the mirror of DAMPE data outside China and the main data center for Monte Carlo production.\\
In 2018, a dedicated user interface, 300 TB of disk space and 7.8k HS06 of CPU time have been allocated for the DAMPE activities.

    
\section{Activities in 2018}
DAMPE activities at CNAF in 2018 have been related to data transfer, Monte Carlo production and data analysis.



\subsection{Data transfer}
The daily activity of data transfer from PMO to CNAF and thereafter from CNAF to GVA have been performed all along the year. 
Daily transfer rate has been of about 100 GB per day from PMO to CNAF more 100 GB from CNAF to PMO.
The step between PMO and CNAF is performed, as seen in previous sections, via \texttt{gridftp} protocol. 
Two strategies have been, instead, used to copy data from CNAF to PMO: via \texttt{rsync} from the UI and via \texttt{rsync} managed by batch (LSF) jobs. 
DAMPE data have been reprocessed three times along the year and a dedicated copy task has been fulfilled to copy the new production releases, in addition to the ordinary daily copy. 


\subsection{Monte Carlo Production}

\iffalse
\begin{figure}
\begin{center}
Lucia Morganti's avatar
Lucia Morganti committed
\includegraphics[width=30pc]{CNAF_HS06_2017}
Lucia Morganti's avatar
Lucia Morganti committed
\end{center}
\caption{\label{fig:hs06_2017} CPU time consumption, in terms of HS06 (blue solid for daily computation, dashed for the average over the entire year). The red solid line corresponds to the annual pledge and the green dotted line corresponds to the job efficiency computed in a 14-day sliding window.}
\end{figure}
\fi

\begin{figure}[ht]
\begin{center}
Lucia Morganti's avatar
Lucia Morganti committed
\includegraphics[width=35pc]{figure_cnaf.png}
Lucia Morganti's avatar
Lucia Morganti committed
\end{center}
\caption{\label{fig:figure_cnaf} Status of completed simulation production at CNAF.}
\end{figure} 

\begin{figure}[ht]
\begin{center}
Lucia Morganti's avatar
Lucia Morganti committed
\includegraphics[width=35pc]{figureCNAF2018.png}
Lucia Morganti's avatar
Lucia Morganti committed
\end{center}
\caption{\label{fig:figure_cnaf_2018} Status of completed simulation production at CNAF in 2018.}
\end{figure} 

\iffalse
\begin{figure}[ht]
\begin{center}
Lucia Morganti's avatar
Lucia Morganti committed
\includegraphics[width=35pc]{figure_all.png}
Lucia Morganti's avatar
Lucia Morganti committed
\end{center}
\caption{\label{fig:figure_all} Status of completed simulation production at all DAMPE simulation sites.}
\end{figure} 
\fi

As the main data center for Monte Carlo production,  CNAF has been strongly involved in the Monte Carlo campaign. 

At CNAF almost 300 thousand jobs have been executed for a total of about 3 billion of Monte Carlo events.
 
Monte Carlo campaign is still ongoing for different species and different energy ranges.
In figure \ref{fig:figure_cnaf} the status of completed simulation production at CNAF is shown. 
During 2019 we will perform a new full simulation campaign with an improved version of our simulation code: this is crucial for all the forthcoming analysis.

\subsection{Data Analysis}
Most of the analysis in Europe is performed at CNAF and its role has been crucial for all the DAMPE publications such as the Nature paper on direct detection of a break in the TeV cosmic-ray spectrum of electrons and positrons \cite{nature}.


\section{Acknowledgments}
The DAMPE mission was founded by the strategic priority science and technology projects in space science of the Chinese Academy of Sciences and in part by the National Key Program for Research and Development, and the 100 Talents program of the Chinese Academy of Sciences. In Europe, the work is supported by the Italian National Institute for Nuclear Physics (INFN), the Italian University and Research Ministry (MIUR), and the University of Geneva. We extend our gratitude to CNAF-T1 for their continued support also beyond providing computing resources.


\section*{References}
\begin{thebibliography}{9}
\bibitem{root}  Antcheva I. {\it et al.} 2009 {\it Computer Physics Communications} {\bf 180} 12, 2499 - 2512, \newline https://root.cern.ch/guides/reference-guide.
\bibitem{mongo} https://www.mongodb.org
\bibitem{flask} http://flask.pocoo.org
\bibitem{latpipeline} Dubois R. 2009 {\it ASP Conference Series} {\bf 411} 189
\bibitem{dirac} Tsaregorodtsev A. et al. 2008 {\it Journal of Physics: Conference Series} {\bf 119} 062048
\bibitem{gridftp} Allcock, W.; Bresnahan, J.; Kettimuthu, R.; Link, M. (2005). "The Globus Striped GridFTP Framework and Server". ACM/IEEE SC 2005 Conference (SC'05). p. 54. doi:10.1109/SC.2005.72. \newline ISBN 1-59593-061-2. http://www.globus.org/toolkit/docs/latest-stable/gridftp/
\bibitem{nature} Ambrosi, G et al.  'Direct detection of a break in the teraelectronvolt cosmic-ray spectrum of electrons and positrons' {\it NATURE} Vol. {\bf 552} (2017)
\bibitem{orientplus} http://www.orientplus.eu
\bibitem{geant} http://www.geant.org
\bibitem{cernet} http://www.cernet.edu.cn/HomePage/english/index.shtml
\bibitem{asdc} http://www.asdc.asi.it

\end{thebibliography}
 

\end{document}