Skip to content
Snippets Groups Projects

Compare revisions

Changes are shown as if the source revision was being merged into the target revision. Learn more about comparing revisions.

Source

Select target project
No results found

Target

Select target project
  • faproietti/ar2018
  • chierici/ar2018
  • SDDS/ar2018
  • cnaf/annual-report/ar2018
4 results
Show changes
Showing
with 7347 additions and 0 deletions
Source diff could not be displayed: it is too large. Options to address this: view the blob.
%% This BibTeX bibliography file was created using BibDesk.
%% http://bibdesk.sourceforge.net/
%% Created for Fabio Bellini at 2017-02-28 14:54:59 +0100
%% Saved with string encoding Unicode (UTF-8)
@article{Alduino:2017ehq,
author = "Alduino, C. and others",
title = "{First Results from CUORE: A Search for Lepton Number
Violation via $0\nu\beta\beta$ Decay of $^{130}$Te}",
collaboration = "CUORE",
journal = "Phys. Rev. Lett.",
volume = "120",
year = "2018",
number = "13",
pages = "132501",
doi = "10.1103/PhysRevLett.120.132501",
eprint = "1710.07988",
archivePrefix = "arXiv",
primaryClass = "nucl-ex",
SLACcitation = "%%CITATION = ARXIV:1710.07988;%%"
}
@article{Alduino:2016vtd,
Archiveprefix = {arXiv},
Author = {Alduino, C. and others},
Collaboration = {CUORE},
Date-Added = {2017-02-28 13:49:12 +0000},
Date-Modified = {2017-02-28 13:49:12 +0000},
Doi = {10.1140/epjc/s10052-016-4498-6},
Eprint = {1609.01666},
Journal = {Eur. Phys. J.},
Number = {1},
Pages = {13},
Primaryclass = {nucl-ex},
Slaccitation = {%%CITATION = ARXIV:1609.01666;%%},
Title = {{Measurement of the two-neutrino double-beta decay half-life of$^{130}$ Te with the CUORE-0 experiment}},
Volume = {C77},
Year = {2017},
Bdsk-Url-1 = {http://dx.doi.org/10.1140/epjc/s10052-016-4498-6}}
@article{Artusa:2014lgv,
Archiveprefix = {arXiv},
Author = {Artusa, D.R. and others},
Collaboration = {CUORE},
Doi = {10.1155/2015/879871},
Eprint = {1402.6072},
Journal = {Adv.High Energy Phys.},
Pages = {879871},
Primaryclass = {physics.ins-det},
Slaccitation = {%%CITATION = ARXIV:1402.6072;%%},
Title = {{Searching for neutrinoless double-beta decay of $^{130}$Te with CUORE}},
Volume = {2015},
Year = {2015},
Bdsk-Url-1 = {http://dx.doi.org/10.1155/2015/879871}}
@inproceedings{Adams:2018nek,
author = "Adams, D. Q. and others",
title = "{Update on the recent progress of the CUORE experiment}",
booktitle = "{28th International Conference on Neutrino Physics and
Astrophysics (Neutrino 2018) Heidelberg, Germany, June
4-9, 2018}",
collaboration = "CUORE",
url = "https://doi.org/10.5281/zenodo.1286904",
year = "2018",
eprint = "1808.10342",
archivePrefix = "arXiv",
primaryClass = "nucl-ex",
SLACcitation = "%%CITATION = ARXIV:1808.10342;%%"
}
\ No newline at end of file
\documentclass[a4paper]{jpconf}
\usepackage{graphicx}
\bibliographystyle{iopart-num}
%\usepackage{citesort}
\begin{document}
\title{CUORE experiment}
\author{CUORE collaboration}
%\address{}
\ead{cuore-spokesperson@lngs.infn.it}
\begin{abstract}
CUORE is a ton scale bolometric experiment for the search of neutrinoless double beta decay in $^{130}$Te.
The detector started taking data in April 2017 at the Laboratori Nazionali del Gran Sasso of INFN, in Italy.
The projected CUORE sensitivity for the neutrinoless double beta decay half life of $^{130}$Te is of 9$\times$10$^{25}\,$y after five years of live time.
In 2018 the CUORE computing and storage resources at CNAF were used for the data processing and for the production of the Monte Carlo simulations used for a preliminary measurement of the 2$\nu$ double-beta decay of $^{130}$Te.
\end{abstract}
\section{The experiment}
The main goal of the CUORE experiment~\cite{Artusa:2014lgv} is to search for Majorana neutrinos through the neutrinoless double beta decay (0$\nu$DBD): $(A,Z) \rightarrow (A, Z+2) + 2e^-$.
The 0$\nu$DBD has never been observed so far and its half life is expected to be higher than 10$^{25}$\,y.
CUORE searches for 0$\nu$DBD in a particular isotope of Tellurium ($^{130}$Te), using thermal detectors (bolometers). A thermal detector is a sensitive calorimeter which measures the
energy deposited by a single interacting particle through the temperature rise induced in the calorimeter itself.
This is accomplished by using suitable materials for the detector (dielectric crystals) and by running it at very low temperatures (in the 10 mK range) in a dilution refrigerator. In such condition a small energy release in the crystal results in a measurable temperature rise. The temperature change is measured by means of a proper thermal sensor, a NTD germanium thermistor glued onto the crystal.
The bolometers act at the same time as source and detectors for the sought signal.
The CUORE detector is an array of 988 TeO$_2$ crystals operated as bolometers, for a total TeO$_2$ mass of 741$\,$kg.
The tellurium used for the crystals has natural isotopic abundances ($\sim$\,34.2\% of $^{130}$Te), thus the CUORE crystals contain overall 206$\,$kg of $^{130}$Te.
The bolometers are arranged in 19 towers, each tower is composed by 13 floors of 4 bolometers each.
A single bolometer is a cubic TeO$_2$ crystal with 5$\,$cm side and a mass of 0.75$\,$kg.
CUORE will reach a sensitivity on the $^{130}$Te 0$\nu$DBD half life of $9\times10^{25}$\,y.
The cool down of the CUORE detector was completed in January 2017, and after a few weeks of pre-operation and optimization, the experiment started taking physics data in April 2017.
The first CUORE results were released in summer 2017 and were followed by a second data release with an extended exposure in autumn 2017~\cite{Alduino:2017ehq}.
The same data release was used in 2018 to produce a preliminary measurement of the 2-neutrino double-beta decay~\cite{Adams:2018nek}.
In 2018 CUORE acquired less than two months worth of physics data, due to cryogenic problems that required a long stop of the data taking.
\section{CUORE computing model and the role of CNAF}
The CUORE raw data consist in Root files containing the continuous data stream of $\sim$1000 channels recorded by the DAQ at sampling frequencies of 1 kHz. Triggers are implemented via software and saved in a custom format based on the ROOT data analysis framework.
The non event-based information is stored in a PostgreSQL database that is also accessed by the offline data analysis software.
The data taking is organized in runs, each run lasting about one day.
Raw data are transferred from the DAQ computers to the permanent storage area at the end of each run.
In CUORE about 20$\,$TB/y of raw data are being produced.
A full copy of data is maintained at CNAF and preserved also on tape.
The main instance of the CUORE database is located on a computing cluster at the Laboratori Nazionali del Gran Sasso and a replica is synchronized at CNAF.
The full analysis framework at CNAF is working and kept up to date to official CUORE software release.
The CUORE data analysis flow consists in two steps.
In the first level analysis the event-based quantities are evaluated, while in the second level analysis the energy spectra are produced.
The analysis software is organized in sequences.
Each sequence consists in a collection of modules that scan the events in the Root files sequentially, evaluate some relevant quantities and store them back in the events.
The analysis flow consists in several fundamental steps that can be summarized in pulse amplitude estimation, detector gain correction, energy calibration, search for events in coincidence among multiple bolometers, evaluation of pulse-shape parameters used to select physical events.
The CUORE simulation code is based on the GEANT4 package, for which the 4.9.6 and the 10.xx up to 10.03 releases have been installed.
The goal of this work is the evaluation, at the present knowledge of material contaminations, of the background index reachable by the experiment in the region of interest of the energy spectrum (0$\nu$DBD is expected to produce a peak at 2528\,keV).
Depending on the specific efficiency of the simulated radioactive sources (sources located outside the lead shielding are really inefficient), the Monte Carlo simulation could exploit from 5 to 500 computing nodes, with durations up to some weeks.
Recently Monte Carlo simulations of the CUORE calibration sources were also performed at CNAF.
Thanks to these simulations, it was possible to produce calibration sources with an activity specifically optimized for the CUORE needs.
In 2018 the CNAF computing resources were exploited for the production of a preliminary measurement of the 2-neutrino double-beta decay of $^{130}$Te.
In order to obtain this result, which was based on the 2017 data, both the processing of the expeirimental data and the production of Monte Carlo simulations were required.
In the last two months of the year a data reprocessing campaign was performed with an updated version of the CUORE analysis software.
This reprocessing campaign, which also included the new data acquired in 2018, allowed to verify the scalability of the CUORE computing model to the amount of data that CUORE will have to process in a few years from now.
\section*{References}
\bibliography{cuore}
\end{document}
%% This BibTeX bibliography file was created using BibDesk.
%% http://bibdesk.sourceforge.net/
%% Created for Fabio Bellini at 2018-02-24 11:10:52 +0100
%% Saved with string encoding Unicode (UTF-8)
@article{Azzolini:2018tum,
author = "Azzolini, O. and others",
title = "{CUPID-0: the first array of enriched scintillating
bolometers for $0\nu\beta\beta$ decay investigations}",
collaboration = "CUPID",
journal = "Eur. Phys. J.",
volume = "C78",
year = "2018",
number = "5",
pages = "428",
doi = "10.1140/epjc/s10052-018-5896-8",
eprint = "1802.06562",
archivePrefix = "arXiv",
primaryClass = "physics.ins-det",
SLACcitation = "%%CITATION = ARXIV:1802.06562;%%"
}
@article{Azzolini:2018dyb,
author = "Azzolini, O. and others",
title = "{First Result on the Neutrinoless Double-$\beta$ Decay of
$^{82}Se$ with CUPID-0}",
collaboration = "CUPID-0",
journal = "Phys. Rev. Lett.",
volume = "120",
year = "2018",
number = "23",
pages = "232502",
doi = "10.1103/PhysRevLett.120.232502",
eprint = "1802.07791",
archivePrefix = "arXiv",
primaryClass = "nucl-ex",
SLACcitation = "%%CITATION = ARXIV:1802.07791;%%"
}
@article{Azzolini:2018yye,
author = "Azzolini, O. and others",
title = "{Analysis of cryogenic calorimeters with light and heat
read-out for double beta decay searches}",
journal = "Eur. Phys. J.",
volume = "C78",
year = "2018",
number = "9",
pages = "734",
doi = "10.1140/epjc/s10052-018-6202-5",
eprint = "1806.02826",
archivePrefix = "arXiv",
primaryClass = "physics.ins-det",
SLACcitation = "%%CITATION = ARXIV:1806.02826;%%"
}
@article{Azzolini:2018oph,
author = "Azzolini, O. and others",
title = "{Search of the neutrino-less double beta decay of$^{82}$
Se into the excited states of$^{82}$ Kr with CUPID-0}",
collaboration = "CUPID",
journal = "Eur. Phys. J.",
volume = "C78",
year = "2018",
number = "11",
pages = "888",
doi = "10.1140/epjc/s10052-018-6340-9",
eprint = "1807.00665",
archivePrefix = "arXiv",
primaryClass = "nucl-ex",
SLACcitation = "%%CITATION = ARXIV:1807.00665;%%"
}
@article{DiDomizio:2018ldc,
author = "Di Domizio, S. and others",
title = "{A data acquisition and control system for large mass
bolometer arrays}",
journal = "JINST",
volume = "13",
year = "2018",
number = "12",
pages = "P12003",
doi = "10.1088/1748-0221/13/12/P12003",
eprint = "1807.11446",
archivePrefix = "arXiv",
primaryClass = "physics.ins-det",
SLACcitation = "%%CITATION = ARXIV:1807.11446;%%"
}
@article{Beretta:2019bmm,
author = "Beretta, M. and others",
title = "{Resolution enhancement with light/heat decorrelation in
CUPID-0 bolometric detector}",
year = "2019",
eprint = "1901.10434",
archivePrefix = "arXiv",
primaryClass = "physics.ins-det",
SLACcitation = "%%CITATION = ARXIV:1901.10434;%%"
}
@article{Azzolini:2019nmi,
author = "Azzolini, O. and others",
title = "{Background Model of the CUPID-0 Experiment}",
collaboration = "CUPID",
year = "2019",
eprint = "1904.10397",
archivePrefix = "arXiv",
primaryClass = "nucl-ex",
SLACcitation = "%%CITATION = ARXIV:1904.10397;%%"
}
\ No newline at end of file
\documentclass[a4paper]{jpconf}
\usepackage{graphicx}
\bibliographystyle{iopart-num}
%\usepackage{citesort}
\begin{document}
\title{CUPID-0 experiment}
\author{CUPID-0 collaboration}
%\address{}
\ead{stefano.pirro@lngs.infn.it}
\begin{abstract}
With their excellent energy resolution, efficiency, and intrinsic radio-purity, cryogenic calorimeters are primed for the search of neutrino-less double beta decay (0$\nu$DBD).
CUPID-0 is an array of 24 Zn$^{82}$Se scintillating bolometers used to search for 0$\nu$DBD of $^{82}$Se.
It is the first large mass 0$\nu$DBD experiment exploiting a double read-out technique: the heat signal to accurately measure particle energies and the light signal to identify the particle type.
The CUPID-0 is in data taking since March 2017 and obtained several outstanding scientific results.
The configuration of the CUPID-0 data processing environment on the CNAF computing cluster has been used for the analysis of the first period of data taking.
\end{abstract}
\section{The experiment}
Neutrino-less Double Beta Decay (0$\nu$DBD) is a hypothesized nuclear transition in which a nucleus decays emitting only two electrons.
This process can not be accommodated in the Standard Model, as the absence of emitted neutrinos would violate the lepton number conservation.
Among the several experimental approaches proposed for the search of 0$\nu$DBD, cryogenic calorimeters (bolometers) stand out for the possibility of achieving excellent energy resolution ($\sim$0.1\%), efficiency ($\ge$80\%) and intrinsic radio-purity. Moreover, the crystals that are operated as bolometers can be grown starting from most of the 0$\nu$DBD emitters, enabling the test of different nuclei.
The state of the art of the bolometric technique is represented by CUORE, an experiment composed of 988 bolometers for a total mass of 741 kg, presently in data taking at Laboratori Nazionali del Gran Sasso.
The ultimate limit of the CUORE background suppression resides in the presence of $\alpha$-decaying isotopes located in the detector structure.
The CUPID-0 project \cite{Azzolini:2018dyb,Azzolini:2018tum} was born to overcome the actual limits.
The main breakthrough of CUPID-0 is the addition of independent devices to measure the light signals emitted from scintillation in ZnSe bolometers.
The different properties of the light emission of electrons and $\alpha$ particles will enable event-by-event rejection of $\alpha$ interactions, suppressing the overall background in the region of interest for 0$\nu$DBD of at least one order of magnitude.
The detector is composed by 26 ZnSe ultra-pure $\sim$ 500g bolometers, enriched at 95\% in $^{82}$Se, the 0$\nu$DBD emitter, and faced to Ge disks light detector operated as bolometers.
CUPID-0 is hosted in a dilution refrigerator at the Laboratori Nazionali del Gran Sasso and started the data taking in March 2017.
The first scientific run (i.e.,~ Phase I) ended in December 2018, collecting 9.95 kg$\times$y of ZnSe exposure.
Such data were used to calculate a new limits on the $^{82}$Se 0$\nu$DBD~\cite{Azzolini:2018dyb,Azzolini:2018oph} and to develop a full background model of the experiment~\cite{Azzolini:2019nmi}.
Phase II will start in June 2019 with an improved detector configuration.
\section{CUPID-0 computing model and the role of CNAF}
The CUPID-0 computing model is similar to the CUORE one, being the only difference in the sampling frequency and working point of the light detector bolometers.
The full data stream is saved in ROOT files, and a derivative trigger is software generated with a channel dependent threshold.
%Raw data are saved in Root files and contain events in correspondence with energy releases occurred in the bolometers.
Each event contains the waveform of the triggering bolometer and those geometrically close to it, plus some ancillary information.
The non-event-based information is stored in a PostgreSQL database that is also accessed by the offline data analysis software.
The data taking is arranged in runs, each run lasting about two days.
Details of the CUPID-0 data acquisition and control system can be found in \cite{DiDomizio:2018ldc}.
Raw data are transferred from the DAQ computers (LNGS) to the permanent storage area (located at CNAF) at the end of each run.
A full copy of data is also preserved on tape.
The data analysis flow consists of two steps; in the first level analysis, the event-based quantities are evaluated, while in the second level analysis the energy spectra are produced.
The analysis software is organized in sequences.
Each sequence consists of a collection of modules that scan the events in the ROOT files sequentially, evaluate some relevant quantities and store them back in the events.
The analysis flow consists of several key steps that can be summarized in pulse amplitude estimation, detector gain correction, energy calibration and search for events in coincidence among multiple bolometers.
The new tools developed for CUPID-0 to handle the light signals are introduced in \cite{Azzolini:2018yye,Beretta:2019bmm}.
The main instance of the database was located at CNAF
and the full analysis framework was used to analyze data until November 2017. A web page for offline reconstruction monitoring was maintained.
Then, since the flooding at INFN Tier 1, we have been using the database of our DAQ servers at LNGS.
%During 2017 a more intense usage of the CNAF resources is expected, both in terms of computing resourced and storage space.
\section*{References}
\bibliography{cupid-biblio}
\end{document}
contributions/dampe/CNAF_HS06_2017.png

62.5 KiB

contributions/dampe/dampe_layout_2.jpg

84.1 KiB

contributions/dampe/figureCNAF2018.png

20.2 KiB

contributions/dampe/figure_all.png

85.1 KiB

contributions/dampe/figure_cnaf.png

41.1 KiB

\documentclass[a4paper]{jpconf}
\usepackage{graphicx}
\usepackage{hyperref}
\usepackage{todonotes}
\begin{document}
\title{DAMPE data processing and analysis at CNAF}
\author{G. Ambrosi$^1$, G. Donvito$^5$, D.F.Droz$^6$, M. Duranti$^1$, D. D'Urso$^{2,3,4}$, F. Gargano$^{5,\ast}$, G. Torralba Elipe$^{7,8}$}
\address{$^1$ INFN Sezione di Perugia, Perugia, IT}
\address{$^2$ Universit\`a di Sassari, Sassari, IT}
\address{$^3$ ASDC, Roma, IT}
\address{$^4$ INFN - Laboratori Nazionali del Sud, Catania, IT}
%\address{$^3$ Universit\`a di Perugia, I-06100 Perugia, Italy}
\address{$^5$ INFN Sezione di Bari, Bari, IT}
\address{$^6$ University of Geneva, Gen\`eve, CH}
\address{$^7$ Gran Sasso Science Institute, L'Aquila, IT}
\address{$^8$ INFN - Laboratori Nazionali del Gran Sasso, L'Aquila, IT}
\address{DAMPE experiment \url{http://dpnc.unige.ch/dampe/},
\url{http://dampe.pg.infn.it}}
\ead{* fabio.gargano@ba.infn.it}
\begin{abstract}
DAMPE (DArk Matter Particle Explorer) is one of the five satellite missions in the framework of the Strategic Pioneer Research Program in Space Science of the Chinese Academy of Sciences (CAS). DAMPE has been launched the 17 December 2015 at 08:12 Beijing time into a sun-synchronous orbit at the altitude of 500 km. The satellite is equipped with a powerful space telescope for high energy gamma-ray, electron and cosmic ray detection.
CNAF computing center is the mirror of DAMPE data outside China and the main data center for Monte Carlo production. It also supports user data analysis of the Italian DAMPE Collaboration.
\end{abstract}
\section{Introduction}
\begin{figure}[ht]
\begin{center}
\includegraphics[width=20pc]{dampe_layout_2.jpg}
\end{center}
\caption{\label{fig:dampe_layout} DAMPE telescope scheme: a double layer of the plastic scintillator strip detector (PSD);
the silicon-tungsten tracker-converter (STK) made of 6 tracking double layers; the imaging calorimeter with about 31 radiation lengths thickness, made of 14 layers of Bismuth Germanium Oxide (BGO) bars in a hodoscopic arrangement and finally
the neutron detector (NUD) placed just below the calorimeter.}
\end{figure}
DAMPE is a space telescope for high energy cosmic-ray detection.
In Fig. \ref{fig:dampe_layout} a scheme of the DAMPE telescope is shown. The top, the plastic scintillator strip detector (PSD) consists of one double layer of scintillating plastic strips detector, which serves as an anti-coincidence detector and to measure particle charge, followed by a silicon-tungsten tracker-converter (STK), which is made of 6 tracking layers. Each tracking layer consists of two layers of single-sided silicon strip detectors measuring the position on the two orthogonal views perpendicular to the pointing direction of the apparatus. Three layers of Tungsten plates with a thickness of 1~mm are inserted in front of tracking layers 3, 4 and 5 to promote photon conversion into electron-positron pairs. The STK is followed by an imaging calorimeter of about 31 radiation lengths thickness, made up of 14 layers of Bismuth Germanium Oxide (BGO) bars which are placed in a hodoscopic arrangement. The total thickness of the BGO and the STK corresponds to about 33 radiation lengths, making it the deepest calorimeter ever used in space. Finally, in order to detect delayed neutron resulting from hadron showers and to improve the electron/proton separation power, a neutron detector (NUD) is placed just below the calorimeter. The NUD consists of 16, 1~cm thick, boron-doped plastic scintillator plates of 19.5 $\times$ 19.5 cm$^2$ large, each read out by a photomultiplier.
The primary scientific goal of DAMPE is to measure electrons and photons with much higher energy resolution and energy reach than achievable with existing space experiments. This will help to identify possible Dark Matter signatures but also may advance our understanding of the origin and propagation mechanisms of high energy cosmic rays and possibly lead to new discoveries in high energy gamma-ray astronomy.
DAMPE was designed to have unprecedented sensitivity and energy reach for electrons, photons and heavier cosmic rays (proton and heavy ions). For electrons and photons, the detection range is 2 GeV-10 TeV, with an energy resolution of about 1.5\% at 100 GeV. For proton and heavy ions, the detection range is 100 GeV-100 TeV, with an energy resolution better than 40\% at 800 GeV. The geometrical factor is about 0.3 m$^2$ sr for electrons and photons, and about 0.2 m$^2$ sr for heavier cosmic rays. The angular resolution is 0.1$^{\circ}$ at 100 GeV.
\section{DAMPE Computing Model and Computing Facilities}
As a Chinese satellite, DAMPE data are collected via the Chinese space communication system and transmitted to the China National Space Administration (CNSA) center in Beijing. From Beijing data are then transmitted to the Purple Mountain Observatory (PMO) in Nanjing, where they are processed and reconstructed.
On the European side, the DAMPE collaboration consists of research groups from INFN and University of Perugia, Lecce and Bari, and from the Department of Particle and Nuclear Physics (DPNC) at the University of Geneva in Switzerland.
\subsection{Data production}
PMO is the deputed center for DAMPE data production. Data are collected 4 times per day, each time the DAMPE satellite is passing over Chinese ground stations (almost every 6 hours). Once transferred to PMO, binary data, downloaded from the satellite, are processed to produce a stream of raw data in ROOT \cite{root} format ({\it 1B} data stream, $\sim$ 7 GB/day), and a second stream that include the orbital and slow control information ({\it 1F} data stream, $\sim$ 7 GB/day). The {\it 1B} and {\it 1F} streams are used to derive calibration files for the different subdetectors ($\sim$ 400MB/day). Finally, data are reconstructed using the DAMPE official reconstruction code, and the so-called {\it 2A} data stream (ROOT files, $\sim$ 85 GB/day) is produced. The total amount of data volume produced per day is $\sim$ 100 GB.
Data processing and reconstruction activities are currently supported by a computing farm consisting of more than 1400 computing cores, able to reprocess 3 years DAMPE data in 1 month.
\subsection{Monte Carlo Production}
Analysis of DAMPE data requires large amounts of Monte Carlo simulation, to fully understand detector capabilities, measurement limits and systematic. In order to facilitate easy work-flow handling and management and also enable effective monitoring of a large number of batch jobs in various states, a NoSQL meta-data database using MongoDB \cite{mongo} was developed with a prototype currently running at the Physics Department of Geneva University. Database access is provided through a web-frontend and command tools based on the flask-web toolkit \cite{flask} with a client-backend of cron scripts that run on the selected computing farm.
The design and completion of this work-flow system were heavily influenced by the implementation of the Fermi-LAT data processing pipeline \cite{latpipeline}
and the DIRAC computing framework \cite{dirac}.
Once submitted, each batch job continuously reports its status to the database through outgoing HTTP requests.
To that end, computing nodes must have outgoing connectivity enabled. Each batch job implements a work-flow where input and output data transfers are being performed (and their return codes are reported) as well as the actual running of the payload of a job (which is defined in the metadata description of the job). Dependencies on productions are implemented at the framework level and jobs are only submitted once dependencies are satisfied.
Once generated, a secondary job is initiated which performs digitization and reconstruction of existing MC data with a given release for large amounts of MC data in bulk. This process is set-up via a cronjob at DPNC and occupies up to 200 slots in a 6-hour limited computing queue.
\subsection{Data availability}
DAMPE data are available to the Chinese Collaboration through the PMO institute, while they are kept accessible to the European Collaboration transferring them from PMO to CNAF, and also from there to the DPNC.
Every time a new {\it 1B}, {\it 1F} or {\it 2A} data files are available at PMO, they are copied, using the GridFTP \cite{gridftp} protocol,
into the DAMPE storage area at CNAF. From CNAF, every 4 hours a copy of each stream
is triggered to the Geneva computing farm via rsync. Dedicated batch jobs are submitted once per day to asynchronously verify the checksum of newly transferred data from PMO to CNAF and from CNAF to Geneva.
Data verification and copy processes are managed through a dedicated User Interface (UI), \texttt{ui-dampe}.
The connection to China is passing through the Orientplus \cite{orientplus} link of the G${\rm \acute{e}}$ant Consortium \cite{geant}. The data transfer rate is currently limited by the connection of the PMO to the China Education and Research Network (CERNET), which has a maximum bandwidth of 100 Mb/s. So the PMO-CNAF copy processed is used for daily data production.
To transfer towards Europe data in case of DAMPE data re-processing and to share in China Monte Carlo generated in Europe,
a dedicated DAMPE server has been installed at the Institute for high energy physics, IHEP, in Beijing which is connected to CERNT with a 1Gb/s bandwidth. Data synchronization between this server and PMO is done by a manually induced hard-drive exchange.
To simplify user data access overall Europe, an XRootD federation has been implemented: an XRootD redirector has been set up in Bari with end-point XRootD server installations (providing the real data) at CNAF, Bari and in Geneva. These end-points provide unified read access for users in Europe.
\section{CNAF contribution}
The CNAF computing center is the mirror of DAMPE data outside China and the main data center for Monte Carlo production.\\
In 2018, a dedicated user interface, 300 TB of disk space and 7.8k HS06 of CPU time have been allocated for the DAMPE activities.
\section{Activities in 2018}
DAMPE activities at CNAF in 2018 have been related to data transfer, Monte Carlo production and data analysis.
\subsection{Data transfer}
The daily activity of data transfer from PMO to CNAF and thereafter from CNAF to CERN have been performed all along the year.
Daily transfer rate has been of about 100 GB per day from PMO to CNAF and more than 100 GB per day from CNAF to PMO.
The step between PMO and CNAF is performed, as seen in previous sections, via \texttt{gridftp} protocol.
Two strategies have been, instead, used to copy data from CNAF to PMO: via \texttt{rsync} from the UI and via \texttt{rsync} managed by batch jobs.
DAMPE data have been reprocessed three times along the year and a dedicated copy task has been fulfilled to copy the new production releases, in addition to the ordinary daily copy.
\subsection{Monte Carlo Production}
\iffalse
\begin{figure}
\begin{center}
\includegraphics[width=30pc]{CNAF_HS06_2017}
\end{center}
\caption{\label{fig:hs06_2017} CPU time consumption, in terms of HS06 (blue solid for daily computation, dashed for the average over the entire year). The red solid line corresponds to the annual pledge and the green dotted line corresponds to the job efficiency computed in a 14-day sliding window.}
\end{figure}
\fi
\begin{figure}[ht]
\begin{center}
\includegraphics[width=35pc]{figure_cnaf.png}
\end{center}
\caption{\label{fig:figure_cnaf} Status of completed simulation production at CNAF.}
\end{figure}
\begin{figure}[ht]
\begin{center}
\includegraphics[width=35pc]{figureCNAF2018.png}
\end{center}
\caption{\label{fig:figure_cnaf_2018} Status of completed simulation production at CNAF in 2018.}
\end{figure}
\iffalse
\begin{figure}[ht]
\begin{center}
\includegraphics[width=35pc]{figure_all.png}
\end{center}
\caption{\label{fig:figure_all} Status of completed simulation production at all DAMPE simulation sites.}
\end{figure}
\fi
As the main data center for Monte Carlo production, CNAF has been strongly involved in the Monte Carlo campaign.
At CNAF almost 300 thousand jobs have been executed for a total of about 3 billion of Monte Carlo events.
Monte Carlo campaign is still ongoing for different species and different energy ranges.
In figure \ref{fig:figure_cnaf} the status of completed simulation production at CNAF is shown.
During 2019 we will perform a new full simulation campaign with an improved version of our simulation code: this is crucial for all the forthcoming analysis.
\subsection{Data Analysis}
Most of the analysis in Europe is performed at CNAF and its role has been crucial for all the DAMPE publications such as the Nature paper on direct detection of a break in the TeV cosmic-ray spectrum of electrons and positrons \cite{nature}.
\section{Acknowledgments}
The DAMPE mission was founded by the strategic priority science and technology projects in space science of the Chinese Academy of Sciences and in part by the National Key Program for Research and Development, and the 100 Talents program of the Chinese Academy of Sciences. In Europe, the work is supported by the Italian National Institute for Nuclear Physics (INFN), the Italian University and Research Ministry (MIUR), and the University of Geneva. We extend our gratitude to INFN-T1 for their continued support also beyond providing computing resources.
\section*{References}
\begin{thebibliography}{9}
\bibitem{root} Antcheva I. {\it et al.} 2009 {\it Computer Physics Communications} {\bf 180} 12, 2499 - 2512, \newline https://root.cern.ch/guides/reference-guide.
\bibitem{mongo} https://www.mongodb.org
\bibitem{flask} http://flask.pocoo.org
\bibitem{latpipeline} Dubois R. 2009 {\it ASP Conference Series} {\bf 411} 189
\bibitem{dirac} Tsaregorodtsev A. et al. 2008 {\it Journal of Physics: Conference Series} {\bf 119} 062048
\bibitem{gridftp} Allcock, W.; Bresnahan, J.; Kettimuthu, R.; Link, M. (2005). "The Globus Striped GridFTP Framework and Server". ACM/IEEE SC 2005 Conference (SC'05). p. 54. doi:10.1109/SC.2005.72. \newline ISBN 1-59593-061-2. http://www.globus.org/toolkit/docs/latest-stable/gridftp/
\bibitem{nature} Ambrosi, G et al. 'Direct detection of a break in the teraelectronvolt cosmic-ray spectrum of electrons and positrons' {\it NATURE} Vol. {\bf 552} (2017)
\bibitem{orientplus} http://www.orientplus.eu
\bibitem{geant} http://www.geant.org
\bibitem{cernet} http://www.cernet.edu.cn/HomePage/english/index.shtml
\bibitem{asdc} http://www.asdc.asi.it
\end{thebibliography}
\end{document}
\documentclass[a4paper]{jpconf}
\usepackage{graphicx}
\bibliographystyle{iopart-num}
%\usepackage{citesort}
\begin{document}
\title{DarkSide program at CNAF}
\author{S. Bussino, S. M. Mari, S. Sanfilippo}
\address{INFN and Universit\`{a} degli Studi Roma 3}
\ead{bussino@fis.uniroma3.it; stefanomaria.mari@uniroma3.it; simone.sanfilippo@roma3.infn.it}
\begin{abstract}
DarkSide is a direct dark matter research program based at the underground Laboratori Nazionali del Gran Sasso
(\textit {LNGS}) and it is searching for the rare nuclear recoils (possibly) induced by the so called Weakly
Interacting Massive Particles (\textit{WIMPs}). It is based on a dual-phase Time Projection Chamber filled with liquid
Argon (\textit{LAr-TPC}) from underground sources. The prototype project is a LAr-TPC with a $(46.4\pm0.7)$kg
active mass, the DarkSide-50 (\textit{DS-50}) experiment, which is installed inside a 30 t organic liquid scintillator
neutron veto, which is in turn installed at the center of a 1kt water Cherenkov veto for the residual flux of cosmic
muons. DS-50 has been taking data since November 2013 with Atmospheric Argon (\textit{AAr}) and, since April 2015, has
been operated with Underground Argon (\textit{UAr}) highly depleted in radioactive ${}^{39}Ar$. The exposure of 1422
kg d of AAr has demonstrated that the operation of DS-50 for three years in a background free condition is a solid
reality, thank to the excellent performance of the pulse shape analysis. The first release of results from an exposure
of 2616 kg d of UAr has shown no dark matter candidate events. This is the most sensitive dark matter search performed
with an Argon-based detector, corresponding to a 90\% CL upper limit on the WIMP-nucleon spin-indipendent cross section
of $2\times10^{-44} cm^2$ for a WIMP mass of 100 $GeV/c^2$. DS-50 will be operated till the end of the year 2019.
From the experience of DS-50, the DS-20k project has been presented based on a new LAr-TPC of more than 20 tonne.
\end{abstract}
\section{The DS-50 experiment}
The existence of dark matter is now established from different gravitational effects, but its nature is still a deep mystery. One possibility, motivated by other considerations in elementary particle physics, is that dark matter consists of new undiscovered elementary particles. A leading candidate explanation, motivated by supersymmetry theory (\textit{SUSY}), is that dark matter is composed of as-yet undiscovered Weakly Interacting Massive Particles (\textit{WIMPs}) formed in the early universe and subsequently gravitationally clustered in association with baryonic matter \cite{Good85}. Evidence for new particles that could constitute WIMP dark matter may come from upcoming experiments at the Large Hadron Collider (\textit{LHC}) at CERN or from sensitive astronomical instruments that detect radiation produced by WIMP-WIMP annihilations in galaxy halos. The thermal motion of the WIMPs comprising the dark matter halo surrounding the galaxy and the Earth should result in WIMP-nuclear collisions of sufficient energy to be observable by sensitive laboratory apparatus. WIMPs could in principle be detected in terrestrial experiments through their collisions with ordinary nuclei, giving observable low-energy $<$100 keV nuclear recoils. The predicted low collision rates require ultra-low background detectors with large (0.1-10 ton) target masses, located in deep underground sites to eliminate neutron background from cosmic ray muons. The DarkSide program is the first to employ a Liquid Argon Time Projection Chamber (\textit{LAr-TPC}) with low levels of ${}^{39}Ar$, together with innovations in photon detection and background suppression.
The DS-50 detector is installed in Hall C at Laboratori Nazionali del Gran Sasso (\textit{LNGS}) at a depth of 3800 m.w.e.\footnote{The meter water equivalent (m.w.e.) is a standard measure of cosmic ray attenuation in underground laboratories.}, and it will continue to taking data up to the end of 2019. The project will continue with DarkSide-20k (\textit{DS-20k}) and \textit{Argo}, a multi-ton detector with an expected sensitivity improvement of two orders of magnitude. The DS-50 target volume is hosted in a dual phase TPC that contains Argon in both phases, liquid and gaseous, the latter on the top of the former one. The scattering of WIMPs or background particles in the active volume induces a prompt scintillation light, called S1, and ionization. Electrons which not recombine are drifted by an electric field of 200 V/cm applied along the z-axis. They are then extracted into gaseous phase above the extraction grid, and accelerated by an electric field of about 4200 V/cm. Here a secondary larger signal due to electroluminescence takes place, the so called S2. The light is collected by two arrays of 19 3"-PMTs on each side of the TPC corresponding to a 60\% geometrical coverage of the end plates and 20\% of the total TPC surface. The detector is capable of reconstructing the position of the interaction in 3D. The z-coordinate, in particular, is easily computed by the electron drift time, while the time profile of the S2 light collected by the top plate PMTs allows to reconstruct the \textit{x} and the \textit{y} coordinates. The LAr-TPC can exploit Pulse Shape Discrimination (\textit{PSD}) and the ratio of scintillation to ionization (S1/S2) to reject $\beta/\gamma$ background in favor of the nuclear recoil events expected from WIMP scattering \cite{Ben08, Bou06}.\\ Events due to neutrons from cosmogenic sources and from radioactive contamination in the detector components, which also produces nuclear recoils, are suppressed by the combined action of the neutron and cosmic rays vetoes. The first one in particular is a 4.0 meter-diameter stainless steel sphere filled with 30 t of borated liquid scintillator acting as Liquid Scintillator Veto (\textit{LSV}). The sphere is lined with \textit{Lumirror} reflecting foils and it is equipped with an array of 110 Hamamatsu 8"-PMTs with low-radioactive components and high-quantum-efficiency photocathodes. The cosmic rays veto, on the other hand, is an 11m-diameter, 10 m-high cylindrical tank filled with high purity water which acts as a Water Cherenkov Detector (\textit{WCD}). The inside surface of the tank is covered with a laminated \textit{Tyvek-polyethylene-Tyvek} reflector and it is equipped with an array of 80 ETL 8"-PMTs with low-radioactive components and high-quantum-efficiency photocathodes.
The exposure of 1422 kg d of AAr has demonstrated that the operation of DS-50 for three years in a background free condition is a solid reality, thank to the excellent performance of the pulse shape analysis. The first release of results from an exposure of 2616 kg d of UAr has shown no dark matter candidate events. This is the most sensitive dark matter search performed with an Argon-based detector, corresponding to a 90\% CL upper limit on the WIMP-nucleon spin-indipendent cross section of $2\times10^{-44} cm^2$ for a WIMP mass of 100 $GeV/c^2$ \cite{Dang16}.
\section{DkS-50 at CNAF}
The data readout in the three detector subsystems is managed by dedicated trigger boards: each subsystem is equipped with an user-customizable FPGA unit, in which the trigger logic is implemented. The inputs and outputs from the different trigger modules are processed by a set of electrical-to-optical converters and the communication between the subsystems uses dedicated optical links. To keep the TPC and the Veto readouts aligned, a pulse per second (\textit{PPS}) generated by a GPS receiver is sent to the two systems, where it is acquired and interpolated with a resolution of 20 ns to allow offline confirmation of event matching.
To acquire data, the DarkSide detector uses a DAQ machine equipped with a storage buffer of 7 TB. Raw data are processed and automatically sent to CNAF farm via a 10 Gbit optical link (almost with approximately 7 hours delay). At CNAF data are housed on a disk storage system of about 1 PB net capacity with a part of the data (300 TB) backed up on the tape library. Raw data from CNAF, and processed ones from LNGS are then semi-automatically copied to Fermi National Laboratories (\textit{FNAL}) via a 100 Gbit optical link. Part of reconstructed data are sent back to CNAF via the same link as before with a rate of about 0.5 TB/month (RECO files). Data processed and analyzed at FNAL, are compared with the analysis performed at CNAF. The INFN Roma 3 group has an active role to maintain and follow, step by step, the overall transferring procedure and to arrange the data management.
\section{The future of DarkSide: DS-20k}
Building on the successful experience in operating the DS-50 detector, the DarkSide program will continue with DS-20k, a direct WIMP search detector using a two-phase Liquid Argon Time Projection Chamber (LAr TPC) with an active (fiducial) mass of 23 t (20 t), which will be built in the next years. The optical sensors will be Silicon Photon Multiplier (\textit{SiPM}) matrices with very low radioactivity. Operation of DS-50 demonstrated a major reduction in the dominant ${}^{39}Ar$ background when using Argon extracted from an underground source, before applying pulse shape analysis. Data from DS-50, in combination with MC simulations and analytical modelling, also shows that a rejection factor for discrimination between electron and nuclear recoils greater than $3\times10^9$ is achievable. The expected large rejection factor, along with the use of the veto system and utilizing silicon photomultipliers in the LAr-TPC, are the keys to unlock the path to large LAr-TPC detector masses, while maintaining an experiment in which less than $<0.1$ events is expected to occur within the WIMP search region during the planned exposure.
Thanks to the measured ultra-low background, DS-20k will have sensitivity to WIMP-nucleon cross sections of
$1.2\times10^{-47}\ cm^2$ and $1.1\times10^{-46}\ cm^2$ for WIMPs respectively of
$1\ TeV/c^2$ and $10\ TeV/c^2$ mass, to be achieved during a 5 yr run producing an exposure of 100 t yr free from any instrumental background.
DS-20k could then extend its operation to a decade, increasing the exposure to 200 t yr, reaching a sensitivity of $7.4\times10^{-48}\ cm^2$ and $6.9\times10^{-47}\ cm^2$ for WIMPs respectively of $1\ TeV/c^2$ and $10\ TeV/c^2$ mass.
DS-20k will be more than two orders of magnitude larger in size compared to DS-50 and will utilize SiPM technologies. Therefore, the collaboration plans to build a prototype detector of intermediate size, called DS-Proto, incorporating the new technologies for their full validation. The choice of about 1t mass scale allows a full validation of the technological choices for DS-20k. DS-proto will be built at CERN laboratory, the data taking is foreseen to start in the year 2020.
\section{DS-proto at CNAF}
Data from DS-proto will be stored and managed at CNAF. The construction, operation, and commissioning of DS-proto will allow validation of the major innovative technical features of DS-20k. Data taking will start in the year 2020. The computing resources have been evaluated according to the data throughput, trigger rate and duty cycle of the experiment. A computing power of about 1kHS06 and 300 net TB is needed to fully support DS-proto data taking and data analysis in the year 2020. In order to perform at CNAF the CPU demanding Monte Carlo production, 30 net TB and 2kHS06 are needed. The DS-proto data taking has been foreseen for few years, requiring a total disk space of the order of some PB and a computing capacity of several kHS06.
%However, the goal of DS-20k is a background free exposure of 100 ton-year of liquid Argon which requires further suppression of ${}^{39}Ar$ background with respect to DS-50. The project \textit{URANIA} involves the upgrade of the UAr extraction plant to a massive production rate suitable for multi-ton detectors. The project \textit{ARIA} instead involves the construction of a very tall cryogenic distillation column in the Seruci mine (Sardinia, Italy) with the high-volume capability of chemical and isotopic purification of UAr.\\ The projected sensitivity of DS-20k and Argo reaches a WIMP-nucleon cross section of $10^{-47}\ cm^2$ and $10^{-48}\ cm^2$ respectively, for a WIMP mass of 100 $GeV/cm^2$, exploring the region of the parameters plane down to the irreducible background due to atmospheric neutrinos.
\section*{References}
\begin{thebibliography} {17}
\bibitem{Good85} M.~W.~Goodman, E.~Witten, Phys. Rev. D {\bf 31} 3059 (1985);
\bibitem{Loo83} H.~H.~Loosli, Earth Plan. Sci. Lett. {\bf 63} 51 (1983);
\bibitem{Ben07} P.~Benetti et al. (WARP Collaboration), Nucl. Inst. Meth. A {\bf 574} 83 (2007);
\bibitem{Ben08} P.~Benetti et al. (WARP Collaboration), Astropart. Phys. {\bf 28} 495 (2008);
\bibitem{Bou06} M.~G.~Boulay, A.~Hime, Astropart. Phys. {\bf 25} 179 (2006);
\bibitem{Dang16} D.~D'Angelo et al. (DARKSIDE Collaboration), Il nuovo cimento C {\bf 39} 312 (2016).
\end{thebibliography}
\end{document}
\ No newline at end of file
File added
% Encoding: UTF-8
@article{ex1,
author = "A. Cisneros",
journal = "Astrophys. Space Sci.",
volume = 10,
pages = 87,
year = 1971
}
@article{ex2,
author = "S. Carlip and R. Vera",
journal = "Phys. Rev. D",
section = "D",
volume = 58,
pages = 011345,
year = 1998
}
@article{ex3,
author = "K. Davies and G. Brown",
journal = "J. High Energy Phys.",
pages = "JHEP12(1997)002",
year = 1997
}
@article{ex4,
author = "D. Neilson and M. Choptuik",
journal = "Class. Quantum Grav.",
volume = 17,
pages = 761,
year = 2000,
eprint = "gr-qc/9812053"
}
@unpublished{ex5,
author = "M. Harrison",
title = "Dipheomorphism-invariant manifolds",
year = 1999,
eprint = "hep-th/9909196"
}
@inbook{ex6,
author = "L. I. Dorman",
title = "Variations of Galactic Cosmic Rays",
publisher = "Moscow State University Press",
address = "Moscow",
year = 1975,
pages = 103
}
@inbook{ex7,
author = "R. Caplar and P. Kulisic",
title = "Proc. Int. Conf. on Nuclear Physics (Munich)",
publisher = "North-Holland/American Elsevier",
address = "Amsterdam",
year = 1973,
volume = 1,
pages = 517
}
@incollection{ex8,
author = "M. Morse",
title = "Supersonic beam sources",
booktitle = "Atomic Molecular and Optical Physics",
editor = "F. B. Dunning and R. Hulet",
series = "Experimental Methods in the Physical Sciences",
volume = 29,
publisher = "Academic",
address = "San Diego",
year = 1996
}
@article{bardeen1957:bcs,
author = "J. Bardeen and L. N. Cooper and J. R. Schrieffer",
journal = "Phys. Rev.",
volume = 108,
number = 5,
pages = 1175,
year = 1957
}
@article{caprio2005:coherent,
author = "M. A. Caprio",
journal = "J. Phys. A",
section = "A",
volume = 38,
number = 28,
pages = 6385,
year = 2005
}
@article{zamfir2005:132te-beta-enam04,
author = "N. V. Zamfir and others",
journal = "Eur. Phys. J. A",
section = "A",
volume = 25,
number = "s01",
issue = "s01",
year = 2005,
pages = 389
}
@book{rose1957:am,
author = "M. E. Rose",
title = "Elementary Theory of Angular Momentum",
publisher = "Wiley",
address = "New York",
year = 1957,
}
@book{dirac1958:qm,
author = "P. A. M. Dirac",
title = "The Principles of Quantum Mechanics",
series = "The International Series of Monographs on Physics",
number = 27,
edition = 4,
publisher = "Clarendon Press",
address = "Oxford",
year = 1967
}
@book{siegbahn1965:v1,
editor = "K. Siegbahn",
title = "Alpha-, Beta-, and Gamma-Ray Spectroscopy",
booktitle = "Alpha-, Beta-, and Gamma-Ray Spectroscopy",
publisher = "North-Holland",
address = "Amsterdam",
year = 1965,
volume = 1
}
@incollection{bell1965:coin-lifetime,
author = "R. E. Bell",
title = "Coincidence Techniques and the Measurement of Short Mean Lives",
editor = "K. Siegbahn",
booktitle = "Alpha-, Beta-, and Gamma-Ray Spectroscopy",
publisher = "North-Holland",
address = "Amsterdam",
year = 1965,
volume = 2,
pages = 905
}
@phdthesis{caprio2003:diss,
author = "M. A. Caprio",
school = "Yale University",
year = 2003,
eprint = "nucl-ex/0502004",
archive = "arXiv"
}
@misc{doePC,
author = "J. Doe",
year = 2006,
note = "private communication"
}
@Article{wenaus,
author = {T. Wenaus},
title = {The {HEP} {S}oftware and {C}omputing {K}nowledge {B}ase},
journal = {J. Phys.: Conf. Ser.},
year = {2017},
volume = {898},
number = {102018},
pages = {1-6},
doi = {10.1088/1742-6596/898/10/102018},
}
@Article{Briand1995,
author = {L. C. Briand and V. R. Basili and C. J. Hetmanski},
title = {Developing Interpretable Models with Optimized Set Reduction for Identifying High-Risk Software Components},
journal = {IEEE Trans. Softw. Eng.},
year = {1995},
volume = {19},
number = {11},
pages = {1028--1044},
month = nov,
__markedentry = {[marco:6]},
}
@Article{Emam2001,
author = {K. El Emam and S. Benlarbi and N. Goel and S. N. Rai},
title = {Comparing case-based reasoning classifiers for predicting high risk software components},
journal = {The Journal of Systems and Software},
year = {2001},
volume = {55},
number = {3},
pages = {301--320},
__markedentry = {[marco:6]},
}
@Article{Ganesan2000,
author = {K. Ganesan and T. M. Khoshgoftaar and E. B. Allen},
title = {Case-Based Software Quality Prediction},
journal = {INT J SOFTW ENG KNOW},
year = {2000},
volume = {10},
number = {2},
pages = {139--152},
__markedentry = {[marco:6]},
}
@Article{Khoshgoftaar1995,
author = {T. M. Khoshgoftaar and A. S. Pandya and D. L. Lanning},
title = {Application of neural networks for predicting program faults},
journal = {Annals of Software Engineering},
year = {1995},
volume = {1},
number = {1},
pages = {141--154},
__markedentry = {[marco:6]},
}
@Article{Lanubile1997,
author = {F. Lanubile and G. Visaggio},
title = {Evaluating predictive quality models derived from software measures: lessons learned},
journal = {J. Syst. Softw.},
year = {1997},
volume = {38},
number = {225--234},
__markedentry = {[marco:6]},
}
@Article{Porter1990,
author = {A. A. Porter and R. W. Selby},
title = {Empirically guided software development using metric-based classification trees},
journal = {IEEE Softw.},
year = {1990},
volume = {7},
number = {2},
pages = {46--54},
__markedentry = {[marco:6]},
}
@Article{Dwivedi2016,
author = {V. K. Dwivedi and M K. Singh},
title = {Software Defect Prediction Using Data Mining Classification Approach},
journal = {Int. J. Tech. Res. Appl.},
year = {2016},
volume = {4},
number = {6},
pages = {31--35},
__markedentry = {[marco:6]},
}
@Article{Almeida1999,
author = {M. A. De Almeida and H. Lounis and W. Melo},
title = {An investigation on the use of machine learned models for estimating software correctability},
journal = {Int. J. Softw. Eng. Knowl. Eng.},
year = {1999},
__markedentry = {[marco:6]},
}
@Article{Suresh2014,
author = {Y. Suresh and L. Kumar and S. Ku Rath},
title = {{Statistical and Machine Learning Methods for Software Fault Prediction Using CK Metric Suite: A Comparative Study}},
journal = {ISRN Software Engineering},
year = {2014},
volume = {2014},
number = {251083},
pages = {15},
__markedentry = {[marco:6]},
url = {http://dx.doi.org/10.1155/2014/251083},
}
@Article{Arisholm2010,
author = {E. Arisholm and L. C. Briand and E. B. Johannessen},
title = {A systematic and comprehensive investigation of methods to build and evaluate fault prediction models},
journal = {J. Syst. Softw.},
year = {2010},
volume = {83},
number = {1},
pages = {2--17},
__markedentry = {[marco:6]},
}
@Article{Gondra2008,
author = {I. Gondra},
title = {Applying machine learning to software fault-proneness prediction},
journal = {Journal of Systems and Software},
year = {2008},
volume = {81},
number = {1},
pages = {186--195},
__markedentry = {[marco:6]},
}
@Article{Malhotra2015,
author = {R. Malhotra},
title = {A systematic review of machine learning techniques for software fault prediction},
journal = {Appl. Soft. Comput.},
year = {2015},
volume = {27},
pages = {504--518},
__markedentry = {[marco:6]},
}
@Article{Whiteson2009,
author = {S. Whiteson and D. Whiteson},
title = {Machine learning for event selection in high energy physics},
journal = {Engineering Applications of Artificial Intelligence},
year = {2009},
volume = {22},
number = {8},
pages = {1203-1217},
month = dec,
__markedentry = {[marco:6]},
}
@Article{Denby1988,
author = {B. H. Denby},
title = {Neural networks and cellular automata in experimental high energy physics},
journal = {Comput. Phys. Commun.},
year = {1988},
volume = {49},
number = {3},
pages = {429--448},
month = jun,
__markedentry = {[marco:6]},
}
@Article{Peterson1994,
author = {C. Peterson and T. Rognvaldsson and L. Lonnbladb},
title = {JETNET 3.0—A versatile artificial neural network package},
journal = {Comput. Phys. Commun.},
year = {1994},
volume = {81},
number = {1--2},
pages = {185--220},
month = jun,
__markedentry = {[marco:6]},
}
@Article{Baldi2016,
author = {P. Baldi and K. Cranmer and T. Faucett and P. Sadowski and D. Whiteson},
title = {Parameterized neural networks for high-energy physics},
journal = {Eur. Phys. J. C.},
year = {2016},
volume = {76},
number = {235},
__markedentry = {[marco:6]},
doi = {https://doi.org/10.1140/epjc/s10052-016-4099-4},
}
@Article{Kolanoski1995,
author = {H. Kolanoski},
title = {Application of artificial neural networks in particle physics},
journal = {Nucl. Instrum. Methods Phys. Res. A},
year = {1995},
volume = {367},
number = {1--3},
pages = {14--20},
month = dec,
__markedentry = {[marco:6]},
}
@Article{Collaboration2014,
author = {The ALTAS Collaboration},
title = {A neural network clustering algorithm for the ATLAS silicon pixel detector},
journal = {J. Instrum.},
year = {2014},
volume = {9},
month = sep,
__markedentry = {[marco:6]},
}
@Article{Denby1990,
author = {B. H. Denby and M. Campbell and F. Bedeschi and N. Chriss and C. Bowers and F. Nesti},
title = {Neural Betworks for Triggering},
journal = {IEEE Trans. Nucl. Sci.},
year = {1990},
volume = {37},
number = {2},
pages = {248--254},
month = apr,
__markedentry = {[marco:6]},
}
@Article{McCabe1976,
author = {T. McCabe},
title = {A Complexity Measure},
journal = {IEEE Trans. Softw. Eng.},
year = {1976},
volume = {Se-2},
number = {4},
pages = {308--320},
month = dec,
__markedentry = {[marco:6]},
}
@Book{Halstead1977,
title = {Elements of Software Science (Operating and programming systems series)},
publisher = {Elsevier Science Inc.},
year = {1977},
author = {M. H. Halstead},
address = {New York, NY, USA},
__markedentry = {[marco:6]},
}
@Article{Chidamber1994,
author = {S. R. Chidamber and C. F. Kemerer},
title = {Metrics suite for Object Oriented Design},
journal = {IEEE Trans. Softw. Eng.},
year = {1994},
volume = {20},
number = {6},
pages = {476--493},
month = jun,
__markedentry = {[marco:6]},
}
@Article{Azar2009,
author = {D. Azar and H. Harmanani and R. Korkmaz},
title = {A hubrid heuristic approach to optimize rule-based software quality estimation models},
journal = {Inf. Softw. Technol.},
year = {2009},
volume = {51},
number = {1},
pages = {1365--1376},
month = jun,
__markedentry = {[marco:6]},
}
@Article{Xie2009,
author = {T. Xie and S. Thummalapenta and D. Lo and C. Liu},
title = {Data mining for software engineering},
journal = {IEEE Computer},
year = {2009},
volume = {42},
number = {1},
pages = {55--62},
__markedentry = {[marco:6]},
}
@Book{Hofmann2013,
title = {RapidMiner: Data Mining Use Cases and Business Analytics Applications},
publisher = {CRC Press},
year = {2013},
author = {M. Hofmann and R. Klinkenberg},
address = {Boca Raton},
__markedentry = {[marco:6]},
}
@Book{Zhao2012,
title = {R and Data Mining},
publisher = {Avademic Press},
year = {2012},
author = {Y. Zhao},
address = {San Diego},
__markedentry = {[marco:6]},
}
@Article{Hall2009,
author = {M. Hall and E. Frank and G. Holmes and B. Pfahringer and P. Reutemann and I. H. Witten},
title = {The WEKA data mining software: an update},
journal = {SIGKDD Explorations},
year = {2009},
volume = {11},
number = {1},
pages = {10--18},
__markedentry = {[marco:6]},
}
@Article{Pedregosa2011,
author = {F. Pedregosa and G. Varoquaux and A. Gramfort and V. Michel and B. Thirion and O. Grisel et al.},
title = {Scikit-learn: Machine Learning in Python},
journal = {J. Mach. Learn. Res.},
year = {2011},
volume = {12},
number = {1},
pages = {2825--2830},
__markedentry = {[marco:6]},
}
@Book{Fenton2014,
title = {Software {M}etrics: {A} {R}igorous and {P}ractical {A}pproach, Third Edition},
publisher = {CRC Press},
year = {2014},
author = {N. Fenton and J. Bieman},
edition = {third},
month = nov,
__markedentry = {[marco:6]},
}
@Article{McCabe1989,
author = {T. J. McCabe and C.W. Butler},
title = {Design Complexity Measurement and Testing},
journal = {Artificial Intelligence and Language Processing},
year = {1989},
volume = {32},
number = {12},
pages = {1415-1425},
month = dec,
__markedentry = {[marco:6]},
}
@InProceedings{Conde2017,
author = {P. P. Conde and I. S. Carrillo},
title = {Comparison of {C}lassifiers {B}ased on {N}eural {N}etworks and {S}upport {V}ector {M}achines},
booktitle = {5th {I}nternational {C}onference in {S}oftware {E}ngineering {R}esearch and {I}nnovation (CONISOFT)},
year = {2017},
month = apr,
publisher = {IEEE},
__markedentry = {[marco:6]},
doi = {10.1109/CONISOFT.2017.00020},
}
@Article{Burges1998,
author = {C. J. C. Burges},
title = {A {T}utorial on {S}upport {V}ector {M}achines for {P}attern {R}ecognition},
journal = {Data Min. Knowl. Discov.},
year = {1998},
volume = {2},
number = {2},
pages = {121-167},
month = jun,
__markedentry = {[marco:6]},
}
@Book{Haykin1999,
title = {Neural {N}etworks: {A} {C}omprehensive {F}oundation},
year = {1999},
author = {S. Haykin},
editor = {Prentice Hall International},
__markedentry = {[marco:6]},
}
@InProceedings{Vlahovic2016,
author = {N. Vlahovic},
title = {An {E}valuation {F}ramwork and a {B}rief {S}urvey of {D}ecision {T}ree {T}ools},
booktitle = {39th {I}nternational {C}onference on {I}nformation and {C}ommunication {T}echnology, {E}lettronics and {M}icroelettronics},
year = {2016},
pages = {1299-1304},
month = jun,
__markedentry = {[marco:6]},
}
@InProceedings{Liu2016,
author = {j. Liu and Z. Tian and P. Liu and J. Jiang and Z. Li},
title = {An {A}pproach of {S}ematic {W}eb {S}ervice {C}lassification {B}ased on {N}aive {B}ayes},
booktitle = {IEEE International Conference on Services Computing},
year = {2016},
pages = {356-362},
publisher = {IEEE},
__markedentry = {[marco:6]},
doi = {DOI 10.1109/SCC.2016.53},
}
@Book{Han2006,
title = {Data Mining Concepts and Techniques},
publisher = {Morgan Kauffman},
year = {2006},
author = {J. Han and M. Kamber},
__markedentry = {[marco:6]},
}
@InProceedings{Santos2014,
author = {C. N. dos Santos and M. Gatti},
title = {Deep {C}onvolutional {N}eural {N}etworks for {S}entiment {A}nalysis of {S}hort {T}exts},
booktitle = {25th International Conference on Computational Linguistics (COLING)},
year = {2014},
pages = {69-78},
month = aug,
__markedentry = {[marco:6]},
}
@InProceedings{Kim2014,
author = {Y. Kim},
title = {Convolutional {N}eural {N}etworks for {S}entence {C}lassification},
booktitle = {{C}onference on {E}mpirical {M}ethods in {N}atural {L}anguage {P}rocessing ({EMNLP})},
year = {2014},
pages = {1746-1751},
month = oct,
__markedentry = {[marco:6]},
}
@Article{Polikar2006,
author = {R. Polikar},
title = {Ensemble {B}ased {S}ystems in {D}ecision {M}aking},
journal = {{IEEE} {C}ircuits and {S}ystems {M}agazine},
year = {2006},
volume = {6},
number = {3},
pages = {21-45},
month = sep,
__markedentry = {[marco:6]},
doi = {10.1109/MCAS.2006.1688199},
}
@InProceedings{Wong2010,
author = {S. Wong and M. Aaron and J. Segall and K. Lynch and S. Mancoridis},
title = {Reverse {E}ngineering {U}tility {F}unctions using {G}enetic {P}rogramming to {D}etect {A}nomalous {B}ehavior in {S}oftware},
booktitle = {17th {W}orking {C}onference on {R}everse {E}ngineering},
year = {2010},
pages = {141-149},
publisher = {IEEE Computer Society},
__markedentry = {[marco:6]},
doi = {DOI 10.1109/WCRE.2010.23},
}
@InProceedings{Bartsch-Spoerl1999,
author = {B. Bartsch-Sp\"{o}rl and M. Lenz and A. H\"{u}bner},
title = {Case-Based Reasoning - Survey and Future Directions},
booktitle = {Lecture Notes in Artificial Intelligence},
year = {1999},
editor = {F. Puppe},
pages = {67-89},
month = mar,
publisher = {Springer},
__markedentry = {[marco:6]},
}
@Article{Ali2013,
author = {N. S. Ali and V.P. Pawar},
title = {The use of data {M}ining {T}echniques for {I}mproving {S}oftware {R}eliability},
journal = {International {J}ournal of {A}dvanced {R}esearch in {C}omputer {S}cience},
year = {2013},
volume = {4},
number = {4},
pages = {172-178},
month = mar,
__markedentry = {[marco:6]},
}
@InProceedings{Conde2013,
author = {P. P. Conde and J. de la Calleja and A. Benitez and Ma. A. Medina},
title = {Image-based classification of diabetic retinopathy using machine learning},
booktitle = {12th {I}nternational {C}onference on {I}ntelligence {S}ystem {D}esign and {A}pplications (ISDA)},
year = {2013},
pages = {826-830},
publisher = {IEEE},
__markedentry = {[marco:6]},
}
@Article{Shepperd2013,
author = {M. Shepperd and Q. Song and Z. Sun and C. Mair},
title = {Data {Q}uality: {S}ome {C}omments on the {NASA} {S}oftware {D}efect {D}atasets},
journal = {IEEE Trans. Softw. Eng.},
year = {2013},
volume = {39},
number = {9},
pages = {1208-1215},
month = sep,
__markedentry = {[marco:6]},
}
@Article{Aleem2015,
author = {S. Aleem and L. F. Capretz and F. Ahmed},
title = {{BENCHMARKING MACHINE LEARNING TECHNIQUES FOR SOFTWARE DEFECT DETECTION}},
journal = {International Journal of Software Engineering \& Applications (IJSEA)},
year = {2015},
volume = {6},
number = {3},
pages = {11-23},
month = may,
__markedentry = {[marco:6]},
}
@InProceedings{Gray2011,
author = {D. Gray and D. Bowes and N. Davey},
title = {The miuse of the {NASA} metrics data program data sets for automated software defect prediction},
booktitle = {15th {A}nnual {C}onference on {E}valuation \& {A}ssessment in {S}oftware {E}ngineering ({EASE})},
year = {2011},
publisher = {IET},
__markedentry = {[marco:6]},
doi = {10.1049/ic.2011.0012},
}
@Article{kaur,
author = {A. Kaur and I. Kaur},
title = {An empirical evaluation of classification algorithms for fault prediction in open source projects},
journal = {Journal of {K}ing {S}aud {U}niversity - {C}omputer and {I}nformation {S}ciences},
year = {2016},
volume = {30},
pages = {2-17},
month = apr,
}
@Article{taylor,
author = {Q. Taylor and C. Giraud-Carrier},
title = {Applications of data mining in software engineering},
journal = {International Journal of Data Analysis Techniques and Strategies},
year = {2010},
volume = {2},
number = {3},
pages = {243-257},
month = jul,
doi = {10.1504/IJDATS.2010.034058},
}
@Article{bengio,
author = {Y. Bengio},
title = {Learning {D}eep {A}rchitecture for {AI}},
journal = {Foundations and Trends® in Machine Learning},
year = {2009},
volume = {2},
number = {1},
pages = {1-127},
doi = {http://dx.doi.org/10.1561/2200000006},
}
@Article{lecun,
author = {Y. LeCun and Y. Bengio and G. Hinton},
title = {Deep {L}earning},
journal = {{N}ature},
year = {2015},
volume = {521},
pages = {436–444},
month = may,
}
@InProceedings{deshmukh,
author = {J. Deshmukh and K. M. Annervaz and S. Podder},
title = {Towards {A}ccurate {D}uplicate {B}ug {R}etrieval {U}sing {D}eep {L}earning {T}echniques},
booktitle = {{IEEE} {I}nternational {C}onference on {S}oftware {M}aintenance and {E}volution ({ICSME})},
year = {2017},
pages = {115-124},
publisher = {IEEE},
doi = {10.1109/ICSME.2017.69},
}
@Article{zhang,
author = {Z. Zhang and X. Jing and T. Wang},
title = {Laber propagation based semi-supervised learning for software defect prediction},
journal = {Autom. Softw. Eng.},
year = {2017},
volume = {24},
number = {1},
pages = {47-69},
month = mar,
}
@Article{song,
author = {Q. Song and Z. Jia and M. Shepperd and S. Ying and J. Liu},
title = {A {G}eneral {S}oftware {D}efect-{P}roneness {P}rediction {F}ramework},
journal = {IEEE Trans. Softw. Eng.},
year = {2011},
volume = {37},
number = {3},
pages = {356 - 370},
month = may,
}
@Article{zhou,
author = {Y. Zhou and H. Leung},
title = {Empirical {A}nalysis of {O}bject-{O}riented {D}esign {M}etrics for {P}redicting {H}igh and {L}ow {S}everity {F}aults},
journal = {IEEE TRANSACTIONS ON SOFTWARE ENGINEERING},
year = {2006},
volume = {32},
number = {10},
pages = {771-789},
month = oct,
}
@InProceedings{mccallum,
author = {A. McCallum and K. Nigam},
title = {A comparison of {E}vent {M}odels for {N}aives {B}ayes {T}ext {C}lassification},
booktitle = {Learning for Text Categorization: Papers from the 1998 {AAAI} Workshop},
year = {1998},
pages = {41--48},
url = {http://www.kamalnigam.com/papers/multinomial-aaaiws98.pdf},
}
@Book{vapnik,
title = {The {N}ature of {S}tatistical {L}earning {T}heory},
year = {2000},
author = {V.N. Vapnik},
editor = {Springer},
edition = {Second},
isbn = {978-0387987804},
}
@Book{campbell,
title = {Learning with {S}upport {V}ector {M}achine},
publisher = {Morgan \& Claypool Publishers},
year = {2011},
author = {C. Campbell and Y. Ying},
isbn = {1608456161 9781608456161},
}
@InBook{dietterich,
pages = {405--408},
title = {"Ensemble Learning" {T}he {H}andbook of {B}rain {T}heory and {N}eural {N}etworks},
year = {2002},
author = {T. G. Dietterich},
editor = {TLFeBOOK},
volume = {2},
edition = {Second},
isbn = {0-262-01197-2},
}
@InProceedings{he,
author = {Q. He and B. Shen and Y. Chen},
title = {Software {D}efect {P}rediction {U}sing {S}emi-{S}upervised {L}earning with {C}hange {B}urst {I}nformation},
booktitle = {IEEE 40th Annual Computer Software and Applications Conference ({COMPSAC})},
year = {2016},
pages = {113--122},
month = aug,
publisher = {IEEE},
doi = {10.1109/COMPSAC.2016.193},
}
@TechReport{breiman,
author = {L. Breiman},
title = {Bagging {P}redictors},
institution = {Department of Statistics, University of California, Berkley, California 94720},
year = {1994},
month = sep,
url = {https://www.stat.berkeley.edu/~breiman/bagging.pdf},
}
@Article{breiman2,
author = {L. Breiman},
title = {Random {F}orests},
journal = {Machine Learning},
year = {2001},
volume = {45},
number = {1},
pages = {5--32},
month = oct,
doi = {10.1007/978-0-387-30164-8_695},
}
@Article{freund,
author = {Y. Freund and R. E. Schapire},
title = {A {D}ecision-{T}heoretic {G}eneralization of {O}n-{L}ine {L}earning and an {A}pplication to {B}oosting},
journal = {J. Comput. Syst. Sci.},
year = {1997},
volume = {55},
pages = {119--139},
}
@InProceedings{raina,
author = {R.Raina and A. Battle and H. Lee and B. Packer and A. Y. Ng},
title = {Self-taught learning: transfer learning from unlabeled data},
booktitle = {Proceedings of the 24th international conference on Machine Learning},
year = {2007},
pages = {759--766},
doi = {10.1145/1273496.1273592},
}
@InProceedings{collobert,
author = {R. Collobert and J. Weston},
title = {A {U}nified {A}rchitecture for {N}atural {L}anguage {P}rocessing: {D}eep {N}eural {N}etworks with {M}ultitask {L}earning},
booktitle = {Proceedings of the 25th International Conference on Machine Learning},
year = {2008},
}
@TechReport{shen,
author = {V. Y. Shen and S. D. Conte},
title = {{S}oftware {S}cience {R}ivisited: {A} {C}ritical {A}nalysis of the {T}heory and Its {E}mpirical {S}upport},
institution = {Department of Computer Science - Purdue University},
year = {1981},
}
@Book{,
title = {Elements of {S}oftware {S}cience ({O}perating and programming systems series)},
publisher = {Elsevier Science},
year = {1977},
author = {M. H. Halstead},
}
@InProceedings{li,
author = {W. Li and S. Henry},
title = {Maintenance {M}etrics for the {O}bject {O}riented {P}aradigm},
booktitle = {First International Software Metrics Symposium},
year = {1993},
publisher = {IEEE},
doi = {10.1109/METRIC.1993.263801},
}
@Article{catal,
author = {C. Catal and B. Diri},
title = {A systematic review of software fault prediction studies},
journal = {Expert Systems with Applications},
year = {2009},
volume = {36},
pages = {7346--7354},
doi = {10.1016/j.eswa.2008.10.027},
}
@Article{xu,
author = {J. Xu and D. Ho and L. F. Capretz},
title = {An {E}mpirical {S}tudy on the {P}rocedure to {D}erive {S}oftware {Q}uality {E}stimation {M}odels},
journal = {International Journal of Computer Science \& Information Technology ({IJCSIT})},
year = {2010},
volume = {2},
number = {4},
pages = {1--16},
month = aug,
}
@Article{kumaresh,
author = {S. Kumaresh and R. Baskaran},
title = {Defect {A}nalysis and {P}revention for {S}oftware {P}rocess {Q}uality {I}mprovement},
journal = {International Journal of Computer Applications},
year = {2010},
volume = {8},
number = {7},
pages = {42--47},
month = oct,
}
@Article{ahmad,
author = {K. Ahmad and N. Varshney},
title = {On {M}inimizing {S}oftware {D}efects during {N}ew {P}roduct {D}evelopment {U}sing {E}nhanced {P}reventive {A}pproach},
journal = {International Journal of Soft Computing and Engineering (IJSCE)},
year = {2012},
volume = {2},
number = {5},
pages = {9--12},
}
@Article{andersson,
author = {C. Andersson},
title = {A replicated empirical study of a selection method for software reliability growth models},
journal = {Empirical Software Engineering},
year = {2007},
volume = {12},
number = {2},
pages = {161--182},
}
@Article{fenton,
author = {N. E. Fenton and N. Ohlsson},
title = {Quantitative analysis of faults and failures in a complex software system},
journal = {IEEE Trans. Softw. Eng.},
year = {2000},
volume = {26},
number = {8},
pages = {797--814},
doi = {10.1109/32.879815},
}
@InProceedings{dam,
author = {H. K. Dam and T. Tran and A. Ghose},
title = {Explainable {S}oftware {A}nalytics},
booktitle = {40th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER)},
year = {2018},
pages = {53--56},
address = {New York, NY, USA},
publisher = {ACM},
doi = {10.1145/3183399.3183424},
}
@Misc{weka,
title = {Weka 3: {D}ata {M}ining {S}oftware in {J}ava},
url = {https://www.cs.waikato.ac.nz/ml/weka/},
}
@Misc{scikit,
title = {scikit-learn, {M}achine {L}earning in {P}ython},
url = {http://scikit-learn.org/stable/},
}
@Misc{r,
title = {The {R} {P}roject for {S}tatistical {C}omputing},
url = {https://www.r-project.org/},
}
@Misc{AR,
title = {Machine {L}earning \& {D}ata {M}ining {A}lgorithms},
url = {http://tunedit.org/repo/PROMISE/DefectPrediction},
}
@Misc{nasadataset,
title = {NASA {D}efect {D}ataset},
url = {https://github.com/klainfo/NASADefectDataset},
}
@Misc{eclipsedataset,
title = {Bug prediction dataset, {E}valuate your bug prediction approach on our benchmark},
url = {http://bug.inf.usi.ch},
}
@Misc{androiddataset,
url = {http://www.inf.u-szeged.hu/$\sim$ferenc/papers/GitHubBugDataSet},
}
@InProceedings{joulin,
author = {A. Joulin and T. Mikonov},
title = {Inferring algorithmic patterns with stack-augmented recurrent nets},
booktitle = {Proceedings of 28th International Conference on Neural Information Processing Systems},
year = {(2015)},
volume = {1},
pages = {190-198},
publisher = {MIT Press Cambridge, MA, USA},
comment = {arXiv:1503.01007},
}
@Article{tong,
author = {H. Tong and B. Liu and S. Wang},
title = {Software defect prediction using stacked denoising autoencoders and two-stage ensemble learning},
journal = {Information and Software Technology},
year = {2018},
volume = {96},
}
@InProceedings{canaparo,
author = {M. Canaparo and E. Ronchieri},
title = {{Data Mining Techniques for Software Quality Prediction in Open Source Software: An Initial Assessment}},
booktitle = {under publication Proc. of EPJ Conf.},
year = {2018},
journal = {under pubblication},
}
@InProceedings{ronchieri,
author = {E. Ronchieri and M. Canaparo and A. Costantini and D. C. Duma},
title = {{Data mining techniques for software quality prediction: a comparative study}},
booktitle = {under publication Proceedings of IEEE NSS/MIC},
year = {2018},
}
@Article{salomoni,
author = {E. Ronchieri and M. Canaparo and D. Salomoni},
title = {{Machine Learning Techniques for Software Analysis of Unlabelled Program Modules}},
journal = {under publication PROCEEDINGS OF SCIENCE},
year = {2019},
}
@Comment{jabref-meta: databaseType:bibtex;}
\documentclass[a4paper]{jpconf}
\usepackage{graphicx}
\usepackage{booktabs}
\bibliographystyle{iopart-num}
\begin{document}
\title{Comparing Data Mining Techniques for Software Defect Prediction}
\author{M. Canaparo$^1$, E. Ronchieri$^1$}
\address{$^1$ INFN-CNAF, Bologna, IT}
\ead{marco.canaparo@cnaf.infn.it, elisabetta.ronchieri@cnaf.infn.it}
\begin{abstract}
In the last decades, the role of Data Mining techniques have grown in the field of software engineering to cover various tasks, such as software defect prediction and test code generation. In this contribution, we describe the work done in 2018 to compare these techniques for software defect prediction in order to identify the ones that perform the best.
\end{abstract}
%--------------------------------------------------------------------------------
\section{Introduction}
\label{sec_intro}
Over the past years, the use of Data Mining techniques have been growing in various applications of software engineering. In this field, typical tasks \cite{Ali2013} are source code generation \cite{joulin} and software defect prediction \cite{tong}: the former is usually for the test activity, the latter is for the quality assessment. They rely on software datasets compose of a set of features for the various instances. The features include software metrics and other information, like the instances defectiveness. Through data mining algorithms, software developers may detect violation in the code. Existing literature shows promising approaches to address software engineering issues. Our contribution aims to compare these techniques for software defect prediction in order to identify the ones that perform the best.
During 2018, the main focus of our work was to define a methodology to compare data mining techniques in the context of open source and HEP software.
The following paragraphs summarize the research procedure and the preliminary results.
\section{Research Procedure}
\label{sec_rproc}
The research procedure is composed of five steps:
\begin{enumerate}
\item collection of data mining techniques in the software engineering field;
\item collection of software metrics;
\item collection of datasets;
\item identification of data mining tools and library for this study;
\item identification of performance criteria.
\end{enumerate}
\subsection{Data Mining Techniques}
For the first step, we have considered existing literature focusing on defect prediction.
\noindent\textbf{Support Vector Machine} (e.g. SMO): is a supervised techniques that searches for the optimal hyperplane to separate training data. The hyperplane found is intuitive: it is the one which is maximally distant from the two classes of labelled points located in each side \cite{vapnik, campbell}.
\noindent\textbf{Decision Tree} (e.g. J48): is a flow-chart like tree structure. It is composed of: nodes which represent a test on a attribute value; branches which show the outcome of the tests; leaves, that indicate the resulting classes \cite{Han2006}.
\noindent\textbf{Naive Bayes}: relies on the Bayesian rule of conditional probability. It assumes that all the attributes are independent and analyses each of them individually \cite{mccallum}.
\noindent\textbf{Ensemble Classifier} (e.g. Random Forest): consists of training multiple classifiers and then combining their predictions \cite{dietterich}. This technique leads to a generalized improvement of the ability of each classifier \cite{he}. According to the way the component classifiers are trained, parallel or sequential, we can distinguish two different categories of ensemble. Bagging \cite{breiman} and Random Forest \cite{breiman2} are both parallel classifiers. Bagging creates multiple version of the classifier by replicating the learning set in parallel from the original on and the final decision is made by majority voting strategy. Random Forest adopts a combination of tree predictors, each depending on the values of a random vector sampled independently and with the same distribution for all trees in the forest. Adaboost \cite{freund} is an example of a sequential classifier since each classifier of this technique is applied sequentially on the training samples misclassified by the previous one.
\noindent\textbf{Deep Learning}: is applied to feature hierarchy where features of higher levels are formed by the composition of lower level ones. Deep learning techniques leverage learning intermediate representations that can be shared across tasks and, as a consequence, they can exploit unsupervised data and data from similar tasks to improve performance on problems characterised by scarcity of labelled data \cite{bengio, raina, collobert}.
\subsection{Software Metrics}
Concerning software metrics, we have collected all the metrics used in literature over time, some of them are:
\noindent\textbf{McCabe} (e.g. Cyclomatic Complexity, Essential Complexity): is used to evaluate the complexity of a software program. It is derived from a flow graph and is mathematically computed using graph theory. Basically, it is determined by counting the number of decision statements in a program \cite{McCabe1976, McCabe1989}.
\noindent\textbf{Halstead} (e.g. Base Measures, Derived Measures): is used to measure some characteristics of a program module - such as the ``Length'', the ``Potential Volume'', ``Difficulty'', the ``Programming Time'' - by employing some basic metrics like number of unique operators, number of unique operands, total occurrences of operators, total occurrences of operands \cite{shen, Halstead1977}.
\noindent\textbf{Size} (e.g. Lines of Code, Comment Lines of Code): the Lines of Code (LOC) is used to measure a software module and the accumulated LOC of all the modules for measuring a program \cite{li}.
\noindent\textbf{Chidamber and Kemerer} (e.g. Number of Children, Depth of Inheritance): is used for object-oriented programs and is the most popular for performing software analysis and prediction. It has been adopted by many software tool vendors and computer scientists \cite{Chidamber1994, catal}. Some metrics of the suite are: Weighted Method Per Class, which measures the number of methods which is in each class; Depth of Inheritance Tree, which measures the distance of the longest path from a class to the root in the inheritance tree; Number Of Children, which measures the number of classes that are direct descendants of each class.
For the third step, we have focused on the NASA Defect Dataset \cite{Shepperd2013, zhang, song, Gray2011, zhou, Aleem2015, nasadataset, AR}, Eclipse \cite{eclipsedataset}, and Android and Elastic Search \cite{androiddataset}. Table \ref{tab:datasets} shows a summary of the most important characteristics of these datasets in terms of number of projects, metrics, modules and percentage of defective modules per projects, reporting their range whenever possible. Modules refer to instances and represent e.g. classes, files, functions and so on.
%\vspace*{-\baselineskip}
\begin{table}[h]
\begin{center}
\caption{Summary of the datasets employed}
\label{tab:datasets}
% \scriptsize
\begin{tabular}{rrrrr}
\toprule
Repository & \#Projects & \#Metrics & \#Modules & \%Defective Modules\\
\midrule
NASA Defect Datasets & 11 & [30,41] & [101, 5589] & [0.41, 48.80]\%\\
Eclipse Datasets & 5 & 17 each & [324, 1863] & [9.26, 39.81]\%\\
Android Datasets & 6 & 102 each & [74, 124] & [0, 27.02]\%\\
Elastic Search Datasets & 12 &102 each & [1860, 7263]& [0.16, 11.47]\%\\
\bottomrule
\end{tabular}
\end{center}
\end{table}
%\vspace*{-\baselineskip}
\subsection{Data Mining Tools}
In relation to the data mining tools, we have employed Weka \cite{weka}, scikit-learn \cite{scikit} and R \cite{r}. They are based on java, python and R and they are characterized by a different learning curve.
\subsection{Performance Criteria}
Finally, for the performance criteria we have taken into account what is available in literature. All the criteria are defined on the basis of the \textbf{confusion matrix} that summarizes the performance of a classification algorithm. According to confusion matrix, we have defined: True Positive (TP) all the instances predicted as defective and that are actually defective; True Negative (TN) all the instances predicted as non-defective and that are actually non-defective; False Positive (FP) all the instances predicted as defective and that are actually non-defective; False Negative (FN) all the instances predicted as non-defective and that are actually defective. In the following, criteria are reported.
\textbf{Accuracy} is the percentage of modules correctly classified as either faulty or non-faulty.
\textbf{Precision} is the percentage of modules classified as faulty that are actually faulty.
\textbf{Recall} or \textbf{Completeness} is the percentage of faulty modules that are predicted as faulty.
\textbf{Mean Absolute Error} determines how close the values of predicted and actual fault rate differ.
\textbf{F-measure} is a combined measure of recall and precision, the higher value of this indicator the better is the quality of the learning method for software prediction.
\section{Preliminary Results and Future Works}
\label{sec_pr}
Bagging and Random Forest have achieved the best average accuracy over all the datasets \cite{canaparo}. In the future, we will focus on (semi-)unsupervised machine learning techniques that enable us to include them in the software development process since the majority of software datasets lacks of instance categorizations \cite{ronchieri}. Furthermore, we will investigate in the adoption of various machine learning frameworks on different resource infrastructure, such as cloud and GPU-equipped resources \cite{salomoni}.
% ------------------------------------------------------------------------
\section*{References}
\bibliography{ar2018}
%\section{Introduction}
%These guidelines show how to prepare articles for publication in \jpcs\ using \LaTeX\ so they can be published quickly and accurately. Articles will be refereed by the \corg s but the accepted PDF will be published with no editing, proofreading or changes to layout. It is, therefore, the author's responsibility to ensure that the content and layout are correct. This document has been prepared using \cls\ so serves as a sample document. The class file and accompanying documentation are available from \verb"http://jpcs.iop.org".
%
%\section{Preparing your paper}
%\verb"jpconf" requires \LaTeXe\ and can be used with other package files such
%as those loading the AMS extension fonts
%\verb"msam" and \verb"msbm" (these fonts provide the
%blackboard bold alphabet and various extra maths symbols as well as
%symbols useful in figure captions); an extra style file \verb"iopams.sty" is
%provided to load these packages and provide extra definitions for bold Greek letters.
%\subsection{Headers, footers and page numbers}
%Authors should {\it not} add headers, footers or page numbers to the pages of their article---they will
%be added by \iopp\ as part of the production process.
%
%\subsection{{\cls\ }package options}
%The \cls\ class file has two options `a4paper' and `letterpaper':
%\begin{verbatim}
%\documentclass[a4paper]{jpconf}
%\end{verbatim}
%
%or \begin{verbatim}
%\documentclass[letterpaper]{jpconf}
%\end{verbatim}
%
%\begin{center}
%\begin{table}[h]
%\caption{\label{opt}\cls\ class file options.}
%%\footnotesize\rm
%\centering
%\begin{tabular}{@{}*{7}{l}}
%\br
%Option&Description\\
%\mr
%\verb"a4paper"&Set the paper size and margins for A4 paper.\\
%\verb"letterpaper"&Set the paper size and margins for US letter paper.\\
%\br
%\end{tabular}
%\end{table}
%\end{center}
%
%The default paper size is A4 (i.e., the default option is {\tt a4paper}) but this can be changed to Letter by
%using \verb"\documentclass[letterpaper]{jpconf}". It is essential that you do not put macros into the text which alter the page dimensions.
%
%\section{The title, authors, addresses and abstract}
%The code for setting the title page information is slightly different from
%the normal default in \LaTeX\ but please follow these instructions as carefully as possible so all articles within a conference have the same style to the title page.
%The title is set in bold unjustified type using the command
%\verb"\title{#1}", where \verb"#1" is the title of the article. The
%first letter of the title should be capitalized with the rest in lower case.
%The next information required is the list of all authors' names followed by
%the affiliations. For the authors' names type \verb"\author{#1}",
%where \verb"#1" is the
%list of all authors' names. The style for the names is initials then
%surname, with a comma after all but the last
%two names, which are separated by `and'. Initials should {\it not} have
%full stops. First names may be used if desired. The command \verb"\maketitle" is not
%required.
%
%The addresses of the authors' affiliations follow the list of authors.
%Each address should be set by using
%\verb"\address{#1}" with the address as the single parameter in braces.
%If there is more
%than one address then a superscripted number, followed by a space, should come at the start of
%each address. In this case each author should also have a superscripted number or numbers following their name to indicate which address is the appropriate one for them.
%
%Please also provide e-mail addresses for any or all of the authors using an \verb"\ead{#1}" command after the last address. \verb"\ead{#1}" provides the text Email: so \verb"#1" is just the e-mail address or a list of emails.
%
%The abstract follows the addresses and
%should give readers concise information about the content
%of the article and should not normally exceed 200
%words. {\bf All articles must include an abstract}. To indicate the start
%of the abstract type \verb"\begin{abstract}" followed by the text of the
%abstract. The abstract should normally be restricted
%to a single paragraph and is terminated by the command
%\verb"\end{abstract}"
%
%\subsection{Sample coding for the start of an article}
%\label{startsample}
%The code for the start of a title page of a typical paper might read:
%\begin{verbatim}
%\title{The anomalous magnetic moment of the
%neutrino and its relation to the solar neutrino problem}
%
%\author{P J Smith$^1$, T M Collins$^2$,
%R J Jones$^{3,}$\footnote[4]{Present address:
%Department of Physics, University of Bristol, Tyndalls Park Road,
%Bristol BS8 1TS, UK.} and Janet Williams$^3$}
%
%\address{$^1$ Mathematics Faculty, Open University,
%Milton Keynes MK7~6AA, UK}
%\address{$^2$ Department of Mathematics,
%Imperial College, Prince Consort Road, London SW7~2BZ, UK}
%\address{$^3$ Department of Computer Science,
%University College London, Gower Street, London WC1E~6BT, UK}
%
%\ead{williams@ucl.ac.uk}
%
%\begin{abstract}
%The abstract appears here.
%\end{abstract}
%\end{verbatim}
%
%\section{The text}
%The text of the article should should be produced using standard \LaTeX\ formatting. Articles may be divided into sections and subsections, but the length limit provided by the \corg\ should be adhered to.
%
%\subsection{Acknowledgments}
%Authors wishing to acknowledge assistance or encouragement from
%colleagues, special work by technical staff or financial support from
%organizations should do so in an unnumbered Acknowledgments section
%immediately following the last numbered section of the paper. The
%command \verb"\ack" sets the acknowledgments heading as an unnumbered
%section.
%
%\subsection{Appendices}
%Technical detail that it is necessary to include, but that interrupts
%the flow of the article, may be consigned to an appendix.
%Any appendices should be included at the end of the main text of the paper, after the acknowledgments section (if any) but before the reference list.
%If there are two or more appendices they will be called Appendix A, Appendix B, etc.
%Numbered equations will be in the form (A.1), (A.2), etc,
%figures will appear as figure A1, figure B1, etc and tables as table A1,
%table B1, etc.
%
%The command \verb"\appendix" is used to signify the start of the
%appendixes. Thereafter \verb"\section", \verb"\subsection", etc, will
%give headings appropriate for an appendix. To obtain a simple heading of
%`Appendix' use the code \verb"\section*{Appendix}". If it contains
%numbered equations, figures or tables the command \verb"\appendix" should
%precede it and \verb"\setcounter{section}{1}" must follow it.
%
%\section{References}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%In the online version of \jpcs\ references will be linked to their original source or to the article within a secondary service such as INSPEC or ChemPort wherever possible. To facilitate this linking extra care should be taken when preparing reference lists.
%
%Two different styles of referencing are in common use: the Harvard alphabetical system and the Vancouver numerical system. For \jpcs, the Vancouver numerical system is preferred but authors should use the Harvard alphabetical system if they wish to do so. In the numerical system references are numbered sequentially throughout the text within square brackets, like this [2], and one number can be used to designate several references.
%
%\subsection{Using \BibTeX}
%We highly recommend the {\ttfamily\textbf\selectfont iopart-num} \BibTeX\ package by Mark~A~Caprio \cite{iopartnum}, which is included with this documentation.
%
%\subsection{Reference lists}
%A complete reference should provide the reader with enough information to locate the article concerned, whether published in print or electronic form, and should, depending on the type of reference, consist of:
%
%\begin{itemize}
%\item name(s) and initials;
%\item date published;
%\item title of journal, book or other publication;
%\item titles of journal articles may also be included (optional);
%\item volume number;
%\item editors, if any;
%\item town of publication and publisher in parentheses for {\it books};
%\item the page numbers.
%\end{itemize}
%
%Up to ten authors may be given in a particular reference; where
%there are more than ten only the first should be given followed by
%`{\it et al}'. If an author is unsure of a particular journal's abbreviated title it is best to leave the title in
%full. The terms {\it loc.\ cit.\ }and {\it ibid.\ }should not be used.
%Unpublished conferences and reports should generally not be included
%in the reference list and articles in the course of publication should
%be entered only if the journal of publication is known.
%A thesis submitted for a higher degree may be included
%in the reference list if it has not been superseded by a published
%paper and is available through a library; sufficient information
%should be given for it to be traced readily.
%
%\subsection{Formatting reference lists}
%Numeric reference lists should contain the references within an unnumbered section (such as \verb"\section*{References}"). The
%reference list itself is started by the code
%\verb"\begin{thebibliography}{<num>}", where \verb"<num>" is the largest
%number in the reference list and is completed by
%\verb"\end{thebibliography}".
%Each reference starts with \verb"\bibitem{<label>}", where `label' is the label used for cross-referencing. Each \verb"\bibitem" should only contain a reference to a single article (or a single article and a preprint reference to the same article). When one number actually covers a group of two or more references to different articles, \verb"\nonum"
%should replace \verb"\bibitem{<label>}" at
%the start of each reference in the group after the first.
%
%For an alphabetic reference list use \verb"\begin{thereferences}" ... \verb"\end{thereferences}" instead of the
%`thebibliography' environment and each reference can be start with just \verb"\item" instead of \verb"\bibitem{label}"
%as cross referencing is less useful for alphabetic references.
%
%\subsection {References to printed journal articles}
%A normal reference to a journal article contains three changes of font (see table \ref{jfonts}) and is constructed as follows:
%
%\begin{itemize}
%\item the authors should be in the form surname (with only the first letter capitalized) followed by the initials with no periods after the initials. Authors should be separated by a comma except for the last two which should be separated by `and' with no comma preceding it;
%\item the article title (if given) should be in lower case letters, except for an initial capital, and should follow the date;
%\item the journal title is in italic and is abbreviated. If a journal has several parts denoted by different letters the part letter should be inserted after the journal in Roman type, e.g. {\it Phys. Rev.} A;
%\item the volume number should be in bold type;
%\item both the initial and final page numbers should be given where possible. The final page number should be in the shortest possible form and separated from the initial page number by an en rule `-- ', e.g. 1203--14, i.e. the numbers `12' are not repeated.
%\end{itemize}
%
%A typical (numerical) reference list might begin
%
%\medskip
%\begin{thebibliography}{9}
%\item Strite S and Morkoc H 1992 {\it J. Vac. Sci. Technol.} B {\bf 10} 1237
%\item Jain S C, Willander M, Narayan J and van Overstraeten R 2000
%{\it J. Appl. Phys}. {\bf 87} 965
%\item Nakamura S, Senoh M, Nagahama S, Iwase N, Yamada T, Matsushita T, Kiyoku H
%and Sugimoto Y 1996 {\it Japan. J. Appl. Phys.} {\bf 35} L74
%\item Akasaki I, Sota S, Sakai H, Tanaka T, Koike M and Amano H 1996
%{\it Electron. Lett.} {\bf 32} 1105
%\item O'Leary S K, Foutz B E, Shur M S, Bhapkar U V and Eastman L F 1998
%{\it J. Appl. Phys.} {\bf 83} 826
%\item Jenkins D W and Dow J D 1989 {\it Phys. Rev.} B {\bf 39} 3317
%\end{thebibliography}
%\smallskip
%
%\noindent which would be obtained by typing
%
%\begin{verbatim}
%\begin{\thebibliography}{9}
%\item Strite S and Morkoc H 1992 {\it J. Vac. Sci. Technol.} B {\bf 10} 1237
%\item Jain S C, Willander M, Narayan J and van Overstraeten R 2000
%{\it J. Appl. Phys}. {\bf 87} 965
%\item Nakamura S, Senoh M, Nagahama S, Iwase N, Yamada T, Matsushita T, Kiyoku H
%and Sugimoto Y 1996 {\it Japan. J. Appl. Phys.} {\bf 35} L74
%\item Akasaki I, Sota S, Sakai H, Tanaka T, Koike M and Amano H 1996
%{\it Electron. Lett.} {\bf 32} 1105
%\item O'Leary S K, Foutz B E, Shur M S, Bhapkar U V and Eastman L F 1998
%{\it J. Appl. Phys.} {\bf 83} 826
%\item Jenkins D W and Dow J D 1989 {\it Phys. Rev.} B {\bf 39} 3317
%\end{\thebibliography}
%\end{verbatim}
%
%\begin{center}
%\begin{table}[h]
%\centering
%\caption{\label{jfonts}Font styles for a reference to a journal article.}
%\begin{tabular}{@{}l*{15}{l}}
%\br
%Element&Style\\
%\mr
%Authors&Roman type\\
%Date&Roman type\\
%Article title (optional)&Roman type\\
%Journal title&Italic type\\
%Volume number&Bold type\\
%Page numbers&Roman type\\
%\br
%\end{tabular}
%\end{table}
%\end{center}
%
%\subsection{References to \jpcs\ articles}
%Each conference proceeding published in \jpcs\ will be a separate {\it volume};
%references should follow the style for conventional printed journals. For example:\vspace{6pt}
%\numrefs{1}
%\item Douglas G 2004 \textit{J. Phys.: Conf. Series} \textbf{1} 23--36
%\endnumrefs
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%\subsection{References to preprints}
%For preprints there are two distinct cases:
%\renewcommand{\theenumi}{\arabic{enumi}}
%\begin{enumerate}
%\item Where the article has been published in a journal and the preprint is supplementary reference information. In this case it should be presented as:
%\medskip
%\numrefs{1}
%\item Kunze K 2003 T-duality and Penrose limits of spatially homogeneous and inhomogeneous cosmologies {\it Phys. Rev.} D {\bf 68} 063517 ({\it Preprint} gr-qc/0303038)
%\endnumrefs
%\item Where the only reference available is the preprint. In this case it should be presented as
%\medskip
%\numrefs{1}
%\item Milson R, Coley A, Pravda V and Pravdova A 2004 Alignment and algebraically special tensors {\it Preprint} gr-qc/0401010
%\endnumrefs
%\end{enumerate}
%
%\subsection{References to electronic-only journals}
%In general article numbers are given, and no page ranges, as most electronic-only journals start each article on page 1.
%
%\begin{itemize}
%\item For {\it New Journal of Physics} (article number may have from one to three digits)
%\numrefs{1}
%\item Fischer R 2004 Bayesian group analysis of plasma-enhanced chemical vapour deposition data {\it New. J. Phys.} {\bf 6} 25
%\endnumrefs
%\item For SISSA journals the volume is divided into monthly issues and these form part of the article number
%
%\numrefs{2}
%\item Horowitz G T and Maldacena J 2004 The black hole final state {\it J. High Energy Phys.} JHEP02(2004)008
%\item Bentivegna E, Bonanno A and Reuter M 2004 Confronting the IR fixed point cosmology with high-redshift observations {\it J. Cosmol. Astropart. Phys.} JCAP01(2004)001
%\endnumrefs
%\end{itemize}
%
%\subsection{References to books, conference proceedings and reports}
%References to books, proceedings and reports are similar to journal references, but have
%only two changes of font (see table~\ref{book}).
%
%\begin{table}
%\centering
%\caption{\label{book}Font styles for references to books, conference proceedings and reports.}
%\begin{tabular}{@{}l*{15}{l}}
%\br
%Element&Style\\
%\mr
%Authors&Roman type\\
%Date&Roman type\\
%Book title (optional)&Italic type\\
%Editors&Roman type\\
%Place (city, town etc) of publication&Roman type\\
%Publisher&Roman type\\
%Volume&Roman type\\
%Page numbers&Roman type\\
%\br
%\end{tabular}
%\end{table}
%
%Points to note are:
%\medskip
%\begin{itemize}
%\item Book titles are in italic and should be spelt out in full with initial capital letters for all except minor words. Words such as Proceedings, Symposium, International, Conference, Second, etc should be abbreviated to {\it Proc.}, {\it Symp.}, {\it Int.}, {\it Conf.}, {\it 2nd}, respectively, but the rest of the title should be given in full, followed by the date of the conference and the town or city where the conference was held. For Laboratory Reports the Laboratory should be spelt out wherever possible, e.g. {\it Argonne National Laboratory Report}.
%\item The volume number, for example vol 2, should be followed by the editors, if any, in a form such as `ed A J Smith and P R Jones'. Use {\it et al} if there are more than two editors. Next comes the town of publication and publisher, within brackets and separated by a colon, and finally the page numbers preceded by p if only one number is given or pp if both the initial and final numbers are given.
%\end{itemize}
%
%Examples taken from published papers:
%\medskip
%
%\numrefs{99}
%\item Kurata M 1982 {\it Numerical Analysis for Semiconductor Devices} (Lexington, MA: Heath)
%\item Selberherr S 1984 {\it Analysis and Simulation of Semiconductor Devices} (Berlin: Springer)
%\item Sze S M 1969 {\it Physics of Semiconductor Devices} (New York: Wiley-Interscience)
%\item Dorman L I 1975 {\it Variations of Galactic Cosmic Rays} (Moscow: Moscow State University Press) p 103
%\item Caplar R and Kulisic P 1973 {\it Proc. Int. Conf. on Nuclear Physics (Munich)} vol 1 (Amsterdam: North-Holland/American Elsevier) p 517
%\item Cheng G X 2001 {\it Raman and Brillouin Scattering-Principles and Applications} (Beijing: Scientific)
%\item Szytula A and Leciejewicz J 1989 {\it Handbook on the Physics and Chemistry of Rare Earths} vol 12, ed K A Gschneidner Jr and L Erwin (Amsterdam: Elsevier) p 133
%\item Kuhn T 1998 {\it Density matrix theory of coherent ultrafast dynamics Theory of Transport Properties of Semiconductor Nanostructures} (Electronic Materials vol 4) ed E Sch\"oll (London: Chapman and Hall) chapter 6 pp 173--214
%\endnumrefs
%
%\section{Tables and table captions}
%Tables should be numbered serially and referred to in the text
%by number (table 1, etc, {\bf rather than} tab. 1). Each table should be a float and be positioned within the text at the most convenient place near to where it is first mentioned in the text. It should have an
%explanatory caption which should be as concise as possible.
%
%\subsection{The basic table format}
%The standard form for a table is:
%\begin{verbatim}
%\begin{table}
%\caption{\label{label}Table caption.}
%\begin{center}
%\begin{tabular}{llll}
%\br
%Head 1&Head 2&Head 3&Head 4\\
%\mr
%1.1&1.2&1.3&1.4\\
%2.1&2.2&2.3&2.4\\
%\br
%\end{tabular}
%\end{center}
%\end{table}
%\end{verbatim}
%
%The above code produces table~\ref{ex}.
%
%\begin{table}[h]
%\caption{\label{ex}Table caption.}
%\begin{center}
%\begin{tabular}{llll}
%\br
%Head 1&Head 2&Head 3&Head 4\\
%\mr
%1.1&1.2&1.3&1.4\\
%2.1&2.2&2.3&2.4\\
%\br
%\end{tabular}
%\end{center}
%\end{table}
%
%Points to note are:
%\medskip
%\begin{enumerate}
%\item The caption comes before the table.
%\item The normal style is for tables to be centred in the same way as
%equations. This is accomplished
%by using \verb"\begin{center}" \dots\ \verb"\end{center}".
%
%\item The default alignment of columns should be aligned left.
%
%\item Tables should have only horizontal rules and no vertical ones. The rules at
%the top and bottom are thicker than internal rules and are set with
%\verb"\br" (bold rule).
%The rule separating the headings from the entries is set with
%\verb"\mr" (medium rule). These commands do not need a following double backslash.
%
%\item Numbers in columns should be aligned as appropriate, usually on the decimal point;
%to help do this a control sequence \verb"\lineup" has been defined
%which sets \verb"\0" equal to a space the size of a digit, \verb"\m"
%to be a space the width of a minus sign, and \verb"\-" to be a left
%overlapping minus sign. \verb"\-" is for use in text mode while the other
%two commands may be used in maths or text.
%(\verb"\lineup" should only be used within a table
%environment after the caption so that \verb"\-" has its normal meaning
%elsewhere.) See table~\ref{tabone} for an example of a table where
%\verb"\lineup" has been used.
%\end{enumerate}
%
%\begin{table}[h]
%\caption{\label{tabone}A simple example produced using the standard table commands
%and $\backslash${\tt lineup} to assist in aligning columns on the
%decimal point. The width of the
%table and rules is set automatically by the
%preamble.}
%
%\begin{center}
%\lineup
%\begin{tabular}{*{7}{l}}
%\br
%$\0\0A$&$B$&$C$&\m$D$&\m$E$&$F$&$\0G$\cr
%\mr
%\0\023.5&60 &0.53&$-20.2$&$-0.22$ &\01.7&\014.5\cr
%\0\039.7&\-60&0.74&$-51.9$&$-0.208$&47.2 &146\cr
%\0123.7 &\00 &0.75&$-57.2$&\m--- &--- &---\cr
%3241.56 &60 &0.60&$-48.1$&$-0.29$ &41 &\015\cr
%\br
%\end{tabular}
%\end{center}
%\end{table}
%
%\section{Figures and figure captions}
%Figures must be included in the source code of an article at the appropriate place in the text not grouped together at the end.
%
%Each figure should have a brief caption describing it and, if
%necessary, interpreting the various lines and symbols on the figure.
%As much lettering as possible should be removed from the figure itself and
%included in the caption. If a figure has parts, these should be
%labelled ($a$), ($b$), ($c$), etc.
%\Tref{blobs} gives the definitions for describing symbols and lines often
%used within figure captions (more symbols are available
%when using the optional packages loading the AMS extension fonts).
%
%\begin{table}[h]
%\caption{\label{blobs}Control sequences to describe lines and symbols in figure
%captions.}
%\begin{center}
%\begin{tabular}{lllll}
%\br
%Control sequence&Output&&Control sequence&Output\\
%\mr
%\verb"\dotted"&\dotted &&\verb"\opencircle"&\opencircle\\
%\verb"\dashed"&\dashed &&\verb"\opentriangle"&\opentriangle\\
%\verb"\broken"&\broken&&\verb"\opentriangledown"&\opentriangledown\\
%\verb"\longbroken"&\longbroken&&\verb"\fullsquare"&\fullsquare\\
%\verb"\chain"&\chain &&\verb"\opensquare"&\opensquare\\
%\verb"\dashddot"&\dashddot &&\verb"\fullcircle"&\fullcircle\\
%\verb"\full"&\full &&\verb"\opendiamond"&\opendiamond\\
%\br
%\end{tabular}
%\end{center}
%\end{table}
%
%
%Authors should try and use the space allocated to them as economically as possible. At times it may be convenient to put two figures side by side or the caption at the side of a figure. To put figures side by side, within a figure environment, put each figure and its caption into a minipage with an appropriate width (e.g. 3in or 18pc if the figures are of equal size) and then separate the figures slightly by adding some horizontal space between the two minipages (e.g. \verb"\hspace{.2in}" or \verb"\hspace{1.5pc}". To get the caption at the side of the figure add the small horizontal space after the \verb"\includegraphics" command and then put the \verb"\caption" within a minipage of the appropriate width aligned bottom, i.e. \verb"\begin{minipage}[b]{3in}" etc (see code in this file used to generate figures 1--3).
%
%Note that it may be necessary to adjust the size of the figures (using optional arguments to \verb"\includegraphics", for instance \verb"[width=3in]") to get you article to fit within your page allowance or to obtain good page breaks.
%
%\begin{figure}[h]
%\begin{minipage}{14pc}
%\includegraphics[width=14pc]{name.eps}
%\caption{\label{label}Figure caption for first of two sided figures.}
%\end{minipage}\hspace{2pc}%
%\begin{minipage}{14pc}
%\includegraphics[width=14pc]{name.eps}
%\caption{\label{label}Figure caption for second of two sided figures.}
%\end{minipage}
%\end{figure}
%
%\begin{figure}[h]
%\includegraphics[width=14pc]{name.eps}\hspace{2pc}%
%\begin{minipage}[b]{14pc}\caption{\label{label}Figure caption for a narrow figure where the caption is put at the side of the figure.}
%\end{minipage}
%\end{figure}
%
%Using the graphicx package figures can be included using code such as:
%\begin{verbatim}
%\begin{figure}
%\begin{center}
%\includegraphics{file.eps}
%\end{center}
%\caption{\label{label}Figure caption}
%\end{figure}
%\end{verbatim}
%
%\section*{References}
%\begin{thebibliography}{9}
%\bibitem{iopartnum} IOP Publishing is to grateful Mark A Caprio, Center for Theoretical Physics, Yale University, for permission to include the {\tt iopart-num} \BibTeX package (version 2.0, December 21, 2006) with this documentation. Updates and new releases of {\tt iopart-num} can be found on \verb"www.ctan.org" (CTAN).
%\end{thebibliography}
\end{document}
File added