Newer
Older
\documentclass[a4paper]{jpconf}
\usepackage{graphicx}
\title{EOSC-hub: contributions to project achievements}
\author{C. Duma$^1$, A. Costantini$^1$, D. Michelotto$^1$,
A. Ceccanti$^1$, E. Vianello $^1$, L. Morganti$^1$,
A. Fallabella$^1$, L. Dell'Agnello$^1$ and D. Salomoni$^1$}
\address{$^1$INFN Division CNAF, Bologna, Italy}
\ead{ds@cnaf.infn.it}
\begin{abstract}
EOSC-hub is the H2020 EINFRA12 project submitted by a consortium of
74 partners under the coordination of EGI, EUDAT and INDIGO-DataCloud.
It brings together multiple service providers to create {\bf the Hub}: a single
contact point for European researchers and innovators to discover, access, use
and reuse a broad spectrum of resources for advanced data-driven research.
After a short description of the projects objectives and of its work-packages,
we present the various aspects of the CNAF contributions to the achievement of
The EOSC-hub project creates the integration and management system of the future European
Open Science Cloud (EOSC) that delivers a catalogue of services, software and data from the EGI Federation,
EUDAT CDI, INDIGO-DataCloud and major research e-infrastructures. This integration and
management system (the Hub) builds on mature processes, policies and tools from the
leading European federated e-Infrastructures to cover the whole life-cycle of services,
from planning to delivery. The Hub aggregates services from local, regional and national
e-Infrastructures in Europe, Africa, Asia, Canada and South America.
The Hub acts as a single contact point for researchers and innovators to discover, access,
use and reuse a broad spectrum of resources for advanced data-driven research. Through
the virtual access mechanism, more scientific communities and users have access to services
supporting their scientific discovery and collaboration across disciplinary and geographical boundaries.
The project mission is to contribue to address the EOSC challenges/objectives implementing a number of actions:
\begin{itemize}
\item Increase the ability to {\bf exploit research data across scientific disciplines} and between the public and private sector, by:
\begin{itemize}
\item {\bf Publish, discover, access} services and resources for all scientific disciplines, defined and adopt {\bf a service
portfolio management process} ? activities and governance
\item {\bf Open} to national, regional, pan-European providers, and supports different exploitation models (e.g. free at point of use, commercial)
\item Build a Joint Digital Innovation Hub https://eosc-hub.eu/digital-innovation-hub
\end{itemize}
\item Support {\bf open science}, by:
\begin{itemize}
\item Offering services to share and discover research artefacts (datasets, software, notebooks) and research artefacts sources
\item Establish an EOSC-hub and OpenAIRE collaboration
\end{itemize}
\end{itemize}
a well as to contribute to the European Open Science Agenda:
\begin{itemize}
\item Develop research infrastructures for Open Science, by:
\begin{itemize}
\item making major data infrastructures available for dissemination and exploitation in Open Science. Data resources include: Copernicus, ESA/Landsat, ERS and Envisat/Meris datasets for earth observation; the CLARIN metadata infrastructure for Art and Humanities, the Compact Muon Solenoid (CMS) experiment data and its distributed processing infrastructure, EISCAT experimental data, the ELIXIR core data resources, et al.
\item providing well defined and non-discriminatory access modes for all researchers across all disciplines, following the G8 principles.
\item applying long-term, persistent care for a given resource by improving the stewardship from the funding agencies and the resource owners, through active maintenance of open science resources, such as certification of data repositories, and maintenance of training and education programmes to increase the amount and quality of knowledge held by the community on required topics such as data preservation, curation and sharing.
\end{itemize}
\item Enable data-intensive research in secured virtual environments for Open Science, by:
\begin{itemize}
\item enabling big data solutions in secured virtual environments to generate smart solutions for analysing complex data from different sources
\item defining and adopting a corpus of harmonised access and reuse policies for research infrastructures and e-Infrastructures, based on one market offering clear points of access and support
\end{itemize}
\item Mainstream Open Access to research results, by:
\begin{itemize}
\item offering services for research data discoverability, an application store where cloud virtual appliances can be freely registered and shared as research objects linked to publications, to be executed on a distributed cloud environment for repeatability of science
\end{itemize}
\section{Project implementation and CNAF contributions}
\label{sec:first}
Following a service lifecycle approach, the EOSC-hub work packages are defined to support a service-oriented approach that covers the various stages of service management: from planning (left); integration, management and delivery (middle); and adoption (right) in Figure \ref{eoschub_wps}.
The {\bf Service Planning} WPs are devoted to collect requirements from new user communities and research collaborations, define and implement strategy and policies ({\bf WP2}), engage with stakeholders ({\bf WP3}), define appropriate business models and strategies and identify procurement frameworks ({\bf WP12}).
The {\bf Service Integration} WPs are devoted to service integration ({\bf WP5},{\bf WP6} and{\bf WP7}), maintenance and control ({\bf WP4}). {\bf WP10} is providing the technical coordination and consistency to these activities.
The {\bf Service Adoption} WPs include activities directed at pre-selected (via an open call during proposal preparation) to early adopters from research communities ({\bf WP8}) and the business world ({\bf WP9}). {\bf WP12} handles business model innovation and service procurement and purchase in order to provide access to all user categories targeted by the project. WP11 will manage a training programme to stimulate knowledge network and facilitate adoption. {\bf WP1} supports the overall project management and coordination, complemented by Technology Coordination, executed in {\bf WP10}.
\begin{figure}[h]
\centering
\includegraphics[width=15cm,clip]{eoschub_wps_2.png}
\caption{EOSC-hub project work-packages and their relationship}
\label{eoschub_wps}
\end{figure}
The main work-packages that see the involvement of the CNAF teams are:
\subsubsection {WP2 - Strategy and Business Development}
The work-package is dealing with the overall policy, business and service strategy of the project, and CNAF team is:
\begin{itemize}
\item contributing to the definition and management of the project strategy
\item managing the service roadmap, service portfolio and service catalogue, in collaboration with the Technology Committee
\item formulating policies to facilitate the sharing and safe processing of both open and ?restricted? data from across European research infrastructures and promoting current models of good practice on the management of restricted data from around the EOSC-hub consortium.
\end{itemize}
\subsubsection {WP6 - Common Services: Integration and Maintenance}
This work package is maintaining and integrating the common services based on an evolving service catalogue, starting from an initial set of mature common services and technologies from the EGI, EUDAT and INDIGO service catalogues. It aims, among other, to maintain the high quality of the baseline and advanced common services from the evolving service catalogue according to a maintenance plan, ensure that these services are developing according to the requirements of users, thematic services and competence centres, and provide support and contribute to the documentation.
CNAF team is involved in particular in the task T6.1 "Discovery and Access", for the support of the INDIGO IAM service integration on request, e.g., with thematic services and testing according to the work plans defined by relevant competence centres. INDIGO IAM will act as attribute authority and IdP for the EOSC-hub federated AAI.
\subsubsection {WP7 - Thematic Services: Integration, maintenance and Exploitation}
The Thematic Services envisaged by the EOSC-hub project provide community-specific capabilities including research core data, data products, scientific software, and pipelines from 18 international research collaborations and infrastructures: CLARIN, CMS, DARIAH, ELIXIR, EISCAT, EIDA, ENES, EPOS, GEOSS, ICOS, ITER, IFREMER and SeaDataNet, LifeWatch, LNEC, LOFAR, ICOS, WeNMR, and earth observation data from Copernicus, ESA and Envisat missions. The work-package, through its different tasks, aims to vertically integrate thematic services with common components, to foster their deployment and operation, to enable users access to the respective services and to integrate monitoring and accounting systems in these thematic services.
In particular, the CNAF team is involved in the {\bf T7.2 - DODAS TS (Dynamic On Demand Analysis Service)}\cite{dodas} that provides the end-user with an automated system that simplifies the process of provisioning, creating, managing and accessing a pool of heterogeneous (possibly opportunistic) computing resources, as described in Figure \ref{dodas}. The work in this task regards the support to a wider range of infrastructure providers improving the following features through the integration with EOSC-hub: Data Management, Data Caching, PaaS level cross-site cluster deployment, Web User interface, Authentication and Authorisation and Accounting.
DODAS is an open-source Platform-as-a-Service tool which allows to deploy software applications over heterogeneous and hybrid clouds. It instantiates on-demand container-based clusters through Apache Mesos \cite{mesos} and it offers a high level of abstraction to users, allowing to exploit any cloud infrastructure with almost zero effort, since it requires a very limited knowledge of the underlying technical details.
During 2018 a lot of effort has been spent on supporting communities. Several initiatives in this respect required support form DODAS team:
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
\item AMS did scale tests during a couple of weeks, using:
\begin{itemize}
\item 1.5K cores. about 200k jobs run producing about 20TB of output data.
\item Input Data read from EOS at CERN and produced data moved to CNAF through a third party copy mechanism
\item integrated Cache in AMS data analysis workflow (with DODAS)
\item collected feedback for improvements such as: AuthN/Z based on GSI (certificates from JWT), improve remote access (e.g. introducing CCB), improve CVMFS server management.
\item Overall very successful testing and usage.
\end{itemize}
\item CERN CMS-OpenData :
\begin{itemize}
\item from a design to a first test. End to End validation. Provided support for design integration and implementation.
\item Results will be presented at CMS Offline and Computing week
\item Document report provided by a CERN summer student
\end{itemize}
\item work regarding BigData platform for Machine Learning as a Service
\begin{itemize}
\item initially focussed on the inference and data reduction part. Integrating respectively the TFaaS of CMS and the Spark (which was already available)
\item followed by activities on Training part. Model training is trying to get all the available results/services/APIs from DEEP-HybridDataCloud project\cite{deep}
\end{itemize}
\item Finalised Onedata\cite{onedata} scale and validation tests. A long running exercise was done with latest onedata release rc11 in order to provide a comprehensive summary of performances evaluations.
\item Concluded the Google cloud integration, exploiting a third parity grant
\begin{itemize}
\item integration was done successfully
\item basic validation concluded with positive results.
\item validation has been carried on with all use cases supported by DODAS (AMS and CMS)
\item both workflow based on cache usage (xcache) and without have been validated.
\end{itemize}
\begin{figure}[h]
\centering
\includegraphics[width=15cm,clip]{dodas_ar.png}
\caption{High level schema of the DODAS architecture}
\label{dodas}
\end{figure}
\subsubsection {WP10 - Technical Coordination}
The work-package main objectives are to define the technical roadmap, with external and internal input from user and service providers? requirements, existing and planned services, technologies, standards, frameworks; to contribute to external standardisation bodies and relevant initiatives; to define the criteria for inclusion of new services into the catalogue and assessing conformance and to identify solutions to community requirements and lead their integration into the service portfolio.
Main CNAF team contributions to this work-package are in:
\begin{itemize}
\item Task T10.1 - Technical roadmap, that is dealing with definition, maintenance and enforcement of the technical roadmap, defining and maintaining the overall service architecture of the services of the EOSC Hub portfolio, by being part of different technological groups driving the activities of respective technological areas like:
\begin{itemize}
\item AAI - that oversees the technical development in the AAI area, in particular supporting one of the Community AAI service - the INDIGO-IAM
\item Software Quality Area - working on the definition of: best practices for services deployment and interoperability checks, software release, support and management
\end{itemize}
\item Task T10.2 - Service Catalogue Technical Evolution, that is defining the project's Rules of Engagement (RoE), including guidelines, policies and procedures to assess the conformance of services to the RoE and to the FAIR principles. This task is also monitoring the evolution of reference standards (and contribute where relevant) and track the evolution of the main software technologies in the Open Source communities. CNAF is participating in the RDAs' Software Source Code Identification Working Group that is bringing together a broad panel of stakeholders directly involved in software identification, working to define concrete recommendations for the academic community to ensure that the solutions that will be adopted by the academic players are compatible with each other and especially with the software development practice of tens of millions of developers worldwide.
\item Task T10.3 - Community Requirement Analysis and Technical, analysing the requirements collected by the other work packages and provide support to all the user communities engaged before or during the project, including also Thematic Services, Competence Centres and Joint Digital Innovation Hubs, in particular during their starting phase. CNAF team is in particular involved in the technical support team of the DODAS Thematic Service.
\end{itemize}
\subsubsection {WP13 - Access Provisioning}
This WP manages the Virtual Access to services of the EOSC-hub catalogue in the following four categories: Common Services, Thematic Services, Collaborative Services and Federation Services. In the context of this WP, CNAF is participating providing the Virtual Access installation for the DODAS Thematic Service. Some of the metrics showing the 2018 contributions are presented in Table \ref{tab:1}.
\begin{table}[ht]
\resizebox{\textwidth}{!}{\begin{tabular}{|l|c|l|l|l|}
\hline
\textbf{Metric Name} & \multicolumn{1}{l|}{\textbf{Baseline}} & \textbf{Description} & \textbf{Period 1 (M3-8)} & \textbf{Period 2 (M9-17)} \\ \hline
\begin{tabular}[c]{@{}l@{}}Usage: CPU time and storage \\ consumed by DODAS \\ at CNAF\end{tabular} & 0 & \begin{tabular}[c]{@{}l@{}}CNAF resources made \\ available for the TS.\\ Data will be collected both from \\ the DODAS monitoring system \\ and accounting at two sites. \\ (These are new resources, \\ installed and configured for\\ the EOSC-hub project)\end{tabular} & \begin{tabular}[c]{@{}l@{}}"CPU Hours": 308068.78, \\ "Disk GB-Hours": 5986141.3 \\ Value taken from the underling \\ Openstack Provider.\end{tabular} & \begin{tabular}[c]{@{}l@{}}"CPU Hours": 541744.2, \\ "Disk GB-Hours": 9798183.68\end{tabular} \\ \hline
\begin{tabular}[c]{@{}l@{}}Usage: Total Number \\ of Cluster deployments\end{tabular} & 0 & \begin{tabular}[c]{@{}l@{}}Number of cluster \\ deployments made through\\ the DODAS Core Services. \\ Metric will be based on the \\ deployments registered on \\ the DODAS PaaS Orchestrator \\ and Infrastructure Manager\end{tabular} & \begin{tabular}[c]{@{}l@{}}622 distinct cluster deployments. \\ Value taken from IM Database\end{tabular} & \begin{tabular}[c]{@{}l@{}}1084 distinct cluster \\ deployments\end{tabular} \\ \hline
\begin{tabular}[c]{@{}l@{}}Visits: number of visit/request \\ to the DODAS core services\end{tabular} & 0 & \begin{tabular}[c]{@{}l@{}}Number of people registered \\ in the DODAS-IAM service\end{tabular} & \begin{tabular}[c]{@{}l@{}}31 "Number of people registered\\ in the DODAS-IAM service."\end{tabular} & 56 \\ \hline
\end{tabular}}
\caption{Metrics regarding CNAF contribution as the main DODAS Virtual Access infrastructure}
\label{tab:1}
\end{table}
After the activities of the first project year that mainly see the organisation of the various technological groups, the collection of the requirements regarding the services developed and\/or supported by INFN, the deployment and operation of the DODAS Thematic Service, the following year will continue with the work on the definition and consolidation of the EOSC Hub Technical Roadmap, the Technical Architecture and standards roadmap, enhance the DODAS user community and integrate it into the EOSC-hub production environment. Also many training activities are foreseen both for the services nuder the development of the CNAF team, as well as for the DODAS TS.
EOSC-hub has been funded by the European Commission H2020 research and innovation
program under grant agreement RIA 777536.
\section{References}
\begin{thebibliography}{}
\bibitem{dodas}
DODAS: https://dodas-ts.github.io/dodas-doc/
\bibitem{mesos}
Web site: https://open.mesosphere.com/
\bibitem{deep}
Web site: https://deep-hybrid-datacloud.eu/
\bibitem{rda}
Web site: https://www.rd-alliance.org/groups/software-source-code-identification-wg
\end{thebibliography}