Skip to content
Snippets Groups Projects
Commit ba490fff authored by Alessandro Costantini's avatar Alessandro Costantini
Browse files

Deep conribution - draft

parent 5c976f15
Branches
No related tags found
1 merge request!1DS contributions
contributions/sdds-deep/DEEP-WP.png

103 KiB

......@@ -4,13 +4,13 @@
\title{DEEP-Hybrid DataCloud project: Hybrid services for distributed e-infrastructures}
%\address{Production Editor, \jpcs, \iopp, Dirac House, Temple Back, Bristol BS1~6BE, UK}
\author{A. Costantini$^1$, D.C. Duma$^1$, G. Donvito$^2$, J. Marco de Lucas$^8$,
\author{A. Costantini$^1$, D.C. Duma$^1$, G. Donvito$^2$, \dots
% etc.
}
\address{$^1$ INFN-CNAF, Bologna, Italy}
\address{$^2$ INFN Bari, Bari, Italy}
\address{$^11$ Univ. de Cantabria, Spain}
\ead{alessandro.costantini@cnaf.infn.it}
......@@ -52,65 +52,95 @@ In order to achieve these objectives, we propose to evolve existing cloud servic
\item Assure the scalability and performance of the solution developed, which is key to guarantee the interest both of resource providers and users.
\end{itemize}
\section{Project structure}
The DEEP Hybrid DataCloud project is structured into six different work packages, covering Networking
Activities (NA) devoted to the coordination, communication and community liaison; Service Activities (SA)
focused on the provisioning of services and resources for the execution of the data analysis challenges; and
Joint Research Activities (JRAs), dealing with the development of new components and technologies to
support data analysis. Figure \ref{DEEP-WP} describes the interaction between the different work packages.
\section{Research Communities and requirements}
The Research Communities participating in the DEEP-HDC project enable to cover differet scientific areas ranging from Biological and Medical Science, Computing Security, Physical Sciences, Citizen Science and Earth Observation.
\begin{figure}[h]
\centering
\includegraphics[width=10cm,clip]{DEEP-WP.png}
\caption{Diagram showing WPs interrelation.}
\label{fig-wp}
\end{figure}
\subsection{Medical science}
Deep learning approaches to biomedical image analysis have opened new opportunities in how diseases are diagnosed and treated. However, one drawback of automated analysis of medical images with deep learning is the requirement for sophisticated IT infrastructures (hardware, software, network). In this project, we will explore various ways to apply deep learning to analyse images in the context of retinopathy \cite{Yau2012}.
Requirements injected into the project:
It is important to remark the key role of WP2/NA2 and WP3/SA1 to define and channel towards the JRA
work packages, WP4, WP5, WP6, the requirements for the solutions to be developed, and then provide feedback.
The direct interaction between WP2 and WP6 will promote an agile interaction on the design and
DEEP Hybrid DataCloud implementation of the services for final users.
{\bf Work Package 1, WP1 (NA1) - Project Management and Exploitation.}
This work package will perform the global oversight of the activities carried out within the project, ensuring
that they are aligned with the DEEP Hybrid DataCloud work programme. WP1 will also coordinate the
consortium management through the governance structure, including the promotion
of an adequate interaction between the WP through the steering committee.
{\bf Work Package 2, WP2 (NA2) - Intensive Computing Pilot Applications.}
This work package is responsible for the definition and correct understanding of the pilot usage scenarios
regarding the project’s technical architecture and will propose an architecture that is
applicable for the identified applications. Moreover, NA2 will interact with SA1 and JRA3 to ensure that the
delivered outcomes are aligned with the expectations of the user communities, are compliant with the
proposed scenarios and validated against the user applications.
{\bf Work Package 3, WP3 (SA1) - Testbed and integration with EOSC services.}
The service activities of the project will be supported by WP3, that will guarantee that the project pilot
testbeds are correctly integrated with other state of the art and e-Infrastructures and services from the
European Open Science Cloud (EOSC), so that the project can exploit their services in an easy way.
Moreover, this work package will supervise the software development within the project, providing a
continuous software improvement process that will involve quality assurance activities, software release
management, maintenance and support.
\begin{itemize}
\item develop and evaluate a deep learning tool facilitating the classification of retinopathy stage and progression based on digital color fundus retinal photography images.
\item improve automated classification retinopathy stage (Healthy, Mild, Medium, Severe) and reconstruct disease progression by means of deep learning.
\item explore construction (training) of deep learning models using inherently distributed training data.
\item address the need for a comprehensive and automated method for large-scale screening programs based on medical images.
\item In particular, INFN leads Task 3.2 - Software quality assurance, release, maintenance and support - aimed at
\item Increase the quality levels of the software by contributing to the implementation and automation of the Quality Assurance (QA) and Control procedures defined by the project.
\item Boost the software delivery process, relying on automation.
\item Guarantee the stability of services already deployed in production and the increase of their readiness levels, where needed, from TRL6 to TRL8.
\end{itemize}
\subsection{Computing Security}
The usage of specialised hardware in order to speed up packet capturing, pre-processing and classification for network analysis is becoming very popular in recent times. Intrusion detection, or deep packet inspection systems are highly demanded application frameworks by network security analysts.
The common feature to these network applications is that it is needed to process a continuous flow of information and react promptly to the generated events.
Requirements injected into the project:
INFN contributes also to Task 3.1 - Pilot testbeds and integration with EOSC platform and their services - by providing resources and services for the pilot testbeds.
{\bf Work Package 4, WP4 (JRA1) - Accelerated and High Performance Computing in the Cloud.}
This key research activity will be carried out close to the hardware and infrastructure, addressing the gaps
that currently exist in the support of accelerators (like GPU), specialized hardware (such as low-latency
interconnects) and HPC systems in general.
INFN contributes to all the JRA1 activities and task, making available its experience on virtualization technologies and cloud middleware frameworks
to develop advanced solutions that enable the delivery of bare-metal like performance and the resource sharing in multi-tenancy environments.
{\bf Work Package 5, WP5 (JRA2) - High Level Hybrid Cloud solutions.}
Lead by INFN, WP5 will take care of the provisioning of the platform exploiting the outcomes from JRA1 in a hybrid
approach, delivering an execution platform for JRA2, ensuring that applications can be spawned in across
several cloud infrastructures.
In particular, this can be done by
\begin{itemize}
\item Definition of an architecture for data intake, analysis and storage.
\item Research and development of different tools e.g. monitoring tools, decision support modules, prediction module with cloud supports that allow data analysis.
\item Research and development of intelligent modules using ML/DL to analyse and to get meaningful insights of massive online data. Applying and testing different approaches toward cyber-security aspects focused on event detection.
\item Accelerate Use Case development using cloud e-infrastructure modern technology advantages such as isolated independent environment, that provide built-in security, portability, and flexibility features.
\item Adopting and/or extending existing INDIGO PaaS orchestration solution for Cloud infrastructures with support for hybrid deployments across multiple Cloud sites,
\item Scaleing out a given virtual infrastructure deployed on a cloud to obtain access to Computing and Storage resources in 3rd party clouds,
\item Building software-based secure virtual networks enabling seamless connection between on-premises
(or local) resources and 3rd party providers, transparently for the user, providing the required level
of security and privacy that are required by users.
\item Providing a PaaS layer to WP6/JRA3 for the automated provision and configuration of complex
hybrid virtual infrastructures, including Container Orchestration Platforms such as Apache Mesos and Kubernetes.
\end{itemize}
\subsection{Physical Sciences}
Quantum Chromodynamics (QCD) is the theory that describes the interaction responsible for the confinement of quarks inside hadrons, the so-called strong interaction. Investigating the properties of QCD requires different techniques depending on the scale of energy we are interested in.
In this respect, data analysis presents technical challenges to the researchers involved. Those challenges can be traced down to the lack of a flexible environment to move data across the network and analyse them in a flexible way to improve the efficiency of the process of configuration analysis.
Requirements injected into the project:
\begin{itemize}
\item designing a data configuration tool to have general applicability for similar usage scenarios in other scientific areas.
\end{itemize}
{\bf Work Package 6, WP6 (JRA3) - DEEP as a Service.}
This activity focuses on bridging the outcomes of NA2, JRA1 and JRA2 so as to deliver the final solution to
the users in the form of a DEEP as a Service solution. This service will ensure that scientists have an easy
way to deploy and execute their intensive compute applications based on containers (from NA2) that will be
executed in an hybrid cloud platform (JRA2), exploiting the specialized hardware that their application requires (JRA1).
INFN contributes to all the JRA3 activities and task by supporting integration and testing activities aimed at
composing a set of defined building blocks that will model the user application and deploying these applications as services that
can be offered to final users, as a way to deliver scientific results to a wider scope of stakeholders.
\subsection{Citizen Science}
The potential of applying deep learning techniques for plant classification and its usage for citizen science in large-scale biodiversity monitoring has been discussed in recent publications \cite{Heredia2017}. The predictions can be confidently used as a baseline classification in citizen science communities which in turn can share their data with biodiversity portals.
Requirements injected into the project:
\begin{itemize}
\item Produce a tool that is able to classify plants species from images.
\item Have the results produced by the developed tools validated by biodiversity experts.
\item Deploy this tool to automated monitoring of biodiversity.
\end{itemize}
\subsection{Earth Observation}
The application of NN to pattern recognition in satellite images, and their combination with other in-situ measurements, opens new possibilities in areas like Ecosystems and Biodiversity. The EU Copernicus Programme relies on a family of dedicated Earth Observation missions called the Sentinels. The data acquired from these missions are systematically downlinked and processed to operational user products, and made available under an open and free license \cite{Copernicus}.
\begin{itemize}
\item Enable monitors for ecosystems and the design of better policies regarding the environment.
\item Enable the capabilities of the European biodiversity platforms such as LifeWatch.
\item Develop a detection and prediction system that combines the latest Deep Learning techniques with satellite data. An environment where you can process and analyse different satellite maps, choosing the place, the date, etc.
\end{itemize}
\section{DEEP Overall architecture}
The DEEP PaaS layer is based on the components developed and integrated in the INDIGO-DataCloud project \cite{indigo}. The architecture is depicted in Figure~\ref{fig-1} and the main components are briefly described hereafter:
The DEEP PaaS layer is based on the components developed and integrated in the INDIGO-DataCloud project \cite{indigo}. The architecture is depicted in Figure~\ref{fig-arch} and the main components are briefly described hereafter:
\begin{figure}[h]
\centering
\includegraphics[width=10cm,clip]{DEEP-fig1.PNG}
\includegraphics[width=10cm,clip]{DEEP-arch.png}
\caption{The architecture of the DEEP PaaS layer is based on the building blocks provided by the INDIGO-DataCloud project.}
\label{fig-1}
\label{fig-arch}
\end{figure}
The PaaS Orchestrator is the core component of the PaaS layer. It receives high-level deployment requests and coordinates the deployment process over the IaaS platforms.
......@@ -131,8 +161,12 @@ The Data Management Services is a collection of services that provide an abstrac
The Information Provider and Accounting System collects detailed information from an IaaS provider about the current status of the resources from the amount of resources of CPU, RAM or storage to the availability of a service.
\section{DEEP as a Service solution}
The high level decomposition of the DEEP as a Service design is depicted in Figure~\ref{fig-2} and consists on the following key components:
The high level decomposition of the DEEP as a Service design is depicted in Figure~\ref{fig-DEEPaas} and consists on the following key components:
\begin{itemize}
\item The DEEP open Catalog where the users, communities, etc. can browse, store and download relevant modules for building up their applications (like ready to use machine learning frameworks, complex application topologies, etc.).
\item An application modeler or composition tool, that will be used to build up complex application topologies in an easy way.
......@@ -146,9 +180,9 @@ The system is designed with extensibility in mind, taking great care in designin
\begin{figure}[h]
\centering
\includegraphics[width=12cm,clip]{DEEP-fig2.PNG}
\includegraphics[width=12cm,clip]{DEEP-aas.PNG}
\caption{DEEPaaS high level architecture.}
\label{fig-2}
\label{fig-DEEPaas}
\end{figure}
\section{Conclusions}
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment