@@ -75,7 +75,7 @@ Hardware resources for farming group are quite new and a refresh was not foresee
During 2018 we completed the migration from SL6 to CentOS7, on all the farming nodes. The configurations have been stored on our provisioning system: with the WNs the migration process has been rather simple, while with CEs and UIs we took extra care and proceed one at a time in order to guarantee continuity to the service. The same configurations have been used to upgrade LHCb-T2 and INFN-BO-T3, with minimal modifications. All the modules produced for our site can easily be exported to other sites, willing to perform the same update.
As already said the update involved all the services with just a small number of exceptions: CMS experiment is using PhEDEx\cite{ref:phedex}, a system that provides the data placement and the file transfer system that is incompatible with CentOS7. Since the system will be phased out in mid 2019, we agreed with the experiment to not perform any update. Same thing happened with a few legacy UIs and some services for the CDF experiment, that are involved in a LTDP project (more details in next year report).
In any case, if an experiment needs a legacy OS, like SL6, on all the Worker Nodes we provide a container solution based on singularity\cite{ref:singu} software.
In any case, if an experiment needs a legacy OS, like SL6, on all the Worker Nodes we provide a container solution based on Singularity\cite{ref:singu} software.
Singularity enables users to have full control of their environment through containers: it can be used to package entire scientific workflows, software and libraries, and even data. This avoids the T1 users to ask farming sysadmin to install any software, since everything can be put container and run. Users are in control of the extent to which containers interacts with its host: there can be seamless integration, or little to no communication at all.
Year 2018 has been terrible from a security point of view. Several critical vulnerabilities have been discovered, affecting data-center CPUs and major software stacks: the major ones were meltdown and spectre~\cite{ref:meltdown} (see figure~\ref{meltdown} and~\ref{meltdown2}). These discoveries required us to promptly intervene in order to mitigate and/or correct these vulnerabilities, applying software updates (this mostly breaks down to updating Linux kernel and firmware) that most of the times required to reboot the whole farm. This impacts greatly in term of resource availability, but it's mandatory in order to prevent security issues and possible sensitive data disclosures. Thanks to our internally-developed dynamic update procedure, patch application is smooth and almost automatic, avoiding farm staff to waste a lot of time.