User Tools

Site Tools


start

HPC Cluster @ DCMC

The Cluster is made of some servers and blades organized in 3 racks physically located in the Cluster Room and Datacenter Room in Mancinelli site. The main hardware components where part of the former Labs cluster, new acquisitions and one blade from Mathematics Dept. Datacenter donated from a Research group of DICA.

Hardware

Server or Blade # nodes and processors # cores/processor Ram (GB) Local storage Network interfaces
Blade Gandalf 6 nodes Dell with 2 Intel Xeon 4 8 1 HD SAS 73GB 1Gbit Ethernet for Management and Data
Blade Legolas 16 nodes Hp with 2 Intel Xeon 4 24 1 HD SAS 73GB 1Gbit Ethernet for Management and Data
Blade Merlino 9 nodes Dell with 2 AMD Opteron 4 16 1 HD SATA 80GB 1Gbit Ethernet for Management, 10Gbit Ethernet for Data
Blade Morgana 11 nodes Dell with 2 Intel Xeon 4 24 2 HD SAS 146GB Raid0 1Gbit Ethernet for Management, 20Gbit Mellanox Infiniband for MPI, 10Gbit Ethernet for Data
Covenant 1 node Hp with 4 Intel Xeon, 2 nodes Dell with 2 Xeon cpu 10,20 256,320 2 HD SAS 1TB Raid0 1Gbit Ethernet for Management, 10Gbit Ethernet for Data
Masternode 1 node Dell with 2 Intel Xeon 4 24 2 HD SAS 1TB Raid1 for OS, 4 HD SAS 2TB Raid5 for Scratch, 35TB Storage for Home 1Gbit Ethernet for Management, 1Gbit Ethernet for Frontend/login, 1Gbit Ethernet for nodes console (iDRAC, ILO), 2 10Gbit Ethernet for Data, Fiber Channel 8Gbps for Storage, 1 10Gbit Mellanox Infiniband for Infiniband control
GPU nodes 5 nodes Dell T630/T640 with 2 Intel Xeon and 1 or 2 nVidia GPUs 8 32/64 1 HD SAS 1TB 1Gbit Ethernet for Management, 1Gbit Ethernet for Data

The Masternode provides:

  1. A file system /opt/ohpc/pub/ on local Raid storage for installed applications, shared via NFS on Gbit network
  2. A file system /homes on Dept. Storage (up to 35 TB) for user homes shared via NFS on Gbit or 10Gbit network (Morgana nodes have 1 Gb bandwidth guaranteed due to 10Gb Ethernet hardware)
  3. A file system /scratch on local Raid storage (up to 5.5 TB) for scratch shared via NFS on Gbit or 10Gbit network (Morgana nodes have 1 Gb bandwidth guaranteed due to 10Gb Ethernet hardware)
  4. Login for users via SSH and via Web SSH interface
  5. Node provisioning software (installation of new nodes and node rebuild in less than 5 min.)
  6. PBSPro master for scheduling and resource management
  7. Support for Singularity container technology
  8. Infiniband network controller (via software)
  9. Application software for compute nodes
  10. Monitoring and utility software: Ganglia, Nagios
  11. Documentation website: http://masternode.chem.polimi.it accessible only from the wired DCMC network or via DCMC VPN
  12. Web SSH interface https://masternode.chem.polimi.it/webssh accessible only from the wired DCMC network or via DCMC VPN
  13. Web interface to PBSPro submission: planned for 2019

Software

The cluster has been installed following the directions of the OpenHPC Project, an open source project backed by Intel and many software and hardware HPC player and supported by a strong developer community. The main reason for this choice is that they committed to support the upgrade of the software platform following the evolution of the single components (OS, compilers, MPI distributions, toolchains components, utility software, hardware support, container technology) and also for their independence from a specific vendor or technology. The installed software is divided in Management and operations software and Application software. At the startup on April, 1st 2019 the software are:

Management and operations software

  1. OpenHPC 1.3.6 (October 2018) for Linux CentOS 7.5 nodes image
  2. PBS Pro 18.1 (Altair released PBSPro as open source in mid 2016)
  3. Compilers: GCC 5.4.0, GCC 7.3.0, GCC 8.2.0, Intel C++/Fortran/MPI 19.0.3.199 ver. 2019 update 3
  4. MPI: OpenMPI 1.10.7, OpenMPI 3.1.2, Intel MPI 2019.3, MPICH 3.2.1, MVAPICH2 2.2, MVAPICH2 2.3
  5. Ganglia for monitoring cluster performance (public access)
  6. Nagios for monitoring nodes health status (restricted access)

Application software

  1. Abaqus 2017
  2. Abaqus 2018
  3. Abaqus 2019
  4. Ansys 19.3 2019 R1 (Fluent and LSDYNA)
  5. Comsol 5.4
  6. Matlab R2019a with parallel support
  7. Python 2.x stack with Scipy,Numpy,Mpi4Py
  8. Python 3.4 stack with Scipy,Numpy,Mpi4py
  9. AdIOS
  10. HDF5 and pHDF5
  11. NetCDF and pNetCDF
  12. SIONLIB
  13. R 3.5
  14. Trilinos 12.12.1

And many more libraries and tools for parallel and scalar scientific programming.

Usage guides

start.txt · Last modified: 2022/07/26 11:24 by druido