User Tools

Site Tools


start

This is an old revision of the document!


HPC Cluster @ DCMC

The Cluster is made of some servers and blades organized in 3 racks physically located in the Cluster Room and Datacenter Room in Mancinelli site. The main hardware components where part of the former Labs cluster, new acquisitions and one blade from Mathematics Dept. Datacenter donated from a Research group of DICA.

Hardware

Server or Blade # nodes and processors # cores/processor Ram (GB) Local storage Network interfaces
Gandalf 6 nodes Dell with 2 Intel Xeon 4 8 1 HD SAS 73GB 1Gbit Ethernet for Management and Data
Merlino 9 nodes Dell with 2 AMD Opteron 4 16 1 HD SATA 80GB 1Gbit Ethernet for Management, 10Gbit Ethernet for Data
Blade Morgana 11 nodes Dell with 2 Intel Xeon 4 24 2 HD SAS 146GB Raid0 1Gbit Ethernet for Management, 20Gbit Mellanox Infiniband for MPI, 10Gbit Ethernet for Data
Covenant 1 node Hp with 4 Intel Xeon 10 256 2 HD SAS 1TB Raid0 1Gbit Ethernet for Management, 10Gbit Ethernet for Data
Masternode 1 node Dell with 2 Intel Xeon 4 24 2 HD SAS 1TB Raid1 for OS, 4 HD SAS 2TB Raid5 for shared Scratch, 35TB Storage for Home 1Gbit Ethernet for Management, 1Gbit Ethernet for Frontend/login, 1Gbit Ethernet for nodes console (iDRAC, ILO), 2 10Gbit Ethernet for Data, Fiber Channel 8Gbps for Storage, 1 10Gbit Mellanox Infiniband for Infiniband control

The Masternode provides:

  1. A file system /opt/ohpc/pub/ on local Raid5 storage (up to 2 TB) for applications and process management, shared via NFS on Gigabit network
  2. A file system /homes on Dept. high performance Storage (up to 5 TB) for homes and user data shared via NFS on Gigabit network (Morgana nodes have 1 Gb bandwidth guaranteed due to 10Gb Ethernet hardware)
  3. Login for users
  4. Node provisioning software (installation of new nodes and node rebuild in less than 5 min.)
  5. PBSPro master for scheduling and resource management
  6. Support for Singularity container technology
  7. Infiniband network controller (via software)
  8. Application software for compute nodes
  9. Monitoring and utility software (Ganglia, Nagios, Hpc website and web interface to PBSPro submission)

Software

The cluster has been installed following the directions of the OpenHPC Project, an open source project backed by Intel and many software and hardware HPC player and supported by a strong developer community. The main reason for this choice is that they committed to support the upgrade of the software platform following the evolution of the single components (OS, compilers, MPI distributions, toolchains components, utility software, hardware support, container technology) and also for their independence from a specific vendor or technology. The installed software is divided in Management and operations software and Application software.

Management and operations software

  1. OpenHPC 1.3.1 (July 2017) for Linux CentOS 7.3 nodes image
  2. PBS Pro 14.1 (Altair released PBSPro as open source in mid 2016)
  3. Compilers: GCC 7.1.0, Intel C++/Fortran 17.0.4 ver. 2017 update 4
  4. MPI: OpenMPI 1.10.7, Intel MPI 2017.3, MPICH 3.2, MVAPICH2 2.2
  5. Ganglia for monitoring cluster performance (public access)
  6. Nagios for monitoring nodes health status (no public access)

Application software

  1. Abaqus 6.14
  2. Abaqus 2017
  3. Ansys 18.2.2 (Fluent and LSDYNA)
  4. Comsol 5.3
  5. Matlab R2017b with parallel support

Usage guides

Applications specific instructions

start.1554477213.txt.gz · Last modified: 2019/04/05 17:13 by druido