This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
start [2019/04/05 17:13] druido |
start [2022/07/26 11:24] (current) druido |
||
---|---|---|---|
Line 7: | Line 7: | ||
^Server or Blade ^# nodes and processors ^# cores/processor ^Ram (GB) ^Local storage ^Network interfaces^ | ^Server or Blade ^# nodes and processors ^# cores/processor ^Ram (GB) ^Local storage ^Network interfaces^ | ||
- | | **Gandalf** | 6 nodes Dell with 2 Intel Xeon | 4 | 8 | 1 HD SAS 73GB | 1Gbit Ethernet for Management and Data| | + | | **Blade Gandalf** | 6 nodes Dell with 2 Intel Xeon | 4 | 8 | 1 HD SAS 73GB | 1Gbit Ethernet for Management and Data| |
- | | **Merlino** | 9 nodes Dell with 2 AMD Opteron | 4 | 16 | 1 HD SATA 80GB | 1Gbit Ethernet for Management, 10Gbit Ethernet for Data| | + | | **Blade Legolas** | 16 nodes Hp with 2 Intel Xeon | 4 | 24 | 1 HD SAS 73GB | 1Gbit Ethernet for Management and Data| |
+ | | **Blade Merlino** | 9 nodes Dell with 2 AMD Opteron | 4 | 16 | 1 HD SATA 80GB | 1Gbit Ethernet for Management, 10Gbit Ethernet for Data| | ||
| **Blade Morgana** | 11 nodes Dell with 2 Intel Xeon | 4 | 24 | 2 HD SAS 146GB Raid0 | 1Gbit Ethernet for Management, 20Gbit Mellanox Infiniband for MPI, 10Gbit Ethernet for Data| | | **Blade Morgana** | 11 nodes Dell with 2 Intel Xeon | 4 | 24 | 2 HD SAS 146GB Raid0 | 1Gbit Ethernet for Management, 20Gbit Mellanox Infiniband for MPI, 10Gbit Ethernet for Data| | ||
- | | **Covenant** | 1 node Hp with 4 Intel Xeon | 10 | 256 | 2 HD SAS 1TB Raid0 | 1Gbit Ethernet for Management, 10Gbit Ethernet for Data| | + | | **Covenant** | 1 node Hp with 4 Intel Xeon, 2 nodes Dell with 2 Xeon cpu | 10,20 | 256,320 | 2 HD SAS 1TB Raid0 | 1Gbit Ethernet for Management, 10Gbit Ethernet for Data| |
- | | **Masternode** | 1 node Dell with 2 Intel Xeon| 4 | 24 | 2 HD SAS 1TB Raid1 for OS, 4 HD SAS 2TB Raid5 for shared Scratch, 35TB Storage for Home |1Gbit Ethernet for Management, 1Gbit Ethernet for Frontend/login, 1Gbit Ethernet for nodes console (iDRAC, ILO), 2 10Gbit Ethernet for Data, Fiber Channel 8Gbps for Storage, 1 10Gbit Mellanox Infiniband for Infiniband control| | + | | **Masternode** | 1 node Dell with 2 Intel Xeon| 4 | 24 | 2 HD SAS 1TB Raid1 for OS, 4 HD SAS 2TB Raid5 for Scratch, 35TB Storage for Home |1Gbit Ethernet for Management, 1Gbit Ethernet for Frontend/login, 1Gbit Ethernet for nodes console (iDRAC, ILO), 2 10Gbit Ethernet for Data, Fiber Channel 8Gbps for Storage, 1 10Gbit Mellanox Infiniband for Infiniband control| |
+ | | **GPU nodes** | 5 nodes Dell T630/T640 with 2 Intel Xeon and 1 or 2 nVidia GPUs| 8 | 32/64 | 1 HD SAS 1TB | 1Gbit Ethernet for Management, 1Gbit Ethernet for Data| | ||
The Masternode provides: | The Masternode provides: | ||
- | - A file system **/opt/ohpc/pub/** on local Raid5 storage (up to 2 TB) for applications and process management, shared via NFS on Gigabit network | + | - A file system **/opt/ohpc/pub/** on local Raid storage for installed applications, shared via NFS on Gbit network |
- | - A file system **/homes** on Dept. high performance Storage (up to 5 TB) for homes and user data shared via NFS on Gigabit network (Morgana nodes have 1 Gb bandwidth guaranteed due to 10Gb Ethernet hardware) | + | - A file system **/homes** on Dept. Storage (up to 35 TB) for user homes shared via NFS on Gbit or 10Gbit network (Morgana nodes have 1 Gb bandwidth guaranteed due to 10Gb Ethernet hardware) |
- | - Login for users | + | - A file system **/scratch** on local Raid storage (up to 5.5 TB) for scratch shared via NFS on Gbit or 10Gbit network (Morgana nodes have 1 Gb bandwidth guaranteed due to 10Gb Ethernet hardware) |
+ | - Login for users via SSH and via Web SSH interface | ||
- Node provisioning software (installation of new nodes and node rebuild in less than 5 min.) | - Node provisioning software (installation of new nodes and node rebuild in less than 5 min.) | ||
- PBSPro master for scheduling and resource management | - PBSPro master for scheduling and resource management | ||
Line 22: | Line 25: | ||
- Infiniband network controller (via software) | - Infiniband network controller (via software) | ||
- Application software for compute nodes | - Application software for compute nodes | ||
- | - Monitoring and utility software (Ganglia, Nagios, Hpc website and web interface to PBSPro submission) | + | - Monitoring and utility software: Ganglia, Nagios |
+ | - Documentation website: [[http://masternode.chem.polimi.it|http://masternode.chem.polimi.it]] accessible only from the wired DCMC network or via DCMC VPN | ||
+ | - Web SSH interface [[https://masternode.chem.polimi.it/webssh|https://masternode.chem.polimi.it/webssh]] accessible only from the wired DCMC network or via DCMC VPN | ||
+ | - Web interface to PBSPro submission: planned for 2019 | ||
===== Software ===== | ===== Software ===== | ||
- | |||
The cluster has been installed following the directions of the **OpenHPC Project**, an open source project backed by Intel and many software and hardware HPC player and supported by a strong developer community. | The cluster has been installed following the directions of the **OpenHPC Project**, an open source project backed by Intel and many software and hardware HPC player and supported by a strong developer community. | ||
The main reason for this choice is that they committed to support the upgrade of the software platform following the evolution of the single components (OS, compilers, MPI distributions, toolchains components, utility software, hardware support, container technology) and also for their independence from a specific vendor or technology. | The main reason for this choice is that they committed to support the upgrade of the software platform following the evolution of the single components (OS, compilers, MPI distributions, toolchains components, utility software, hardware support, container technology) and also for their independence from a specific vendor or technology. | ||
The installed software is divided in Management and operations software and Application software. | The installed software is divided in Management and operations software and Application software. | ||
+ | At the startup on April, 1st 2019 the software are: | ||
==== Management and operations software ==== | ==== Management and operations software ==== | ||
- | - **OpenHPC 1.3.1** (July 2017) for Linux CentOS 7.3 nodes image | + | - **OpenHPC 1.3.6** (October 2018) for Linux CentOS 7.5 nodes image |
- | - **PBS Pro 14.1** (Altair released PBSPro as open source in mid 2016) | + | - **PBS Pro 18.1** (Altair released PBSPro as open source in mid 2016) |
- | - **Compilers:** **GCC 7.1.0**, **Intel C++/Fortran 17.0.4** ver. 2017 update 4 | + | - **Compilers:** **GCC 5.4.0**, **GCC 7.3.0**, **GCC 8.2.0**, **Intel C++/Fortran/MPI 19.0.3.199** ver. 2019 update 3 |
- | - **MPI:** **OpenMPI 1.10.7**, **Intel MPI 2017.3**, **MPICH 3.2**, **MVAPICH2 2.2** | + | - **MPI:** **OpenMPI 1.10.7**, **OpenMPI 3.1.2**, **Intel MPI 2019.3**, **MPICH 3.2.1**, **MVAPICH2 2.2**, **MVAPICH2 2.3** |
- **Ganglia** for monitoring cluster performance (public access) | - **Ganglia** for monitoring cluster performance (public access) | ||
- | - **Nagios** for monitoring nodes health status (no public access) | + | - **Nagios** for monitoring nodes health status (restricted access) |
==== Application software ==== | ==== Application software ==== | ||
- | - **Abaqus 6.14** | ||
- **Abaqus 2017** | - **Abaqus 2017** | ||
- | - **Ansys 18.2.2** (Fluent and LSDYNA) | + | - **Abaqus 2018** |
- | - **Comsol 5.3** | + | - **Abaqus 2019** |
- | - **Matlab R2017b** with parallel support | + | - **Ansys 19.3 2019 R1** (Fluent and LSDYNA) |
+ | - **Comsol 5.4** | ||
+ | - **Matlab R2019a** with parallel support | ||
+ | - **Python 2.x** stack with Scipy,Numpy,Mpi4Py | ||
+ | - **Python 3.4** stack with Scipy,Numpy,Mpi4py | ||
+ | - **AdIOS** | ||
+ | - **HDF5** and **pHDF5** | ||
+ | - **NetCDF** and **pNetCDF** | ||
+ | - **SIONLIB** | ||
+ | - **R 3.5** | ||
+ | - **Trilinos 12.12.1** | ||
+ | |||
+ | And many more libraries and tools for parallel and scalar scientific programming. | ||
===== Usage guides ===== | ===== Usage guides ===== | ||
[[Queues and Resources|Queues and access to the cluster]]\\ | [[Queues and Resources|Queues and access to the cluster]]\\ | ||
- | [[pbs_jobfile_structure|PBS jobfile structure]] | + | [[pbs_jobfile_structure|PBS jobfile structure]]\\ |
- | + | [[Modules|Modules]] | |
- | ===== Applications specific instructions ===== | + | |
- | + | ||
- | [[comsol_5.3|COMSOL 5.3]] | + | |
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||