Back in April we did a piece on digital storage used for the imaging of the black hole event horizon of a black hole in a neighboring galaxy (M87), 55 million light-years away, including the Western Digital helium sealed HDDs used to capture data at the 8 radio telescopes that were analyzed by astronomers at MIT and the Max Planck institute (MPI) in Germany. Since then we were contacted by Supermicro about their servers and storage used in the processing of the event horizon data to create the images. What follows is some information about what we found out. You can read more here:
According to Helge Rottmann from MPI “For the 2017 observations the total data volume collected was about 4 PB”. He also said that starting in 2018 the EHT collected data doubled to 8 PB. The DBEs acquired data from the upstream detection equipment using two 10 Gb/s Ethernet network interface cards at 128 Gb/s. Data was written using a time sliced round-robin algorithm across the 32 HDDs. The drives were mounted in groups of eight in four removable modules. After the data was collected the HDD modules were flown to the Max Planck Institute (MPI) for Radio Astronomy in Bonn, Germany for high frequency band data analysis and to the MIT Haystack Observatory in Westford, Massachusetts for low frequency band data analysis.
Vincent Fish from the MIT Haystack Observatory said that , “It has traditionally been too expensive to keep the raw data, so the disks get erased and sent out again for recording. This could change as disk prices continue to come down. We still have the 2017 data on disk in case we find a compelling reason to re-correlate it, but in general, once you’ve correlated the data correctly, there isn’t much need to keep petabytes of raw data around anymore.”
The imaging utilized servers and storage systems from Supermicro in a supercomputer, called a correlator at MIT and MPI. The correlator uses an a priori Earth geometry and a clock/delay model to align the signals from each telescope to a common time reference. Also, the sensitivity of the antennas had to be calculated to create a correlation coefficient between the different antennas.
The actual processing was performed with DiFX software running on high performance computing clusters at MPI and MIT. The clusters are composed of 100s of servers, thousands of cores, high performance networking (25 GbE and FDR Infiniband) and RAID storage servers. The photograph below shows the MIT correlator.
The MIT Cluster is housed within 10 racks. Three generations of Supermicro servers were used with the newest having two 10-core Intel® Xeon® CPUs. The network consists of Mellanox® 100/50/40/25 GbE switches with the majority of nodes on the high speed network at 25GbE or higher Mellanox PCIe add-on NICs. In addition to the recorders there is half a petabyte of storage scattered across the various Supermicro storage servers for staging raw data and archiving correlated data product.
The MPI Cluster, located in Bonn Germany, is comprised of 3 Supermicro head node servers, 68 Compute nodes (20 cores each = 1,360 cores), 11 Supermicro storage RAID servers running BeeGFS parallel file system with a capacity of 1.6 petabytes, FDR Infiniband networking and 21 playback units from a couple of generations.
Due to the high resource demand of DiFX, 2×2 clustered Supermicro headend nodes (4 total), shown in the figure below, are required to play the role of launching the correlations; farming out the pieces of the correlations to the compute nodes; collecting and combining the processed correlation pieces and writing out the correlated data products. These 4U 2 processor systems with 24 3.5” drive headend nodes utilize the onboard hardware SAS RAID controller to achieve high output data rates and data protection.
There are a total of 60 compute nodes that comprise the MIT cluster. There are 38 nodes of the Supermicro TwinPro multi-node systems (19x systems total), shown below, with Intel Xeon E5-2640 v4 processors. The Twin multi-node system contains two independent dual processor compute nodes in a single system doubling the density from traditional rackmount systems and with shared power and cooling for improved power efficiency and serviceability.
There are 16 previous generation compute cluster nodes in the cluster comprised of a 3U dual socket Supermicro server with the Intel Xeon E5-2680 v2 processors, shown below.
The clustered storage nodes are configured with redundant high efficiency power supplies and optimized redundant cooling to save energy, SAS3 expander options for ease of interconnection, plus a variety of drive bay options depending on task.
At the core of the MIT DiFX Correlator is a high-performance data storage cluster based on four Supermicro storage systems, to deliver high I/O throughput and data availability through 10 Gigabit Ethernet networking fabrics and RAID controllers.
These systems are built with a selection of different Supermicro serverboards and chassis with support for dual or single Intel® Xeon® processors, SAS3 drives with onboard hardware RAID controllers, onboard dual 10GbE for efficient networking, up to 2 TB DDR4 memory and 7 PCI-E 3.0 expansion slots for external drive capabilities.
EHT observations result in data spanning a wide range of signal-to-noise ratio (S/N) due to the heterogeneous nature of the array, and the high observing frequency produced data that were particularly sensitive to systematics in the signal chain. These factors, along with the typical challenges associated with VLBI, motivated the development of specialized processing and calibration techniques.
The end result of all this work, involving an intense international collaboration, was the first image of a black hole event horizon.
Also, Supermicro just release a line of servers and storage systems using the EDSFF NVMe form factor. There are two of these standardized form factors (formerly know as the Intel Ruler). E1.L (Long) is 1U X 318.75mm long and E1.S (Short) which is 1U X 111.49mm long. An image of the two form factors is shown below. The form factors support hot plug support and are built to enable cooling and high space utilization.
The EHT black hole event horizon imaging involved international teams of astronomers and utilized sophisticated data processing using Supermicro server and storage hardware. Future generations of EHT images may use NVMe SSD storage as well as HDDs, particularly for data from a future space radio telescope.