HPC fabrics math.
For all its complexities, high-performance computing (HPC) nevertheless occasionally runs on relatively straightforward math. For instance, when it comes to balancing CPU/GPU workloads, whole numbers can be better than fractions.
In other words, when a cluster is routing its threads across a non-whole number of nodes, the scheduling just becomes that much more complicated. As an analogy, think of a taxi company that services three regions of a city. Ideally, they would want to have the number of taxis out at any given time be a whole-number multiple of three. Otherwise, presuming all three regions of the city have approximately the same amount of activity on any given day, some taxis will be covering more than one region and thus wasting time, money, and petrol needlessly shuttling back and forth.
Likewise, an HPC with a non-integer ratio of ports to nodes is going to suffer unnecessary lag and jitter from the bifurcations and dissections of threads it’ll have to do simply as a matter of course.
So HPC high-speed interconnects provides an application of the taxi dispatcher problem, as it were, etched in silicon and carried across copper and fiber optic cables. Hewlett Packard Enterprise’s Apollo 6000 is a flagship HPC cluster that powers leading-edge supercomputer deployments from the German chemical company’s BASF to Denmark’s Centre for Biological Sequence Analysis to the New Zealand-based Academy Award® winning visual effects studio Weta Digital.
HPE Apollo 6000 and Intel® OPA integration.
In its architecture, Apollo 6000 is particularly tuned to interconnect fabrics whose number of ports are multiples of 24. As a counterexample, consider an interconnect fabric with 36 ports. In the Apollo 6000, such a fabric would connect 24 ports to the server’s nodes, leaving just 12 uplinks to the rest of the network. This is called the 2-to-1 bisection problem, and it poses a difficult routing problem to keep the system running at optimum speeds.
The best way to avoid the 2-to-1 bisection problem, of course, is to avoid such interconnect topologies in the first place. Intel’s Omni-Path Architecture has 48 ports, which for Apollo 6000 means one can connect 24 ports to the nodes and 24 ports to the uplink maintaining the full bandwidth.
For cluster reliability and availability, not to mention speed and consistency, the simplest solution to the bisection problem remains the best solution.
As can be seen in the network diagram discussed in this Intel® product brief, the difference between 36 ports and 48 ports can mean a whole world of performance improvements.
To interconnect the same number of nodes in a sample 768-node cluster, a 36-port fabric would require 43x 36-port edge switches to make a five-hop fat tree with two 648-port director switches. This same 36-port cluster would also require 1,542 cables (50% more than with the simple 48-port Omni-Path Architecture) and 99u of rack space. Worst of all is the 680 nanosecond switch latency this five-hop fabric would entail.
By contrast, the simple whole-number interconnects enable the Omni-Path Architecture to involve no edge switches, 20u of rack space (79% less than the 36-port fabric), and a 51-55% faster switch latency of 300-330 nanoseconds. (For details see Intel’s comparison cluster architecture diagrams, linked to above.)
Such streamlining enables Omni-Path interconnects in many applications to not only speed up a cluster but also reduce its operating budget.
So just like the straightforward math behind Omni-Path, the speedups and savings are easy to relate to as well.
To take charge of your HPC’s interconnect fabric, reduce your hardware cost and increase your cluster’s performance, explore HPE and Intel® OPA HPC solutions today.