In my previous article (Evolution of modern data center), I described how modern data centers are evolving in adopting the distributed computing along with modular commodity servers that have yielded highest levels of services at lowered cost. To that end, near 100% uptime and agility needed in repurposing the IT infrastructures are the most desirable aspects of modern data center.
In this article, I’ll describe the technologies (compute, networking and storage) that are shaping up the modern data center and their impact on software architectures. There are some technical details in this article, which I have made easy to assimilate for everyone in the IT industry.
Figure 1 shows the improvements in CPU architecture with respect to number of transistors, performance, frequency, power and number of logical cores during the last 4 decades. From the graph, it’s clear that single thread performance (blue dots in the graph) has been improving slightly in the last 5-6 years and for all practical purpose it has remained flat. Similarly, frequency of the CPU (green dots) also has not shown much upward gains in the last few years. Higher frequencies raise the CPU temperature beyond operational limits either forcing it to be shutdown or left to melt otherwise.
The above limitation in scaling up the CPU frequency has forced processor manufactures to add more and more independent single threads (horizontal core scale) to the processor die (black dots in the graph). Thus, today, the Intels and the AMDs of the compute industry have already shipped CPUs with 16-72 cores per die. These cores, on single die, have smaller private cache but share common bigger cache with other cores.
Impact on Software Architecture: Multiple cores means that applications will need to be written with parallelism in mind. Shared data access from multiple parallel threads will need guarded
access with locks; this is where the software design becomes complex, tedious to debug and maintain. Also, a greedy resource intensive software thread could hog the cache resource depriving the other threads from effectively using the shared cache system.
Thus, the overall performance in a multicore system is highly dependent on how well the application is architected for concurrency and controlled access to CPU resources; something which system developers and veteran architects believe is hard achieve and demands seasoned experts in that field.
Hence, to achieve higher concurrency, the monolithic software modules are broken into smaller independent tasks (micro services) and deployed to run on each of those cores in parallel (like swim lanes) is the preferred software architecture in modern data center.
In summary, process and resource isolation techniques like server virtualization (hypervisor) and the lightweight containers are the more preferred and plausible solutions to increase the utilization of multicore systems.
Networking: Ethernet converging both storage and networking
Today, 10 Gbps Ethernet is the norm in the data center and poised to make relatively rapid leaps to 40 and 100-Gbps Ethernet with switch interconnects of 400-Gbps between data centers. That’s how quickly data-center bandwidth demands are spiraling upward.
According to a research by IEEE, the growth in bandwidth requirement is being driven across several application spaces by simultaneous increases in users, access methodologies and services (such as video on demand and social media).
Owing to this demand, leading Ethernet vendors have announced spectrum of 10, 25, 40, 50 and 100 GbE port switches and adapters. These Ethernet devices not only have higher bandwidth but lower latency (< 10 microseconds). Perhaps the most important factors is the ability of the Ethernet community to keep the cost per bit falling with time in such a way that the exponential rise in traffic does not result in unsupportable costs.
The lowered latency (<10 us) and higher bandwidth of Ethernet has made system designers to overlay high bandwidth data access protocols like RDMA over Ethernet (aka ROCE, RDMA over converged Ethernet). ROCE has demonstrated the data transfer capabilities between systems that were only possible with Infiniband/Fiber channel networks few years back.
Thus, storage protocols like NVMe (Non Volatile Memory Express) are being implemented over ROCE extending the Ethernet to handle high performance computing workloads.
Impact on Software Architecture: The availability of higher network bandwidth with lower latency enables distributed computing to scale to thousands of nodes. This also augments the fact that several redundant micro services could span thousands of nodes increasing the compute output aiding the real time processing of data.
Thus, faster networks have enabled distributed computing at hyper scale (> 1000 nodes) and enabled bandwidth hungry storage protocols to converge on Ethernet.
Storage: Non volatile memory; new storage tier
In the recent years, the IT industry has seen a proliferation of new storage media, providing a much wider range of performance and price points.
Figure 2 shows a graphical representation of the emerging landscape of storage media by comparing them in price and access latency. Today’s single level cell (SLC) flash SSDs provide 1000x the IOPS of the near-line disk for 32x the price per GB. Important thing to note here is that flash SSDs are actually cheaper than disks when measured in dollars per IOPS. These gaps will increase in the coming years with the introduction of storage-class-memory (SCM) and of less expensive and denser cloud and archive disk drives.
SCM is class of memories that are byte addressable and non-volatile, ideally accessed over memory bus (DDR) in the system. SCM access latencies comparable to DRAM access latencies. SCMs could be consumed as a replacement to DRAM on DDR bus (byte addressed like memory) or as better flash on PCIe bus using Non Volatile Memory Express (NVMe) parallel block protocol.
Impact on Software Architecture: SCMs as a drop-in replacement for DRAM is the most disruptive option. But today’s applications don’t expect a corrupted memory state to survive a crash. Hence a lot will have to change in the application development ecosystem, before persistence becomes a first-class citizen on the memory bus. Hence, the SNIA NVM Programming Technical Workgroup is focused on standardizing the programming model for persistent memory.
Another way to view SCM is as a better flash. Its endurance is about 30x better than the endurance of the best SLC NAND flash in the market today. This media could also be packaged as an NVMe device accessed over standard PCIe bus. In that form, 4K random read and random write performance are expected to be symmetrical – quite unlike NAND flash. Because of 10x lowered latencies and much higher bandwidth than NAND flash SCM introduces a new performance tier in the application data access path; thus making caching and tiering even more attractive.
In summary, multicore systems, blazing fast Ethernet coupled with Non-volatile memory storage are giving a run to software stack which need to be refactored/re-architected in some cases for maximum system utilization.