The host CPU determination is an early and important decision for Oxide. As this document will eludicate, this determination is both essential and nuanced: there are two options that seem to have very different tradeoffs — each one carrying with it high levels of risk and uncertainty.
The following image relates this RFD to other ones in a hierarchical fashion.
Constraints
Instruction set architecture
While we laud alternative instruction set architectures, server-side computing remains dominated by x86. This dominance stems from multiple historical factors, not least that it triumphed in personal computing nearly four decades ago. Now, there is reason to believe that this dominance is more fragile than ever: with ARM’s Ares core (the basis for Amazon’s Graviton 2) there is at last a non-x86 core that can meaningfully compete with (Intel) x86 on the server — but it is not clear to what degree this will get adoption in the long tail of software development. (For example, will software developers who are increasingly returning to a binary model embrace cross compilation and two different instruction set targets?) In our target market of the enterprise space, it seems likely that any adoption of hybrid instruction sets will be slower still; Oxide will have x86-based host CPUs for the foreseeable future.
Evaluation Criteria
Performance
The most obvious criteria for the host CPU is performance. While quantifiable, performance is also incredibly nuanced: with so many different ways of describing the performance of a system, it’s easy to measure (or emphasize) an aspect of the system that is practically irrelevant. Ultimately, the performance that matters is the performance of a customer workload — which may or may not be what vendors measure or design around! That said, there remain obvious axes of microprocessor performance: clock rate; number of cores; cache size; cache architecture. (In terms of quantifying these, SPECrate is generally reasonable.) These axes are directly affected by density/process: barring architectural differences, one cannot expect (say) a 14nm part to ever be broadly competitive with a 7nm one. There are more architectural aspects to performance as well: on-die acceleration (e.g., SIMD extensions), compiler optimization, cache coherence architecture, etc. And then there are system aspects of performance that can be implied by processor choice: bus speed, memory speed, memory sizing etc. Performance certainly isn’t our only criteria, and its importance must be kept in perspective, but it also must be viewed as central: Oxide racks will compete with extant infrastructure; the higher their performance (however measured, observed or perceived), the more those racks will enjoy a tail wind in our customer base.
Price
The second obvious criteria for any microprocessor is price. Historically, the CPU is the most expensive single component in the BOM, but the price can vary widely within a single family. For example, when it launched, Intel’s Cascade Lake ranged in price from $1600 to $17900 for a single socket — and that only includes the "Gold" and "Platinum" classes of Xeon. At the quantities that we are seeking (on the order of ~96 sockets per rack), these price differences have enormous impact on the economics of the entire rack. In this regard, price almost always needs to be denominated — e.g., as $/core or $/SPECrate.
Thermal design point (TDP)
Critical for us (or anyone looking to achieve any kind of density) is the thermal design point (TDP). This is — more or less — the draw of the part that the needs to be cooled. Historically, dense form factors like the OCP Tioga Pass form factor have allowed TDP per socket of around ~165W; beyond that, a different form factor would need to be considered. Note that 250W is generally the threshold for air cooling; beyond TDP of 250W, a CPU must be water cooled. Moreover, power for the rack must be considered: assuming an Open Rack, we should assume no more than ~15kW at ~32OU — or about 450W per OU. This must be all in (DRAM, NVMe, NIC, etc.), so the CPU power budget must be less than this, although to what degree is to be determined.
Transistor density/process
Transistor density has many implications for the system, in terms of performance, power and price. Regrettably, for historical reasons, transistor density (which ideally should be expressed in terms of transistors per fixed unit area) is expressed by the industry in terms of feature size — which doesn’t actually correspond to anything. (That is, in a 7nm process — and especially in the forthcoming 5nm processes — there is nothing that is in fact as small as the putative feature size.) This has given rise to absurd process nodes like Intel’s "14nm+++" (yes, three plusses). Making things worse, vendors making claims that are difficult to verify ("our 10nm is equivalent to their 7nm"). Density is critically important (there is unquestionably a density difference between, say, 14nm and 7nm), but it is likely best measured through its manifestation in the artifact: core count, cache sizing, TDP, etc.
Core density
Closely related to TDP and transistor density is core density. We are seeking a dense product offering, with more than 1344 cores in a 15 kW rack. Core density should be thought of per U — and if achieving that desired core density necessitates multiple sockets, issues around multiple sockets must be considered (see, e.g. "Cross-connect", below).
Yield
Related to transistor density is yield. We have no control over yield and yield often isn’t disclosed, but there are architectural decisions that can materially affect yield. In particular, chiplet designs can result in (much) higher yield because of their (much) smaller size. Similarly, tiled designs can circumvent some of the yield problems that may be present in a large monolithic design.
Platform
CPUs do not exist on their own; they exist in a larger system that must support them. CPUs are mated to specific platforms in several key attributes, but especially via I/O (i.e., generation of PCIe) and memory (i.e., generation of DDR SDRAM). This is particularly germane for us because we happen to be targeting a product that is arguably at the tail end of one generation or the leading edge of the next. Specifically, both Intel and AMD have roadmap CPUs (Sapphire Rapids and Genoa, respectively) that are targeting PCIe Gen 5 and DDR5; to build on either of these parts is to build a dependency on both of these technologies — each of which has economic, performance, complexity, and schedule risk implications.
Roadmap/schedule
Microprocessors take a long time to develop — especially given dense processes like 7nm. A roadmap can therefore be a concrete way of predicting the future in that one can often known years ahead of time if a particular CPU is going to be competitive or not. Roadmaps should be taken as a best case; schedules certainly can slip. In general, late projects get later, and when there is a history of slip, more can be reasonably expected.
Strategic relationship
Our strategic relationship with a microprocessor vendor can have an outsized effect: the ability to get the support we need when we need it can often come down to the depth of our relationship. As a startup, it is a challenge for us to be strategically relevant to much larger companies.
Security
Microprocessor security was brought to the fore with Spectre and Meltdown — which presaged a deluge of microprocessor vulnerabilities. There are many aspects to security — some practical (degree of historical vulnerability, history of effective embargoed communication) and some more architectural (role of secure enclaves, secure virtualization, etc.)
Firmware
Microprocessors require firmware to boot and operate correctly. Firmware considerations include: degree of openness, degree of support, degree of platform specificity, etc. The most important single access for us with respect to firmware is our ability to control and attest our entire stack: we want to minimize our dependency on third-parties (that is, firmware delivered neither by us nor by the component vendor). In particular, nearly all computing systems have a dependency on third-party BIOS vendors such as AMI (formerly American Megatrends, Inc.); we view it as a constraint to avoid this spurious dependency, for myriad reasons.
Hyperprivileged software
x86 architectures contain software that runs in a hyperprivileged state: for Intel, this is the Intel Management Engine (ME); for AMD, it is the Platform Security Processor (PSP). Understanding this software — its scope, its firmware, the conditions under which it operates — can be critical for understanding many different aspects of the system (not least its security).
Cross-connect
Where multi-socket systems are to be considered, cross-connect architecture (protocol, bandwidth, latency) should be understood. Intel and AMD have different architectures for cache coherence (PCIe Gen 4 vs. UPI in Rome and Cascade Lake, respectively) — which can introduce asymmetries in the system. Specifically, in two socket systems, the I/O path must be understood to know if sockets are asymmetric with respect to I/O.
ISA extensions
Both x86 microprocessors extend the ISAs in different ways. Some of these can be significant (e.g., AVX-512) and can have an outsized impact on performance.
Top-of-rack considerations
The Oxide rack integrates a top-of-rack controller. It is desirable for this controller to have the same CPU architecture as the rest of the rack. Some microprocessors may lend themselves better to this integration than others.
Root-of-trust implications
Different microprocessors have different mechanisms for executing the first instruction. We believe that our root-of-trust will be implementable with either Intel or AMD, but the effort involved in each path will surely not be identical.
RAS features
Features for reliability, availability and serviceability differ in ways that can affect the ability for system software to be resilient to certain kinds of failure.
Microcode architecture
Microprocessors have microcode that the host can dynamically load. How the microprocessor is able to load microcode can have ramifications for security and availability.
Evaluation
Given the constraints, the evaluation boils down to AMD (Rome, Milan, Genoa) versus Intel (Cascade Lake, Ice Lake, Sapphire Rapids).
Density/Performance/TDP/Price
This table compares publicly available CPUs: AMD’s Rome (EPYC) to Intel’s Cascade Lake (CLX):
Processor | Cores | Thr | Base Freq | L3 | TDP | W/core | SPEC | Price | $/core |
---|---|---|---|---|---|---|---|---|---|
EPYC 7702P | 64 | 128 | 2 | 256 MB | 200 | 3.13 | 249.6 | $4,425 | $69 |
EPYC 7702 | 64 | 128 | 2 | 256 MB | 200 | 3.13 | 234.9 | $6,450 | $101 |
EPYC 7742 | 64 | 128 | 2.25 | 256 MB | 225 | 3.52 | 261.7 | $6,950 | $109 |
EPYC 7552 | 48 | 96 | 2.2 | 192 MB | 200 | 4.17 | 201.8 | $4,025 | $84 |
EPYC 7642 | 48 | 96 | 2.3 | 256 MB | 225 | 4.69 | 230.5 | $4,775 | $99 |
EPYC 7452 | 32 | 64 | 2.35 | 128 MB | 155 | 4.84 | 175.9 | $2,025 | $63 |
CLX 6262V | 24 | 48 | 1.9 | 33 MB | 135 | 5.63 | 103.8 | $2,900 | $121 |
CLX 6238T | 22 | 44 | 1.9 | 30.25 MB | 125 | 5.68 | 104.5 | $2,742 | $125 |
CLX 6222V | 20 | 40 | 1.8 | 27.5 MB | 115 | 5.75 | 91.6 | $1,600 | $80 |
CLX 5220T | 18 | 36 | 1.9 | 24.75 MB | 105 | 5.83 | 90.1 | $1,727 | $96 |
CLX 8276 | 28 | 56 | 2.2 | 38.5 MB | 165 | 5.89 | 127.2 | $8,719 | $311 |
CLX 8276L | 28 | 56 | 2.2 | 38.5 MB | 165 | 5.89 | 128 | $11,722 | $419 |
CLX 8276M | 28 | 56 | 2.2 | 38.5 MB | 165 | 5.89 | 127 | $11,722 | $419 |
CLX 4216 | 16 | 32 | 2.1 | 22 MB | 100 | 6.25 | 83.9 | $1,002 | $63 |
EPYC 7502P | 32 | 64 | 2.5 | 128 MB | 200 | 6.25 | 184.1 | $2,300 | $72 |
EPYC 7502 | 32 | 64 | 2.5 | 128 MB | 200 | 6.25 | 183.5 | $2,600 | $81 |
CLX 6252 | 24 | 48 | 2.1 | 35.75 MB | 150 | 6.25 | 118.4 | $3,655 | $152 |
CLX 6252N | 24 | 48 | 2.3 | 35.75 MB | 150 | 6.25 | 119 | $3,984 | $166 |
CLX 6230 | 20 | 40 | 2.1 | 27.5 MB | 125 | 6.25 | 102.5 | $1,894 | $95 |
CLX 6230T | 20 | 40 | 2.1 | 27.5 MB | 125 | 6.25 | 101.5 | $1,988 | $99 |
CLX 6230N | 20 | 40 | 2.3 | 27.5 MB | 125 | 6.25 | 102.8 | $2,046 | $102 |
EPYC 7402 | 24 | 48 | 2.8 | 128 MB | 155 | 6.46 | 169.8 | $1,783 | $74 |
CLX 6238 | 22 | 44 | 2.1 | 30.25 MB | 140 | 6.36 | 109.1 | $2,612 | $119 |
CLX 6238M | 22 | 44 | 2.1 | 30.25 MB | 140 | 6.36 | 110.9 | $5,615 | $255 |
CLX 6238L | 22 | 44 | 2.1 | 30.25 MB | 140 | 6.36 | 110.3 | $5,615 | $255 |
CLX 5218T | 16 | 32 | 2.1 | 22 MB | 105 | 6.56 | 85.2 | $1,349 | $84 |
CLX 5218N | 16 | 32 | 2.3 | 22 MB | 110 | 6.88 | 86.5 | $1,375 | $86 |
CLX 8260 | 24 | 48 | 2.4 | 35.75 MB | 165 | 6.88 | 119.3 | $4,702 | $196 |
CLX 8260Y | 24 | 48 | 2.4 | 35.75 MB | 165 | 6.88 | 122.5 | $5,320 | $222 |
CLX 8260L | 24 | 48 | 2.4 | 35.75 MB | 165 | 6.88 | 122.7 | $7,705 | $321 |
CLX 8260M | 24 | 48 | 2.4 | 35.75 MB | 165 | 6.88 | 112.8 | $7,705 | $321 |
CLX 5220 | 18 | 36 | 2.2 | 24.75 MB | 125 | 6.94 | 94.5 | $1,555 | $86 |
CLX 5220S | 18 | 36 | 2.7 | 24.75 MB | 125 | 6.94 | 95.5 | $2,000 | $111 |
EPYC 7542 | 32 | 64 | 2.9 | 128 MB | 225 | 7.03 | 183.2 | $3,400 | $106 |
CLX 4214 | 12 | 24 | 2.2 | 16.5 MB | 85 | 7.08 | 68 | $694.00 | $58 |
CLX 4214Y | 12 | 24 | 2.2 | 16.5 MB | 85 | 7.08 | 67.4 | $768.00 | $64 |
CLX 8280 | 28 | 56 | 2.7 | 38.5 MB | 205 | 7.32 | 138 | $10,009 | $357 |
CLX 8280L | 28 | 56 | 2.7 | 38.5 MB | 205 | 7.32 | 138 | $13,012 | $465 |
CLX 8280M | 28 | 56 | 2.7 | 38.5 MB | 205 | 7.32 | 136.6 | $13,012 | $465 |
EPYC 7282 | 16 | 32 | 2.8 | 64 MB | 120 | 7.50 | 92.8 | $650 | $41 |
EPYC 7352 | 24 | 48 | 2.3 | 128 MB | 180 | 7.50 | 160.3 | $1,350 | $56 |
CLX 6248 | 20 | 40 | 2.5 | 27.5 MB | 150 | 7.50 | 110 | $3,072 | $154 |
CLX 5218B | 16 | 32 | 2.3 | 22 MB | 125 | 7.81 | 89 | $1,273 | $80 |
CLX 5218 | 16 | 32 | 2.3 | 22 MB | 125 | 7.81 | 87.5 | $1,273 | $80 |
CLX 8253 | 16 | 32 | 2.2 | 22 MB | 125 | 7.81 | 79.1 | $3,115 | $195 |
CLX 8270 | 26 | 52 | 2.7 | 35.75 MB | 205 | 7.88 | 131.6 | $7,405 | $285 |
EPYC 7402P | 24 | 48 | 2.8 | 128 MB | 200 | 8.33 | 169.1 | $1,250 | $52 |
CLX 6240 | 18 | 36 | 2.6 | 24.75 MB | 150 | 8.33 | 103.5 | $2,445 | $136 |
CLX 6240Y | 18 | 36 | 2.6 | 24.75 MB | 150 | 8.33 | 103.6 | $2,726 | $151 |
CLX 6240M | 18 | 36 | 2.6 | 24.75 MB | 150 | 8.33 | 104.1 | $5,448 | $303 |
CLX 6240L | 18 | 36 | 2.6 | 24.75 MB | 150 | 8.33 | 103.4 | $5,448 | $303 |
CLX 4210 | 10 | 20 | 2.2 | 13.75 MB | 85 | 8.50 | 57.9 | $501.00 | $50 |
CLX 5215 | 10 | 20 | 2.5 | 13.75 MB | 85 | 8.50 | 61.8 | $1,221 | $122 |
CLX 5215M | 10 | 20 | 2.5 | 13.75 MB | 85 | 8.50 | 62.1 | $4,224 | $422 |
CLX 5215L | 10 | 20 | 2.5 | 13.75 MB | 85 | 8.50 | 61.8 | $4,224 | $422 |
CLX 8268 | 24 | 48 | 2.9 | 35.75 MB | 205 | 8.54 | 129.2 | $6,302 | $263 |
CLX 4209T | 8 | 16 | 2.2 | 11 MB | 70 | 8.75 | 46.4 | $501.00 | $63 |
CLX 6242 | 16 | 32 | 2.8 | 22 MB | 150 | 9.38 | 99.1 | $2,529 | $158 |
EPYC 7302P | 16 | 32 | 3 | 128 MB | 155 | 9.69 | 141.1 | $825 | $52 |
EPYC 7302 | 16 | 32 | 3 | 128 MB | 155 | 9.69 | 140.8 | $978 | $61 |
CLX 6226 | 12 | 24 | 2.7 | 19.25 MB | 125 | 10.42 | 82.8 | $1,776 | $148 |
CLX 4208 | 8 | 16 | 2.1 | 11 MB | 85 | 10.63 | 45.3 | $417.00 | $52 |
CLX 4215 | 8 | 16 | 2.5 | 11 MB | 85 | 10.63 | 52.4 | $794.00 | $99 |
CLX 6254 | 18 | 36 | 3.1 | 24.75 MB | 200 | 11.11 | 112.9 | $3,803 | $211 |
EPYC 7272 | 12 | 24 | 2.9 | 64 MB | 155 | 12.92 | 83.5 | $625 | $52 |
CLX 6246 | 12 | 24 | 3.3 | 24.75 MB | 165 | 13.75 | 92.1 | $3,286 | $274 |
CLX 3204 | 6 | 6 | 1.9 | 8.25 MB | 85 | 14.17 | 27 | $213.00 | $36 |
CLX 5217 | 8 | 16 | 3 | 11 MB | 115 | 14.38 | 56.5 | $1,522 | $190 |
EPYC 7232P | 8 | 16 | 3.1 | 32 MB | 120 | 15.00 | ? | $450 | $56 |
EPYC 7252 | 8 | 16 | 3.1 | 64 MB | 120 | 15.00 | 67 | $475 | $59 |
EPYC 7262 | 8 | 16 | 3.2 | 128 MB | 120 | 15.00 | 89 | $575 | $72 |
CLX 6234 | 8 | 16 | 3.3 | 24.75 MB | 130 | 16.25 | 68.9 | $2,214 | $277 |
CLX 6244 | 8 | 16 | 3.6 | 24.75 MB | 150 | 18.75 | 73.4 | $2,925 | $366 |
CLX 5222 | 4 | 8 | 3.8 | 16.5 MB | 105 | 26.25 | 38.6 | $1,221 | $305 |
CLX 8256 | 4 | 8 | 3.8 | 16.5 MB | 105 | 26.25 | 38.6 | $7,007 | $1,752 |
Intel’s forthcoming Ice Lake will likely have a 6262V equivalent (dubbed 6362V) with better density (32 or 36 cores), but with no better TDP/core than Cascade Lake. In essentially every conceivable way of looking at this data, Rome is superior to Cascade Lake: in price, in performance, in density, in TDP — and in all of these things when viewed in terms of the other.
Security
The following table depicts known vulnerabilities, and the degree that they historically affected Intel CPUs vs. AMD CPUs circa February 2020:
Vulnerability | Intel | AMD |
---|---|---|
Vulnerable | Vulnerable | |
Vulnerable | Not vulnerable | |
Vulnerable | Vulnerable | |
Vulnerable | Not vulnerable | |
Vulnerable | Not vulnerable | |
Vulnerable | Vulnerable | |
Vulnerable | Not vulnerable | |
Vulnerable | Not vulnerable | |
Vulnerable | Not vulnerable | |
Vulnerable | Not vulnerable | |
Vulnerable | Not vulnerable | |
Vulnerable | Not vulnerable | |
Vulnerable | Not vulnerable | |
Vulnerable | Vulnerable |
Determinations
Based on the comparison table and its subsequent analysis, we are opting to go with AMD’s Milan CPUs. As we are trying to maximize the W/core and $/core, while still retaining a reasonable amount of frequency, we have chosen to build a platform that targets the SP3 Group D CPUs (ยง2.2.1 [amd-irm]) and a configurable TDP (cTDP) and package power limit (PPL) up to 240 W in a single socket (1P aka 1S) configuration. 64-core processors are the desired launch target.
Although AMD’s 64-core product stack has changed somewhat from Rome to our previous expectations for Milan to the documented Milan set at launch, this determination has not changed. As shown in [amd-rome-ptds] and [amd-milan-ptds], AMD offered 4 different 64-core Rome parts, of which we considered the 7702P (group A) and 7742 (group D) most likely to represent the type of processor we want to offer in our product. There is at present no Milan analogue to the 7742; AMD are planning a 7763 (group X, analogous to the 7H12 in group Z) and the two 7713 variants (the 1P 7713P and the 2P 7713, both in group D and analogous to the 7702P/7702 that were both in group A).
Therefore, the launch target is the 7713P. Due to availability constraints, bringup and early validation phases may utilize the 7713 and/or inexpensive Rome processors as required.
Non-Goals
Group X/Z processor support.
This does leave out processors such as the slightly higher-clocked 7763 and the frequency-optimized processors such as the 32-core 2.95 GHz 75F3; however, these all run hot with a default TDP and PPL of 280W and are designed for more specialty use cases that we do not believe make sense for us to target in this version in the product. Group X (Milan-only) parts also include configurable EDC up to 300 A. Support for these processors in our platform is mildly desirable for future flexibility but not a requirement and we do not expect to offer configurations with these processors.
2S configurations.
A 2S SP3 platform is not planned; however, every 1S platform automatically provides hardware support for both 1P and 2P processors of the same power/thermal infrastructure groups.
Other group D and lower processors.
By virtue of AMD’s infrastructure design, every group D platform will also support all group A, B, and C processors solely by virtue of meeting the group D requirements. Software support for and testing of the platform with group D or lower processors other than the 7713P (and 7713, due to availability constraints) is not planned but may be undertaken later with business justification.
Documentation and Sources
[amd-irm] Advanced Micro Devices. Infrastructure Roadmap (IRM) for Socket SP3 Processors. Publication number 55418, revision 1.18. 2020. Distributed only under NDA.
[amd-rome-ptds] Advanced Micro Devices. Power and Thermal Data Sheet for AMD Family 17h Models 30h-3Fh Socket SP3 Processors. Publication number 56585, revision 0.87. 2019. Distributed only under NDA.
[amd-milan-ptds] Advanced Micro Devices. Power and Thermal Data Sheet for AMD Family 19h Models 00h-0Fh Socket SP3 Processors. Publication number 56958, revision 0.88. January 2021. Distributed only under NDA.