RFD 203
Standard units for counting bits
RFD
203
Updated

This RFD proposes standard units and prefixes for counting "amounts of data" in our product, including code, APIs, the console, documentation, and other materials both internal and external. This might seem pedantic, particularly at small scale (the difference between 1 KB and 1 KiB is only 2%), but rigor in this area is important at rack scale (the difference between 1 PB and 1 PiB is over 12%!), and transparency is essential for customers to understand the behavior of their Oxide system as well as their application. Imagine the experience of working a support case with a customer who is confused about poor performance and finding out that one of you misunderstood the other’s number — and was an order of magnitude off.

There are two main considerations:

  • Bits vs. bytes. Network throughput is traditionally measured in "bits per second", while everything else usually uses "bytes per second". When crossing layers of the stack, it’s easy for one person to write "30 Mbps" (or "30 MBps") and for someone else to misunderstand by a factor of eight.

  • SI (base-10) vs. IEC (base-2) prefixes. History makes this messy:

    • Networking has traditionally used SI (base-10) prefixes. 1 Mbit = 1 Megabit = 1,000,000 bits.

    • Blocks of memory are usually sized to powers of 2, but have historically been described using base-10 prefixes. A "1 MB block of memory" is usually really 1 MiB or 1,048,576 bytes, not 1,000,000 bytes.

    • Storage manufacturers have long reported device capacity using powers of 10 (a 1 TB hard drive is really 1,000,000,000,000 bytes). Operating systems often report storage using powers of 2 with base-10 prefixes (just like memory).

Determinations

In all cases, when we quote a number, we should use the correct power-of-two or power-of-ten prefix for that quantity. If we’re quoting the exact size of something that is 2048 bytes, it should always say 2 KiB and never 2 KB.

The rest of this section gives suggestions for when to use powers of two vs. powers of ten and bits vs. bytes.

For network throughput, base unit is "bits per second", summarized using decimal prefixes. So 1,000 bits per second would be 1 Kilobit per second. While some conventions use "bps" to mean "bits per second" and "Bps" to mean "bytes per second", we propose always being explicit: the most abbreviated way to say "1,000 bits per second" would be "1 Kbit/sec"

For memory and storage, base unit is "bytes", summarized using base-2 prefixes. So 1024 would be Kibibytes, abbreviated KiB (not KB). 1024 * 1024 would be Mebibytes, abbreviated MiB (not MB), etc. People who haven’t been burned by this before sometimes look sideways at these prefixes, but it’s just one extra letter that’s very clarifying for those who have been burned by it.

QuantityBase unitPrefixesExampleWrite thisNever write this

Network throughput

bits per second

base 10

1,000 bits per second

1 Kbit/s, 1 Kbit/sec, 1 Kilobit/sec

1 Kbps, 1 KBps, 1 KiBps, 1 Kibps

Memory, storage

bytes

base 2

1024 bytes

1 KiB, 1 Kibibyte

1 Kilobyte, 1 KB

There are sometimes implementation-specific reasons to diverge from these suggestions, like if you’re working closely with a hardware device that measures its size in Kibits. The existence of such parts in the system means we will always have some components measured differently, which makes it all the more important that we use the correct label (and abbreviation) for each number.

Where there is a compelling reason to label a quantity in some alternate form (like disks that are exactly 3.2TB, so showing "2.9 TiB" would be an approximation), we recommend using both forms, as in: "3.2 TB (2.9 TiB)".