Viewing public RFDs.
RFD 634
Git stub files for Dropshot versioned APIs
RFD
634
Authors
Updated

[RFD 421] and [RFD 532] introduced versioning for Dropshot HTTP APIs, enabling automated system software updates [RFD 418]. As part of this, we’ve built a Dropshot API manager which stores OpenAPI documents corresponding to each version of an API within a directory. This document introduces Git stub files and describes how they address issues with versioned OpenAPI document storage and diffing.

Motivation

As part of the self-service update work, we introduced a distinction between lockstep and versioned APIs. For the purpose of this RFD, we’re focused on versioned APIs.

Versioned APIs

A versioned API is one whose clients are not all expected to be at the same version, typically because an automated update is in progress. Unlike a lockstep API, which is stored as a single file, a versioned API is stored as a directory containing one file per version. Once a version of an API is integrated into main, it is considered blessed: an immutable source of truth.

For example, the Sled Agent API is a versioned API, since a newer version of Sled Agent might have older versions of Nexus and other system components as clients. The Dropshot API manager stores each version of the Sled Agent API as a separate JSON file, and manages a latest symlink pointing to the latest version.

% tree openapi/sled-agent
openapi/sled-agent
├── sled-agent-10.0.0-898597.json
├── sled-agent-1.0.0-2da304.json
├── sled-agent-11.0.0-5f3d9f.json
├── sled-agent-12.0.0-ffacab.json
├── sled-agent-2.0.0-a3e161.json
├── sled-agent-3.0.0-f44f77.json
├── sled-agent-4.0.0-fd6727.json
├── sled-agent-5.0.0-253577.json
├── sled-agent-6.0.0-d37dd7.json
├── sled-agent-7.0.0-62acb3.json
├── sled-agent-8.0.0-0e6bcf.json
├── sled-agent-9.0.0-12ab86.json
└── sled-agent-latest.json -> sled-agent-12.0.0-ffacab.json

The [progenitor] client for Sled Agent refers to the latest symlink, so that when a new version of the Sled Agent API is added, the in-tree client automatically picks up the update.

Blessed, generated, and local APIs

Dropshot is designed so that the source of truth for OpenAPI documents is in Rust server types and endpoints, as expressed via the corresponding API trait [RFD 479]. The Dropshot API manager is responsible for generating OpenAPI documents from the API trait.

For each versioned API, the Dropshot API manager performs a three-way resolution process between:

  • Blessed documents for versions on the upstream branch (origin/main in Git). Blessed documents are canonical and immutable.

  • Generated documents, which are produced for each version from the Rust API trait.

  • Local documents as currently stored on disk.

The resolution process upholds several invariants:

  • There’s one OpenAPI document corresponding to each API version.

  • Blessed documents haven’t changed compared to the upstream branch.

  • Generated versions of blessed documents are only changed in semantically equivalent ways.

  • For non-blessed versions, local documents reflect generated ones.

Issues with versioned API storage

New API versions result in large diffs

When a new version of a versioned API is added, Git detects it as a brand new file. For example, Omicron PR #9434 added version 12 of the Sled Agent API. Though the actual diff between versions 11 and 12 was small, GitHub displayed over 9,000 added lines for the new sled-agent-12.0.0-ffacab.json file. GitHub’s web UI could not detect that the new version was a modification of the previous one.

Even simple doc comment updates result in extremely large diffs: Omicron PR #9623 changes a few lines of formatting, but results in a diff of approximately 30,000 lines.

This issue has two important consequences: it is hard to review API changes, and diff line counts become misleading.

For more about why Git cannot detect new versions of APIs as copies of previous versions, see Appendix: Why can Git not detect copies?.

Blame on versioned API documents is not meaningful

Running blame on an OpenAPI document—typically via GitHub’s blame view—is useful for determining when a type was added or changed. For lockstep APIs, blame works as expected. But for versioned APIs, since Git treats each version of an API as a brand new file, blame attributes every line to a single commit.

As an alternative, blame can be run on the .rs files that serve as the source of truth. But deeply nested type hierarchies can be scattered across many files, and being able to blame a single top-level file is valuable.

Working copy size

Each new version of an API increases working copy size. In Omicron, as of revision 7c56d21 (shortly before Git stub storage was enabled), the openapi subdirectory was 33 MiB (43%) out of a total working copy size of 76 MiB. This was a minor issue compared to the first two, but worth noting since the size grows with each new version.

After Git stub storage was enabled in Omicron Current status, the working copy size dropped to 46 MiB. This represents a 39% reduction in working copy size and approximately 16% in overall checkout size[1]. Though modest in absolute terms, this has practical benefits: tools like rg no longer search through every prior API version.

Determinations

To resolve the Issues with versioned API storage outlined above, this RFD proposes a new method of storage for older API versions: Git stub files.

Git stub files

A Git stub is a text file with a .gitstub extension, containing a single line of the form <commit-hash>:<path-to-file> followed by a newline.

An example Git stub is:

99c3f3ef97f80d1401c54ce0c625af125d4faef3:openapi/sled-agent/sled-agent-11.0.0-5f3d9f.json

The <commit-hash> field is a full Git commit hash: 40 hexadecimal characters for SHA-1 repositories, or 64 for SHA-256 repositories. The <path-to-file> field is the path to a file at that commit, using forward slashes on all platforms, including Windows.

A Git stub can be dereferenced by reading the file contents at the specified revision. The format is designed so it can be provided directly as an argument to commands like git show:

% git show $(cat sled-agent-11.0.0-5f3d9f.json.gitstub)
{
"openapi": "3.0.3",
"info": {
"title": "Oxide Sled Agent API",
"description": "API for interacting with individual sleds",
"contact": {
"url": "https://oxide.computer",
"email": "api@oxide.computer"
},
"version": "11.0.0"
},
// ...
}

For commands that don’t accept arguments in a <commit>:<path> format similar to git show, standard Unix tools such as cut can be used to extract the components:

$ jj file show -r $(cut -d: -f1 sled-agent-11.0.0-5f3d9f.json.gitstub) root:$(cut -d: -f2 sled-agent-11.0.0-5f3d9f.json.gitstub)
{
"openapi": "3.0.3",
"info": {
"title": "Oxide Sled Agent API",
"description": "API for interacting with individual sleds",
"contact": {
"url": "https://oxide.computer",
"email": "api@oxide.computer"
},
"version": "11.0.0"
},
// ...
}

Alternatively: IFS=: read -r c p < file.gitstub; jj file show -r "$c" "root:$p".

(The Jujutsu root: fileset resolves paths from the root of the repository, regardless of the current working directory.)

To meet other needs described in this RFD, such as Progenitor integration through build scripts, this RFD also introduces two Rust libraries for parsing and interacting with Git stubs:

  1. git-stub for core types.

  2. git-stub-vcs for VCS (Git and Jujutsu) integration.

The git-stub-vcs crate contains facilities for dereferencing Git stubs, as well as for materializing them by writing them out to disk.

Conversion to Git stub files

This RFD proposes that the Dropshot API manager optionally convert eligible API versions to Git stub files. This capability is called Git stub storage.

A versioned API document is stored as a Git stub if all of the following are true:

  1. Git stub storage is enabled for this API.

  2. The API is blessed (i.e. present in main).

  3. The API is not the latest version.

  4. The API was not introduced in the same commit as the latest version. (Otherwise, the Git stub would point to a commit not yet on main; see Special case: multiple API versions added in the same commit.)

If an API should be stored as a Git stub, but is currently stored as a JSON file, then the Dropshot API manager will automatically convert the JSON to a Git stub, storing it as filename.json.gitstub. The <commit-hash> refers to the commit that most recently introduced the file. (If a file is removed and later re-added, the commit that re-added it is used, not the original.)

If an API should be stored as JSON, but is currently stored as a Git stub, then the Dropshot API manager will automatically convert the Git stub back to a JSON file.

If both .json and .json.gitstub are found for an API, the Dropshot API manager will delete whichever one is redundant, based on the above rules.

Desired behavior with Git stub storage enabled

On main

  • The latest version of an API is stored as a JSON file.

  • All prior versions of the API are stored as Git stub files, with each Git stub’s <commit-hash> component being the commit the corresponding API version was introduced in.

For example:

% tree openapi/sled-agent
openapi/sled-agent
├── sled-agent-10.0.0-898597.json.gitstub
├── sled-agent-1.0.0-2da304.json.gitstub
├── sled-agent-11.0.0-5f3d9f.json.gitstub
├── sled-agent-12.0.0-ffacab.json
├── sled-agent-2.0.0-a3e161.json.gitstub
├── sled-agent-3.0.0-f44f77.json.gitstub
├── sled-agent-4.0.0-fd6727.json.gitstub
├── sled-agent-5.0.0-253577.json.gitstub
├── sled-agent-6.0.0-d37dd7.json.gitstub
├── sled-agent-7.0.0-62acb3.json.gitstub
├── sled-agent-8.0.0-0e6bcf.json.gitstub
├── sled-agent-9.0.0-12ab86.json.gitstub
└── sled-agent-latest.json -> sled-agent-12.0.0-ffacab.json

The -latest.json symlink still points to a valid JSON file, so existing clients are unaffected.

When adding a new API version

When a developer adds a new API version:

  • A new OpenAPI document corresponding to the new version is added, as before.

  • The latest.json symlink is updated to the new version, as before.

  • New: The previous latest document is converted into a Git stub, with the <commit-hash> component set to the commit where the document was introduced.

For example, if a developer adds a new version 13 of the Sled Agent API, the Dropshot API manager:

  • adds sled-agent-13.0.0-<hash>.json;

  • adds sled-agent-12.0.0-ffacab.json.gitstub, with the contents 63c01899a7668044841021075711919160c90b1e:openapi/sled-agent/sled-agent-12.0.0-ffacab.json (the commit hash being where version 12 was added); and

  • removes sled-agent-12.0.0-ffacab.json.

Removing the old JSON file is what enables Git to detect the new version as a rename rather than a new file (see Rationale for Git stub files for details). An example is Omicron PR #9572: the GitHub diff correctly shows version 13 as a rename of version 12, and blame preserves history across versions.

When the latest version is removed

In the situation where the latest version of an API is removed, e.g. if a developer added a new version locally and then changed their mind:

  • The Dropshot API manager will convert the previous-latest version of the API from a Git stub to a JSON file.

  • If other API versions were introduced in the same commit as the previous-latest version of the API, the Dropshot API manager will convert those to JSON files as well.

This behavior follows the eligibility conditions listed in Conversion to Git stub files above, avoiding path dependence (where the final state depends on the sequence of operations): adding an API version and then immediately removing it should leave the repository in the same state as if the version had never been added.

Resolving merge conflicts in API documents

With the proposal in this RFD, Git will see version bumps as renames. If, in parallel branches, the API is changed in different ways, a rename-aware merge operation will result in conflicting changes.

See Rename/rename conflicts for some discussion of alternatives.

Merge conflicts with Git

Git’s merge algorithm is rename-aware by default, so when two branches both bump an API version, git merge or git rebase will produce a rename/rename conflict with conflict markers in the OpenAPI document JSON files.

The RFD’s scope of work includes merge conflict resolution support in the Dropshot API manager. With this in place, developers resolve rename/rename conflicts by running cargo xtask openapi generate. Because OpenAPI documents are generated from source code as described in [RFD 479], regeneration produces the correct result regardless of which conflicting version is present on disk.

Merge conflicts with Jujutsu

As of version 0.36, Jujutsu's merge algorithm is not rename-aware, so it won’t produce a rename/rename conflict. However, the scenario described in Merge conflicts with Git above will result in a conflict in the -latest.json symlink. In such cases, Jujutsu turns the conflicting symlink into a regular file.

This RFD extends the Dropshot API manager to handle -latest.json symlinks becoming regular files. When Jujutsu’s merge algorithm becomes rename-aware, it will automatically benefit from the rename-based conflict handling described above.

Merge resolution mechanics

The mechanics of merge conflict resolution are not discussed in this RFD; it is a somewhat tricky algorithm with many edge cases, each with a deterministic resolution. For an implementation of the resolution algorithm, see dropshot-api-manager PR #39.

Special case: multiple API versions added in the same commit

In the uncommon situation where multiple versions of the same API are added in the same commit, the Dropshot API manager treats all such versions as the latest, leaving them as JSON files. They are converted to Git stubs only when a subsequent version is added.

This is condition 4 in Conversion to Git stub files—without it, CI would fail when multiple new versions land in one commit.

Progenitor integration through build scripts

Most Progenitor clients are only generated against the latest version of each API via the -latest.json symlink. However, as discussed in the Testability section of [RFD 532], we will sometimes want to generate Rust methods for old clients to test against new servers.

To meet this need, we choose a composable approach using Cargo’s build scripts as the integration point:

  • The git-stub-vcs project contains helpers to resolve Git stub files to their referenced contents (e.g., OpenAPI documents) and write those contents to the build script output directory.

  • Progenitor gains the ability to read OpenAPI documents from the build output directory through a new relative_to parameter.

With these two blocks in place, crates that need access to older OpenAPI documents will:

  1. Add a build script which materializes the corresponding Git stubs to the output directory.

  2. Then, configure Progenitor to read from that directory.

An example is shown in Omicron:

Rationale

Rationale for Git stub files

This RFD introduces an approach for referring to an old version of a file. The general strategy is inspired by [git-lfs], which stores pointer files with the format:

version https://git-lfs.github.com/spec/v1
oid sha256:4bd049d85f06029d28bd94eae6da2b6eb69c0b2d25bac8c30ac1b156672c4082
size 3771098624

The key difference is that Git LFS stores referred-to files on an external server, whereas Git stub files refer to objects already in the repository. The respective choices make sense: Git LFS is meant for files too large to comfortably fit in a Git repository, while OpenAPI documents are small enough to store directly.

To our knowledge, using pointer files to refer to objects within the same repository is a new technique. But it is not a particularly large leap: the core idea is a logical extension of the concepts behind Git LFS. The chain of reasoning is roughly:

  1. New API versions result in large diffs and break git blame.

  2. This is because Git doesn’t track copies; it detects renames heuristically. (See Appendix: Why can Git not detect copies?).

  3. To trigger the rename detection heuristics, the old file must be deleted.

  4. But old API documents still need to be discoverable and retrievable.

  5. Therefore, replace the old document with a reference to it, like Git LFS does.

  6. What should the reference point to? The old document exists in history, so that is the most natural candidate.

  7. The representation format is the one that makes retrieval easiest: <commit>:<path>, since that’s the format git show accepts.

Alternatives to Git stubs

The following table compares the proposed approach against the most notable alternatives, evaluating each against the issues identified in Motivation and some additional concerns. Each alternative is discussed in detail below.

Git stubsExternal toolingCurrent copyDiff chainGit LFSHistory only

Meaningful diffs

Yes

Partiala

Yesb

Yes

No

Yes

Meaningful blame

Yes

No

Partialc

Yes

No

Yes

Reduced diff line count

Yes

No

No

Yes

Yes

Yes

Reduced working copy size

Yes

No

No

Yes

Nod

Yes

Discoverability of old versions

Yes

Yes

Yes

Yes

Yes

No

Simple retrieval of old versions

Yese

Yes

Yes

Nof

Nog

Noe

Shallow clone compatible (Shallow clones)

Noh

Yes

Yes

Yes

Nog

No

Avoids merge conflicts

No

Yes

No

No

Yes

No

a External tooling can produce diffs, but GitHub’s UI still shows new API versions as new files.
b The current copy shows a meaningful diff, but the new versioned file still appears as new.
c Blame on the current file is meaningful; blame on individual versioned files is not.
d LFS smudge filters fetch and materialize files by default, so the working copy is not reduced.
e Requires Git history (see Dereferencing Git stub files requires Git history).
f Requires reconstructing from the latest version through a chain of diffs.
g Requires a configured LFS server.
h Dereferencing stubs requires the referenced Git objects, which shallow clones typically lack.

Alternative (external tooling): keep current storage, tooling on top. We could leave the storage model unchanged and instead write a utility to perform the appropriate diffs (and leave, e.g., a comment on the PR). That is simpler to implement, but it doesn’t solve the problem of large diffs on GitHub. It also doesn’t restore useful blame in the GitHub UI.

Alternative (current copy): store a complete copy of the current API document. This approach stores a full copy of the current OpenAPI document (such as sled-agent-current.json) alongside all versioned files. When a new version is added, this file is updated in place, producing a meaningful diff and preserving blame on the current-copy file. However, it doesn’t help with:

  • inflated diff sizes, since the new version is still added as a wholly new file;

  • working copy size (and in fact increases it); or,

  • merge conflicts on the current-copy file, when parallel branches bump the API version.

Alternative (diff chain): replace older versions with diffs against the subsequent version. Instead of storing full JSON for older versions, each non-latest version could be replaced with a textual diff (patch file) against the next version. When a new version is added:

  • The old JSON file is deleted and replaced with a patch file, so Git’s rename detection treats the new version as a rename of the old; this produces meaningful diffs and preserves blame via the same mechanism as Git stubs.

  • Working copy size is also reduced, since diffs are much smaller than full JSON.

  • However, retrieving old versions becomes significantly harder: rather than a single git show as with Git stubs, you must chain diffs backward from the latest version.

Alternative (Git LFS): use Git LFS to store previous versions. There are two main downsides to using Git LFS: first, an external server is required to serve objects; and second, LFS files can only be diffed with some difficulty. These tradeoffs are reasonable for the kinds of large binary assets Git LFS is designed for, but not so much for OpenAPI documents.

Alternative (history only): do not store Git stub files, always do history lookups. We could store only the latest version, and use git log on a directory to find old versions of APIs. However, that significantly reduces discoverability: listing old API versions requires inspecting Git history. It also becomes unclear when support for an API version is dropped. With Git stub files, dropping support for an API version means deleting the corresponding .gitstub file.

Other alternatives

Alternative: don’t store Git stub files as regular files. There are several alternatives to storing Git stub files as regular files, such as Git notes. Regular files on disk are significantly more legible than these less common Git features.

Alternative: smudge/clean filters. We could keep fully materialized files for each API version on disk, but use a Git clean filter to convert them to Git stub files when committing. Then, when updating to a revision, a smudge filter would materialize them. This option has a few issues:

  1. Smudge and clean filters only work on file contents. They cannot change the names of files, i.e., they cannot turn sled-agent-4.0.0-abcdef.json into sled-agent-4.0.0-abcdef.json.gitstub, or vice versa. As discussed in Appendix: Why can Git not detect copies? below, for rename detection to work, the old files must be deleted. Merely changing the contents is insufficient.

  2. They introduce additional complexity. Developers would have to configure a filter driver in their own .git/config. This can be done as part of the setup script, but it introduces one more way things can go wrong.

  3. Smudge and clean filters are incompatible with Jujutsu (as of version 0.36), which several developers working on Omicron—including this RFD’s author—use.

By being regular files with a .gitstub suffix, Git stub files address all these concerns.

Alternative: lobby GitHub to turn on --find-copies-harder. Given the computational cost of --find-copies-harder and GitHub’s priorities, this seems unlikely to succeed. The problem discussed in Appendix: Why can Git not detect copies? below—that a copy might be detected against a different version than desired—also remains.

Alternative: convert API versions to Git stub files via a separate command. This alternative has the downside of inconsistent results when one side of a merge converts API versions to Git stub files and the other side does not. Having a uniform, automatic strategy removes this possibility and reduces developer burden.

Rationale for Git stub format

The Git stub format is chosen to make git show $(cat file.gitstub) work without manipulation of the Git stub’s contents. This is the most natural format for Git stub files.

Alternative: a single manifest listing all prior versions. One option is to have, for example, an old-versions.json file listing all prior versions. This is reasonable, but it would complicate git show usage. It also has a greater chance of resulting in merge conflicts, particularly when one side of a merge adds multiple versions while the other side adds one. Separate Git stub files per API version address both these concerns.

Rationale for commit hash selection

Alternative: use the merge base. This RFD specifies storing the commit where a file was most recently introduced. An alternative is to store the merge base between the current branch and main. However, this leads to merge conflicts when multiple developers work on the same API but with different merge bases:

In this scenario, the merge base of feature1 and main is M2, and the merge base of feature2 and main is M3. If the Dropshot API manager stored Git stub files using the merge base, the Git stub files would have different contents, leading to a merge conflict. The choice of using the first commit when the file was most recently introduced produces results independent of developers' merge bases.

Alternative: use the first commit when the file was first introduced. This RFD specifies that we use the first commit when the file was most recently introduced. Another option is to use the first commit when the file was first introduced. For example, for an API version v1:

In this scenario, we store commit C within the Git stub. The alternative approach would store commit A.

A core assumption behind API versioning is that blessed versions are immutable. Under this assumption, in the situation described above, both A and C are valid options for the Git stub. But we have to pick a consistent approach, and commit C is slightly better because it represents an unbroken history.

If, for some reason, this assumption is not upheld, then the hash in the file name will almost certainly be different, and C would be the only valid option.

Rationale for Progenitor integration through build scripts

Alternative: tightly couple Progenitor with build scripts. Alternatively, we could have added Git stub integration directly to Progenitor. The upside of the alternative is smoother integration, but the downside is that Progenitor becomes aware of a niche (though pertinent to Oxide) concern.

Instead, we choose to have the user integrate Git stubs and Progenitor using a build script. (The git-stub-vcs library does most of the work, so the build script contains a very small amount of code.) The integration of Progenitor with build scripts is useful outside Git stubs as well.

Tradeoffs

Git stub storage for older versions introduces notable tradeoffs. On balance, the benefits are large and the drawbacks can be mitigated, but the tradeoffs are worth documenting and considering. While the change impacts the whole team, if we eventually decide the costs outweigh the benefits, it is reversible: the Dropshot API manager can convert Git stub files back to JSON files, and the original JSON contents remain in Git history.

Dereferencing Git stub files requires Git history

Git stub files are pointers to older versions in Git history. Dereferencing a Git stub requires that the history be available.

Shallow clones

One common situation where history is unavailable is with the --depth option, which creates a shallow clone. Shallow clones typically lack the objects corresponding to old API versions, and with insufficient history, merge-base computations may be incorrect.

This is particularly relevant for GitHub Actions, which does shallow clones with depth 1 (i.e., just the commit being checked out) by default.

The Dropshot API manager has an existing, intrinsic dependency on Git history: the defining characteristic of a blessed API version is whether it has landed on main, so inspecting history is fundamentally necessary. Git stub storage introduces an additional dependency on history as a tradeoff. The following table summarizes the impact on shallow clones:

OperationWithout Git stubsWith Git stubs

Build the repo

Works

Works, unless Progenitor clients reference older versions (Progenitor integration through build scripts)

cargo xtask openapi

Requires Git history (merge base computations)

Requires Git history

View old API versions on disk

Works (JSON files present)

Requires Git history (to dereference stubs)

For GitHub Actions, a full clone can be done through fetch-depth: 0:

- uses: actions/checkout@v6
with:
fetch-depth: 0
Source archives

Source archives (tarballs) without Git history, such as those traditionally distributed by open source projects, will not have materialized JSON files in them. At the moment, we do not have a need for source archives at Oxide. Should the need arise, and if prior versions of APIs need to be made available as part of that archive, then whatever tooling builds the archive will need to dereference Git stub files.

Tool support for dereferencing Git stub files

For APIs with Git stub storage enabled, if a tool or other program needs access to older API versions, it will need to know how to dereference Git stub files. This is relatively straightforward: git cat-file blob $(cat file.gitstub). But tools that previously only relied on files on disk would now have a dependency on Git. (In practice, accessing old API versions is uncommon.)

For Rust, we’ve built the git-stub and git-stub-vcs libraries, and used them in Progenitor integration through build scripts.

Libraries in other languages can be built as needed.

Rename/rename conflicts

As discussed in Resolving merge conflicts in API documents above, the fact that Git will observe renames means that users making changes to an API in parallel will see rename/rename conflicts. This is a notable downside. The current system was carefully designed—with features like hashes in file names—to avoid merge conflicts in OpenAPI documents.

As part of this RFD, we’ve extended the Dropshot API manager to handle merge conflicts (whether rename/rename, or for other reasons). This capability may not be obvious to developers, so it will need clear documentation. The implementation also introduces notable complexity.

Another option is for users to disable rename detection during merges: pass -X no-renames, or set merge.renames to false. But most of the time, rename detection during merges is valuable, so turning it off globally is inadvisable.

A third option is for users to resolve conflicts themselves with git restore or jj restore. This approach will continue to work, and if users make a mistake in this process, the API manager will correct any errors the next time generate is run.

Future work

History rewrites

In the rare scenario where Git history is rewritten with commands like [git-filter-repo], the Git stub files may become invalid. Should that need arise, we can build the required tooling to map Git stub files over to their new versions.

Current status

As of 2026-03-12, the implementation is complete:

  • dropshot-api-manager version 0.4.0 and above.

  • git-stub repo for core types and VCS integration.

  • Omicron PR #9933 (with more than 1 000 000 lines removed) converts Omicron to Git stubs.

  • Most other Oxide repositories with Dropshot APIs have been converted as well.

Security considerations

None. This RFD introduces no external dependencies or new attack surfaces.

External References

Appendix: Why can Git not detect copies?

To understand why Git cannot detect copies in this situation, a comparison with other version control systems is instructive. Many systems (Subversion, Mercurial) allow developers to mark a file as a copy of another file, tracking this relationship in the version control system. (A rename is generally tracked as a copy and a delete.) If a file is copied and this copy is changed in the same commit, these systems can show the appropriate diff, log, and blame output[2].

In contrast, Git’s data layer is entirely unaware of renames and copies as concepts. It stores snapshots of working copies; rename and copy relationships are inferred heuristically by the UI layer when commands like diff, log, or blame are run. This is a common misconception: unlike Mercurial or Subversion, where hg mv or svn mv records the relationship in the repository, git mv is shorthand for a delete and an add. Because detection is heuristic, searching for copies can be algorithmically expensive, so Git performs limited detection by default.

  • By default, Git only detects renames, where a file is deleted and another file is added with similar contents. This is algorithmically cheapest, because only removed files need to be compared with added ones. In other words, for each added file, O(removed files) comparisons need to be performed. GitHub’s web UI has rename detection turned on.

  • With the optional --find-copies argument, Git also detects copies when the original file is changed. For each added file, this can be done in O(changed files) comparisons, so it is somewhat more expensive, though not prohibitively so. GitHub’s web UI has --find-copies detection turned off.

  • With --find-copies-harder, Git detects copies even when the original file is unchanged. For each added file, this requires O(number of files in repository) comparisons. This is prohibitively expensive for large projects, so it is rarely used.

Note
For more details on copy tracking, see [jj-copy-tracking].

The predicament is clear: since, with our current storage model, blessed API versions are immutable by definition, Git cannot detect copies across API versions without --find-copies-harder. GitHub’s web UI has no option for --find-copies-harder, even for specific paths, so a new API version would always register as a brand new file. GitHub’s blame UI also becomes unhelpful when used on OpenAPI documents.

And even with --find-copies-harder, there’s no guarantee that the previous version is the most similar to the new one. Git might detect version 12 as a copy of version 10, 9, or even earlier, leading to misleading output.

Footnotes
  • 1

    A checkout consists of the repository store (the .git directory; 108 MiB for a fresh Omicron clone as of March 2026) and the working copy (files on disk). Git stub storage does not affect repository store size. Git’s pack file algorithm already deduplicates similar content across files via delta compression.

    View
  • 2

    If we used one of these systems, the Dropshot API manager could mark the file as a copy, and we wouldn’t need this RFD.

    View