Reorganizing Repositories

Repository "appendable directories" (those below /vesta, both "package-parents" created with mkdir or vmkdir, and "packages" created with vcreate) and "immutable directories" (those created with vadvance and vcheckin) can never be renamed. They can be deleted, but they leave behind ghosts. (See: RepositoryRules.)

Essentially, when you create something under /vesta (a directory, a package, a version), it's immortal. Versions, once created, are frozen forever.

There are two main motivations for these restrictions:

  1. Build repeatability. In order for a build to do precisely the same thing in the future that it does now, once a version is created, it must always refer to the same contents. So, once you create /vesta/example.com/dir/pkg/1, that version must always have the same contents. If you could rename /vesta/example.com/dir/pkg to /vesta/example.com/otherdir/otherpkg, your build descriptions would still refer to the old name (via an import), and you wouldn't be able to repeat past builds. If you could delete /vesta/example.com/dir without leaving behind a ghost, you could re-create /vesta/example.com/dir/pkg/1 with different contents, which would change the result of builds you had created in the past.

  2. Replication invariance. Just as /vesta/example.com/dir/pkg/1 is supposed to have the same contents for all time in the repository where it is first created, it's also supposed to have the same contents in any repository you replicate it to. If you could delete objects without leaving ghosts, you could replace /vesta/example.com/dir/pkg/1 in one repository and leave the old contents in another repository. (For more on this see: "Vesta Replication Design" in the vrepl(1) man page and Partial Replication in the Vesta Software Repository.)

But the desire to rename or delete them remains, nonetheless.

Reasons for Renaming or Deleting Repos Dirs

There are several common reasons why people want to delete or rename repository directories and packages.

Typo on vmkdir or vcreate

Ie, you type vmkdir progrm when you meant vmkdir program.

Splitting up packages

Since programs tend to get bigger over time, adding more souce files, packages tend to grow in file count. Eventually one may want to split a package up into two or more separate packages.

Refactoring

This is just a general reorganization of dirs, packages, and files, for readability, or because the human team structure changed, or for many other reasons.

Packages Becoming Obsolete

Maybe some component of your design is no longer used. You probably don't want to completely delete it (as old builds may refer to it), but maybe you don't want users to see it when looking through the contents of the repository.

(Really this is a sub-problem of both splitting up packages and refactoring.)

Current Options

Create New Dirs

If you vmkdir progrm by accident, you can just say to yourself "oops" and vmkdir program and just live with both of those lying around, one of which you never use again. You can vrm progrm, of course, but this still leaves progrm lying around as a ghost.

Number Your Root Dir

When you create the top-level dir of your tree, even the namespace, create it as project1.example.com instead of just project.example.com so that when you eventually want to refactor, you just move to project2.example.com. A disadvantage of this approach is that, except for possibly old-version attributes, your new project2.example.com tree has lost connection to your old tree.

It's relatively easy to automate such a migration. The mirror-legacy-version.pl script attached to this page is an example of one way to do a simple migration to a new top-level directory (though it won't do any re-organization, so probably isn't appropriate for most uses).

A potential problem with this is that paths to objects below /vesta may get hard-coded into various files (scripts, replicator instructions, weeder instructions, etc.). The mirror-legacy-version.pl script only helps with re-writing SDL imports.

Other Ideas

Optionally Make Ghosts Invisible

An idea that's been suggested in the past (and has a tracker entry) is to make ghosts invisible when listing a directory through the NFS interface. This could be implemented without too much difficulty. The ghosts would still exist and still perform their essential function of preventing names from being recreated with different contents. However, they wouldn't show up when listing a directory.

This would probably be affected by an attribute which would either be required to hide a ghost or could be set to make a ghost visible again.

We could make this applicable to non-ghosts as well, which would make it possible to hide an obsolete package but leave it in the repository so that old builds could use it.

An alternative suggestion is implementing a a Vesta-specific vls command that can optionally hide ghosts.

Don't "Freeze" Until Needed

In the conversation which prompted the creation of this page, JohnVk suggested the idea of allowing changes to appendable directories until they need to be "frozen". There are two obvious times when you would need to "freeze" something:

  1. When a source is first used by a build
  2. When a source is replicated to a peer repository

It seems like this would be difficult to implement, and would actually be more difficult for users to understand than the current rules. They would have to have some way to find out whether something can be renamed, and would probably have difficulty understanding why something suddenly changes from being renameable/replaceable to no being frozen/immortal.

Mount A Historical View of the Repository

AdamMartin suggested the idea of mounting the repository as it was at some point in the past. I (KenSchalk) responded later on IRC with a few reasons why this would be difficult.

Really I'm not clear on how much utility this would have, as it would mostly help with the behavior of scripts/programs which are not part of a Vesta build but which want to access the repository via NFS or look at attributes.

Transparently Follow Renames

JimHuggins suggests:

This seems problematic to me (KenSchalk) for several reasons:

  1. It makes the evaluator's behavior dependent upon data which changes over time. Even though the intent is not to introduce anything which could break repeatability, this seems inherently unwise.
  2. It requires a new query-able database that's part of the repository's stable, transactional data store. This definitely has to be made part of the repository, as changes to it would have to be atomic with the , but it's a non-trivial amount of work just to add this database.

A better option for supporting this would be to have SDL refer to its imports not by a path below /vesta, but by something persistent (like the fingerprint of the immutable directory). This would essentially mean treating the path to a version (vesta/vestasys.org/vesta/release/12) as a "pet name" for a more permanent referent, like it's fingerprint (e8ca04dd4c895ee654ef9ceaec291e39). The down-side to this approach would be that SDL imports would become far less readable:

from e8ca04dd4c895ee654ef9ceaec291e39 import
  vesta_release_12 = build.ves;

This would also make the output of vimports unreadable, and would probably completely break vupdate.

It's probably better to just leave the package in its original position even after migrating to using it at the new position.

Having said all this, there are some other reason why it would be very helpful to have an accessible data structure which stores the DAG of versions for a package. That's not quite what Jim said, but such a data structure would probably include information about where all copies of the exact same version exist, which could be used for a purpose like this.

Transparently Re-write Old SDL Imports

JimHuggins suggests:

My (KenSchalk) instinct is to reject any idea that involves re-writing history. I like the previous idea more.

Unrelated: Tracking Renames Between Versions

There's an RFE about automatically handling tracking of renames, but this is about a different problem: keeping track of renames of files/directories within a package (i.e. the changes from one version to the next). It has nothing to do with renaming in appendable directories.