Node-based vs. Edge-based Systems

The revision history of a source can be viewed as a directed acyclic graph. Versions are the nodes, and the changes leading from one version to another are the edges. When a node has more than one outgoing edge, that represents a branch in the version history. When a node has more than one incoming edge, that represents a merge operation.

Some systems are designed around storing versions as the primary object. People call these node-based systems. Vesta is clearly in this category. Some others in this categoy are:

Other systems are designed around storing the changes that lead from one version to another as the primary object. These systems usually have features for applying changes in contexts other than the ones they were originally created in. People call these edge-based systems. Some systems in this categoy are:

Some systems use a delta encoding for storage, but don't treat edges as first-class objects.

Some people would claim that this distinction is a false dichotomy. For example, monotone actually stores both nodes (the complete state of the source at a particular point in time) and edges (representations of the changes leading from one state to another). Also, it's possible to derive edges from nodes (by comparing the source states) or to derive nodes from edges (by applying the changes to construct the source state at a particular point in time).

Perhaps a good way to distinguish the systems is by which kind of objects (nodes and/or edges) the system gives names to and allows the user to refer to, manipulate, or utilize in some way. Since Vesta has no representation or name for edges, it seems clearly in the node-based class.

Edge Representations

Edges are often called patches or changesets.

One tricky problem with edges has to do with the desire to apply them in a context different from the one the are created in. The UNIX patch(1) command has a very simple method of doing this: recognizing the text to change based on the context of surrounding lines. Some edge-based systems put a lot of design effort into this problem. One example worth looking at is Darcs' theory of patches.

Another complication comes from the desire to separate the changes represented by some branch or other line of development in order to apply only a subset of its changes to a different line of development. This is often referred to as cherry picking.

Most of the existing algorithms and well-understood methods for manipulating edges and applying them outside their original context only work on files like source code (ASCII text separated by newlines and edited directly by humans). (In fact, some systems deal poorly with files that don't conform to this limited definition.) This is unfortunate, as there are many other kinds of files which people need to kep under revision control.

@@@ Need more here