How is dvcs (git/mercurial) branching and merging support better than svn's?_问答_开发者

Lots of articles on dvcs systems claim superior branching and merging support as one reason to move from svn to 开发者_JAVA百科dvcs systems. How exactly do these systems do branching and merging differently that makes it better?

Historically, the difference between merge-tracking in git and svn was this: git has merge-tracking, and until version 1.5, svn didn't. At all. If you wanted to make a merge you had to always specify exactly what changes were to be merged, and if you merged one branch into another more than once, you would have to manually keep track of which revisions had and hadn't been merged, and manually select only the changes that hadn't been merged yet, to avoid conflicts. Good luck with that if you ever cherry-picked any changes.

Beginning with version 1.5 (released in 2008), if your client, server, and repository are all up-to-date, then svn is capable of acting a lot more intelligently; it uses properties to keep track of where a branch came from and what changes have already been merged into it. The upshot is that in many cases you can just svn merge BRANCHNAME and have the right thing happen. But due to its "bolted on" nature it's still not very fast and not entirely robust. Git, on the other hand, has to handle merge scenarios well because of its DVCS nature, and it was designed from the beginning with data structures (like the particular kind of DAG it uses) and algorithms (such as recursive-merge and octopus-merge) that are suited to the task.

The difference is not, contrary to popular perception, due to the distributed nature of DVCS's, vs Subversion's centralised model. There is nothing inherent in a centralised model that entails that branching and merging will be substandard.

My take is that Subversion made a massive design gaffe by deciding to model code-base directory structure, branching and tagging (and all manner of other code management patterns) in a single, unified directory tree, which made the problem of reliably detecting branching activity one hundred times more difficult than it would have been if branching were explicit in the model.

Don't forget the human components of any version control system. Earlier today I created and deleted 3 different local git branches, because that was the least disruptive way to accomplish cleaning up a problem on the main branch. Try that on centralized version control and you are likely to get a lecture from the server admin or a storm of angry emails, if you even have privileges to do it at all. The very fact that you can have branches in a private repo removes many cultural barriers to using branches effectively. The algorithms used by centralized systems are catching up to DVCS. Those human factors will remain.

From Joel's hginit:

Here’s the difference. Imagine that you and I are working on some code, and we branch that code, and we each go off into our separate workspaces and make lots and lots of changes to that code separately, so they have diverged quite a bit.

When we have to merge, Subversion tries to look at both revisions—my modified code, and your modified code—and it tries to guess how to smash them together in one big unholy mess. It usually fails, producing pages and pages of “merge conflicts” that aren’t really conflicts, simply places where Subversion failed to figure out what we did.

By contrast, while we were working separately in Mercurial, Mercurial was busy keeping a series of changesets. And so, when we want to merge our code together, Mercurial actually has a whole lot more information: it knows what each of us changed and can reapply those changes, rather than just looking at the final product and trying to guess how to put it together.

Branching or tagging in SVN is merely copying a particular directory and its subdirs to another location within the same repository. In git, branches (and tags) are instead described as metadata (much like CVS), except that it does not throw all this data in a single file, but many (allowing for much faster updates since you don't have to rewrite a huge "foo.c,v" for example). Furthermore, git makes heavy use of pointers. (http://eagain.net/articles/git-for-computer-scientists/ ) so there is in fact, few to update in the first place when something changes (e.g. a commit is made).

The difference lies in the repository format used by most DVCS's - the Directed Acylic Graph.

SVN Will just store your history in a series of lines, each branch its own line. But a DVCS will store it in a DAG that contains much better information for merging.