Best practice tools and techniques for merging a derived code snapshot with updated upstream code?_问答_开发者

The situation is as follows: it is necessary to merge in changes from an upstream code base (from V1, to V2), into a third code base S1 that is derived/branched from V1, to produce a new code base S2.

We have access to version control for logs and revisions between V1 and V2, and the source of V1, V2 and the source of S1. However, S1 is not provided with a version control repository and a history: it is not possible to treat this as a merge between a branch and an evolved trunk given that the intermediate changes to arrive at S from V1 are not known individually.

The situation is that we therefore are performing an incremental 3-way merge in order to result in S2, with the changes derived in S1 updated to work on the basis of V2. (Our evolving V2 is naturally开发者_如何学Python kept under version control)

I have found WinMerge to be of use in identifying files that are simply different / missing / added between directory structures, and p4merge as a good 3-way merge tool at the file level.

What tools and techniques do you suggest? It is worth noting that the sizes of the code bases are large, the number of intermediate revisions between V1 and V2 is large, and the size of the changes between V1 and S are also large.

Personally, while probably not as fancy as the aforementioned Clone Detector, I'd start by using diff -u S1 V1 >/tmp/diffs.patch, which should tell you what was changed in S1. I'd suspect a high percentage of the diffs can be merged onto V2 with patch -p0 </tmp/diffs.patch. Any it can't will be patched as much as it can, and the rejected changes left for hand-merging.

This should handle all the "easy" pieces in a few minutes. You may want to trial-run it, then remove some trickier files from the .patch file that you'll do all the merging by hand with a 3-way merge.

If the changes are too extensive (massive refactoring or moving large amounts of code from file to file), then you may need to use tools like Ira mentioned.

What you want to know are the deltas between V2 and S1, and where they are.

Winmerge tells you that files are the exactly the same, or if they are missing or different. If different, it won't tell you what they have in common, if anything which is the basis for a merge.

I'd use a (our) Clone Detector across V2 and S to find out what they had in common at the granularity of language structures. Code blocks that are clones from V2 into S into the same file are in some sense "already merged"; where there are clones of V2 into a different file in S there has likely been code movement. Where there are parameterizable differences, the clone detector (at least ours) will be able to tell you what the parameters ("edits") are and you can decide how to merge them. Where the code is very different, the clone detector won't say anything, but you can get that list by subtracting files the clone detector says are mostly clones, from those that Winmerge says are different. These very different files will likely be difficult to merge.

For files that are mostly clones of one another, you can use our Smart Differencer to tell you how the V1 file could have been modified to produce S; that will provide fine grain change information.

If V is not in a good VCS, it might pay to import V into a whole new VCS to start. Then create a branch at V1 and import every old copy of S1 you can find from backups, old build trees, working copies, etc. Build as much of a history from V1 to S1 as possible in the VCS. If someone has done some drastic reformatting along the way it may be possible to normalize that by applying the same tool to the revisions of the V2 branch.

Then use the merge tool and work your way through the conflicts. If there are a lot (and there probably will be) you may want to incrementally merge at stages along the V1..S1 and V1..S2 histories so you can commit intermediate work.

I really recommend Beyond Compare. It has a clean GUI, great comparison algorithms, 3-files compare, directory structure comparison and more.

You should use Git to do the merge. Checkout S1, then git merge v2.

If S1 has lots of changes between it and V1, you can be sure that renames, etc will throw the least amount of conflicts. Alternatively, you can rebase (replay changes) from range v1 to v2 on top of s1 or from range v1 to s1 on top of v2 - it depends on what you want to attain. This is not really a 3 way merge.

how you get v1..v2 and v1..s1 history into git is up to you.. If no migration tool exists, scripting the export at each revision should do it. Then just commit your results on top of S1 as S2 in the SCM you are using.

Hope this helps,

You can find me on the #git irc channel if you want my help or a bunch of others are there as well that are VERY good at merge things.