Given a branch, I'd like to see a开发者_JAVA百科 list of commits that exist only on that branch. In this question we discuss ways to see which commits are on one branch but not one or more specified other branches.
This is slightly different. I'd like to see which commits are on one branch but not on any other branches.
The use case is in a branching strategy where some branches should only be merged to, and never committed directly on. This would be used to check if any commits have been made directly on a "merge-only" branch.
EDIT: Below are steps to set up a dummy git repo to test:
git init
echo foo1 >> foo.txt
git add foo.txt
git commit -am "initial valid commit"
git checkout -b merge-only
echo bar >> bar.txt
git add bar.txt
git commit -am "bad commit directly on merge-only"
git checkout master
echo foo2 >> foo.txt
git commit -am "2nd valid commit on master"
git checkout merge-only
git merge master
Only the commit with message "bad commit directly on merge-only", which was made directly on the merge-only branch, should show up.
We just found this elegant solution
git log --first-parent --no-merges
In your example of course the initial commit still shows up.
this answer does not exactly answer the question, because the initial commit still shows up. On the other hand many people coming here seem to find the answer they are looking for.
Courtesy of my dear friend Redmumba:
git log --no-merges origin/merge-only \
--not $(git for-each-ref --format="%(refname)" refs/remotes/origin |
grep -Fv refs/remotes/origin/merge-only)
...where origin/merge-only
is your remote merge-only branch name. If working on a local-only git repo, substitute refs/remotes/origin
with refs/heads
, and substitute remote branch name origin/merge-only
with local branch name merge-only
, i.e.:
git log --no-merges merge-only \
--not $(git for-each-ref --format="%(refname)" refs/heads |
grep -Fv refs/heads/merge-only)
git log origin/dev..HEAD
This will show you all the commits made in your branch.
@Prakash answer works. Just for clarity ...
git checkout feature-branch
git log master..HEAD
lists the commits on feature-branch but not the upstream branch (typically your master).
Maybe this could help:
git show-branch
Try this:
git rev-list --all --not $(git rev-list --all ^branch)
Basically git rev-list --all ^branch
gets all revisions not in branch and then you all the revisions in the repo and subtract the previous list which is the revisions only in the branch.
After @Brian's comments:
From git rev-list's documentation:
List commits that are reachable by following the parent links from the given commit(s)
So a command like git rev-list A
where A is a commit will list commits that are reachable from A inclusive of A.
With that in mind, something like
git rev-list --all ^A
will list commits not reachable from A
So git rev-list --all ^branch
will list all commits not reachable from the tip of branch. Which will remove all the commits in the branch, or in other words commits that are only in other branches.
Now let's come to git rev-list --all --not $(git rev-list --all ^branch)
This will be like git rev-list --all --not {commits only in other branches}
So we want to list all
that are not reachable from all commits only in other branches
Which is the set of commits that are only in branch. Let's take a simple example:
master
|
A------------B
\
\
C--------D--------E
|
branch
Here the goal is to get D and E, the commits not in any other branch.
git rev-list --all ^branch
give only B
Now, git rev-list --all --not B
is what we come down to. Which is also git rev-list -all ^B
- we want all commits not reachable from B. In our case it's is D and E. Which is what we want.
Hope this explains how the command works correctly.
Edit after comment:
git init
echo foo1 >> foo.txt
git add foo.txt
git commit -am "initial valid commit"
git checkout -b merge-only
echo bar >> bar.txt
git add bar.txt
git commit -am "bad commit directly on merge-only"
git checkout master
echo foo2 >> foo.txt
git commit -am "2nd valid commit on master"
After the above steps, if you do a git rev-list --all --not $(git rev-list --all ^merge-only)
you will get the commit you were looking for - the "bad commit directly on merge-only"
one.
But once you do the final step in your steps git merge master
the command will not give the expected output. Because as of now there is no commit that is not there in merge-only since the one extra commit in master also has been merged to merge-only. So git rev-list --all ^branch
gives empty result and hence git rev-list -all --not $(git rev-list --all ^branch)
will give all the commits in merge-only.
This is not exactly a real answer, but I need access to formatting, and a lot of space. I'll try to describe the theory behind what I consider the two best answers: the accepted one and the (at least currently) top-ranked one. But in fact, they answer different questions.
Commits in Git are very often "on" more than one branch at a time. Indeed, that's much of what the question is about. Given:
...--F--G--H <-- master
\
I--J <-- develop
where the uppercase letters stand in for actual Git hash IDs, we're often looking for only commit H
or only commits I-J
in our git log
output. Commits up through G
are on both branches, so we'd like to exclude them.
(Note that in graphs drawn like this, newer commits are towards the right. The names select the single right-most commit on that line. Each of those commits has a parent commit, which is the commit to their left: the parent of H
is G
, and the parent of J
is I
. The parent of I
is G
again. The parent of G
is F
, and F
has a parent that simply isn't shown here: it's part of the ...
section.)
For this particularly simple case, we can use:
git log master..develop # note: two dots
to view I-J
, or:
git log develop..master # note: two dots
to view H
only. The right-side name, after the two dots, tells Git: yes, these commits. The left-side name, before the two dots, tells Git: no, not these commits. Git starts at the end—at commit H
or commit J
—and works backwards. For (much) more about this, see Think Like (a) Git.
The way the original question is phrased, the desire is to find commits that are reachable from one particular name, but not from any other name in that same general category. That is, if we have a more complex graph:
O--P <-- name5
/
N <-- name4
/
...--F--G--H--I---M <-- name1
\ /
J-----K <-- name2
\
L <-- name3
we could pick out one of these names, such as name4
or name3
, and ask: which commits can be found by that name, but not by any of the other names? If we pick name3
the answer is commit L
. If we pick name4
, the answer is no commits at all: the commit that name4
names is commit N
but commit N
can be found by starting at name5
and working backwards.
The accepted answer works with remote-tracking names, rather than branch names, and allows you to designate one—the one spelled origin/merge-only
—as the selected name and look at all other names in that namespace. It also avoids showing merges: if we pick name1
as the "interesting name", and say show me commits that are reachable from name1
but not any other name, we'll see merge commit M
as well as regular commit I
.
The most popular answer is quite different. It's all about traversing the commit graph without following both legs of a merge, and without showing any of the commits that are merges. If we start with name1
, for instance, we won't show M
(it's a merge), but assuming the first parent of merge M
is commit I
, we won't even look at commits J
and K
. We'll end up showing commit I
, and also commits H
, G
, F
, and so on—none of these are merge commits and all are reachable by starting at M
and working backwards, visiting only the first parent of each merge commit.
The most-popular answer is pretty well suited to, for instance, looking at master
when master
is intended to be a merge-only branch. If all "real work" was done on side branches which were subsequently merged into master
, we will have a pattern like this:
I---------M---------N <-- master
\ / \ /
o--o--o o--o--o
where all the un-letter-named o
commits are ordinary (non-merge) commits and M
and N
are merge commits. Commit I
is the initial commit: the very first commit ever made, and the only one that should be on master that isn't a merge commit. If the git log --first-parent --no-merges master
shows any commit other than I
, we have a situation like this:
I---------M----*----N <-- master
\ / \ /
o--o--o o--o--o
where we want to see commit *
that was made directly on master
, not by merging some feature branch.
In short, the popular answer is great for looking at master
when master
is meant to be merge-only, but is not as great for other situations. The accepted answer works for these other situations.
Are remote-tracking names like origin/master
branch names?
Some parts of Git say they're not:
git checkout master
...
git status
says on branch master
, but:
git checkout origin/master
...
git status
says HEAD detached at origin/master
. I prefer to agree with git checkout
/ git switch
: origin/master
is not a branch name because you cannot get "on" it.
The accepted answer uses remote-tracking names origin/*
as "branch names":
git log --no-merges origin/merge-only \
--not $(git for-each-ref --format="%(refname)" refs/remotes/origin |
grep -Fv refs/remotes/origin/merge-only)
The middle line, which invokes git for-each-ref
, iterates over the remote-tracking names for the remote named origin
.
The reason this is a good solution to the original problem is that we're interested here in someone else's branch names, rather than our branch names. But that means we've defined branch as something other than our branch names. That's fine: just be aware that you're doing this, when you do it.
git log
traverses some part(s) of the commit graph
What we're really searching for here are series of what I have called daglets: see What exactly do we mean by "branch"? That is, we're looking for fragments within some subset of the overall commit graph.
Whenever we have Git look at a branch name like master
, a tag name like v2.1
, or a remote-tracking name like origin/master
, we tend to want to have Git tell us about that commit and every commit that we can get to from that commit: starting there, and working backwards.
In mathematics, this is referred to as walking a graph. Git's commit graph is a Directed Acyclic Graph or DAG, and this kind of graph is particularly suited for walking. When walking such a graph, one will visit each graph vertex that is reachable via the path being used. The vertices in the Git graph are the commits, with the edges being arcs—one-way links—going from each child to each parent. (This is where Think Like (a) Git comes in. The one-way nature of arcs means that Git must work backwards, from child to parent.)
The two main Git commands for graph-walking are git log
and git rev-list
. These commands are extremely similar—in fact they're mostly built from the same source files—but their output is different: git log
produces output for humans to read, while git rev-list
produces output meant for other Git programs to read.1 Both commands do this kind of graph-walk.
The graph walk they do is specifically: given some set of starting point commits (perhaps just one commit, perhaps a bunch of hash IDs, perhaps a bunch of names that resolve to hash IDs), walk the graph, visiting commits. Particular directives, such as --not
or a prefix ^
, or --ancestry-path
, or --first-parent
, modify the graph walk in some way.
As they do the graph walk, they visit each commit. But they only print some selected subset of the walked commits. Directives such as --no-merges
or --before <date>
tell the graph-walking code which commits to print.
In order to do this visiting, one commit at a time, these two command use a priority queue. You run git log
or git rev-list
and give it some starting point commits. They put those commits into the priority queue. For instance, a simple:
git log master
turns the name master
into a raw hash ID and puts that one hash ID into the queue. Or:
git log master develop
turns both names into hash IDs and—assuming these are two different hash IDs—puts both into the queue.
The priority of the commits in this queue is determined by still more arguments. For instance, the argument --author-date-order
tells git log
or git rev-list
to use the author timestamp, rather than the committer timestamp. The default is to use the committer timestamp and pick the newest-by-date commit: the one with the highest numerical date. So with master develop
, assuming these resolve to two different commits, Git will show whichever one came later first, because that will be at the front of the queue.
In any case, the revision walking code now runs in a loop:
- While there are commits in the queue:
- Remove the first queue entry.
- Decide whether to print this commit at all. For instance,
--no-merges
: print nothing if it is a merge commit;--before
: print nothing if its date does not come before the designated time. If printing isn't suppressed, print the commit: forgit log
, show its log; forgit rev-list
, print its hash ID. - Put some or all of this commit's parent commits into the queue (as long as it isn't there now, and has not been visited already2). The normal default is to put in all parents. Using
--first-parent
suppresses all but the first parent of each merge.
(Both git log
and git rev-list
can do history simplification with or without parent rewriting at this point as well, but we'll skip over that here.)
For a simple chain, like start at HEAD
and work backwards when there are no merge commits, the queue always has one commit in it at the top of the loop. There's one commit, so we pop it off and print it and put its (single) parent into the queue and go around again, and we follow the chain backwards until we reach the very first commit, or the user gets tired of git log
output and quits the program. In this case, none of the ordering options matter: there is only ever one commit to show.
When there are merges and we follow both parents—both "legs" of the merge—or when you give git log
or git rev-list
more than one starting commit, the sorting options matter.
Last, consider the effect of --not
or ^
in front of a commit specifier. These have several ways to write them:
git log master --not develop
or:
git log ^develop master
or:
git log develop..master
all mean the same thing. The --not
is like the prefix ^
except that it applies to more than one name:
git log ^branch1 ^branch2 branch3
means not branch1, not branch2, yes branch3; but:
git log --not branch1 branch2 branch3
means not branch1, not branch2, not branch3, and you have to use a second --not
to turn it off:
git log --not branch1 branch2 --not branch3
which is a bit awkward. The two "not" directives are combined via XOR, so if you really want, you can write:
git log --not branch1 branch2 ^branch3
to mean not branch1, not branch2, yes branch3, if you want to obfuscate.
These all work by affecting the graph walk. As git log
or git rev-list
walks the graph, it makes sure not to put into the priority queue any commit that is reachable from any of the negated references. (In fact, they affect the starting setup too: negated commits can't go into the priority queue right from the command line, so git log master ^master
shows nothing, for instance.)
All of the fancy syntax described in the gitrevisions documentation makes use of this, and you can expose this with a simple call to git rev-parse
. For instance:
$ git rev-parse origin/pu...origin/master # note: three dots
b34789c0b0d3b137f0bb516b417bd8d75e0cb306
fc307aa3771ece59e174157510c6db6f0d4b40ec
^b34789c0b0d3b137f0bb516b417bd8d75e0cb306
The three-dot syntax means commits reachable from either left or right side, but excluding commits reachable from both. In this case the origin/master
commit, b34789c0b
, is itself reachable from origin/pu
(fc307aa37...
) so the origin/master
hash appears twice, once with a negation, but in fact Git achieves the three-dot syntax by putting in two positive references—the two non-negated hash IDs—and one negative one, represented by the ^
prefix.
Simiarly:
$ git rev-parse master^^@
2c42fb76531f4565b5434e46102e6d85a0861738
2f0a093dd640e0dad0b261dae2427f2541b5426c
The ^@
syntax means all the parents of the given commit, and master^
itself—the first parent of the commit selected by branch-name master
—is a merge commit, so it has two parents. These are the two parents. And:
$ git rev-parse master^^!
0b07eecf6ed9334f09d6624732a4af2da03e38eb
^2c42fb76531f4565b5434e46102e6d85a0861738
^2f0a093dd640e0dad0b261dae2427f2541b5426c
The ^!
suffix means the commit itself, but none of its parents. In this case, master^
is 0b07eecf6...
. We already saw both parents with the ^@
suffix; here they are again, but this time, negated.
1Many Git programs literally run git rev-list
with various options, and read its output, to know what commits and/or other Git objects to use.
2Because the graph is acyclic, it's possible to guarantee that none have been visited already, if we add the constraint never show a parent before showing all of its children to the priority. --date-order
, --author-date-order
, and --topo-order
add this constraint. The default sort order—which has no name—doesn't. If the commit timestamps are screwy—if for instance some commits were made "in the future" by a computer whose clock was off—this could in some cases lead to odd looking output.
If you made it this far, you now know a lot about git log
Summary:
git log
is about showing some selected commits while walking some or all of some part of the graph.- The
--no-merges
argument, found in both the accepted and the currently-top-ranked answers, suppresses showing some commits that are walked. - The
--first-parent
argument, from the currently-top-ranked-answer, suppresses walking some parts of the graph, during the graph-walk itself. - The
--not
prefix to command line arguments, as used in the accepted answer, suppresses ever visiting some parts of the graph at all, right from the start.
We get the answers we like, to two different questions, using these features.
Another variation of the accepted answers, to use with master
git log origin/master --not $(git branch -a | grep -Fv master)
Filter all commits that happen in any branch other than master.
精彩评论