I am writing a Python script to get a list of commits that are about to be applied by a git pull
operation. The excellent GitPython library is a great base to start, but the subtle inner workings of git are killing me. Now, here is what I have at the moment (simplified and annotated version):
repo = git.Repo(path) # get the local repo
local_commit = repo.commit() # latest local commit
remote = git.remote.Remote(repo, 'origin') # remote repo
info = remote.fetch()[0] # fetch changes
remote_commit = info.commit # latest remote commit
if local_commit.hexsha == remote_commit.hexsha: # local is updated; end
return
# for every remote commit
while remote_commit.hexsha != local_commit.hexsha:
authors.append(remote_commit.author.email) # note the author
remote_commit = remote_commit.parents[0] # navigate up to the parent
Essentially it gets the authors for all commits that will be applied in the next git pull
. This is working well, but it has the following problems:
- When the local commit is ahead of the remote, my code just prints all commits to the first.
- A remote commit can have more than one parent, and the local commit can be the second parent. This means that my code will never find the local commit in the remote repository.
I can deal with remote repositories being behind the local one: just look in the other direction (local to remote) at the same time, the code gets messy but it works. But this last problem is killing me: now I need to navegate a (potentially unlimited) tree to find a match for the local commit. This is not just theoretical: my latest change was a repo merge which presents this very problem, so my script is not working.
Getting an ordered list of commits in the remote repository, such as repo.iter_commits()
does for a local Repo, would be a great help. But I haven't found in the documentation how to do that. Can I just get a Repo object for the Remote repository?
Is there another approach which might get me there, and I am using a hammer to nail screws?