Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reading progress of Git operation using GitPython stuck (or not printing) #871

Closed
t89 opened this issue May 6, 2019 · 5 comments
Closed

Reading progress of Git operation using GitPython stuck (or not printing) #871

t89 opened this issue May 6, 2019 · 5 comments

Comments

@t89
Copy link
Contributor

@t89 t89 commented May 6, 2019

I'm trying to access the progress of time consuming Git operations using GitPython. I tried the sample solution taken from the official documentation, and also tried passing in a method following the exact signature of the update method below. Everytime I call fetch(), push(), pull() with the parameter progress=<anything>, the programm is stuck and the update method does not get called. If I call those operations without setting the progress parameter, it works flawlessly.

  • $ git --version is 2.21.0
  • Calling sys.stdout.flush() after print() does not help either
  • I use assert to assure my repo objects are available and in the expected state
  • ProgressPrinter() yields not None
  • I tried calling the functions from the main thread and multithreaded
  • I took a look at the implementation (line 350) of RemoteProgress and also the implementation (line 815) of push() and do not see a reason, why it would not continue execution
  • I found out, that when I assign my ProgressPrinter instance and pass the assigned variable, the programm is not stuck anymore. Yet the update() method does not get called and no progress is printed:
# Not stuck anymore, yet no progress
pp = ProgressPrinter()
fetch_info = origin.fetch(progress=pp)

Core of my implementation:

from git import RemoteProgress

class ProgressPrinter(RemoteProgress):
    def update(self,
               op_code,
               cur_count,
               max_count=None,
               message=''):
        print("Is this even called?")

And later on:

origin = repo.remotes.origin
assert origin.exists()
fetch_info = origin.fetch(progress=ProgressPrinter())

Any recommendations on how to investigate this problem furthermore? I've been debugging this for several days now and feel like I am missing something.

@jeking3
Copy link
Contributor

@jeking3 jeking3 commented May 6, 2019

Try an older version of git (v2.20.0) as in v2.21.0 of git they changed the output strings to allow for localization, and that's caused at least one bug, let's see if it caused others...

@t89
Copy link
Contributor Author

@t89 t89 commented May 6, 2019

I switched to both the default macOS Git 2.14.ish as well as 2.20.1. Both versions are not getting stuck anymore. Yet there is still no progress being printed.

I'll create stripped down script to check if I can reproduce it reliably.

@t89
Copy link
Contributor Author

@t89 t89 commented May 6, 2019

Since I switched to the older Git version and later on reinstalled Git 2.21.0, the very same code does not get stuck anymore. I don't know what caused this hickup. I will keep an eye on it and let you know if I can reproduce this behaviour in the upcoming weeks.

I solved the second problem of the progress not being logged though and have a proposal for an implementation-, or at least documentation-change:

If the git process finishes quickly, the update() method will never be called. I expected it to be called at least once. May I suggest implementing an additional method did_finish(exit_code, message) to the RemoteProgress class?

This way we do not break existing implementations and keep the progress handling in one place. Yes, I could check if my fetch() returned a string and know that way, the process is done, but it would be a lot nicer to implement, if the RemoteProgress subclass would be able to handle this as well.

Also a note within the documentation for update() would be awesome: Maybe with a hint for developers to grab a large repository, and prepare it for a longer process:

$ git reset --hard @~100
$ git remote remove origin
$ git reflog expire --expire=now --all && git gc --prune=now --aggressive
$ git remote add origin <url>
@jeking3
Copy link
Contributor

@jeking3 jeking3 commented May 6, 2019

Perhaps update() should always be called at least once?

t89 added a commit to t89/GitPython that referenced this issue May 7, 2019
@t89
Copy link
Contributor Author

@t89 t89 commented May 7, 2019

I proposed a pull request in which I adapt the behaviour in a way, that update() is called once, even when Git answers with = [up to date]. My project depending on GitPython has a deadline of ~6 weeks. Please let me know if this change will be implemented. So that I have time to adjust my planning. Thank you in advance!

@Byron Byron added this to the v2.1.12 - Bugfixes milestone Jul 20, 2019
@ghost ghost closed this in 687c8f0 Jul 20, 2019
ghost pushed a commit that referenced this issue Jul 20, 2019
Sebastian Thiel
ghost pushed a commit that referenced this issue Jul 20, 2019
Sebastian Thiel
…871""

This reverts commit 3bf002e.

Try again
ghost pushed a commit that referenced this issue Jul 20, 2019
Sebastian Thiel
…to date" #871"""

This reverts commit 9b628dc.

Definitely doesn't work
https://travis-ci.org/gitpython-developers/GitPython/builds/561361507
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants
You can’t perform that action at this time.