New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Detect cycle during projectivize #10877
Detect cycle during projectivize #10877
Conversation
|
Discussed with @kadarakos : I'll make some changes, so putting it in draft temporarily. |
Communicate presence of cycles through an output argument.
The has_cycle pointer is too easy to misuse. Ideally, we would have a sum type like Rust's `Result` here, but C++ is not there yet.
We are now explicitly checking for cycles, so the algorithm must always terminate. Either we encounter the head, we find a root, or a cycle.
|
|
||
|
|
||
| cdef bool _is_nonproj_arc(int tokenid, const vector[int]& heads) nogil: | ||
| cdef pair[bool, bool] _is_nonproj_arc(int tokenid, const vector[int]& heads) nogil: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could use C-enumerations (or even C++11 enum class for additional type-safety, if Cython supports it) in place of pair[bool, bool]? That would be more readable/explicit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be possible from the pair[bool, bool] functions, but not for pair[int, bool] (unless we hack the semantics like before). I am not sure if I like using different types for different related functions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was saying to @kadarakos, I wish we had sum types
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After discussion with @shadeMe and @kadarakos, redid it with C++ exceptions. Now we have to decide which of the TMTOWTDI's we want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was telling @danieldk that I think its nice to have the exception, because even though it might be an overkill here, we could use it to refactor maybe some of the clunky errors with raise from cython.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be nice in some places, but it also means that we have to be more mindful about except signatures. I found out today if you call a noexcept + function (can throw a C++ exception) in a cdef nogil function, the exception is swallowed? So every place we call a noexcept + function, we have to to take care that we have an except specification. Also, the exception is not propagated as a C++ exception, but as a Python exception from that point, so you need something like except * or except -1 (depending on your return value).
There are probably places in our code base where we are calling functions that may throw (except +) in cdef nogil functions, where we do not take proper care of propagating the exception.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am still not sure if this is a net improvement over returning errors as values and maybe having a template class for errors. At least, something like
cdef Result[int, string] myfunction(...) nogil
is harder to mis than an except + that is only visible in some pxd.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm. What do you think of having always tuple[Any, bool, bool, bool, ....] where the Any is the actual value and the bools refer to various errors that can be encountered?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That has very weak semantics. The benefit of the Result approach is that the error could be anything strongly-typed (like an enum class or string).
|
So what's left now is to add some cycle-detection test cases to |
Yeah and |
Description
Attempting to detect cycles during
projectivizestep inget_aligned_parse. Fixes #10859.Types of change
Small bugfix
Checklist