Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upReturn a DataFrame from HPAT Jited function #173
Comments
|
@bigwater thanks or your report. We'll look into it. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi,
I am trying to use HPAT to accelerate ETL process. Although HPAT gave significant speedup on a multi-core CPU in terms of the data frame transformation, it has an issue that I could not figure out now.
It gives no speedup or raises an error when we return the data frame from the jitted function. The example with minimal code is listed as follows.
In the baseline case,
time python test_hpat3.pyuses 30.31s.We found that using more processes on MPI for this example program only gives more slowdown.
The observation is different when I remove the
return dffrom the JITted function, where we have more speedup with the increasing number of processes used.Besides, if I use even more processes, an error is reported.
I am not sure if the slowdown/error is supposed to happen since I am quite new to HPAT.
Could you give me more explanation and suggestions about it? Let me know if other information is needed.
Since I would like to feed the data frame after the ETL process, how can I return the data frame out of the HPAT jitted function?
Thank you so much.
Best regards,
Hongyuan Liu
Software configuration:
hpat 0.30.0 py37hc547734_15 intel/label/test
numba 0.45.0 py37h962f231_0