Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

calculating cumsums on a groupby object #3141

Open
tommylees112 opened this issue Jul 18, 2019 · 5 comments · May be fixed by #3417
Open

calculating cumsums on a groupby object #3141

tommylees112 opened this issue Jul 18, 2019 · 5 comments · May be fixed by #3417

Comments

@tommylees112
Copy link

@tommylees112 tommylees112 commented Jul 18, 2019

How do I go about calculating cumsums on a groupby object?

I have a Dataset that looks as the following:

lat = np.linspace(-5.175003, 5.9749985, 224)
lon = np.linspace(33.524994, 42.274994, 176)
time = pd.date_range(start='1981-01-31', end='2019-04-30', freq='M')
data = np.random.randn(len(time), len(lat), len(lon))
dims = ['time', 'lat', 'lon']
coords = {'time': time, 'lat': lat, 'lon': lon}

ds = xr.Dataset({'precip': (dims, data)}, coords=coords)

Out[]:
<xarray.Dataset>
Dimensions:  (lat: 224, lon: 176, time: 460)
Coordinates:
  * time     (time) datetime64[ns] 1981-01-31 1981-02-28 ... 2019-04-30
  * lat      (lat) float64 -5.175 -5.125 -5.075 -5.025 ... 5.875 5.925 5.975
  * lon      (lon) float64 33.52 33.57 33.62 33.67 ... 42.12 42.17 42.22 42.27
Data variables:
    precip   (time, lat, lon) float64 0.006328 0.2969 1.564 ... 0.6675 2.32

I need to groupby year and calculate the cumsum for each year. That way I will have a value for each month (timestep) and each pixel (lat - lon pair).

But the cumsum operation doesn't work on a groupby object

ds.groupby('time.year').cumsum(dim='time')

Out[]:
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-12-dceee5f5647c> in <module>
      9 display(ds_)
     10 
---> 11 ds_.groupby('time.year').cumsum(dim='time')

AttributeError: 'DatasetGroupBy' object has no attribute 'cumsum'

Is there a work around?

INSTALLED VERSIONS
------------------
commit: None
python: 3.7.0 | packaged by conda-forge | (default, Nov 12 2018, 12:34:36) 
[Clang 4.0.1 (tags/RELEASE_401/final)]
python-bits: 64
OS: Darwin
OS-release: 18.2.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: en_US.UTF-8
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
libhdf5: 1.10.4
libnetcdf: 4.6.2

xarray: 0.12.2
pandas: 0.24.2
numpy: 1.16.4
scipy: 1.3.0
netCDF4: 1.5.1.2
pydap: None
h5netcdf: None
h5py: 2.9.0
Nio: None
zarr: None
cftime: 1.0.3.4
nc_time_axis: None
PseudonetCDF: None
rasterio: 1.0.17
cfgrib: 0.9.7
iris: None
bottleneck: 1.2.1
dask: 1.2.2
distributed: 1.28.1
matplotlib: 3.1.0
cartopy: 0.17.0
seaborn: 0.9.0
numbagg: None
setuptools: 41.0.1
pip: 19.1
conda: None
pytest: 4.5.0
IPython: 7.1.1
sphinx: 2.0.1
@dcherian
Copy link
Contributor

@dcherian dcherian commented Jul 18, 2019

I wonder if this is as easy as adding ops.inject_cum_methods(Dataset.groupby) at the end of core/groupby.py?

@shoyer
Copy link
Member

@shoyer shoyer commented Jul 18, 2019

It looks like ds.groupby('time.year').apply(lambda x: x.cumsum(dim='time')) mostly works for now.

But yes, it would be great to add this.

@dcherian
Copy link
Contributor

@dcherian dcherian commented Jul 19, 2019

@tommylees112 Are you up for sending in a PR. It's an easy fix...

@tommylees112
Copy link
Author

@tommylees112 tommylees112 commented Jul 21, 2019

Would love to! Sorry have been away this weekend. Do i just clone the repo write the code and send in a PR in a new branch?

(first PR on a public repo!)

@nbren12
Copy link
Contributor

@nbren12 nbren12 commented Jul 22, 2019

Xarray has a pretty extensive contributor's guide that you might find helpful. In short, the way to contribute changes is to create your own fork of xarray, commit/push some changes, and finally submit a pull request (PR).

@dcherian dcherian pinned this issue Sep 7, 2019
VladSkripniuk added a commit to VladSkripniuk/xarray that referenced this issue Oct 19, 2019
@VladSkripniuk VladSkripniuk linked a pull request that will close this issue Oct 19, 2019
3 of 3 tasks complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

4 participants
You can’t perform that action at this time.