Update indexer instantiation. Allow loc from index with duplicates.#46
Update indexer instantiation. Allow loc from index with duplicates.#46
Conversation
Codecov Report
@@ Coverage Diff @@
## master #46 +/- ##
==========================================
- Coverage 88.36% 88.16% -0.21%
==========================================
Files 7 7
Lines 1143 1174 +31
==========================================
+ Hits 1010 1035 +25
- Misses 133 139 +6
Continue to review full report at Codecov.
|
sparsity/sparse_frame.py
Outdated
| new_data = SparseFrame(new_mat, columns=index, | ||
| index=self.index) | ||
| else: | ||
| raise ValueError('Only supported aces are 0 and 1.') |
| assert res.index.date.min() == dt.date(2016, 1, 15) | ||
| res = res.compute(get=get_sync) | ||
| assert res.index.levels[0].max().date() == dt.date(2016, 2, 15) | ||
| assert res.index.levels[0].min().date() == dt.date(2016, 1, 15) |
There was a problem hiding this comment.
No need to use datetime if you already need/have pandas, but that's so lightweight I don't think it justifies changing it.
michcio1234
left a comment
There was a problem hiding this comment.
Awesome that we can use new pandas now. I have just a few comments and one concern (the one with __getitem__).
sparsity/sparse_frame.py
Outdated
| """Create an indexer like _name in the class.""" | ||
| if getattr(cls, name, None) is None: | ||
| _v = int(pd.__version__.split('.')[1]) | ||
| if _v >= 23: |
There was a problem hiding this comment.
I am okay with this for now, but it would be prettier to do it like this:
tuple(map(int, pd.__version__.split('.')))
Out[6]: (0, 22, 0)
tuple(map(int, pd.__version__.split('.'))) > (0, 23, 0)
Out[7]: False
sparsity/sparse_frame.py
Outdated
| if item is not None and len(item) > 0: | ||
| return self.reindex_axis(item, axis=1) | ||
| else: | ||
| return self |
There was a problem hiding this comment.
So if I do my_frame[[]], I'll get the whole frame in return? I think it should return a frame with no columns.
| res = sp.SparseFrame.concat(res.compute(get=get_sync).tolist()) | ||
| assert res.index.date.max() == dt.date(2016, 2, 15) | ||
| assert res.index.date.min() == dt.date(2016, 1, 15) | ||
| res = res.compute(get=get_sync) |
There was a problem hiding this comment.
Yeah, it's been deprecated:
dask/dask@2036040
| assert len(sf.loc['A'].index) == 3 | ||
| assert len(sf.loc['B'].index) == 2 | ||
| assert np.all(sf.loc['A'].todense().values == np.identity(5)[:3]) | ||
| assert np.all(sf.loc['B'].todense().values == np.identity(5)[3:]) |
There was a problem hiding this comment.
It looks like you wanted to test columns too, but didn't do it.
|
@michcio1234 update |
| if not isinstance(item, (tuple, list)): | ||
| item = [item] | ||
| if item is not None and len(item) > 0: | ||
| if len(item) > 0: |
There was a problem hiding this comment.
@kayibal Previously you took care of the situation when None is passed, now you don't. I'm not sure if it's necessary, just wanted to drop a hint.
Ended up updating to pandas>=0.23.0 and also fixing circle ci