Update indexer instantiation. Allow loc from index with duplicates. by kayibal · Pull Request #46 · datarevenue-berlin/sparsity

kayibal · 2018-08-21T12:52:32Z

Ended up updating to pandas>=0.23.0 and also fixing circle ci

codecov · 2018-08-21T13:39:23Z

Codecov Report

Merging #46 into master will decrease coverage by 0.2%.
The diff coverage is 92.68%.

@@            Coverage Diff             @@
##           master      #46      +/-   ##
==========================================
- Coverage   88.36%   88.16%   -0.21%     
==========================================
  Files           7        7              
  Lines        1143     1174      +31     
==========================================
+ Hits         1010     1035      +25     
- Misses        133      139       +6

Impacted Files	Coverage Δ
sparsity/sparse_frame.py	`87.42% <92.68%> (-0.45%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4d5fd2b...c21824a. Read the comment docs.

vitords

Looks great to me!

vitords · 2018-08-21T19:03:04Z

sparsity/sparse_frame.py

+                new_data = SparseFrame(new_mat, columns=index,
+                                       index=self.index)
+            else:
+                raise ValueError('Only supported aces are 0 and 1.')


vitords · 2018-08-21T19:09:55Z

sparsity/test/test_dask_sparse_frame.py

-    assert res.index.date.min() == dt.date(2016, 1, 15)
+    res = res.compute(get=get_sync)
+    assert res.index.levels[0].max().date() == dt.date(2016, 2, 15)
+    assert res.index.levels[0].min().date() == dt.date(2016, 1, 15)


No need to use datetime if you already need/have pandas, but that's so lightweight I don't think it justifies changing it.

michcio1234

Awesome that we can use new pandas now. I have just a few comments and one concern (the one with __getitem__).

michcio1234 · 2018-08-22T07:06:24Z

sparsity/sparse_frame.py

+        """Create an indexer like _name in the class."""
+        if getattr(cls, name, None) is None:
+            _v = int(pd.__version__.split('.')[1])
+            if _v >= 23:


I am okay with this for now, but it would be prettier to do it like this:

tuple(map(int, pd.__version__.split('.'))) Out[6]: (0, 22, 0) tuple(map(int, pd.__version__.split('.'))) > (0, 23, 0) Out[7]: False

michcio1234 · 2018-08-22T07:10:54Z

sparsity/sparse_frame.py

+        if item is not None and len(item) > 0:
+            return self.reindex_axis(item, axis=1)
+        else:
+            return self


So if I do my_frame[[]], I'll get the whole frame in return? I think it should return a frame with no columns.

michcio1234 · 2018-08-22T07:17:29Z

sparsity/test/test_dask_sparse_frame.py

-    res = sp.SparseFrame.concat(res.compute(get=get_sync).tolist())
-    assert res.index.date.max() == dt.date(2016, 2, 15)
-    assert res.index.date.min() == dt.date(2016, 1, 15)
+    res = res.compute(get=get_sync)


Isn't get kwarg deprecated?

noo but it's superfluous

Yeah, it's been deprecated:
dask/dask@2036040

michcio1234 · 2018-08-22T07:22:24Z

sparsity/test/test_sparse_frame.py

+    assert len(sf.loc['A'].index) == 3
+    assert len(sf.loc['B'].index) == 2
+    assert np.all(sf.loc['A'].todense().values == np.identity(5)[:3])
+    assert np.all(sf.loc['B'].todense().values == np.identity(5)[3:])


It looks like you wanted to test columns too, but didn't do it.

kayibal · 2018-08-22T11:07:09Z

@michcio1234 update

michcio1234 · 2018-08-22T11:12:36Z

sparsity/sparse_frame.py

        if not isinstance(item, (tuple, list)):
            item = [item]
-        if item is not None and len(item) > 0:
+        if len(item) > 0:


@kayibal Previously you took care of the situation when None is passed, now you don't. I'm not sure if it's necessary, just wanted to drop a hint.

It should raise an error

Now it's great.

michcio1234

All good.

kayibal added 4 commits July 25, 2018 16:54

Update indexer instantiation. Allow loc from index with duplicates.

21b6011

Fix integer based indexing in newer pandas version

a610018

Fix location based indexing

4be9628

Support older (<0.23.0) versions of pandas

6ec1810

kayibal requested review from michcio1234 and vitords August 21, 2018 13:39

kayibal added 6 commits August 21, 2018 15:47

Fix broken test due to new pandas behaviour when passed to np.all

9d74e6c

Update circleci configuration

c9329fb

Misc fixes

aa4bc8f

fix indexing error with older scipy versions (<1.0.0)

ecf3762

Support column indexing in _xs method

74075ca

Additional test for getitem

f1866f1

vitords approved these changes Aug 21, 2018

View reviewed changes

michcio1234 requested changes Aug 22, 2018

View reviewed changes

Implement PR feedback

c854ce6

michcio1234 reviewed Aug 22, 2018

View reviewed changes

michcio1234 approved these changes Aug 22, 2018

View reviewed changes

raise error if sparse frame is indexed with None

c21824a

kayibal merged commit 8d2f8f6 into master Aug 22, 2018

kayibal deleted the fix/indexing-duplicates branch August 23, 2018 14:07

michcio1234 mentioned this pull request Sep 5, 2018

[WIP] Update to pandas 0.23 #44

Closed

Conversation

kayibal commented Aug 21, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Aug 21, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

vitords left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

michcio1234 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kayibal commented Aug 22, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

michcio1234 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

kayibal commented Aug 21, 2018 •

edited

Loading

codecov bot commented Aug 21, 2018 •

edited

Loading

michcio1234 left a comment •

edited

Loading