Change sparse matrix to array #96

Lakshmi-bashyam · 2025-07-29T10:37:40Z

Replace `spmatrix` with `sparray` for SciPy compatibility

This pull request updates all references to scipy.sparse.spmatrix to use the new scipy.sparse.sparray class, in line with SciPy's ongoing deprecation of spmatrix. This change ensures compatibility with recent and future versions of SciPy.

Changes Made

Replaced all instances of spmatrix type checks and imports with sparray.
Modified test cases to make them compatible with the new change.

Related Issue

Fixes stwfsapy#89

codecov · 2025-07-29T11:20:15Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 100.00%. Comparing base (cf652c4) to head (04e04fc).
⚠️ Report is 5 commits behind head on master.

Additional details and impacted files

@@            Coverage Diff            @@
##            master       #96   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           16        16           
  Lines          943       943           
=========================================
  Hits           943       943

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

gmmajal

Can you make the changes specified for the doctrings and further clarify if text_features needs to be modified?

gmmajal · 2025-08-04T15:17:28Z

stwfsapy/predictor.py

+    def predict(self, X) -> csr_array:
        """
        Predicts binary concept match labels for each input text.



I think in the docstrings for the predict() method we can replace "A sparse matrix of shape ..." with a sparse array for the sake of consistency.

gmmajal · 2025-08-04T15:33:32Z

stwfsapy/predictor.py

+                    txt_vec = self.text_vectorizer_.transform([inp])
                else:
                    txt_vec = 0
                txt_feat = self.text_features_.transform([text])[0]


Can you explain why over here(line 433) and in line 458, when the transform method is applied we access index 0. I see that it was modified for the text_vectorizer attribute i.e. index 0 is not accessed. Should it be also modified for the text_features attribute or does it have a different data structure?

I looked at what the transform methods are doing for text_vectorizer and text_features, respectively. The one for text_vectorizer returns a sparse matrix whereas for text_features a numpy array is returned. The data structure does indeed seem to be different.

The text vectorizer produces a csr_matrix from scikit-learn, so we can’t switch it to a sparray at this point.

Lakshmi-bashyam · 2025-08-14T11:02:23Z

We can’t safely transition from spmatrix to the sparray hierarchy just yet. Our dependency on scikit-learn still poses compatibility risks. While scikit-learn has begun its migration toward supporting sparray, the internal transition is still in progress.

Specifically, scikit-learn’s PR #31072 (“First steps toward sparray migration pass 2”) is still open, indicating that full adoption isn’t complete yet.

gmmajal · 2025-08-14T11:23:04Z

We can’t safely transition from spmatrix to the sparray hierarchy just yet. Our dependency on scikit-learn still poses compatibility risks. While scikit-learn has begun its migration toward supporting sparray, the internal transition is still in progress.

Specifically, scikit-learn’s PR #31072 (“First steps toward sparray migration pass 2”) is still open, indicating that full adoption isn’t complete yet.

Good catch! I had a look at the pull request you referenced. The team at scikit-learn are in the process of migrating as you mentioned. As part of their release v1.8, they have this particular pull request: scikit-learn/scikit-learn#31177. The release is expected to be available by mid Nov 2025, see here: https://github.com/scikit-learn/scikit-learn/milestone/66. We can wait till scikit-learn has a transition mechanism in place before migrating to sparray ourselves.

Lakshmi-bashyam added 3 commits July 25, 2025 11:17

Change sparse matrix to array

ece6767

Merge conflict resolution

c669ab5

use 2D indexing and .nnz for sparray

04e04fc

Lakshmi-bashyam marked this pull request as ready for review July 29, 2025 11:22

Lakshmi-bashyam requested a review from gmmajal July 29, 2025 11:22

gmmajal assigned Lakshmi-bashyam Aug 4, 2025

gmmajal requested changes Aug 4, 2025

View reviewed changes

Lakshmi-bashyam marked this pull request as draft August 14, 2025 10:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Change sparse matrix to array #96

Change sparse matrix to array #96

Uh oh!

Lakshmi-bashyam commented Jul 29, 2025

Uh oh!

codecov bot commented Jul 29, 2025 •

edited

Loading

Uh oh!

gmmajal left a comment

Uh oh!

gmmajal Aug 4, 2025

Uh oh!

gmmajal Aug 4, 2025

Uh oh!

gmmajal Aug 13, 2025 •

edited

Loading

Uh oh!

Lakshmi-bashyam Aug 14, 2025

Uh oh!

Lakshmi-bashyam commented Aug 14, 2025

Uh oh!

gmmajal commented Aug 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Change sparse matrix to array #96

Are you sure you want to change the base?

Change sparse matrix to array #96

Uh oh!

Conversation

Lakshmi-bashyam commented Jul 29, 2025

Replace spmatrix with sparray for SciPy compatibility

Changes Made

Related Issue

Uh oh!

codecov bot commented Jul 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

gmmajal left a comment

Choose a reason for hiding this comment

Uh oh!

gmmajal Aug 4, 2025

Choose a reason for hiding this comment

Uh oh!

gmmajal Aug 4, 2025

Choose a reason for hiding this comment

Uh oh!

gmmajal Aug 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Lakshmi-bashyam Aug 14, 2025

Choose a reason for hiding this comment

Uh oh!

Lakshmi-bashyam commented Aug 14, 2025

Uh oh!

gmmajal commented Aug 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Replace `spmatrix` with `sparray` for SciPy compatibility

codecov bot commented Jul 29, 2025 •

edited

Loading

gmmajal Aug 13, 2025 •

edited

Loading