CFT Heuristic for Set Covering #4607

c4v4 · 2025-03-28T16:09:19Z

Initial Implementation of the CFT Heuristic for Set Covering

This PR introduces the first steps toward implementing the Caprara, Fischetti, and Toth heuristic (CFT) for the Set Covering problem. This is an evolution and porting of another existing implementation of the same algorithm.

This is still a work in progress. The goal of this PR is to create a shared space where we can discuss the development and refine the approach together (as we agreed with the OR-Tools developers working on the Set Cover module).

Current State

The current implementation focuses on the 3-Phase algorithm described in the paper, excluding the Refinement phase for now. Since the Refinement phase repeatedly calls the 3-Phase as a subroutine, it can be added later if needed.

Subgradient

The subgradient method (currently sequential) is implemented with flexibility in mind, it uses the SubgradientCBs interface to define the key customization points.
The main reason for this approach is that the algorithm uses two types of subgradient methods:

One to improve the current dual bound (Subgradient Phase in the paper).
One to generate multipliers for the Greedy algorithm (Heuristic Phase in the paper).

Thus, to avoid code duplication, the parts that differ between these two are isolated in callbacks.

This design also simplifies experimentation with stabilization techniques, especially for improving the dual-bound phase. But, more in general, it should help simplify future work on subgradient-based algorithms for the Set Covering.

Greedy

The Lagrangian multiplier-based Greedy algorithm is implemented, including its column-scoring method based on the median-finding algorithm, as described in the paper.

One missing (small) part is the enumeration step, which the paper suggests using when there are few redundant columns to remove (fewer than 10). In our previous implementation, we noticed that this step added complexity without much benefit, so I left it out for now. It can be added later if needed.

Core Model

The paper describes a core model technique that improves performance by around an order of magnitude. The idea is to focus on a smaller set of high-quality columns, determined through a periodic pricing procedure during the subgradient phase.

This part is already implemented, but only in a temporary form. It will need to be properly integrated with the "sub-model system", which is still to be developed.

Sub-Model

This is the main missing piece. The CFT heuristic works by gradually fixing columns that are likely to be in the optimal solution. This reduces the problem size and shifts the focus to the remaining uncovered elements and their covering columns.

From past experience, this part needs to be designed carefully since a poor design could make future extensions and maintenance tricky. I'm currently discussing with one of the developers to understand possible designs and which one to pick.

Future Possibilities

One potential long-term improvement would be replacing the "static model" with an online column generation approach. The current structure already allows for this: instead of selecting columns, the core model pricing could be replaced with an actual column generation step.

This is beyond the current scope, but I’m keeping it in mind to ensure the implementation remains flexible for such an extension in the future (since this seems to be the most sensible design for many contexts where generating a large enough set of columns is not feasible).

Let me know what you think! Any feedback is welcome :)

…d-columns sizes

…le issues

…ff with prev commit)

c4v4 · 2025-04-05T09:51:36Z

As discussed privately, I’ve completed a prototype based on views over the original model to track column fixings and focus on a core model (i.e., a subset of columns).

An alternative approach using model copies with only active rows/columns would probably be more computationally efficient, since it would work with small vectors instead of iterating the full list, skipping inactive items.
However, it's not compatible with the current context when memory is a constraint. Since fixings are incremental, early iterations would require handling nearly a full copy of the original instance, roughly doubling the memory usage (at peak).

The current view-based system is in a minimal but working state and needs to be improved and hardened.
With this, the sequential 3-Phase prototype is complete, covering about 90% of the CFT logic.

Next steps:

Test, polish, and stabilize the current implementation. Multipliers can be erratic in edge cases, so we need to identify and handle those.
Experiment with the model-copy approach for the core model only. Since the core is much smaller than the original, this might not increase memory usage significantly, and could even reduce it compared to the view-based version.

…odel

+ added asserts

c4v4 · 2025-04-09T12:30:52Z

Current State Update

The 3Phase algorithm is still in a prototype stage, but it's now working reasonably efficiently (still sequential for now).

Two representations for the core model are available:

SubModelView: a view-based version, storing only the focused item lists and "is-focused" vectors.
CoreModel: an explicit SetCoverModel, wrapped with the necessary components to keep it up to date.

Special care was taken to ensure the search trajectory remains consistent regardless of the chosen representation.

Preliminary testing on rails instances shows that the CoreModel is about 2x faster, with memory usage roughly the same (within a 1% delta, sometimes slightly better, sometimes worse) compared to the SubModelView.

The view system has been refined. That said, if we eventually settle on a specific strategy for column fixing and core model pricing, the abstractions in set_cover_views.h could be replaced with manual filtering of focused items (similar to what one would do in a lower-level language without zero-cost abstractions). For now, they’re helpful for easily switching between design ideas at a high level without having to tweak every implementation detail.

Next steps:

Continue with cleanup, organization, and adding comments.
Start implementing tests (I'll need help integrating them with OR-Tools' testing system).
Begin experimenting with basic subgradient improvements and stabilization techniques.

c4v4 · 2025-04-14T14:44:19Z

I’ve applied most of the suggestions from the code review (thanks again for the thorough feedback!).

A few quick notes:

`SubModelView`

At some point (possibly even now), it might be worth considering dropping support for some of the features currently implemented, mainly the SubModelView class and the specific views used only within it. Removing those could significantly reduce boilerplate without much loss in functionality (note that SubModelView is a less performant alternative to CoreModel).

Composable Views

Another area that could benefit from some cleanup is the handling of strongly typed indices in the full model, particularly in how the views interact with them. While things work as they are, the abstraction is a bit leaky and requires some extra handling inside FullToCoreModel. One possible improvement would be to generalize the current views to support composition, essentially replacing the current absl::Span usage with a templated view type (with absl::Span as the identity-view base case). That said, it would push us closer to template proliferation, which might not be a great fit for the codebase and probably isn’t necessary right now.

`SetCoverInvariant` in Multiplier-Based Greedy

For now, I’ve avoided using SetCoverInvariant inside the CFT greedy algorithm, mainly to avoid potential overhead from unneeded calculations. But I plan to take a closer look, rewriting the greedy logic around SetCoverInvariant could simplify the code and reduce duplication.

c4v4 and others added 12 commits March 27, 2025 22:56

3-phase layout and some common utils

516ba8f

Written down hih-level greedy code

41b4442

Adding greedy scores computation

cb33040

Adding redundant columns removal to greedy

f2fa389

Generic subgradient without callbacks implementation

7c734b3

Completed Subgradient phase impl

0c8d644

Heuristic phase impl

ce117fc

Full to core model with pricing

bab5097

Column fixing selection (without sub-model definition)

0d5da13

Merge branch 'google:main' into main

7475848

Merge branch 'google:main' into main

6acce83

After merge fix

bc321b3

c4v4 changed the title ~~Main~~ Initial Implementation of the CFT Heuristic for Set Covering Mar 28, 2025

c4v4 changed the title ~~Initial Implementation of the CFT Heuristic for Set Covering~~ CFT Heuristic for Set Covering Mar 28, 2025

c4v4 and others added 8 commits April 1, 2025 19:00

Work in progress: implemented a sub-model view

cbe587a

Work in progress: fixed some views details, but still missing filtere…

f7be541

…d-columns sizes

Work in progress: added size to IndexListFilter & solved last compi…

81d4afe

…le issues

Merge branch 'google:main' into main

9d901a3

Work in progress: working identity views

9742366

Work in progress: bugfixes + started SubModelView fixing system

ab4b013

Work in progess: removed SubModelView premature optimization (look di…

bbb555e

…ff with prev commit)

Completed: 3phase prototype based on "lightweight" model views

d1703f2

WIP: started implementing explicit sub model

f11424e

c4v4 force-pushed the main branch from 7ea45af to f11424e Compare April 5, 2025 15:26

c4v4 and others added 5 commits April 5, 2025 17:27

Merge branch 'google:main' into main

d409c19

Refactored Views: bounds check + compatibility with STL

d13a506

New "bit-mask"-like view

46feedd

Normal & lightweight submodel views

6630e0a

Adapting set_cover_cft to new views

9d252ae

c4v4 added 2 commits April 7, 2025 00:24

Same search trajectory using either SubModel or SubmodelView as CoreM…

406b9a4

…odel

Started commenting + housekeeping

b9a626c

Mizux added Solver: Set Cover Solver in set_cover/ Feature Request Missing Feature/Wrapper labels Apr 8, 2025

c4v4 added 2 commits April 9, 2025 13:48

More comments + small CoreModel::FixColumns bugfix

9eee197

Removed sizes from SubModel view interface

21bcdc7

+ added asserts

c4v4 and others added 2 commits April 9, 2025 14:31

Merge branch 'google:main' into main

5fb2678

Fixing SubModelView compatibility issues

7f3b80c

c4v4 marked this pull request as ready for review April 9, 2025 13:15

c4v4 added 2 commits April 9, 2025 23:48

Refactored CoreModel::FixColumns to improve clarity

a774a24

Experimental simple subgradient stabilization

fe363c6

dourouc05 approved these changes Apr 12, 2025

View reviewed changes

c4v4 added 4 commits April 13, 2025 12:33

View refactor: pointers->absl::Span and separate file

54604fa

Full model strong typed indices

aaed375

Isolating sub-model classes in a separate file

8a6863b

Removed unnecessary use of absl::Status

15345c5

dourouc05 merged commit 311151a into google:main Apr 15, 2025
1 check passed

BrewTestBot mentioned this pull request Jun 19, 2025

or-tools 9.14 Homebrew/homebrew-core#227389

Merged

dependabot bot mentioned this pull request Aug 27, 2025

Bump Google.OrTools from 9.13.4784 to 9.14.6206 Altafraner/afra-app#99

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

CFT Heuristic for Set Covering #4607

CFT Heuristic for Set Covering #4607

Uh oh!

c4v4 commented Mar 28, 2025

Uh oh!

c4v4 commented Apr 5, 2025

Uh oh!

c4v4 commented Apr 9, 2025

Uh oh!

c4v4 commented Apr 14, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

CFT Heuristic for Set Covering #4607

CFT Heuristic for Set Covering #4607

Uh oh!

Conversation

c4v4 commented Mar 28, 2025

Initial Implementation of the CFT Heuristic for Set Covering

Current State

Subgradient

Greedy

Core Model

Sub-Model

Future Possibilities

Uh oh!

c4v4 commented Apr 5, 2025

Uh oh!

c4v4 commented Apr 9, 2025

Current State Update

Uh oh!

c4v4 commented Apr 14, 2025

SubModelView

Composable Views

SetCoverInvariant in Multiplier-Based Greedy

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

`SubModelView`

`SetCoverInvariant` in Multiplier-Based Greedy