Skip to content

Future of Pangeo-Forge? #799

@TomNicholas

Description

@TomNicholas

Over in nsidc/earthaccess#956 (reply in thread) I argued that Pangeo-Forge as a project is winding down, in favour of new tools and services.

Specifically:

  1. the Pangeo-forge project is winding down since the original creator (@cisaacstern) is no longer funded to work on it,
  2. the catalog functionality in PGF has been gone/unmaintained for a while, and cataloging is/will be hopefully better served by services such as Earthmover and Source Cooperative (or maybe a community effort via FROST),
  3. the kerchunking recipes are now better done by VirtualiZarr (to which many of the same devs have now switched their focus),
  4. PGF's pattern of writing out the zarr metadata as a schema first then filling in the chunks in parallel is much better served by Icechunk, which can basically do the same thing but with transactional version control.
  5. moving actual chunks at scale (instead of just references to chunks) is likely now better done by Xarray-Beam or Cubed, though these are much less mature.
  6. the VirtualiZarr/Cubed approaches both have the huge advantage of avoiding having to create a whole separate API for ETL compared to for analytics.

So generally I would advise against anyone starting a big project now that is tied to Pangeo-Forge because although the stack is in a bit of a transitionary period right now (with Icechunk/VirtualiZarr gaining more maturity daily) but the future trajectory is already clear.

I think Pangeo-Forge is an awesome project but I would now like to see something more robust and sustainable built using the many lessons learned.

Do people think this is fair / reasonable? If so do you disagree with the vision I sketched above? Is there a part of the PGF project that isn't represented by that sketch? Should we consider at what point the project could be officially deprecated? Or could it instead transition to use new tools?

cc @keewis @jbusecke @abarciauskas-bgse @sharkinsspatial @alxmrs @rabernat @TomAugspurger

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions