Skip to content

Add xcube support#2917

Draft
bouweandela wants to merge 21 commits into
mainfrom
add-xcube-support
Draft

Add xcube support#2917
bouweandela wants to merge 21 commits into
mainfrom
add-xcube-support

Conversation

@bouweandela
Copy link
Copy Markdown
Member

@bouweandela bouweandela commented Dec 4, 2025

Description

Add support for loading data with xcube

Related to #2584

To use the data source, run

esmvaltool config copy data-xcube-esacci.yml

and check out the ESMValTool branch here: ESMValGroup/ESMValTool#4447.

Currently available CMORizers are located in https://github.com/ESMValGroup/fixer-prototype/blob/main/packages/fixer-esa-cci/fixer_esa_cci/fixes.yaml

Example recipe, run it with --max-parallel-tasks 1 for now:

documentation:
  description: Example recipe that plots a map.

  title: Recipe that runs an example diagnostic written in Python.

  authors:
    - andela_bouwe

  maintainer:
    - andela_bouwe

  references:
    - acknow_project

  projects:
    - esmval
    - c3s-magic

datasets:
  - dataset: "ESACCI-WATERVAPOUR-L3C-TCWV-meris-005deg-2002-2017-fv3.2.zarr"

diagnostics:
  map:
    description: Global map of total column of water in January 2010.
    themes:
      - phys
    realms:
      - atmos
    variables:
      prw:
        project: ESACCI
        mip: atmos
        frequency: mon
        timerange: 2010/P1M
        caption: Global map of {long_name} in January 2010 according to {dataset}.
    scripts:
      script1:
        script: examples/diagnostic.py
        quickplot:
          plot_type: pcolormesh
          cmap: viridis

Example output:
image

TODO:

  • Investigate why max_parallel_tasks > 1 does not work and solve if possible.
Error message when running with max_parallel_tasks > 1
  File "/home/bandela/src/esmvalgroup/esmvalcore/esmvalcore/_task.py", line 993, in _copy_results
    task.output_files, task.products = future.get()
                                       ^^^^^^^^^^^^
  File "/home/bandela/mambaforge/envs/esmvaltool-xcube/lib/python3.12/multiprocessing/pool.py", line 774, in get
    raise self._value
  File "/home/bandela/mambaforge/envs/esmvaltool-xcube/lib/python3.12/multiprocessing/pool.py", line 540, in _handle_tasks
    put(task)
  File "/home/bandela/mambaforge/envs/esmvaltool-xcube/lib/python3.12/multiprocessing/connection.py", line 206, in send
    self._send_bytes(_ForkingPickler.dumps(obj))
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/bandela/mambaforge/envs/esmvaltool-xcube/lib/python3.12/multiprocessing/reduction.py", line 51, in dumps
    cls(buf, protocol).dump(obj)
TypeError: cannot pickle '_thread.RLock' object

Link to documentation: https://esmvaltool--2917.org.readthedocs.build/projects/ESMValCore/en/2917/api/esmvalcore.io.xcube.html


Before you get started

Checklist

It is the responsibility of the author to make sure the pull request is ready to review. The icons indicate whether the item will be subject to the 🛠 Technical or 🧪 Scientific review.


To help with the number pull requests:

@bouweandela bouweandela added the enhancement New feature or request label Dec 4, 2025
Comment thread environment.yml Outdated
- xcube-cci
- yamale
- zarr >3
- zarr >2
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

zarr3 is perfectly able to read zarr2 datasets, bud

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

xcube requires zarr==2

Copy link
Copy Markdown
Contributor

@valeriupredoi valeriupredoi Dec 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well that's a bummer - that means it can't read Zarr3 spec?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also Zarr2 is borderline archaic - good luck to us trying to maintain such an evironment

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like it's on the TODO list: xcube-dev/xcube#1182

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good, otherwise without a pixi env this would be diabolically hard to maintain 😁

@valeriupredoi
Copy link
Copy Markdown
Contributor

this is shaping up nicely - couple quick questions: why not use the standard xarray - to iris via ncdata path? Is XCube really needed to load the ESA-CCI Zarr files, and if not, do we know what database they are in so we can bolt on an eg intake-esm functionality?

@bouweandela
Copy link
Copy Markdown
Member Author

bouweandela commented Dec 4, 2025

why not use the standard xarray - to iris via ncdata path?

That is exactly what is used:

return dataset_to_iris(dataset)
and
conversion_func = ncdata.iris_xarray.cubes_from_xarray

Is XCube really needed to load the ESA-CCI Zarr files

No, but it is convenient

do we know what database they are in

It looks like the data is available here: https://github.com/esa-cci/xcube-cci/blob/fe8ac26405bd36b0176e1a0cae30238f52009a10/xcube_cci/zarraccess.py#L47

so we can bolt on an eg intake-esm functionality?

That might be possible, but I'm not sure if that will be easier to maintain or less work. xcube supports an interesting range of data sources from ESA, Copernicus, Climate Data Store, and even Zenodo, so I think that having support for xcube will be an interesting feature for our users. Of course, that shouldn't stop us from adding support for intake-esm as well.

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented May 12, 2026

CLA assistant check
All committers have signed the CLA.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 12, 2026

Codecov Report

❌ Patch coverage is 12.50000% with 98 lines in your changes missing coverage. Please review.
✅ Project coverage is 95.60%. Comparing base (77013ca) to head (b8710db).

Files with missing lines Patch % Lines
esmvalcore/io/xcube.py 7.54% 98 Missing ⚠️

❌ Your patch check has failed because the patch coverage (12.50%) is below the target coverage (100.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2917      +/-   ##
==========================================
- Coverage   96.15%   95.60%   -0.56%     
==========================================
  Files         270      271       +1     
  Lines       15807    15915     +108     
==========================================
+ Hits        15200    15215      +15     
- Misses        607      700      +93     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants