Add CRAM support#69
Conversation
|
Looking at this now. |
|
So far it is looking good. I would ask you to bump the version number and add yourself as an aut in Authors@R in DESCRIPTION. Some information on provenance of the test CRAM resources would be good even if trivial. User-visible documentation should indicate the new capability. You were clear that chunked sequential reading is not supported. GPT-5.2 told me that
So you can support chunked reading by:
GitHub co-pilot offered to do some refactoring that would accommodate yieldSize feature. Is |
|
Updated the PR with those annotation changes and some basic documentation. As for the provenance of the test data: it's completely synthetic. I generated 3 random sequences, then created simulated 2x150bp reads and aligned them to the synthetic reference with minimap2. |
This PR adds basic support for CRAM files. CRAM and BAM files are handled (almost) equally by newer versions of htslib, so all that was needed was to add an (optional)
reference=parameter toBamFilewhich is then set on the underlyinghtsFilestruct, and various adaptations to index handling, because CRAM indices are a bit different.One limitation is mate-pair handling with
asMatesis not yet supported, because we can't read within the bgzf block in the same way as for BAM.I added a test cram file and reference and various unit tests to check that it works as expected.