Handle gapless MP3 Xing/Info durations#3198
Conversation
|
GitHub isn't letting me add review comments for some reason, so here's a quick bit of initial feedback The part of this PR related to the In the |
Xing/Info frame counts describe the raw MP3 frame timeline, but SeekMap#getDurationUs should expose the gapless playback duration when encoder delay and padding are present. Keep both values in MP3 seekers: use the raw duration for bitrate and byte-position calculations, and use the gapless duration for the public duration and end-of-stream seek points. This makes getPosition(durationUs) return a seek point whose timeUs is the requested public duration. Issue: androidx#3183 Test: ./gradlew :lib-extractor:testDebugUnitTest --tests androidx.media3.extractor.mp3.ConstantBitrateSeekerTest --tests androidx.media3.extractor.mp3.Mp3ExtractorTest --tests androidx.media3.extractor.mp3.XingFrameTest
|
Yeah sorry I did not fully verify that part, I completely underestimated the complete mess that is mp3 seeking and gapless before going into this. I've moved the lame part in another PR, but used AI to rebase it on main to not depend on this one. It's probably best to not really look at the second one deeply until this is one is done or not. |
| * Returns the raw duration before subtracting encoder delay/padding, or {@link C#TIME_UNSET} if | ||
| * not known. | ||
| */ | ||
| default long getRawDurationUs() { |
There was a problem hiding this comment.
i'm wondering if we can avoid introducing this method (and therefore avoid having to distinguish between these concepts of 'duration' and 'raw duration').
It looks like it's used in two ways:
- To calculate a bitrate in
Mp3Extractoras part of the 'enhanced CBR' logic - In the implementation of
ConstantBitrateSeeker.getSeekPointsto ensure the final byte position is reported correctly.
For (1), I see that Seeker already has a getAverageBitrate() method, which should fit for the use-case (based on its documentation of Returns the average bitrate (usually derived from the duration and length of the file)) - except it seems that XingSeeker and VbriSeeker implement this incorrectly by directly returning the bitrate of the MPEG frame that contains the seek metadata.
For (2) I think you can use getDataEndPosition() instead?
If this sounds right, would you be up for:
- Sending a PR to fix the bug in the
getAverageBitrate()impls inXingSeekerandVbriSeeker - Rebasing this PR on top once that's merged, and then removing
getRawDurationUsand usinggetAverageBitrate()andgetDataEndPosition()instead.
There was a problem hiding this comment.
I can use Codex to split things and move things yes if that's not a problem for you. Way less time consuming.
Also Codex: Caveat: getDataEndPosition() is documented as immediately after audio data, so using it directly may be one byte/frame past the desired seek point. The implementation should likely use the last valid seek position derived from data end, for example dataEndPosition - normalizedFrameSize, while returning timeUs = durationUs. we start to be a little outside of my confort zone so I don't really have value on the decisions, just did tests and debug to try to reach solutions. I now mostly rely on your expertise.
LAME Xing/Info headers can include encoder delay and padding. Mp3Extractor already propagates those values into Format so decoded audio is rendered gaplessly, but the SeekMap duration was still computed from the untrimmed MPEG frame count. For CBR Info files this can expose a source duration that is longer than the samples that will actually be played, which can break gapless transitions.
Split the Xing/Info duration into two concepts: raw duration from the frame count, used for bitrate and byte-position calculations, and gapless duration with encoder delay and padding removed, exposed from SeekMap.
Keep CBR average-bitrate derivation on the raw duration to avoid changing byte-position seeking. When a CBR seeker is given an explicit gapless duration, map seeks at the advertised end of the stream to the raw data end so the SeekMap endpoint contract is preserved.
Add focused tests for raw vs gapless duration calculation, CBR seek endpoint handling, and Info-frame duration/bitrate behavior, and update affected extractor dumps.
Fixes #3183.