Skip to content

FSDP2 Support #32

@garrett361

Description

@garrett361

This issue tracks progress on running Bamba models with FSDP2.

WIP branch and PR here.

Top level goals:

  • FSDP2 + torch kernel impl
  • FSDP2 + mamba_ssm kernel impl
  • FSDP2 + TP
  • FSDP2 + CP
  • FSDP2 + FP8
  • FSDP2 + PP

Known Issues/Open Questions

  • Requires custom op registration for FSDP2 compatibility
  • DTensor currently only has minimal/experimental support for convolutions. Because of the depthwise convolution in the mamba layers, this is a blocker for TP/CP support. @garrett361 is working on more robust conv support.

cc @raghukiran1224

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions