Skip to content

Reproduction Steps for TransMLA-llama3-8b-8k #42

@haok1402

Description

@haok1402

https://huggingface.co/fxmeng/TransMLA-llama3-8b-8k/

Hi, Thanks for the amazing work. I see on the Huggingface, there is a model release with TransMLA. Could you clarify how to reproduce the conversion from Meta-Llama3-8B to TransMLA-llama3-8b-8k? What's the training data used? In particular, the experimental setup from the paper seems to focus primarily on smolLM 1.7B and Llama 2 7B, with little mention of Llama3 8B.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions