Skip to content

Even lower bitwidth kernel #3

@xzyaoi

Description

@xzyaoi

Hi,

Thanks for the great work! Just wondering if there're plans for supporting lower bitwidth kernels (e.g., 2 bit + 2:4 sparsity).

For a bit of context, we were working on a project that compresses the difference between the fine-tuned model and the base model, and it turned out we can compress it more aggressively (see: https://arxiv.org/abs/2312.05215), and it would be great if we can leverage marlin & sparse marlin to accelerate the inference.

Thanks in advance!

Best regards,
Xiaozhe

cc: @alexm-neuralmagic (since I saw there's a PR for 8bit, but closed)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions