Skip to content

v0.3.39 Windows wheel failed to load ggml-cpu-alderlake.dll on i9-12900K, causing ValueError at load_tensors #129

@qqba

Description

@qqba

Environment:

OS: Windows 11 Pro [Version 10.0.26200.8457]

CPU: Intel Core i9-12900K (Alder Lake Architecture)

GPU: NVIDIA GeForce RTX 3090 (24 GB, CUDA 12.4)

Python: 3.12.10

Package: llama_cpp_python-0.3.39+cu124-cp312-cp312-win_amd64.whl

Issue Description:
When upgrading to v0.3.39, the model failed to load during load_tensors with ValueError: Failed to load model from file.

Since my CPU is a 12th Gen i9-12900K (Alder Lake), the loader was supposed to load ggml-cpu-alderlake.dll. However, looking at the logs, it seems ggml_backend_load_all_from_path() failed to load any CPU backend DLLs, resulting in a registry count of only 1 (CUDA only):

Plaintext
load_backend: loaded CUDA backend from ...\llama_cpp\lib\ggml-cuda.dll
Llama.init: Loaded ggml backend registry count: 1
...
load_tensors: loading model tensors...
ValueError: Failed to load model from file
Downgrading to v0.3.38+cu124.basic completely resolved the issue and the model loaded fine. It seems like there might be a regression or missing runtime dependencies (like LLVM/Clang runtime artifacts) when dynamically loading the alderlake CPU backend on certain Windows environments.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions