[ PROMPT_NODE_22818 ]
Optimization Gguf 故障排查
[ SKILL_DOCUMENTATION ]
# GGUF 故障排除指南
## 安装问题
### 构建失败
**错误**: `make: *** No targets specified and no makefile found`
**修复**:
bash
# 确保你在 llama.cpp 目录下
cd llama.cpp
make
**错误**: `fatal error: cuda_runtime.h: No such file or directory`
**修复**:
bash
# 安装 CUDA 工具包
# Ubuntu
sudo apt install nvidia-cuda-toolkit
# 或设置 CUDA 路径
export CUDA_PATH=/usr/local/cuda
export PATH=$CUDA_PATH/bin:$PATH
make GGML_CUDA=1
### Python 绑定问题
**错误**: `ERROR: Failed building wheel for llama-cpp-python`
**修复**:
bash
# 安装构建依赖
pip install cmake scikit-build-core
# CUDA 支持
CMAKE_ARGS="-DGGML_CUDA=on" pip install llama-cpp-python --force-reinstall --no-cache-dir
# Metal 支持 (macOS)
CMAKE_ARGS="-DGGML_METAL=on" pip install llama-cpp-python --force-reinstall --no-cache-dir
**错误**: `ImportError: libcudart.so.XX: cannot open shared object file`
**修复**:
bash
# 将 CUDA 库添加到路径
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
# 或使用正确的 CUDA 版本重新安装
pip uninstall llama-cpp-python
CUDACXX=/usr/local/cuda/bin/nvcc CMAKE_ARGS="-DGGML_CUDA=on" pip install llama-cpp-python
## 转换问题
### 模型不受支持
**错误**: `KeyError: 'model.embed_tokens.weight'`
**修复**:
bash
# 检查模型架构
python -c "from transformers import AutoConfig; print(AutoConfig.from_pretrained('./model').architectures)"
# 使用适当的转换脚本
# 大多数模型:
python convert_hf_to_gguf.py ./model --outfile model.gguf
# 旧模型请检查是否需要旧版脚本
### 词表不匹配
**错误**: `RuntimeError: Vocabulary size mismatch`
**修复**:
python
# 确保分词器与模型匹配
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("./model")
model = AutoModelForCausalLM.from_pretrained("./model")
print(f"Tokenizer vocab size: {len(tokenizer)}")
print(f"Model vocab size: {model.config.vocab_size}")
# 如果不匹配,在转换前调整嵌入层
model.resize_token_embeddings(len(tokenizer))
model.save_pretrained("./model-fixed")
### 转换时内存溢出
**错误**: `torch.cuda.OutOfMemoryError`
**修复**:
bash
# 使用 CPU 进行转换
CUDA_VISIBLE_DEVICES="" python convert_hf_to_gguf.py ./model --outfile model.gguf
# 或使用低内存模式
python convert_hf_to_gguf.py ./model --outfile model.