[ PROMPT_NODE_22978 ]

RAG Pinecone 部署

[ SKILL_DOCUMENTATION ]

# Pinecone 部署指南 Pinecone 的生产部署模式。 ## 无服务器 (Serverless) vs 基于 Pod (Pod-based) ### 无服务器 (推荐) python from pinecone import Pinecone, ServerlessSpec pc = Pinecone(api_key="your-key") # 创建无服务器索引 pc.create_index( name="my-index", dimension=1536, metric="cosine", spec=ServerlessSpec( cloud="aws", # 或 "gcp", "azure" region="us-east-1" ) ) **优势：** - 自动扩缩容 - 按使用量付费 - 无需管理基础设施 - 针对可变负载具有成本效益 **适用场景：** - 流量波动大 - 重视成本优化 - 不需要持续一致的延迟 ### 基于 Pod python from pinecone import PodSpec pc.create_index( name="my-index", dimension=1536, metric="cosine", spec=PodSpec( environment="us-east1-gcp", pod_type="p1.x1", # 或 p1.x2, p1.x4, p1.x8 pods=2, # Pod 数量 replicas=2 # 高可用性 ) ) **优势：** - 性能一致 - 延迟可预测 - 更高的吞吐量 - 专用资源 **适用场景：** - 生产工作负载 - 需要一致的 p95 延迟 - 需要高吞吐量 ## 混合搜索 ### 稠密 + 稀疏向量 python # 同时 Upsert 稠密和稀疏向量 index.upsert(vectors=[ { "id": "doc1", "values": [0.1, 0.2, ...], # 稠密（语义） "sparse_values": { "indices": [10, 45, 123], # Token ID "values": [0.5, 0.3, 0.8] # TF-IDF/BM25 分数 }, "metadata": {"text": "..."} } ]) # 混合查询 results = index.query( vector=[0.1, 0.2, ...], # 稠密查询 sparse_vector={ "indices": [10, 45], "values": [0.5, 0.3] }, top_k=10, alpha=0.5 # 0=仅稀疏，1=仅稠密，0.5=平衡 ) **优势：** - 兼顾两者优点 - 语义 + 关键词匹配 - 比单一方法更好的召回率 ## 用于多租户的命名空间 (Namespaces) python # 按用户/租户隔离数据 index.upsert( vectors=[{"id": "doc1", "values": [...]}], namespace="user-123" ) # 查询特定命名空间 results = index.query( vector=[...], namespace="user-123", top_k=5 ) # 列出命名空间 stats = index.describe_index_stats() print(stats['namespaces']) **用例：** - 多租户 SaaS - 用户特定数据隔离 - A/B 测试（生产/预发布命名空间） ## 元数据过滤 ### 精确匹配 python results = index.query( vector=[...], filter={"category": "...

数据来源：claude-code-templates（MIT），中文翻译由 AI 生成。详见关于我们。

BAGUA AI