Cortex 文档已覆盖 Parse、Storage、Knowledge、Evaluation 与 Synthesis。查看 最新变更

TensorZero + Cortex 矩阵评测

运行一个金融研究 RAG 实验,覆盖 Parse、Storage、Knowledge、Evaluation 和 Synthesis 工作流。

examples/tensorzero-cortex 项目演示了一个接近真实业务的金融研究流水线:

  1. 用多个 Cortex Parse engine 解析 20 个宏观经济和金融 URL;
  2. 将每个解析出的 Markdown 上传到 Cortex Storage;
  3. 摄入 Cortex Knowledge,构建图谱感知上下文;
  4. 通过 TensorZero 在 OpenAI、Gemini、Kimi 或其他 variant 之间做 exhaustive / adaptive A/B test;
  5. 将 TensorZero inference 与 feedback 记录转成 Cortex Evaluation cases;
  6. 可选使用 Cortex Synthesis 扩展 QA、RAG golden 或结构化 benchmark 数据集。

API 对照

Cortex API示例如何使用
Parse对每个 URL 和 parse engine 调用 /v1/parse/sync/v1/parse/jobs
Storage调用 /v1/storage/files 持久化标准化 Markdown。
Knowledge创建 dataset,提交 Add/Cognify jobs,再用 Search 生成 RAG context。
Evaluation将生成的 RAG/custom cases 提交给 /v1/eval/sync/v1/eval/jobs
Synthesis可在 run 后基于产物生成 QA pairs、RAG goldens 或结构化 benchmark 记录。

1. 准备 Cortex

在 Cortex 仓库根目录启动:

cd /path/to/cortex
test -f .env || cp .env.local.example .env

docker compose --env-file .env -p cortex-local -f compose.local.yaml \
  --profile docling \
  --profile eval-runtime \
  --profile synthesis-runtime \
  up -d --build

如果只是修改了运行时配置,可以只重建完整示例依赖的服务:

docker compose --env-file .env -p cortex-local -f compose.local.yaml \
  --profile docling \
  --profile eval-runtime \
  --profile synthesis-runtime \
  up -d --no-build --force-recreate \
  cortex-api \
  cortex-parse-worker-docling \
  cortex-knowledge-worker \
  cortex-evaluation-worker-runtime \
  cortex-synthesis-worker-runtime

2. 配置 TensorZero 示例

cd /path/to/cortex/examples/tensorzero-cortex
test -f .env || cp .env.example .env
uv sync

填写你要测试的模型供应商密钥:

OPENAI_API_KEY=...
GEMINI_API_KEY=...
KIMI_API_KEY=...
OPENROUTER_API_KEY=...

完整本地实验常用配置:

PARSE_ENGINES=auto,crawl4ai,jina_reader,markitdown,llama_parse,docling
TENSORZERO_STRATEGY=exhaustive
TENSORZERO_VARIANTS=openai,gemini,kimi
TENSORZERO_CONTEXT_GROUPING=by_parse_engine
SUBMIT_CORTEX_EVAL=true
CORTEX_EVAL_MODE=async
CORTEX_EVAL_TYPES=rag,custom
CORTEX_EVAL_METRIC_PROFILE=deepeval_rag_core
KNOWLEDGE_GRAPH_VISUALIZATION=true

渲染 TensorZero 配置并启动:

uv run tensorzero-cortex render-config
docker compose --env-file .env -f tensorzero/docker-compose.tensorzero.yaml up -d

TensorZero Gateway 通常在 http://127.0.0.1:3002,TensorZero UI 在 http://127.0.0.1:4000

3. 先跑冒烟测试

先用 1 个 URL 和 1 个 parser:

uv run tensorzero-cortex run --max-urls 1 --parse-engines markitdown --skip-knowledge-jobs

这个最小链路会覆盖 Parse、Storage、TensorZero inference 和 fallback RAG context,不要求 Knowledge worker 可用。

4. 运行完整矩阵

uv run tensorzero-cortex run \
  --max-urls 5 \
  --parse-engines auto,crawl4ai,jina_reader,markitdown,llama_parse,docling \
  --parse-mode sync \
  --tensorzero-strategy exhaustive \
  --tensorzero-variants openai,gemini,kimi \
  --context-grouping by_parse_engine \
  --submit-cortex-eval \
  --cortex-eval-mode async \
  --cortex-eval-types rag,custom \
  --query "What macroeconomic and financial stability risks are highlighted across these documents?"

Docling 属于重型 worker。即使全局 parse mode 是 sync,示例也会把 docling 覆盖为 async。

5. 查看输出

每次运行都会写入:

examples/tensorzero-cortex/artifacts/{run_id}

关键文件:

文件含义
report.md人类可读实验报告。
report.json结构化报告。
tensorzero_eval_dataset.jsonl从 TensorZero 记录生成的评测集。
raw_tensorzero_result.jsonTensorZero inference 摘要。
raw_search_result.jsonKnowledge Search 响应或 parse fallback。
knowledge_graph.html可选 Cognee 知识图谱可视化。
parse/*.json每个 URL + engine 的 Parse 和 Storage 响应。

6. 在 run 后补充 Synthesis

示例当前重点覆盖 Parse、Storage、Knowledge、TensorZero、Evaluation。要覆盖第五类 Cortex API,可以基于同一批解析内容提交 Synthesis job:

import osimport requestsBASE_URL = os.getenv("CORTEX_URL", "http://127.0.0.1:8080")TOKEN = os.getenv("CORTEX_TOKEN", "replace_with_token")def auth_headers():    return {"Authorization": f"Bearer {TOKEN}"}payload = {  "name": "tensorzero-rag-goldens",  "synthesis_type": "qa_pairs",  "engine_id": "deepeval",  "source": {    "type": "documents",    "documents": [      "Paste a representative parsed Markdown excerpt from report artifacts."    ]  },  "config": {    "sample_count": 10,    "include_expected_output": True  },  "output": {    "output_format": "jsonl",    "include_preview": True  }}response = requests.post(    f"{BASE_URL}/v1/synthesis/jobs",    headers={**auth_headers(), "Content-Type": "application/json"},    json=payload,)response.raise_for_status()data = response.json()print(data)
const BASE_URL = process.env.CORTEX_URL ?? "http://127.0.0.1:8080";const TOKEN = process.env.CORTEX_TOKEN ?? "replace_with_token";const authHeaders = {  Authorization: `Bearer ${TOKEN}`,};const payload = {  "name": "tensorzero-rag-goldens",  "synthesis_type": "qa_pairs",  "engine_id": "deepeval",  "source": {    "type": "documents",    "documents": [      "Paste a representative parsed Markdown excerpt from report artifacts."    ]  },  "config": {    "sample_count": 10,    "include_expected_output": true  },  "output": {    "output_format": "jsonl",    "include_preview": true  }};const response = await fetch(`${BASE_URL}/v1/synthesis/jobs`, {  method: "POST",  headers: { ...authHeaders, "Content-Type": "application/json" },  body: JSON.stringify(payload),});if (!response.ok) throw new Error(await response.text());const data = await response.json();console.log(data);
import java.net.URI;import java.net.http.HttpClient;import java.net.http.HttpRequest;import java.net.http.HttpResponse;public class CortexExample {  static final String BASE_URL = System.getenv().getOrDefault("CORTEX_URL", "http://127.0.0.1:8080");  static final String TOKEN = System.getenv().getOrDefault("CORTEX_TOKEN", "replace_with_token");  static final HttpClient HTTP = HttpClient.newHttpClient();  static void print(HttpResponse<String> response) {    System.out.println(response.statusCode());    System.out.println(response.body());  }  public static void main(String[] args) throws Exception {    String json = """      {        \"name\": \"tensorzero-rag-goldens\",        \"synthesis_type\": \"qa_pairs\",        \"engine_id\": \"deepeval\",        \"source\": {          \"type\": \"documents\",          \"documents\": [            \"Paste a representative parsed Markdown excerpt from report artifacts.\"          ]        },        \"config\": {          \"sample_count\": 10,          \"include_expected_output\": true        },        \"output\": {          \"output_format\": \"jsonl\",          \"include_preview\": true        }      }      """;    HttpRequest request = HttpRequest.newBuilder()      .uri(URI.create(BASE_URL + "/v1/synthesis/jobs"))      .header("Authorization", "Bearer " + TOKEN)      .header("Content-Type", "application/json")      .POST(HttpRequest.BodyPublishers.ofString(json))      .build();    print(HTTP.send(request, HttpResponse.BodyHandlers.ofString()));  }}

生成的 QA pairs 可以作为新的 Evaluation cases,也可以持久化成数据集,供后续 TensorZero 实验复用。

本页目录