Guides

How to Reduce Token Usage

Agent-CoreX already cuts token usage by routing only the most relevant tools. Here are five practical techniques to push your savings even further.

Last updated: April 2026

1. Lower top_k

The top_k parameter controls how many tools are returned per query. The default is 5. If your agent only reliably uses 1–2 tools per task, lowering this to 3 or even 2 reduces the schema size sent to the LLM.

example.py
# Retrieve only the top 2 tools — smallest possible context
tools = client.retrieve_tools(
    query="search the web for recent news",
    top_k=2
)

Via the REST API:

bash
curl "https://api.agent-corex.com/retrieve_tools?query=search+the+web&top_k=2" \
  -H "Authorization: Bearer acx_your_key"
Setting top_k=1 returns only the single best-matching tool. Use this carefully — if the top-ranked tool is wrong for the query, your agent has no fallback.

2. Use specific queries

The retrieval engine uses a hybrid of vector similarity and keyword matching. A more specific query narrows the candidate set and produces higher-confidence results — meaning the returned tools are more likely to all be useful, so you can safely lower top_k.

Less effective

"do something with files"

More effective

"read a local text file from disk"

Treat the query like a search engine — include the action verb, the object, and any relevant constraint (local, remote, read-only, etc.).

3. Enable only relevant packs

Agent-CoreX filters retrieval results to tools in your enabled packs (when using the V2 endpoint with user scoping). If you have 10 packs enabled but your agent only works with web and GitHub tools, disable the other 8 packs.

Go to Dashboard → Tools and toggle off any packs your current workflow doesn't need.

Fewer enabled tools means a smaller candidate pool for the retrieval engine — and a tighter match between what's returned and what your agent needs.

4. Create custom packs

For the lowest possible token overhead, create a custom pack containing only the exact tools your agent uses. Navigate to Dashboard → Custom Packs to build and manage custom packs.

A custom pack with 8 tools will return at most 5 tools per query (withtop_k=5) — far fewer schemas than a broad pack with 150 tools.

5. Disable unused tools

Within a pack, you can individually disable tools that aren't applicable to your environment. For example, if you're running a read-only agent, disable all write-capable tools. Go to Dashboard → Tools, expand a pack, and toggle off individual tools. Disabled tools are never returned by the retrieval engine, even if their semantic score is high.

Monitor your savings

After applying any of these changes, check the Usage → Estimated cost savings card. Click Refresh to force a live update. See Token Usage & Cost Savings for an explanation of how the savings figure is calculated.