pi-search-hub
Health Warn
- No license — Repository has no license file
- Description — Repository has a description
- Active repo — Last push 0 days ago
- Community trust — 11 GitHub stars
Code Fail
- process.env — Environment variable access in benchmark/benchmark.mjs
- network request — Outbound network request in benchmark/benchmark.mjs
- execSync — Synchronous shell command execution in extensions/search-hub.ts
- process.env — Environment variable access in extensions/search-hub.ts
- network request — Outbound network request in extensions/search-hub.ts
Permissions Pass
- Permissions — No dangerous permissions requested
This tool is a unified web search extension that provides a single interface to nine different search backend providers, with automatic fallback and rate limiting capabilities.
Security Assessment
The tool requires outbound network requests to function, which is expected for a search aggregator. API keys and configuration are handled via environment variables rather than hardcoded secrets. However, the scan detected synchronous shell command execution (`execSync`) in the main extension file (`extensions/pi-search.ts`), which is a significant security concern. While network requests and environment variable usage are normal for this type of tool, the use of `execSync` in the core code could potentially be exploited depending on how inputs are handled.
Overall Risk: Medium — The shell execution capability in the main code requires careful code review before trusting.
Quality Assessment
The project is very new and has extremely low community visibility, evidenced by only 7 GitHub stars. It was recently updated (pushed within the last day), indicating active development. A major red flag is the lack of an open-source license, meaning that technically no one has legal permission to use, modify, or distribute the code. The documentation is excellent, providing clear installation instructions, backend comparisons, and recent benchmark results.
Verdict
Use with caution — The missing license and the presence of synchronous shell execution in the core code are significant concerns. Review the source code manually before installing.
Unified web search + content extraction extension for pi — 12 backends (DuckDuckGo, Jina AI, Tavily, Brave, Exa, Serper, Firecrawl, Marginalia, LangSearch, WebSearchAPI, Perplexity Sonar, SearXNG) with auto-fallback, RRF combine mode, web_read tool, and env/shell credential resolution.
pi-search-hub
Unified web search + content extraction extension for pi with 12 backend providers (all working). One web_search tool, one web_read tool, auto-fallback, RRF-ranked combine mode, and credential resolution via env/shell/literal.
Installation
pi install npm:pi-search-hub
Note for DuckDuckGo backend: Requires the
ddgsPython package. Install with:
- Linux/macOS:
pip3 install ddgs- Windows:
pip install ddgs
Usage
Web Search
After installing, just ask naturally:
Search for recent AI agent frameworks.
What's the latest news on Llama 4?
Or use the tools directly — the agent picks the best configured backend automatically:
web_search— search the web with auto-fallback or parallel combine modeweb_read— fetch any URL as clean markdown
Combine Mode
Set combine=true to query ALL enabled backends in parallel with Reciprocal Rank Fusion (RRF) ranking:
Search for "Rust vs Go performance benchmarks" with combine=true to get results from all backends
Combine mode benefits:
- Broader coverage across multiple search indexes
- Results ranked by RRF — position-based scoring across all backends
- Each result shows which backend found it
- URL deduplication with content-aware merge (prefers richest result)
- Useful for comprehensive research or when you want diverse sources
Tradeoff: Uses more API quota per query (all backends are called), but you get more comprehensive results.
Read Web Pages
Fetch any URL as clean markdown — great for extracting article content, docs, or reference pages:
Read https://docs.example.com/api-reference
The web_read tool supports:
- objective — specific question to focus extraction
- keywords — relevant terms to highlight on long pages
- mode —
rushfor speed (return innerText) orsmart(markdown extraction) - fresh — bypass cache when freshness matters
Supported Backends
| # | Backend | Free Tier | API Key? | How to get key |
|---|---|---|---|---|
| 1 | DuckDuckGo | Unlimited (rate-limited) | No | pip install ddgs (Linux/macOS: pip3) |
| 2 | Jina AI | Free tier (API key req.) | Yes | jina.ai |
| 3 | Marginalia Search | Unlimited (rate-limited) | No† | marginalia.nu |
| 4 | Tavily | 1,000 calls/month | Yes | tavily.com |
| 5 | Serper (Google) | 2,500 queries/month | Yes | serper.dev |
| 6 | Brave | 2,000 queries/month | Yes | brave.com/search/api |
| 7 | Firecrawl | 500 free credits | Yes | firecrawl.dev |
| 8 | Exa | 10 QPS rate-limited | Yes | exa.ai |
| 9 | LangSearch | Genuinely free, no CC | Yes | langsearch.com |
| 10 | WebSearchAPI.ai | 2,000 free credits | Yes | websearchapi.ai |
| 11 | Perplexity Sonar | Unlimited free queries | Yes | perplexity.ai |
| 12 | SearXNG | Self-hosted, unlimited | No | docs.searxng.org |
† Marginalia Search uses
publicas a shared API key — no registration required, but subject to a shared rate limit.Jina AI (s.jina.ai) returns full markdown content. Free tier requires a free API key from jina.ai.
SearXNG is a self-hosted metasearch engine. Run your own instance (or use a public one), no API key required. Configure the instance URL in
.pi/search.json.
Removed: Stract, UnSearch, BoardReader, EntireWeb, Search1API, FreeAPITools.dev — no longer viable (public API removed, requires payment, or endpoint not implemented).
Configuration
Configure backends globally (all projects) or per-project:
Global: ~/.pi/agent/extensions/search.json
Project: .pi/search.json (project takes precedence)
{
"defaultBackend": "auto",
"backends": {
"duckduckgo": { "enabled": true },
"jina": { "enabled": true, "apiKey": "JINA_API_KEY" },
"marginalia": { "enabled": true },
"serper": { "enabled": true, "apiKey": "SERPER_API_KEY" },
"tavily": { "enabled": true, "apiKey": "TAVILY_API_KEY" },
"brave": { "enabled": true, "apiKey": "BRAVE_API_KEY" },
"exa": { "enabled": true, "apiKey": "EXA_API_KEY" },
"firecrawl": { "enabled": true, "apiKey": "FIRECRAWL_API_KEY" },
"langsearch": { "enabled": true, "apiKey": "LANGSEARCH_API_KEY" },
"websearchapi":{ "enabled": true, "apiKey": "WEBSEARCHAPI_API_KEY" },
"perplexity": { "enabled": true, "apiKey": "PERPLEXITY_API_KEY" },
"searxng": { "enabled": true, "instanceUrl": "http://localhost:8888" }
}
}
Credential Resolution
The apiKey field supports four formats (following pi-web-providers convention):
apiKey value |
Resolved from | Example |
|---|---|---|
"SERPER_API_KEY" |
process.env.SERPER_API_KEY |
ALL_CAPS → env var |
"!pass show api/serper" |
stdout of shell command (cached) | ! prefix → exec |
"sk-abc123..." |
Used as-is | Literal key (backwards compatible) |
| (unset) | SEARCH_<BACKEND>_API_KEY env fallback |
Auto-enables backend |
Env var references: Any ALL_CAPS string is treated as an environment variable name (not a literal). If the referenced env var is unset, a warning is printed (your literal key is not silently discarded).
Shell commands: Commands prefixed with ! are executed via execSync with a 5s timeout. Results are cached and invalidated when config is reloaded (editing the config file clears the cache).
Convenience env vars: Backends are auto-enabled when these env vars are set (even with no config entry):
export SEARCH_SERPER_API_KEY="sk-..."
export SEARCH_TAVILY_API_KEY="sk-..."
export SEARCH_EXA_API_KEY="sk-..."
# ...
{
"backends": {
"serper": { "enabled": true, "apiKey": "SERPER_API_KEY" }
}
}
To rotate a shell-command key: Update the secret in your password manager, then trigger a config reload (edit the config file, or wait 10s for automatic refresh).
Or use the interactive setup:
/search-setup
Commands
| Command | Description |
|---|---|
/search-setup |
Interactive prompt to configure API keys for any backend |
/search-status |
Show which backends are active and which have keys |
How auto mode works
Fallback Mode (default, combine=false)
- Tries each enabled backend in order from your config
- If a backend fails (rate limit, auth error, etc.), moves to the next one
- DuckDuckGo requires no API key; Jina AI needs a free API key. Both serve as safety nets
- Returns results from the first backend that succeeds
- If all backends fail, reports the collected errors
Combine Mode (combine=true)
- Queries ALL enabled backends in parallel
- Each backend receives
numResults / numBackendsas a target - Results are merged using Reciprocal Rank Fusion (RRF) — position-based scoring that works across incompatible ranking systems
- Each result shows its source backend (e.g.,
*Source: Tavily*) - URL dedup prefers the result with the richest content (content > snippet)
- Backend statistics are displayed (which succeeded, result counts, errors)
RRF Scoring
RRF assigns each result a score of Σ(1 / (60 + rank_i)) across all backends that returned it. Results are ranked by score, then by number of backends that found them. This means a result ranked #1 by one backend and #5 by another beats a result ranked #4 by two backends.
Security
- API keys are stored in local config files only (
~/.pi/agent/extensions/search.jsonor.pi/search.json), never sent to any third party besides the chosen backend - Env vars and shell commands are supported for credential resolution — the config file is trusted (you own it), but never commit plain API keys to version control
- DuckDuckGo queries use spawned Python subprocess (abortable via signal)
- All HTTP backends have a 30-second timeout; shell commands for credentials have a 5-second timeout
- Error messages are sanitized — API response bodies are truncated and key-like patterns are redacted
- The
.pi/directory is in.gitignore— never commit API keys to version control
Testing
# Run the full benchmark against all backends
node benchmark/benchmark.mjs
# Quick test Jina AI (with your free API key)
curl -s -H "Authorization: Bearer $JINA_API_KEY" "https://s.jina.ai/?q=test&format=json" | jq .
# Quick test via curl with your configured key
curl -X POST "https://api.exa.ai/search" \
-H "Content-Type: application/json" \
-H "x-api-key: $KEY" \
-d '{"query": "test", "numResults": 3, "contents": {"text": true}}'
# Quick test Perplexity Sonar
curl -X POST "https://api.perplexity.ai/chat/completions" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $KEY" \
-d '{"model": "sonar", "messages": [{"role": "user", "content": "test"}], "search_context_size": "low"}'
# Quick test SearXNG (replace URL with your instance)
curl "http://localhost:8888/search?q=test&format=json&count=3"
Adding a new backend
Backends are registered via the BACKEND_DEFS registry in extensions/search-hub.ts. Define a search function and add one entry to the registry:
const BACKEND_DEFS: Record<string, BackendRunner> = {
// ... existing entries
mybackend: {
needsKey: true,
needsKeyFromConfig: false,
needsInstanceUrl: false,
label: "My Backend",
setupLabel: "My Backend (free tier description)",
search: async (query, numResults, { key, signal }) => {
const result = await searchMyBackend(query, numResults, key!, signal);
return { results: result.results };
},
},
};
The registry handles dispatching, key resolution, formatting labels, and setup menu — no other edits needed.
License
MIT
Proudly created with pi
Reviews (0)
Sign in to leave a review.
Leave a reviewNo results found