Web Search
How SurfSense web search works and how to configure it for production with residential proxies
Web Search
SurfSense uses SearXNG as a bundled meta-search engine to provide web search across all search spaces. SearXNG aggregates results from multiple search engines (Google, DuckDuckGo, Brave, Bing, and more) without requiring any API keys.
How It Works
When a user triggers a web search in SurfSense:
- The backend sends a query to the bundled SearXNG instance via its JSON API
- SearXNG fans out the query to all enabled search engines simultaneously
- Results are aggregated, deduplicated, and ranked by engine weight
- The backend receives merged results and presents them to the user
SearXNG runs as a Docker container alongside the backend. It is never exposed to the internet. Only the backend communicates with it over the internal Docker network.
Docker Setup
SearXNG is included in both docker-compose.yml and docker-compose.dev.yml and works out of the box with no configuration needed.
The backend connects to SearXNG automatically via the SEARXNG_DEFAULT_HOST environment variable (defaults to http://searxng:8080).
Disabling SearXNG
If you don't need web search, you can skip the SearXNG container entirely:
docker compose up --scale searxng=0Using Your Own SearXNG Instance
To point SurfSense at an external SearXNG instance instead of the bundled one, set in your docker/.env:
SEARXNG_DEFAULT_HOST=http://your-searxng:8080Configuration
SearXNG is configured via docker/searxng/settings.yml. The key sections are:
Engines
SearXNG queries multiple search engines in parallel. Each engine has a weight that influences how its results rank in the merged output:
| Engine | Weight | Notes |
|---|---|---|
| 1.2 | Highest priority, best general results | |
| DuckDuckGo | 1.1 | Strong privacy-focused alternative |
| Brave | 1.0 | Independent search index |
| Bing | 0.9 | Different index from Google |
| Wikipedia | 0.8 | Encyclopedic results |
| StackOverflow | 0.7 | Technical/programming results |
| Yahoo | 0.7 | Powered by Bing's index |
| Wikidata | 0.6 | Structured data results |
| Currency | default | Currency conversion |
| DDG Definitions | default | Instant answers from DuckDuckGo |
All engines are free. SearXNG scrapes public search pages, no API keys required.
Engine Suspension
When a search engine returns an error (CAPTCHA, rate limit, access denied), SearXNG suspends it for a configurable duration. After the suspension expires, the engine is automatically retried.
The default suspension times are tuned for use with rotating residential proxies (shorter bans since each retry goes through a different IP):
| Error Type | Suspension | Default (without override) |
|---|---|---|
| Access Denied (403) | 1 hour | 24 hours |
| CAPTCHA | 1 hour | 24 hours |
| Too Many Requests (429) | 10 minutes | 1 hour |
| Cloudflare CAPTCHA | 2 hours | 15 days |
| Cloudflare Access Denied | 1 hour | 24 hours |
| reCAPTCHA | 2 hours | 7 days |
Timeouts
| Setting | Value | Description |
|---|---|---|
request_timeout | 12s | Default timeout per engine request |
max_request_timeout | 20s | Maximum allowed timeout (must be ≥ request_timeout) |
extra_proxy_timeout | 10s | Extra seconds added when using a proxy |
retries | 1 | Retries on HTTP error (uses a different proxy IP per retry) |
Production: Residential Proxies
In production, search engines may rate-limit or block your server's IP. To avoid this, configure a residential proxy so SearXNG's outgoing requests appear to come from rotating residential IPs.
Step 1: Build the Proxy URL
SurfSense uses anonymous-proxies.net style residential proxies where the password is a base64-encoded JSON object. Build the URL using your proxy credentials:
# Encode the password (replace with your actual values)
echo -n '{"p": "YOUR_PASSWORD", "l": "LOCATION", "t": PROXY_TYPE}' | base64The full proxy URL format is:
http://<username>:<base64_password>@<hostname>:<port>/Step 2: Add to SearXNG Settings
In docker/searxng/settings.yml, add the proxy URL under outgoing.proxies:
outgoing:
proxies:
all://:
- http://username:base64password@proxy-host:port/The all://: key routes both HTTP and HTTPS requests through the proxy. If you have multiple proxy endpoints, list them and SearXNG will round-robin between them:
proxies:
all://:
- http://user:pass@proxy1:port/
- http://user:pass@proxy2:port/Step 3: Restart SearXNG
docker compose restart searxngVerify
Check that SearXNG is healthy:
curl http://localhost:8888/healthzTroubleshooting
SearXNG Fails to Start
ValueError: Invalid settings.yml - Check the error line above the traceback. Common causes:
extra_proxy_timeoutmust be an integer (use10, not10.0)KeyError: 'engine_name'means an engine was removed but other engines reference its network. Remove all variants (e.g., removingqwantalso requires removingqwant news,qwant images,qwant videos)
Engines Getting Suspended
If an engine is suspended (visible in SearXNG logs as suspended_time=N), it will automatically recover after the suspension period. With residential proxies, the next request after recovery goes through a different IP and typically succeeds.
No Web Search Results
- Check SearXNG health:
curl http://localhost:8888/healthz - Check SearXNG logs:
docker compose logs searxng - Verify the backend can reach SearXNG: the
SEARXNG_DEFAULT_HOSTenv var should point tohttp://searxng:8080(Docker) orhttp://localhost:8888(local dev)
Proxy Not Working
- Verify the base64 password is correctly encoded
- Check that
extra_proxy_timeoutis set (proxies add latency) - Ensure
max_request_timeoutis high enough to accommodaterequest_timeout + extra_proxy_timeout
Environment Variables Reference
| Variable | Location | Description | Default |
|---|---|---|---|
SEARXNG_DEFAULT_HOST | docker/.env | URL of the SearXNG instance | http://searxng:8080 |
SEARXNG_SECRET | docker/.env | Secret key for SearXNG | surfsense-searxng-secret |
SEARXNG_PORT | docker/.env | Port to expose SearXNG UI on the host | 8888 |