SurfSenseSurfSense Docs

Web Search

How SurfSense web search works and how to configure it for production with residential proxies

Web Search

SurfSense uses SearXNG as a bundled meta-search engine to provide web search across all search spaces. SearXNG aggregates results from multiple search engines (Google, DuckDuckGo, Brave, Bing, and more) without requiring any API keys.

How It Works

When a user triggers a web search in SurfSense:

  1. The backend sends a query to the bundled SearXNG instance via its JSON API
  2. SearXNG fans out the query to all enabled search engines simultaneously
  3. Results are aggregated, deduplicated, and ranked by engine weight
  4. The backend receives merged results and presents them to the user

SearXNG runs as a Docker container alongside the backend. It is never exposed to the internet. Only the backend communicates with it over the internal Docker network.

Docker Setup

SearXNG is included in both docker-compose.yml and docker-compose.dev.yml and works out of the box with no configuration needed.

The backend connects to SearXNG automatically via the SEARXNG_DEFAULT_HOST environment variable (defaults to http://searxng:8080).

Disabling SearXNG

If you don't need web search, you can skip the SearXNG container entirely:

docker compose up --scale searxng=0

Using Your Own SearXNG Instance

To point SurfSense at an external SearXNG instance instead of the bundled one, set in your docker/.env:

SEARXNG_DEFAULT_HOST=http://your-searxng:8080

Configuration

SearXNG is configured via docker/searxng/settings.yml. The key sections are:

Engines

SearXNG queries multiple search engines in parallel. Each engine has a weight that influences how its results rank in the merged output:

EngineWeightNotes
Google1.2Highest priority, best general results
DuckDuckGo1.1Strong privacy-focused alternative
Brave1.0Independent search index
Bing0.9Different index from Google
Wikipedia0.8Encyclopedic results
StackOverflow0.7Technical/programming results
Yahoo0.7Powered by Bing's index
Wikidata0.6Structured data results
CurrencydefaultCurrency conversion
DDG DefinitionsdefaultInstant answers from DuckDuckGo

All engines are free. SearXNG scrapes public search pages, no API keys required.

Engine Suspension

When a search engine returns an error (CAPTCHA, rate limit, access denied), SearXNG suspends it for a configurable duration. After the suspension expires, the engine is automatically retried.

The default suspension times are tuned for use with rotating residential proxies (shorter bans since each retry goes through a different IP):

Error TypeSuspensionDefault (without override)
Access Denied (403)1 hour24 hours
CAPTCHA1 hour24 hours
Too Many Requests (429)10 minutes1 hour
Cloudflare CAPTCHA2 hours15 days
Cloudflare Access Denied1 hour24 hours
reCAPTCHA2 hours7 days

Timeouts

SettingValueDescription
request_timeout12sDefault timeout per engine request
max_request_timeout20sMaximum allowed timeout (must be ≥ request_timeout)
extra_proxy_timeout10sExtra seconds added when using a proxy
retries1Retries on HTTP error (uses a different proxy IP per retry)

Production: Residential Proxies

In production, search engines may rate-limit or block your server's IP. To avoid this, configure a residential proxy so SearXNG's outgoing requests appear to come from rotating residential IPs.

Step 1: Build the Proxy URL

SurfSense uses anonymous-proxies.net style residential proxies where the password is a base64-encoded JSON object. Build the URL using your proxy credentials:

# Encode the password (replace with your actual values)
echo -n '{"p": "YOUR_PASSWORD", "l": "LOCATION", "t": PROXY_TYPE}' | base64

The full proxy URL format is:

http://<username>:<base64_password>@<hostname>:<port>/

Step 2: Add to SearXNG Settings

In docker/searxng/settings.yml, add the proxy URL under outgoing.proxies:

outgoing:
  proxies:
    all://:
      - http://username:base64password@proxy-host:port/

The all://: key routes both HTTP and HTTPS requests through the proxy. If you have multiple proxy endpoints, list them and SearXNG will round-robin between them:

  proxies:
    all://:
      - http://user:pass@proxy1:port/
      - http://user:pass@proxy2:port/

Step 3: Restart SearXNG

docker compose restart searxng

Verify

Check that SearXNG is healthy:

curl http://localhost:8888/healthz

Troubleshooting

SearXNG Fails to Start

ValueError: Invalid settings.yml - Check the error line above the traceback. Common causes:

  • extra_proxy_timeout must be an integer (use 10, not 10.0)
  • KeyError: 'engine_name' means an engine was removed but other engines reference its network. Remove all variants (e.g., removing qwant also requires removing qwant news, qwant images, qwant videos)

Engines Getting Suspended

If an engine is suspended (visible in SearXNG logs as suspended_time=N), it will automatically recover after the suspension period. With residential proxies, the next request after recovery goes through a different IP and typically succeeds.

No Web Search Results

  1. Check SearXNG health: curl http://localhost:8888/healthz
  2. Check SearXNG logs: docker compose logs searxng
  3. Verify the backend can reach SearXNG: the SEARXNG_DEFAULT_HOST env var should point to http://searxng:8080 (Docker) or http://localhost:8888 (local dev)

Proxy Not Working

  • Verify the base64 password is correctly encoded
  • Check that extra_proxy_timeout is set (proxies add latency)
  • Ensure max_request_timeout is high enough to accommodate request_timeout + extra_proxy_timeout

Environment Variables Reference

VariableLocationDescriptionDefault
SEARXNG_DEFAULT_HOSTdocker/.envURL of the SearXNG instancehttp://searxng:8080
SEARXNG_SECRETdocker/.envSecret key for SearXNGsurfsense-searxng-secret
SEARXNG_PORTdocker/.envPort to expose SearXNG UI on the host8888

On this page