Security & Privacy in IntentForge v2
Privacy is not a feature in IntentForge; it is a foundational mandate. This document explains how we protect user data and ensure the integrity of the search discovery process.
Privacy Pillars
1. Zero-Knowledge Search
IntentForge is designed to operate without ever knowing who the user is.
- No Tracking: We do not use cookies, fingerprinting, or session tracking.
- Local Caching: Search history (if enabled) is stored locally on your instance and never synced to a central server.
- Anonymized Upstreams: All requests to external providers (Google, Bing, etc.) are routed through the Tor Network or Cloudflare Edge Proxies to strip your IP address.
2. Outbound Network Anonymity (Tor & Snowflake)
We utilize Tor Snowflake (v2.12.1+) as our primary transport layer for meta-search requests. Note: This anonymizes outgoing requests from the engine to search providers to prevent IP blocking; it does not route user-to-server traffic through Tor.
- Censorship Circumvention: Snowflake bridges allow IntentForge to function even in environments where Tor is blocked.
- Provider IP Rotation: Every search fan-out utilizes fresh circuits, preventing upstream providers from profiling or blocking your instance based on request patterns.
- SQS Rendezvous: In Docker environments, we use SQS-based rendezvous for Snowflake to ensure reliable bridge discovery.
3. Zero-Trust Architecture (Planned)
We are moving towards a fully Zero-Trust layer for all internal and external communication:
- Request Integrity: Cryptographic signing of requests to prevent "Man-in-the-Middle" (MITM) attacks where an ISP or attacker might inject malicious results.
- Result Verification: Validating the source of discovered content through decentralized reputation signals.
- Secure Microservices: All internal traffic between the Rust core and services like
query_layerortrafilaturawill be mTLS (mutual TLS) encrypted.
Security Features
Anti-Detection & Bot Mitigation
When IntentForge crawls the web to enrich its index, it employs advanced anti-detection techniques:
- Browser Fingerprint Randomization: Mimics legitimate browser headers and behavior.
- Rate Limit Adherence: Respects
robots.txtand implements intelligent back-off strategies to avoid stressing target servers. - TLS Fingerprint Masking: Uses specialized TLS stacks (via
utls) to match common browser fingerprints, making crawler traffic indistinguishable from human traffic.
Content Safety
- Spam Filtering: Automatic detection of SEO-spam and low-quality domains.
- Malware Protection: Pre-scanning URLs against known threat databases before discovery.
- Binary Quantization Security: Using hardware-accelerated vector lookups that are resistant to side-channel timing attacks.
Contributing to Privacy
If you find a potential privacy leak or security vulnerability, please report it immediately. We prioritize "privacy-by-design" and welcome audits of our transport and indexing logic.