IntentForge Architecture — How We Built a Privacy-First Search Engine with Tor
"“Google knows what you searched for. Your ISP sees every site you visit. IntentForge changes the game.” Most search engines treat privacy as an afterthought. IntentForge was built with Tor integration from day one – routing every query through Snowflake bridges, matching intent instead of keywords, and running on a self-improving binary-quantized index. No logs. No tracking. No manipulation. Just a search engine that respects you. Read how we built a privacy-first search engine on a $20 VPS."
IntentForge: How We Built a Privacy-First Search Engine on Tor
Google knows what you searched for. Your ISP knows every site you visited. Your government can demand logs. The current web is a surveillance infrastructure with a search interface.
IntentForge was built to change this.
The Core Problem: Metadata
Even when you use HTTPS, your DNS queries and connection metadata leak your intent. Your ISP sees what domains you resolve. Cloudflare sees what IPs you connect to. Running a search engine without Tor is like sending mail without an envelope — the content is sealed, but the destination is visible.
Tor Integration: First-Class from Day One
IntentForge routes all queries through the Tor network by default. Not as an option. Not as a "privacy mode." As the primary interface.
We use Snowflake bridges to connect to Tor, making it harder for network observers to detect Tor usage patterns. Every search request is routed through a different exit node, making query correlation across requests practically impossible.
Intent-First Architecture
Traditional search engines match keywords. IntentForge matches intent.
When you search "best laptop for coding," a keyword engine returns pages with those exact words. IntentForge understands that you're evaluating purchasing decisions, so it surfaces reviews, comparisons, and developer forum discussions — even if none contain the phrase "best laptop for coding."
How it works:
- Query parsing — The intent extraction layer breaks queries into structured intent objects:
{ action, target, constraints, context } - Tor-routed meta-search — The structured intent is sent through Tor to multiple search backends simultaneously
- Vector scoring — Results are embedded and scored using binary quantized vectors (384→48 bytes per embedding)
- Self-improving index — Implicit feedback from clicks, dwell time, and reformulations updates the index in real-time
Binary Quantized Vectors: 8× Compression
Storing full 384-dimensional float vectors for every indexed document is expensive. IntentForge uses binary quantization — mapping each float vector to a 48-byte binary code while retaining ~92% of retrieval accuracy. This lets us run the full index on modest hardware while maintaining sub-50ms P95 latency.
Self-Improving Index: Learning from Searches
Most search engines update their index on a fixed schedule — hourly, daily, weekly. IntentForge updates its index based on query intent signals. When users consistently reformulate a query in a certain way, the intent extractor learns. When users click results lower in the ranking, the vector scorer adjusts. When new content matches a query pattern, the crawler prioritizes it. This creates a feedback loop where the search engine gets better at matching intent without manual curation.
Anti-Signals Filtering: No Manipulation
We actively filter anti-signals — SEO manipulation, paid placements, clickbait patterns, and known misinformation sources. This is expensive to compute but essential for maintaining result quality.
What's Under the Hood
- Backend: FastAPI + Redis caching
- Crawler: Go-based with BadgerDB for persistence
- Search: Meilisearch with custom intent scoring
- Embedding: ONNX Runtime (local inference, no data leaves the stack)
- Privacy layer: Tor + Snowflake bridges
- Vector storage: Binary quantized embeddings
Open Source
IntentForge is fully open source under the IECL license. The code, architecture docs, and research notes are available at github.com/oxiverse-labs/intentforge.
The Bigger Picture
IntentForge is one piece of Oxiverse — a complete privacy-first ecosystem that includes browser, productivity tools, and more. We believe privacy isn't a feature. It's the default.
Related Content_
Binary Quantization — 8× Vector Compression with Minimal Accuracy Loss
*“Vector search is memory-hungry. Binary quantization is the answer – but traditional methods lose 15-20% accuracy.”* We cracked asymmetric binary quantization: 48 bytes per document instead of 1,536 bytes, 5× faster queries, and only 3.3% NDCG loss. No GPUs. No terabytes of RAM. Just efficient, accurate search on commodity hardware. Dive into the math, the implementation, and why this makes privacy-first search viable.
RAVANA v2 — Building a Cognitive Architecture with Bounded AGI
What if AI safety wasn’t about stopping bad behavior—but designing systems that never want to misbehave? RAVANA v2 introduces a homeostatic cognitive architecture where intelligence emerges from constraint, reflection, and adaptive pressure—not raw reward maximization. With its GRACE framework and identity-clamped governance, the system learns from its own corrections, turning failure into alignment. This isn’t just safer AI—it’s a fundamentally different way to build minds.