knowledge graph infrastructure for AI

turn the open web into agent-ready intelligence.

Most AI agents are blind to the real-time web — traditional crawlers return messy, unstructured data. contextforce discovers, crawls, and parses multi-modal data across social, commerce, and map ecosystems, transforming the open web into structured, agent-ready intelligence endpoints.

// the problem

Agents need structure. The web hands them noise.

Today's LLMs can reason brilliantly — but only over what they're handed. Drop a raw URL into a crawler and you get walls of boilerplate HTML, obfuscated SPA payloads, and JS-rendered junk that has to be re-parsed every time.

contextforce closes the gap. One layer that handles discovery, proxy chaining, JS execution, and schema-aware extraction — so your agents call a clean endpoint and receive entities, not soup.

Five stages from raw web to structured intelligence.

Every endpoint runs the same backbone. Pick a stage as a primitive, or chain the whole flow.

01
Discover
Find candidate sources via search, autocomplete, AI mode, and knowledge graph.
02
Crawl
Direct, residential, datacenter, and headless-browser proxies — one spec, autorouting.
03
Parse
Site-aware handlers for SPAs, embedded JSON, XHR capture, and markdown extraction.
04
Extract
Schema-driven entity extraction — POIs, products, posts, transcripts, flights.
05
Enrich
Cross-reference identifiers, cache canonical assets, attach geocoded + LLM-graded metadata.

Multi-modal data across the ecosystems agents need.

Three domains, one consistent shape. Every result speaks the same entity graph.

Social
Short-form video, posts, transcripts, vibes, and creator-driven POI signals — the layer where culture and intent live.
tiktok instagram youtube shorts
Commerce
Product search, visual lookup, pricing, and merchant entities — turning catalogs and listings into queryable rows.
amazon google-lens stylesnap serper
Maps & Travel
Places, hotels, flights, activities — geocoded, ranked, and stitched to the social mentions that originated them.
google-maps flights klook place-search

REST & MCP — same graph, two front doors.

One call, one entity, zero parsing.

Every endpoint returns the same normalized shape — IDs, geocoordinates, canonical images, source provenance. Cache-first, deterministic keys, ready for retrieval pipelines.

  • REST — flat HTTPS endpoints, OpenAPI-discoverable.
  • MCP — drop-in tool servers for Claude, Cursor, and agent frameworks.
  • R2-backed cache — canonical assets persisted; second call is free.
  • Auto proxy chain — direct → datacenter → residential → headless, transparently.
browse all services
GET /api/tiktok/video
// extract a TikTok video — itemStruct, POI, transcript, signed URLs. curl "https://mcp-hub.contextforce.com/api/tiktok/video?url=..." // response (excerpt) { "videoId": "7401234567890123456", "videoUrl": "https://v16-.../play.mp4", "poi": { "name": "Senso-ji Temple", "address": "2-3-1 Asakusa, Taito City" }, "transcript": { "lang": "en", "lines": [...] }, "item": { /* full itemStruct */ } }

Production-grade plumbing, built for agents.

30+
structured endpoints
5
proxy strategies
3
data ecosystems
cached on cloudflare r2

Stop parsing HTML. Start shipping agents.

Pick an endpoint from the catalog and call it from your agent or notebook in under a minute. Local dev needs no key.