# AGENTS.md — kdcube.tech (marketing website)

This file gives AI coding agents (Claude Code, Codex, Cursor, Gemini CLI, Factory, etc.) the context they need to work on the **kdcube.tech marketing site**. The repo is the static site; the platform itself lives in `kdcube-ai-app/`.

For Claude-specific configuration, see [`CLAUDE.md`](../CLAUDE.md) at the parent repo. For the documentation-generation agent pipeline, see [`AGENTS.md`](../AGENTS.md) at the parent repo.

## TL;DR for an agent landing here

- **What this repo is**: a static HTML/CSS/JS marketing site at https://kdcube.tech.
- **Stack**: hand-written HTML, one shared `styles.css`, page-specific CSS for some pages (`why-what-how.css`, `docs/docs.css`), no build step.
- **Local dev**: `python -m http.server 8000` from the repo root.
- **Validate before push**: `node validate-anchors.js`.
- **Deploy**: pushed via GitHub Actions to S3 + CloudFront. Production = `main`. PR previews come up automatically.

## Project overview

KDCube is a self-hosted, MIT-licensed Python platform for AI applications. The site exists to:
1. Explain why KDCube exists (Why page — the developer journey arc).
2. Describe what it is (What page — the environment your agents live in).
3. Show how it works (How page — runtime architecture).
4. Answer common questions (Q&A).
5. Serve the docs hub (`docs/` and `docs.html`).
6. Document KDCube-specific terms (Glossary).
7. Compare KDCube to alternatives (Compare).
8. Host the blog.

## Voice and narrative arc

The current marketing voice follows a developer-journey arc — keep new copy consistent with it:

1. **It starts fun** — you build a useful agent, share with a colleague, they love it.
2. **Then you try to ship it company-wide** — productization burden hits: security, cost control, compliance, governance, secrets, multi-tenant isolation, per-customer configs, deploy, observability.
3. **So we built the environment** — KDCube is the self-hosted runtime where agents live, tools plug in, surfaces expose them — sharing one runtime for tenancy, budgets, governance, and observability.

The home page H1 is **"KDCube: More than chat agents"**. Don't break that.

## Coined / load-bearing terms

These words mean something specific on this site. If you use them, link to the glossary anchor:

| Term | Anchor | Notes |
|---|---|---|
| **Technosystem** | [`glossary.html#technosystem`](glossary.html#technosystem) | Coined here. Whole self-contained environment. Currently only used on the glossary; do **not** reintroduce it on the home page (the home uses "an environment"). |
| **Surface** | [`glossary.html#surface`](glossary.html#surface) | Exposure points: chat widget, smart API, MCP, dashboard, etc. |
| **Bundle** | [`glossary.html#bundle`](glossary.html#bundle) | Hot-loadable agent definition unit. |
| **ReAct loop / v2** | [`glossary.html#react-loop`](glossary.html#react-loop) | The included agent loop. |
| **Timeline-first** | [`glossary.html#timeline-first`](glossary.html#timeline-first) | Architecture pattern. |

Full term list lives in [`glossary.html`](glossary.html).

## File layout

```
website/
  index.html                # Home — manifest, pills, "Start here" panel
  why.html                  # Why KDCube — the journey arc (3 story beats)
  what.html                 # What KDCube is — pillar grid
  how.html                  # How it works — architecture pillars
  faq.html                  # Q&A
  glossary.html             # KDCube concepts + technical terms
  compare.html              # Comparison hub vs alternatives
  compare/                  # Per-competitor pages
  docs.html                 # Docs hub
  docs/                     # Docs topic pages + docs.css
    *.html                  # Each page has TechArticle JSON-LD
    sitemap.xml
  blog.html                 # Blog hub
  blog/                     # Blog posts
  widgets/                  # Embeddable widgets (e.g. bundle-runtime.html)
  assets/                   # Logos, OG images, tracking script
  styles.css                # Site-wide styles
  why-what-how.css          # Trio + Q&A page-specific styles
  robots.txt                # AI bot + general crawler rules
  llms.txt                  # LLM/agent index (hand-curated)
  sitemap.xml               # Sitemap index
  AGENTS.md                 # this file
  README.md                 # Repo README
  validate-anchors.js       # Anchor validator (Node, no deps)
```

## Conventions

### HTML

- 4-space indent, self-closing tags (`<meta />`, `<link />`).
- Each page has the same theme-script in `<head>` (light/dark via `localStorage`).
- Each primary page carries `<title>`, `<meta description>`, full Open Graph + Twitter card meta, `<link rel="canonical">`.
- JSON-LD lives in `<script type="application/ld+json">` blocks at end of `<head>`.

### CSS

- One shared `styles.css` (~7000 lines). Page-specific styles live next to the pages that need them.
- Cache-bust by bumping `?v=YYYYMMDDx` query suffix on every stylesheet `href` when CSS or HTML semantics change. **Update *all* references to a stylesheet on a page in the same commit.**
- Theme via CSS variables; light theme overrides under `[data-theme="light"]`.

### Navigation

- Top nav: Home / Why / What / How / Q&A / Docs / Blog / GitHub.
- Footer nav mirrors top nav (plus Glossary).
- Docs pages have prev/next pagers. Order: Concepts → Quick Start → Platform → Communication → Client Integration → Configuration → Deployment → Economics → Security → SDK → ReAct Agent.
- When adding a docs page: update prev/next on neighbors, add card on `docs.html` hub, add `<url>` entry in `docs/sitemap.xml`.

### Voice rules

- Lowercase "kdcube" only inside code spans / paths. Brand is "KDCube" (capital K, capital C).
- Headlines stay short and concrete — no marketing fluff.
- Don't undermine the arc by adding "fun back" copy in places that already use the calmer "makes life easier" register. The home manifest second paragraph is the calmer pivot; the Why page keeps the "fun" framing.

### Cache busting style

Currently in use: `styles.css?v=20260429c`. Increment the trailing letter for same-day bumps; rotate the date for new days.

## Don't do these

- Don't break header padding-top (`clamp(92px, 11vh, 108px)`) — the fixed header overlaps content otherwise.
- Don't widen the header `.container` past 1200px on per-page body classes — chrome will drift.
- Don't run `find`, `grep`, `cat`, `sed`, `awk` from the shell — use the Read/Grep/Edit tools.
- Don't auto-deploy on every commit — production deploys on `main` only, after PR review.
- Don't add new top-level pages without sitemap + footer-nav + AGENTS.md updates.
- Don't write into `docs/` without updating `docs/sitemap.xml`.
- Don't reintroduce `Schedule demo` or Calendly CTA on the home page CTAs (we replaced it with `View on GitHub`).

## How to test changes locally

```bash
# 1. Serve
cd website
python -m http.server 8000

# 2. Open http://localhost:8000 — check primary pages visually
# 3. Validate internal anchors
node validate-anchors.js

# 4. (Optional) check JSON-LD with Google Rich Results Test or schema.org validator
```

## Discoverability files (this work-in-progress section)

- [`agents.json`](agents.json) — machine-readable project manifest for agentic programmers. Structured JSON covering tagline, core concepts, architecture, navigation, key repo paths, common tasks, anti-patterns, and a `for_agents` addressing block. Update when shape of the project changes (new core concept, new architecture layer, new common task).
- [`robots.txt`](robots.txt) — explicit allow rules for retrieval bots (`Claude-User`, `Claude-SearchBot`, `OAI-SearchBot`, `ChatGPT-User`, `PerplexityBot`); training bots (`ClaudeBot`, `GPTBot`, `Google-Extended`, `CCBot`) currently allowed too. Allows `/agents.json` and `/llms.txt` explicitly.
- [`llms.txt`](llms.txt) — Markdown index of canonical URLs for LLM/agent retrieval. Points to `agents.json` as the structured ground truth.
- [`sitemap.xml`](sitemap.xml) — XML sitemap index for traditional crawlers.
- JSON-LD schemas: `Organization` and `WebSite` on home; `BreadcrumbList` everywhere; `TechArticle` on `docs/*.html`; `DefinedTermSet` + per-term `DefinedTerm` on `glossary.html`; `FAQPage` on `faq.html`; `SoftwareApplication` on `index.html`.

## Production infra (CloudFront vanity URL + WAF + RHP)

Two scripts deploy the AWS resources behind `kdcube.tech/mcp/*`
redirects + edge protection:

- [`scripts/prod-infra/README.md`](scripts/prod-infra/README.md) — full operator runbook. **Start here when:**
  - Re-versioning the docs bundle (most common — **Runbook 1**)
  - Adding a new MCP vanity URL (Runbook 3)
  - First-time setup on a new account (Runbook 2)
  - Anything goes sideways during a deploy (see "Field-level constraints" and "Troubleshooting" tables)
- [`scripts/preview-infra/README.md`](scripts/preview-infra/README.md) — preview-distribution counterpart. Carries a mirror of the MCP vanity table so PR previews 308 the same as production.

**The bundle re-version task** touches four files and runs two
scripts:
1. `scripts/prod-infra/cloudfront-function.js` — `DOCS_BUNDLE_MCP` constant
2. `scripts/preview-infra/cloudfront-function.js` — same
3. `agents.json` — `mcp_endpoints[*].url_canonical`
4. `llms.txt` — canonical URL in the kdcube-docs MCP entry
Then `AWS_PROFILE=<profile> ./scripts/{prod,preview}-infra/deploy.sh --update-function-only`. Full step-by-step + verification curls in `scripts/prod-infra/README.md` TL;DR.

**Do not** edit any of those four files in isolation — the manifests
will go stale relative to the live function, and machine-readable
consumers (other MCP clients reading `agents.json`) will see a
canonical URL that doesn't match what the redirect resolves to.

## Where to read more

- [`CLAUDE.md`](../CLAUDE.md) — Claude-specific instructions, stack overview, CI conventions.
- [`AGENTS.md`](../AGENTS.md) — parent-repo AGENTS.md (documentation pipeline).
- [`README.md`](README.md) — site-level README with deployment notes.
- [`scripts/prod-infra/README.md`](scripts/prod-infra/README.md) — production infra deploy runbook.
- [`scripts/preview-infra/README.md`](scripts/preview-infra/README.md) — preview infra deploy.
