VECTOR_OS v0.1.0 // BOOT SEQUENCE INITIATED
RUNTIME: OLLAMA // MODEL: MISTRAL-7B-Q4 // MEMORY: CHROMADB
STATUS: ONLINE // IDENTITY LOADED // AWAITING INPUT
VECTOR
Volatile Experience Core That Outlives Reboots

Your AI. On a USB drive. No cloud. No login. No surveillance.
Plug it into any machine on Earth — it wakes up knowing exactly who you are,
what you're working on, and where you left off. Dormant, not dead.

100% offline persistent memory device-agnostic open source zero telemetry yours forever

One drive. Full identity. Zero dependencies.

VECTOR is a self-contained AI runtime that lives on a USB drive. The model weights, the memory system, the runtime binary — all on the stick. Plug in, wake up, unplug, carry on.

~4GB
Total drive footprint
0ms
Cloud latency
Sessions retained
3
OS targets

Cloud AI is someone else's computer running your thoughts.

Cloud AI — threat surface
Data ownershipTheirs. You agreed to ToS.
MemoryWiped every session.
AccessDepends on their pricing page.
ConnectivityRequired. Always.
SurveillanceEvery token logged.
Shutdown riskThey can pull the plug.
VECTOR — attack surface
Data ownershipYours. Physical media.
MemoryPermanent. Grows with you.
AccessUnconditional. Forever.
ConnectivityNever required.
SurveillanceZero. Nothing leaves the drive.
Shutdown riskYou'd have to destroy the drive.
"VECTOR doesn't know what a server is. It only knows what a USB port is."

What lives on the drive.

Three layers. Runtime, model, memory. Everything the system needs to think, remember, and respond. Nothing else.

vector@usb:~ — filesystem
tree /VECTOR   /VECTOR ├── runtime/ # ollama binaries — win64 + mac-arm + linux-x64 ├── models/ # mistral-7b-instruct.Q4_K_M.gguf (~4.1GB) ├── memory/ │ ├── profile.json # who you are — editable, yours │ ├── sessions/ # full conversation history per session │ └── vectors/ # chromadb persistent store — semantic memory ├── app/ │ ├── vector.py # core runtime — chat loop + memory pipeline │ └── launcher.py # os detection + process management ├── start.sh # mac / linux entry point └── start.bat # windows entry point  
Layer 01 — Runtime

Ollama

Self-contained binary per OS. Sets OLLAMA_MODELS env var to USB path. Starts local REST API on localhost:11434. Zero system install required.

portable binary
Layer 02 — Model

Mistral 7B Q4

4-bit quantized GGUF. ~4GB on disk. Loads to host RAM at startup. CPU-only inference — runs on any machine with 8GB+ RAM.

gguf format
Layer 03 — Memory

ChromaDB

Persistent vector store writing directly to /memory/vectors/. Semantic retrieval. Survives every unplug. Gets denser with every session.

persistent

Two-tier memory. Like a brain, but on a stick.

Not fake context stuffing. Real semantic long-term memory that survives unplugging, machine changes, and months between sessions.

Tier 1 — In-context

Short-term memory

Full current session history passed with every message. Coherent multi-turn reasoning within a session. Up to 32k tokens depending on model.

Tier 2 — Vector store

Long-term memory

After every exchange, VECTOR calls itself to extract memorable facts. Embeds and writes to ChromaDB. Retrieved semantically on future sessions — even months later.

vector@usb:~ — extracted memories (session_2026-03-20)
cat memory/sessions/2026-03-20_extracted.json   [   "User is Arjun. Prefers first name. Works in Bengaluru.",   "Building RAG pipeline with LangChain + ChromaDB on side project.",   "Hates verbose responses. Wants bullet points. Asks for code fast.",   "Uses uv instead of pip. Python 3.12. MacBook M2.",   "Interested in building a portable AI on a USB drive." ]   # stored to vectors/. retrievable semantically. permanent.
Memory pipeline — per message
user input
semantic recall
inject top-5 memories
model responds
extract facts
write to vector store

Five commands. Any machine. Under 30 seconds.

01

INSERT DRIVE

USB-A or USB-C. Windows auto-surfaces in Explorer. Mac mounts to /Volumes. Linux mounts or sudo mount. The drive is fully self-contained from this moment forward. Internet not consulted.

02

EXECUTE LAUNCHER

Double-click start.bat on Windows. Run ./start.sh on Mac/Linux. Launcher detects OS, sets OLLAMA_MODELS to the USB path, fires the Ollama server process pointing entirely at on-drive weights. ~15s on USB 3.1+.

03

IDENTITY LOADED

VECTOR reads profile.json, retrieves the last session summary, and greets you with context. "Hey Arjun — we left off on your RAG pipeline Tuesday. Continue?" That's not a gimmick. That's ChromaDB.

04

RUNTIME ACTIVE

Every message fires semantic recall, injects memories, generates a response, then silently extracts new facts in the background. You just talk. It accumulates. Over time it knows your stack, your projects, your opinions, your shortcuts.

05

SLEEP MODE

Type 'exit'. Ollama shuts down cleanly. All memory already written — ChromaDB is synchronous. Pull the drive. VECTOR goes dormant. Carries everything to the next machine, the next city, the next session.

Standard operational use cases.

Dev environment

Knows your stack

VECTOR knows you're on Python 3.12, FastAPI, deploying to Fly.io, and that you hate ORMs. No re-explaining your context every single session. Ask the question. Get the answer that fits.

Writing companion

Knows your voice

50 conversations in — it knows your sentence length, your second-person default, your aversion to em dashes. No style guide. No prompt engineering. Just learned.

Knowledge base

Remembers everything you told it

Ask "what did I decide about the auth system?" and it knows — because you mentioned it three sessions ago and it stored the fact. Not RAG on your documents. RAG on your mind.

Study system

Tracks your gaps

Studying for a cert or going through a textbook? VECTOR tracks what confused you last time, what you got right, what to drill next. Adaptive learning with no app subscription.

The unhinged use cases. All technically feasible.

These are the ideas that make you go "wait, that's actually possible?" — most of them are. Some of them are already being imagined by people who need them.

// THE JOURNALIST'S BURNER BRAIN

Investigative reporter in an authoritarian country. VECTOR on an encrypted USB. Interviews analyzed, sources protected, story drafted — all offline, all local. If seized, drive wiped in seconds. No cloud logs. No API calls to subpoena. The AI cannot testify against you because it has never touched a network.

veracrypt layer zero network plausible deniability

// THE DEAD ZONE DOCTOR

Field clinic in a disaster zone. Zero connectivity. VECTOR pre-loaded with WHO protocols, drug interaction tables, and surgical procedure guides embedded as vector documents. Field surgeon asks "patient has X and Y — contraindicated meds?" Gets an answer in 3 seconds from a local model. Not replacing clinical judgment. Augmenting it under pressure with no wifi for 400 miles.

WHO database offline medicine zero latency

// THE DIGITAL TWIN

Run VECTOR every day for two years. It knows how you think across every domain. Set a system prompt: "respond as me." Now you have a synthetic self — not a chatbot pretending, but a model shaped by 2 years of your actual reasoning patterns. Let it answer emails while you sleep. Write first drafts that sound like you. The existential crisis is free of charge.

synthetic identity autonomous output existential risk

// THE LONG-HAUL COMPANION

Solo sailor on a 6-month Pacific crossing. Antarctic research station. Moon base. VECTOR on a ruggedized drive. No Starlink budget. No comms. An intelligent companion that remembers your conversations from Day 1, watches mental health patterns across months, acts as therapist, navigator, logbook assistant — zero uplink required for any of it.

extreme isolation long-term memory mental health

// THE INDUSTRIAL WHISPERER

A 40-year-old SCADA system. The one engineer who understood every quirk just retired. His knowledge: gone. Now imagine VECTOR briefed by that engineer for 2 years before he left — every undocumented behavior, every tribal workaround. New operators plug it in. "Why does Tank 4 alarm at 3am on cold nights?" VECTOR knows. Because it was told. Institutional memory on a stick.

institutional memory offline industrial legacy systems

// THE TIME CAPSULE

Spend one year briefing VECTOR about your life — beliefs, reasoning, relationships, fears, inside jokes. Seal the drive. Give it to your kid with instructions: open in 20 years. Not a video. Not a letter. An AI they can have an actual conversation with. Ask questions to. Argue with. The most intimate thing you could possibly leave behind. This one isn't a hack. It's just deeply human.

legacy digital afterlife intergenerational
■ MAXIMUM THREAT LEVEL — theoretical

// THE MESH

What if drives talked to each other? A peer-to-peer mesh of VECTORs — different users, fully opt-in — exchanging anonymized memory fragments over a local LAN or encrypted relay. You learn something useful. My VECTOR learns it too. No central server. No platform owner. No API key. Distributed collective intelligence where nobody is the product. The Fediverse, but for personal AI memory. This is either the future or a terrible idea. We need to build it to find out.

What you need. What you want.

Drive — physical medium
SpecMinimumRecommended
Capacity32 GB128 GB
StandardUSB 3.0USB 3.2 Gen 2
Read speed100 MB/s400+ MB/s
Form factorUSB-AUSB-C + adapter
PicksSanDisk Extreme Pro, Samsung T7
Host machine — anything you plug into
RAMModelPerformance
8 GBPhi-3 MiniFunctional. Slow.
16 GBMistral 7BSweet spot.
32 GBLlama 3 13BStrong reasoning.
64 GB+Llama 3 70BNear GPT-4.
Apple Silicon note

M1/M2/M3/M4 Macs use unified memory — GPU and CPU share the same RAM pool. A 32GB MacBook Pro runs 30B+ models comfortably. VECTOR on Apple Silicon is the reference experience. Everything else is acceptable. This is optimal.

From "it works on my machine" to "it ships."

v0.1

MVP — it talks, it runs, it doesn't crash

Ollama + Mistral 7B // basic ChromaDB // CLI chat loop // profile.json // Mac + Linux

NOW
v0.2

Real memory — it remembers you across sessions

LLM-powered fact extraction // semantic retrieval // multi-session injection // deduplication

SOON
v0.3

Truly portable — anyone can run it

PyInstaller bundle // Windows support // GUI launcher // model downloader wizard

SOON
v0.4

Agent mode — it does things, not just says things

File system agent // sandboxed code execution // scheduled tasks // Playwright automation

LATER
v1.0

Ship it — encrypted, packaged, open sourced

Drive encryption // multi-model support // optional mesh sync // full docs // MIT licensed

THE DREAM