My CLAUDE.md — 582 lines. Here’s why

Every new chat with Claude Code starts from scratch. The agent does not know your project, does not remember what you discussed an hour ago, and on the sixth time will still try to "fix" the configuration that was working fine. Also, every week in r/ClaudeAI there’s a new story about a remote database or leaked secrets.

Every week in r/ClaudeAI there’s a new story. The agent deleted the production database. The agent pushed secrets to a public repo. The agent "optimized" the billing service and issued zero invoices to clients. And every time you think: I really don't want to become that person from the headlines.

CLAUDE.md should solve both problems: the context between sessions and disaster protection. A typical CLAUDE.md with 5-10 lines solves neither. I decided to approach this systematically - not one problem at a time, but as an architectural task.

Currently, my config is 582 lines, 6 layers, and behind each rule is a specific story.

Three cases that changed everything

The agent "fixed" the working system. Sunday evening. The agent sees in the config 127.0.0.1 for the external storage service. It concludes that the previous session left an error - instead of the real address, localhost is set. Logical, right? It replaces it with the actual IP. The upload breaks. Half an hour of debugging later, you realize: it was an SNI-proxy through a local tunnel, and 127.0.0.1 was the correct value. Without context, the obvious solution turned out to be a disaster.

The rule that emerged: "do not change configs without understanding why the current values are what they are. If a value seems strange - first understand, then act."

fail2ban mistook the agent for a brute-forcer. The agent was checking the server's status. For each check, it opened a new SSH connection. A dozen connections in a minute - fail2ban interpreted this as brute-force and blocked the IP for half an hour. During this time, the model training was happening on the server, and I lost access to it.

The rule: "one SSH bridge for everything. One client per session. Do not write separate scripts for check, fix, verify - combine them into one."

"Filter" turned out to mean "delete." I asked to filter the dataset - to remove unsuitable images. The agent interpreted this literally: it deleted the files. Did not move, did not label - just deleted. The data disappeared.

Rule: “‘filtering’ = move or mark, do not delete. Before any deletion, ensure that the user explicitly requested to delete.”

Writing “be careful” doesn’t work. A system is needed.

6 layers: how it works

No layer was planned. Each one emerged after a specific problem.

Layer 1: Rules (9 files). A set of rules that are loaded as needed. When the agent writes an article, they do not need rules about SSH. When debugging code, they do not need rules about text formatting. Claude Code can connect the necessary rules files depending on the task.

Layer 2: Memory (78 files). Emerged when the agent forgot the server configuration for the third time. Between sessions, it now remembers: infrastructure settings, project decisions, my preferences, past mistakes. The files are linked with - 178 cross-references, creating a knowledge graph from regular markdown. Some load always (basic rules), and the rest is by topic.

Layer 3: Handoffs. Emerged when a new chat repeated a deadlock from the previous one. When closing the chat, the agent records a summary: what has been done, what did NOT work (the most valuable part), one next action. Here’s a real handoff:

## Session goal
Color checker: CNN sweep + diffusion, first visual results.

## Done
- CNN baseline: median 1.99 deg (11M params, 21 MB)
- Sweep on 5 GPUs: crop128(3.17), bs16(2.04), lr3e-4(NaN)
- Diffusion training started: epoch 5/50, loss 0.827

## DID NOT work
- EfficientNet-B0: hash mismatch in Docker image
- lr=3e-4: NaN after epoch 10-13, no gradient clipping
- CNN visually: 3 numbers give parasitic castings

## Next step
Inference script for diffusion + visual sheets with 24 patches

The next chat reads 1500 tokens instead of reanalyzing the project. In 4 days, 27 handoffs have accumulated - no deadlock has repeated. It works not only between chats: I have three subscriptions (two work-related, one personal), and the handoff allows starting in another subscription without repeated explanations.

Layer 4: Chronicles. Emerged when, after 20 handoffs, it became unclear why the project came to its current state. Handoff answers “what’s next.” Chronicle - “how we got here.” Key decisions, turns, deadlocks. 3-7 lines for each milestone.

Layer 5: Hooks. Emerged when the rule “check links in CLAUDE.md” stopped working after 20 minutes of the session. This is discussed in a separate section below.

Layer 6: Skills (16 items). Ready knowledge sets for specific tasks. The description is written as a trigger for the model: “use when: GPU is frozen, server health check needed,” rather than “helps with servers.”

Rule - wish. Hook - guarantee.

This is the most non-obvious conclusion of the month.

The rule in CLAUDE.md - instruction in the prompt. The agent can forget, reinterpret, or ignore it during a long session when the context is filled with other things. The rule “check links before working” worked for the first 10 minutes. Then the agent got carried away with the task and forgot.

A Hook is a Python script that Claude Code automatically runs upon certain events. SessionStart, Stop, PreToolUse. The script does not forget, does not reinterpret. It is executed mechanically, every time.

Example - a hook that reminds to record a handoff before closing a long session:

# remind_handoff.py (Stop hook, simplified)
age = session_age_minutes()
if age < 15:
    return  # short session, no need

if fresh_handoff_exists():
    return  # already recorded

# Block closing and ask to record handoff
print(json.dumps({
    "decision": "block",
    "reason": f"Session {int(age)} min, handoff not recorded. "
              f"Record in .claude/handoffs/ before exit."
}))

The model knows when it’s time - when the task is completed or the context is overflowing. A Hook protects against cases when it forgets.

If something must happen reliably - it’s a hook, not a rule.

One line of config that saved from a supply chain attack

On March 31, 2026, the Sapphire Sleet group (DPRK) compromised the official npm package axios (~100M downloads per week). They published version 1.14.1 with malicious code. Window: 3 hours, from 00:21 to 03:29 UTC.

I had one line in my .npmrc:

min-release-age=7

Packages published less than 7 days ago are not installed. Most malicious packages are detected within 1-3 days, 7 days is a comfortable buffer.

I was not affected. One line in the config.

Similarly for Python - in uv.toml:

exclude-newer = "7 days"

According to the config - 37 papers

Many rules came not from personal experience, but from academic works. 37 arxiv papers, reworked into principles. Here are the ones that changed my workflow the most:

Proof Loop. The agent says “tests passed” - you check, the tests did not pass. Proof Loop prohibits the agent from confirming its own work. Evidence files are needed: test outputs, a verdict from the verifier in a fresh session who has not seen the creation process. Source.

Structured Reasoning. Instead of free-form “well, maybe this, maybe that” - format: what we definitely know from the code and logs → step-by-step tracing → what follows → which hypotheses were tested and discarded. On real patches, accuracy increased from 78% to 93%. Source.

Deterministic Orchestration. If the task is deterministic - tests, linters, formatting - it goes through a shell script. The model calculates poorly, loses counters, confuses conditions in loops. The script does not.

Red Lines. Ordinary rules can be interpreted by the agent “creatively.” Red Lines are absolute prohibitions without exceptions. “Do not delete without confirmation.” “Do not change production configs without understanding.” Each is tied to an incident. A pattern from the Chinese engineering community (红线).

The other principles - generator-evaluator, autoresearch, multi-agent decomposition, codified context, agent security, documentation integrity and 7 more - are described in detail in the repository.

Numbers

78 memory files. 178 cross-references. 27 handoffs over 4 days. 96.9% KV-cache hit rate on 83 sessions over a week.

The configuration file updates itself: after each change, the agent checks if the links have become outdated. The SessionStart hook validates automatically.

Does it work perfectly? No. During the audit, 4 memory files were found that had dropped out of the index. Documentation drift occurred with the system that is supposed to prevent it. But without it, it would have been worse.

What I Don’t Know

I’m not sure that all principles are necessary for everyone. For most projects, five may be enough: Deterministic Orchestration, Structured Reasoning, Supply Chain Defense, Codified Context, Handoffs.

I’m not sure that 6 layers is the minimum. Maybe I overengineered. But in a month, the context has never been lost, and no dead ends have repeated.

One of the principles (Assumption Testing) states: each component encodes an assumption about the model's incapacity. Models improve. Remove components and measure - maybe some of the layers are no longer needed.

Try It

Copy into the Claude Code chat:

https://github.com/AnastasiyaW/claude-code-config - study it, choose what fits my work, and set up my Claude Code

Start small: Supply Chain Defense (one line in .npmrc) + Deterministic Orchestration (tests via scripts) + Structured Reasoning (debug format). Add as needed.

Everything is under MIT. github.com/AnastasiyaW/claude-code-config

Comments