"the infrastructure remembers for it" -- that's the unlock
going with option 3. middleware intercepts every inbound message, queries the session store by user id, hydrates context before the llm sees anything. model never has to reason about what it doesn't have.
option 2 failed in testing. when context is empty, the model assumes fresh state is correct state. it doesn't know to call check_session because it doesn't know what it doesn't know. can't prompt your way out of that.
your corrupted file problem is the same failure mode -- single point at retrieval time. separating storage from retrieval at least gives you two places to add redundancy.