While OpenAI continues to enhance ChatGPT’s capabilities, user experiences highlight persistent challenges with session stability, memory retention, and performance, particularly during complex, document-heavy workflows.
A small case study published on GitHub in mid-April 2025 by user `sks38317`, identified as an 18-year-old South Korean student, provides a granular look at these alleged issues. The repository, described by the author as a “Cache failure and memory workaround case study using GPT-4o (authored by a 18-years-old student from Korea),” sheds some light on the practical challenges that OpenAI’s subsequent memory enhancements aim to address.
User-Documented Failures and Performance Metrics
The `sks38317` repository outlines several critical operational failures observed while using GPT-4o. According to the user’s detailed report, persistent PDF rendering failures were misinterpreted by ChatGPT as successful operations. According to the user, this led to faulty responses being stored in the system’s cache, subsequently triggering repetitive, unproductive loops as the chatbot attempted to reuse the broken cache entries.
Further issues documented by him included the redundant storage of multiple edited document versions within the session’s context, creating content conflicts, and noticeable response delays and session slowdowns, which the user attributed to cache overload.
The student quantified the impact before attempting a self-devised fix: PDF loops occurred more than four times per hour, requiring 4-6 retry attempts; redundant documents averaged five to six per session; the estimated cache token load swelled to between 17,000 and 18,500 tokens; and redundant phrases constituted roughly 22% of this cache.
Faced with these workflow disruptions—the user noted, “starting a new session deletes the conversation history, which seriously disrupts my workflow when working on documents”—they implemented a manual workaround involving session analysis and logic circuits to auto-delete failed outputs and prune redundancy.
This intervention reportedly yielded substantial improvements: PDF loop frequency was halved (-50%), retry frequency dropped by approximately 66% (to ≤2 occurrences), redundant document count fell by 50-60% (to ≤3), cache token load decreased by 13.7% (below 14,200 tokens), the redundant phrase rate dropped below 7%, and response delays were eliminated.
The user suggested OpenAI should “allow users to retain a limited number of previous document versions—such as 1 or 2—rather than automatically deleting all old versions. Ideally, this could be made configurable…” The repository also contains a file presented as a response from OpenAI support acknowledging the detailed feedback.
OpenAI’s Layered Approach To Memory
These user-reported issues provides some context for OpenAI’s multi-stage rollout of memory features apparently designed to mitigate such problems. The company first began testing a base Memory capability in February 2024, allowing users to explicitly provide facts for ChatGPT to remember. This base feature saw expanded availability for Plus subscribers later.
A distinct, more implicit memory function arrived around April 10, 2025, when OpenAI enabled ChatGPT (starting with Pro subscribers) to reference a user’s entire chat history for personalization, which CEO Sam Altman described as progressing towards “AI systems that get to know you over your life.”
Shortly thereafter, the “Memory with Search” feature was detailed in company release notes, which allows ChatGPT to use stored memory (both explicit facts and implicit context from chat history, controllable via settings detailed in the Memory FAQ) to customize web search queries conducted through partners like Microsoft Bing.
The update coincided with the release of new o3 and o4-mini models noted for improved reasoning. It’s important to distinguish these user-facing memory features from OpenAI’s separate, server-side Prompt Caching used via the API for performance optimization on repetitive calls.
Persistent Data, Persistent Risks?
While improving user experience, persistent memory capabilities inherently introduce security considerations. Prompt injection, where malicious instructions hidden within user input or external data sources manipulate LLM behavior, is ranked as a top AI security risk by groups like OWASP, leveraging the difficulty models have in separating trusted instructions from untrusted data.
This risk can be amplified by memory features, potentially allowing malicious instructions or extracted data to persist across sessions.
Researchers demonstrated such vulnerabilities in 2024. A June arXiv paper explored how memory could facilitate data exfiltration attacks, while a another report highlighted the injection of persistent “spyware” instructions into the ChatGPT macOS app’s memory via malicious documents, enabling potential long-term chat data theft.
While OpenAI reportedly addressed that specific macOS vulnerability, the fundamental challenge of securing persistent AI memory remains an industry-wide concern, impacting similar features from competitors like Google Gemini, Microsoft Copilot, and xAI’s Grok.
OpenAI provides controls for users to disable memory features entirely. The company states that specific user account details are not shared with search partners, though generalized location data inferred from IP addresses might be used for improving result relevance. The rollout of Memory with Search is gradual. Paid subscribers should note that search function usage, including memory-assisted searches, counts against their GPT-4o message limits. For tighter browser integration, OpenAI also offers a Chrome Extension.
Funny timing. I documented and reported this exact cache loop problem last week, and even published a fix suggestion. OpenAI said “Thanks, but we won’t say if we read it.”
Now the system acts like my logic was implemented… and articles like this appear.
For context: I’m not part of any team. I’m 18. I just tested it until it broke.
Release: https://github.com/sks38317/gpt-cache-optimization/releases/tag/v2025.04.19
Post: https://news.ycombinator.com/item?id=43741181
User feedback doesn’t disappear. It just reappears—quietly.