How Joy protects user data
How to ensure safety and ethics in the age of AI? Take a look behind the scenes of Joy, the mental health AI developed in-house at teale.
This is the date
The rise of artificial intelligence, and in particular large language models (LLMs), is opening up new opportunities for mental health. Easier access to first-line support, better information quality, therapeutic assistance, and lighter administrative burdens: the potential benefits are significant. But when it comes to mental health, one requirement stands above all others: confidentiality, data security, and ethical deployment.
This principle guided the creation of Joy, our AI dedicated to mental health. In this article, we explain how we built a secure LLM system for mental health—focusing on the ethical, regulatory, and technical choices that shaped its development. What role does European regulation play? What risks come with relying on third-party LLM providers? And why did we opt for internal deployment on open-source solutions?
The General Data Protection Regulation (GDPR) applies to any system processing personal data in the EU. Mental health data falls into the “special categories of personal data,” which require reinforced protection.
Since August 2024, the European AI Act has categorized AI systems by risk level. “High-risk” systems face strict obligations. AI used in healthcare or mental health—especially for diagnosis, therapeutic support, or patient follow-up—almost certainly falls into this category.
For high-risk systems, obligations include:
As we can see, GDPR and the AI Act provide a strict regulatory framework. But ensuring compliance in practice requires more than legal knowledge—it requires concrete technical and organizational choices. That is why we chose internal deployment: to maintain full control over risks, data security, and ethical safeguards.
Using third-party cloud AI platforms (e.g., ChatGPT) is incompatible with GDPR and AI Act requirements, particularly in:
For these reasons, external platforms could not meet our requirements. Our choice was clear: internal deployment on teale’s infrastructure.
Handling unstructured data—which is, by nature, likely to contain personal information—comes with significant responsibility.
To strictly limit access and control data flows, we chose to deploy our LLM services on teale’s private internal network, with no internet access.
In fact, under certain conditions, an LLM can be configured to query external resources to enrich its context. When using a cloud service, it is impossible to guarantee that no request is sent outside the system.
At teale, our LLM services operate as isolated network endpoints: no external requests are ever allowed.
“LLMs have the ability to remember.”
That statement is misleading.
LLMs are based on neural networks. A neural network—or an LLM—is typically trained only once, because training requires massive computational resources. When it generates a response, it does so by inference: the input text follows the same computational path as billions of training examples.
This process is called stateless: an LLM does not retain anything from the requests it processes. Producing an answer does not alter its state. It cannot memorize information.
So where does the idea of “memory” come from? It is actually a technical device: user interaction history is re-injected into the prompt to provide context. The longer this history, the more coherent the responses.
This raises a crucial question: where, and how, is this interaction history stored—knowing that it necessarily contains personal data?
At teale, we chose a secure design:
There is no data more sensitive than unstructured free text, unconstrained by any schema, typed freely into a text box.
Interactions with online services generally occur via APIs or graphical interfaces (web or native apps) that strongly contextualize the data. For example, entering your age in a form saves a numeric value into the “Age” column of a database. In such cases, sensitive data can be inventoried and categorized.
With an LLM, the user faces a blank text field with no context. They can write anything. It is then up to the system to “understand” the request and produce an answer.
This paradigm shift breaks the ability to inventory and categorize data at the platform level. Since information can no longer be categorized contractually or contextually, the highest level of criticality must be applied to every request—this is the principle of least privilege.
At teale, we apply a strict forgetting policy, in addition to the encryption measures described earlier:
This forgetting policy does not affect explicitly categorized platform data—such as Wellbeing Tracker scores or activity logs—which are stored securely with appropriate protection and access controls.
Performing inference on an LLM takes time—from a few seconds to several minutes. During inference, a large amount of GPU memory must be allocated on the server. This allocation reduces the server’s ability to handle other requests, and the memory footprint remains high for the entire calculation.
To reduce this load, many systems implement a cache: storing generated responses so they can be reused if an identical or very similar request is made later.
But this raises a problem: how do you search the cache for a matching answer when requests are unstructured?
The common solution is to use semantic similarity search. Instead of storing raw text, the system stores a vector embedding of the request, computed by the LLM itself. This approach is also widely used in Retrieval-Augmented Generation (RAG).
Embeddings encode the meaning of a sentence in numerical form. For example:
This allows the system to quickly retrieve the right cached answer for semantically similar queries.
But there is a problem.
If an embedding encodes the user’s request, then in theory it could be decoded—potentially exposing the original input. And as we explained earlier, we apply a strict forgetting policy: requests are deleted after 24 hours. If we applied the same deletion policy to the cache, it would quickly lose its usefulness.
So what’s the solution?
At teale, we developed a mathematical method that allows us to:
This means that even if an attacker gained access to the cache database, they would not be able to reconstruct user requests or extract personal data.
Navigating the complex intersection of AI and mental health requires not only innovation, but also strict compliance with existing and emerging regulations.
As we have seen, both the GDPR and the EU AI Act impose demanding requirements for data protection and security, particularly when handling sensitive mental health information.
While the appeal of ready-to-use cloud AI platforms is understandable, our analysis shows that they cannot provide the necessary levels of control, transparency, and accountability in this critical domain. Risks related to data retention, integrity, liability, and continuous compliance assessments make them unsuitable for our standards.
At teale, our decision to deploy Joy internally was not just a technical choice—it was a necessity, dictated by our commitment to safeguarding users’ data.
The measures we implemented—network isolation of our LLM services, advanced encryption with automatic key rotation, systematic deletion of interaction history after 24 hours, and a redesigned approach to semantic cache management—all reflect this commitment. By making it impossible to decode user requests without an encryption key, we ensure both high performance and absolute respect for privacy.
These architectural and operational safeguards are not just “best practices.” They are at the core of our value proposition and responsibility. They enable us to deliver support that respects confidentiality while guaranteeing a level of data protection that meets the unique challenges of mental health.
Responsible innovation is our compass, and protecting your data has been our priority since day one at teale.
If you want to test Joy, it's over here.