Do you have multiple Symfony applications and need to anonymize user data? Tired of handling anonymization manually? Why not create a custom bundle to automate the process? In this article, we’ll explore how to achieve this.
If you’re an EU resident, you’ve likely heard about the GDPR regulation, which came into effect in 2016. This regulation ensures that personal data is protected by law and grants users the right to anonymity.
Companies that fail to comply can face significant fines, reaching up to €20 million or 4% of global turnover.
But how do you translate this requirement into code, especially when faced with multiple applications that have been built and evolved over the years, now requiring automated GDPR compliance?
Let’s see how a custom bundle can help integrate GDPR compliance across multiple Symfony applications.

What is a Symfony bundle?
Even if you’re new to Symfony, you’ve probably heard of Symfony Bundles. Introduced in Symfony 2.0 (2011), they are like packages of code, used to organize and reuse functionality across projects. They are similar to plugins or libraries, containing everything needed to implement a specific feature.
Project context
We had a request to implement GDPR rules across multiple applications where user data is stored.
This client had different PHP apps, almost all under Symfony.
Their request was simple:
- Anonymize a given user across all applications from a single interface, using a first name, last name, email, or phone number as input.
- Search for a user and choose which one to anonymize.
- Automatically anonymize users who haven’t been updated in over three years.
The challenges:
- Different PHP and framework versions across applications.
- Inconsistent database structures (e.g., fields for “first name” could be first_name, firstName, etc.).
- Multiple tables storing user data (e.g., user, address).
- Ensuring a unified approach while handling varied application architectures.
What If Someone Already Solved This?
Our first step was to explore the open-source world to see if an existing implementation met our requirements.
We explored several existing solutions, such as DbToolsBundle and the gdpr-bundle by Superbrave. However, they either lacked key features — like Collection anonymization — were no longer maintained, or did not expose configurable search endpoints that we needed to search for users to anonymize. After evaluating these options and confirming that none fully met our needs, we decided to build our own solution.
Solution
The solution was to create a Symfony bundle that could be installed across applications, providing all necessary anonymization functionalities.
When installing the bundle on an application that needs anonymization, a new table is created via a migration. This table stores all incoming anonymization requests — each line contains the user ID and the entity name to anonymize (one application can have multiple entities to be anonymized).
Two main routes are then added to the app automatically:
- Search GET route — with a given input user information, it outputs the results found in the database
- Anonymize POST route — with a user ID as input, it creates an Anonymization Request entry in the table that was created.
But how do we know in which entities and properties to search, and which ones to anonymize? That’s where attributes come into play.
First comes the Search class attribute — as seen earlier, the bundle is installed on many different apps with different field names — this mapping below allows to make sure that everything we return to the front has the same format.
#[GDPRSearch(
filter: [
"firstName" => ["firstname"],
"lastName" => ["lastname"],
"email" => ["email"],
"phone" => ["phone", "mobile"],
],
output: [
"firstName" => ["firstname"],
"lastName" => ["lastname"],
"email" => ["email"],
"phoneNumber" => ["phone", "mobile"],
"zipCode" => ["zipcode"],
"city" => ["city"],
"updatedDate" => ["updated"],
]
)]
The filter option in the code here determines which fields to search for a user with an input, and the output option formats the user result for the front end.
Alright, we’ve found some users to anonymize. Now we need to know what fields the anonymization will be performed on.
For this purpose, we created the Anonymize property attribute, which can be applied to any field that needs to be anonymized.
#[GDPRAnonymize(type: AnonymizationType::STRING, value: 'anonymized')]
private string $value;
This attribute then takes a parameter that is called anonymization type — there are as many of them as there are field types. For instance, we have a string type for string values, a collection type for Collections, an object type for Objects, a null type to anonymize fields to null, and so on.
Some anonymization types have options on how to transform the data — either set it to null if possible, fill it with random data thanks to the Faker library, with a specific value, or a value with a wildcard.
#[ORMColumn(type: 'int')]
public int $id;
#[ORMColumn(type: 'string')]
#[GDPRAnonymize(type: AnonymizationType::STRING, value: 'some-value-{id}')]
public string $address;
When the address field is anonymized here, the value will become:
"some-value-10", where “10" is the object ID.
Each anonymization type behaves differently — if the collection type is used, then all the fields inside each item of the collection where an anonymization attribute is found will be anonymized recursively. With the right configuration of attributes on every field and every entity related to a User that needs anonymization, we can be sure that we don’t miss anything.
For collections, the anonymization applies recursively to all items within the collection.
For example, a User entity may contain a collection of addresses in a separate table. By applying the Anonymize attribute to user.address, we ensure that all related data is anonymized.
// src/Entity/User.php
#[ORMEntity]
class User {
#[ORMOneToMany(targetEntity: 'Address')]
#[GDPRAnonymize(type: AnonymizationType::COLLECTION)]
public Collection $addresses;
}
// src/Entity/Address.php
#[ORMEntity]
class Address {
#[ORMColumn(type: 'string')]
#[GDPRAnonymize(type: AnonymizationType::STRING)]
public string $street;
}
In this example, the street from each address of the User will be anonymized too.
Processing Anonymization Requests
We’ve seen that when an anonymization request is submitted, an entry is created in the anonymization requests table. Then, a cron command will check each entry in the table that has not been anonymized and will perform the anonymization; this way, everything is done asynchronously.
In summary, the command will take all anonymization requests, then thanks to Doctrine’s metadata we get all properties with attributes, and then with some help of Reflexion we apply a new value to the property.
public function anonymizeField(object $object, string $property, Anonymize $attribute): AnonymizedProperty
{
$anonymizer = $this->anonymizerCollection->getAnonymizer($attribute->type);
$propertyValue = $this->propertyManipulator->getPropertyValue($object, $property);
if (null === $propertyValue) {
return new AnonymizedProperty($property, null, $propertyValue);
}
$options = new AnonymizeOptions($attribute->value, $object);
$task = new AnonymizeTask($propertyValue, $property, $options);
$anonymizedProperty = $anonymizer->anonymize($task);
$this->propertyManipulator->setPropertyValue($object, $property, $anonymizedProperty->getPropertyValue());
return $anonymizedProperty;
}
A single interface to control it all
But it was not enough, as we also needed to regroup all calls in a single interface. For this purpose, we created an app that works as a Gateway. It receives all calls from the front end and dispatches them to applications that either have the bundle installed or expose the two API endpoints, search and anonymize. This is useful for applications that don’t run on Symfony or even those that aren’t built in PHP.
To ensure security and restrict access to anonymization requests, the gateway also checks a third app that stores all user access information — let’s call it UserApp. A unique token per application is also used to secure each API call.
Here is a schema that resumes everything we have seen so far:

Application-Specific Requirements
Since each application had unique requirements, we needed custom pre and post processing hooks for pre search filtering, post search modifications and pre/post anonymization processing.
For example, one application required re-indexing after any update to the User table. To handle such cases, custom events were introduced, that were dispatched when needed from the bundle.
Automatic Anonymization
But we also talked about automatic anonymization, right? Another attribute joins the party to handle this:
#[Attribute(Attribute::TARGET_CLASS)]
final class AutomaticAnonymization
{
public function __construct(
public string $name = '',
) {}
}
This attribute is used on the main User class on each application, and takes the name of the field where the last updated date is stored (as sometimes we can find some names like updateAt, updatedDateTime). Thanks to this attribute, we know what field is used to track the last date when the user has been updated. Then, in a dedicated command launched daily, we analyze which users were last updated more than three years ago and create anonymization requests that are later processed by the first command we discussed.
Limitations
Everything worked fine as expected until we encountered an old application still running on PHP 7.4 where we needed to perform some GDPR anonymizations as well. But here’s the catch — this PHP version doesn’t support attributes! To counter this, we introduced another version of the bundle, where this time all is done via annotations — thanks to Rector the migration was not as hard as expected.
Then, we added some database load control – to prevent crashes, we capped anonymization batches at 1000 records per run.
Finally, we needed a backup mechanism in case something went wrong — because even with robust unit and kernel tests, surprises can still happen! To address this, we added an original_data field to the anonymization_request table, storing all anonymized data in JSON format, encrypted with a secret key. Each application and environment has its own dedicated key to ensure security. Additionally, we implemented a command that automatically deletes the encrypted data after one month, preventing unnecessary data retention.
Final words
By creating a custom Symfony bundle, we were able to:
- Standardize GDPR compliance across multiple PHP applications.
- Ensure flexibility for different database structures.
- Handle manual and automatic anonymization seamlessly.
- Provide a centralized interface for all GDPR requests.
- Implement security measures to protect access to the anonymization system.
So handling GDPR compliance across multiple applications may not seem easy, but with the right tools, architecture and automation it’s definitely achievable!
GDPR Made Easy: Automating Anonymization in Symfony was originally published in ekino-france on Medium, where people are continuing the conversation by highlighting and responding to this story.