How the upcoming EU laws can shake the machine learning pipeline
From just regulating personal data
The EU has moved from just regulating data via GDPR to implementing different laws that affect the machine learning pipeline at different stages. How that’s going to look like? Let’s take a look!
Defining the machine learning pipeline
Before diving deeper, let’s assume the following definition of a machine learning pipeline. The goal of this definition isn’t to provide a definitive list, but a model that can be referenced later on - like the OSI model, but for the machine learning.
- Project: Scoping the problem
- Data: Defining, Cleansing, Ingestion, Analysis, Validation, Feature Engineering, Splitting
- Model: Building, Training, Validation
- Deployment: Serving, Monitoring, Finetuning
Pipeline: the project layer
- Regulating AI systems based on the risk they pose: new law - AI Act
Pipeline: the data layer
- Stronger enforcement of GDPR - including US companies not based in the EU like in the case of the fine against the company Clearview AI
- Regulating handling of non-personal data: new laws - Data Act (access/use rights on data), Data Governance Act (how the data is shared)
- Bannning EU-US data transfers
- Binding privacy with competition law
Pipeline: the model layer
Unlike at the data layer, where EU is trying to establish a single market for all data - the model layer itself seems to stay largely unregulated.
- How do you share the model with other companies?
- How do you train on 3rd party models?
Those questions remain not explicitly answered. The single market for the model is not here.
Pipeline: the deployment layer
- Regulating single market in the EU: new laws - Digital Services Act, Digital Markets Act
- Holding accountable for the harm caused by AI systems: new law - AI liability directive