Artificial intelligence (“AI”), machine learning (“ML”), and the metaverse are all characterized by the need to require a significant amount of data: a phenomenon called big data. More precisely, AI, ML and metaverse are applications or application environments that require, as an essential condition of operation, large amounts of data. Volumes of data that are selected and analyzed using AI and ML to capture trends, behaviors, patterns, useful for making optimal decisions and adopting new strategies. Or to power the virtual realities of the metaverse in which users – and avatars of their own creation – contribute to increase the information magnitude, with new data produced through their interaction in the simulated environment.
Big data are characterized by a kind of information bulimia fueled by a multiplicity of sources: this phenomenon is likely to lead to the inevitable blurring, if not outright nullification, of the principles of purpose-specificity and information minimization that are important crossroads of the data protection framework.
In this round we will look at what the main ways big data can be fed and the related implications with the GDPR and EU data strategy laws.
Main data acquisition modalities
The main data acquisition modalities, especially ‘personal’ data, both from a technical and a legal point of view, are:
- Data sharing: that is, providing data from the sources that have the availability of the data. In the case of personal data, the main source is the data subject; the EU Data Governance Regulation (“DGA”) also included the cases of making data available by:
- public entities, for data held by them for the performance of their institutional functions, with the possible assistance of the bodies for the reuse
- data intermediation services providers that facilitate interactions – always from both technical and legal perspectives – between data holders (data controllers and data subjects) and data users (companies that use the data for their own purposes).
- Commercialization or ‘services versus data’: characterized by the provision of services or the granting of incentives, behind the granting of the consent of the data subject to the use of the personal data concerning him or her, by the provider and for his or her own commercial purposes other than the provision of the service.
- Data Monetization or ‘data capitalization’: when the provision of personal data and the associated authorization for its use for multiple purposes is accorded a financial remuneration, in effect, recognizing personal data as a commercial asset.