Trump Towers, Ofis Kule:2 Kat:18, No:12, Sisli, Istanbul, Turkey

Publication

Publication

Regulating Text and Data Mining in the European Union: Issues and Challenges

Text and Data Mining (“TDM”) is the process of deriving information from a machine-read material, by copying large quantities to extract the data and recombine it to identify patterns. The importance of TDM activities lies within its potential for exponential scientific growth. It may accelerate the process of finding cures for autoimmune and genetic diseases or provide data for new solutions to current technical problems. TDM regulations were enacted to The Directive on Copyright in the Digital Single Market (“The DSM Directive”) which came into force on June 6 2019, aiming to amend the European Union (“EU”) copyright and database legislations. The DSM Directives certain provisions were exclusively drafted for TDM which can be described as, any type of analytical technique aimed at analyzing text and data in digital form to generate information. This also includes patterns, trends and correlations and covers other forms of methods. By generating large amounts of data, extensive know-how on a particular research field can be obtained by the miners, which enables cumulative innovation. The “text” refers to copyright protected works within the intellectual property law regime, while the “data” stands for use of data from databases, albeit database rights. Therefore, EU copyright regimes have provided exceptions to copyright protection and database rights in case of text and data mining for non-commercial scientific purposes.

Background and the Current Digital Single Market Directive

Before the DSM Directive came into force, the Information Society Directive (“InfoSoc Directive”) for copyright and related rights protection in the digital world and Database Directive for sui generis database rights were the directives in force. Within the InfoSoc Directive, all-inclusive reproduction rights for every direct or indirect, temporary or permanent reproduction rights by any means and forms were authorized by rights holders. However, TDM created potential conflicts with copyright law and sui generis database rights when the protected content is mined by third parties, as this process was not mentioned in neither of the directives. The DSM Directive, therefore aimed to extend TDM exceptions to sui generis database rights, by aligning them with the general exceptions of copyright law. The EU’s approach towards text and data mining in the commercial realm is more ambivalent, as a mandatory TDM exception for the benefit of non-commercial research organizations and cultural heritage institutions were provided. As a result, the DSM Directive now comprises two obligatory provisions; i) the acts of reproduction and extraction by research organizations and cultural heritage institutions; and ii) acts of reproduction and extraction for the purposes of text and data mining.

Exceptions Granted by the DSM Directive

The DSM Directive permits the mining activities for research organizations and cultural heritage institutions per se, and in addition allows storage and retention of copies of mined works and other subject matter for the purpose of scientific research. These exceptional activities are also for the purpose of verifying research results. Exemptions provided by DSM Directive is particularly important for empirical scientific research that requires research data to remain available for corroboration purposes. The definitions for research organizations and cultural heritage institutions are also stated in the DSM Directive. “Research organizations” are non-profit entities or entities tasked with a public service research mission, while “cultural heritage institutions” are publicly accessible libraries, museums, archives and film or audio heritage institutions. Therefore, public broadcasting organizations and commercial research organizations have been excluded from the exemptions provided to the research organizations and cultural heritage institutions. They may still be protected, under the DSM Directive, to benefit from the copyright and database right exemptions, if they oblige to the conditions requested by the rights holder. In fact, all types of organizations and institutions are allowed to reproduce, extract and retrain text or data for mining purposes, regardless of any underlying commercial motives. However, rights holders can opt out TDM exemptions from their contract with the miners to protect their commercial interest, unlike the contracts signed with non-profit research organizations and cultural heritage institutes.

With regards to accessing works and databases, the abovementioned research organizations and cultural heritage institutions are required to have lawful access for mining. Lawful access covers access to content pursuant to contractual agreements (e.g. open access licensing or subscription), as well as freely available content online. The used source also has to be indicated during the TDM activities for credit to the rights holder. As an important side note, the rights holders are prevented from contractually ruling out TDM in their terms of agreement. It is expressly stated in the DSM Directive that contractual provisions contrary to the digital use of works and other subject matter for the sole purpose of illustration of digital and cross-border educational activities (under non-commercial use) are unenforceable. Nevertheless, the EU parliament allows national legislators to enact their national laws based on an EU Directive, and national legislators can specify which works may not fall under educational use exemptions. For Commercial TDM activities, for example, the users are required to provide a fair compensation to the rights holder and options to opt out out of the TDM exemption is provided only in respect of the non-research uses. Additionally, rights holders still remain free to apply measures to ensure the security and integrity of the networks and databases where the works or other subject matter are hosted. However, justifiable measures must be limited to the security and integrity objectives, such as controlling access requests and downloading of the works and other subject matter.

Positive and Negative Impacts on the EU Innovation Policies

Harmonization of TDM under copyright and database regulations are also supported by extended scope of limitations, covering commercial and non-commercial uses and unenforceability of contrary contractual provisions under the DSM Directive. These measures are expected to prevent deprivation of TDM exceptions by the rights holder. Publishers may contractually rule out mining in a licensing agreement and transaction costs to obtain permission from multiple rights holders may make TDM projects unsustainable.  However, the TDM provision of the DSM Directive are criticized for securing less freedom to text and data mine, particularly in the R&D progress. For instance, the second provisions opt-out clause for rights holders leaves for-profit miners at mercy of the content owners. While it may be reasonable from the intellectual property law perspective, to constrain third party access to protected content; this puts Artificial Intelligence developers, journalists, commercial research labs, and other innovators at a competitive disadvantage in comparison with the United States, where text and data mining is deemed fair use, even if it is done for profit. This problem raises questions the limitations to be a “research organization” under the DSM Directive. As explained above, the research organization must operate on a not-for-profit basis, reinvest all the profits in their scientific research, or pursuant to a public interest mission to qualify for the exception. Since the scope of these exceptions do not apply to research institutions controlled by commercial undertakings, this policy, from a practical market-based perspective, might cripple opportunities for start-ups and individual researchers in this area.

Yet, limited indirect application of the new exception to private parties is provided by the DSM Draft Directive. Research organizations that engage into public-private partnerships may benefit from the “research organization” exception. The DSM Directive refers to TDM research carried out by private businesses within the framework of a collaboration with a research organization, unless the research organization is also a commercial entity or is controlled by private business. In any event, public-private partnerships might be a limited option for start-ups as the transactional costs and finances may not be feasible for small teams to invest into. The narrow application of the limitation to research organization does not fully provide the EU market with the legal framework to fill the gap with other jurisdictions. It was also suggested that adopting opening clauses or fair use models to allow a broader number of research players to perform TDM research and promote related innovation might have been considered under DSM Directive. Given the global nature of the modern economy, the impact of the DSM Directive Article 3 exceptions on EU’s competitive advantage against other top innovative economies, enabling all undertakings to carry out TDM under fair use/fair dealing models (e.g., USA, Canada, and Israel) , might fall behind.

Secondly, the DSM Directives approach on “lawful access” was not found purpose-specific by some critics, as it disregards unspecified applications of TDM that might be construed as within the exclusive rights of the copyright holders. In addition, this criticism might raise subtle issues of applicability of the new limitation within research organizations enjoying lawful access to a database as well. For example, would public universities granted lawful access to a database “for educational purpose” license, would it need to pay an additional licensing fee for a “scientific research” purpose? If this is the case, would this obligation contradict the prohibition of contractually overriding the TDM exception? Since the scope of the new limitation consists both purposes and due to prevention of contractual override within the DSM Directive, the answer is probably not. Still, research institutions might find these possible legal uncertainties a limitation to the deployment of TDM research due to potential liability that might arise and related transaction costs that should be considered before running TDM research project. The exception can effectively be denied to certain users by right holders who refuses to grant lawful access to works or grants such access on a conditional basis only. Such measures by rights holders will make TDM research projects harder to run by raising transactional costs. Objecting TDM research to market access does discriminate research according to research organizations market power. Only few research organizations will be able to acquire licenses for all databases or copyrighted works that are relevant for a TDM research project, raising regulatory questions (e.g. competition law) in the national administrations.

What DSM Directive means for Turkish legislation?

Currently the Turkish legislation does not specify TDM in its Law Intellectual and Artistic Works/Fikir ve Sanat Eserleri Kanunu (“FSEK”), which does however refer to the exceptional uses on copyrighted works and has also integrated database rights within its codex. As part of the Continental European/Civil legal system, Turkey mostly comply with the EU legislator’s position in lawmaking. This is relevant for FSEK, which as amended parallel to the former Copyright and Database Directives of the EU. However, this leaves TDM exemptions for researchers in ambiguity. Due to the internal criticism on the DSM Directive on TDM measures, and that it is aimed for the EU internal market; Turkish legislators may not opt for implementing a similar TDM provision within its Intellectual Property Law legislation that may hinder commercial R&D for start-ups and commercial research organizations. Given the potential innovative effects and market for AI-implemented technologies in information technologies industry and disease prevention methods in the medical sector, alternative approaches may be needed for future lawmaking.

 

Author: Sinan Erkan

 

Kustepe Mahallesi, Mecidiyekoy Yolu Caddesi, Trump Towers, Ofis Kule:2 Kat:18, No:12, Sisli Mecidiyekoy, Istanbul, Turkey

Subscribe Our Newsletter

© 2025 HERDEM | All Rights Reserved. Powered by Stingreys

HERDEM

360