In an eagerly anticipated judgment dated 27 September 2024 (case number 310 O 227/23) the Hamburg District Court dismissed the complaint by photographer Robert Kneschke asserting claims for copyright infringement against non-profit Large-scale Artificial Intelligence Open Network (LAION) based on the use of his photograph in a data set for training AI image generators. Mr. Kneschke had uploaded his photo on a microstock photography website. The website’s terms of use provide for a restriction of using automated programs, applets, bots or the like to access content on the website including scraping.

Whilst this contractual exclusion would, on first sight, appear to prevent LAION’s use of the photograph, the recent decision of the Hamburg District Court ruled that LAION’s use was covered by the statutory limitation under section 60d of the German Copyright Act (Urheberrechtsgesetz – UrhG) allowing text and data mining for scientific research purposes.

By way of a side note (obiter dictum), the Hamburg District Court confirmed with convincing arguments that the general limitation for text and data mining under section 44b UrhG was applicable to the use of works as AI training data, which has been controversial in German legal literature. In particular, the Court dismissed the argument that section 44b UrhG only covered the exploitation of “information hidden in the data” but not the use of the “content of the intellectual creation” given that it was not sufficiently clear where the demarcation line falls in case of digital works. The judges further rejected the argument that the EU legislator did not yet have AI issues on its radar when creating the underlying provision of Article 4 of EU Directive 2019/790 on Copyright in the Digital Single Market (DSM Directive) in 2019. In any event, the current EU legislator of the EU AI Act (EU Regulation 2024/1689) had unambiguously expressed that the creation of data sets intended for the training of AI is also subject to the limitation provision of Article 4 of the DSM Directive. This was generally also in line with the assessment of the German legislator when implementing the limitation in 2021.

Although not relevant for the decision in the pertinent case, the Hamburg District Court addressed another key question of section 44b UrhG, namely when there is a reservation of use by the right holder towards text and data mining in “machine-readable format”. According to a broader understanding, the reservation would be machine-readable if it is included in the respective website’s terms of use. A more restrictive understanding suggests that such reservation is only effective if a machine can actually read it (e.g. via robots.txt). In this regard, the judges confirmed that the reservation used in the terms of use of the scraped website in “natural language” would meet the statutory requirement of a reservation in machine-readable format, albeit this was a question of the individual case considering the technical development at the time of use of the work. In fact, it seemed like a certain contradiction to enable the providers of AI models to develop increasingly powerful text-understanding and -creating AI models via the limitation under section 44b UrhG on the one hand, but not to require them to use these models for detecting right holders’ reservations on the other.

Coming back to the limitation for text and data mining for non-commercial scientific research purposes under section 60d UrhG actually applied to the pertinent case, the Hamburg District Court took a rather generous approach. As followed from Recital 42 of EU Directive 2001/29/EC on the harmonization of certain aspects of copyright and related rights in the information society (InfoSoc Directive), the non-commercial nature of the activity in question should be determined by that activity as such. The organizational structure and the means of funding of the establishment concerned were not decisive. On that basis, the judges affirmed the absence of commercial activity, as the data set was made freely available to the public. The fact that commercially active companies also use the training data set was irrelevant. Regarding the counter-exception for research organizations that cooperate with a private enterprise which exerts a certain degree of influence on the research organization and has preferential access to its findings, the plaintiff had not substantiated this with respect to LAION.

Apart from the fact that the Hamburg District Court clearly confirms (with good arguments) that the statutory limitations under section 44b UrhG and 60d UrhG apply to the use of works as AI training data, the judges set the bar for the assumption of text and data mining for non-commercial scientific research purposes rather low. In the individual case, it may even be sufficient for users of works for AI training data to implement intermediate steps to put aside all copyright concerns.

That said, the Hamburg District Court will most likely not have the last word on these questions. The case may still move up to the Hamburg Court of Appeals, the German Federal Court of Justice and even the European Court of Justice. However, the decision is another cornerstone of international case law entering this unchartered and fast-moving territory.

Our multinational team is advising clients on the risks of data scraping on a daily basis so that they can create global internal policies which set out their approach to data scraping in the light of the most recent case law.