Employing corpus linguistics in German private law (Working title)

The habilitation project aims to introduce and explore potential fields of application of corpus linguistics in German private law.

Corpus linguistics is concerned with the computational analysis of large amounts of text (so-called “corpora”). For example, the German reference corpus (DeReKo) of the Institute for the German Language in Mannheim contained over 50 billion words in February 2021. Additionally, a German legal reference corpus (JuReKo) and court-specific corpora have recently been compiled for the first time. With the help of numerous tools, corpus linguistics facilitates empirical statements on a wide range of linguistic issues. One of those issues is the meaning of a (legal) term, which is particularly interesting for lawyers.

Fields of research

Contrary to the United States, German lawyers have hardly used corpus linguistics in practice so far. This is not due to missing fields of application but to the fact that the empirical analysis of large data sets has been largely unknown in legal scholarship. The habilitation project will address this research gap. Within the project‘s scope, numerous potential fields of research shall be considered:

1. Legal interpretation traditionally begins with a legal term’s literal (common) meaning and conclusively considers its meaning and purpose. The project will discuss to what extent corpus linguistics might provide traditional legal interpretation with empirical validity.

2. Predictability: A lawyer’s central challenge is the prediction of future instances of the law’s application, e.g. within the framework of a hypothetical court decision on a particular case. It shall be investigated to what extent past court verdicts on various widespread problems of private law can be identified by employing corpus linguistics and if these might serve to more accurately predict the outcome of future cases.

3. Clarifying indeterminate legal concepts, particularly within formalised legal norms: It is already possible to convert legal norms into computer-readable code based on systems of deontological logic. This enables automated compliance checks in experimental applications, for example, with regard to the European General Data Protection Regulation. However, the law’s machine-readability is obstructed by its frequent use of indeterminate terms (e.g. “legitimate interest”; “without hesitation”). Since these indeterminate legal terms usually conceal more specific and tangible cases, it shall be examined to which extent this case law can be determined on the grounds of court-specific corpora.

4. Comparative law and international law: Due to the recent emergence of legal reference corpora within numerous foreign legal systems, it also shall be explored to which extent corpus linguistics might offer a new approach to comparative law. In particular, it will be considered whether a legal comparison obtained through corpus linguistics might solve interpretation problems within international and internationalised law.

Benefits

For numerous reasons, tackling legal problems with corpus linguistics might prove highly beneficial: Since the subjectivity of the person applying the law (e.g. lawyers, judges) hardly plays a role in the context of an empirical approach to the law’s meaning, such results are highly persuasive and might eliminate interpretative ambiguities. In turn, this would significantly increase the legal certainty and predictability of court decisions, thus avoiding expensive judicial clarification in many cases. While this might increase the practical effectiveness of the law overall, it could particularly benefit economically weaker parties in mass proceedings since expensive litigation in courts might become unnecessary or cheaper altogether. In addition, the empiric predictability of future instances of the law’s application is a core condition of a fully or semi-automated application of the law.