Identification and Visualization of Legal Definitions and their Relations Based on European Regulatory Documents

Anastasiya Damaratskaya

PDF Supplemental Materials

Verified Thesis

Abstract

Analyzing regulatory documents is a continuous challenge for numerous companies, especially if it is a manual process. Considering the exponential growth in legal acts, legal practitioners must invest vast amounts of time examining the legal text for relevant information. Nevertheless, the manual analysis remains susceptible to errors and misinterpretation. This thesis concentrates on semi-automating this procedure and presents an approach for extracting legal definitions and their semantic relations from European regulatory documents using natural language processing techniques. We further visualize the obtained data on the implemented web service, which serves as a practical application for the approach. Since the existing methodologies addressing legal information retrieval tasks struggle with interpreting legal text and lack semantic analysis and visualization, our method intends to cover this research gap and deepen the understanding of regulatory documents. In order to identify legal definitions, we primarily investigated the legal acts structure that regulatory documents attempt to follow. After recognizing similar formats, we focused on a single article specifying legal terms, extracted definitions and analyzed all semantic relations occurring, such as hyponymy, meronymy, and synonymy. For this purpose, contingent upon the type of semantic relationship, we applied pattern matching and natural language processing techniques, emphasizing dependency parsing and noun phrase chunking. For visualization, the prototype collected the data into separate files and extracted sentences mentioning legal definitions for each related term. To rapidly discover these sentences in the text and obtain an overview of each term’s frequency, the prototype listed the articles where the definitions occur and counted the number of retrieved sentences. Additionally, it assigned annotations to the regulatory documents, explaining the legal definitions in each paragraph to facilitate comprehension of the regulatory documents. The evaluation outcomes demonstrated that the prototype could detect 99.9% of legal definitions and 96.7% of their semantic relations correctly, thereby delivering accurate results for the introduced approach. The study further fulfilled the established requirements intending to simplify the plat- form’s usage. Consequently, these results demonstrate that natural language processing techniques perform well in the classification phase and are suitable for definition and relation extraction.

Topics

legal definitions legal information extraction natural language processing.

Research Methods

Publication Data

Author: Anastasiya Damaratskaya

Signing Author Pub-Key: 0xAa685918A7c03ea05F1d0df9504Da79f3f233643

Thesis Type: Bachelor's Thesis

Pages: 156

Language: English

DOI:

About the Author:

Major / Study Program: Informatics

Primary Field of Study:

Additional Study Interests:

Publication Contract: 0x852EDAC65DD61b20328F465A85bfe7D432308665

License: CC BY-NC-ND 4.0

Date of Publication: 11/10/23

Status: Available

Date of Grading: 05/16/23

Institution: Technical University of Munich (Technical University of Munich, Germany)

Endorsements

#	Name	Details	Endorsement
1	Catherine Sai Supervisor	Technical University of Munich Email: catherine.sai@tum.de Web: https://www.cs.cit.tum.de/bpm/staff/ Pub-Key: 0x955Ae4A7a39Eb4130185980ccE06046D7AC2Db3b	11/09/23 12:00:00 AM

Thesis Documents and Supplemental Materials

11/04/25 06:32:08 AM

#	Description	Type	Upload Date	Location
1	Thesis Document	PDF (33.42MB)	11/06/23 12:00:00 AM	IPFS	Download Raw

Bloxberg Transaction History

Transaction

Tx Hash

Date

Currently no data available