2024 LegalProBERTClassificationofLeg
- (Tewari, 2024) ⇒ Amit Tewari. (2024). “LegalPro-BERT: Classification of Legal Provisions by Fine-tuning BERT Large Language Model.” doi:10.48550/arXiv.2404.10097
Subject Headings: Fine-Tuned BERT Model.
Notes
- The paper introduces LegalPro-BERT, a fine-tuned BERT model specialized in classifying legal provisions within contracts, demonstrating significant performance improvements over existing benchmarks.
- The research employs the LEDGAR dataset, sourced from LexGLUE, consisting of 80,000 labeled paragraphs from legal contracts, to fine-tune and evaluate the LegalPro-BERT model.
- The paper reports superior performance metrics of LegalPro-BERT, achieving a micro-F1 score of 0.93 and a macro-F1 score of 0.88, which surpass the performance of previous models on the same tasks.
- The methodology includes the use of transfer learning techniques where the BERT model, pre-trained on general language data, is fine-tuned on a legal-specific corpus, enabling it to effectively capture the unique lexicon found in legal texts.
- The experimental setup detailed in the paper uses various evaluation metrics like the F1-score to assess the class-wise performance and predictive power of the classification model.
- Key to the approach is the processing of text inputs where only the top-100 most frequent words in each category are retained, which helps in improving the classification accuracy by reducing noise in the training data.
- The paper also discusses potential future work, suggesting further exploration of fine-tuning with different subsets of words and applying the model to other domains beyond legal, such as finance and healthcare, which could benefit from specialized document classification.
Cited By
Quotes
Abstract
A contract is a type of legal document commonly used in organizations. Contract review is an integral and repetitive process to avoid business risk and liability. Contract analysis requires the identification and classification of key provisions and paragraphs within an agreement. Identification and validation of contract clauses can be a time-consuming and challenging task demanding the services of trained and expensive lawyers, paralegals or other legal assistants. Classification of legal provisions in contracts using artificial intelligence and natural language processing is complex due to the requirement of domain-specialized legal language for model training and the scarcity of sufficient labeled data in the legal domain. Using general-purpose models is not effective in this context due to the use of specialized legal vocabulary in contracts which may not be recognized by a general model. To address this problem, we propose the use of a pre-trained large language model which is subsequently calibrated on legal taxonomy. We propose LegalPro-BERT, a BERT transformer architecture model that we fine-tune to efficiently handle classification task for legal provisions. We conducted experiments to measure and compare metrics with current benchmark results. We found that LegalPro-BERT outperforms the previous benchmark used for comparison in this research.
References
;
Author | volume | Date Value | title | type | journal | titleUrl | doi | note | year | |
---|---|---|---|---|---|---|---|---|---|---|
2024 LegalProBERTClassificationofLeg | Amit Tewari | LegalPro-BERT: Classification of Legal Provisions by Fine-tuning BERT Large Language Model | 10.48550/arXiv.2404.10097 | 2024 |