2021 CLAUSERECAClauseRecommendationF
- (Aggarwal, Garimella et al., 2021) ⇒ Vinay Aggarwal, Aparna Garimella, Balaji Vasan Srinivasan, and Rajiv Jain. (2021). “CLAUSEREC: A Clause Recommendation Framework for AI-aided Contract Authoring.” In: arXiv preprint arXiv:2110.15794. doi:10.48550/arXiv.2110.15794
Subject Headings: Legal Clause Relevance Prediction, CONTRACTBERT.
Notes
Cited By
Quotes
Abstract
Contracts are a common type of legal document that frequent in several day-to-day business workflows. However, there has been very limited NLP research in processing such documents, and even lesser in generating them. These contracts are made up of clauses, and the unique nature of these clauses calls for specific methods to understand and generate such documents. In this paper, we introduce the task of clause recommendation, as a first step to aid and accelerate the authoring of contract documents. We propose a two-staged pipeline to first predict if a specific clause type is relevant to be added in a contract, and then recommend the top clauses for the given type based on the contract context. We pretrain BERT on an existing library of clauses with two additional tasks and use it for our prediction and recommendation. We experiment with classification methods and similarity-based heuristics for clause relevance prediction, and generation-based methods for clause recommendation, and evaluate the results from various methods on several clause types. We provide analyses on the results, and further outline the advantages and limitations of the various methods for this line of research.
1 Introduction
A contract is a legal document between at least two partyies that outlines the terms and conditions of the parties to an agreement. Contracts are typically in textual format, thus providing a huge potential for NLP applications in the space of legal documents. However, unlike most natural language corpora that are typically used in NLP research, contract language is repetitive with high inter-sentence similarities and sentence matches (Simonson et al., 2019), calling for new methods specific to legal language to understand and generate contract documents. A contract is essentially made up of legal clauses, which are provisions to address specific terms of the agreement, and which form the legal essence of the contract. Drafting a contract involves selecting an appropriate template, with a skeletal set of clauses, and customizing it for the specific purpose, typically via adding, removing, or modifying the various clauses in it. Both these stages involve manual effort and domain knowledge, and hence can benefit from assistance from NLP methods that are trained on large collections of contract documents. In this paper, we attempt to take the first step towards AI-assisted contract authoring, and introduce the task of clause recommendation, and propose a two-staged approach to solve it. There have been some recent works on item-based and content-based recommendations. Wang and Fu (2020) reformulated the next sentence prediction task in BERT (Devlin et al., 2019) as next purchase prediction task to make a collaborative filtering based recommendation system for ecommerce setting. Malkiel et al. (2020) introduced RecoBERT leveraging textual description of items such as titles to build an item-to-item recommendation system for wine and fashion domains. In the space of text-based content recommendations, Bhagavatula et al. (2018) proposed a method to recommend citations in academic paper drafts without using metadata. However, legal documents remain unexplored, and it is not straightforward to extend these methods to recommend clauses in contracts, as these documents are heavily domain-specific and recommending content in them requires specific understanding of their language. In this paper, clause recommendation is defined as the process of automatically providing recommendations of clauses that may be added to a given contract while authoring it. We propose a two-staged approach: first, we predict if a given clause type is relevant to be added to the given input contract; examples of clause types include governing laws, confidentiality, etc. Next, if a given clause type is predicted as relevant, we provide context-aware recommendations of clauses belonging to the given type for the input contract. We develop CONTRACTBERT, by further pre-training BERT using two additional tasks, and use it as the underlying language model in both the stages to adapt it to contracts. To the best of our knowledge, this is the first effort towards developing AI assistants for authoring and generating long domain-specific legal contracts.
2 Methodology
A contract can be viewed as a collection of clauses with each clause comprising of: (a) the clause label that represents the type of the clause and (b) the clause content. Our approach consists of two stages: (1) clause type relevance prediction: predicting if a given clause type that is not present in the given contract may be relevant to it, and (2) clause recommendation: recommending clauses corresponding to the given type that may be relevant to the contract.
____
- Figure 1: CLAUSEREC pipeline: Binary classification + generation for clause recommendation.
First, we build a model to effectively represent a contract by further pre-training BERT, a pre-trained Transformer-based encoder, on contracts to bias it towards legal language. We refer to the resulting model as CONTRACTBERT. In addition to masked language modelling and next sentence prediction, CONTRACTBERT is trained to predict (i) if the words in a clause label belong to a specific clause, and (ii) if two sentences belong to the same clause, enabling the embeddings of similar clauses to cluster together.
____
- Figure 2: Clustering of clauses using BERT Embedding
____
- Figure 3: Clustering of clauses using ContractBERT Embedding
On the other hand, CONTRACTBERT is able to cluster similar clause types closely while ensuring the separation between clauses of two different types.
2.1 Clause Type Relevance Prediction
Given a contract and a specific target clause type, the first stage involves predicting if the given type may be relevant to be added to the contract. We train binary classifiers for relevance prediction for each of the target clause types. Given an input contract, we obtain its CONTRACTBERT representation as shown in Figure 1. Since the number of tokens in the contracts are usually very large ( 512), we obtain the contextual representations of each of the clauses present and average their [CLS] embeddings to obtain the contract representation ct_rep. This representation is fed as input to a binary classifier which is a small fully-connected neural network that is trained using binary cross entropy loss. We use a probability score of over 0.5 as a positive prediction, i.e., the target clause type is relevant to the input contract.
2.2 Clause Content Recommendation
Once a target clause type is predicted as relevant, the next stage is to recommend clause content corresponding to the given type for the contract. We model this as a sequence-to-sequence generation task, where the input includes the given contract and clause label, and the output contains relevant clause content that may be added to the contract. We start with a transformer-based encoder-decoder architecture (Vaswani et al., 2017), follow (Liu and Lapata, 2019) and initialize our encoder with CONTRACTBERT. We then train the transformer decoder for generating clause content. As mentioned above, the inputs for the encoder comprise of a contract and a target clause type. We calculate the representations of all possible clauses belonging to the given type in the dataset using CONTRACTBERT, and their [CLS] token's embeddings are averaged, to obtain a target clause type representation trgt_cls_rep. This trgt_cls_rep and the contract representation ct_rep are averaged to obtain the encoding of the given contract and target clause type, which is used as input to the decoder. Note that since CONTRACTBERT is already pre-trained on the contracts, we do not need to train the encoder again for clause generation. Given the average of the contract and target clause type representation as input, the decoder is trained to generate the appropriate clause belonging to the target type which might be relevant to the contract. Note that our generation method provides a single clause as recommendation. On the other hand, with retrieval-based methods, we can obtain multiple suggestions for a given clause type using similarity measures.
3 Experiments and Evaluation
We evaluate three methods for clause type relevance prediction + clause recommendation: (1) Binary classification + clause generation, which is our proposed approach; (2) Collaborating filtering + similarity-based retrieval; and (3) Document similarity + similarity-based retrieval.
...
...
Metrics
We evaluate the performance of clause type relevance prediction using precision, recall, accuracy and F1-score metrics, and that of the clause content recommendation using ROUGE (Lin, 2004) score.
Data
We use the LEDGAR dataset introduced by Tuggener et al. (2020). It contains contracts from the U.S. Securities and Exchange Commission (SEC) filings website, and includes material contracts (Exhibit-10), such as shareholder agreements, employment agreements, etc. The dataset contains 12,608 clause types and 846,274 clauses from around 60,000 contracts. Further details on the dataset are provided in the appendix.
Since this dataset can not be used for our work readily, we preprocess it to create proxy datasets for clause type relevance prediction and clause recommendation tasks. For the former, for a target clause type t, we consider the labels relevant and not relevant for binary classification. For relevant class, we obtain contracts that contain a clause corresponding to t, and remove this clause; given such a contract as input in which t is not present, the classifier is trained to predict t as relevant to be added to the contract. For the not relevant class, we randomly sample an equal number of contracts that do not contain t in them. For recommendation, we use the contracts that contain t (i.e., the relevant class contracts); the inputs consist of the contract with the specific clause removed and t, and the output is the clause that is removed. For both the tasks, we partition these proxy datasets into train (60%), validation (20%) and test (20%) sets. These ground truth labels ({relevant, not relevant} for the first task and the clause content for the second task) that we removed are used for evaluation. The implementation details are provided in appendix.
- Table 2: Clause content recommendation results.
Governing Sim-based (w/o cls_rep) 0.441 0.213 0.327 Laws Sim-based (with cls_rep) 0.499 0.280 0.399 Generation-based 0.567 0.395 0.506 Severability Sim-based (w/o cls_rep) 0.419 0.142 0.269 Sim-based (with cls_rep) 0.444 0.155 0.288 Generation-based 0.521 0.264 0.432 Notices Sim-based (w/o cls_rep) 0.341 0.085 0.207 Sim-based (with cls_rep) 0.430 0.144 0.309 Generation-based 0.514 0.271 0.422 Counterparts Sim-based (w/o cls_rep) 0.466 0.214 0.406 Sim-based (with cls_rep) 0.530 0.279 0.474 Generation-based 0.666 0.495 0.667 Entire Agreements Sim-based (w/o cls_rep) 0.433 0.183 0.306 Agreements Sim-based (with cls_rep) 0.474 0.201 0.331 Generation-based 0.535 0.312 0.485
4 Results and Discussion
Table 1 summarizes the results of the three methods (CF-based, document similarity-based and binary classification) for the clause type relevance prediction task. For the tasks, we report results on the thresholds, k and learning rate which gave best results on the validation set (the ablation results are reported in the appendix).
...
...
5 Conclusions
We addressed AI-assisted authoring of contracts via clause recommendation. We proposed CLAUSEREC pipeline to predict clause types relevant to a contract and generate appropriate content for them based on the contract content. The results we get on comparing our approach with similarity-based heuristics and traditional filtering-based techniques are promising, indicating the viability of AI solutions to automate tasks for legal domain. Efforts in generating long contracts are still in their infancy and we hope our work can pave way for more research in this area.
Figure 4: Some clause types in the LEDGAR dataset.
References
- (Bhagavatula et al., 2018) ⇒ Chandra Bhagavatula, Sergey Feldman, Russell Power, and Waleed Ammar. (2018). “Content-based Citation Recommendation.” In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 238–251, New Orleans, Louisiana.
- (Devlin et al., 2019) ⇒ Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. (2019). “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding.” In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis, Minnesota.
- (Lin, 2004) ⇒ Chin-Yew Lin. (2004). “ROUGE: A Package for Automatic Evaluation of Summaries.” In Text Summarization Branches Out, pages 74–81, Barcelona, Spain.
- (Linden et al., 2003) ⇒ G. Linden, B. Smith, and J. York. (2003). “Amazon.com Recommendations: Item-to-item Collaborative Filtering.” IEEE Internet Computing, 7(1):76–80.
- (Liu & Lapata, 2019) ⇒ Yang Liu and Mirella Lapata. (2019). “Text Summarization with Pretrained Encoders.”
- (Malkiel et al., 2020) ⇒ Itzik Malkiel, Oren Barkan, Avi Caciularu, Noam Razin, Ori Katz, and Noam Koenigstein. (2020). “RecoBERT: A Catalog Language Model for Text-based Recommendations.” In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 1704–1714, Online.
- (Simonson et al., 2019) ⇒ Dan Simonson, Daniel Broderick, and Jonathan Herr. (2019). “The Extent of Repetition in Contract Language.” In: Proceedings of the Natural Legal Language Processing Workshop 2019, pages 21–30, Minneapolis, Minnesota.
- (Tuggener et al., 2020) ⇒ Don Tuggener, Pius Von Däniken, Thomas Peetz, and Mark Cieliebak. (2020). “LEDGAR: A Large-scale Multi-label Corpus for Text Classification of Legal Provisions in Contracts.” In: Proceedings of the Twelfth Language Resources and Evaluation Conference. [1](https://aclanthology.org/2020.lrec-1.155)
- (Vaswani et al., 2017) ⇒ Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. (2017). “Attention is All You Need.”
- (Wang & Fu, 2020) ⇒ Tian Wang and Yuyangzi Fu. (2020). “Item-based Collaborative Filtering with BERT.” In: Proceedings of The 3rd Workshop on e-Commerce and NLP, pages 54–58, Seattle, WA, USA.
References
;
Author | volume | Date Value | title | type | journal | titleUrl | doi | note | year | |
---|---|---|---|---|---|---|---|---|---|---|
2021 CLAUSERECAClauseRecommendationF | Vinay Aggarwal Aparna Garimella Balaji Vasan Srinivasan Rajiv Jain | CLAUSEREC: A Clause Recommendation Framework for AI-aided Contract Authoring | 10.48550/arXiv.2110.15794 | 2021 |