From jargon to clarity: bridging understanding through graded simplification of legal data - Artificial Intelligence and Law

专属客服号

微信订阅号
大数据治理
全面提升数据价值
赋能业务提质增效
Abstract
Legal documents are notorious for their length, density, and jargon-heavy language, making them challenging to navigate and comprehend. This highlights a strong need for clear and accessible documentation for a diverse audience. Text simplification at multiple levels, tailored to individuals with diverse backgrounds and expertise, is essential in making legal content universally accessible. To this effect, in this work, we focus on paragraph-level simplification of legal contracts and introduce Graded Simplification for Legal Data, a framework that adapts contract clauses across three competency levels: Skilled, Intermediate, and Basic. We employ Large language models (LLMs) to perform graded simplification, supported by a Token efficient Compression mechanism that incrementally encodes document context across paragraphs within fixed tokens, making it well suited to lengthy contracts. To address the challenge of reliably evaluating legal simplification at scale, we design a multi-criteria evaluation framework that jointly assesses readability, lexical simplicity, semantic preservation, and entailment. This framework enables the creation of our key resource, the SimpLegal dataset, an English-language preference dataset of paragraph-level contract simplifications. Using this dataset for Direct Preference Optimization (DPO), we achieve notable gains (\(\uparrow \)5 points) in readability and simplicity over zero-shot prompting-based baselines. Collectively, these contributions underscore the importance of graded, paragraph-level simplification for contracts and demonstrate that small and medium-scale LLMs, when fine-tuned on preference data, can achieve performance comparable to larger models, providing a scalable pathway for accessible and comprehensible legal documentation. Our code and dataset are made available at https://github.com/GSLD-SimpLegal/FromJargonToClarity.git.




Data Availability
The link to code and data has been shared in the manuscript.
Notes
In our experiments, we observed that the models typically converge to their best response by the third or fourth iteration. We set \(i=2\) and \(j=10\), balancing computational constraints and convergence.
Empirically, scores between 0.3–0.5 indicated semantic drift; thus, 0.6 was adopted as a stricter cutoff.
https://scikit-learn.org/stable/modules/generated/sklearn.metrics.cohen_kappa_score.html (Cohen’s Kappa implementation in scikit-learn)
96 candidates \(\times \) 6K paragraphs = 576K total candidate paragraphs
top 15 responses selected, (\(6000 \times 15\)); not all pairs were retained, final set ~65000
https://pypi.org/project/textstat/ (textstat library for readability metrics)
https://scikit-learn.org/stable/modules/generated/sklearn.metrics.cohen_kappa_score.html (Cohen’s Kappa implementation in scikit-learn)
https://pingouin-stats.org/generated/pingouin.intraclass_corr.html (Intraclass Correlation Coefficient implementation in Pingouin)
We reused publicly available implementations where possible and otherwise implemented faithful reproductions based on the methodologies described in the original papers.
References
Abend O, Rappoport A (2013) Universal conceptual cognitive annotation (UCCA). In: Schuetze H, Fung P, Poesio M (eds) Proceedings of the 51st annual meeting of the association for computational linguistics (volume 1: Long Papers), pp 228–238. Association for Computational Linguistics, Sofia, Bulgaria. https://aclanthology.org/P13-1023/
Al-Thanyyan SS, Azmi AM (2021) Automated text simplification: a survey. ACM Comput Surv (CSUR) 54(2):1–36
Asthana S, Rashkin H, Clark E, Huot F, Lapata M (2024) Evaluating LLMs for targeted concept simplification for domain-specific texts. In: Al-Onaizan Y, Bansal M, Chen Y-N (eds) Proceedings of the 2024 conference on empirical methods in natural language processing, pp 6208–6226. Association for Computational Linguistics, Miami, Florida, USA. https://doi.org/10.18653/v1/2024.emnlp-main.357. https://aclanthology.org/2024.emnlp-main.357/
Asthana S, Rashkin H, Clark E, Huot F, Lapata M (2024) Evaluating llms for targeted concept simplification for domain-specific texts. In: Proceedings of the 2024 conference on empirical methods in natural language processing, pp 6208–6226
Bai Y, Kadavath S, Kundu S, Askell A, Kernion J, Jones A, Chen A, Goldie A, Mirhoseini A, McKinnon C et al (2022) Constitutional ai: Harmlessness from ai feedback. arXiv preprint arXiv:2212.08073
Bakker J, Vendeville B, Ermakova L, Kamps J (2025) Overview of the clef 2025 simpletext task 1: Simplify scientific text. Working Notes of CLEF 3147–3162
Bhattacharya P, Hiware K, Rajgaria S, Pochhi N, Ghosh K, Ghosh S (2019) A comparative study of summarization algorithms applied to legal case judgments. In: European conference on information retrieval, pp 413–428. Springer
Bhattacharya P, Poddar S, Rudra K, Ghosh K, Ghosh S (2021) Incorporating domain knowledge for extractive summarization of legal case documents. In: Proceedings of the eighteenth international conference on artificial intelligence and law, pp 22–31
Blinova S, Zhou X, Jaggi M, Eickhoff C, Bahrainian SA (2023) Simsum: Document-level text simplification via simultaneous summarization. In: Proceedings of the 61st annual meeting of the association for computational linguistics (volume 1: Long Papers), pp 9927–9944
Brown T, Mann B, Ryder N, Subbiah M, Kaplan JD, Dhariwal P, Neelakantan A, Shyam P, Sastry G, Askell A et al (2020) Language models are few-shot learners. Adv Neural Inf Process Syst 33:1877–1901
Bubeck S, Chandrasekaran V, Eldan R, Gehrke J, Horvitz E, Kamar E, Lee P, Lee YT, Li Y, Lundberg S, Nori H, Palangi H, Ribeiro MT, Zhang Y (2023) Sparks of Artificial General Intelligence: Early experiments with GPT-4. https://www.microsoft.com/en-us/research/publication/sparks-of-artificial-general-intelligence-early-experiments-with-gpt-4/
Cemri M, Çukur T, Koç A (2022) Unsupervised simplification of legal texts. arXiv preprint arXiv:2209.00557
Chalkidis I, Androutsopoulos I, Michos A (2017) Extracting contract elements. In: Proceedings of the 16th edition of the international conference on articial intelligence and law, pp 19–28
Chalkidis I, Fergadiotis M, Malakasiotis P, Aletras N, Androutsopoulos I (2020) Legal-bert: The muppets straight out of law school. In: Findings of the association for computational linguistics: EMNLP 2020, pp 2898–2904
Chalkidis I, Fergadiotis M, Malakasiotis P, Androutsopoulos I (2019) Neural contract element extraction revisited. In: Workshop on document intelligence at NeurIPS 2019
Chandrasekar R, Doran C, Bangalore S (1996) Motivations and methods for text simplification. In: COLING 1996 Volume 2: The 16th international conference on computational linguistics
Chowdhery A, Narang S, Devlin J, Bosma M, Mishra G, Roberts A, Barham P, Chung HW, Sutton C, Gehrmann S et al (2023) Palm: Scaling language modeling with pathways. J Mach Learn Res 24(240):1–113
Christiano P.F., Leike J, Brown T, Martic M, Legg S, Amodei D (2017) Deep reinforcement learning from human preferences. Adv Neural Inf Process Syst 30
Cripwell L, Legrand J, Gardent C (2023) Document-level planning for text simplification. In: 17th Conference of the European chapter of the association for computational linguistics, pp 993–1006. Association for Computational Linguistics
Dale E, Chall JS (1948) A formula for predicting readability: Instructions. Educ Res Bull 37–54
Devaraj A, Marshall I, Wallace B, Li JJ (2021) Paragraph-level simplification of medical texts. In: Toutanova K, Rumshisky A, Zettlemoyer L, Hakkani-Tur D, Beltagy I, Bethard S, Cotterell R, Chakraborty T, Zhou Y (eds) Proceedings of the 2021 conference of the North American chapter of the association for computational linguistics: Human language technologies, pp 4972–4984. Association for Computational Linguistics, Online. https://doi.org/10.18653/v1/2021.naacl-main.395
Devaraj A, Wallace BC, Marshall IJ, Li JJ (2021) Paragraph-level simplification of medical texts. In: Proceedings of the conference. Association for Computational Linguistics. North American Chapter. Meeting, vol 2021, p 4972
Devlin J, Chang M.-W., Lee K, Toutanova K (2019) Bert: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: Human language technologies, Volume 1 (long and Short Papers), pp 4171–4186
Dikmen I, Eken G, Erol H, Birgonul MT (2025) Automated construction contract analysis for risk and responsibility assessment using natural language processing and machine learning. Comput Ind 166:104251
Fang D, Qiang J, Zhu Y, Yuan Y, Li W (2026) Liu Y Progressive document-level text simplification via large language models. N Gener Comput 44(1):4
Fei Z, Shen X, Zhu D, Zhou F, Han Z, Huang A, Zhang S, Chen K, Yin Z, Shen Z et al (2024) Lawbench: Benchmarking legal knowledge of large language models. In: Proceedings of the 2024 conference on empirical methods in natural language processing, pp 7933–7962
Feng Y, Qiang J, Li Y, Yuan Y, Zhu Y (2023) Sentence simplification via large language models. arXiv preprint arXiv:2302.11957
Feng F, Yang Y, Cer D, Arivazhagan N, Wang W (2022) Language-agnostic bert sentence embedding. In: Proceedings of the 60th annual meeting of the association for computational linguistics (volume 1: Long Papers), pp 878–891
Flesch R (1948) A new readability yardstick. J Appl Psychol 32(3):221
Gao Y, Johnson K, Froehlich D, Carrer L, Ebling S (2025) Evaluating the effectiveness of direct preference optimization for personalizing german automatic text simplifications for persons with intellectual disabilities. arXiv preprint arXiv:2507.01479
Garimella A, Sancheti A, Aggarwal V, Ganesh A, Chhaya N, Kambhatla N (2022) Text simplification for legal domain:\(\{\)i\(\}\) nsights and challenges. In: Proceedings of the natural legal language processing workshop 2022, pp 296–304
Geis GS (2008) Automating contract law. NYUL Rev 83:450
Guerreiro NM, Rei R, Stigt Dv, Coheur L, Colombo P, Martins AF (2024) xcomet: Transparent machine translation evaluation through fine-grained error detection. Trans Assoc Comput Linguist 12:979–995
Hanna M, Bojar O (2021) A fine-grained analysis of BERTScore. In: Barrault L, Bojar O, Bougares F, Chatterjee R, Costa-jussa MR, Federmann C, Fishel M, Fraser A, Freitag M, Graham Y, Grundkiewicz R, Guzman P, Haddow B, Huck M, Yepes AJ, Koehn P, Kocmi T, Martins A, Morishita M, Monz C (eds) Proceedings of the Sixth Conference on Machine Translation, pp. 507–517. Association for Computational Linguistics, Online. https://aclanthology.org/2021.wmt-1.59/
Hochreiter S (1997) Schmidhuber J Long short-term memory. Neural Comput 9(8):1735–1780
Hu EJ, Shen Y, Wallis P, Allen-Zhu Z, Li Y, Wang S, Wang L, Chen W et al (2022) Lora: Low-rank adaptation of large language models. ICLR 1(2):3
Jiang AQ, Sablayrolles A, Mensch A, Bamford C, Chaplot DS, Casas D, Bressand F, Lengyel G, Lample G, Saulnier L, Lavaud LR, Lachaux M-A, Stock P, Scao TL, Lavril T, Wang T, Lacroix T, Sayed WE (2023) Mistral 7B. arxiv:2310.06825
Justo JM, Recario RNC (2024) Text simplification system for legal contract review. In: Arai K (ed) Advances in information and communication, pp 105–123. Springer, Cham
Kew T, Chi A, Vásquez-Rodríguez L, Agrawal S, Aumiller D, Alva-Manchego F, Shardlow M (2023) Bless: Benchmarking large language models on sentence simplification. In: Proceedings of the 2023 conference on empirical methods in natural language processing, pp 13291–13309
Kincaid JP, Fishburne Jr RP, Rogers RL, Chissom BS (1975) Derivation of new readability formulas (automated readability index, fog count and flesch reading ease formula) for navy enlisted personnel. Technical report
Koetsier M, Grefen P, Vonk J (2000) Contracts for cross-organizational workflow management. In: International conference on electronic commerce and web technologies, pp 110–121. Springer
Kudo T, Richardson J (2018) Sentencepiece: A simple and language independent subword tokenizer and detokenizer for neural text processing. In: Proceedings of the 2018 conference on empirical methods in natural language processing: system demonstrations, pp 66–71
Lee J, Yeung CY (2018) Personalizing lexical simplification. In: Bender EM, Derczynski L, Isabelle P (eds) Proceedings of the 27th international conference on computational linguistics, pp 224–232. Association for Computational Linguistics, Santa Fe, New Mexico, USA. https://aclanthology.org/C18-1019/
Levy M, Jacoby A, Goldberg Y (2024) Same task, more tokens: the impact of input length on the reasoning performance of large language models. In: Proceedings of the 62nd annual meeting of the association for computational linguistics (volume 1: Long Papers), pp 15339–15353
Lewis M, Liu Y, Goyal N, Ghazvininejad M, Mohamed A, Levy O, Stoyanov V, Zettlemoyer L (2020) Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension, 7871–7880
Li Z (2023) The dark side of chatgpt: Legal and ethical challenges from stochastic parrots and hallucination. arXiv preprint arXiv:2304.14347
Liang X, Wang H, Wang Y, Song S, Yang J, Niu S, Hu J, Liu D, Yao S, Xiong F et al (2024) Controllable text generation for large language models: A survey. CoRR
Li H, Dong Q, Chen J, Su H, Zhou Y, Ai Q, Ye Z, Liu Y (2024) Llms-as-judges: a comprehensive survey on llm-based evaluation methods. arXiv preprint arXiv:2412.05579
Likert R (1932) A technique for the measurement of attitudes. Arch Psychol
Lin C-Y (2004) Rouge: A package for automatic evaluation of summaries. In: Text Summarization branches out, pp 74–81
Li Z, Shardlow M, Hassan S (2022) An investigation into the effect of control tokens on text simplification. In: Proceedings of the workshop on text simplification, accessibility, and readability (TSAR-2022), pp 154–165
Liu NF, Lin K, Hewitt J, Paranjape A, Bevilacqua M, Petroni F, Liang P (2024) Lost in the middle: How language models use long contexts. Trans Assoc Comput Linguist 12:157–173. https://doi.org/10.1162/tacl_a_00638
Liu Q, Wang W, Willard J (2025) Effects of prompt length on domain-specific tasks for large language models. arXiv preprint arXiv:2502.14255
Loshchilov I, Hutter F (2017) Decoupled weight decay regularization
Loukas L, Fergadiotis M, Chalkidis I, Spyropoulou E, Malakasiotis P, Androutsopoulos I, Paliouras G (2022) Finer: Financial numeric entity recognition for xbrl tagging. In: Proceedings of the 60th annual meeting of the association for computational linguistics (volume 1: Long Papers), pp 4419–4431
Lu J, Li J, Wallace B, He Y, Pergola G (2023) NapSS: Paragraph-level medical text simplification via narrative prompting and sentence-matching summarization. In: Vlachos A, Augenstein I (eds) Findings of the association for computational linguistics: EACL 2023, pp 1079–1091. Association for Computational Linguistics, Dubrovnik, Croatia. https://doi.org/10.18653/v1/2023.findings-eacl.80
Martin L, Fan A, La Clergerie EV, Bordes A, Sagot B (2021) Multilingual unsupervised sentence simplification
Ondov B, Attal K (2022) Demner-Fushman D A survey of automated methods for biomedical text simplification. J Am Med Inform Assoc 29(11):1976–1988
Paetzold GH (2017) Specia L A survey on lexical simplification. J Artif Int Res 60(1):549–593
Papineni K, Roukos S, Ward T, Zhu W-J (2002) Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting of the association for computational linguistics, pp 311–318
Pattnayak A, Ramkumar A, Khetarpaul S, Vuthoo K (2025) Lawmate: Leveraging domain-specific llms for the indian legal ecosystem. In: Asian conference on intelligent information and database systems, pp 188–201. Springer
Pereira FV, Frazão A, Moreira VP (2024) Automatic text simplification for the legal domain in brazilian portuguese. In: Brazilian conference on intelligent systems, pp 31–45. Springer
Perković G, Drobnjak A, Botički I (2024) Hallucinations in llms: Understanding and addressing challenges. In: 2024 47th MIPRO ICT and electronics convention (MIPRO), pp 2084–2088. IEEE
Qiang J, Li Y, Zhu Y, Yuan Y, Shi Y, Wu X (2021) Lsbert: Lexical simplification based on bert. IEEE/ACM Trans Audio Speech Lang Proc 29:3064–3076. https://doi.org/10.1109/TASLP.2021.3111589
Rafailov R, Sharma A, Mitchell E, Manning CD, Ermon S (2023) Finn C Direct preference optimization: Your language model is secretly a reward model. Adv Neural Inf Process Syst 36:53728–53741
Reuter A, Schwenkreis F (1995) Contracts - a low-level mechanism for building general-purpose workflow management-systems. IEEE Data Eng Bull 18(1):4–10
Saggion H (n.d.) Automatic Text Simplification, vol 32. Springer
Scherrer N, Shi C, Feder A (2023) Blei D Evaluating the moral beliefs encoded in llms. Adv Neural Inf Process Syst 36:51778–51809
Sheang KC, Saggion H (2021) Controllable sentence simplification with a unified text-to-text transfer transformer. In: Belz A, Fan A, Reiter E, Sripada Y (eds) Proceedings of the 14th International Conference on Natural Language Generation, pp 341–352. bpublisherAssociation for Computational Linguistics, Aberdeen, Scotland, UK. https://doi.org/10.18653/v1/2021.inlg-1.38
Shukla A, Bhattacharya P, Poddar S, Mukherjee R, Ghosh K, Goyal P, Ghosh S (2022) Legal case document summarization: Extractive and abstractive methods and their evaluation. In: Proceedings of the 2nd conference of the Asia-Pacific chapter of the association for computational linguistics and the 12th international joint conference on natural language processing (volume 1: Long Papers), pp 1048–1064
Sulem E, Abend O, Rappoport A (2018) Simple and effective text simplification using semantic and neural methods. In: Proceedings of the 56th annual meeting of the association for computational linguistics (volume 1: Long Papers), pp 162–173
Sun H, Gao R, Zhang P, Yang B, Wang R (2025) Enhancing machine translation with self-supervised preference data. In: Che W, Nabende J, Shutova E, Pilehvar MT (eds) Proceedings of the 63rd annual meeting of the association for computational linguistics (volume 1: Long Papers), pp 23916–23934. Association for Computational Linguistics, Vienna, Austria. https://doi.org/10.18653/v1/2025.acl-long.1165
Sun R, Jin H, Wan X (2021) Document-level text simplification: Dataset, criteria and baseline. In: Proceedings of the 2021 conference on empirical methods in natural language processing, pp 7997–8013
Team G (2024) Gemma. https://doi.org/10.34740/KAGGLE/M/3301
Team Q (2025) Qwen3 Technical Report. arxiv:2505.09388
Thilagavathy R, Chaudhari S, Rastogi JS (2024) Simplification and summarization of legal contracts. In: AIP Conference proceedings, vol 3075, p 020113. AIP Publishing LLC
Touvron H, Lavril T, Izacard G, Martinet X, Lachaux M-A, Lacroix T, Rozière B, Goyal N, Hambro E, Azhar F et al (2023) Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971
Vásquez-Rodríguez L, Shardlow M, Przybyła P, Ananiadou S (2023) Document-level text simplification with coherence evaluation. In: Proceedings of the second workshop on text simplification, accessibility and readability, pp 85–101
Wubben S, Bosch A, Krahmer E (2012) Sentence simplification by monolingual machine translation. In: Li H, Lin C-Y, Osborne M, Lee GG, Park JC (eds) Proceedings of the 50th annual meeting of the association for computational linguistics (volume 1: Long Papers), pp 1015–1024. Association for Computational Linguistics, Jeju Island, Korea. https://aclanthology.org/P12-1107/
Xu W, Callison-Burch C, Napoles C (2015) Problems in current text simplification research: New data can help. Trans Assoc Comput Linguist 3:283–297
Xu W, Napoles C, Pavlick E, Chen Q (2016) Callison-Burch C Optimizing statistical machine translation for text simplification. Trans Assoc Comput Linguist 4:401–415
Xu W, Napoles C, Pavlick E, Chen Q (2016) Callison-Burch C Optimizing statistical machine translation for text simplification, vol 4, pp 401–415. https://www.aclweb.org/anthology/Q16-1029
Yang G, Chen J, Lin W, Byrne B (2024) Direct preference optimization for neural machine translation with minimum bayes risk decoding. In: Proceedings of the 2024 conference of the North American chapter of the association for computational linguistics: Human language technologies (volume 2: Short Papers), pp 391–398
Zhang T, Kishore V, Wu F, Weinberger KQ, Artzi Y (2019) Bertscore: Evaluating text generation with bert. In: International conference on learning representations
Zhang X, Lapata M (2017) Sentence simplification with deep reinforcement learning. In: Palmer M, Hwa R, Riedel S (eds) Proceedings of the 2017 conference on empirical methods in natural language processing, pp 584–594. Association for Computational Linguistics, Copenhagen, Denmark. https://doi.org/10.18653/v1/D17-1062
Zhou W, Jiang YE, Wilcox E, Cotterell R, Sachan M (2023) Controlled text generation with natural language instructions. In: International conference on machine learning, pp 42602–42613. PMLR
Author information
Authors and Affiliations
Contributions
H.S. played a major role in data collection, conducting experiments and preparing the manuscript. A.M. helped in evaluation framework and manuscript editing. M.S. provided guidance and reviewed the manuscript.
Corresponding author
Correspondence to
Hiranmai Sri Adibhatla.
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
A Appendix
A Appendix
1.1 A.1 Examples of simplification across grades
This section contains Table 8, a collection of whole clauses and illustrates how simplification is performed across the three defined grades. Looking at the second example in detail, the original sentence “Executive agrees that Executive shall not directly or indirectly solicit an employee of the Company to terminate their employment relationship with the Company”, the core semantic intent is to prohibit attempts to induce an employee to leave the company. In the Basic version it is simplified to “You promise not to try to get an employee to quit their jobs. You won’t do it on your own or help someone else do it”, this intent is preserved through explicit phrasing such as “try to get an employee to quit” and “help someone else do it,” which together capture both the action and indirect solicitation aspects of the original clause.
1.2 A.2 Implementation details
This section outlines the implementation details of our approach. The models employed, along with their configurations, are summarized in Table 9, while the evaluation metrics and their corresponding signatures are presented in Table 10, ensuring transparency in model selection and reproducibility of results.
1.3 A.3 Simplification settings
The Table 11 details different context lengths (no context i.e paragraph-only, and paragraph with token-restricted context), variant types, simplification grades, and models, where each unique combination results in a generated simplification
1.4 A.4 Prompts
Prompts designed to guide the models effectively are detailed here. Prompts help define the task clearly, ensuring that the model focuses on the intended aspects of the text. This section includes prompts corresponding to three levels of simplification (Figs. 5, and 6), allowing for graded outputs that vary in detail and complexity. Furthermore, it provides a dedicated prompt (Fig. 7 for multi-criteria evaluation, enabling the assessment of outputs along different dimensions such as readability, simplicity, and fidelity to the original text.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Sri Adibhatla, H., Mukherjee, A. & Shrivastava, M. From jargon to clarity: bridging understanding through graded simplification of legal data.
Artif Intell Law (2026). https://doi.org/10.1007/s10506-026-09503-y
Received:
Accepted:
Published:
Version of record:




