Unsupervised-Based Information Extraction from Unstructured Arabic Legal Documents

  • Khudhair J. Kadhim
  • Ahmed T. Sadiq
  • Hasanen S. Abdulah


In order to make the unstructured or semi-structured traditional legal texts that are meet the requirements of high-level application such as A.I appli- cations in legal, must overcoming on challenge how to extract and analyze structured information from the legal documents automatically. This paper proposes architecture that using a combined approach that utilizes features, lexical and rules based approaches to extract the needed information from traditional legal documents. This research uses a dataset that is collected from Iraq federal court of cassation decisions documents to extract two sets of in- formation, the first is a set of general information, including reference law category, date of decision, court of jurisdiction name, and document no., deci- sion type that are called valuables attributes information, and the document essence is a focused legal information that include principle, arguments, opin- ions legal, and facts of the case which can used in any analysis phase. This research is a part of big project entitled “The Arabic documents opinion ex- traction using argumentation mining”, and the preliminary results were quite promising.

Khudhair J. Kadhim
Computer Sciences Department, University of Technology, Baghdad, Iraq
Ahmed T. Sadiq
Computer Sciences Department, University of Technology, Baghdad, Iraq
Hasanen S. Abdulah
Computer Sciences Department, University of Technology, Baghdad, Iraq


