posted by organizer: arindamp || 3721 views || tracked by 6 users: [display]

IRLeD 2017 : Information Retrieval from Legal Documents


When Dec 8, 2017 - Dec 10, 2017
Where Bangalore, India
Abstract Registration Due Jul 24, 2017
Submission Deadline Sep 20, 2017
Notification Due Oct 10, 2017
Final Version Due Oct 30, 2017
Categories    citation analysis   information retrieval   legal documents analysis   text mining

Call For Papers

Information Retrieval from Legal Documents (IRLeD) 2017

In conjunction with FIRE 2017
Forum for Information Retrieval Evaluation
Indian Institute of Science, Bangalore
8th - 10th December

Track description:
In a Common Law System, great importance is given to prior cases. A prior case (also called a precedent) is an older court case related to the current case, which discusses similar issue(s) and which can be used as reference in the current case. A prior case is treated as important as any law written in the law book (called statutes). This is to ensure that a similar situation is treated similarly in every case. If an ongoing case has any related/relevant legal issue(s) that has already been decided, then the court is expected to follow the interpretations made in the prior case. For this purpose, it is critical for legal practitioners to find and study previous court cases, so as to examine how the ongoing issues were interpreted in the older cases.

With the recent developments in information technology, the number of digitally available legal documents has rapidly increased. It is, hence, imperative for legal practitioners to have an automatic precedent retrieval system. The task of precedence retrieval can be modeled as a task of information retrieval, where the current document (or a description of the current situation) will be used as the query, and the system should return relevant prior cases as results.

Generally, legal texts (e.g., court case descriptions) are long and have complex structures. This makes their thorough reading time-consuming and strenuous. So, apart from a precedence retrieval system, it is also essential for legal practitioners to have a concise representation of the core legal issues described in a legal text. One way to list the core legal issues is by keywords or key phrases, which are known as “catchphrases” in the legal domain.

Motivated by the requirements described above, we have the following two tasks:
* Catchphrase extraction
* Precedence retrieval

Task 1:
Catchphrases are short phrases from within the text of the document. Catchphrases can be extracted by selecting certain portions from the text of the document. A set of legal documents (Indian Supreme Court decisions) will be provided. For a few of these documents (training set), the catchphrases (gold standard) will also be provided. These catchphrases have been obtained from a well-known legal search system Manupatra (, which employs legal experts to annotate case documents with catchphrases. The rest of the documents will be used as the test set. The participants will be expected to extract the catchphrases for the documents in the test set.

Task 2:
For the precedent retrieval task, two sets of documents shall be provided:
Current cases: A set of cases for which the prior cases have to be retrieved.
Prior cases: For each “current case”, we have obtained a set of prior cases that were actually cited in the case decision. These cited prior cases are present in the second set of documents along with other (not cited) documents.

For each document in the first set, the participants are to form a list of documents from the second set in a way that the cited prior cases are ranked higher than the other (not cited) documents.

A participant team may participate in either or both the sub-tasks. Each team can have at most 4 participants.

Evaluation plan:
For Task 1, a set of catchphrases is expected as result for each document in the test data. We plan to use set-based IR measures such as Precision, Recall, F-Score, etc., to check how well the set of extracted catchphrases match with the set of gold standard catchphrases.

For Task 2, a ranked list of documents is expected as result for each document in the Current Cases set. Measures like Precision, Recall, MAP, DCG and Mean Reciprocal Rank will be used to check how well the documents that were actually cited are ranked in the retrieved list of documents.

July 24 – data released
September 20 – run submission deadline
October 10 – results declared
October 30 – working notes due

Arpan Mandal, ​IIEST Shibpur, India (
Kripabandhu Ghosh, ​IIT Kanpur, India (
Arnab Bhattacharya, ​IIT Kanpur, India (
Arindam Pal, ​TCS Research, Kolkata, India (
Saptarshi Ghosh, ​IIT Kharagpur and IIEST Shibpur, India (

Related Resources

IEEE IV 2019 BROAD workshop 2019   Where to from here? Algorithmic, Legal, and Societal Challenges for Autonomous Driving
LTA 2019   4th International Workshop on Language Technologies and Applications
RecSys 2019   13th ACM Conference on Recommender Systems
IJNLC 2019   International Journal on Natural Language Computing
ACM--NLPIR--Ei Compendex and Scopus 2019   ACM--2019 3rd International Conference on Natural Language Processing and Information Retrieval (NLPIR 2019)--Ei Compendex and Scopus
IJIBM 2019   Call For Papers - International Journal of Information, Business and Management
KDIR 2019   11th International Conference on Knowledge Discovery and Information Retrieval
LT 2019   Special Session on Language Technologies
NLPIR--ACM, Ei and Scopus 2019   ACM2019 3rd International Conference on Natural Language Processing and Information Retrieval (NLPIR 2019)--Ei Compendex and Scopus
PBIJ 2019   Pharmaceutical and Biomedical sciences: An International Journal