Wednesday, January 28, 2009

Sunday, January 18, 2009

Indian Kanoon - The road so far and the road ahead

I was quite pleased to find law information publicly available on the judis and
the indiacode. However, it was too difficult to look for anything on these
websites and so I started building tool sets to play with law data. At a
certain point I felt that integration of these small software pieces will be
very interesting. I was still skeptic as to whether search on law documents
meant anything to common people who do not know the law jargon. In any case I
integrated the tool sets into a search engine and got pleasantly surprised when
many of my common queries were well answered. So I deployed it as a publicly
available service, called it Indian Kanoon and fortunately many people have
found it useful over time.

When actual people start using a service (whether free or fee-based), the
demand for correctness and usability increases significantly. The need to
understand the problems, think about the issues and fix them have kept me in
tight grip. Indian Kanoon was announced last January in a very crude form and a
number of changes have gone in the past year. So this post is mostly to
highlight what all work has gone into indian kanoon in the last year, what the
challenges were and what features are planned in future.

Integrating more legal documents

Indian Kanoon started only with supreme court judgments and central laws.
Clearly this was not sufficient to many people who wanted to search in high
court judgments, law commission reports and law journals. Over last year, a
number of other legal documents have been added. Firstly, the law commission reports
and a law journal
was added. The law journal "Central India Law Quarterly" has been
digitized and was put up on Internet by Devaranjan. The only problem in their integration
was that the many of these documents were images scanned from the books. So I used tesseract,
a free OCR software supported by google, for extracting text from these images.
However, the text extraction quality was just 90% and I am skeptical if google
uses tesseract for its own google books project. Tarunabh pointed out the availability
of constituent assembly debates that can be integrated. He pointed out two main
problems in integrating them. First, the article numbers in the debates were different
than in the constitution. Secondly, debates are cited in the court judgments using
page numbers in the official books. But both of these numbers were not available in
the digital copy provided by the government. So the only way out was to go back to
the actual books. We did not want to give away the digital route yet. So we went to books.google.com that had a scanned copy of the debates. Tarunabh emailed Google
to release those books in public domain as the copyright on them has expired the
previous year. Google replied saying that they are not sure about the copyright
expiration and will be conservative in making books publicly available. Finally,
I loaned the books from a library, manually copied the page numbers and the
association list between the article numbers in the debates and the article numbers
in the Constitution and integrated the constituent assembly debates.

Indian Kanoon was highly deficient in terms of high court judgments and even in
Supreme court judgments as Dilip earlier pointed out on my blog. So I
integrated the high court judgments and made Indian Kanoon more comprehensive.

Features

Beside making Indian Kanoon comprehensive in terms of legal documents, a number
of features to make searching easier have been added. The most common problem
was the mis-spelling of Indian names and so I I first added the most critical
feature for
spelling suggestions
. Ability to search and order documents by date was added next. The search and forums were redesigned to look aesthetically appealing. In order to provide notifications for new judgments, RSS feed for court judgments was recently added. Finally, people may like to monitor documents related to certain words or phrases. So on Tarunabh's suggestion I added the
RSS feed for any arbitrary query.

Contributing code back

Developing indian kanoon software has been possible because of the availability
of large amount of free software. As a result I was able to modify these
software and customize it for law search. Indian Kanoon uses a feature rich
open source database - Postgresql as the
backend. When users submit a query, matching documents are found, ordered and
the top few are shown. For each document, the search engine also displays a
small text excerpt where the query terms appear. The text excerpt allows people
to quickly evaluate whether the document is relevant to the query. The
headline function developed for indian kanoon was contributed back to postgres
and has been
added to the postgres CVS head
. Beside that a bug in postgres was fixed as well. I also sent the
phrase search function to the postgres list. But, Teodor Sigaev, who merged OpenFTS in the Postgresql, wants a generic operator that can check for arbitrary distance between the lexemes. I have not yet got time to work on this operator.

Beside development on the database, the Indian Kanoon forums has been released
as djangobb - Django Bulletin board that uses the django web application framework. The judis recently moved to a really obfuscated website where the judgment did not have a
stable URL. Prashant Iyengar pointed out that we are not getting the live feed from the judis. So I reverse engineered the website and released the judis reverse engineering code.

Future works

Even after so much of work a number of things need to be improved on indian
kanoon. Here is a list of changes that I think are required to make indian
kanoon more comprehensive, more rich and better in search. Please feel free to
suggest more.

1. Reverse engineering different court and tribunal websites so that indian
kanoon can provide a live feed of all Indian court and tribunal judgments.

2. Currently indian kanoon cannot answer questions like "list of judgments in
which a particular law section was held" and "search only in family law
judgments". The problem is that we do not have enough semantic information
about judgments. So I want to enable common users to start tagging documents.
There will be two kinds of tagging: categorizing court judgments and laws into
broad categories like family law, constitutional law, right to equality etc and
secondly, tag whether a judgment explains, bolsters, or overturns a given law
or judgment. The tags generated by the users will be available to everyone
with the Creative Commons-Attribution-Share Alike license 3.0.

3. A number of people type in natural language in the search box. For example,
someone will type "recent judgments from delhi high court". Even though we can
answer these questions, we directly search the query to the documents. For
example, the above query could have been reduced to "doctypes: delhi sortby:
mostrecent". So what we need is a small natural language processor that can
automatically convert such natural language queries to a more precise query
that the engine can evaluate.

4. I only support searching for a set of words in the documents. Roy wanted a
more sophisticated
query langauge
that supports boolean queries. This will enable people to
issue more complicated queries like (freedom OR speech) AND (NOT expression).

5. With the addition of more data over time, Indian Kanoon takes more than a
second to evaluate some queries. A number of software changes (or possible
hardware upgrade) are required to bring back the evaluation time to sub-second.

Tuesday, January 13, 2009

Challenges with constitutional democracy in South Asia

In A constitutional state, Rasul Bakhsh Rais points out the reasons for weak democracy in Pakistan.
While the generally pointed problem is the over indulgence of military into civil government, Rasul points out other aspects that have made civilian governments weak. He points "The answer lies in the undemocratic mindset of traditional leaders who control the political parties and, through them, the electoral process.". He continues - "One point that is debated often but never understood is why there is no democracy within political parties; and why and how families and oligarchs dominate them. In essence, these elements use political parties to maintain their dominance, using the party’s name, social support base and elite network to control access to electoral politics and power."

This problem is not only confined to Pakistan and also resonates very well with India.

Tuesday, January 6, 2009

Extradition of mumbai suspects using SAARC convention

An informative article from Daily Times, Pakistan (Next steps after evidence from India) that talks about India invoking "SAARC Regional Convention on Suppression of terrorism (1987)" for extradition of Pakistani suspects in Mumbai attack. Article 3(4) says:
"If a Contracting State which makes extradition conditional on the existence of a treaty receives a request for extradition from another Contracting State with which it has no extradition treaty, the requested State may, at its option, consider this Convention as the basis for extradition in respect of the offences set forth in Article I or agreed to in terms of Article II. Extradition shall be subject to the law of the requested State"

It is important to note that the extradition is optional and not binding on the "Contracting State". It appears that this is just an enabling clause and Pakistan is not obliged in this case. The daily times notes that India has not invoked this treaty in the 1999 hijack of Indian airliner to Kandahar.

Saturday, January 3, 2009

A brief analysis of Supreme Court judgment on SAR Gilani

The supreme court judgment in which Afzal Guru, his wife Navjot Sandhu, Shaukat hussain Guru, and SAR Gilani were tried is here:
State (N.C.T. Of Delhi) vs Navjot Sandhu@ Afsan Guru on 4 August, 2005

The judgment narrates the events, talks about the police investigation, confessions and finally the judgment. It also talks in detail about the different provisions relating to confessions under POTA and it has actually stuck it down. Finally, the court went with a normal criminal confession under a magistrate.

SAR Gilani was defended by Ram Jethmalani. Court rejected some witnesses who said that they have seen Shaukat and Gilani together while procuring room and board for the terrorists (that were killed). This was mostly because the witnesses did some mistake while identifying Gilani.

However, one evidence that was irrefutable was the constant phone calls between Gilani and Shaukat and Afzal. This evidence was furnished by AIRTEL and ESSAR after warrants provided under Indian Telegraph Act. Court accepted these phone calls as evidence. However, supreme court held the high court view that just phone calls between Shaukat and Gilani did not confirm that Gilani knew about the conspiracy. Here is the text from the judgment:

"The High Court after holding that the disclosure statement of Gilani
was not admissible under Section 27 of the Evidence Act and that the
confession of co-accused cannot also be put against him, observed thus:

"We are, therefore, left with only one piece of evidence against
accused S.A.R. Gilani being the record of telephone calls between
him and accused Mohd. Afzal and Shaukat. This circumstance, in
our opinion, do not even remotely, far less definitely and unerringly
point towards the guilt of accused S.A.R. Gilani. We, therefore,
conclude that the prosecution has failed to bring on record
evidence which cumulatively forms a chain, so complete that there
is no escape from the conclusion that in all human probabilities
accused S.A.R. Gilani was involved in the conspiracy.""

Police could only get the call records for previous conversations. However, they recorded the call between GIlani and Brother of Gilani after the incident. Here is the text excerpt translated from Kashmiri:

"Caller: (Bother of Gilani) What have you done in Delhi?
Receiver: (Gilani) It is necessary to do (while laughing) ( Eh che zururi).
Caller: Just maintain calm now.
Receiver: O.K. (while laughing)Where is Bashan?
This portion of the conversation appears almost towards the end of talk.
The defence version of translation is as follows:
Caller: (Brother of Gilani) What has happened?
Receiver: (Gilani) What, in Delhi?
Caller: What has happened in Delhi?
Receiver: Ha! Ha! Ha! (laughing)
Caller: Relax now.
Receiver: Ha! Ha! Ha!, O.K. Where are you in Srinagar?"

Police did another mistake here of recording it really poorly that high
court rejected the first two lines as inaudible. Police needs to do a
better job than this. On the other part Supreme court said:

"However, we would like to advert to one disturbing feature. Gilani rejoiced and laughed heartily when the Delhi event was raised in the conversation. It raises a serious suspicion that he was approving of the happenings in Delhi. Moreover, he came forward with a false version that the remark was made in the context of domestic quarrel. We can only say that his conduct, which is not only evident from this fact, but also the untruthful pleas raised by him about his contacts with Shaukat and Afzal, give rise to serious suspicion at least about his knowledge of the incident and his tacit approval of it. At the same time, suspicion however strong cannot take the place of legal proof. Though his conduct was not above board, the Court cannot condemn him in the absence of sufficient evidence pointing unmistakably to his guilt."

Finally the judgment:

"In view of the foregoing discussion we affirm the verdict of the High
Court and we uphold the acquittal of S.A.R. Gilani of all charges."

On the whole I felt that there was surely not enough evidence (or significant
amount of police mistakes) to implicate
Gilani as a conspirator in the unfortunate happening. However, a significant
amount of doubt still remains on his character.