Tuesday, December 23, 2008

Information Technology (Ammendment) Bill passed in Lok Sabha

Information Technology (Ammendment) Bill passed in Lok Sabha yesterday. PRSindia has done a brief review of the bill

The main aim for the Information Technology Act of 2000 was to bring legal legitimacy to electronic transactions by bringing in rules regarding digital certificates. The current bill goes beyond that to specify electronic contracts, cyber tribunal, pornography, wiretapping and exonerating the intermediaries of computer systems.

Section 10A is added so that the end user license agreements, web site terms and service agreements would be considered almost equivalent to paper contracts.

Section 43A is added so that the corporates or firms who handle sensitive data and are negligent in security security practices will be fined. Here the definition of best security policies and sensitive data is left for the government to decide by talking with other professional bodies.

There is a lot of details on resolving cyber disputes using Cyber-Appellatte tribunal.

Section 67 prohibits almost everything that can be considered lascivious in nature. Looks like this extends to writing as well.

Section 67A explicitly prohibits publishing of sexually explicit acts.

Section 69(1) is modified to allow government to get all information from intermediaries and wiretap in cases that affect national security.

CERT-IN gets official heads-up to be the nodal agency for protecting critical information infrastructure in Section 70A.

Section 79(I) exonerates intermediaries from legal liability if they are nice and follow government regulations and demands.

Section 79A providers for examiner of electronic evidence that can give expert opinion on any electronic evidence produced in court. For the trivial ones that involve some kind of crypto, the remaining evidence set is so difficult to verify. You don't need to be an expert to know this.

Saturday, December 20, 2008

Performance debugging on Linux

Whenever the computer starts running slow, I (as others) am curious as to what
has got wrong. Sometimes it is temporary performance issues and other times it
becomes very periodic in nature. That is when I really want to know what is
going wrong. Off course I would like to know what are the broad failures to
even start debugging. Is the hardware too old to run current software? Does a
particular linux distro like ubuntu has made bad performance choices and should
I just change the distro? Is there a huge problem in which Linux is designed
and I need to change to better designed OS's like BSD or MacOS or Vista? Is
there a known sever performance bug in some middleware (like X server) making
any blocking system calls?

However, the answer to these is often hard just because of the complexity in
which software systems interact with each other. For example, firefox may start
eating up memory and swap out my editor. I will not even be able to type. Some
may say that there is no application isolation in Linux. Others may say that
firefox or the javascript interpreter in firefox is very memory hogging. An
equally likely guess may be that the webpage that I tried to load contains
badly written javascript code or bad data. But these are still guesses and I
would like to know why I am suffering.

An exact answer requires first finding out which application is
misbehaving. After this more curious people can profile the application
on some dataset to find out the bottlenecks and probably find a fix. In
this post I will just talk about broad available tools
available in Linux that will help you find the misbehaving application
or the set of applications. After this one can use tools like gprof or
specalizind profiling that comes with a particular application. Most of
these is pretty trivial and I am not disclosing anything new.

First and foremost just run the command "top" and check memory usage, CPU usgae
and the overall load. You will see something like:





The load
average
denotes the CPU load on the system for three time periods, one five
and fifteen minutes. My current laptop (Dell Inspiron with 13.3" form factor)
has two CPUs (dual core CPU) and so a load of below 2 means my CPUs are not
overloaded. If CPUs were overloaded then you have to check the top programs
that are taking most of the CPU time. The next important thing to look in the
figure is the memory usage. It shows total memory of 3GB out of which roughly
1.6GB are already used. Actually if you are running the system for a while you
will find that entire 3GB will be used. You may ask why is so much of memory is
used. The answer is that the linux is very aggressive in caching. It caches
any disk block that you read/write from/to the disk and improves performance of
the application. If there is available memory why not use it.

You will also see that the virtual memory of Xorg (the X server), firefox, and
rhythmbox (the music player) is very high. Most of them are taking more than
half a GB of virtual memory. The first thing that comes to my mind is that
these applications are too memory hogging. However I monitored the virtual
memory of firefox (by typing "less /proc/26745/smaps" firefox process id is
26745). I found that the heap is just 60MB and rest of the virtual memory is
all used by different libraries as shared objects. You do not need to worry
about these as linux maintains only one copy of these shared objects and they
are shared across a number of applications.

Sometimes the machine may not have enough memory to run all the applications
and you will find that everything is slow. This may mean that the applications
are hitting the disk and to look at the disk activity you should start
monitoring the virtual memory status together with disk I/O. Run "vmstat 1" and
you will see something like:




The important thing to look at is the swap activity and the I/O activity. If
you see heavy read or write activity on any of these that means you are I/O
bound in your applications.

Another tool to look at will be latencytop. It requires root privilages to run and you will something like





You can monitor overall system latency as well as latency of individual
application. It performs a number of tests like fsync on the disk, read from a
pipe etc and will show what are the main causes of latency for a particular
application. Thanks to Jon Oberheide for
bringing this tool into my notice.

Finally if any of your applications are stuck, you may like to use another tool
called strace. First find out
the process id f the application. For example, to find process id of firefox
run the command "ps aux|grep firefox". Then run strace -p and you
will find the system call if any it is stuck at.

Friday, October 24, 2008

Ideology: Is it bad or you already have one?

Alan Greenspan in a congressional testimony on October 23, 2008 said:

"Remember what an ideology is. It is a conceptual framework with the way people deal with reality. Everyone has one. To exist, you need an ideology. The question is whether it is accurate or not. What I am saying to you is .. Yes I found a flaw which I do not know how significant or permanent it is .. but I am very distressed by it."

I was listening to Greenspan's testimony and this caught my attention. On many discussions with my friends, I have realized that people hate ideologies and more so when someone talks about it. I agree to the fact that ideologies are generalizations of reality and in many cases are over-generalizations. And so we need to fill in the details to understand which ideology is more accurate in what situation.

While the criticism of an ideology of lacking details is just, the complete rejection of ideologies is quite dangerous. The problem is really the fact that everyone has an ideology. Some people know what ideology they have themselves and think about it while for others it seeps into the mind without any sign. As a result people do not realize that they have an ideology and an opinion on most things. Whether such opinions are fed by a liberal ideology or a conservative ideology may be just a matter of your surrounding. The point is that everyone has an ideology and to argue that one does not have one is quite difficult. The argument that one should not have an ideology is also an ideology.

I personally do not agree that being ideological is itself wrong. They are nice in times because they will help us plan for longer duration.

To conclude, people need to be careful about their own inherent ideologies while arguing for de-ideology.

Sunday, September 7, 2008

Software patent has not withered on Indian radar

Microsoft along with an Indian subsidiary has applied for a patent on file snapshots and a possible version maintainence system. The two thing required to get a patent in India is that there should not be any prior art to it and the patent application should present a new innovative step. However, we do not evaluate these issues for this particular patent because software patents are disallowed by Indian Patent law.

There is small backdoor created by Indian Patent office in granting software patents. In the the recent draft of patent manual (a guide for patent examiners in granting patents), there is a provision for granting software patents. However, these provision do not have the force of law and may be invalidated when challenged in a court.

Wednesday, August 6, 2008

Privacy on the Indian-Net

Police in Amritsar has instructed cyber cafe owners to start tracking their users (Times of India). Now cafe owners are required to maintain the list of their users along with the time they have used a particular computer. Each user has to furnish some form of identity in the form of driving licence or college ID cards. In case they can't, then they have to use CCTV's to track students.

I could not get what the objective of such a record keeping is. Is it to track people who do not have a computer at home or to trace terrorist? Second, who can get access to this data. Can one parents or friends verify if you were using cafe at a particular time?

1. Well definitely people who have a ID card but not a home computer can be exactly tracked. If browser cache has not been cleaned up, then the websites they visited can also be traced.

2. If such data is aggregated at a central place, then a simple query can exactly tell which all cafes a particular person visited.

3. Can people without any form of ID card can use computers?


If the objective is to trace terrorist then the method has to be accurate for not blaming innocent citizens. However, a simple described logging mechanism seems quite inaccurate because of following problems:

1. Identity problem because of NAT: Many cafe operators split a single incoming connection with 10 to 20 computers. This is achieved using a Network Address Translation (NAT) device. The outside world will only see one computer. So if one discovers a particular IP that sent a threatening email then one has to identify which one of the 20 computers was used. Without keeping track of connection tables in the NAT this is not possible. It is definitely possible that finding 20 suspects is still good for many purposes.

2. Anonymity provided by email services: Email providers like Gmail do not append your IP address to an email that you sent using the web interface. Gmail only appends IP address to the outgoing mail if you have used their IMAP interface. So if you find a threatening mail sent through Gmail, the last locatable IP on the email will be a Gmail server. You cannot locate the cafe that was used until and unless you force Google to release their if they have collected it.

3. Infected machines: Machines can be infected with viruses, botnets, key loggers and spammers. All these machines can be used for any purpose including sending emails without any one noticing it. So cafe owners who have infected machines may be sending illegitimate mails and pointing to innocent users.

4. Bypassing tracing: A person can enter a cafe and start a process that listens for mails on a TCP port. Then he goes to home and redirects mails through the cafe computer. If NAT's impair such activity, then he can tunnel out such a connection. Cafe owners won't even know that a person who came in the day time actually sent an email from his computer in the night.

I do not know if such information can misguide police into detaining innocent people but certainly such information is not completely reliable.

It will be interesting to know if such rules are against the right to liberty or the right to free movement.

Thursday, March 27, 2008

Auto back up your data

Losing your computer data is a bad thing. One good habit is to regularly back up your data. But manually doing this is too cumbersome and so we need automated scripts to back up the data. Here is a small script that I use to back up my own desktop daily. It uses rsync protocol over ssh. You need to have ssh, rsync and ssh public-private key pairs for your back up machine. Setting up SSH key authentication is easy and here is one of the link http://www.ece.uci.edu/~chou/ssh-key.html

Add this script to your cron settings. For linux copy this file to /etc/cron.daily/ directory.

The benefit of using rsync is that only changes over the last week will be exported and it will save network bandwidth. SSH provides the encryption layer so that others cannot snoop on your data when you are backing up. This code is under Public Domain License. Customize it to your needs and auto back up your home directory from now onwards.

#!/usr/local/bin/bash

RSYNC=rsync
SSH=ssh
KEY=/home/sushant/.ssh/id_rsa
RUSER=sushant
RHOST=mybackupmachine.com
RPATH=backup/daily/`date +%u`
LPATH=/home/sushant/

$RSYNC -avxz --exclude=".*" --force --delete --delete-excluded --ignore-errors -e "$SSH -i $KEY" $LPATH $RUSER@$RHOST:$RPATH

Friday, February 29, 2008

Is software patentable in India?

A number of computer developers assume that Computer Software is patentable in India similar to US. However, it is a good news that software cannot be patented in India. Section 4(k) in THE PATENTS (AMENDMENT) ACT specifically prohibits this.

This section becomes the Section 3(k) of the Indian Patent Act. It is interesting that the Government allowed software patents in India using an Ordinance in January 2005. Ordinance is a way for the Indian Government to promulgate a law without going through the Parliament. However, they need to be ratified by the Parliament with in 6 months failing which the Ordinance cease to exist.

However, Communist Party of India (Marxist) (CPI(M)) did not like it and forced government to withdraw software patents. CPI(M) replied harshly to the ordinance passed by the Government of India in January 2005 that allowed Software patents.

CPI(M) rejoiced in winning against the Govt proposal for software patents

Actually if you search software patent for www.cpim.org you will find more relevant articles

http://www.google.com/search?hl=en&client=firefox-a&rls=com.ubuntu%3Aen-US%3Aofficial&hs=fcS&q=software+patent+site%3Awww.cpim.org&btnG=Search

Three cheers for the communist party!

Friday, January 4, 2008

Starting indiankanoon.org

India prides herself as the largest democracy in the world. There are three broad pillars of Indian democracy: the legislatures who make laws, the executives who enforce laws and the judiciary that interprets laws. The laws regulate a number of activities like criminal offense, civil cases, taxation, trade, social welfare, education and labor rights.

Even when laws empower citizens in a large number of ways, a significant fraction of the population is completely ignorant of their rights and privileges. As a result, common people are afraid of going to police and rarely go to court to seek justice. People continue to live under fear of unknown laws.

A number of attempts have been made to bring the knowledge of law to the common people. The Government of India took active efforts to present all laws along with their amendments at indiacode.nic.in and all court judgments at judis.nic.in.

While it is commendable to make law documents available to common people, it is still quite difficult for common people to easily find the required information. The first problem is that acts are very large and in most scenarios just a few section of laws are applicable. Finding most applicable sections from hundreds of pages of law documents is too daunting for common people. Secondly, laws are often vague and one needs to see how they have been interpreted by the judicial courts. Currently, the laws and judgments are separately maintained and to find judgments that interpret certain law clauses is difficult.

In order to remove the above two structural problems, Indian Kanoon is started. It achieves them by breaking law documents into smallest possible clause and by integrating law/statutes with court judgments. A tight integration of court judgments with laws allows automatic determination of the most relevant clauses and court judgments. Hope Indian Kanoon helps you in your search for Indian laws and their interpretations.

The Indian Kanoon main search page is here
The Indian Kanoon forum is here .