Tuesday 10 April 2012

Use md5sum to compare files in Linux

To compare two files in Linux, the first utility we can think of is diff.

Suppose we have two files /root/abc.txt and /root/cba.txt

To compare them using diff
diff /root/abc.txt /root/cba.txt
Besides diff, we can use md5sum to compare the checksum of the two files
md5sum /root/abc.txt
md5sum /root/cba.txt
Then compare the output, a smarter way to compare the checksum.
md5sum /root/abc.txt | awk '{print $1,"/root/cba.txt"}' > \ /tmp/cksum.txt
md5sum -c /tmp/cksum.txt
md5sum can be extremely useful when both abc.txt and cba.txt are too huge to use diff.

md5sum can also be used to validate files transferred to remote site.

In one of my jobs, I need to transfer big files from server to server, in the process of copying, one server may crash, we may lose connectivity between servers. Many things can cause the copying incomplete, and we end up having corrupted files on some of the servers. To avoid this situation.
we can generate the checksum of all the files to be copied on the source server.
md5sum file1 file2 file3 > cksum.txt
copy cksum.txt together with the files to destination server.

At destination server, verify all the files are being copied over and have the identical checksum value as files on source servers.
md5sum -c cksum.txt

Websites like apache.org also provide checksums together with their software, we can use these checksums to validate we have downloaded the correct file completely.
http://www.apache.org/dist/httpd/httpd-2.4.1.tar.gz.md5 shows the md5 chcksum for httpd-2.4.1.tar.gz is:
4366afbea8149ca125af01fd59a2f8a2 *httpd-2.4.1.tar.gz

2 comments:

  1. The md5sum command, is used to calculate and print the hash of the given file. This is a handy tool if you're looking for a quick way to see what's changed in your files.
    https://codeprozone.com/code/shell/19144/md5sum-windows.html

    ReplyDelete