Wednesday, April 11, 2012

Syncing files of two folders in different computers (dirty way)

Recently I submitted my assignment for Distributed System class. I took it too lightly ,and so procrastinated. I started working on it in the 11th hour ,and quickly realized how much FUBAR my situation was. I set up a team of  two Ubuntu virtual machines on my system, and connected them using virtual LAN. 135 lines later I apparently had something which bore a slight resemblance to the thing I was required to make.

I compared the snapshots of the filelist of the source in the previous sync, of the source in present time, and of the target in the present time. Syncing is done accordingly. I used rsync over ssh to efficiently and securely transfer files between the two VMs. The missing files are deleted. One necessary condition for this to work is that the system clock of the two machines must have almost same time. Another problem is that the synchronizations must happen quite frequently. Also, it has not been extended for more than two machines, and for sub-folders.

Luckily, it worked fine enough to fetch marks.

Following is the python script which compared the filelists
import sys
from sets import Set
server = open(sys.argv[1])
client = open(sys.argv[2])
prev = open(sys.argv[3])
to_be_removed = open("/home/utsav2/Desktop/temp/X","w")
to_be_removed_c = open("/home/utsav2/Desktop/temp/Y","w")
dict_s = {}
dict_c = {}
dict_p = {}
f = 0
for line_s in server:
if f is 0:
f = 1
continue
k = line_s.split()
dict_s[k[0]] = [k[1], k[2]]
f = 0
for line_c in client:
if f is 0:
f = 1
continue
k = line_c.split()
dict_c[k[0]] = [k[1], k[2]]
f = 0
for line_p in prev:
if f is 0:
f = 1
continue
k = line_p.split()
dict_p[k[0]] = [k[1], k[2]]
set_key_s = Set(dict_s.keys())
set_key_c = Set(dict_c.keys())
set_key_p = Set(dict_p.keys())
#print set_key_s
#print set_key_c
removed = set_key_c & set_key_p - set_key_s
added = set_key_s - set_key_p
present_in_both = set_key_s & set_key_c
new_in_client = set_key_c - set_key_p
removed_c= set_key_p - set_key_c
older = []
newer = []
newer_c = []
for i in new_in_client:
newer_c.append(i)
for i in present_in_both:
s = dict_s[i]
c = dict_c[i]
#print s
k = s[0].split("-")
#print k
ts = (int(k[0]) - 2010) * 365 + int(k[1])*30 + int(k[2])
k = c[0].split("-")
tc = (int(k[0]) - 2010) * 365 + int(k[1])*30 + int(k[2])
if tc == ts:
k = s[1].split(":")
ts = int(k[0])*3600 + int(k[1])*60 + int(k[2].split('.')[0])
k = c[1].split(":")
tc = int(k[0])*3600 + int(k[1])*60 + int(k[2].split('.')[0])
if ts>tc:
older.append(i)
if ts < tc:
newer.append(i)
print "Files removed from Server since last sync:"
print "------------------------"
for i in removed:
print i
to_be_removed.write(i+'\n')
print
print "Files added to Server since last sync:"
print "------------------------"
for i in added:
print i
print
print "Newer Files in Server"
print "------------------------"
for i in older:
print i
print
print "Older Files in Server"
print "------------------------"
for i in newer:
print i
print
print "Files removed from Client since last sync:"
print "------------------------"
for i in removed_c:
print i
to_be_removed_c.write(i+'\n')
print
print "Files added to Client since last sync"
print "------------------------"
for i in newer:
print i
print
server.close()
client.close()
to_be_removed.close()
to_be_removed_c.close()
view raw Compare.py hosted with ❤ by GitHub


Following script drives the main program (does the comparison, transfer, and deletion)
#!/bin/sh
clear
echo "---------------------------------------------------------------"
figlet -kc "my_Sync"
echo
echo "seamlessly sync folders :)"
echo
echo "Enter Source IP"
read sourIP
echo
echo "Connecting to $sourIP..."
ssh $sourIP "ls --full-time /home/utsav1/Desktop/Sync/" | awk '{print $9, $6 ,$7}' > /home/utsav/Desktop/temp/server.txt
ls --full-time /home/utsav2/Desktop/Sync/ | awk '{print $9, $6 ,$7}' > /home/utsav2/Desktop/temp/client.txt
python /home/utsav2/Desktop/compare.py /home/utsav2/Desktop/temp/server.txt /home/utsav2/Desktop/temp/client.txt /home/utsav2/Desktop/temp/prev.txt
cat /home/utsav2/Desktop/temp/X | while read l; do rm /home/utsav2/Desktop/Sync/$l;done
cat /home/utsav2/Desktop/temp/Y | while read l; do ssh $sourIP "rm /home/utsav1/Desktop/Sync/$l";done
#
echo
echo "Syncing..."
rsync -zaue ssh $sourIP:/home/utsav1/Desktop/Sync /home/utsav2/Desktop/
rsync -zaue ssh /home/utsav2/Desktop/Sync $sourIP:/home/utsav1/Desktop/
cp /home/utsav/Desktop/temp/server.txt /home/utsav/Desktop/temp/prev.txt
echo
echo "******Source and Destination Synced*******"
echo
view raw gistfile1.sh hosted with ❤ by GitHub