Sunday, 15 September 2013

Length of Python dictionary created doesn't match length from input file

Length of Python dictionary created doesn't match length from input file

Hi python programmers out there!
I'm currently to create a dictionary from the following input file:
1776344_at 1779734_at 0.755332745 1.009570769 -0.497209846 1776344_at
1771911_at 0.931592828 0.830039019 2.28101445 1776344_at 1777458_at
0.746306282 0.753624146 3.709120716 ... ...
There are a total of 12552 lines in this file. What I wanted to do is to
create a dictionary where the first 2 columns are the keys and the rest
are the values. This I've successfully done and it looks something like
this:
1770449_s_at;1777263_at:0.825723773;1.188969175;-2.858979578
1772892_at;1772051_at:-0.743866602;-1.303847456;26.41464414
1777227_at;1779218_s_at:0.819554413;0.677758609;4.51390617
But here's THE THING: I ran my python script on ms-dos cmd, and the
generated output not only does not have the same sequence as that in the
input file (i.e. 1st line is the 34th line), the whole file only have 739
lines.
Can someone enlighten me what's going on? Is it something to do with
memory? Cos the last I check I still have 305GB of disk space.
The script I wrote is as follow:
import sys import os
input_file = sys.argv[1] infile = open(input_file, 'r')
model_dict = {} for line in infile: key =
';'.join(line.split('\t')[0:2]).rstrip(os.linesep) value =
';'.join(line.split('\t')[2:]).rstrip(os.linesep) print 'keys
are:',key,'\n','values are:',value model_dict[key] = value print
model_dict outfile = open('model_dict', 'w') for key,value in
model_dict.items(): print key,value outfile.write('%s:%s\n' % (key,value))
outfile.close()
Thank you guys in advance!

No comments:

Post a Comment