开发者

Extracting values from text file

开发者 https://www.devze.com 2023-04-13 00:15 出处:网络
I have one text file as follows with 2 columns 44333373-5.829738285 3007762-5.4685开发者_开发百科21083

I have one text file as follows with 2 columns

44333373    -5.829738285
3007762     -5.4685开发者_开发百科21083
16756295    -5.247183569
46197456    -5.216096421
46884567    -5.195179321
44333390    -5.162411562
44420579    -5.133122186
6439190     -5.028260409
...

I want to extract values which greater than -5.162411562 ideal output should look like

Output

44333373    -5.829738285
3007762     -5.468521083
16756295    -5.247183569
46197456    -5.216096421
46884567    -5.195179321

To accomplish this task i wrote simple python script

f1=open("file.txt","r")
n=0
for line in f1.readlines():
     if float(n) > -5.162411562:
        print line

But it is just reading all data in file. I know it is a very simple task but I am not able to figure out where I am going wrong. Can anybody help?


Well, you need to actually set n to a value aside from zero. How about:

with open('file.txt') as f1:
  for line in f1: # readlines is not necessary here
    n = float(line.split()[1]) # line.split()[0] is the first number
    if n > -5.162411562:
        print (line.rstrip('\r\n')) # rstrip to remove the existing EOL in line


The issue with the code you have presented is that the value of n is never changes, so the if statement will always evaluate to True, and therefore the line will be printed:

f1=open("file.txt","r")
n=0  # the `n` is set here
for line in f1.readlines():
     if float(n) > -5.162411562:  # `n` is always `0`, so this is always `True`
        print line

You'll want to update the variable n with the number extracted from the second column of each line.

Furthermore, the if condition will have to have its comparison operator changed from > (greater than) to < (less than), as the values you show in your output are values which are "less than -5.162411562", not "greater than"

Also, it should be noted that the n=0 is not necessarily required.

With those changes, we get the following code:

f1 = open("file.txt","r")
for line in f1.readlines():
  n = line.split()[1]          # get the second column
  if float(n) < -5.162411562:  # changed the direction comparison
     print line.rstrip()       # remove the newline from the line read
                               # from the file to prevent doubling of newlines
                               # from the print statement
f1.close()                     # for completeness, close the file

The resulting output is:

44333373        -5.829738285
3007762         -5.468521083
16756295        -5.247183569
46197456        -5.216096421
46884567        -5.195179321


line contains 44333373 -5.829738285. when looping through lines you need to split the line & consider the first element & you dont need n. Then compare. So the code changes to -

f1=open("file.txt","r")
for line in f1.readlines():
     if float(line.split()[1]) > -5.162411562:
        print line

Slight modification here. readlines reads the entire file contents into memory in one single go. If the file is too big then you could have problems. The file operator in python is a iterator. how cool is that! Also open by default opens a file in read mode. So the code further simplifies to -

for line in open('file.txt'):
    if float(line.split()[1]) > -5.162411562:
        print line

Hope this helps...

0

精彩评论

暂无评论...
验证码 换一张
取 消