开发者

Print 50 sequences from each line using Clustal

开发者 https://www.devze.com 2022-12-31 01:12 出处:网络
I have a multiple sequence alignment (Clustal) file and I want to read this file and arrange sequences in such a way that it looks more clear and precise in order.

I have a multiple sequence alignment (Clustal) file and I want to read this file and arrange sequences in such a way that it looks more clear and precise in order.

I am doing this from Biopython using an AlignIO object:

ali开发者_JAVA技巧gnment = AlignIO.read("opuntia.aln", "clustal")

print "Number of rows: %i" % len(align)

for record in alignment:
    print "%s - %s" % (record.id, record.seq)

My output looks messy and long scrolling. What I want to do is print only 50 sequences in each line and continue until the end of the alignment file.

I wish to have output like this, from http://www.ebi.ac.uk/Tools/clustalw2/.


Br,

I don't have biopython on this computer, so this isn't tested, but it should work:

chunk_size = 50

for i in range(0, alignment.get_alignment_length(), chunk_size):
    print ""
    for record in alignment:
        print "%s\t%s %i" % (record.name,  record.seq[i:i + chunk_size], i + chunk_size)

Does the same trick as Eli's one - using range to set up an index to slice from then iterating over the record in the alignment for each slice.


Do you require something more complex than simply breaking record.seq into chunks of 50 characters, or am I missing something?

You can use Python sequence slicing to achieve that very easily. seq[N:N+50] accesses the 50 sequence elements starting with N:

In [24]: seq = ''.join(str(random.randint(1, 4)) for i in range(200))

In [25]: seq
Out[25]: '13313211211434211213343311221443122234343421132111223234141322124442112343143112411321431412322123214232414331224144142222323421121312441313314342434231131212124312344112144434314122312143242221323123'

In [26]: for n in range(0, len(seq), 50):
   ....:     print seq[n:n+50]
   ....:     
   ....:     
13313211211434211213343311221443122234343421132111
22323414132212444211234314311241132143141232212321
42324143312241441422223234211213124413133143424342
31131212124312344112144434314122312143242221323123
0

精彩评论

暂无评论...
验证码 换一张
取 消