Populate arrays in python (numpy)?_问答_开发者

开发者 https://www.devze.com 2023-03-08 22:50 出处：网络

Given a file in the format below: a a 0 a b 1 a c 1 b b 0 b a 1 b c 1 c c 0 c a 1 c b 1 The third column is the distance between the items in the first and second columns. If I read such a file int

Given a file in the format below:

a a 0
a b 1
a c 1
b b 0
b a 1
b c 1
c c 0
c a 1
c b 1

The third column is the distance between the items in the first and second columns. If I read such a file into pyton as a nested list, how do I convert it to a symmetrical matrix, i.e开发者_JAVA百科.,

? I also wish to include the column and row names.

I would preferably like to use numpy to complete this task.

Any suggestions?

Thanks, D.

import numpy as np
from itertools import count

data = [line.split() for line in inputfile.readlines()]
rows = dict(zip(sorted(set(line[0] for line in data)), count()))
cols = dict(zip(sorted(set(line[1] for line in data)), count()))
array = np.zeros((len(rows), len(cols)))

for row, col, val in data:
    index = (rows[row], cols[col])
    array[index] = val

I don't know how to label rows and columns in numpy, so I just made a dict mapping the row label to the row index and another doing the same for the columns. If you need it you can make a reverse map, as below, or you can make rows and cols a bidict.

rows_reverse = dict((v, k) for k, v in rows)
cols_reverse = dict((v, k) for k, v in cols)

A slightly different approach:

import numpy as np
# Load "Row Col Value" text file
ar = np.loadtxt('file.txt', [('R','|S1'), ('C','|S1'), ('V','i')])
names = np.unique(np.row_stack((ar['R'], ar['C']))).tolist()
vf = np.vectorize(lambda x: names.index(x), otypes='i')
# load them in an output array
out = np.empty((len(names), len(names)), 'i')
out[vf(ar['R']), vf(ar['C'])] = ar['V']