I have to create a distance matrix using R. My data is in an Excel file which contains 300 rows and 10 columns. I have to creat开发者_Python百科e distance matrix based on the values of 9th column. For example
s s s s s
s 1
s 2 2
s 3 3 4
s 4 4 7 3
s 5 5 8 2 8
How to create this type of matrix?
Easiest option I know, is to save your Excel sheet containing the data as a CSV file. Make sure that only the first row and column of the sheet contain any sample or variable names.
Then read into R using:
dat <- read.csv("path/to/my/file.csv")
and then use dist()
on the 9th column to compute the dissimilarity matrix
dij <- dist(dat[, 9])
If you want something other than the Euclidean distance, see the options in ?dist
and if those don't suit, try the daisy()
function in recommended package cluster, or vegdist()
function in package vegan or the proxy package.
If your numbers are in a vector called z, then dist(z)
returns a distance matrix of euclidean (sqrt(dx^2+dy^2)
) values. See help(dist)
for more info.
精彩评论