开发者

issue running a program (R) in Python to perform an operation (execute a script)

开发者 https://www.devze.com 2023-01-09 12:08 出处:网络
I\'m tying to execute an R script from python, ideally displaying and saving the results.Using rpy2 has been a bit of a struggle, so I thought I\'d just call R directly.I have a feeling that I\'ll nee

I'm tying to execute an R script from python, ideally displaying and saving the results. Using rpy2 has been a bit of a struggle, so I thought I'd just call R directly. I have a feeling that I'll need to use something like "os.system" or "subprocess.call," but I am having difficulty deciphering the module guides.

Here's the R script "MantelScript", which uses a particular stat test to compare two distance matrices at a time (distmatA1 and distmatB1). This works in R, though I haven't yet put in the iterating bits in order to read through and compare a bunch of files in a pairwise fashion (I really need some assistance with this, too btw!):

library(ade4)

M1<-read.table("C:\\pythonscripts\\distmatA1.csv", header = FALSE, sep = ",")
M2<-read.table("C:\\pythonscripts\\distmatB1.csv", header = FALSE, sep = ",")

mantel.rtest(dist(matrix(M1, 14, 14)), dist(matrix(M2, 14, 14)), nrepet = 999)

Here's the relevant bit of my python script, which reads through some previously formulated lists and pulls out matrices in order to compare them via this Mantel Test (it should pull the first matrix from identityA and sequentially compare it to every matrix in identityB, then repeat with the second matrix from identityB etc). I want to save these files and then call on the R program to compare them:

# windownA and windownB are lists containing ascending sequences of integers
# identityA and identityB are lists where each field is a distance matrix.

z = 0
v = 0

import subprocess
import os

for i in windownA:                              

    M1 = identityA[i]                          

    z += 1
    filename = "C:/pythonscripts/distmatA"+str(z)+".csv"
    file = csv.writer(open(filename, 'w'))
    file.writerow(M1)


    for j in windownB:                          

        M2 = identityB[j]                     

        v += 1
        filename2 = "C:/pythonscripts/distmatB"+str(v)+".csv"
        file = csv.writer(o开发者_开发百科pen(filename2, 'w'))
        file.writerow(M2)

        ## result = os.system('R CMD BATCH C:/R/library/MantelScript.R') - maybe something like this??

        ## result = subprocess.call(['C:/R/library/MantelScript.txt'])  - or maybe this??

        print result
        print ' '


If your R script only has side effects that's fine, but if you want to process further the results with Python, you'll still be better of using rpy2.

import rpy2.robjects
f = file("C:/R/library/MantelScript.R")
code = ''.join(f.readlines())
result = rpy2.robjects.r(code)
# assume that MantelScript creates a variable "X" in the R GlobalEnv workspace
X = rpy2.rojects.globalenv['X']


Stick with this.

process = subprocess.Popen(['R', 'CMD', 'BATCH', 'C:/R/library/MantelScript.R'])
process.wait()

When the the wait() function returns a value the .R file is finished.

Note that you should write your .R script to produce a file that your Python program can read.

with open( 'the_output_from_mantelscript', 'r' ) as result:
    for line in result:
        print( line )

Don't waste a lot of time trying to hook up a pipeline.

Invest time in getting a basic "Python spawns R" process working.

You can add to this later.


In case you're interested in generally invoking an R subprocess from Python.

#!/usr/bin/env python3

from io import StringIO
from subprocess import PIPE, Popen

def rnorm(n):
    rscript = Popen(["Rscript", "-"], stdin=PIPE, stdout=PIPE, stderr=PIPE)
    with StringIO() as s:
        s.write("x <- rnorm({})\n".format(n))
        s.write("cat(x, \"\\n\")\n")
        return rscript.communicate(s.getvalue().encode())

if __name__ == '__main__':
    output, errmsg = rnorm(5)
    print("stdout:")
    print(output.decode('utf-8').strip())
    print("stderr:")
    print(errmsg.decode('utf-8').strip())

Better to do it through Rscript.


Given what you're trying to do, a pure R solution might be neater:

file.pairs <- combn(dir(pattern="*.csv"), 2) # get every pair of csv files in the current dir

The pairs are columns in a 2xN matrix:

file.pairs[,1]
[1] "distmatrix1.csv" "distmatrix2.csv"

You can run a function on these columns by using apply (with option '2', meaning 'act over columns'):

my.func <- function(v) paste(v[1], v[2], sep="::")
apply(file.pairs, 2, my.func)

In this example my.func just glues the two file names together; you could replace this with a function that does the Mantel Test, something like (untested):

my.func <- function(v){
  M1<-read.table(v[1], header = FALSE, sep = ",")
  M2<-read.table(v[2], header = FALSE, sep = ",")
  mantel.rtest(dist(matrix(M1, 14, 14)), dist(matrix(M2, 14, 14)), nrepet = 999)
}
0

精彩评论

暂无评论...
验证码 换一张
取 消