开发者

R force local scope

开发者 https://www.devze.com 2023-03-09 21:02 出处:网络
This is probably not correct terminology, but hopefully I can get my point across. I frequently end up doing something like:

This is probably not correct terminology, but hopefully I can get my point across.

I frequently end up doing something like:

myVar = 1
f <- function(myvar) { return(myVar); }
# f(2) = 1 now

R happily uses the variable outside of the function's scope, which leaves me scratching my head, wondering how I could possibly be getting the results I am.

Is there any option which says "force me to only use variables which have previously been assigned values in this function's scope"? Perl's use strict does something like this, for example. But I don't know that R has an equivalent of my.


EDIT: Thank you, I am aware of that I capitalized them differently. Indeed, the example was created specifically to illustrate this problem!

I want to know if there is a way that R can automatically warn me when I do this.

EDIT 2:开发者_C百科 Also, if Rkward or another IDE offers this functionality I'd like to know that too.


As far as I know, R does not provide a "use strict" mode. So you are left with two options:

1 - Ensure all your "strict" functions don't have globalenv as environment. You could define a nice wrapper function for this, but the simplest is to call local:

# Use "local" directly to control the function environment
f <- local( function(myvar) { return(myVar); }, as.environment(2))
f(3) # Error in f(3) : object 'myVar' not found

# Create a wrapper function "strict" to do it for you...
strict <- function(f, pos=2) eval(substitute(f), as.environment(pos))
f <- strict( function(myvar) { return(myVar); } )
f(3) # Error in f(3) : object 'myVar' not found

2 - Do a code analysis that warns you of "bad" usage.

Here's a function checkStrict that hopefully does what you want. It uses the excellent codetools package.

# Checks a function for use of global variables
# Returns TRUE if ok, FALSE if globals were found.
checkStrict <- function(f, silent=FALSE) {
    vars <- codetools::findGlobals(f)
    found <- !vapply(vars, exists, logical(1), envir=as.environment(2))
    if (!silent && any(found)) {
        warning("global variables used: ", paste(names(found)[found], collapse=', '))
        return(invisible(FALSE))
    }

    !any(found)
}

And trying it out:

> myVar = 1
> f <- function(myvar) { return(myVar); }
> checkStrict(f)
Warning message:
In checkStrict(f) : global variables used: myVar


checkUsage in the codetools package is helpful, but doesn't get you all the way there. In a clean session where myVar is not defined,

f <- function(myvar) { return(myVar); }
codetools::checkUsage(f)

gives

<anonymous>: no visible binding for global variable ‘myVar’

but once you define myVar, checkUsage is happy.

See ?codetools in the codetools package: it's possible that something there is useful:

> findGlobals(f)
[1] "{"      "myVar"  "return"
> findLocals(f)
character(0)


You need to fix the typo: myvar != myVar. Then it will all work...

Scope resolution is 'from the inside out' starting from the current one, then the enclosing and so on.

Edit Now that you clarified your question, look at the package codetools (which is part of the R Base set):

R> library(codetools)
R> f <- function(myVAR) { return(myvar) }
R> checkUsage(f)
<anonymous>: no visible binding for global variable 'myvar'
R> 


Using get(x, inherits=FALSE) will force local scope.

 myVar = 1

 f2 <- function(myvar) get("myVar", inherits=FALSE)


f3 <- function(myvar){
 myVar <- myvar
 get("myVar", inherits=FALSE)
}

output:

> f2(8)    
Error in get("myVar", inherits = FALSE) : object 'myVar' not found
> f3(8)
[1] 8


You are of course doing it wrong. Don't expect static code checking tools to find all your mistakes. Check your code with tests. And more tests. Any decent test written to run in a clean environment will spot this kind of mistake. Write tests for your functions, and use them. Look at the glory that is the testthat package on CRAN.


There is a new package modules on CRAN which addresses this common issue (see the vignette here). With modules, the function raises an error instead of silently returning the wrong result.

# without modules
myVar <- 1
f <- function(myvar) { return(myVar) }
f(2)
[1] 1

# with modules
library(modules)
m <- module({
  f <- function(myvar) { return(myVar) }
})
m$f(2)
Error in m$f(2) : object 'myVar' not found

This is the first time I use it. It seems to be straightforward so I might include it in my regular workflow to prevent time consuming mishaps.


you can dynamically change the environment tree like this:

a <- 1

f <- function(){
    b <- 1
    print(b)
    print(a)
}

environment(f) <- new.env(parent = baseenv())

f()

Inside f, b can be found, while a cannot.

But probably it will do more harm than good.


You can test to see if the variable is defined locally:

myVar = 1
f <- function(myvar) { 
if( exists('myVar', environment(), inherits = FALSE) ) return( myVar) else cat("myVar was not found locally\n")
}

> f(2)
myVar was not found locally

But I find it very artificial if the only thing you are trying to do is to protect yourself from spelling mistakes.

The exists function searches for the variable name in the particular environment. inherits = FALSE tells it not to look into the enclosing frames.


environment(fun) = parent.env(environment(fun))

will remove the 'workspace' from your search path, leave everything else. This is probably closest to what you want.


@Tommy gave a very good answer and I used it to create 3 functions that I think are more convenient in practice.

strict

to make a function strict, you just have to call

strict(f,x,y)

instead of

f(x,y)

example:

my_fun1 <- function(a,b,c){a+b+c}
my_fun2 <- function(a,b,c){a+B+c}
B <- 1
my_fun1(1,2,3)        # 6
strict(my_fun1,1,2,3) # 6
my_fun2(1,2,3)        # 5
strict(my_fun2,1,2,3) # Error in (function (a, b, c)  : object 'B' not found

checkStrict1

To get a diagnosis, execute checkStrict1(f) with optional Boolean parameters to show more ore less.

checkStrict1("my_fun1") # nothing
checkStrict1("my_fun2") # my_fun2  : B

A more complicated case:

A <- 1 # unambiguous variable defined OUTSIDE AND INSIDE my_fun3
# B unambiguous variable defined only INSIDE my_fun3
C <- 1 # defined OUTSIDE AND INSIDE with ambiguous name (C is also a base function)
D <- 1 # defined only OUTSIDE my_fun3 (D is also a base function)
E <- 1 # unambiguous variable defined only OUTSIDE my_fun3
# G unambiguous variable defined only INSIDE my_fun3
# H is undeclared and doesn't exist at all
# I is undeclared (though I is also base function)
# v defined only INSIDE (v is also a base function)
my_fun3 <- function(a,b,c){
  A<-1;B<-1;C<-1;G<-1
  a+b+A+B+C+D+E+G+H+I+v+ my_fun1(1,2,3)
}
checkStrict1("my_fun3",show_global_functions = TRUE ,show_ambiguous = TRUE , show_inexistent = TRUE)

# my_fun3  : E 
# my_fun3  Ambiguous : D 
# my_fun3  Inexistent : H 
# my_fun3  Global functions : my_fun1

I chose to show only inexistent by default out of the 3 optional additions. You can change it easily in the function definition.

checkStrictAll

Get a diagnostic of all your potentially problematic functions, with the same parameters.

checkStrictAll()
my_fun2         : B 
my_fun3         : E 
my_fun3         Inexistent : H

sources

strict <- function(f1,...){
  function_text <- deparse(f1)
  function_text <- paste(function_text[1],function_text[2],paste(function_text[c(-1,-2,-length(function_text))],collapse=";"),"}",collapse="") 
  strict0 <- function(f1, pos=2) eval(substitute(f1), as.environment(pos))
  f1 <- eval(parse(text=paste0("strict0(",function_text,")")))
  do.call(f1,list(...))
}

checkStrict1 <- function(f_str,exceptions = NULL,n_char = nchar(f_str),show_global_functions = FALSE,show_ambiguous = FALSE, show_inexistent = TRUE){
  functions <-  c(lsf.str(envir=globalenv()))
  f <- try(eval(parse(text=f_str)),silent=TRUE)
  if(inherits(f, "try-error")) {return(NULL)}
  vars <- codetools::findGlobals(f)
  vars <- vars[!vars %in% exceptions]
  global_functions <- vars %in% functions

  in_global_env <- vapply(vars, exists, logical(1), envir=globalenv())
  in_local_env  <- vapply(vars, exists, logical(1), envir=as.environment(2))
  in_global_env_but_not_function <- rep(FALSE,length(vars))
  for (my_mode in c("logical", "integer", "double", "complex", "character", "raw","list", "NULL")){
    in_global_env_but_not_function <- in_global_env_but_not_function | vapply(vars, exists, logical(1), envir=globalenv(),mode = my_mode)
  }
  found     <- in_global_env_but_not_function & !in_local_env
  ambiguous <- in_global_env_but_not_function & in_local_env
  inexistent <- (!in_local_env) & (!in_global_env)
  if(typeof(f)=="closure"){
    if(any(found))           {cat(paste(f_str,paste(rep(" ",n_char-nchar(f_str)),collapse=""),":",                  paste(names(found)[found], collapse=', '),"\n"))}
    if(show_ambiguous        & any(ambiguous))       {cat(paste(f_str,paste(rep(" ",n_char-nchar(f_str)),collapse=""),"Ambiguous :",        paste(names(found)[ambiguous], collapse=', '),"\n"))}
    if(show_inexistent       & any(inexistent))      {cat(paste(f_str,paste(rep(" ",n_char-nchar(f_str)),collapse=""),"Inexistent :",       paste(names(found)[inexistent], collapse=', '),"\n"))}
    if(show_global_functions & any(global_functions)){cat(paste(f_str,paste(rep(" ",n_char-nchar(f_str)),collapse=""),"Global functions :", paste(names(found)[global_functions], collapse=', '),"\n"))}
    return(invisible(FALSE)) 
  } else {return(invisible(TRUE))}
}

checkStrictAll <-  function(exceptions = NULL,show_global_functions = FALSE,show_ambiguous = FALSE, show_inexistent = TRUE){
  functions <-  c(lsf.str(envir=globalenv()))
  n_char <- max(nchar(functions))  
  invisible(sapply(functions,checkStrict1,exceptions,n_char = n_char,show_global_functions,show_ambiguous, show_inexistent))
}


What works for me, based on @c-urchin 's answer, is to define a script which reads all my functions and then excludes the global environment:

filenames <- Sys.glob('fun/*.R')
for (filename in filenames) {
    source(filename, local=T)
    funname <- sub('^fun/(.*).R$', "\\1", filename)
    eval(parse(text=paste('environment(',funname,') <- parent.env(globalenv())',sep='')))
}

I assume that

  • all functions and nothing else are contained in the relative directory ./fun and
  • every .R file contains exactly one function with an identical name as the file.

The catch is that if one of my functions calls another one of my functions, then the outer function has to also call this script first, and it is essential to call it with local=T:

source('readfun.R', local=T)

assuming of course that the script file is called readfun.R.

0

精彩评论

暂无评论...
验证码 换一张
取 消