开发者

How to organize big R functions?

开发者 https://www.devze.com 2023-03-31 18:42 出处:网络
I\'m writing an R function, that is becoming quite big. It admit multiple choice, and I\'m organizing it like so:

I'm writing an R function, that is becoming quite big. It admit multiple choice, and I'm organizing it like so:

myfun <- function(y, type=c("aa", "bb", "cc", "dd" ... "zz")){

   if (type == "aa") {
      do something
      - a lot of code here -
      ....
   }

   if (type == "bb") {
      do something
      - a lot of code here -
      ....
   }

   ....
}

I have two questions:

  1. Is there a better way, in order to not use the 'if' statement, for every choice of the parameter type?
  2. Could it be more functional to write a sub-function for every "type" choice?

If I write subfunction, it would look like this:

myfun <- function(y, type=c("aa", "bb", "cc", "dd" ... "zz")){

   if (type == "aa") result <- sub_fun_aa(y)
   if (type == "bb") result <- sub_fun_bb(y)
   if (type == "cc") result <- sub_fun_cc(y)
   if (type == "dd") result <- sub_fun_dd(y)
   ....
}

Subfunction are of course defined elsewhere (in the top of myfun, or in another way).

I hope I was clear with my question. Thanks in Advance.

- Additional info -

I'm writing a function that applies some different filters to an image (different filter = different "type" parameter). Some filters share some code (for example, "aa" and "bb" are two gaussian filters, which differs only for one line code), while others are completely different.

So I'm forced to use a lot of if statement, i.e.

 if(type == "aa" | type == "bb"){
  - do something common to aa and bb -

    if(type == "aa"){
      - do something aa-related -
    }
    if(type == "bb"){
      - do something bb-related -
    }
 }

 if(type == "cc" | type == "dd"){
  - do something common to cc and dd -

    if(type == "cc"){
      - do something cc-related -
    }
    if(type == "dd"){
      - do something dd-related -
    }
 }

if(type == "zz"){
     - do something zz-related -
}

And so on. Furthermore, there are some if statement in the code "do something". I'm looking for the best way to organiz开发者_如何学Pythone my code.


Option 1

One option is to use switch instead of multiple if statements:

myfun <- function(y, type=c("aa", "bb", "cc", "dd" ... "zz")){
  switch(type, 
    "aa" = sub_fun_aa(y),
    "bb" = sub_fun_bb(y),
    "bb" = sub_fun_cc(y),
    "dd" = sub_fun_dd(y)
  )
}  

Option 2

In your edited question you gave far more specific information. Here is a general design pattern that you might want to consider. The key element in this pattern is that there is not a single if in sight. I replace it with match.function, where the key idea is that the type in your function is itself a function (yes, since R supports functional programming, this is allowed).:

sharpening <- function(x){
  paste(x, "General sharpening", sep=" - ")
}

unsharpMask <- function(x){
  y <- sharpening(x)
  #... Some specific stuff here...
  paste(y, "Unsharp mask", sep=" - ")
}

hiPass <- function(x) {
  y <- sharpening(x)
  #... Some specific stuff here...
  paste(y, "Hipass filter", sep=" - ")
}

generalMethod <- function(x, type=c(hiPass, unsharpMask, ...)){
  match.fun(type)(x)
}

And call it like this:

> generalMethod("stuff", "unsharpMask")
[1] "stuff - General sharpening - Unsharp mask"
> hiPass("mystuff")
[1] "mystuff - General sharpening - Hipass filter"


There is hardly ever a reason not to refactor your code into smaller functions. In this case, besides the reorganisation, there is an extra advantage: the educated user of your function(s) can immediately call the subfunction if she knows where she's at.

If these functions have lots of parameters, a solution (to ease maintenance) could be to group them in a list of class "myFunctionParameters", but depends on your situation.

If code is shared between the different sub_fun_xxs, just plug that into another function that you use from within each of the sub_fun_xxs, or (if that's viable) calculate the stuff up front and pass it directly into each sub_fun_xx.


This is a much more general question about program design. There's no definitive answer, but there's almost certainly a better route than what you're currently doing.

Writing functions that handle the different types is a good route to go down. How effective it will be depends on several things - for example, how many different types are there? Are they at all related, e.g. could some of them be handled by the same function, with slightly different behavior depending on the input?

You should try to think about your code in a modular way. You have one big task to do overall. Can you break it down into a sequence of smaller tasks, and write functions that perform the smaller tasks? Can you generalize any of those tasks in a way that doesn't make the functions (much) more difficult to write, but does give them wider applicability?

If you give some more detail about what your program is supposed to be achieving, we will be able to help you more.


This is more of a general programming question than an R question. As such, you can follow basic guidelines of code quality. There are tools that can generate code quality reports from reading your code and give you guidelines on how to improve. One such example is Gendarme for .NET code. Here is a typical guideline that would appear in a report with too long methods:

AvoidLongMethodsRule

0

精彩评论

暂无评论...
验证码 换一张
取 消