开发者

How do I process extremely long lists of files in a make recipe?

开发者 https://www.devze.com 2023-03-27 11:40 出处:网络
Because GNU make allows variables to be as large as memory allows, it has no problem building massive dependency lists.However, if you want to actually use these lists of files in a recipe (sequence o

Because GNU make allows variables to be as large as memory allows, it has no problem building massive dependency lists. However, if you want to actually use these lists of files in a recipe (sequence of shell commands for building a target), you run into a problem: the command might exceed the shell's command line length limit, producing an error such as "Argument list too long".

For example, suppose I want to concatenate several files contained in the list $(INPUTS) to produce a file combined.txt. Ordinarily, I could use:

combined.txt: $(INPUTS)
        cat $^ > $@

But if $(INPUT开发者_Python百科S) contains many thousands of files, as it does in my case, the call to cat is too long and fails. Is there a way to get around this problem in general? It's safe to assume that there exists some sequence of commands that have identical behaviour to the one enormous command -- in this case, a series of cat commands, one per input file, that use >> to append to combined.txt would work. But how can make be persuaded to generate those commands?


In looking for the answer, about the best suggestion I could find was to break up the list into a series of smaller lists and process them using shell for loops. But you can't always do that, and even when you can it's a messy hack: for example, it's not obvious how to get the usual make behaviour of stopping as soon as a command fails. Luckily, after much searching and experimentation, it turns out that a general solution does exist.

Subshells and newlines

make recipes invoke a separate subshell for each line in the recipe. This behaviour can be annoying and counterintuitive: for example, a cd command on one line will not affect subsequent commands because they are run in separate subshells. Nevertheless it's actually what we need to get make to perform actions on very long lists of files.

Ordinarily, if you build a "multiline" list of files with a regular variable assignment that uses backslashes to break the statement over multiple lines, make removes all newlines:

# The following two statements are equivalent
FILES := a b c

FILES := \
a \
b \
c

However, using the define directive, it's possible to build variable values that contain newlines. What's more, if you substitute such a variable into a recipe, each line will indeed be run using a separate subshell, so that for example running make test from /home/jbloggs with the makefile below (and assuming no file called test exists) will produce the output /home/jbloggs, because the effect of the cd .. command is lost when its subshell ends:

define CMDS
cd ..
pwd
endef

test:
        $(CMDS)

If we create a variable that contains newlines using define, it can be concatenated with other text as usual, and processed using all the usual make functions. This, combined with the $(foreach) function, allows us to get what we want:

# Just a single newline!  Note 2 blank lines are needed.
define NL


endef

combined.txt: $(INPUTS)
        rm $@
        $(foreach f,$(INPUTS),cat $(f) >> $@$(NL))

We ask $(foreach) to convert each filename into a newline-terminated command, which will be executed in its own subshell. For more complicated needs, you could instead write out the list of filenames to a file with a series of echo commands and then use xargs.

Notes

  • The define directive is described as optionally taking a =, := or += token on the end of the first line to determine which variable flavour is to be created -- but note that that only works on versions of GNU make 3.82 and up! You may well be running the popular version 3.81, as I was, which silently assigns nothing to the variable if you add one of these tokens, leading to much frustration. See here for more.
  • All recipe lines must begin with a literal tab character, not the 8 spaces I have used here.
0

精彩评论

暂无评论...
验证码 换一张
取 消