开发者

List all leaf subdirectories in linux

开发者 https://www.devze.com 2022-12-08 20:37 出处:网络
Is there an easy way to list only directories under a given directory in Linux? To explain better, I can do:

Is there an easy way to list only directories under a given directory in Linux? To explain better, I can do:

find mydir -type d

which gives:

mydir/src
mydir/src/main
mydir/bin
mydir/bin/classes

Wh开发者_开发问答at I want instead is:

mydir/src/main
mydir/bin/classes

I can do this in a bash script that loops over the lines and removes previous line if next line contains the path, but I'm wondering if there is a simpler method that does not use bash loops.


If you want only the leaf directories (directories which don't contain any sub-directory), look at this other question. The answer also explains it, but in short it is:

find . -type d -links 2


find . -type d | sort | awk '$0 !~ last "/" {print last} {last=$0} END {print last}'


If you're looking for something visual, tree -d is nice.

drinks
|-- coke
|   |-- cherry
|   `-- diet
|       |-- caffeine-free
|       `-- cherry
|-- juice
|   `-- orange
|       `-- homestyle
|           `-- quart
`-- pepsi
    |-- clear
    `-- diet


I can't think of anything that will do this without a loop. So, here are some loops:

This displays the leaf directories under the current directory, regardless of their depth:

for dir in $(find -depth -type d); do [[ ! $prev =~ $dir ]] && echo "$dir" ; prev="$dir"; done

This version properly handles directory names containing spaces:

saveIFS=$IFS; IFS=$'\n'; for dir in $(find -depth -type d ); do [[ ! $prev =~ $dir ]] && echo "${dir}" ; prev="$dir"; done; IFS=$saveIFS

Here is a version using Jefromi's suggestion:

find -depth -type d | while read dir;  do [[ ! $prev =~ $dir ]] && echo "${dir}" ; prev="$dir"; done


The solution using awk is nice, simple… and fails if the directory name contains any character that is considered special in forming regex patterns. This also presents an issue with the ~ or != tests in Bash.

The following seems to work for both BSD and GNU find:

find . -type d | sed 's:$:/:' | sort -r | while read -r dir;do [[ "${dir}" != "${prev:0:${#dir}}" ]] && echo "${dir}" && prev="${dir}”;done
  • Change find . to any directory you want to start the search in.
  • The sed command adds a forward slash to each directory returned by find.
  • sort -r sorts the directory list in reverse alphabetical order, which has the benefit of listing the directories furthest away from a root first, which is what we want.
  • This list is then read in line-by-line by the while read loop, where the -r option further protects against treating certain characters differently from others.
  • We then need to compare the current line against the previous one. As we cannot use the != test and that intermediate directories will have a path shorter than that of the corresponding leaf directory, our test will compare the current line to the previous line truncated to the length of the current line. If that’s a match, then we can discard this line as a non-leaf directory, otherwise we print this line and set it as the previous line ready for the next iteration. Note that the strings need to be quoted in the test statement, otherwise some false positives may be produced.

Oh, if you don’t want to use find

shopt -s nullglob globstar;printf "%s\n" **/ | sort -r | while read -r dir;do [[ "${dir}" != "${prev:0:${#dir}}" ]] && echo "${dir}" && prev="${dir}";done;shopt -u nullglob globstar

UPDATE (2020-06-03): Here’s a script I’ve thrown together that’s hopefully useful. Obviously feel free to improve/adapt/point out glaring problems…

#!/usr/bin/env bash

# leaf: from a given source, output only the directories
#       required ('leaf folders' ) to recreate a full
#       directory structure when passed to mkdir -p 

usage() {
    echo "Usage: ${0##*/} [-f|-g|-h|-m <target>|-s|-S|-v] <source>" 1>&2
}

# Initial variables...
dirMethod=0 # Set default method of directory listing (find -d)
addSource=0 # Set default ouput path behaviour

# Command options handling with Bash getopts builtin
while getopts ":fghm:sSv" options; do
    case "${options}" in
        f) # use depth-first find method of directory listing
            dirMethod=0 # set again here if user sets both f and g
            ;;
        g) # Use extended globbing and sort method of directory listing
            dirMethod=1
            ;;
        h) # Help text
            echo "Leaf - generate shortest list of directories to (optionally)"
            echo "       fully recreate a directory structure from a given source"
            echo 
            echo "Options"
            echo "======="
            usage
            echo
            echo "Directory listing method"
            echo "------------------------"
            echo "-f           Use find command with depth-first search [DEFAULT]"
            echo "-g           Use shell globbing method"
            echo
            echo "Output options"
            echo "--------------"
            echo "-m <target>  Create directory structure in <target> directory"
            echo "-v           Verbose output [use with -m option]"
            echo "-s           Output includes source directory"
            echo "-S           Output includes full given path of <source> directory"
            echo
            echo "Other options"
            echo "-------------"
            echo "-h           This help text"
            exit 0 # Exit script cleanly
            ;;
        m) # make directories in given location
            destinationRootDir="${OPTARG}"
            ;;
        s) # Include source directory as root of output paths/tree recreation
            addSource=1
            ;;
        S) # Include full source path as root of output paths/tree recreation
            addSource=2
            ;;
        v) # Verbose output if -m option given
            mdOpt="v"
            ;;
        *) # If no options... 
            usage
            exit 1 # Exit script with an error
            ;;
    esac
done
shift $((OPTIND-1))

# Positional parameters handling - only one (<source>) expected
if (( $# == 1 )); then
    if [[ $1 == "/" ]]; then # Test to see if <source> is the root directory /
        (( dirMethod == 0 )) && sourceDir="${1}" || sourceDir=
            # Set sourceDir to '/' if using find command dir generation or null if bash globbing method
    else
        sourceDir="${1%/}" # Strip trailing /
    fi
else
    usage  # Show usage message and...
    exit 1 # Quit with an error
fi

# Generate full pre-filtered directory list depending on requested method
if (( dirMethod == 0 )); then # find command method
    dirList=$(find "${sourceDir}" -depth -type d 2>/dev/null | sed -e 's:^/::' -e '/^$/ ! s:$:/:')
        # find command with depth-first search should eliminate need to sort directories
        # sed -e 's:^/::' -e '/^$/ ! s:$:/:' - strip leading '/' if present and add '/'
        #                                      to all directories except root
else
    shopt -s nullglob globstar dotglob
    # nullglob - don't return search string if no match
    # globstar - allow ** globbing to descend into subdirectories. '**/' returns directories only
    # dotglob  - return hidden folders (ie. those beginning with '.') 
    dirList=$(printf "%s\n" "${sourceDir}"/**/ | sed -e 's:^/::' | sort -r)
    # sort command required so filtering works correctly
fi

# Determine directory stripping string. ie. if given path/to/source[/] as the
# source directory (src), should the output be just that of the contents of src,
# src and its contents or the path/to/src and contents?
sourceDir="${sourceDir#/}"
case "${addSource}" in
    0) strip="${sourceDir}/";; # Set 'strip' to <source> 
    1) [[ "${sourceDir}" =~ (\/?.+)\/.+$ ]] && strip="${BASH_REMATCH[1]}/" || strip="/"
       # To strip down to <source> only, check to see if matched by regex and only return matched part
       # If not found, behave like -S
       ;;
    2) strip="/";; # Set 'strip' to nothing but a forward slash
esac

# Main loop
# Feed the generated dirList into this while loop which is run line-by-line (ie. directory by directory)
while read -r dir;do
    if [[ "${dir}" != "${prev:0:${#dir}}" ]]; then
        # If current line is not contained within the previous line then that is a valid directory to display/create 
        if [[ -v destinationRootDir ]]; then # If destinationRootDir is set (-m) then create directory in <target>
            mkdir -p${mdOpt} "${destinationRootDir%/}/${dir#$strip}"
            # -p - create intermediate directories if they don't exist. The filtered list means no unnecessary mkdir calls
            # if mdOpt is set, it is 'v', meaning mkdir will output each created directory path to stdin
            # ${dir#$strip} removes the set strip value from the line before it is displayed/created
        else
            echo "${dir#$strip}" # Same as above but no directories created. Displayed only, so -v ignored here
        fi
        prev="${dir}" # Set prev to this line before the loop iterates again and the next line passed to dir
    fi
done <<<"${dirList}" # This is a here string


I think you can look at all the directories and then redirect the ouput and use xargs for counting the number files for each subdirectories, when there's no subdirectory ( xargs find SUBDIR -type d | wc -l ... something like that, i cannot test right now ) you've found a leaf.

This is still a loop though.


This is still a loop, since it uses the branch command in sed:

find -depth -type d |sed 'h; :b; $b; N; /^\(.*\)\/.*\n\1$/ { g; bb }; $ {x; b}; P; D'

Based on a script in info sed (uniq work-alike).

Edit Here is the sed script broken out with comments (copied from info sed and modified):

# copy the pattern space to the hold space
h 

# label for branch (goto) command
:b
# on the last line ($) goto the end of 
# the script (b with no label), print and exit
$b
# append the next line to the pattern space (it now contains line1\nline2
N
# if the pattern space matches line1 with the last slash and whatever comes after
# it followed by a newline followed by a copy of the part before the last slash
# in other words line2 is different from line one with the last dir removed
# see below for the regex
/^\(.*\)\/.*\n\1$/ {
    # Undo the effect of
    # the n command by copying the hold space back to the pattern space
    g
    # branch to label b (so now line2 is playing the role of line1
    bb
}
# If the `N' command had added the last line, print and exit
# (if this is the last line then swap the hold space and pattern space
# and goto the end (b without a label) 
$ { x; b }

# The lines are different; print the first and go
# back working on the second.
# print up to the first newline of the pattern space
P
# delete up to the first newline in the pattern space, the remainder, if any,
# will become line1, go to the top of the loop
D

Here is what the regex is doing:

  • / - start a pattern
  • ^ - matches the beginning of the line
  • \( - start a capture group (back reference subexpression)
  • .* - zero or more (*) of any character (.)
  • \) - end capture group
  • \/ - a slash (/) (escaped with \)
  • .* - zero or more of any character
  • \n - a newline
  • \1 - a copy of the back reference (which in this case is whatever was between the beginning of the line and the last slash)
  • $ - matches the end of the line
  • / - end the pattern


Try the following one-liner (tested on Linux & OS X):

find . -type d -execdir sh -c 'test -z "$(find "{}" -mindepth 1 -type d)" && echo $PWD/{}' \;


On most filesystems (not btrfs), the simple answer is:

find . -type d -links 2

In https://unix.stackexchange.com/questions/497185/how-to-find-only-directories-without-subdirectories there is a solution that works on btrfs, but it's unbearably ugly:

find . -type d \
    \( -exec sh -c 'find "$1" -mindepth 1 -maxdepth 1 -type d -print0 | grep -cz "^" >/dev/null 2>&1' _ {} \; -o -print \)

There's an alternative to find called rawhide (rh) that makes this much easier:

rh 'd && "[ `rh -red %S | wc -l` = 0 ]".sh'

A slightly shorter/faster version is:

rh 'd && "[ -z \"`rh -red %S`\" ]".sh'

The above commands search for directories and then list their sub-directories and only match when there are none (the first by counting the number of lines of output, and the second by checking if there is any output at all per directory).

If you don't need support for btrfs, it's more like find but still shorter:

rh 'd && nlink == 2'

For a version that works on all filesystems as efficiently as possible:

rh 'd && (nlink == 2 || nlink == 1 && "[ -z \"`rh -red %S`\" ]".sh)'

On normal (non-btrfs) filesystems, this will work without the need for any additional processes for each directory, but on btrfs, it will need them. This is probably best if you have a mix of different filesystems including btrfs.

Rawhide (rh) is available from https://raf.org/rawhide or https://github.com/raforg/rawhide. It works at least on Linux, FreeBSD, OpenBSD, NetBSD, Solaris, macOS, and Cygwin.

Disclaimer: I am the current author of rawhide

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号