unary operator expected
is because [
and *
(in your *fastq.gz
) work independently.
[
is not shell syntax. [
is a regular command (a builtin in Bash, but still a command) and ]
is its last argument, a mandatory one. Anything in between is an argument too.
The shell expands /path/to/dir/*fastq.gz
to one or more words before it calls [
. [
will see these words plus the mandatory ]
as arguments. Depending on the number of arguments and what they are, [
expects zero or more argument(s) to be operators (like -f
).
Your [ /path/to/dir/*fastq.gz ]
will be valid if /path/to/dir/*fastq.gz
expands to a single argument (note “will be valid” is not equivalent to “will do what you want”). This includes cases where *
matches nothing; traditionally (and by default in Bash) if there is no match then /path/to/dir/*fastq.gz
will be processed as-is. It may happen /path/to/dir/*fastq.gz
expands to multiple words, none of them will look like an operator [
understands. The error you got is most likely from a case where the pattern expanded to two words.
Later you used [ "$in"/*spring -f ]
. This is even worse, because you probably wanted something like [ -f some/path ]
where -f
is before the path to test. Still [ -f "$in"/*spring ]
is not a robust fix because "$in"/*spring
in general may expand to multiple arguments and [
will not stand them. You wrote there is at most one *spring
file per directory, so in your case code like this may kinda work; it’s still poor code though.
With [
, do not use wildcards like *
that may expand to multiple words; this will fail immediately or soon. [[
is different under the hood but it’s not good for your purpose either.
You want to know how many files a pattern like /path/to/dir/*fastq.gz
matches. The right way to do it is to assign the result of the expansion to an array. Portably there’s only one array: the array of arguments of the shell script (or the shell function); and you need extra code to detect a case of zero matches (that still generates one word: the unexpanded pattern string). Your question is tagged bash, so I will use a named array and few other non-portable functionalities:
# non-portable code, works in Bash
check_dir () (
dir="${1-.}"
dir="${dir%/}/"
[ -d "$dir" ] || { echo "Not a directory." >&2; return 1; }
shopt -s nullglob
files=( "$dir"/*fastq.gz )
nf="${#files[@]}"
files=( "$dir"/*spring )
ns="${#files[@]}"
printf '%s\t%s\t%s\n' "$nf" "$ns" "$dir"
)
Usage: check_dir path/to/dir
or check_dir
(the default path is .
). The function will print the number of *fastq.gz
files, a tab, the number of *spring
files, a tab, finally the examined path (printed with a trailing /
).
Now you can analyze a directory tree (the below function requires the above function to be defined):
# non-portable code, works in Bash
check_dirs () (
dir="${1-.}"
dir="${dir%/}/"
[ -d "$dir" ] || { echo "Not a directory." >&2; return 1; }
shopt -s nullglob globstar
for d in "$dir"**/; do
check_dir "$d"
done
)
Usage: check_dirs path/to/dir
or check_dirs
(the default path is .
).
Notes:
-
For a large directory tree
check_dirs
may seem to initially stall. This is becausefor d in "$dir"**/
needs to be fully expanded beforecheck_dir
is ever called and prints anything. -
The functions are deliberately defined as subshells (
check_dir () (
as opposed tocheck_dir () {
), so shell options (shopt
) and all variables are local. -
If you want
check_dir
to count hidden files, you needdotglob
in this function (i.e.shopt -s nullglob dotglob
). -
If you want
check_dirs
to descend to hidden directories, you needdotglob
in this function (i.e.shopt -s nullglob globstar dotglob
). -
Unless the names of your directories contain newline characters, the output from
check_dir
orcheck_dirs
is easily parsable with standard tools. Useful commands:sort -n
,grep $'^2\t1\t'
,cut -f 3-
.E.g. to find directories under
./
with exactly one*fastq.gz
file and exactly zero*spring
files:check_dirs | grep $'^1\t0\t' | cut -f 3-
Read more here: Source link