开发者

How to parse $QUERY_STRING from a bash CGI script?

开发者 https://www.devze.com 2023-01-19 22:34 出处:网络
I have a bash script that is being used in a CGI. The CGI sets the $QUERY_STRING environment variable by reading everything after the ? in the URL.For开发者_如何学运维 example, http://example.com?a=12

I have a bash script that is being used in a CGI. The CGI sets the $QUERY_STRING environment variable by reading everything after the ? in the URL. For开发者_如何学运维 example, http://example.com?a=123&b=456&c=ok sets QUERY_STRING=a=123&b=456&c=ok.

Somewhere I found the following ugliness:

b=$(echo "$QUERY_STRING" | sed -n 's/^.*b=\([^&]*\).*$/\1/p' | sed "s/%20/ /g")

which will set $b to whatever was found in $QUERY_STRING for b. However, my script has grown to have over ten input parameters. Is there an easier way to automatically convert the parameters in $QUERY_STRING into environment variables usable by bash?

Maybe I'll just use a for loop of some sort, but it'd be even better if the script was smart enough to automatically detect each parameter and maybe build an array that looks something like this:

${parm[a]}=123
${parm[b]}=456
${parm[c]}=ok

How could I write code to do that?


Try this:

saveIFS=$IFS
IFS='=&'
parm=($QUERY_STRING)
IFS=$saveIFS

Now you have this:

parm[0]=a
parm[1]=123
parm[2]=b
parm[3]=456
parm[4]=c
parm[5]=ok

In Bash 4, which has associative arrays, you can do this (using the array created above):

declare -A array
for ((i=0; i<${#parm[@]}; i+=2))
do
    array[${parm[i]}]=${parm[i+1]}
done

which will give you this:

array[a]=123
array[b]=456
array[c]=ok

Edit:

To use indirection in Bash 2 and later (using the parm array created above):

for ((i=0; i<${#parm[@]}; i+=2))
do
    declare var_${parm[i]}=${parm[i+1]}
done

Then you will have:

var_a=123
var_b=456
var_c=ok

You can access these directly:

echo $var_a

or indirectly:

for p in a b c
do
    name="var$p"
    echo ${!name}
done

If possible, it's better to avoid indirection since it can make code messy and be a source of bugs.


you can break $QUERY down using IFS. For example, setting it to &

$ QUERY="a=123&b=456&c=ok"
$ echo $QUERY
a=123&b=456&c=ok
$ IFS="&"
$ set -- $QUERY
$ echo $1
a=123
$ echo $2
b=456
$ echo $3
c=ok

$ array=($@)

$ for i in "${array[@]}"; do IFS="=" ; set -- $i; echo $1 $2; done
a 123
b 456
c ok

And you can save to a hash/dictionary in Bash 4+

$ declare -A hash
$ for i in "${array[@]}"; do IFS="=" ; set -- $i; hash[$1]=$2; done
$ echo ${hash["b"]}
456


Please don't use the evil eval junk.

Here's how you can reliably parse the string and get an associative array:

declare -A param   
while IFS='=' read -r -d '&' key value && [[ -n "$key" ]]; do
    param["$key"]=$value
done <<<"${QUERY_STRING}&"

If you don't like the key check, you could do this instead:

declare -A param   
while IFS='=' read -r -d '&' key value; do
    param["$key"]=$value
done <<<"${QUERY_STRING:+"${QUERY_STRING}&"}"

Listing all the keys and values from the array:

for key in "${!param[@]}"; do
    echo "$key: ${param[$key]}"
done


I packaged the sed command up into another script:

$cat getvar.sh

s='s/^.*'${1}'=\([^&]*\).*$/\1/p'
echo $QUERY_STRING | sed -n $s | sed "s/%20/ /g"

and I call it from my main cgi as:

id=`./getvar.sh id`
ds=`./getvar.sh ds`
dt=`./getvar.sh dt`

...etc, etc - you get idea.

works for me even with a very basic busybox appliance (my PVR in this case).


To converts the contents of QUERY_STRING into bash variables use the following command:

eval $(echo ${QUERY_STRING//&/;})

The inner step, echo ${QUERY_STRING//&/;}, substitutes all ampersands with semicolons producing a=123;b=456;c=ok which the eval then evaluates into the current shell.

The result can then be used as bash variables.

echo $a
echo $b
echo $c

The assumptions are:

  • values will never contain '&'
  • values will never contain ';'
  • QUERY_STRING will never contain malicious code


While the accepted answer is probably the most beautiful one, there might be cases where security is super-important, and it needs to be also well-visible from your script.

In such a case, first I wouldn't use bash for the task, but if it should be done on some reason, it might be better to avoid these new array - dictionary features, because you can't be sure, how exactly are they escaped.

In this case, the good old primitive solutions might work:

QS="${QUERY_STRING}"
while [ "${QS}" != "" ]
do
  nameval="${QS%%&*}"
  QS="${QS#$nameval}"
  QS="${QS#&}"
  name="${nameval%%=*}"
  val="${nameval#$name}"
  val="${nameval#=}"

  # and here we have $name and $val as names and values

  # ...

done

This iterates on the name-value pairs of the QUERY_STRING, and there is no way to circumvent it with any tricky escape sequence - the " is a very strong thing in bash, except a single variable name substitution, which is fully controlled by us, nothing can be tricked.

Furthermore, you can inject your own processing code into "# ...". This enables you to allow only your own, well-defined (and, ideally, short) list of the allowed variable names. Needless to say, LD_PRELOAD shouldn't be one of them. ;-)

Furthermore, no variable will be exported, and exclusively QS, nameval, name and val is used.


Following the correct answer, I've done myself some changes to support array variables like in this other question. I added also a decode function of which I can not find the author to give some credit.

Code appears somewhat messy, but it works. Changes and other recommendations would be greatly appreciated.

function cgi_decodevar() {
    [ $# -ne 1 ] && return
    local v t h
    # replace all + with whitespace and append %%
    t="${1//+/ }%%"
    while [ ${#t} -gt 0 -a "${t}" != "%" ]; do
        v="${v}${t%%\%*}" # digest up to the first %
        t="${t#*%}"       # remove digested part
        # decode if there is anything to decode and if not at end of string
        if [ ${#t} -gt 0 -a "${t}" != "%" ]; then
            h=${t:0:2} # save first two chars
            t="${t:2}" # remove these
            v="${v}"`echo -e \\\\x${h}` # convert hex to special char
        fi
    done
    # return decoded string
    echo "${v}"
    return
}

saveIFS=$IFS
IFS='=&'
VARS=($QUERY_STRING)
IFS=$saveIFS

for ((i=0; i<${#VARS[@]}; i+=2))
do
  curr="$(cgi_decodevar ${VARS[i]})"
  next="$(cgi_decodevar ${VARS[i+2]})"
  prev="$(cgi_decodevar ${VARS[i-2]})"
  value="$(cgi_decodevar ${VARS[i+1]})"

  array=${curr%"[]"}

  if  [ "$curr" == "$next" ] && [ "$curr" != "$prev" ] ;then
      j=0
      declare var_${array}[$j]="$value"
  elif [ $i -gt 1 ] && [ "$curr" == "$prev" ]; then
    j=$((j + 1))
    declare var_${array}[$j]="$value"
  else
    declare var_$curr="$value"
  fi
done


I would simply replace the & to ;. It will become to something like:

a=123;b=456;c=ok

So now you need just evaluate and read your vars:

eval `echo "${QUERY_STRING}"|tr '&' ';'`
echo $a
echo $b
echo $c


A nice way to handle CGI query strings is to use Haserl which acts as a wrapper around your Bash cgi script, and offers convenient and secure query string parsing.


To bring this up to date, if you have a recent Bash version then you can achieve this with regular expressions:

q="$QUERY_STRING"
re1='^(\w+=\w+)&?'
re2='^(\w+)=(\w+)$'
declare -A params
while [[ $q =~ $re1 ]]; do
  q=${q##*${BASH_REMATCH[0]}}       
  [[ ${BASH_REMATCH[1]} =~ $re2 ]] && params+=([${BASH_REMATCH[1]}]=${BASH_REMATCH[2]})
done

If you don't want to use associative arrays then just change the penultimate line to do what you want. For each iteration of the loop the parameter is in ${BASH_REMATCH[1]} and its value is in ${BASH_REMATCH[2]}.

Here is the same thing as a function in a short test script that iterates over the array outputs the query string's parameters and their values

#!/bin/bash
QUERY_STRING='foo=hello&bar=there&baz=freddy'

get_query_string() {
  local q="$QUERY_STRING"
  local re1='^(\w+=\w+)&?'
  local re2='^(\w+)=(\w+)$'
  while [[ $q =~ $re1 ]]; do
    q=${q##*${BASH_REMATCH[0]}}
    [[ ${BASH_REMATCH[1]} =~ $re2 ]] && eval "$1+=([${BASH_REMATCH[1]}]=${BASH_REMATCH[2]})"
  done
}

declare -A params
get_query_string params

for k in "${!params[@]}"
do
  v="${params[$k]}"
  echo "$k : $v"
done          

Note the parameters end up in the array in reverse order (it's associative so that shouldn't matter).


why not this

    $ echo "${QUERY_STRING}"
    name=carlo&last=lanza&city=pfungen-CH
    $ saveIFS=$IFS
    $ IFS='&'
    $ eval $QUERY_STRING
    $ IFS=$saveIFS

now you have this

    name = carlo
    last = lanza
    city = pfungen-CH

    $ echo "name is ${name}"
    name is carlo
    $ echo "last is ${last}"
    last is lanza
    $ echo "city is ${city}"
    city is pfungen-CH


@giacecco

To include a hiphen in the regex you could change the two lines as such in answer from @starfry.

Change these two lines:

  local re1='^(\w+=\w+)&?'
  local re2='^(\w+)=(\w+)$'

To these two lines:

  local re1='^(\w+=(\w+|-|)+)&?'
  local re2='^(\w+)=((\w+|-|)+)$'


For all those who couldn't get it working with the posted answers (like me), this guy figured it out.

Can't upvote his post unfortunately...

Let me repost the code here real quick:

#!/bin/sh

if [ "$REQUEST_METHOD" = "POST" ]; then
  if [ "$CONTENT_LENGTH" -gt 0 ]; then
      read -n $CONTENT_LENGTH POST_DATA <&0
  fi
fi

#echo "$POST_DATA" > data.bin
IFS='=&'
set -- $POST_DATA

#2- Value1
#4- Value2
#6- Value3
#8- Value4

echo $2 $4 $6 $8

echo "Content-type: text/html"
echo ""
echo "<html><head><title>Saved</title></head><body>"
echo "Data received: $POST_DATA"
echo "</body></html>"

Hope this is of help for anybody.

Cheers


Actually I liked bolt's answer, so I made a version which works with Busybox as well (ash in Busybox does not support here string). This code will accept key1 and key2 parameters, all others will be ignored.

while IFS= read -r -d '&' KEYVAL && [[ -n "$KEYVAL" ]]; do
case ${KEYVAL%=*} in
        key1) KEY1=${KEYVAL#*=} ;;
        key2) KEY2=${KEYVAL#*=} ;;
    esac
done <<END
$(echo "${QUERY_STRING}&")
END


One can use the bash-cgi.sh, which processes :

  • the query string into the $QUERY_STRING_GET key and value array;

  • the post request data (x-www-form-urlencoded) into the $QUERY_STRING_POST key and value array;

  • the cookies data into the $HTTP_COOKIES key and value array.

Demands bash version 4.0 or higher (to define the key and value arrays above).

All processing is made by bash only (i.e. in an one process) without any external dependencies and additional processes invoking.

It has:

  • the check for max length of data, which can be transferred to it's input, as well as processed as query string and cookies;

  • the redirect() procedure to produce redirect to itself with the extension changed to .html (it is useful for an one page's sites);

  • the http_header_tail() procedure to output the last two strings of the HTTP(S) respond's header;

  • the $REMOTE_ADDR value sanitizer from possible injections;

  • the parser and evaluator of the escaped UTF-8 symbols embedded into the values passed to the $QUERY_STRING_GET, $QUERY_STRING_POST and $HTTP_COOKIES;

  • the sanitizer of the $QUERY_STRING_GET, $QUERY_STRING_POST and $HTTP_COOKIES values against possible SQL injections (the escaping like the mysql_real_escape_string php function does, plus the escaping of @ and $).

It is available here:

https://github.com/VladimirBelousov/fancy_scripts


This works in dash using for in loop

IFS='&'
for f in $query_string; do
   value=${f##*=}
   key=${f%%=*}
    # if you need environment variable -> eval "qs_$key=$value"
done
0

精彩评论

暂无评论...
验证码 换一张
取 消