Sunday, 26 May 2013

bash string manipulation

Bash supports many ways of string manipulation, it includes substring, substring replacement, substring removal, string length.

In how to rename multiple files in one command, one of the ways I shared to rename *.txt to *.sh is

for f in *.txt
do
        mv $f ${f/.txt/.sh}
done

How ${f/.txt/.sh} is doing the magic? Actually this is string replacement in bash.

${f/.txt/.sh} means replacing ".txt" in string $f with ".sh", so for f=abc.txt, ${f/.txt/.sh} will produce abc.sh

[linuxscripter@localhost ~]$ f=abc.txt
[linuxscripter@localhost ~]$ echo $f{f/.txt/.sh}
abc.sh

But the example I gave in rename multiple files in one command was not so correct.
For f=abc.txt.txt, ${f/.txt/.sh} will produce abc.sh.txt instead of abc.txt.sh

[linuxscripter@localhost ~]$ f=abc.txt.txt
[linuxscripter@localhost ~]$ echo $f{f/.txt/.sh}
abc.txt.sh

To correct this, we need to use ${f/%.txt/.sh}

[linuxscripter@localhost ~]$ f=abc.txt.txt
[linuxscripter@localhost ~]$ echo $f{f/%.txt/.sh}
abc.txt.sh

The syntax of string replacement includes:
1. ${var/Pattern/Replacement}   : replacing the first match of "Pattern" with "Replacement"
2. ${var//Pattern/Replacement}  : replacing every match of "Pattern" with "Replacement"
3. ${var/#Pattern/Replacement}  : if $var starts with "Pattern", replace "Pattern" with "Replacement"
4. ${var/%Pattern/Replacement}  : if $var ends with "Pattern", replace "Pattern" with "Replacement"

Some examples will make the syntax clearer.

[linuxscripter@localhost ~]$ f=abc.txt.abc.txt

[linuxscripter@localhost ~]$ ### testing syntax 1
[linuxscripter@localhost ~]$ echo ${f/txt/sh}
abc.sh.abc.txt

[linuxscripter@localhost ~]$ ### testing syntax 2
[linuxscripter@localhost ~]$ echo ${f//txt/sh}
abc.sh.abc.sh

[linuxscripter@localhost ~]$ ### testing syntax 3, [linuxscripter@localhost ~]$ echo ${f/#txt/sh}
abc.txt.abc.txt

[linuxscripter@localhost ~]$ ### testing syntax 3,

[linuxscripter@localhost ~]$ echo ${f/#abc/sh}
sh.txt.abc.txt

[linuxscripter@localhost ~]$ ### testing syntax 4,
[linuxscripter@localhost ~]$ echo ${f/%abc/sh}
abc.txt.abc.txt
[linuxscripter@localhost ~]$ ### testing syntax 4
[linuxscripter@localhost ~]$ echo ${f/%txt/sh}
abc.txt.abc.sh

At first these syntax may not appear very intuitive, I found this method help to understand and memorize them.
Variable is referred by add '$' infront of it
1. '/' is used for replacement, same as what we do in awk and sed
2. Once we know '/' is for replacement, '//' should be too difficult for us.
3. On keyboard, '#' is at the left of '$', so '#' is used to replace the left side of the variable.
4. On keyboard, '%' is at the right of '$', so '%' is used to replace the right side of the variable.

Besides string replacement, of course there are a lot more bash can do.
${#var} returns length of $var
[linuxscripter@localhost ~]$ echo {#f}
15

${var:pos:len} returns substring of $var, starting at position "pos", with length upto "len", if len is empty or len is too big, return substring starting at position "pos" until to the end of the string.
[linuxscripter@localhost ~]$ echo ${f:2:5} ${f:2:100} ${f:2}
c.txt c.txt.abc.txt c.txt.abc.txt

${var#substring} delete shortest match of "substring" from front of $var
[linuxscripter@localhost ~]$ echo ${f#abc}
.txt.abc.txt
[linuxscripter@localhost ~]$ echo ${f#*txt}
.abc.txt

${var##substring} delete longest match of "substring" from front of $var
[linuxscripter@localhost ~]$ echo ${f##*abc}
.txt

${var%substring} delete longest match of "substring" from back of $var
[linuxscripter@localhost ~]$ echo ${f%txt}
abc.txt.abc.
[linuxscripter@localhost ~]$ echo ${f%abc*}
abc.txt.

${var%%substring} delete longest match of "substring" from back of $var
[linuxscripter@localhost ~]$ echo ${f%%txt*}
abc.

http://tldp.org/LDP/abs/html/string-manipulation.html
http://bbs.chinaunix.net/forum.php?mod=viewthread&tid=218853&page=7

No comments:

Post a Comment