2

I would like to find a straightforward method to extract the portion of a string up to where the first digit appears (possibly using regular expressions instead of traversing the string character by character). I am using this to extract package names from rpm -qa without their versions. E.g:

Parsing: perl-Text-ParseWords-3.30-1.fc22.i686
Result: perl-Text-ParseWords
4
  • 2
    Your example cut at - not digits. Nov 17, 2016 at 17:47
  • There is another - after 30 Nov 17, 2016 at 18:02
  • I think he means if you cut where the first digit appears you will end up with the last - character. If the name of the package doesn't have any numbers you could use: strtocut="perl-Text-ParseWords-3.30-1.fc22.i686"; echo ${strtocut:0:$(($(expr match "$strtocut" '[A-Za-z-]*')-1))} This will use expr match to find the length of the string containing only characters from A-Z, a-z and "-". Then it will subtract 1 of that length and extract this portion of the string. Not really reliable though, if the name of the package has some number (or other special char) it will cut earlier..
    – IanC
    Nov 17, 2016 at 18:33
  • If it's a Bash variable, "${somevar%%[0-9]*}" - look for answers about Bash parameter expansion.
    – Wildcard
    Nov 17, 2016 at 18:52

2 Answers 2

5

Preferred Alternative

We could simply modify the rpm query to only output the name.

rpm -qa --queryformat "%{NAME}\n"

Or we can get dirty with a regex

Not exactly "straight forward" but here is a sed regex that should be able to do it.

sed -e 's/\([^\.]*\).*/\1/;s/-[0-9]*$//' <<< "perl-Text-ParseWords-3.30-1.fc22.i686"

This should handle everything except for in there is a period in the package name (I don't even think that is allowed).

Quick breakdown

  • s/\([^\.]*\).*/\1/ grab everything before the first period. So perl-Text-ParseWords-3.30-1.fc22.i686 becomes perl-Text-ParseWords-3

  • s/-[0-9]*$//get rid of that trailing - and first version digit. So perl-Text-ParseWords-3 becomes perl-Text-ParseWords.

4
  • 1
    +1 for the rpm command. I think that should be listed first. :)
    – Wildcard
    Nov 17, 2016 at 18:54
  • +1 for rpm -qa --queryformat "%{NAME}\n" that is the right way to do it. I have a package called xmms2-core so don't split on the number.
    – hschou
    Nov 17, 2016 at 18:58
  • @hschou the sed expression should still work on xmms2-core. -[0-9]*$ should only grab hyphen followed by numbers not letters to the EOL. Nov 17, 2016 at 19:04
  • I had no idea I could format the rpm query like that. As a matter of fact that is ideal to my need. I will need the regex as well as I am parsing name of RPMS in a directory as well. Thanks for providing both! Nov 17, 2016 at 19:29
3

Directly in bash:

a='perl-Text-ParseWords-3.30-1.fc22.i686'
r='(^[^0-9]+)'
[[ $a =~ $r ]]

echo "${BASH_REMATCH[1]%?}"

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .