Tell me more ×
Stack Overflow is a question and answer site for professional and enthusiast programmers. It's 100% free, no registration required.

I have a string

$test = 'xyz45sd2-32d34-sd23-456562.abc.com'

The objective is to obtain $1=23 and $2=45 i.e equal number of digits from - followered by ..

I have tried the following:

$test1 =~ s/.*(\d+)-(\d+).*//;

But

$1 matches: 3

$2 matches: 456562

share|improve this question
Why do you want (23, 45) and not (23, 456562)? – Akinakes 13 hours ago
Will they always be two digits, or do you mean that it's a variable number of digits before and after the dash? If it's always two digits, you could just use s/.*(\d{2})-(\d{2}).*// – GreatBigBore 13 hours ago
@GreatBigBore - The number of digits is variable, but i wanted to exact numbers on both sides, eg: if Left side of '-' has 5 numbers but right side of '-' has 3 numbers then i need to match 3 numbers – Ashwath Narayanan 12 hours ago
What do you expect in the case if the right side has lesser digits than the left side? For example in the string xyz45sd2-32d34-sd23-1d3f5b.abc.com? Would you expect 3 and 1? – Samveen 9 hours ago
@Samveen - yes. because the ending sequence from number 1 has the dot. So i want 3 and 1 – Ashwath Narayanan 42 mins ago
add comment (requires an account with 50 reputation)

4 Answers

You can try this regex

if($test1 =~ m/(\S+)-(\S+)-([a-z]*)(\d+)-(\d\d)(\d+).*/)
{
    print $4,"|",$5;
}

I assume that u need only the first 2 didgits from 456562

share|improve this answer
add comment (requires an account with 50 reputation)

perl -e '"xyz45sd2-32d34-sd23-456562.abc.com" =~ /(\d{2})-(\d{2})\d*(?=\.)/; print "$1\n$2\n"'

share|improve this answer
add comment (requires an account with 50 reputation)

This other entry confirms that regex does not count: How to match word where count of characters same

Building upon GreatBigBore's idea, if there's an upper bound to the count, then you could try the or operator |. This only matches your requirement to find a match; depending on the matched count the match will be in different bins. Only one case correctly places them in $1 and $2. (\d{3})-(\d{3})|(\d{2})-(\d{2})|(\d{1})-(\d{1})

However if you concatenate the result captures as $1$3$5 and $2$4$6, you will effectively get the 2 stings you were looking for.

Another idea is to operate iteratively, you could repeat your search on the string by increasing the number until the match fails. (\d{1})-(\d{1}) , (\d{2})-(\d{2}) ...

A binary search comes to mind making it an O{ln(N)}, N being the upper limit for the capture length.

share|improve this answer
add comment (requires an account with 50 reputation)

Theoretical answer

Short answer:

What you're looking for is not possible using regular expressions.

Long Answer:

Regular expressions (as their name suggests) are a compact representation of Regular languages (Type-3 grammars in the Chomsky Heirarchy).

What you're looking for is not possible using regular expressions as you're trying to write out an expression that maintains some kind of count (some contextual information other than beginning and end). This kind of behavior cannot be modelled as a DFA(actually any Finite Automaton). The informal proof of whether a language is regular is that there exists a DFA that accepts that language. As this kind of contextual information cannot be modeled in a DFA, thus by contradiction, you cannot write a regular expression for your problem.

Practical Solution

my ($lhs,$rhs) = $test =~ /^[^-]+-[^-]+-([^-]+)-([^-.]+)\S+/;
# Alernatively and faster
my (undef,undef,$lhs,$rhs) = split /-/, $test;

# Rest is common, no matter how $lhs and $rhs is extracted.
my @left = reverse split //, $lhs;
my @right = split //, $rhs;

my $i;
for($i=0; exists($left[$i]) and exists($right[$i]) and $left[$i] =~ /\d/ and $right[$i] =~ /\d/ ; ++$i){}

--$i;
$lhs= join "", reverse @left[0..$i];
$rhs= join "", @right[0..$i];

print $lhs, "\t", $rhs, "\n";
share|improve this answer
add comment (requires an account with 50 reputation)

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.