Regex question

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Regex question

Bugzilla from peter_sliepenbeek@yahoo.com

First some examples:

--

std::string input = "u";
std::string expression = "[^UuMmFf]*";
std::string replacement = "Unknown";
boost::regex regular_expression(expression);
std::string output = boost::regex_replace(input,
regular_expression,
replacement, boost::match_default);

Boost: output = "UnknownuUnknown"

$ echo 'u' | sed 's/[^UuMmFf]*/Unknown/g'
UnknownuUnknown

Expected.

--

std::string input = "x";
std::string expression = "[^UuMmFf]*";
std::string replacement = "Unknown";
boost::regex regular_expression(expression);
std::string output = boost::regex_replace(input,
regular_expression,
replacement, boost::match_default);

Boost 1.33.1: output = "UnknownUnknown"
Prior: "Unknown"

$ echo 'x' | sed 's/[^UuMmFf]*/Unknown/g'
Unknown

The result generated by Boost 1.33.1 is not expected!

--

std::string input = " ";
std::string expression = "[^UuMmFf]*";
std::string replacement = "Unknown";
boost::regex regular_expression(expression);
std::string output = boost::regex_replace(input,
regular_expression,
replacement, boost::match_default);

Boost 1.33.1: output = "UnknownUnknown"
Prior: "Unknown"

$ echo ' ' | sed 's/[^UuMmFf]*/Unknown/g'
Unknown

The result generated by Boost 1.33.1 is not expected!

--

The expression "[^UuMmFf]*" will greedily match the
input string. In the
case of "x" or " " (space) the expression will match
the whole string! There
should not be a match on an empty string!

Considering these examples I would mark this as a bug!

Thanks,

P. Sliepenbeek

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around
http://mail.yahoo.com 
_______________________________________________
Boost-users mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/boost-users
Reply | Threaded
Open this post in threaded view
|

Re: Regex question

John Maddock
>The expression "[^UuMmFf]*" will greedily match the
>input string. In the
>case of "x" or " " (space) the expression will match
>the whole string! There
>should not be a match on an empty string!
>
>Considering these examples I would mark this as a bug!

The point here is that Boost-1.33.x aims to be Perl compatible, and wait for
it...

** sed is not the same as perl **

Perl *does* match an additional empty string immediately after a previous
match.

Rightly or wrongly the behaviour is by design.

John.

_______________________________________________
Boost-users mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/boost-users