BLOCXX_NAMESPACE::PosixRegEx Class Reference

POSIX Regular Expression wrapper class and utility functions. More...

#include <PosixRegEx.hpp>

List of all members.

Public Types

typedef regmatch_t match_t
 POSIX RegEx structure for captured substring offset pair.
typedef blocxx::Array< match_tMatchArray
 Array of captured substring offsets.

Public Member Functions

 PosixRegEx ()
 Create a new PosixRegEx object without compilation.
 PosixRegEx (const String &regex, int cflags=REG_EXTENDED)
 Create a new PosixRegEx object and compile the regular expression.
 PosixRegEx (const PosixRegEx &ref)
 Create a new PosixRegEx as (deep) copy of the specified reference.
 ~PosixRegEx ()
 Destroy this PosixRegEx object.
PosixRegExoperator= (const PosixRegEx &ref)
 Assign the specified PosixRegEx reference.
bool compile (const String &regex, int cflags=REG_EXTENDED)
 Compile the regular expression contained in the string.
int errorCode ()
 Return the last error code generated by compile or one of the executing methods.
String errorString () const
 Return the error message string for the last error code.
String patternString () const
int compileFlags () const
bool isCompiled () const
bool execute (MatchArray &sub, const String &str, size_t index=0, size_t count=0, int eflags=0)
 Execute regular expression matching against the string.
StringArray capture (const String &str, size_t index=0, size_t count=0, int eflags=0)
 Search in string and return an array of captured substrings.
String replace (const String &str, const String &rep, bool global=false, int eflags=0)
 Replace (substitute) the first or all matching substrings.
StringArray split (const String &str, bool empty=false, int eflags=0)
 Split the specified string into an array of substrings.
StringArray grep (const StringArray &src, int eflags=0)
 Match all strings in the array against regular expression.
bool match (const String &str, size_t index=0, int eflags=0) const
 Execute regular expression matching against the string.

Private Attributes

bool compiled
int m_flags
int m_ecode
String m_error
String m_rxstr
regex_t m_regex


Detailed Description

POSIX Regular Expression wrapper class and utility functions.

Depends on avaliability of a POSIX.2 / SUSv2 conforming regcomp(3) and regexec(3) function implementation.

Consult the regcomp(3), regexec(3) and regex(7) manual pages for informations about details of the posix regex usage.

Definition at line 58 of file PosixRegEx.hpp.


Member Typedef Documentation

POSIX RegEx structure for captured substring offset pair.

The regex match structure contains two member variables:

  • rm_so start offset of the regex match
  • rm_eo end offset of the regex match

Definition at line 67 of file PosixRegEx.hpp.

Array of captured substring offsets.

Definition at line 72 of file PosixRegEx.hpp.


Constructor & Destructor Documentation

BLOCXX_NAMESPACE::PosixRegEx::PosixRegEx (  ) 

Create a new PosixRegEx object without compilation.

Definition at line 121 of file PosixRegEx.cpp.

BLOCXX_NAMESPACE::PosixRegEx::PosixRegEx ( const String regex,
int  cflags = REG_EXTENDED 
)

Create a new PosixRegEx object and compile the regular expression.

See also compile() method.

Parameters:
regex A regular expression pattern.
cflags Bitwise-or of compilation flags.
Exceptions:
RegExCompileException on compilation failure.

Definition at line 130 of file PosixRegEx.cpp.

References BLOCXX_THROW_ERR, compile(), errorString(), and m_ecode.

BLOCXX_NAMESPACE::PosixRegEx::PosixRegEx ( const PosixRegEx ref  ) 

Create a new PosixRegEx as (deep) copy of the specified reference.

If the reference is compiled, the new object will be compiled as well.

Parameters:
ref The PosixRegEx object reference to copy.
Exceptions:
RegExCompileException on compilation failure.

Definition at line 144 of file PosixRegEx.cpp.

References BLOCXX_THROW_ERR, compile(), compiled, errorString(), m_ecode, m_flags, and m_rxstr.

BLOCXX_NAMESPACE::PosixRegEx::~PosixRegEx (  ) 

Destroy this PosixRegEx object.

Definition at line 159 of file PosixRegEx.cpp.

References compiled, and m_regex.


Member Function Documentation

PosixRegEx & BLOCXX_NAMESPACE::PosixRegEx::operator= ( const PosixRegEx ref  ) 

Assign the specified PosixRegEx reference.

If the reference is compiled, the current object will be (re)compiled.

Parameters:
ref The PosixRegEx object reference to assign from.
Exceptions:
RegExCompileException on compilation failure.

Definition at line 170 of file PosixRegEx.cpp.

References BLOCXX_THROW_ERR, compile(), compiled, BLOCXX_NAMESPACE::String::erase(), errorString(), m_ecode, m_error, m_flags, m_regex, m_rxstr, and BLOCXX_NAMESPACE::REG_NOERROR.

bool BLOCXX_NAMESPACE::PosixRegEx::compile ( const String regex,
int  cflags = REG_EXTENDED 
)

Compile the regular expression contained in the string.

Parameters:
regex A regular expression pattern.
cflags Bitwise-or of compilation flags.
Returns:
True on successful compilation, false on failure.
The cflags parameter can be set to one or a bitwise-or of the following flags. Consult the regcomp manual page for the complete (library specific) flag list and detailed description.

  • REG_EXTENDED Use Extended Regular Expressions syntax instead of Basic.
  • REG_ICASE Ignore character case in match.
  • REG_NOSUB Report match only, do not capture substrings.
  • REG_NEWLINE Match-any-character operators don't match a newline.

Definition at line 195 of file PosixRegEx.cpp.

References BLOCXX_NAMESPACE::String::c_str(), compiled, BLOCXX_NAMESPACE::String::erase(), BLOCXX_NAMESPACE::getError(), m_ecode, m_error, m_flags, m_regex, m_rxstr, and BLOCXX_NAMESPACE::REG_NOERROR.

Referenced by operator=(), and PosixRegEx().

int BLOCXX_NAMESPACE::PosixRegEx::errorCode (  ) 

Return the last error code generated by compile or one of the executing methods.

Returns:
0 or the last error code

Definition at line 222 of file PosixRegEx.cpp.

References m_ecode.

String BLOCXX_NAMESPACE::PosixRegEx::errorString (  )  const

Return the error message string for the last error code.

Returns:
The error message or empty string if no expression was compiled.

Definition at line 230 of file PosixRegEx.cpp.

References m_error.

Referenced by capture(), grep(), match(), operator=(), PosixRegEx(), replace(), and split().

String BLOCXX_NAMESPACE::PosixRegEx::patternString (  )  const

Returns:
The regular expression pattern string.

Definition at line 238 of file PosixRegEx.cpp.

References m_rxstr.

int BLOCXX_NAMESPACE::PosixRegEx::compileFlags (  )  const

Returns:
The compilation flags used in compile() method.

Definition at line 246 of file PosixRegEx.cpp.

References m_flags.

bool BLOCXX_NAMESPACE::PosixRegEx::isCompiled (  )  const

Returns:
true, if the current regex object is compiled.

Definition at line 254 of file PosixRegEx.cpp.

References compiled.

bool BLOCXX_NAMESPACE::PosixRegEx::execute ( MatchArray sub,
const String str,
size_t  index = 0,
size_t  count = 0,
int  eflags = 0 
)

Execute regular expression matching against the string.

The matching starts at the specified index and return true on match of false if no match found.

Note:
In contrast to the (PCRE) PerlRegEx class, the index handling is not provided by posix regex. The PosixRegEx class is using simple str.c_str() + index construct and adjusts the resulting match offsets.
The expected number of substrings to match can be specified in count. If the default value of 0 is used, the count as detected by compile is used instead.
Note:
If the specified count is greater 0 but smaller than the effectively number of found matches, the match array will contain only offsets for captured substring. This is a different than in PerlRegEx that reports failure. If the specified count is greater 0 and greater than the the effectively number of found matches, unused offsets at the end are filled with to -1.
If no match was found, the sub array will be empty and false is returned. If a match is found and the expression was compiled to capture substrings, the sub array will be filled with the captured substring offset (match_t structures). The first (index 0) offset pair points to the start of the first match and the end of the last match. Unused / optional capturing subpattern offsets will be set to -1.

Consult the regexec(3) and regex(7) manual pages for complete and detailed descriptions.

Parameters:
sub array for substring offsets
str string to match
index match string starting at index
count number of expected substring matches
eflags execution flags described bellow
Returns:
true on match or false
Exceptions:
RegExCompileException if regex is not compiled.
OutOfBoundsException if the index is greater than the string length.
The eflags parameter can be set to 0, one or a bitwise-or of the following options:

  • REG_NOTBOL The circumflex character (^) will not match the beginning of string.
  • REG_NOTEOL The dollar sign ($) will not match the end of string.
Example:
 String      str("foo = bar trala hoho");
 MatchArray  sub;
 if( PosixRegEx("=").execute(sub, str) && !sub.empty())
 {
   //
   // sub[0].rm_so is 4,
   // sub[0].rm_eo is 5
   //
 }

Definition at line 262 of file PosixRegEx.cpp.

References BLOCXX_THROW, BLOCXX_NAMESPACE::String::c_str(), compiled, BLOCXX_NAMESPACE::String::erase(), BLOCXX_NAMESPACE::AutoPtrVec< X >::get(), BLOCXX_NAMESPACE::getError(), BLOCXX_NAMESPACE::String::length(), m_ecode, m_error, m_flags, m_regex, and BLOCXX_NAMESPACE::REG_NOERROR.

Referenced by capture(), replace(), and split().

StringArray BLOCXX_NAMESPACE::PosixRegEx::capture ( const String str,
size_t  index = 0,
size_t  count = 0,
int  eflags = 0 
)

Search in string and return an array of captured substrings.

Parameters:
str string to search in
index match string starting at index
count expected substring count
eflags execution flags, see execute() method
Returns:
array of captured substrings
Exceptions:
RegExCompileException if regex is not compiled or the REG_NOSUB compilation flag was used.
RegExExecuteException on execute failures.
OutOfBoundsException if the index is greater than the string length.
Example:
 String      str("foo = bar trala hoho");
 PosixRegEx  reg("^([a-z]+)[ \t]*=[ \t]*(.*)$");
 StringArray out = reg.capture(str);
 //
 // out is { "foo = bar trala hoho",
 //          "foo",
 //          "bar trala hoho"
 //        }

Definition at line 325 of file PosixRegEx.cpp.

References BLOCXX_THROW, BLOCXX_THROW_ERR, compiled, errorString(), execute(), m_ecode, match(), BLOCXX_NAMESPACE::Array< T >::push_back(), and BLOCXX_NAMESPACE::String::substring().

blocxx::String BLOCXX_NAMESPACE::PosixRegEx::replace ( const String str,
const String rep,
bool  global = false,
int  eflags = 0 
)

Replace (substitute) the first or all matching substrings.

Substring(s) matching regular expression are replaced with the string provided in rep and a new, modified string is returned. If no matches are found, a copy of 'str' string is returned.

The rep string can contain capturing references "\\1" to "\\9" that will be substituted with the corresponding captured string. Prepended "\\" before the reference disables (switches to skip) the substitution. Note, the notation using double-slash followed by a digit character, not just "\1" like the "\n" escape sequence.

Parameters:
str string that should be matched
rep replacement substring with optional references
global if to replace the first or all matches
eflags execution flags, see execute() method
Returns:
new string with modification(s)
Exceptions:
RegExCompileException if regex is not compiled or the REG_NOSUB compilation flag was used.
RegExExecuteException on execute failures.
OutOfBoundsException if the index is greater than the string length.
Example:
 String      str("//foo/.//bar/hoho");
 PosixRegEx  reg("([/]+(\\.?[/]+)?)");
 String      out = reg.replace(str, "/", true);
 //
 // out is "/foo/bar/hoho"
 //

Definition at line 370 of file PosixRegEx.cpp.

References BLOCXX_THROW, BLOCXX_THROW_ERR, compiled, BLOCXX_NAMESPACE::String::erase(), errorString(), execute(), BLOCXX_NAMESPACE::String::length(), m_ecode, m_error, match(), BLOCXX_NAMESPACE::REG_NOERROR, BLOCXX_NAMESPACE::substitute_caps(), and BLOCXX_NAMESPACE::String::substring().

StringArray BLOCXX_NAMESPACE::PosixRegEx::split ( const String str,
bool  empty = false,
int  eflags = 0 
)

Split the specified string into an array of substrings.

The regular expression is used to match the separators.

If the empty flag is true, empty substring are included in the resulting array.

If no separators were found, and the empty flag is true, the array will contain the input string as its only element. If the empty flag is false, a empty array is returned.

Parameters:
str string that should be splitted
empty whether to capture empty substrings
eflags execution flags, see execute() method
Returns:
array of resulting substrings or empty array on failure
Exceptions:
RegExCompileException if regex is not compiled or the REG_NOSUB compilation flag was used.
RegExExecuteException on execute failures.
OutOfBoundsException if the index is greater than the string length.
Example:
 String      str("1.23, .50 , , 71.00 , 6.00");
 StringArray out1 = PosixRegEx("([ \t]*,[ \t]*)").split(str);
 //
 // out1 is { "1.23", ".50", "71.00", "6.00" }
 //

Definition at line 422 of file PosixRegEx.cpp.

References BLOCXX_THROW, BLOCXX_THROW_ERR, compiled, BLOCXX_NAMESPACE::String::empty(), BLOCXX_NAMESPACE::String::erase(), errorString(), execute(), BLOCXX_NAMESPACE::String::length(), m_ecode, m_error, match(), BLOCXX_NAMESPACE::Array< T >::push_back(), BLOCXX_NAMESPACE::REG_NOERROR, and BLOCXX_NAMESPACE::String::substring().

StringArray BLOCXX_NAMESPACE::PosixRegEx::grep ( const StringArray src,
int  eflags = 0 
)

Match all strings in the array against regular expression.

Returns an array of matching strings.

Parameters:
src list of strings to match
eflags execution flags, see execute() method
Exceptions:
RegExCompileException if regex is not compiled or the REG_NOSUB compilation flag was used.
RegExExecuteException on execute failures.
OutOfBoundsException if the index is greater than the string length.
Example:
 StringArray src;
 src.push_back("\t");
 src.push_back("one");
 src.push_back("");
 src.push_back("two");
 src.push_back("  ");
 StringArray out = PosixRegEx("[^ \t]").grep(src);
 //
 // out is { "one", "two" }
 //

Definition at line 479 of file PosixRegEx.cpp.

References BLOCXX_NAMESPACE::Array< T >::begin(), BLOCXX_THROW, BLOCXX_THROW_ERR, compiled, BLOCXX_NAMESPACE::Array< T >::empty(), BLOCXX_NAMESPACE::Array< T >::end(), BLOCXX_NAMESPACE::String::erase(), errorString(), BLOCXX_NAMESPACE::getError(), m_ecode, m_error, m_regex, BLOCXX_NAMESPACE::Array< T >::push_back(), and BLOCXX_NAMESPACE::REG_NOERROR.

bool BLOCXX_NAMESPACE::PosixRegEx::match ( const String str,
size_t  index = 0,
int  eflags = 0 
) const

Execute regular expression matching against the string.

The matching starts at the specified index and return true on match of false if no match found.

See execute() method for description of the index and eflags parameters.

Parameters:
str string to match
index match string starting at index
eflags execution flags, see execute() method
Returns:
true on match or false
Exceptions:
RegExCompileException if regex is not compiled.
RegExExecuteException on execute failures.
OutOfBoundsException if the index is greater than the string length.
Example:
 String      str("foo = bar ");
 if( PosixRegEx("^[a-z]+[ \t]*=[ \t]*.*$").match(str))
 {
 }

Definition at line 518 of file PosixRegEx.cpp.

References BLOCXX_THROW, BLOCXX_THROW_ERR, BLOCXX_NAMESPACE::String::c_str(), compiled, BLOCXX_NAMESPACE::String::erase(), errorString(), BLOCXX_NAMESPACE::getError(), BLOCXX_NAMESPACE::String::length(), m_ecode, m_error, m_regex, and BLOCXX_NAMESPACE::REG_NOERROR.

Referenced by capture(), replace(), and split().


Member Data Documentation

Definition at line 409 of file PosixRegEx.hpp.

Referenced by compile(), compileFlags(), execute(), operator=(), and PosixRegEx().

int BLOCXX_NAMESPACE::PosixRegEx::m_ecode [mutable, private]

Definition at line 411 of file PosixRegEx.hpp.

Referenced by compile(), errorString(), execute(), grep(), match(), operator=(), replace(), and split().

Definition at line 412 of file PosixRegEx.hpp.

Referenced by compile(), operator=(), patternString(), and PosixRegEx().

Definition at line 413 of file PosixRegEx.hpp.

Referenced by compile(), execute(), grep(), match(), operator=(), and ~PosixRegEx().


The documentation for this class was generated from the following files:

Generated on Wed Feb 25 19:05:08 2009 for blocxx by  doxygen 1.5.6