match.regex module

This module provides functions for compiling a regular expression into a non-deterministic finite automaton (NFA) and checking if that NFA matches a given search string.

Supported Operators

The operators supported by this module are outlined below:

  • . - Concatenation. Note that this operator denotes explicit
    concatenation (e.g. The regular expression “h.e.l.l.o” is required in order to match the string “hello”).
  • | - A vertical bar represents alternation/union.
  • ? - Indicates an optional character (zero or one occurrences).
  • + - The plus symbol indicates one or more occurrences of the
    preceding character.
  • * - The “Kleene star” indicates zero or more occurrences of the
    preceding character.
exception match.regex.InvalidRegexError(msg='Invalid regular expression', *args)[source]

Bases: ValueError

Raised to indicate a string is not a valid regular expression, and is therefore unable to be compiled into a NFA.

match.regex.compile_regex(regex: str) → match.states.Fragment[source]

Compiles a given regular expression (in infix notation) to a NFA Fragment and returns it.

The compilation process works by first converting regex to postfix notation and then scanning through the postfix expression while maintaining a stack of computed NFA fragments. When a literal character is read, a new NFA fragment representing that character is pushed onto the stack. If an operator is read, fragments are popped off the stack (one fragment for unary operators & two for binary operators) and then a new fragment representing the result of the operation is pushed back onto the stack. At the end, a single NFA Fragment is returned which represents the compiled regular expression.

References & Further Info:
Parameters:regex – The regular expression to compile.
Returns:A Fragment that represents the compiled regular expression.
Raises:InvalidRegexError – If regex is determined to be an invalid infix regular expression.
match.regex.match(regex: str, s: str) → bool[source]

This function will return True if the regular expression regex (fully) matches the string s, and False otherwise.

Parameters:
  • regex – The regular expression to match.
  • s – The string to check against the regular expression.
Returns:

True if the string s matches the regular expression, and False otherwise.

Raises:
  • InvalidRegexError – If there is an error compiling the regular expression.
  • ValueError – If the regular expression is an empty string.