This regex implementation is backwards-compatible with the standard 're'
module, but offers additional functionality.
The re module's behaviour with zero-width matches changed in Python 3.7,
and this module will follow that behaviour when compiled for Python 3.7.
This module is targeted at CPython. It expects that all codepoints are the
same width, so it won't behave properly with PyPy outside U+0000..U+007F
because PyPy stores strings as UTF-8.
Old vs new behaviour
In order to be compatible with the re module, this module has 2 behaviours:
* **Version 0** behaviour (old behaviour, compatible with the re module):
Please note that the re module's behaviour may change over time, and I'll
endeavour to match that behaviour in version 0.
* Indicated by the VERSION0 or V0 flag, or ``(?V0)`` in the pattern.
* Zero-width matches are not handled correctly in the re module before
Python 3.7. The behaviour in those earlier versions is:
* ``.split`` won't split a string at a zero-width match.
* ``.sub`` will advance by one character after a zero-width match.
* Inline flags apply to the entire pattern, and they can't be turned off.
* Only simple sets are supported.
* Case-insensitive matches in Unicode use simple case-folding by default.
* **Version 1** behaviour (new behaviour, possibly different from the re
* Indicated by the VERSION1 or V1 flag, or ``(?V1)`` in the pattern.
* Zero-width matches are handled correctly.
* Inline flags apply to the end of the group or pattern, and they can be
* Nested sets and set operations are supported.
* Case-insensitive matches in Unicode use full case-folding by default.
If no version is specified, the regex module will default to
Case-insensitive matches in Unicode
The regex module supports both simple and full case-folding for
case-insensitive matches in Unicode. Use of full case-folding can be turned
on using the FULLCASE or F flag, or ``(?f)`` in the pattern. Please note
that this flag affects how the IGNORECASE flag works; the FULLCASE flag
itself does not turn on case-insensitive matching.
In the version 0 behaviour, the flag is off by default.
In the version 1 behaviour, the flag is on by default.
Nested sets and set operations
It's not possible to support both simple sets, as used in the re module,
and nested sets at the same time because of a difference in the meaning of
an unescaped ``"["`` in a set.
For example, the pattern ``[[a-z]--[aeiou]]`` is treated in the version 0
behaviour (simple sets, compatible with the re module) as:
* Set containing "[" and the letters "a" to "z"
* Literal "--"
* Set containing letters "a", "e", "i", "o", "u"
* Literal "]"
but in the version 1 behaviour (nested sets, enhanced behaviour) as:
* Set which is:
* Set containing the letters "a" to "z"
* but excluding:
Configuration Switches (platform-specific settings discarded)
PY38 ON Build using Python 3.8
PY39 OFF Build using Python 3.9
Package Dependencies by Type
Distribution File Information
f341ee2df0999bfdf7a95e448075effe0db212a59387de1a70690e4acb03d4c6 702813 regex-2021.11.10.tar.gz
Ports that require python-regex:py38