| single |
Internationalized Domain Names in Applications (IDNA)
=====================================================
Support for `Internationalized Domain Names in
Applications (IDNA) `_
and [Unicode IDNA Compatibility Processing
].
The latest versions of these standards supplied here provide
more comprehensive language coverage and reduce the potential of
allowing domains with known security vulnerabilities. This library
is a suitable replacement for the “encodings.idna”
module that comes with the Python standard library, but which
only supports an older superseded IDNA specification from 2003.
Basic functions are simply executed:
.. code-block:: pycon
>>> import idna
>>> idna.encode('ドメイン.テスト')
b'xn--eckwd4c7c.xn--zckzah'
>>> print(idna.decode('xn--eckwd4c7c.xn--zckzah'))
ドメイン.テスト
Installation
------------
This package is available for installation from PyPI via the
typical mechanisms, such as:
.. code-block:: bash
$ python3 -m pip install idna
Usage
-----
For typical usage, the encode and decode functions will take a
domain name argument and perform a conversion to ASCII compatible encoding
(known as A-labels), or to Unicode strings (known as U-labels)
respectively.
.. code-block:: pycon
>>> import idna
>>> idna.encode('ドメイン.テスト')
b'xn--eckwd4c7c.xn--zckzah'
>>> print(idna.decode('xn--eckwd4c7c.xn--zckzah'))
ドメイン.テスト
Conversions can be applied at a per-label basis using the ulabel or
alabel functions if necessary:
.. code-block:: pycon
>>> idna.alabel('测试')
b'xn--0zwm56d'
Compatibility Mapping (UTS #46)
+++++++++++++++++++++++++++++++
This library provides support for [Unicode IDNA Compatibility
Processing] which normalizes input from
different potential ways a user may input a domain prior to performing the
IDNA
conversion operations. This functionality, known as a
[mapping], is considered by the
specification to be a local user-interface issue distinct from IDNA
conversion functionality.
For example, “Königsgäßchen” is not a permissible label as *LATIN
CAPITAL LETTER K* is not allowed (nor are capital letters in general).
UTS 46 will convert this into lower case prior to applying the IDNA
conversion.
.. code-block:: pycon
>>> import idna
>>> idna.encode('Königsgäßchen')
...
idna.core.InvalidCodepoint: Codepoint U+004B at position 1 of
'Königsgäßchen' not allowed
>>> idna.encode('Königsgäßchen', uts46=True)
b'xn--knigsgchen-b4a3dun'
>>> print(idna.decode('xn--knigsgchen-b4a3dun'))
königsgäßchen
Exceptions
----------
All errors raised during the conversion following the specification
should raise an exception derived from the ``idna.IDNAError`` base
class.
More specific exceptions that may be generated as ``idna.IDNABidiError``
when the error reflects an illegal combination of left-to-right and
right-to-left characters in a label; ``idna.InvalidCodepoint`` when
a specific codepoint is an illegal character in an IDN label (i.e.
INVALID); and ``idna.InvalidCodepointContext`` when the codepoint is
|