| primary |
This library provide fast Unicode functions such as
- ASCII, UTF-8, UTF-16LE/BE and UTF-32 validation, with and without
error identification,
- transcoding between each of Latin1, UTF-8, UTF-16LE/BE, and UTF-32,
with and without validation, with and without error identification
- From an UTF-8 string, compute the size of the Latin1/UTF-16/UTF-32
equivalent string,
- From an UTF-16LE/BE string, compute the size of the
Latin1/UTF-8/UTF-32 equivalent string,
- From an UTF-32 string, compute the size of the UTF-8 or UTF-16LE
equivalent string,
- UTF-8 and UTF-16LE/BE character counting.
- UTF-16 endianness change (UTF16-LE/BE to UTF-16-BE/LE)
- Base64 encoding and decoding
The functions are accelerated using SIMD instructions (e.g., ARM NEON,
SSE, AVX, AVX-512, etc.). When your strings contain hundreds of
characters, we can often transcode them at speeds exceeding a billion
characters per second. You should expect high speeds not only with
English strings (ASCII) but also Chinese, Japanese, Arabic, and so
forth. We handle the full character range (including, for example,
emojis).
simdutf compiles down to a small library of a few hundred kilobytes.
Our functions are exception-free and non allocating. We have extensive
tests and extensive benchmarks.
|