python-chardet
Port variant v14
Summary Universal character encoding detector (3.14)
Package version 7.4.0.post2
Homepage https://github.com/chardet/chardet
Keywords python
Maintainer Python Automaton
License Not yet specified
Other variants v13
Ravenports Buildsheet | History
Ravensource Port Directory | History
Last modified 30 MAR 2026, 13:06:33 UTC
Port created 30 MAY 2017, 20:17:50 UTC
Subpackage Descriptions
single # chardet Universal character encoding detector. [![License: 0BSD]](LICENSE) [Documentation] [codecov] chardet 7 is a ground-up, 0BSD-licensed rewrite of [chardet]. Same package name, same public API — drop-in replacement for chardet 5.x/6.x, just much faster and more accurate. Python 3.10+, zero runtime dependencies, works on PyPy. ## Why chardet 7? **99.3% accuracy** on 2,517 test files. **47x faster** than chardet 6.0.0 and **1.5x faster** than charset-normalizer 3.4.6. **Language detection** for every result. **MIME type detection** for binary files. **0BSD licensed.** | | chardet 7.4.0 (mypyc) | chardet 6.0.0 | [charset-normalizer] 3.4.6 | | ---------------------- | :--------------------: | :-----------: | :-------------------------: | | Accuracy (2,517 files) | **99.3%** | 88.2% | 85.4% | | Speed | **551 files/s** | 12 files/s | 376 files/s | | Language detection | **95.7%** | 40.0% | 59.2% | | Peak memory | **52.9 MiB** | 29.5 MiB | 78.8 MiB | | Streaming detection | **yes** | yes | no | | Encoding era filtering | **yes** | no | no | | Encoding filters | **yes** | no | yes | | MIME type detection | **yes** | no | no | | Supported encodings | 99 | 84 | 99 | | License | 0BSD | LGPL | MIT | [charset-normalizer]: https://github.com/jawah/charset_normalizer ## Installation `bash pip install chardet ` ## Quick Start ```python import chardet chardet.detect(b"Python is a great programming language for beginners and experts alike.") # {'encoding': 'ascii', 'confidence': 1.0, 'language': 'en', 'mime_type': 'text/plain'} # UTF-8 English with accented characters chardet.detect("The naïve approach doesn't always work in complex systems.".encode("utf-8")) # {'encoding': 'utf-8', 'confidence': 0.84, 'language': 'en', 'mime_type': 'text/plain'} # Japanese EUC-JP chardet.detect("日本語の文字コード検出テストです。このテキストはEUC-JPでエンコードされています。正しく検出できるか確認します。".encode("euc-jp")) # {'encoding': 'EUC-JP', 'confidence': 1.0, 'language': 'ja', 'mime_type': 'text/plain'} # Get all candidate encodings ranked by confidence text = "Le café est une boisson très populaire en France et dans le monde entier." results = chardet.detect_all(text.encode("windows-1252")) for r in results[:4]: print(r["encoding"], round(r["confidence"], 2)) # Windows-1252 0.32 # iso8859-15 0.32 # ISO-8859-1 0.32 # MacRoman 0.31 ``` ### Streaming Detection For large files or network streams, use `UniversalDetector` to feed data incrementally: ```python from chardet import UniversalDetector detector = UniversalDetector() with open("unknown.txt", "rb") as f: for line in f: detector.feed(line) if detector.done: break
Configuration Switches (platform-specific settings discarded)
PY313 OFF Build using Python 3.13 PY314 ON Build using Python 3.14
Package Dependencies by Type
Build (only) python314:dev:std
python-pip:single:v14
autoselect-python:single:std
Build and Runtime python314:primary:std
Download groups
main mirror://PYPIWHL/94/d2/22ac0b5b832bb9d2f29311dcded6c09ad0c32c23e3e53a8033aad5eb8652
Distribution File Information
e0c9c6b5c296c0e5197bc8876fcc04d58a6ddfba18399e598ba353aba28b038e 625322 python-src/chardet-7.4.0.post2-py3-none-any.whl
Ports that require python-chardet:v14
No other ports depend on this one.