Convert a Simplified Chinese article to Taiwan Traditional for a regional site
Batch convert a folder of Markdown files with the opencc-py CLI
Add custom term rules on top of a built-in locale via converter_factory
Localize an HTML page while skipping script, style, and ignored elements
Requires Python 3.11 or newer, otherwise no extra setup.
opencc-py is a pure Python library for converting Chinese text between different regional variants such as Mainland Simplified, Hong Kong Traditional, Taiwan Traditional, and Japanese new-form characters. The author ported it from an earlier C# implementation, keeping the same dictionaries, locale and preset definitions, longest-match lookup approach, and multi-stage conversion flow as the original. The package targets Python 3.11 or newer and has no runtime dependencies on other packages. It ships with built-in dictionaries for six locales, written as short codes: cn for Mainland Simplified, hk for Hong Kong Traditional, tw for Taiwan Traditional, tw2 for the Taiwan everyday-words variant, twp for Taiwan with extra IT terms and personal names, and jp for Japanese characters. There is also a pass-through code t that skips dictionary loading for that stage. Three presets are available: full, cn2t for Simplified to Traditional, and t2cn for the reverse. Install with pip install opencc-py-tw2. The basic usage is to call converter(source_locale, target_locale) which returns an object you can call on a string, for example converter("cn", "tw2")("a sentence in simplified characters") returns the Taiwan-form output. Users can also pass their own dictionaries. The custom dictionary string format matches the C# version: each entry is source then target, entries separated by a pipe character, and a tab can be used when the source or target contains a space. The README also documents a converter_factory function that chains multiple DictGroup objects in order, useful when you want to apply your own rules on top of a built-in locale. There is an HTML and XML converter that works through Python's standard xml.etree.ElementTree, converting text inside elements whose lang attribute matches the requested range, plus meta description and keywords content, image alt attributes, and button input values, while skipping script and style tags and any element with the ignore-opencc class. A command-line tool called opencc-py converts a file given source and target locales, with optional output path or in-place editing. The project is MIT licensed.
Generated 2026-05-22 · Model: sonnet-4-6 · Verify against the repo before relying on details.