This Typst package provides automatic Cantonese segmentation and romanization
(Jyutping (粵拼) and Yale (耶魯)) by wrapping the
rust-canto Rust crate as a WebAssembly
plugin. It integrates seamlessly with the
pycantonese-parser
package to render beautiful Cantonese text with ruby characters.
- Automatic Segmentation: Breaks Cantonese sentences into meaningful words using a dictionary-based trie.
- Multiple Romanizations: Supports both Jyutping and Yale (numeric or diacritics).
- High Performance: Powered by a Rust-compiled WASM plugin for fast processing.
- Typst Integration: Provides a
quick-renderfunction that handles both segmentation and styling in one go.
To use this package, ensure the rust_canto.wasm file is in your project directory.
#import "@preview/auto-canto:0.2.3": quick-render
// 36pt font
// use Libertinus Serif first (for ruby text)
// before falling back to Noto Serif CJK HK (for Chinese characters)
#set text(36pt, font: ("Libertinus Serif", "Noto Serif CJK HK"))
// 1. Basic rendering (defaults to Jyutping)
#quick-render[都會大學入面3%人識用AB膠]
// 2. Rendering with Yale romanization
#quick-render(romanization: "yale")[
平時會成日睇書
]
// 3. Customizing the underlying parser's style
#let my-text = "廣東話好難學"
#let my-style = (rb-size: 0.7em, rb-color: blue)
#let quicker-renderer = quick-render.with(style: my-style, visual-tones: false)
#quicker-renderer(my-text)Live demo on YouTube: https://youtu.be/ivUu91eDfvY
This package can render Jyutcizi above Chinese characters, provided that the
user has imported the
se-jyutcitzi Typst package.
To ensure that a clean dependency, the user has to pass the jyutcitzi()
function from se-jyutcitzi package to the jyutcit-ruby() function in this
package.
#import "@preview/se-jyutcitzi:0.3.2": *
#import "@preview/auto-canto:0.2.3": *
// #set page(height: auto, width: auto, margin: 1pt)
#set text(24pt, font: "Chiron GoRound TC")
#set par(justify: true)
// Customize Jyutcitzi display
#let default-style = (
rb-color: rgb("#ff0000"), // Annotation text color
rb-size: 0.8em, // Annotation text size
word-sep: 0.2em, // Chinese words separation
char-jp-sep: 0.2em, // vertical space between words and Jyutping above
)
#let mytxt = [
你識唔識講廣東話?就算你識講廣東話都好,都可以遇到啲好𠮩𠹌嘅字,就算係粵語母語者都好,都未必識得寫,最後要用abcd先得,就好似「bibu車」噉。
所以,我呢個package一定幫到你。仲唔快啲下載?
]
#jyutcit-ruby(mytxt, jyutcitzi: jyutcitzi)The primary high-level function. It fetches data from the WASM plugin and forwards it to the parser.
it: The item containing the Cantonese string to process...args: Named arguments forwarded torender-word-groups(e.g.romanization,style).
Renders Cantonese text with Jyutcitzi annotations above each word.
- Note: Requiresthe
jyutcitzifunction from these-jyutcitzipackage passed as an argument. it: The item containing the Cantonese string to process.jyutcitzi: Named argument for the Jyutcitzi function.style: a dictionary for the following four keysrb-color: ruby text colorrb-size: ruby text size (in em)word-sep: horizontal separation between words (in em)char-jp-sep: vertical separation between ruby text and main text (in em)
Returns the raw segmented data as an array of dictionaries.
- Return format:
arrayof{word: str, jyutping: str, yale: array}.
Utility functions to convert space-delimited Jyutping strings into Yale format.
numeric: "gwong2 dung1 waa2" → "gwong2 dung1 wa2".diacritics: "gwong2 dung1 waa2" → "gwóngdūngwá".
lib.typ: The main entry point containing the Typst wrappers.rust_canto.wasm: The WebAssembly binary compiled from therust-cantocrate.typst.toml: Package metadata and dependencies.
MIT
Contributions are welcome! Please open an issue or submit a pull request.

