utsuho.converters module
Converters for deterministic Japanese text normalization.
- class utsuho.converters.FullToHalfConverter(config: Optional[WidthConverterConfig] = None)
ベースクラス:
objectFull-width katakana to half-width katakana converter.
- パラメータ:
config (WidthConverterConfig, optional) -- Additional configuration of whether to convert non-katakana letters.
- class utsuho.converters.HalfToFullConverter(config: Optional[WidthConverterConfig] = None)
ベースクラス:
objectHalf-width katakana to full-width katakana converter.
- パラメータ:
config (WidthConverterConfig, optional) -- Additional configuration of whether to convert non-katakana letters.
- class utsuho.converters.WidthConverterConfig(punctuation: bool = True, corner_brucket: bool = True, conjunction_mark: bool = True, length_mark: bool = True, space: bool = True, ascii_symbol: bool = True, ascii_alphabet: bool = True, ascii_digit: bool = True, wave_dash: bool = False)
ベースクラス:
objectConfiguration for converting non-katakana characters.
- パラメータ:
punctuation (bool, default=True) -- Whether to convert punctuation marks.
corner_brucket (bool, default=True) -- Whether to convert corner brackets.
conjunction_mark (bool, default=True) -- Whether to convert conjunction marks.
length_mark (bool, default=True) -- Whether to convert length marks.
space (bool, default=True) -- Whether to convert spaces.
ascii_symbol (bool, default=True) -- Whether to convert ASCII symbols.
ascii_alphabet (bool, default=True) -- Whether to convert ASCII alphabets.
ascii_digit (bool, default=True) -- Whether to convert ASCII digits.
wave_dash (bool, default=False) -- Whether to convert full-width wave dash to half-width tilde.