Function

GLib-2.0GLibstr_tokenize_and_foldSince 2.40

  • Tokenizes string and performs folding on each token.

    A token is a non-empty sequence of alphanumeric characters in the source string, separated by non-alphanumeric characters. An "alphanumeric" character for this purpose is one that matches GLib.unichar_isalnum or GLib.unichar_ismark.

    Each token is then (Unicode) normalised and case-folded. If ascii_alternates is non-NULL and some of the returned tokens contain non-ASCII characters, ASCII alternatives will be generated.

    The number of ASCII alternatives that are generated and the method for doing so is unspecified, but translit_locale (if specified) may improve the transliteration if the language of the source string is known.

    Parameters

    • string: string

      a string to tokenize

    • Optionaltranslit_locale: string

      the language code (like 'de' or 'en_GB') from which string originates

    Returns [string[], string[]]

    the folded tokens

    2.40