sourced.ml.core.algorithms.token_parser¶
Module Contents¶
-
class
sourced.ml.core.algorithms.token_parser.TokenStyle[source]¶ Bases:
enum.EnumMetadata that should allow to reconstruct initial identifier from a list of tokens.
-
class
sourced.ml.core.algorithms.token_parser.TokenParser(stem_threshold=STEM_THRESHOLD, max_token_length=MAX_TOKEN_LENGTH, min_split_length=MIN_SPLIT_LENGTH, single_shot=False, save_token_style=False, attach_upper=True, use_nn=False, nn_model=None)[source]¶ Common utilities for splitting and stemming tokens.