sourced.ml.core.models.id_splitter¶
Module Contents¶
-
class
sourced.ml.core.models.id_splitter.IdentifierSplitterBiLSTM(**kwargs)[source]¶ Bases:
modelforge.ModelBidirectional LSTM Model. Splits identifiers without need for a conventional pattern. Reference: https://arxiv.org/abs/1805.11651
-
construct(self, model: keras.models.Model, maxlen: int = DEFAULT_MAXLEN, padding: str = DEFAULT_PADDING, mapping: Dict[str, int] = DEFAULT_MAPPING, batch_size: int = DEFAULT_BATCH_SIZE)[source]¶ Construct IdentifierSplitterBiLSTM model.
Parameters: - model – keras model used for identifier splitting.
- maxlen – Maximum length of input identifers.
- padding – Where to pad the identifiers of length < maxlen. Can be “left” or “right”.
- mapping – Mapping of characters to integers.
- batch_size – Batch size of input data fed to the model.
Returns: BiLSTM based source code identifier splitter.
-
prepare_input(self, identifiers: Sequence[str])[source]¶ Prepare input by converting a sequence of identifiers to the corresponding ascii code 2D-array and the list of lowercase cleaned identifiers.
-