sourced.ml.core.algorithms¶
Subpackages¶
Submodules¶
sourced.ml.core.algorithms.id_embeddingsourced.ml.core.algorithms.swivelsourced.ml.core.algorithms.tf_idfsourced.ml.core.algorithms.token_parsersourced.ml.core.algorithms.uast_id_distancesourced.ml.core.algorithms.uast_ids_to_bagsourced.ml.core.algorithms.uast_inttypes_to_graphletssourced.ml.core.algorithms.uast_inttypes_to_nodessourced.ml.core.algorithms.uast_struct_to_bagsourced.ml.core.algorithms.uast_to_bagsourced.ml.core.algorithms.uast_to_id_sequencesourced.ml.core.algorithms.uast_to_role_id_pairs
Package Contents¶
-
class
sourced.ml.core.algorithms.UastIds2Bag(token2index=None, token_parser=None)[source]¶ Bases:
sourced.ml.core.algorithms.uast_ids_to_bag.UastTokens2BagConverts a UAST to a bag-of-identifiers.
-
XPATH= //*[@roleIdentifier]¶
-
-
class
sourced.ml.core.algorithms.UastRandomWalk2Bag(p_explore_neighborhood=0.79, q_leave_neighborhood=0.82, n_walks=2, n_steps=10, stride=1, seq_len=(2, 3), seed=42)[source]¶ Bases:
sourced.ml.core.algorithms.uast_struct_to_bag.Uast2StructBagBase
-
class
sourced.ml.core.algorithms.UastSeq2Bag(stride=1, seq_len=(3, 4), node2index=None)[source]¶ Bases:
sourced.ml.core.algorithms.uast_struct_to_bag.Uast2StructBagBaseDFS traversal + preserves the order of node children.
-
class
sourced.ml.core.algorithms.Uast2QuantizedChildren(npartitions: int = 20)[source]¶ Bases:
sourced.ml.core.algorithms.uast_to_bag.Uast2BagThroughSingleScanConverts a UAST to a bag of children counts.
-
node2key(self, node: bblfsh.Node)¶ Return the key for a given Node.
Parameters: node – a node in UAST. Returns: The string which consists of the internal type of the node and its number of children.
-
quantize(self, frequencies: Iterable[Tuple[str, Iterable[Tuple[int, int]]]])¶
-
quantize_unwrapped(self, children_freq: Iterable[Tuple[int, int]])¶ Builds the quantization partition P that is a vector of length nb_partitions whose entries are in strictly ascending order. Quantization of x is defined as:
0 if x <= P[0] m if P[m-1] < x <= P[m] n if P[n] <= xParameters: children_freq – distribution of the number of children. Returns: The array with quantization levels.
-
-
class
sourced.ml.core.algorithms.Uast2GraphletBag[source]¶ Bases:
sourced.ml.core.algorithms.uast_ids_to_bag.Uast2BagBaseConverts a UAST to a bag of graphlets. The graphlet of a UAST node is composed from the node itself, its parent and its children. Each node is represented by the internal role string.
-
uast2graphlets(self, uast)¶ Parameters: uast – The UAST root node. Generate: The nodes which compose the UAST. :class: ‘Node’ is used to access the nodes of the graphlets.
-
node2key(self, node)¶ Builds the string joining internal types of all the nodes in the node’s graphlet in the following order: parent_node_child1_child2_child3. The children are sorted by alphabetic order. str format is required for BagsExtractor.
Parameters: node – a node of UAST Returns: The string key of node
-
-
class
sourced.ml.core.algorithms.Uast2RoleIdPairs(token2index=None, token_parser=None)[source]¶ Bases:
sourced.ml.core.algorithms.uast_ids_to_bag.UastIds2BagConverts a UAST to a list of pairs. Pair is identifier and role, where role is Node role where identifier was found.
__call__ is overridden here and returns list instead of bag-of-words (dist).
-
static
merge_roles(roles: Iterable[int])¶
-
static
-
class
sourced.ml.core.algorithms.Uast2IdLineDistance[source]¶ Bases:
sourced.ml.core.algorithms.uast_id_distance.Uast2IdDistanceConverts a UAST to a list of identifiers pair and code line distance between where applicable.
__call__ is overridden here and return list instead of bag-of-words (dist).
-
distance(self, point1, point2)¶
-
-
class
sourced.ml.core.algorithms.Uast2IdTreeDistance[source]¶ Bases:
sourced.ml.core.algorithms.uast_id_distance.Uast2IdDistanceConverts a UAST to a list of identifiers pair and UAST tree distance between.
__call__ is overridden here and return list instead of bag-of-words (dist).
-
distance(self, point1, point2)¶
-
static
calc_tree_distance(last_common_level, level1, level2)¶
-
-
class
sourced.ml.core.algorithms.Uast2IdSequence[source]¶ Bases:
sourced.ml.core.algorithms.uast_id_distance.Uast2IdLineDistanceConverts a UAST to a sorted sequence of identifiers. Identifiers are sorted by position in code. We do not change the order if positions are not present.
__call__ is overridden here and return list instead of bag-of-words (dist).
-
static
concat(id_sequence: Iterable)¶
-
static