Utility Functions¶

ddot.utils.NdexGraph_to_nx(G)[source]¶

Converts a NetworkX into a NdexGraph object.

Parameters:	G (ndex.networkn.NdexGraph) –
Returns:
Return type:	networkx.classes.DiGraph

ddot.utils.bubble_layout_nx(G, xmin=-750, xmax=750, ymin=-750, ymax=750, verbose=False)[source]¶

Bubble-tree Layout using the Tulip library.

The input tree must be a graph. The layout is scaled so that it is fit exactly within a bounding box.

Grivet, S., Auber, D., Domenger, J. P., & Melancon, G. (2006). Bubble tree drawing algorithm. In Computer Vision and Graphics (pp. 633-641). Springer Netherlands.

Parameters:	G (networkx.Graph) – Tree xmin (float, optional) – Minimum x-coordinate of the bounding box xmax (float, optional) – Maximum x-coordinate of the bounding box ymin (float, optional) – Minimum y-coordinate of the bounding box ymax (float, optional) – Maximum y-coordinate of the bounding box
Returns:	Dictionary mapping nodes to 2D coordinates. pos[node_name] -> (x,y)
Return type:	dict

ddot.utils.color_gradient(ratio, min_col='#FFFFFF', max_col='#D65F5F', output_hex=True)[source]¶: Calculate a proportional mix between two colors.

ddot.utils.create_edgeMatrix(X, X_cols, X_rows, verbose=True, G=None, ndex2=True)[source]¶

Converts an NumPy array into a NdexGraph with a special CX aspect called “edge_matrix”. The array is serialized using base64 encoding.

Parameters:	X (np.ndarray) – X_cols (list) – Column names X_rows (list) – Row names
Returns:
Return type:	ndex.networkn.NdexGraph

ddot.utils.expand_seed(seed, sim, sim_names, agg='mean', min_sim=-inf, filter_perc=None, seed_perc=None, agg_perc=0.5, expand_size=None, include_seed=True, figure=False, verbose=False)[source]¶

Identify genes that are most similar to a seed set of genes.

A gene is included in the expanded set only if it meets all of the specified criteria. If include_seed is True, then genes that are seeds will be included regardless of the criteria. At the same time, the number of genes returned is still limited by expand_size. One way to get n novel genes returned is therefore to set expand_size = n + |seed| and include_seed = True, and then to remove the seed list from expand.

Parameters:

seed (list) –
sim (np.ndarray) –
sim_names (list of str) –
agg (str or function) – Aggregation method. Possible values are mean, min, max, perc.
min_sim (float) – Minimum similarity to the seed set.
filter_perc (float) – Filter based on a percentile of similarities between all genes and the seed set.
seed_perc (float) – Filter based on a percentile of similarities between seed set to itself.
agg_perc (float) – The <agg_perc> percentile of similarities to the seed set. For example, if a gene has similarities of (0, 0.2, 0.4, 0.6, 0.8) to five seed genes, then the 10% similarity is 0.2
expand_size (int) – Maximum limit on the number of returned genes.
include_seed (bool) – Include the seed genes even if they didn’t meet the criteria.
figure (bool) – Generate a figure showing the average distances within the seed an d the average distances between seed and the background.

Returns:

expand – The list of expanded genes passing all filters.
expand_idx – Indices of the ranking. I.e. expand_idx[0] is the index of the top gene, so you can get the name of the top gene with sim_names[expand_idx[0]] where sim_names is the input parameter.
sim_2_seed – The returned array sim_2_seed is the calculated similarities of the genes to the seed set. So sim_2_seed[0] is the similarity of the gene
fig – The generated figure. Can be saved like this: plt.savefig(‘foo.pdf’)

ddot.utils.gridify(parents, pos, G)[source]¶

Relayout leaf nodes into a grid.

Nodes must be connected and already laid out in “star”-like topologies. In each “star”, a set of nodes are positioned to form the shape of a circle and connect to a common parent node that is positioned at the circle’s center.

This function repositions the nodes in each start into a square grid that inscribes the circle.

Parameters:	parents (list) – For each parent, its children will be arranged in a grid. pos (dict) – Dictionary that maps names of nodes to their (x,y) coordinates G (nx.Graph) – Network
Returns:	Modifies <pos> inplace
Return type:	None

ddot.utils.ig_edges_to_pandas(G, attr_list=None)[source]¶

Create pandas.DataFrame of edge attributes of a igraph Graph object.

Parameters:	G (igraph.Graph) – attr_list (list, optional) – Names of edge attributes. Default: all edge attributes
Returns:	DataFrame where index is a MultIndex with two levels (u,v) referring to edges and the columns refer to edge attributes.
Return type:	pandas.DataFrame

ddot.utils.ig_nodes_to_pandas(G, attr_list=None)[source]¶

Create pandas.DataFrame of node attributes of a igraph.Graph object.

Parameters:	G (igraph.Graph) – attr_list (list, optional) – Names of node attributes. Default: all node attributes
Returns:	DataFrame where index is the names of nodes and the columns are node attributes.
Return type:	pandas.DataFrame

ddot.utils.ig_unfold_tree_with_attr(g, sources, mode)[source]¶: Call igraph.Graph.unfold_tree while preserving vertex and edge attributes.

ddot.utils.invert_dict(dic, sort=True, keymap={}, valmap={})[source]¶

Inverts a dictionary of the form key1 : [val1, val2] key2 : [val1]

to a dictionary of the form

val1 : [key1, key2] val2 : [key2]

Parameters:	dic (dict) –
Returns:
Return type:	dict

ddot.utils.load_edgeMatrix(ndex_uuid, ndex_server, ndex_user, ndex_pass, ndex=None, json=None, verbose=True)[source]¶

Loads a NumPy array from a NdexGraph with a special CX aspect called “edge_matrix”.

Parameters:

ndex_uuid (str) – NDEx UUID of ontology
ndex_server (str) – URL of NDEx server
ndex_user (str) – NDEx username
ndex_pass (str) – NDEx password
json (module) – JSON module with “loads” function. Default: the simplejson package (must be installed)

Returns:

X (np.ndarray)
X_cols (list) – Column names
X_rows (list) – Row names

ddot.utils.make_index(it)[source]¶: Create a dictionary mapping elements of an iterable to the index position of that element

ddot.utils.make_seed_ontology(sim, sim_names, expand_kwargs={}, build_kwargs={}, align_kwargs={}, ndex_kwargs={}, node_attr=None, verbose=False, ndex=True)[source]¶

Assembles and analyzes a data-driven ontology to study a process or disease

Parameters:

sim (np.ndarray) – gene-by-gene similarity array
sim_names (array-like) – Names of genes as they appear in the rows and columns of <sim>
expand_kwargs (dict) – Parameters for ddot.expand_seed() to identify an expanded set of genes
build_kwargs (dict) – Parameters for Ontology.build_from_network(…) to build a data-driven ontology.
align_kwargs (dict) – Parameters for Ontology.align() to align against a reference ontology.
ndex_kwargs (dict) – Parameters for Ontology.to_ndex() to upload ontology to NDEx.
node_attr (pd.DataFrame) – A DataFrame of node attributes to assign to the ontology.
ndex (bool) – If True, then upload ontology to NDEx using parameters <ndex_kwargs>

ddot.utils.melt_square(df, columns=['Gene1', 'Gene2'], similarity='similarity', empty_value=0, upper_triangle=True)[source]¶

Melts square dataframe into sparse representation.

Parameters:	df (pandas.DataFrame) – Square-shaped dataframe where df[i,j] is the value of edge (i,j) columns (iterable) – Column names for nodes in the output dataframe similarity (string) – Column for edge value in the output dataframe empty_value – Not yet supported upper_triangle (bool) – Only use the values in the upper-right triangle (including the diagonal) of the input square dataframe
Returns:	3-column dataframe that provides a sparse representation of the edges. Two of the columns indicate the node name, and the third column indicates the edge value
Return type:	pandas.DataFrame

ddot.utils.ndex_to_sim_matrix(ndex_url, ndex_server=None, ndex_user=None, ndex_pass=None, similarity=None, input_fmt='cx_matrix', output_fmt='matrix', subset=None, verbose=True)[source]¶

Read a similarity network from NDEx and return it as either a square np.array (compact representation) or a pandas.DataFrame of the non-zero similarity values (sparse representation)

Parameters:	ndex_url (str) – NDEx URL (or UUID) of ontology ndex_server (str) – URL of NDEx server ndex_user (str) – NDEx username ndex_pass (str) – NDEx password similarity (str) – Name of the edge attribute that represents the similarity/weight between two nodes. If None, then the name of the edge attribute in the output is named ‘similarity’ and all edges are assumed to have a similarity value of 1. input_fmt (str) – output_fmt (str) – If ‘matrix’, return a NumPy array. If ‘sparse’, return a pandas.DataFrame subset (optional) –
Returns:
Return type:	np.ndarray or pandas.DataFrame

ddot.utils.nx_edges_to_pandas(G, attr_list=None)[source]¶

Create pandas.DataFrame of edge attributes of a NetworkX graph.

Parameters:	G (networkx.Graph) – attr_list (list, optional) – Names of edge attributes. Default: all edge attributes
Returns:	DataFrame where index is a MultIndex with two levels (u,v) referring to edges and the columns refer to edge attributes. For multi(di)graphs, the MultiIndex have three levels of the form (u, v, key).
Return type:	pandas.DataFrame

ddot.utils.nx_nodes_to_pandas(G, attr_list=None)[source]¶

Create pandas.DataFrame of node attributes of a NetworkX graph.

Parameters:	G (networkx.Graph) – attr_list (list, optional) – Names of node attributes. Default: all node attributes
Returns:	DataFrame where index is the names of nodes and the columns are node attributes.
Return type:	pandas.DataFrame

ddot.utils.nx_to_NdexGraph(G_nx, discard_null=True)[source]¶

Converts a NetworkX into a NdexGraph object.

Parameters:	G_nx (networkx.Graph) –
Returns:
Return type:	ndex.networkn.NdexGraph

ddot.utils.parse_ndex_uuid(ndex_url)[source]¶

Extracts the NDEx UUID from a URL

Parameters:	ndex_url (str) – URL for a network stored on NDEx
Returns:	UUID of the network
Return type:	str

ddot.utils.pivot_square(df, index, columns, values, fill_value=0)[source]¶

Convert a dataframe into a square compact representation.

Parameters:	df (pandas.DataFrame) – DataFrame in long-format where every row represents one gene pair
Returns:	df – DataFrame with gene-by-gene dimensions
Return type:	pandas.DataFrame

ddot.utils.set_edge_attributes_from_pandas(G, edge_attr)[source]¶

Modify edge attributes according to a pandas.DataFrame.

Parameters:	G (networkx.Graph) – edge_attr (pandas.DataFrame) –

ddot.utils.set_node_attributes_from_pandas(G, node_attr)[source]¶

Modify node attributes according to a pandas.DataFrame.

Parameters:	G (networkx.Graph) – node_attr (pandas.DataFrame) –

ddot.utils.sim_matrix_to_NdexGraph(sim, names, similarity, output_fmt, node_attr=None)[source]¶

Convert similarity matrix into NdexGraph object

Parameters:	sim (np.ndarray) – Square-shaped NumPy array representing similarities names (list) – Genes names, in the same order as the rows and columns of sim similarity (str) – Edge attribute name for similarities in the resulting NdexGraph object output_fmt (str) – Either ‘cx’ (Standard CX format), or ‘cx_matrix’ (custom edgeMatrix aspect) node_attr (pandas.DataFrame, optional) – Node attributes, as a pandas.DataFrame, to be set in NdexGraph object
Returns:
Return type:	ndex.networkn.NdexGraph

ddot.utils.transform_pos(pos, xmin=-250, xmax=250, ymin=-250, ymax=250)[source]¶

Transforms coordinates to fit a bounding box.

Parameters:	pos (dict) – Dictionary mapping node names to (x,y) coordinates xmin (float, optional) – Minimum x-coordinate of the bounding box xmax (float, optional) – Maximum x-coordinate of the bounding box ymin (float, optional) – Minimum y-coordinate of the bounding box ymax (float, optional) – Maximum y-coordinate of the bounding box
Returns:	New dictionary with transformed coordinates
Return type:	dict

ddot.utils.update_nx_with_alignment(G, alignment, term_descriptions=None, use_node_name=True)[source]¶

Add node attributes to a NetworkX graph.

Parameters:	G – NetworkX object alignment – pandas.DataFrame where the index is the name of terms, and where there are 3 columns: ‘Term’, ‘Similarity’, ‘FDR’ use_node_name (bool) – term_descriptions (dict) –
Returns:
Return type:	None

Utility Functions¶

Table of Contents

Previous topic

This Page