pbcluster package

Top-level package for PBCluster.

Submodules

pbcluster.cluster module

Cluster module.

class pbcluster.cluster.Cluster(graph, particle_df, box_lengths, cutoff_distance)[source]

Bases: object

Object to store and compute data about an individual particle cluster

Parameters:
  • graph (networkx Graph) – Contains nodes and edges corresponding to particles and bonds, respectively, where a bond implies the particles are within a distance of cutoff_distance from each other.
  • particle_df (dataframe) – Dataframe where index is particle_id, and there are n_dimensions columns labelled x0, x1`, … xN
  • box_lengths (ndarray) – Must contain n_dimensions values representing the lengths of each dimension of a rectangular box.
  • cutoff_distance (float) – Maximum distance two particles can be from each other to be considered part of the same cluster
graph

Contains nodes and edges corresponding to particles and bonds, respectively, where a bond implies the particles are within a distance of cutoff_distance from each other.

Type:networkx Graph
particle_df

Dataframe where index is particle_id, and there are n_dimensions columns labelled x0, x1`, … xN

Type:dataframe
box_lengths

Must contain n_dimensions values representing the lengths of each dimension of a rectangular box.

Type:ndarray
cutoff_distance

Maximum distance two particles can be from each other to be considered part of the same cluster

Type:float
n_dimensions

Number of dimensions in the system

Type:int
n_particles

Number of particles in the cluster

Type:int
compute_asphericity()[source]

Returns cluster asphericity (see https://en.wikipedia.org/wiki/Gyration_tensor#Shape_descriptors)

Returns:Asphericity, normalized by radius of gyration squared
Return type:float
compute_bonds()[source]

Returns a dataframe with 2 columns, where each row has a pair of `particle_id`s associated with bonded particles

Returns:Shape (n_bonds, 2). Column names particle_id_1 and particle_id_2.
Return type:dataframe
compute_center_of_mass(wrapped=True)[source]

Returns cluster center of mass dictionary

Parameters:wrapped (boolean, optional) – If True, a center of mass that falls outside the box bounds is forced to be in range [0, box_lengths[d]) for each dimension d. If using this to compare to unwrapped particle coordinates, leave as False. Defaults to False.
Returns:{“x0”: x0, “x1”: x1, …}
Return type:dict
compute_cluster_properties(properties=['n_particles'])[source]

Compute cluster properties passed in properties variable

Parameters:properties (list or str, optional) – List of cluster properties to compute, or “all” to compute all available properties. Defaults to [“n_particles”].
Returns:property_name → property_value key-value pairs
Return type:dict
compute_coordination_number()[source]

Returns a dataframe of coordination numbers corresponding to each particle in the cluster

Returns:Coordination numbers for particles in the cluster. Index is particle_id`s and matches `particle_df.index
Return type:dataframe
compute_distance_from_com(include_dx=True, include_distance=True)[source]

Returns dataframe of distances from the center of mass for each particle

Parameters:
  • include_dx (bool, optional) – If True, includes dx_from_com_x* columns. Defaults to True.
  • include_distance (bool, optional) – If True, includes distance_from_com column. Defaults to True
Raises:

ValueError – both include_dx and include_distance are False

Returns:

Index is particle_id (matching index of particle_df), columns are distance_from_com (Euclidean distance from center of mass), and dx_from_com_x* (Vector difference) where * represents 0, 1, … n_particles.

Return type:

dataframe

compute_minimum_node_cuts()[source]

Returns dictionary of minimum node cuts required to break the connection between faces normal to a given direction.

Returns:dimension_str → minimum_node_cuts key-value pairs
Return type:dict
compute_n_particles()[source]

Returns the number of particles in the cluster

Returns:number of particles in the cluster
Return type:int
compute_particle_properties(properties=['coordination_number'])[source]

Compute particle properties passed in properties variable

Parameters:properties (list or str, optional) – List of particle properties to compute, or “all” to compute all available properties. Defaults to [“coordination_number”].
Returns:Shape (n_particles, n_dimensions + n_properties) particle_id as index, x* and particle property columns.
Return type:dataframe
compute_rg()[source]

Returns cluster radius of gyration.

Returns:Cluster radius of gyration
Return type:float
compute_unwrapped_center_of_mass()[source]

Returns unwrapped center of mass, meaning it’s the center of mass of the unwrapped particle coordinates, and isn’t necessarily inside the box coordinates.

Returns:Unwrapped center of mass coordinates, “x*” → number key-value pairs. Technically, no max or min restriction, but probably within 1 period of the box bounds.
Return type:dict

pbcluster.trajectory module

Trajectory module.

class pbcluster.trajectory.Trajectory(trajectory_data, box_lengths, cutoff_distance)[source]

Bases: object

Object to store and compute data about particle clusters in sequential timesteps

Parameters:
  • trajectory_data (dataframe or ndarray) – Dataframe or ndarray containing trajectory data. If dataframe, it must contain columns particle_id and x0, x1, … xN. If there are multiple timesteps, a timestep column must be included. If there are multiple particle types, a particle_type column must be included. If ndarray, its dimensions must be (n_timesteps, n_particles, n_dimensions), and it will be assumed that all particles are of the same type.
  • box_lengths (float or ndarray) – If float, the length of each side of a cubic box. If ndarray, it must contain n_dimensions values representing the lengths of each dimension of a rectangular box.
  • cutoff_distance (float) – Maximum distance two particles can be from each other to be considered part of the same cluster
trajectory_df

Dataframe containing trajectory data with columns timestep, particle_id, particle_type, x0, x1`, … xN

Type:dataframe
n_dimensions

Number of dimensions in the system

Type:int
box_lengths

Length n_dimensions array representing the lengths of each dimension of a rectangular box.

Type:ndarray
cutoff_distance

Maximum distance two particles can be from each other to be considered part of the same cluster

Type:float
cluster_dict

timestep → cluster_list key-value pairs

Type:dict
compute_bond_durations()[source]

Returns dataframe with duration of bond for every distinct bonding event. If particles 3 and 5 bond, then unbond, then bond again, that is 2 distinct bonding events.

Returns:Shape (n_bond_events, 5). Columns particle_id_1, particle_id_2, start, duration, bonded_at_end
Return type:dataframe
compute_bonds()[source]

Returns dataframe of bonds for the whole trajectory.

Returns:Shape (n_bonds, 4). Column names particle_id_1, particle_id_2, timestep, and cluster_id.
Return type:dataframe
compute_cluster_properties(properties=['n_particles'], verbosity=0)[source]

Returns dataframe of properties for each cluster in each timestep.

Parameters:
  • properties (list, optional) – List of properties to computer for each cluster. Defaults to [“n_particles”].
  • verbosity (int, optional) – A value greater than 0 will print some output to show which timesteps have been computed. Defaults to 0.
Returns:

columns of timestep, cluster_id, and columns corresponding to properties in properties list.

Return type:

dataframe

compute_particle_properties(properties=['coordination_number'], verbosity=0)[source]

Returns dataframe of properties for each particle in each timestep.

Parameters:
  • properties (list, optional) – List of properties to computer for each particle. Defaults to [“coordination_number”].
  • verbosity (int, optional) – A value greater than 0 will print some output to show which timesteps have been computed. Defaults to 0.
Returns:

columns of timestep, cluster_id, particle_id, and columns corresponding to properties in properties list.

Return type:

dataframe

pbcluster.utils module

Utils Module.

pbcluster.utils.flatten_dict(input_dict)[source]

Returns flattened dictionary given an input dictionary with maximum depth of 2

Parameters:input_dict (dict) – str → number key-value pairs, where value can be a number or a dictionary with str → number key-value paris.
Returns:Flattened dictionary with underscore-separated keys if input_dict contained nesting
Return type:dict
pbcluster.utils.get_graph_from_particle_positions(particle_positions, box_lengths, cutoff_distance, store_positions=False)[source]

Returns a networkx graph of connections between neighboring particles

Parameters:
  • particle_positions (ndarray or dataframe) – Shape (n_particles, n_dimensions). Each of the n_particles rows is a length n_dimensions particle position vector. Positions must be in range [0, box_lengths[d]) for each dimension d.
  • box_lengths (ndarray) – Shape (n_dimensions,) array of box lengths for each box dimension.
  • cutoff_distance (float) – Maximum length between particle pairs to consider them connected
  • store_positions (bool, optional) – If True, store position vector data within each node in the graph. Defaults to False.
Returns:

Graph of connections between all particle pairs with distance below cutoff_distance

Return type:

networkx Graph

pbcluster.utils.get_within_cutoff_graph(distances, cutoff_distance)[source]

Converts pairwise distances matrix into networkx graph of connections between i and j (i \(\ne\) j) where distances[i, j] \(\le\) cutoff_distance.

Parameters:
  • distances (ndarray or dataframe) – Shape (n_particles, n_particles) symmetric matrix of pairwise euclidean distances.
  • cutoff_distance (float) – Maximum length between particle pairs to consider them connected
Returns:

Graph of connections between all particle pairs with distance below cutoff_distance

Return type:

networkx Graph

pbcluster.utils.get_within_cutoff_matrix(distances, cutoff_distance)[source]

Returns matrix of 0s and 1s that can be fed into networkx to initialize a graph

Parameters:
  • distances (ndarray or dataframe) – Shape (n_particles, n_particles) symmetric matrix of pairwise euclidean distances.
  • cutoff_distance (float) – Maximum length between particle pairs to consider them connected
Returns:

Shape (n_particles, n_particles) symmetric binary array

Return type:

ndarray

pbcluster.utils.pairwise_distances(particle_positions, box_lengths)[source]

Returns pairwise distance matrix between row vectors in a single positions matrix

Parameters:
  • particle_positions (ndarray or dataframe) – Shape (n_particles, n_dimensions). Each of the n_particles rows is a length n_dimensions particle position vector. Positions must be in range [0, box_lengths[d]) for each dimension d.
  • box_lengths (ndarray) – Shape (n_dimensions,) array of box lengths for each box dimension.
Returns:

Shape (n_particles, n_particles) symmetric matrix of pairwise euclidean distances.

Return type:

ndarray

pbcluster.utils.pairwise_distances_distinct(particle_positions_1, particle_positions_2, box_lengths)[source]

Returns pairwise distance matrix between 2 distinct sets of particle positions accounting for periodic boundary conditions

Parameters:
  • particle_positions_1 (ndarray or dataframe) – Shape (n_particles_1, n_dimensions). Each of the n_particles_1 rows is a length n_dimensions particle position vector. Positions must be in range [0, box_lengths[d]) for each dimension d.
  • particle_positions_2 (ndarray or dataframe) – Shape (n_particles_2, n_dimensions). Each of the n_particles_2 rows is a length n_dimensions particle position vector. Positions must be in range [0, box_lengths[d]) for each dimension d.
  • box_lengths (ndarray) – Shape (n_dimensions,) array of box lengths for each box dimension.
Raises:

ValueError – Length of last dimension of each argument doesn’t match

Returns:

Shape (n_particles_1, n_particles_2) matrix of pairwise euclidean distances.

Return type:

ndarray