sciSOM.SOM_recall package

Submodules

sciSOM.SOM_recall.recall module

sciSOM.SOM_recall.recall.SOM_cls_recall(array_to_fill, data_in_SOM_fmt, weight_cube, reference_map)[source]

Takes the data, the weight cube and the classification map and assignes each data point a label based on their cluster.

Parameters:
  • array_to_fill (ndarray) – structured array to fill with the classification

  • data_in_SOM_fmt (ndarray) – data to classify in the SOM format

  • weight_cube (ndarray) – SOM weight cube

  • reference_map (ndarray) – reference map for the SOM

Returns:

array_to_fill – structured array with the SOM classification added

Return type:

ndarray

sciSOM.SOM_recall.recall.SOM_location_recall(normalized_data, weight_cube)[source]

Takes the data, the weight cube and the classification map and assignes each data point a label based on their cluster.

Parameters:
  • array_to_fill (np.ndarray) – structured array to fill with the classification

  • data_in_SOM_fmt (np.ndarray) – data to classify in the SOM format

  • weight_cube (ndarray) – SOM weight cube

  • reference_map (np.ndarray) – reference map for the SOM

  • normalized_data (ndarray)

Returns:

array_to_fill – structured array with the SOM classification added

Return type:

ndarray

sciSOM.SOM_recall.recall.affine_transform(data, target_min, target_max)[source]

Takes a set of data an applies a affine transfrom to scale it. The first axis is expected to be the number of data samples, the second axis is expected to be the number of features.

Parameters:
  • data (ndarray) – Input data to apply the affine transform to.

  • target_min (Union[float, ndarray]) – Minimum of the target space

  • target_max (Union[float, ndarray]) – Maximum of the target space

Returns:

normalized_data – Data after the affine transformation

Return type:

ndarray

sciSOM.SOM_recall.recall.assign_labels(data, ref_img, xdim, ydim, cut_out)[source]

Assigns labels to the data based on the reference image from the SOM

This functions takes in the data and classifications based on an image gives the unique labels as well as the data set bacl with the new classification PS this version only takes in S1s and S2s and ignores unclassified samples, another version will be made to deal with the unclassified samples.

Parameters:
  • data (ndarray) – can be either peaks or peak_basics

  • ref_img (ndarray) – will be the image extracted from the SOM classification of each data point

  • xdim (int) – width of the image cube

  • ydim (int) – height of the image cube

  • cut_out (int)

Return type:

tuple[ndarray, ndarray]

Returns:

  • colorp (np.ndarray) – list of unique colors in the image

  • data_new (np.ndarray) – structured array with the new classification

sciSOM.SOM_recall.recall.create_mapping_dict(output_classes, dataset_classes)[source]

Create a mapping dictionary from output classes to dataset classes.

Parameters:
  • output_classes (Union[list, ndarray]) – List of output classes from the neural network.

  • dataset_classes (Union[list, ndarray]) – List of corresponding dataset classes.

Returns:

mapping_dict – A dictionary mapping output classes to dataset classes.

Return type:

Dict

sciSOM.SOM_recall.recall.generate_color_ref_map(color_image, unique_colors)[source]

Generate a map where the color image representing the labels of the som weight cube.

Parameters:
  • color_image (ndarray) – image made by the remap compressed to the SOM size

  • unique_colors (ndarray) – unique colors found in the image (also represent # of clusters)

Returns:

ref_map – reference map for the SOM

Return type:

ndarray

sciSOM.SOM_recall.recall.map_output_to_dataset(output_classes, mapping_dict)[source]

Map output classes to dataset classes using the mapping dictionary.

Parameters:
  • output_array (np.ndarray) – Array of output classes from the neural network.

  • mapping_dict (ndarray) – Dictionary mapping output classes to dataset classes.

  • output_classes (ndarray)

Returns:

mapped_array – Array of dataset classes corresponding to the output classes.

Return type:

ndarray

sciSOM.SOM_recall.recall.normalize_data_recall(peaklet_data, normalization_factor)[source]

Use this function to do operation with an already trained SOM Converts peaklet data into the current best inputs for the SOM, log10(deciles) + log10(area) + AFT Since we are dealing with logs, anything less than 1 will be set to 1

peaklet_data: straxen datatype peaklets (peaks also work) normalization_factors: numbers needed to normalize data so recalls work

sciSOM.SOM_recall.recall.recall_populations(dataset, weight_cube, SOM_cls_img, norm_factors)[source]

Recalls data from a SOM weight cube and assigns a population label to each data point.

Master function that should let the user provide a weightcube, a reference img as a np.array, a dataset and a set of normalization factors. In theory, if these 5 things are provided, this function should output the original data back with one added field with the name “SOM_type” Here we will assume that the data has been preprocessed in the SOM input format.

Parameters:
  • weight_cube (ndarray) – SOM weight cube (3D array)

  • SOM_cls_img (ndarray) – SOM reference image as a numpy array

  • dataset (ndarray) – Data to preform the recall on should be a structured array

  • normfactos – A set of numbers (equal to dimensionality of the data) to normalize the data so we can preform a recall

  • norm_factors (ndarray)

Returns:

output_data – Data with the SOM classification added as a field

Return type:

ndarray

sciSOM.SOM_recall.recall.select_middle_pixel(img_as_np_array, pxl_per_block=12)[source]

Selects the middle pixel of each cell in the image.

Image resulting from NS have cells of about 12 pixels, we want to reduce the image to 1 pixel per cell, so we will take the middle pixel. Since images have their 0 index at the top and np arrays start at the bottom we have to filp the image across the y-axis.

Parameters:
  • img_as_np_array (ndarray) – Image as a numpy array

  • pxl_per_block (int) – Number of pixels per block in the image, defualt set to 12

Returns:

SOM_img_clusters – Image with 1 pixel per cell

Return type:

ndarray

sciSOM.SOM_recall.strax_functions module

sciSOM.SOM_recall.strax_functions.compute_quantiles(peaks, n_samples)[source]

Compute waveforms and quantiles for a given number of nodes(attributes)

Parameters:
  • peaks (ndarray) – Peaks data

  • n_samples (int) – Number of nodes or attributes

Returns:

quantiles – Quantiles of the waveform

Return type:

np.ndarray

sciSOM.SOM_recall.strax_functions.compute_wf_attributes(data, sample_length, n_samples)[source]

Compute waveform attribures.

Quantiles: represent the amount of time elapsed for a given fraction of the total waveform area to be observed in n_samples i.e. n_samples = 10, then quantiles are equivalent deciles.

Parameters:
  • data (np.ndarray) – Waveform data

  • sample_length (np.ndarray) – Length of each sample

  • n_samples (int) – Number of samples

Returns:

quantiles – Quantiles of the waveform

Return type:

np.ndarray

sciSOM.SOM_recall.strax_functions.data_to_log_decile_log_area_aft(peaklet_data, normalization_factor)[source]

Takes peakelt level data and converts it into input vectors for the SOM consisting of: deciles, log10(area), AFT

Converts peaklet data into the current best inputs for the SOM, log10(deciles) + log10(area) + AFT Since we are dealing with logs, anything less than 1 will be set to 1. Any decile value < 1 will be set to 1 before log10 is taken. The areas can be very negative to we add the minimum value to all areas to make them positive.

Parameters:
  • peaklet_data (ndarray) – Peaklet level data

  • normalization_factor (ndarray) – Normalization factors for the data

Returns:

deciles_area_aft – Normalized Input vectors for the SOM

Return type:

ndarray

sciSOM.SOM_recall.strax_functions.data_to_log_decile_log_area_aft_generate(peaklet_data)[source]

Use this function for generating data to train an SOM Converts peaklet data into the current best inputs for the SOM, log10(deciles) + log10(area) + AFT Since we are dealing with logs, anything less than 1 will be set to 1 Lets explain the norm factors: 0->9 max of the log of each decile => normalizes it to 1 10 -> max of the log of the area => to normalize to 1 11 -> keep track of the minimum value used to add to all other data

Module contents