DECIMER package#
Submodules#
DECIMER.decimer module#
- DECIMER.decimer.detokenize_output(predicted_array)[source]#
- Return type:
2024
This function takes the predited tokens from the DECIMER model and returns the decoded SMILES string.
- Args:
predicted_array (int): Predicted tokens from DECIMER
- Returns:
(str): SMILES representation of the molecule
- DECIMER.decimer.detokenize_output_add_confidence(predicted_array, confidence_array)[source]#
- Return type:
2024
This function takes the predicted array of tokens as well as the confidence values returned by the Transformer Decoder and returns a list of tuples that contain each token of the predicted SMILES string and the confidence value.
- Args:
predicted_array (tf.Tensor): Transformer Decoder output array (predicted tokens)
- Returns:
str: SMILES string
- DECIMER.decimer.get_models(model_urls)[source]#
Download and load models from the provided URLs.
This function downloads models from the provided URLs to a default location, then loads tokenizers and TensorFlow saved models.
- Args:
model_urls (dict): A dictionary containing model names as keys and their corresponding URLs as values.
- Returns:
- tuple: A tuple containing loaded tokenizer and TensorFlow saved models.
tokenizer (object): Tokenizer for DECIMER model.
DECIMER_V2 (tf.saved_model): TensorFlow saved model for DECIMER.
DECIMER_Hand_drawn (tf.saved_model): TensorFlow saved model for DECIMER HandDrawn.
- DECIMER.decimer.main()[source]#
This function take the path of the image as user input and returns the predicted SMILES as output in CLI.
- Agrs:
str: image_path
- Returns:
str: predicted SMILES
- DECIMER.decimer.predict_SMILES(image_path, confidence=False, hand_drawn=False)[source]#
- Return type:
2024
Predicts SMILES representation of a molecule depicted in the given image.
- Args:
image_path (str): Path of chemical structure depiction image confidence (bool): Flag to indicate whether to return confidence values along with SMILES prediction hand_drawn (bool): Flag to indicate whether the molecule in the image is hand-drawn
- Returns:
str: SMILES representation of the molecule in the input image, optionally with confidence values
DECIMER.config module#
- class DECIMER.config.Config[source]#
Bases:
object
Configuration class.
- initialize_encoder_config(image_embedding_dim, preprocessing_fn, backbone_fn, image_shape, do_permute=False, pretrained_weights=None)[source]#
This functions initializes the Efficient-Net V2 encoder with user defined configurations.
- Args:
image_embedding_dim (int): Embedding dimention of the input image preprocessing_fn (method): Efficient Net preprocessing function for input image backbone_fn (method): Calls Efficient-Net V2 as backbone for encoder image_shape (int): Shape of the input image do_permute (bool, optional): . Defaults to False. pretrained_weights (keras weights, optional): Use pretrainined efficient net weights or not. Defaults to None.
- initialize_lr_config(warm_steps, n_epochs)[source]#
This function sets the configuration to initialize learning rate.
- Args:
warm_steps (int): Number of steps The learning rate is increased n_epochs (int): Number of epochs
- initialize_transformer_config(vocab_len, max_len, n_transformer_layers, transformer_d_dff, transformer_n_heads, image_embedding_dim, rate=0.1)[source]#
This functions initializes the Transformer model as decoder with user defined configurations.
- Args:
vocab_len (int): Total number of words in the input vocabulary max_len (int): Maximum length of the string found on the training dataset n_transformer_layers (int): Number of layers present in the transformer model transformer_d_dff (int): Transformer feed forward upwards projection size transformer_n_heads (int): Number of heads present in the transformer model image_embedding_dim (int): Total number of dimension the image gets embeddeded dropout_rate (float, optional): Fraction of the input units to drop. Defaults to 0.1.
- class DECIMER.config.CustomSchedule(d_model, warmup_steps=4000)[source]#
Bases:
LearningRateSchedule
Custom schedule for learning rate used during training.
- Args:
tf (_type_): keras learning rate schedule
- DECIMER.config.HEIF_to_pillow(image_path)[source]#
Converts Appleās HEIF format to useful pillow object Returns: image_path (str): path of input image Returns: PIL.Image
- DECIMER.config.PIL_im_to_BytesIO(image)[source]#
Convert pillow image to io.BytesIO object Args: PIL.Image Returns: io.BytesIO object with the image data
- DECIMER.config.central_square_image(image)[source]#
This function takes a Pillow Image object and will add white padding so that the image has a square shape with the width/height of the longest side of the original image.
Args: PIL.Image Returns: PIL.Image
- DECIMER.config.decode_image(image_path)[source]#
Loads an image and preprocesses the input image in several steps to get the image ready for DECIMER input.
- Args:
image_path (str): path of input image
- Returns:
Processed image
- DECIMER.config.delete_empty_borders(image)[source]#
This function takes a Pillow Image object, converts it to grayscale and deletes white space at the borders.
Args: PIL.Image Returns: PIL.Image
- DECIMER.config.download_trained_weights(model_url, model_path, verbose=1)[source]#
This function downloads the trained models and tokenizers to a default location. After downloading the zipped file the function unzips the file automatically. If the model exists on the default location this function will not work.
- Args:
model_url (str): trained model url for downloading. model_path (str): model default path to download.
- Returns:
path (str): downloaded model.
- DECIMER.config.get_bnw_image(image)[source]#
converts images to black and white Args: PIL.Image Returns: PIL.Image
- DECIMER.config.get_resize(image)[source]#
This function used to decide how to resize a given image without losing much information.
Args: PIL.Image Returns: PIL.Image
- DECIMER.config.increase_brightness(image)[source]#
This function adjusts the brightness of the given image.
Args: PIL.Image Returns: PIL.Image
- DECIMER.config.increase_contrast(image)[source]#
This function increases the contrast of an image input.
Args: PIL.Image Returns: PIL.Image
- DECIMER.config.prepare_models(encoder_config, transformer_config, replica_batch_size, verbose=0)[source]#
This function is used to initiate the Encoder and the Transformer with appropriate configs set by the user. After initiating the models this function returns the Encoder,Transformer and the optimizer.
- Args:
encoder_config ([type]): Encoder configuration set by user in the config class. transformer_config ([type]): Transformer configuration set by user in the config class. replica_batch_size ([type]): Per replica batch size set by user(during distributed training). verbose (int, optional): Defaults to 0.
- Returns:
[type]: Optimizer, Encoder model and the Transformer
- DECIMER.config.remove_transparent(image_path)[source]#
Removes the transparent layer from a PNG image with an alpha channel Args: image_path (str): path of input image Returns: PIL.Image
- DECIMER.config.resize_byratio(image)[source]#
This function takes a Pillow Image object and will resize the image by 512 x 512 To upscale or to downscale the image LANCZOS resampling method is used.
with the new pillow version the antialias is turned on when using LANCZOS. Args: PIL.Image Returns: PIL.Image