DECIMER package¶
Submodules¶
DECIMER.decimer module¶
- class DECIMER.decimer.KerasCompatibilityUnpickler(file, *, fix_imports=True, encoding='ASCII', errors='strict', buffers=())[source]¶
Bases:
UnpicklerCustom unpickler to handle Keras 2.x tokenizers in Keras 3 environment.
This handles the module path changes between Keras 2 and Keras 3: - keras.preprocessing.text -> tensorflow.keras.preprocessing.text
- find_class(module, name)[source]¶
Return an object from a specified module.
If necessary, the module will be imported. Subclasses may override this method (e.g. to restrict unpickling of arbitrary classes and functions).
This method is called whenever a class or a function object is needed. Both arguments passed are str objects.
- DECIMER.decimer.detokenize_output(predicted_array)[source]¶
This function takes the predicted tokens from the DECIMER model and returns the decoded SMILES string.
- Return type:
str
- Args:
predicted_array (int): Predicted tokens from DECIMER
- Returns:
(str): SMILES representation of the molecule
- DECIMER.decimer.detokenize_output_add_confidence(predicted_array, confidence_array)[source]¶
This function takes the predicted array of tokens as well as the confidence values returned by the Transformer Decoder and returns a list of tuples that contain each token of the predicted SMILES string and the confidence value.
- Return type:
List[Tuple[str,float]]
- Args:
predicted_array (tf.Tensor): Transformer Decoder output array (predicted tokens)
- Returns:
str: SMILES string
- DECIMER.decimer.get_models(model_urls)[source]¶
Download and load models from the provided URLs.
This function downloads models from the provided URLs to a default location, then loads tokenizers and TensorFlow saved models.
- Args:
model_urls (dict): A dictionary containing model names as keys and their corresponding URLs as values.
- Returns:
- tuple: A tuple containing loaded tokenizer and TensorFlow saved models.
tokenizer (object): Tokenizer for DECIMER model.
DECIMER_V2 (tf.saved_model): TensorFlow saved model for DECIMER.
DECIMER_Hand_drawn (tf.saved_model): TensorFlow saved model for DECIMER HandDrawn.
- DECIMER.decimer.load_tokenizer(tokenizer_path)[source]¶
Load tokenizer with Keras 2/3 compatibility.
- Return type:
object
- Args:
tokenizer_path (str): Path to the pickled tokenizer file
- Returns:
tokenizer: Loaded tokenizer object
- DECIMER.decimer.main()[source]¶
This function take the path of the image as user input and returns the predicted SMILES as output in CLI.
- Agrs:
str: image_path
- Returns:
str: predicted SMILES
- DECIMER.decimer.predict_SMILES(image_input, confidence=False, hand_drawn=False)[source]¶
Predicts SMILES representation of a molecule depicted in the given image.
- Return type:
str
- Args:
image_input (str or np.ndarray): Path of chemical structure depiction image or a numpy array representing the image. confidence (bool): Flag to indicate whether to return confidence values along with SMILES prediction. hand_drawn (bool): Flag to indicate whether the molecule in the image is hand-drawn.
- Returns:
str: SMILES representation of the molecule in the input image, optionally with confidence values.