Server¶
Models¶
-
class
mammoth.translate.translation_server.
ServerModel
(opts, model_id, preprocess_opt=None, tokenizer_opt=None, postprocess_opt=None, custom_opt=None, load=False, timeout=-1, on_timeout='to_cpu', model_root='./', ct2_model=None, ct2_translator_args=None, ct2_translate_batch_args=None)[source]¶ Bases:
object
Wrap a model with server functionality.
- Parameters
opts (dict) – Options for the Translator
model_id (int) – Model ID
preprocess_opt (list) – Options for preprocess processus or None
tokenizer_opt (dict) – Options for the tokenizer or None
postprocess_opt (list) – Options for postprocess processus or None
custom_opt (dict) – Custom options, can be used within preprocess or postprocess, default None
load (bool) – whether to load the model during
__init__()
timeout (int) – Seconds before running
do_timeout()
Negative values means no timeouton_timeout (str) – Options are [“to_cpu”, “unload”]. Set what to do on timeout (see
do_timeout()
.)model_root (str) – Path to the model directory it must contain the model and tokenizer file
-
detokenize
(sequence, side='tgt')[source]¶ Detokenize a single sequence
Same args/returns as
tokenize()
-
do_timeout
()[source]¶ Timeout function that frees GPU memory.
Moves the model to CPU or unloads it; depending on attr`self.on_timemout` value
-
maybe_convert_align
(src, tgt, align)[source]¶ Convert alignment to match detokenized src/tgt (or not).
- Parameters
src (str) – The tokenized source sequence.
tgt (str) – The tokenized target sequence.
align (str) – The alignment correspand to src/tgt pair.
- Returns
The alignment correspand to detokenized src/tgt.
- Return type
align (str)
-
maybe_detokenize
(sequence, side='tgt')[source]¶ De-tokenize the sequence (or not)
Same args/returns as
tokenize()
-
maybe_detokenize_with_align
(sequence, src, side='tgt')[source]¶ De-tokenize (or not) the sequence (with alignment).
- Parameters
sequence (str) – The sequence to detokenize, possible with alignment seperate by ` ||| `.
- Returns
The detokenized sequence. align (str): The alignment correspand to detokenized src/tgt
sorted or None if no alignment in output.
- Return type
sequence (str)
-
maybe_tokenize
(sequence, side='src')[source]¶ Tokenize the sequence (or not).
Same args/returns as tokenize
-
parse_opt
(opts)[source]¶ Parse the option set passed by the user using mammoth.opts
- Parameters
opts (dict) – Options passed by the user
- Returns
full set of options for the Translator
- Return type
opts (argparse.Namespace)
-
postprocess
(sequence)[source]¶ Preprocess a single sequence.
- Parameters
sequence (str) – The sequence to process.
- Returns
The postprocessed sequence.
- Return type
sequence (str)
-
preprocess
(sequence)[source]¶ Preprocess a single sequence.
- Parameters
sequence (str) – The sequence to preprocess.
- Returns
The preprocessed sequence.
- Return type
sequence (str)
-
rebuild_seg_packages
(all_preprocessed, results, scores, aligns, n_best)[source]¶ Rebuild proper segment packages based on initial n_seg.
Core Server¶
-
class
mammoth.translate.translation_server.
TranslationServer
[source]¶ Bases:
object
-
clone_model
(model_id, opts, timeout=-1)[source]¶ Clone a model model_id.
Different options may be passed. If opts is None, it will use the same set of options
-
preload_model
(opts, model_id=None, **model_kwargs)[source]¶ Preloading the model: updating internal datastructure
It will effectively load the model if load is set
-