BPESegmentation
Split words into subword units using BPE model [Sennrich et al., 2016].
Parameters:
model
: a model file with BPE codesmerges
: use this many BPE operations (optional; default -1 = use all learned operations)separator
separator between non-final subword units (optional; default"@@"
)vocab
: vocabulary file; if provided, reverts any merge operations that produce an OOV (optional; defaultnull
)glossaries
: words matching any of the words/regex provided in glossaries will not be affected (optional; defaultnull
)dropout
: dropout BPE merge operations with the probability (optional; default 0)
See train_bpe for training a model and subword-nmt documentation for details of the parameters.