gatenlp.offsetmapper module

class gatenlp.offsetmapper.OffsetMapper(text: str)[source]

Bases: object

Calculate the tables for mapping unicode code points to utf16 code units. NOTE: currently this optimizes for conversion speed at the cost of memory, with one special case: if after creating the java2python table we find that all offsets are identical, we discard the tables and just set a flag for that. :param text: the text as a python string

convert_to_java(offsets)[source]

Convert one python offset or an iterable of python offsets to java offset/s :param offsets: a single offset or an iterable of offsets :return:

convert_to_python(offsets)[source]

Convert one java offset or an iterable of java offsets to python offset/s :param offsets: a single offset or an iterable of offsets :return: