gatenlp.offsetmapper module¶
-
class
gatenlp.offsetmapper.
OffsetMapper
(text: str)[source]¶ Bases:
object
Calculate the tables for mapping unicode code points to utf16 code units. NOTE: currently this optimizes for conversion speed at the cost of memory, with one special case: if after creating the java2python table we find that all offsets are identical, we discard the tables and just set a flag for that. :param text: the text as a python string