gatenlp.gateslave module

Module for interacting with a Java GATE process, running API commands on it and exchanging data with it.

class gatenlp.gateslave.GateSlave(port=25333, start=True, java='java', host='127.0.0.1', gatehome=None, platform=None)[source]

Bases: object

Create an instance of the GateSlave and either start our own Java GATE process for it to use (start=True) or connect to an existing one (start=False).

After the GateSlave instance has been create successfully, it is possible to:

  • Use one of the methods of the instance to perform operations on the Java side or exchange data

  • use GateSlave.slave to invoke methods from the PythonSlave class on the Java side

  • use GateSlave.jvm to directly construct objects or call instance or static methods

NOTE: the GATE process must not output anything important/big to stderr because everything from stderr gets captured and used for communication between the Java and Python processes. At least part of the output to stderr may only be passed on after the GATE process has ended.

Example:

gs = GateSlave()
pipeline = gs.slave.loadPipelineFromFile("thePipeline.xgapp")
doc = gs.slave.createDocument("Some document text")
gs.slave.run4doc(pipeline,doc)
pdoc = gs.gdoc2pdoc(doc)
gs.slave.deleteResource(doc)
# process the gatenlp Document pdoc ...
Parameters
  • port – port to use

  • java – path to the java binary to run or the java command to use from the PATH (for start=True)

  • host – host an existing Java GATE process is running on (only relevant for start=False)

  • gatehome – where GATE is installed (only relevant if start=True). If None, expects environment variable GATE_HOME to be set.

  • platform – system platform we run on, one of Windows, Linux (also for MacOs) or Java

close()[source]

Clean up: if the gate slave process was started by us, we will shut it down. :return:

del_gdoc(gdoc)[source]

Delete/unload the GATE document from GATE. This is necessary to do for each GATE document that is not used anymore, otherwise the documents will accumulate in the Java process and eat up all memory. NOTE: just removing all references to the GATE document does not delete/unload the document!

Parameters

gdoc – the document to remove

Returns

gdoc2pdoc(gdoc)[source]

Convert the GATE document to a python document and return it.

Parameters

gdoc – the handle to a GATE document

Returns

a gatenlp Document instance

load_gdoc(path, mimetype=None)[source]

Let GATE load a document from the given path and return a handle to it.

Parameters
  • path – path to the gate document to load.

  • mimetype – a mimetype to use when loading.

Returns

a handle to the GATE document

load_pdoc(path, mimetype=None)[source]

Load a document from the given path, using GATE and convert and return as gatenlp Python document.

Parameters
  • path – path to load document from

  • mimetype – mime type to use

Returns

gatenlp document

pdoc2gdoc(pdoc)[source]

Convert the Python gatenlp document to a GATE document and return a handle to it.

Parameters

pdoc – python gatenlp Document

Returns

handle to GATE document

save_gdoc(gdoc, path, mimetype=None)[source]

Save GATE document to the given path.

Parameters
  • gdoc – GATE document handle

  • path – destination path

  • mimetype – mimtetype, only the following types are allowed: “”/None: GATE XML, application/fastinfoset, and all mimetypes supported by the Format_Bdoc plugin.

Returns

show_gui()[source]

Show the GUI for the started GATE process. NOTE: this is more of a hack and may cause sync problems when closing down the GATE slave.

Returns

gatenlp.gateslave.classpath_sep(platform=None)[source]

Return the classpath separator character for the current operating system / platform.

Returns

classpath separator character

gatenlp.gateslave.gate_classpath(gatehome, platform=None)[source]

Return the GATE classpath components as a string, with the element seperator characters appropriate for the operating system. :param gatehome: where GATE is installed, either as a cloned git repo or a downloaded installation dir. :return: GATE classpath