gatenlp.gateslave module¶
Module for interacting with a Java GATE process, running API commands on it and exchanging data with it.
-
class
gatenlp.gateslave.
GateSlave
(port=25333, start=True, java='java', host='127.0.0.1', gatehome=None, platform=None)[source]¶ Bases:
object
Create an instance of the GateSlave and either start our own Java GATE process for it to use (start=True) or connect to an existing one (start=False).
After the GateSlave instance has been create successfully, it is possible to:
Use one of the methods of the instance to perform operations on the Java side or exchange data
use GateSlave.slave to invoke methods from the PythonSlave class on the Java side
use GateSlave.jvm to directly construct objects or call instance or static methods
NOTE: the GATE process must not output anything important/big to stderr because everything from stderr gets captured and used for communication between the Java and Python processes. At least part of the output to stderr may only be passed on after the GATE process has ended.
Example:
gs = GateSlave() pipeline = gs.slave.loadPipelineFromFile("thePipeline.xgapp") doc = gs.slave.createDocument("Some document text") gs.slave.run4doc(pipeline,doc) pdoc = gs.gdoc2pdoc(doc) gs.slave.deleteResource(doc) # process the gatenlp Document pdoc ...
- Parameters
port – port to use
java – path to the java binary to run or the java command to use from the PATH (for start=True)
host – host an existing Java GATE process is running on (only relevant for start=False)
gatehome – where GATE is installed (only relevant if start=True). If None, expects environment variable GATE_HOME to be set.
platform – system platform we run on, one of Windows, Linux (also for MacOs) or Java
-
close
()[source]¶ Clean up: if the gate slave process was started by us, we will shut it down. :return:
-
del_gdoc
(gdoc)[source]¶ Delete/unload the GATE document from GATE. This is necessary to do for each GATE document that is not used anymore, otherwise the documents will accumulate in the Java process and eat up all memory. NOTE: just removing all references to the GATE document does not delete/unload the document!
- Parameters
gdoc – the document to remove
- Returns
-
gdoc2pdoc
(gdoc)[source]¶ Convert the GATE document to a python document and return it.
- Parameters
gdoc – the handle to a GATE document
- Returns
a gatenlp Document instance
-
load_gdoc
(path, mimetype=None)[source]¶ Let GATE load a document from the given path and return a handle to it.
- Parameters
path – path to the gate document to load.
mimetype – a mimetype to use when loading.
- Returns
a handle to the GATE document
-
load_pdoc
(path, mimetype=None)[source]¶ Load a document from the given path, using GATE and convert and return as gatenlp Python document.
- Parameters
path – path to load document from
mimetype – mime type to use
- Returns
gatenlp document
-
pdoc2gdoc
(pdoc)[source]¶ Convert the Python gatenlp document to a GATE document and return a handle to it.
- Parameters
pdoc – python gatenlp Document
- Returns
handle to GATE document
-
save_gdoc
(gdoc, path, mimetype=None)[source]¶ Save GATE document to the given path.
- Parameters
gdoc – GATE document handle
path – destination path
mimetype – mimtetype, only the following types are allowed: “”/None: GATE XML, application/fastinfoset, and all mimetypes supported by the Format_Bdoc plugin.
- Returns
-
gatenlp.gateslave.
classpath_sep
(platform=None)[source]¶ Return the classpath separator character for the current operating system / platform.
- Returns
classpath separator character
-
gatenlp.gateslave.
gate_classpath
(gatehome, platform=None)[source]¶ Return the GATE classpath components as a string, with the element seperator characters appropriate for the operating system. :param gatehome: where GATE is installed, either as a cloned git repo or a downloaded installation dir. :return: GATE classpath