Eidos (indra.sources.eidos)

Eidos is an open-domain machine reading system which uses a cascade of grammars to extract causal events from free text. It is ideal for modeling applications that are not specific to a given domain like molecular biology.

To set up reading with Eidos, the Eidos system and its dependencies need to be compiled and packaged as a fat JAR:

git clone https://github.com/clulab/eidos.git
cd eidos
sbt assembly

This creates a JAR file in eidos/target/scala[version]/eidos-[version].jar. Set the absolute path to this file on the EIDOSPATH environmental variable and then append EIDOSPATH to the CLASSPATH environmental variable (entries are separated by colons).

The pyjnius package needs to be set up and operational to use Eidos reading in Python. For more details, see Pyjnius setup instructions in the documentation.

For eidos to provide grounding information to be included in INDRA statements, the eidos configuration needs to be adjusted. First, in the eidos installation, create the directory src/main/resources/org/clulab/wm/eidos/w2v. Then, obtain vectors.txt from the eidos developers and put it in this directory. Next, set the property “useW2V” to true in src/main/resources/eidos.conf. Finally, rerun sbt compile and sbt assembly.

Eidos API (indra.sources.eidos.eidos_api)

indra.sources.eidos.eidos_api.process_json(json_dict)[source]

Return an EidosJsonProcessor by processing the given Eidos JSON dict.

Parameters:json_dict (dict) – The JSON dict to be processed.
Returns:ep – A EidosJsonProcessor containing the extracted INDRA Statements in ep.statements.
Return type:EidosJsonProcessor
indra.sources.eidos.eidos_api.process_json_file(file_name)[source]

Return an EidosProcessor by processing the given Eidos json file.

The output from the Eidos reader is in json format. This function is useful if the output is saved as a file and needs to be processed.

Parameters:file_name (str) – The name of the json file to be processed.
Returns:ep – A EidosJsonProcessor containing the extracted INDRA Statements in ep.statements.
Return type:EidosJsonProcessor
indra.sources.eidos.eidos_api.process_json_ld(json_dict)[source]

Return an EidosJsonLdProcessor by processing a Eidos JSON-LD dict.

Parameters:json_dict (dict) – The JSON-LD dict to be processed.
Returns:ep – A EidosJsonLdProcessor containing the extracted INDRA Statements in ep.statements.
Return type:EidosJsonLdProcessor
indra.sources.eidos.eidos_api.process_json_ld_file(file_name)[source]

Return an EidosProcessor by processing the given Eidos JSON-LD file.

The output from the Eidos reader is in json-LD format. This function is useful if the output is saved as a file and needs to be processed.

Parameters:file_name (str) – The name of the JSON-LD file to be processed.
Returns:ep – A EidosJsonLdProcessor containing the extracted INDRA Statements in ep.statements.
Return type:EidosJsonLdProcessor
indra.sources.eidos.eidos_api.process_json_ld_str(json_str)[source]

Return an EidosJsonLdProcessor by processing the Eidos JSON-LD string.

The output from the Eidos parser is in JSON-LD format.

Parameters:json_str (str) – The json-LD string to be processed.
Returns:ep – A EidosJsonLdProcessor containing the extracted INDRA Statements in ep.statements.
Return type:EidosJsonLdProcessor
indra.sources.eidos.eidos_api.process_json_str(json_str)[source]

Return an EidosProcessor by processing the given Eidos json string.

The output from the Eidos parser is in json format.

Parameters:json_str (str) – The json string to be processed.
Returns:ep – A EidosProcessor containing the extracted INDRA Statements in ep.statements.
Return type:EidosJsonProcessor
indra.sources.eidos.eidos_api.process_text(text, out_format='json', save_json='eidos_output.json')[source]

Return an EidosProcessor by processing the given text.

This constructs a reader object via Java and extracts mentions from the text. It then serializes the mentions into JSON and processes the result with process_json.

Parameters:
  • text (str) – The text to be processed.
  • out_format (str) – The type of Eidos output to read into and process. Can be one of “json” or “json_ld”. Default: “json”
  • save_json (Optional[str]) – The name of a file in which to dump the JSON output of Eidos.
Returns:

ep – A EidosJsonProcessor or EidosJsonLdProcessor containing the extracted INDRA Statements in ep.statements.

Return type:

EidosJsonProcessor or EidosJsonLdProcessor depending on out_format

Eidos Processor (indra.sources.eidos.processor)

class indra.sources.eidos.processor.EidosJsonLdProcessor(json_dict)[source]

This processor extracts INDRA Statements from Eidos JSON-LD output.

Parameters:json_dict (dict) – A JSON dictionary containing the Eidos extractions in JSON-LD format.
tree

objectpath.Tree – The objectpath Tree object representing the extractions.

statements

list[indra.statements.Statement] – A list of INDRA Statements that were extracted by the processor.

class indra.sources.eidos.processor.EidosJsonProcessor(json_dict)[source]

This processor extracts INDRA Statements from Eidos JSON (not JSON-LD) output.

Parameters:json_dict (dict) – A JSON dictionary containing the Eidos extractions in JSON (not JSON-LD) format.
tree

objectpath.Tree – The objectpath Tree object representing the extractions.

statements

list[indra.statements.Statement] – A list of INDRA Statements that were extracted by the processor.

Eidos Reader (indra.sources.eidos.eidos_reader)

class indra.sources.eidos.eidos_reader.EidosReader[source]

Reader object keeping an instance of the Eidos reader as a singleton.

This allows the Eidos reader to need initialization when the first piece of text is read, the subsequent readings are done with the same instance of the reader and are therefore faster.

eidos_reader

org.clulab.wm.eidos.EidosSystem – A Scala object, an instance of the Eidos reading system. It is instantiated only when first processing text.

process_text(text, format='json')[source]

Return a mentions JSON object given text.

Parameters:
  • text (str) – Text to be processed.
  • format (str) – The format of the output to produce, one of “json” or “json_ld”. Default: “json”
Returns:

json_dict – A JSON object of mentions extracted from text.

Return type:

dict