FAQ: Tools
What tools do I need for LinkML?
Formally, LinkML is a specification for modeling data, and is independent of any set of tools.
However, for practical purposes, you will find the core python toolchain useful, whether you use this as a python library, or a command line tool.
This includes functionality like:
generators to convert schemas to other modeling languages
data converters and validators for working with data that conforms to LinkML (including RDF, JSON, and TSV)
The GitHub repo is https://github.com/linkml/linkml
For installation, see installation
There are other tools in the LinkML ecosystem that you may find useful:
linkml/schemasheets, for managing your schema as a spreadsheet
linkml/linkml-model-enrichment, for bootstrapping and enhancing schemas
linkml/linkml-owl, for generating OWL ontologies using schemas as templates
linkml/sparqlfun, for templated SPARQL queries
How do I install the LinkML tools?
See the installation guide.
Is there a tool to manage schemas as spreadsheets?
Yes! See:
How do I browse a schema?
For small schemas with limited inheritance, it should be possible to mentally picture the structure just by examining the source YAML. For larger schemas, with deep inheritance, it can help to have some kind of hierarchical browsing tool.
There are a few strategies:
Use gen-markdown to make markdown that can be viewed using mkdocs
Use gen-owl to make an OWL ontology, which can be browsed:
Using an ontology editing tool like Protege
By publishing the ontology with an ontology repository and using a web ontology browser
By running the Ontology Lookup Service docker image and browsing using a web browser
How can I check my schema is valid?
You can use any of the generator tools distributed as part of linkml to check for errors in your schema
Are there tools to create a schema from JSON-Schema/SHACL/SQL DDL/…?
Currently the core linkml framework can generate schemas in other frameworks from a linkml schema. The generators are part of the core framework.
We have experimental importers as part of the linkml-model-enrichment project, which can generate a schema from:
An OWL ontology
JSON-Schema
Others may be added in future
However, there importers are not part of the core, may be incomplete, and may not be as well supported, and not as well documented. You may still find them useful to kick-start a schema, but you should not rely on them in a production environment.
Are there tools to infer a schema from data?
The linkml-model-enrichment framework can seed a schema from:
CSV/TSV files
JSON data
RDF triples
Note that a number of heuristic measures are applied, and the results are not guaranteed to be correct. You may still find them useful to bootstrap a new schema.
This framework also has tools to:
Automatically annotate mappings in a schema using bioportal annotator service
Automatically assign meaning fields in enums using bioportal and OLS annotators
Again, this is a text-mining based approach, and will yield both false positives and negatives.
How do I programmatically create schemas?
As LinkML schemas are YAML files, you can use library that writes YAML.
For example, in Python you can write code like this:
import yaml
schema = {
"id": my_schema_url,
classes: [
{
"Person": {
"description": "any person, living or dead",
"attributes": {
...
}
}
}
]
}
print(yaml.dump(schema))
You can also write similar code in most languages.
While this should work fine, the approach has some disadvantages. In particular you get no IDE support and there is no guard against making mistakes in key names or structure until you come to run the code.
A better approach for Python developers is to use the Python object model that is generated from the metamodel.
from linkml_runtime.linkml_model.meta import SchemaDefinition, ClassDefinition
s = SchemaDefinition(id= my_schema_id,
classes= [ ... ])
You can also use the SchemaView classes, see the developers guide section on manipulating schemas
How can I check my data is valid?
If you have data in RDF, JSON, or TSV then you can check for validiting using linkml-validate
See validating data for more details
Are there tools for editing my data?
the same LinkML data can be rendered as JSON or RDF, and for schemas that have a relatively flat structure, TSVs can be used. Any editing tool that can be used for those formats can be used for LinkML. For example, you can turn your schema into OWL and then use Protege to edit instance data. Or you can simply edit your data in a TSV.
For “flat” schemas such as those for collecting sample or specimen metadata, the DataHarmonizer accepts LinkML as a schema language.
If you are comfortable using an IDE like PyCharm, and with editing you data as JSON, then you can use your LinkML schema to provide dynamic schema validation and autocompletion while editing, see these slides for a guide
Are there guides for developing LinkML compliant tools?
See the tool developer guide
Can I generate a website from a LinkML schema
Yes!
See the markdown generator for details.
If you run:
gen-markdown -d docs personinfo.yaml
It will place all the markdown documents you need to run a mkdocs site
Can I customize the Markdown generation for my schema site?
For some purposes, the generic schema documentation provided by gen-markdown
may look too… generic.
You can customize markdown generation using your own templates. This requires a basic understanding of Jinja2 templates.
The protocol is:
copy the jinja templates from docgen to your own repo in a folder
templates
customize these templates
run
gen-docs --template-directory templates -d docs my_schema.yaml
run
mkdocs serve
to test locallyiterate until they look how you want, then deploy (e.g.
mkdocs gh-deploy
)
An example repo that uses highly customized templates: GSC MIxS
Can I use my schema to do reasoning over my data?
There are a number of strategies for performing deductive inference:
Convert your schema to OWL and your data to RDF and use an OWL reasoner
Use the (experimental) linkml-datalog framework