Using yq for querying and manipulating schemas
STATUS: DRAFT
yq is a command-line YAML processor, similar to jq.
As all LinkML schemas have a canonical form as YAML, you can use yq to process these.
Note: yq operates at the level of schema yaml document structure, not the meaning of a schema. It has no knowledge of:
imports
inheritance and inference of slots over class hierarchies
inlining as dicts
If you want to do semantics-aware schema processing then we recommend you use SchemaView.
However, for certain kinds of quick and dirty low-level operations, yq provides a fast, flexible, and easy way to query schema yaml files.
You could also choose to use jq, but this requires a (trivial) intermediate step of converting yaml to json (and back to yaml, if you are performing write operations), so we recommend yq over jq since it works on YAML as a native form
This guide is mostly in the form of cookbook examples. If you wish to perform operations that don’t fit a template here, then you will need to consult the (excellent) yq docs, and also have some awareness of how LinkML schemas are rendered as YAML.
top level lookups
Fetch the schema identifier:
$ yq e '.id' personinfo.yaml
https://w3id.org/linkml/examples/personinfo
all class names
✗ yq e '.classes | keys' personinfo.yaml
- NamedThing
- Person
- HasAliases
- Organization
- Place
- Address
- Event
- Concept
- DiagnosisConcept
- ProcedureConcept
- Relationship
- FamilialRelationship
- EmploymentEvent
- MedicalEvent
- WithLocation
# TODO: annotate that this is a container/root class
- Container
classes with their is-a parents
$ yq e '.classes | to_entries | {.[].key: .[].value.is_a}' personinfo.yaml
TODO
setting top level slots
Set the schema name:
$ yq e '.name = "NEW NAME"' personinfo.yaml
id: https://w3id.org/linkml/examples/personinfo
name: NEW NAME
description: |-
Information about people, based on [schema.org](http://schema.org)
...
lookup a class by name
Lookup the class Person:
✗ yq e '.classes.Person' personinfo.yaml
is_a: NamedThing
description: >-
A person (alive, dead, undead, or fictional).
class_uri: schema:Person
mixins:
- HasAliases
slots:
- primary_email
- birth_date
- age_in_years
- gender
- current_address
- has_employment_history
- has_familial_relationships
- has_medical_history
slot_usage:
primary_email:
pattern: "^\\S+@[\\S+\\.]+\\S+"
Looking up a particular slot:
✗ yq e '.classes.Person.is_a' personinfo.yaml | less
NamedThing
Setting the is_a value of a class:
yq e '.classes.Person.is_a="Agent"' personinfo.yaml
Note that in LinkML schemas, classes, slots, etc are inlined_as_dict
, meaning you can’t access these by array indices
prefixes
✗ yq e '.prefixes' personinfo.yaml
personinfo: https://w3id.org/linkml/examples/personinfo/
linkml: https://w3id.org/linkml/
schema: http://schema.org/
rdfs: http://www.w3.org/2000/01/rdf-schema#
prov: http://www.w3.org/ns/prov#
GSSO: http://purl.obolibrary.org/obo/GSSO_
famrel: https://example.org/FamilialRelations#
# DATA PREFIXES
P: http://example.org/P/
ROR: http://example.org/ror/
CODE: http://example.org/code/
GEO: http://example.org/geoloc/
Gotchas:
this will not include imported prefixes. Use SchemaView to get these.
just the keys:
✗ yq e '.prefixes | keys' personinfo.yaml | less
- personinfo
- linkml
- schema
- rdfs
- prov
- GSSO
- famrel
# DATA PREFIXES
- P
- ROR
- CODE
- GEO
just the values:
yq e '.prefixes | to_entries | .[].value' personinfo.yaml
https://w3id.org/linkml/examples/personinfo/
https://w3id.org/linkml/
http://schema.org/
http://www.w3.org/2000/01/rdf-schema#
http://www.w3.org/ns/prov#
http://purl.obolibrary.org/obo/GSSO_
https://example.org/FamilialRelations#
http://example.org/P/
http://example.org/ror/
http://example.org/code/
http://example.org/geoloc/
Gotchas:
the key value form above is a common shorthand, but prefixes can also be stored in expanded form