Look up records of species, gene, protein, cell marker#
Entities and ontologies can be complex with many different identifiers or even species.
Here we show Bionty’s Entity model for species, genes, proteins and cell markers. You’ll see how to
initialize an Entity model with different identifiers
access the reference table via
.df
lookup an entity record via
.lookup.{term}
import bionty as bt
Species#
To examine the Species ontology we create the corresponding object and look at the associated Pandas DataFrame.
species_bionty = bt.Species()
species_bionty
Reference table#
df = species_bionty.df()
df.head()
Lookup records#
Terms can be searched with auto-complete using a lookup object:
Tip
By default, the name
field is used to generate the lookup, you may change the field via:
species.lookup_field = <new field>
For duplications, we uniquefy them by appending __0
, __1
, __2
, …
species_bionty_lookup = species_bionty.lookup()
species_bionty_lookup.white_tufted_ear_marmoset
species_bionty_lookup.white_tufted_ear_marmoset.scientific_name
To access the information of, for example the human, pig, and mouse species, we select the corresponding species through Pandas:
df = species_bionty.df()
df.set_index("name", inplace=True)
df.loc[["human", "mouse", "pig"]]
Gene#
Next let’s take a look at genes, which follows the same design choices as Species
.
The only difference is the Gene
class will initialize with a species
parameter, therefore you will only retrieve gene entries of the specified species.
gene_bionty = bt.Gene(species="human")
gene_bionty
df = gene_bionty.df()
df.head()
gene_bionty_lookup = gene_bionty.lookup()
gene_bionty_lookup.TCF7
Convert between identifiers just using Pandas:
df.loc[df["symbol"].isin(["BRCA1", "BRCA2"])]
The mouse reference is also available from ensembl:
gene_bionty_mouse = bt.Gene("mouse")
df = gene_bionty_mouse.df()
df.head()
Protein#
The protein reference uses UniProt id as the standardized identifier.
protein_bionty = bt.Protein(species="human")
protein_bionty
protein_bionty_lookup = protein_bionty.lookup()
protein_bionty_lookup.ABC_transporter_domain_containing_protein
df = protein_bionty.df()
df.head()
Cell marker#
The cell marker ontologies works similarly.
cell_marker_bionty = bt.CellMarker(species="human")
cell_marker_bionty
df = cell_marker_bionty.df()
df.head()
cell_marker_bionty_lookup = cell_marker_bionty.lookup()
cell_marker_bionty_lookup.CD45