Skip to content

Commit

Permalink
Merge pull request #10 from linkml/refactored-docs
Browse files Browse the repository at this point in the history
Reorganized docs
  • Loading branch information
linkmluser authored May 6, 2024
2 parents 2208819 + 3334401 commit 1f2ebe3
Show file tree
Hide file tree
Showing 46 changed files with 1,446 additions and 702 deletions.
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ app:
# $(RUN) streamlit run $(CODE)/app.py --logger.level=debug

apidoc:
$(RUN) sphinx-apidoc -f -M -o docs/ src/linkml_store/ && cd docs && $(RUN) make html
$(RUN) sphinx-apidoc -f -M -o docs/reference/ src/linkml_store/ && cd docs && $(RUN) make html

sphinx-%:
cd docs && $(RUN) make $*
33 changes: 27 additions & 6 deletions docs/about.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,12 +6,33 @@ About
LinkML-Store is an early effort to provide a unifying storage layer
over multiple different backends, unified via LinkML schemas.

The default backend is DuckDB, but partial implementations are provided for:
Quickstart
----------

- MongoDB
- Solr
See the :ref:`tutorials`

This frameworks also allows *composable indexes*. Currently two are supported:
Data Model
----------

- Simple native trigram method
- LLM text embedding
* A :class:`.Client` provides a top-level interface over one or more databases.
* A :class:`.Database` consists of one or more possibly heterogeneous collections.
* A :class:`.Collection` is a set of objects of a similar type.

Adapters
--------

The current backends supported are:

- :py:mod:`DuckDB<linkml_store.api.stores.duckdb>`
- :py:mod:`MongoDB<linkml_store.api.stores.mongodb>`
- :py:mod:`Solr<linkml_store.api.stores.solr>`
- :py:mod:`ChromaDB<linkml_store.api.stores.chromadb>` (pre-alpha)
- :py:mod:`HDF5<linkml_store.api.stores.mdf5>` (pre-alpha)

Indexing
--------

This frameworks also allows *composable indexes*. Currently two indexers are supported:

- :py:mod:`SimpleIndexer<linkml_store.index.implementations.simple_indexer>` Simple native trigram method
- :py:mod:`LLMIndexer<linkml_store.index.implementations.llm_indexer>` LLM text embedding
24 changes: 24 additions & 0 deletions docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,19 +13,34 @@
copyright = f"{date.today().year}, Author 1 <[email protected]>"
author = "Author 1 <[email protected]>"


# -- General configuration ---------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration

# from https://github.com/cthoyt/ontoportal-client/blob/9862e26f8e374c3aef8707e3d5d69526c4d0fcd5/docs/source/conf.py
# If true, the current module name will be prepended to all description
# unit titles (such as .. function::).
add_module_names = False

# A list of prefixes that are ignored when creating the module index. (new in Sphinx 0.6)
modindex_common_prefix = ["linkml_store."]

extensions = [
"sphinx.ext.autosummary",
"sphinx.ext.autodoc",
"sphinx.ext.githubpages",
"sphinx_rtd_theme",
"sphinx_click",
"sphinx.ext.viewcode",
"sphinx_autodoc_typehints",
"sphinx_automodapi.automodapi",
"sphinx_automodapi.smart_resolver",
"myst_parser",
"nbsphinx",
]



# generate autosummary pages
autosummary_generate = True

Expand Down Expand Up @@ -54,6 +69,15 @@
html_favicon = 'https://linkml.io/uploads/linkml-logo_color-no-words.png'
html_static_path = ["_static"]

# https://stackoverflow.com/questions/5599254/how-to-use-sphinxs-autodoc-to-document-a-classs-init-self-method
autodoc_default_options = {
'members': True,
'member-order': 'bysource',
'special-members': '__init__',
'undoc-members': True,
'exclude-members': '__weakref__'
}

# The name of an image file (relative to this directory) to place at the top
# of the sidebar.
#
Expand Down
381 changes: 0 additions & 381 deletions docs/examples/MongoDB-Example.ipynb

This file was deleted.

261 changes: 261 additions & 0 deletions docs/how-to/Check-Referential-Integrity.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,261 @@
{
"cells": [
{
"cell_type": "markdown",
"source": [
"# How to Check Referential Integrity\n",
"\n",
"This example uses MongoDB"
],
"metadata": {
"collapsed": false
},
"id": "fc4794dd116ed21"
},
{
"cell_type": "code",
"execution_count": 1,
"outputs": [],
"source": [
"from linkml_store import Client\n",
"\n",
"client = Client()"
],
"metadata": {
"collapsed": false,
"ExecuteTime": {
"end_time": "2024-05-04T19:51:09.760981Z",
"start_time": "2024-05-04T19:51:08.378243Z"
}
},
"id": "initial_id"
},
{
"cell_type": "code",
"execution_count": 2,
"outputs": [],
"source": [
"db = client.attach_database(\"mongodb://localhost:27017\", \"test\")\n",
"db.metadata.ensure_referential_integrity = True\n",
"db.set_schema_view(\"../../tests/input/countries/countries.linkml.yaml\")\n",
"countries_coll = db.create_collection(\"Country\", alias=\"countries\", recreate_if_exists=True)\n",
"routes_coll = db.create_collection(\"Route\", alias=\"routes\", recreate_if_exists=True)"
],
"metadata": {
"collapsed": false,
"ExecuteTime": {
"end_time": "2024-05-04T19:51:09.788932Z",
"start_time": "2024-05-04T19:51:09.771112Z"
}
},
"id": "cc164c0acbe4c39d"
},
{
"cell_type": "code",
"execution_count": 3,
"outputs": [],
"source": [
"COUNTRIES = \"../../tests/input/countries/countries.jsonl\"\n",
"ROUTES = \"../../tests/input/countries/routes.csv\""
],
"metadata": {
"collapsed": false,
"ExecuteTime": {
"end_time": "2024-05-04T19:51:09.789681Z",
"start_time": "2024-05-04T19:51:09.786454Z"
}
},
"id": "5286ef4e9dd0f316"
},
{
"cell_type": "code",
"execution_count": 4,
"outputs": [
{
"data": {
"text/plain": "[{'origin': 'DE', 'destination': 'FR', 'method': 'rail'}]"
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from linkml_store.utils.format_utils import load_objects\n",
"\n",
"countries = load_objects(COUNTRIES)\n",
"routes = load_objects(ROUTES)\n",
"routes"
],
"metadata": {
"collapsed": false,
"ExecuteTime": {
"end_time": "2024-05-04T19:51:09.795894Z",
"start_time": "2024-05-04T19:51:09.790413Z"
}
},
"id": "2e21988e4fc13f58"
},
{
"cell_type": "code",
"execution_count": 5,
"outputs": [],
"source": [
"countries_coll.insert(countries)\n",
"routes_coll.insert(routes)"
],
"metadata": {
"collapsed": false,
"ExecuteTime": {
"end_time": "2024-05-04T19:51:09.803272Z",
"start_time": "2024-05-04T19:51:09.798758Z"
}
},
"id": "668e59a8f28e7bfe"
},
{
"cell_type": "code",
"execution_count": 6,
"outputs": [
{
"data": {
"text/plain": "[{'origin': 'DE', 'destination': 'FR', 'method': 'rail'}]"
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"routes_coll.find().rows"
],
"metadata": {
"collapsed": false,
"ExecuteTime": {
"end_time": "2024-05-04T19:51:09.810617Z",
"start_time": "2024-05-04T19:51:09.804004Z"
}
},
"id": "995e63f873ea9353"
},
{
"cell_type": "code",
"execution_count": 7,
"outputs": [],
"source": [
"for result in db.iter_validate_database():\n",
" print(result)"
],
"metadata": {
"collapsed": false,
"ExecuteTime": {
"end_time": "2024-05-04T19:51:09.956191Z",
"start_time": "2024-05-04T19:51:09.809082Z"
}
},
"id": "a8ef16a3fbc6bfe6"
},
{
"cell_type": "markdown",
"source": [
"## Inserting invalid data\n",
"\n",
"We will intentionally insert an invalid row"
],
"metadata": {
"collapsed": false
},
"id": "24fb15bce092c2d1"
},
{
"cell_type": "code",
"execution_count": 8,
"outputs": [],
"source": [
"routes_coll.insert({\"origin\": \"ZZZ\", \"destination\": \"YYY\", \"method\": \"rail\"})"
],
"metadata": {
"collapsed": false,
"ExecuteTime": {
"end_time": "2024-05-04T19:51:09.961815Z",
"start_time": "2024-05-04T19:51:09.956721Z"
}
},
"id": "f712a82be775f413"
},
{
"cell_type": "code",
"execution_count": 9,
"outputs": [
{
"data": {
"text/plain": " origin destination method\n0 DE FR rail\n1 ZZZ YYY rail",
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>origin</th>\n <th>destination</th>\n <th>method</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>0</th>\n <td>DE</td>\n <td>FR</td>\n <td>rail</td>\n </tr>\n <tr>\n <th>1</th>\n <td>ZZZ</td>\n <td>YYY</td>\n <td>rail</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"routes_coll.find().rows_dataframe"
],
"metadata": {
"collapsed": false,
"ExecuteTime": {
"end_time": "2024-05-04T19:51:09.974226Z",
"start_time": "2024-05-04T19:51:09.961675Z"
}
},
"id": "18ffa996e3893b96"
},
{
"cell_type": "code",
"execution_count": 16,
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"type='ReferentialIntegrity' severity=<Severity.ERROR: 'ERROR'> message='Referential integrity error: Country not found' instance='ZZZ' instance_index=None instantiates='Country'\n",
"type='ReferentialIntegrity' severity=<Severity.ERROR: 'ERROR'> message='Referential integrity error: Country not found' instance='YYY' instance_index=None instantiates='Country'\n"
]
}
],
"source": [
"results = list(db.iter_validate_database())\n",
"for result in results:\n",
" print(result)"
],
"metadata": {
"collapsed": false,
"ExecuteTime": {
"end_time": "2024-05-04T19:52:20.044928Z",
"start_time": "2024-05-04T19:52:19.996008Z"
}
},
"id": "c67517aece5d47c5"
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
{
"cell_type": "markdown",
"source": [
"# Example of querying a Solr backend on the Command Line\n",
"# How to query Solr using the Command Line\n",
"\n",
"For this we will use the Golr endpoint: https://golr.geneontology.org/solr\n",
"\n",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
{
"cell_type": "markdown",
"source": [
"# Example: Monarch-KG notebook\n",
"# How to query the Monarch-KG\n",
"\n",
"Illustrates use of LinkML-Store over the Monarch-KG database (duckdb serialization)\n",
"\n",
Expand Down
File renamed without changes.
Loading

0 comments on commit 1f2ebe3

Please sign in to comment.