Skip to content

Commit

Permalink
Merge branch 'dev' into bump-release-v2
Browse files Browse the repository at this point in the history
  • Loading branch information
dbirman committed Nov 5, 2024
2 parents 415ccd7 + bb33e44 commit a5a54ef
Show file tree
Hide file tree
Showing 21 changed files with 465 additions and 51 deletions.
4 changes: 2 additions & 2 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
rig,
session,
subject,
quality_control
quality_control,
)

dummy_object = [
Expand All @@ -34,7 +34,7 @@
rig,
session,
subject,
quality_control
quality_control,
] # A temporary workaround to bypass "Imported but unused" error

INSTITUTE_NAME = "Allen Institute for Neural Dynamics"
Expand Down
12 changes: 10 additions & 2 deletions docs/source/quality_control.md
Original file line number Diff line number Diff line change
Expand Up @@ -102,6 +102,14 @@ Each metric is associated with a reference figure, image, or video. The QC porta

By default the QC portal displays dictionaries as tables where the values can be edited. We also support a few special cases to allow a bit more flexibility or to constrain the actions that manual annotators can take. Install the `aind-qcportal-schema` package and set the `value` field to the corresponding pydantic object to use these.

### Multi-session QC
### Multi-asset QC

[Details coming soon, this is under discussion]
During analysis there are many situations where multiple data assets need to be pulled together, often for comparison. For example, FOVs across imaging sessions or recording sessions from a chronic probe might need to get matched up across days. When a `QCEvaluation` is being calculated from multiple assets it should be tagged with `Stage:MULTI_ASSET` and each of its `QCMetric` objects needs to track the assets that were used to generate that metric in the `evaluated_assets` list.

**Q: Where do I store multi-asset QC?**

You should follow the preferred/alternate workflows described above. If your multi-asset analysis pipeline generates a new data asset, put the QC there. If your pipeline does not generate an asset, push a copy of each `QCEvaluation` back to **each** individual data asset.

**Q: I want to be able to store data about each of the evaluated assets in this metric**

Take a look at the `MultiAssetMetric` class in `aind-qc-portal-schema`. It allows you to pass a list of values which will be matched up with the `evaluated_assets` names. You can also include options which will appear as dropdowns or checkboxes.
13 changes: 7 additions & 6 deletions docs/source/session.rst
Original file line number Diff line number Diff line change
Expand Up @@ -59,10 +59,11 @@ the Stimulus Class, but the trial-by-trial stimulus information belongs in the N

Great question! We began defining specific classes for different stimulus and behavior modalities, but quickly found
that this won't be scalable. You can currently use these classes if they work for you. However, in the long run we
would like this to move into the `script` field. This field uses the Software class, which has a field for stimulus
parameters, where users can define their own dictionary of parameters used in the script to control the stimulus/
behavior. We recommend that you use software to define these and be consistent within your projects. Please reach out
with questions and we can help you with this.
would like this to move into the `script` field. This field uses the Software class, which has a field for
`parameters`. Users should use this to document the parameters used to control the stimulus or behavior. parameters
should have unambiguous names (e.g. "trial_duration" rather than "duration") and units must be provided as a separate
field (e.g. "trial_duration_unit"). We recommend that you use software to define these and be consistent within your
projects. Please reach out with questions and we can help you with this.

**Q: What should I put for the `session_type`?**

Expand All @@ -77,8 +78,8 @@ and SLIMS. Until this is fully functional, these files must be created manually.
**Q: How do I know if my mouse platform is "active"?**

There are experiments in which the mouse platform is actively controlled by the stimulus/behavior software - i.e. the
resistance of the wheel is adjusted based on the subjects activity. This is an "active" mouse platform. Most platforms
we use are not active in this way.
resistance of the wheel is adjusted based on the subject's activity. This is an "active" mouse platform. Most platforms
we use are currently not active in this way.

**Q: How do I use the Calibration field?**

Expand Down
3 changes: 3 additions & 0 deletions examples/quality_control.json
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,7 @@
"evaluated_assets": null
}
],
"tags": null,
"notes": "",
"allow_failed_metrics": false
},
Expand Down Expand Up @@ -121,6 +122,7 @@
"evaluated_assets": null
}
],
"tags": null,
"notes": "Pass when video_1_num_frames==video_2_num_frames",
"allow_failed_metrics": false
},
Expand Down Expand Up @@ -176,6 +178,7 @@
"evaluated_assets": null
}
],
"tags": null,
"notes": null,
"allow_failed_metrics": false
}
Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ readme = "README.md"
dynamic = ["version"]

dependencies = [
'aind-data-schema-models>=0.3.2',
'aind-data-schema-models>=0.5.4, <1.0.0',
'dictdiffer',
'pydantic>=2.7',
'inflection',
Expand Down
48 changes: 47 additions & 1 deletion src/aind_data_schema/base.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
""" generic base class with supporting validators and fields for basic AIND schema """

import json
import re
from pathlib import Path
from typing import Any, Generic, Optional, TypeVar
Expand All @@ -14,6 +15,7 @@
ValidationError,
ValidatorFunctionWrapHandler,
create_model,
model_validator,
)
from pydantic.functional_validators import WrapValidator
from typing_extensions import Annotated
Expand All @@ -31,13 +33,57 @@ def _coerce_naive_datetime(v: Any, handler: ValidatorFunctionWrapHandler) -> Awa
AwareDatetimeWithDefault = Annotated[AwareDatetime, WrapValidator(_coerce_naive_datetime)]


def is_dict_corrupt(input_dict: dict) -> bool:
"""
Checks that dictionary keys, included nested keys, do not contain
forbidden characters ("$" and ".").
Parameters
----------
input_dict : dict
Returns
-------
bool
True if input_dict is not a dict, or if any keys contain
forbidden characters. False otherwise.
"""

def has_corrupt_keys(input) -> bool:
"""Recursively checks nested dictionaries and lists"""
if isinstance(input, dict):
for key, value in input.items():
if "$" in key or "." in key:
return True
elif has_corrupt_keys(value):
return True
elif isinstance(input, list):
for item in input:
if has_corrupt_keys(item):
return True
return False

# Top-level input must be a dictionary
if not isinstance(input_dict, dict):
return True
return has_corrupt_keys(input_dict)


class AindGeneric(BaseModel, extra="allow"):
"""Base class for generic types that can be used in AIND schema"""

# extra="allow" is needed because BaseModel by default drops extra parameters.
# Alternatively, consider using 'SerializeAsAny' once this issue is resolved
# https://github.com/pydantic/pydantic/issues/6423
pass

@model_validator(mode="after")
def validate_fieldnames(self):
"""Ensure that field names do not contain forbidden characters"""
model_dict = json.loads(self.model_dump_json(by_alias=True))
if is_dict_corrupt(model_dict):
raise ValueError("Field names cannot contain '.' or '$'")
return self


AindGenericType = TypeVar("AindGenericType", bound=AindGeneric)
Expand Down
4 changes: 2 additions & 2 deletions src/aind_data_schema/core/acquisition.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
from typing import List, Literal, Optional, Union

from aind_data_schema_models.process_names import ProcessName
from pydantic import Field, field_validator
from pydantic import Field, SkipValidation, field_validator

from aind_data_schema.base import AindCoreModel, AindModel, AwareDatetimeWithDefault
from aind_data_schema.components.coordinates import AnatomicalDirection, AxisName, ImageAxis
Expand Down Expand Up @@ -45,7 +45,7 @@ class Acquisition(AindCoreModel):

_DESCRIBED_BY_URL = AindCoreModel._DESCRIBED_BY_BASE_URL.default + "aind_data_schema/core/acquisition.py"
describedBy: str = Field(default=_DESCRIBED_BY_URL, json_schema_extra={"const": _DESCRIBED_BY_URL})
schema_version: Literal["1.0.1"] = Field(default="1.0.1")
schema_version: SkipValidation[Literal["1.0.1"]] = Field(default="1.0.1")
protocol_id: List[str] = Field(default=[], title="Protocol ID", description="DOI for protocols.io")
experimenter_full_name: List[str] = Field(
...,
Expand Down
4 changes: 2 additions & 2 deletions src/aind_data_schema/core/data_description.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
from aind_data_schema_models.organizations import Organization
from aind_data_schema_models.pid_names import PIDName
from aind_data_schema_models.platforms import Platform
from pydantic import Field, model_validator
from pydantic import Field, SkipValidation, model_validator

from aind_data_schema.base import AindCoreModel, AindModel, AwareDatetimeWithDefault

Expand All @@ -40,7 +40,7 @@ class DataDescription(AindCoreModel):

_DESCRIBED_BY_URL = AindCoreModel._DESCRIBED_BY_BASE_URL.default + "aind_data_schema/core/data_description.py"
describedBy: str = Field(default=_DESCRIBED_BY_URL, json_schema_extra={"const": _DESCRIBED_BY_URL})
schema_version: Literal["1.0.1"] = Field(default="1.0.1")
schema_version: SkipValidation[Literal["1.0.1"]] = Field(default="1.0.1")
license: Literal["CC-BY-4.0"] = Field("CC-BY-4.0", title="License")

platform: Platform.ONE_OF = Field(
Expand Down
4 changes: 2 additions & 2 deletions src/aind_data_schema/core/instrument.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
from typing import List, Literal, Optional

from aind_data_schema_models.organizations import Organization
from pydantic import Field, ValidationInfo, field_validator
from pydantic import Field, SkipValidation, ValidationInfo, field_validator

from aind_data_schema.base import AindCoreModel, AindModel
from aind_data_schema.components.devices import (
Expand Down Expand Up @@ -35,7 +35,7 @@ class Instrument(AindCoreModel):

_DESCRIBED_BY_URL = AindCoreModel._DESCRIBED_BY_BASE_URL.default + "aind_data_schema/core/instrument.py"
describedBy: str = Field(default=_DESCRIBED_BY_URL, json_schema_extra={"const": _DESCRIBED_BY_URL})
schema_version: Literal["1.0.2"] = Field("1.0.2")
schema_version: SkipValidation[Literal["1.0.1"]] = Field(default="1.0.1")

instrument_id: Optional[str] = Field(
default=None,
Expand Down
58 changes: 54 additions & 4 deletions src/aind_data_schema/core/metadata.py
Original file line number Diff line number Diff line change
@@ -1,16 +1,26 @@
"""Generic metadata class for Data Asset Records."""

import inspect
import json
import logging
from datetime import datetime
from enum import Enum
from typing import Dict, List, Literal, Optional, get_args
from uuid import UUID, uuid4

from aind_data_schema_models.modalities import ExpectedFiles, FileRequirement
from aind_data_schema_models.platforms import Platform
from pydantic import Field, PrivateAttr, ValidationError, ValidationInfo, field_validator, model_validator

from aind_data_schema.base import AindCoreModel
from pydantic import (
Field,
PrivateAttr,
SkipValidation,
ValidationError,
ValidationInfo,
field_validator,
model_validator,
)

from aind_data_schema.base import AindCoreModel, is_dict_corrupt
from aind_data_schema.core.acquisition import Acquisition
from aind_data_schema.core.data_description import DataDescription
from aind_data_schema.core.instrument import Instrument
Expand Down Expand Up @@ -61,7 +71,7 @@ class Metadata(AindCoreModel):

_DESCRIBED_BY_URL = AindCoreModel._DESCRIBED_BY_BASE_URL.default + "aind_data_schema/core/metadata.py"
describedBy: str = Field(default=_DESCRIBED_BY_URL, json_schema_extra={"const": _DESCRIBED_BY_URL})
schema_version: Literal["1.0.3"] = Field("1.0.3")
schema_version: SkipValidation[Literal["1.0.2"]] = Field(default="1.0.2")
id: UUID = Field(
default_factory=uuid4,
alias="_id",
Expand Down Expand Up @@ -278,3 +288,43 @@ def validate_rig_session_compatibility(self):
check = RigSessionCompatibility(self.rig, self.session)
check.run_compatibility_check()
return self


def create_metadata_json(
name: str,
location: str,
core_jsons: Dict[str, Optional[dict]],
optional_created: Optional[datetime] = None,
optional_external_links: Optional[dict] = None,
) -> dict:
"""Creates a Metadata dict from dictionary of core schema fields."""
# Extract basic parameters and non-corrupt core schema fields
params = {
"name": name,
"location": location,
}
if optional_created is not None:
params["created"] = optional_created
if optional_external_links is not None:
params["external_links"] = optional_external_links
core_fields = dict()
for key, value in core_jsons.items():
if key in CORE_FILES and value is not None:
if is_dict_corrupt(value):
logging.warning(f"Provided {key} is corrupt! It will be ignored.")
else:
core_fields[key] = value
# Create Metadata object and convert to JSON
# If there are any validation errors, still create it
# but set MetadataStatus as Invalid
try:
metadata = Metadata.model_validate({**params, **core_fields})
metadata_json = json.loads(metadata.model_dump_json(by_alias=True))
except Exception as e:
logging.warning(f"Issue with metadata construction! {e.args}")
metadata = Metadata.model_validate(params)
metadata_json = json.loads(metadata.model_dump_json(by_alias=True))
for key, value in core_fields.items():
metadata_json[key] = value
metadata_json["metadata_status"] = MetadataStatus.INVALID.value
return metadata_json
4 changes: 2 additions & 2 deletions src/aind_data_schema/core/procedures.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@
VolumeUnit,
create_unit_with_value,
)
from pydantic import Field, field_serializer, field_validator, model_validator
from pydantic import Field, SkipValidation, field_serializer, field_validator, model_validator
from pydantic_core.core_schema import ValidationInfo
from typing_extensions import Annotated

Expand Down Expand Up @@ -649,7 +649,7 @@ class Procedures(AindCoreModel):
_DESCRIBED_BY_URL = AindCoreModel._DESCRIBED_BY_BASE_URL.default + "aind_data_schema/core/procedures.py"
describedBy: str = Field(default=_DESCRIBED_BY_URL, json_schema_extra={"const": _DESCRIBED_BY_URL})

schema_version: Literal["1.1.1"] = Field(default="1.1.1")
schema_version: SkipValidation[Literal["1.1.1"]] = Field(default="1.1.1")
subject_id: str = Field(
...,
description="Unique identifier for the subject. If this is not a Allen LAS ID, indicate this in the Notes.",
Expand Down
4 changes: 2 additions & 2 deletions src/aind_data_schema/core/processing.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@

from aind_data_schema_models.process_names import ProcessName
from aind_data_schema_models.units import MemoryUnit, UnitlessUnit
from pydantic import Field, ValidationInfo, field_validator, model_validator
from pydantic import Field, SkipValidation, ValidationInfo, field_validator, model_validator

from aind_data_schema.base import AindCoreModel, AindGeneric, AindGenericType, AindModel, AwareDatetimeWithDefault
from aind_data_schema.components.tile import Tile
Expand Down Expand Up @@ -124,7 +124,7 @@ class Processing(AindCoreModel):

_DESCRIBED_BY_URL: str = AindCoreModel._DESCRIBED_BY_BASE_URL.default + "aind_data_schema/core/processing.py"
describedBy: str = Field(default=_DESCRIBED_BY_URL, json_schema_extra={"const": _DESCRIBED_BY_URL})
schema_version: Literal["1.1.1"] = Field(default="1.1.1")
schema_version: SkipValidation[Literal["1.1.1"]] = Field(default="1.1.1")

processing_pipeline: PipelineProcess = Field(
..., description="Pipeline used to process data", title="Processing Pipeline"
Expand Down
7 changes: 5 additions & 2 deletions src/aind_data_schema/core/quality_control.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
from typing import Any, List, Literal, Optional

from aind_data_schema_models.modalities import Modality
from pydantic import BaseModel, Field, field_validator, model_validator
from pydantic import BaseModel, Field, SkipValidation, field_validator, model_validator

from aind_data_schema.base import AindCoreModel, AindModel, AwareDatetimeWithDefault

Expand Down Expand Up @@ -78,6 +78,9 @@ class QCEvaluation(AindModel):
name: str = Field(..., title="Evaluation name")
description: Optional[str] = Field(default=None, title="Evaluation description")
metrics: List[QCMetric] = Field(..., title="QC metrics")
tags: Optional[List[str]] = Field(
default=None, title="Tags", description="Tags can be used to group QCEvaluation objects into groups"
)
notes: Optional[str] = Field(default=None, title="Notes")
allow_failed_metrics: bool = Field(
default=False,
Expand Down Expand Up @@ -161,7 +164,7 @@ class QualityControl(AindCoreModel):

_DESCRIBED_BY_URL = AindCoreModel._DESCRIBED_BY_BASE_URL.default + "aind_data_schema/core/quality_control.py"
describedBy: str = Field(default=_DESCRIBED_BY_URL, json_schema_extra={"const": _DESCRIBED_BY_URL})
schema_version: Literal["1.1.2"] = Field("1.1.2")
schema_version: SkipValidation[Literal["1.1.1"]] = Field(default="1.1.1")
evaluations: List[QCEvaluation] = Field(..., title="Evaluations")
notes: Optional[str] = Field(default=None, title="Notes")

Expand Down
Loading

0 comments on commit a5a54ef

Please sign in to comment.