Feat 1078 modelschema #1109

saskiad · 2024-10-16T23:01:34Z

dyf

I feel this structure is too specific and nested.

More generally, how do others describe models? We should talk to Laura.

dyf · 2024-10-18T15:19:39Z

src/aind_data_schema/core/model.py

+    name: str = Field(..., title="Name")
+    developer_full_name: Optional[str] = Field(default=None, title="Name of developer")
+    developer_institution: Optional[Organization.ONE_OF] = Field(default=None, title="Institute where developed")
+    modality: Modality.ONE_OF = Field(..., title="Modality")


should be a list

I wonder if we want a more generic version of modality as well, especially for models like an LLM that is not taking experimental data specifically as input (ie text, image, video etc). Not sure what an alternative name for that would be though.

If this is meant to include all models, I agree with Tom that there won't always be training data. In 'task trained' RNNs, the model is trained on a supervised target output. I imagine some models might be 'pre-trained' in this setup and then trained on data at a later stage of training.

src/aind_data_schema/core/model.py

dyf · 2024-10-18T15:33:03Z

src/aind_data_schema/core/model.py

+    std: Decimal = Field(..., title="Standard deviation")
+
+
+class PerformanceScore(AindModel):


do these only apply to supervised models?

tmchartrand · 2024-10-22T20:57:08Z

src/aind_data_schema/core/model.py

+    value: Any = Field(..., title="Metric value")
+
+
+class ModelEvaluation(AindModel):


might be clearer to have a distinct ModelTraining as a subclass of ModelEvaluation. This would add a cross_validation_method (instead of validation_folds) along with a list of training_parameters and anything else we want to highlight (perhaps a specific field for data augmentation if present?).

actually, now I'm wondering if these should both subclass DataProcess, with ModelEvaluation just adding the performance field? It's a bit off conceptually in the case of unsupervised methods, but I think we're already considering simulation as within the scope of DataProcess so maybe no different from that.

tmchartrand · 2024-10-22T21:03:11Z

src/aind_data_schema/core/model.py

+    name: str = Field(..., title="Name")
+    developer_full_name: Optional[str] = Field(default=None, title="Name of developer")
+    developer_institution: Optional[Organization.ONE_OF] = Field(default=None, title="Institute where developed")
+    modality: Modality.ONE_OF = Field(..., title="Modality")


I wonder if we want a more generic version of modality as well, especially for models like an LLM that is not taking experimental data specifically as input (ie text, image, video etc). Not sure what an alternative name for that would be though.

src/aind_data_schema/core/model.py

…lDynamics/aind-data-schema into feat-1078-modelschema

tmchartrand · 2024-12-13T21:31:28Z

closing in favor of #1166

saskiad added 2 commits October 15, 2024 21:33

start of model class

3f2e891

added more to model

9d69329

saskiad marked this pull request as draft October 16, 2024 23:01

saskiad added 3 commits October 16, 2024 16:02

fixed literal

90357d5

added cumulative evaluation

6776d47

line length

0507466

dyf reviewed Oct 18, 2024

View reviewed changes

cleanup based on feedback from team

b2a68e9

tmchartrand marked this pull request as ready for review October 22, 2024 20:42

tmchartrand marked this pull request as draft October 22, 2024 20:42

tmchartrand self-requested a review October 22, 2024 20:43

tmchartrand reviewed Oct 22, 2024

View reviewed changes

lauradriscoll reviewed Nov 5, 2024

View reviewed changes

src/aind_data_schema/core/model.py Show resolved Hide resolved

saskiad and others added 5 commits November 6, 2024 16:37

small edits

d38cf13

model validation

a0ef7f3

using dataprocess

029efef

linting

2807250

Merge branch 'dev' into feat-1078-modelschema

17597ab

saskiad requested review from tmchartrand and dyf November 17, 2024 03:56

saskiad added 2 commits November 16, 2024 21:02

remove dataprocess and add test

47cfc63

Merge branch 'feat-1078-modelschema' of https://github.com/AllenNeura…

55651ee

…lDynamics/aind-data-schema into feat-1078-modelschema

saskiad marked this pull request as ready for review November 17, 2024 05:03

saskiad added 2 commits November 16, 2024 21:05

linting

b6d07b2

more lint

3343a96

tmchartrand mentioned this pull request Nov 25, 2024

Model schema (updated to use DataProcess) #1166

Merged

tmchartrand closed this Dec 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat 1078 modelschema #1109

Feat 1078 modelschema #1109

saskiad commented Oct 16, 2024

dyf left a comment

dyf Oct 18, 2024

tmchartrand Oct 22, 2024

lauradriscoll Nov 5, 2024

dyf Oct 18, 2024

tmchartrand Oct 22, 2024

tmchartrand Oct 22, 2024

tmchartrand Oct 22, 2024

tmchartrand commented Dec 13, 2024

		std: Decimal = Field(..., title="Standard deviation")


		class PerformanceScore(AindModel):

		value: Any = Field(..., title="Metric value")


		class ModelEvaluation(AindModel):

Feat 1078 modelschema #1109

Feat 1078 modelschema #1109

Conversation

saskiad commented Oct 16, 2024

dyf left a comment

Choose a reason for hiding this comment

dyf Oct 18, 2024

Choose a reason for hiding this comment

tmchartrand Oct 22, 2024

Choose a reason for hiding this comment

lauradriscoll Nov 5, 2024

Choose a reason for hiding this comment

dyf Oct 18, 2024

Choose a reason for hiding this comment

tmchartrand Oct 22, 2024

Choose a reason for hiding this comment

tmchartrand Oct 22, 2024

Choose a reason for hiding this comment

tmchartrand Oct 22, 2024

Choose a reason for hiding this comment

tmchartrand commented Dec 13, 2024