-
Notifications
You must be signed in to change notification settings - Fork 393
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
guide: Add --recursive
options to Running Experiments
#2846
Conversation
content/docs/user-guide/experiment-management/running-experiments.md
Outdated
Show resolved
Hide resolved
content/docs/user-guide/experiment-management/running-experiments.md
Outdated
Show resolved
Hide resolved
content/docs/user-guide/experiment-management/running-experiments.md
Outdated
Show resolved
Hide resolved
content/docs/user-guide/experiment-management/running-experiments.md
Outdated
Show resolved
Hide resolved
content/docs/user-guide/experiment-management/running-experiments.md
Outdated
Show resolved
Hide resolved
--recursive
options to Running Experiments
I'd just close this (and maybe #2730) per #2846 (comment). |
|
I'm not suggesting to only have |
### Running pipelines recursively | ||
|
||
DVC supports pipelines defined in more than one `dvc.yaml` file. These can | ||
reside in subfolders inside the workspace, and you may want to run all of them | ||
at once. Example project: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Continues #2846 (review)
To summarize on this, exp run --recursive
has to do with how your project structure is organized so in case people have several dvc.yaml files they may need it. I don't think we need to mention pipelines to explain this, even when they user may be using them indeed — they would know what they are and how/where they codified them already.
That said if you want to mention both the cases with multiple dvc.yaml files and the one with multiple pipelines as part of the explanation here, that's OK too. We just shouldn't focus on the concept of pipelines I believe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another option or an additional thing we may want to do is to create a separate "Organizing Experiments" page (instead of https://dvc.org/doc/user-guide/experiment-management#organization-patterns) and mention exp run --recursive
as a tool in there. This could be a separate issue or check box somewhere. Up to you @iesahin
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section is about "running experiments" and I believe it involves experiments that use pipelines in directory hierarchies. We can mention --recursive
in other places as a way to organize the experiments, but here we are discussing how to run experiments if their parts are found in recursive directories.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pipelines are an inherent part of experiments, even if we have "experiments without pipelines" at some point, this guide will have to cover "experiments with pipelines." Specifying pipelines, organizing them or adding/deleting stages to them are irrelevant topics to this section but "running pipelines" should be covered here IMO. If we'd remove all "pipeline related stuff" from this section, about 90% of the material should be deleted. We don't have any means to run the experiments other than running the pipelines.
This individual section tells how to run the experiments if their parts are organized in directory hierarchies, not organizing experiments into directory hierarchies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here we are discussing how to run experiments if their parts are found in recursive directories
Right, it makes sense to mention exp --recursive
here. I just don't think we need a full section that describes the project's organization, much less any more info about data pipelines. It can be a paragraph (maybe even a single sentence) in the existing Running all pipelines section.
Pipelines are an inherent part of experiments
at some point, this guide will have to cover "experiments with pipelines."
Disagree. You don't need a pipeline for experiments, you just need a dvc.yaml file. It's more of a technicality and in the future this implementation will be made transparent with exp init
anyway. But we've discussed this a lot already...
If we'd remove all "pipeline related stuff" from this section, about 90% of the material
Most would stay, just explained from a different perspective, as mentioned in #2846 (comment).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This new comment was meant as a way to wrap it up but we are going in circles in the discussion again... Happy to do a call too.
So far I think it's clear we agree we want to mention --recursive
in this guide but IMO it could be a single sentence in an existing section e.g. https://dvc.org/doc/user-guide/experiment-management/running-experiments#running-the-pipeline, even a single sentence e.g.:
"If your project uses more than one dvc.yaml
file, you can run them all with the --recursive
option."
(just an extremely short idea, not an exact suggestion)
I also wouldn't mind a whole section, especially if it's quite short, but currently we are introducing more data pipelining information and we have a task to extract all of that so it seems counterproductive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay, was it suggested before? :)
@iesahin do you have a strong opinion on keeping the whole section?
(I personally tend to agree - even if we keep the section - I would try to generalize it into other possible ways (--recursive` is only one single option, there are way more different ways)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No strong opinion here, I'll reduce it to one or two sentences.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, thanks! I've linked to this discussion from a task in #2911 so we can get back to the question of experiment organization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've cut the example to make it shorter. @jorgeorpinel
### Running pipelines recursively | ||
|
||
When your pipeline is defined in recursive subfolders, you can selectively run | ||
them using the `--recursive` option. | ||
|
||
```dvc | ||
$ dvc exp run --recursive dir/ | ||
``` | ||
|
||
It will run all the pipelines under `dir/`, and its subdirectories. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I have more feedback. To some extent --recursive is similar to --all-pipelines (the same but restricted to a certain directory I think), so let's combine the sections? They're very short on their own anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still do believe that having more titles makes it easier to follow the text. This is technical writing, not a fiction where you have to read sequentially to understand. People will be coming to this page from all different directions.
"To some extent" all of the options of dvc exp run
are similar, they select some stage to run. Could we merge all of the sections, then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but by making title too specific you make for users very hard to relate them to the problem they have. E.g. in this case people will be looking for "how to run a specific pipeline" vs "running pipelines recursively" (most likely, not precise example, important to illustrate the point)
E.g. a better general section for you can be:
Run pipelines selectively
(you can see it from the way you write about it)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we merge all of the sections, then?
Not all but https://dvc.org/doc/user-guide/experiment-management/running-experiments does have lots of very small sections (and subsections) in general, now that I see it (as context). We could probably consolidate some easily if we wanted (not suggesting that for this PR).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated. PTAL.
This comment has been minimized.
This comment has been minimized.
Co-authored-by: Restyled.io <[email protected]>
…nts.md Co-authored-by: Jorge Orpinel <[email protected]>
…nts.md Co-authored-by: Jorge Orpinel <[email protected]>
Co-authored-by: Restyled.io <[email protected]>
format the running experiments
3c1d144
to
24acd69
Compare
content/docs/user-guide/experiment-management/running-experiments.md
Outdated
Show resolved
Hide resolved
Co-authored-by: Restyled.io <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll let you merge if you agree @iesahin
Related #2730
--glob
seems to take some time, so this one only adds the--recursive
Also per iterative/dvc#6458