diff --git a/.github/workflows/deploy.yaml b/.github/workflows/deploy.yaml new file mode 100644 index 0000000..0b4ba4c --- /dev/null +++ b/.github/workflows/deploy.yaml @@ -0,0 +1,33 @@ +name: Deploy MkDocs to GitHub Pages + +on: + push: + branches: + - main + +jobs: + deploy: + runs-on: ubuntu-latest + steps: + - name: Checkout code + uses: actions/checkout@v4 + + - name: Set up Python + uses: actions/setup-python@v5 + with: + python-version: '3.8' + + - name: Install dependencies + run: | + python -m pip install --upgrade pip + pip install -r requirements.txt + + - name: Build MkDocs site + run: | + mkdocs build + + - name: Deploy to GitHub Pages + uses: peaceiris/actions-gh-pages@v4 + with: + github_token: ${{ secrets.GITHUB_TOKEN }} + publish_dir: ./site diff --git a/.gitignore b/.gitignore index 28407bc..bcc1406 100644 --- a/.gitignore +++ b/.gitignore @@ -1,2 +1,6 @@ # Mac System .DS_Store + +# Additional +__pycache__ + diff --git a/CODE_OF_CONDUCT.md b/docs/CODE_OF_CONDUCT.md similarity index 90% rename from CODE_OF_CONDUCT.md rename to docs/CODE_OF_CONDUCT.md index 8cc1ccf..2840124 100644 --- a/CODE_OF_CONDUCT.md +++ b/docs/CODE_OF_CONDUCT.md @@ -3,6 +3,7 @@ As members of the Imageomics community, we agree to maintain an environment where every participant feels welcome to be their true self and speak from the heart. To this end, we agree as individuals and as a group to: + - **Listen to understand.** When one person talks, others listen. - **Speak to be understood.** We use lay terms and are patient with people who are not experts in our specific field. We are all learning, no matter who we are. - Embrace **“Yes and…”** Focus on possibilities instead of obstacles. Be inclusive of other people’s ideas. Honor divergence. @@ -16,23 +17,23 @@ We abide by these principles in all Imageomics spaces, including but not limited Discrimination, bullying or harassment–sexual or otherwise–is misconduct. Those found to engage in misconduct will be subject to dismissal from the project and further actions as directed by the guidelines of the employers and the place of incidence. If you believe you have experienced or witnessed misconduct in an Imageomics setting, please take these steps: + 1. Document the incident; 2. Tell someone you trust; -3. Report the incident to Sam Stevens, Diane, or Tanya. +3. Report the incident to Diane Boghrat. Privacy will be protected to the greatest extent possible. -# JEDI Framework +## JEDI Framework -[Full text](https://docs.google.com/document/d/1zHghf5bCsDsw1n0s_Nxt5wu_GYIbj9to/edit?usp=sharing&ouid=114612552367385014086&rtpof=true&sd=true) +[Full text](pdfs/Imageomics_Equity_Tool.pdf) -## VALUES -### TRANSPARENCY +### VALUES +#### TRANSPARENCY We ensure our efforts are clear about assumptions, uncertainty, and limits, and provide open sources of information, processes, and discovery. -### ACCOUNTABILITY +#### ACCOUNTABILITY We are responsible, individually and collectively, for the outcomes we produce and ensure, to the best of our abilities, that the methods outcome matches intended use. -### INCLUSION & COLLABORATION +#### INCLUSION & COLLABORATION We create and nurture inclusive environments and welcome, value, and affirm all members of our community. We also consider how and for whom solutions are created and promote the diversification of perspectives in the creation process. We actively engage others’ perspectives, recognize everyone’s potential to contribute new ideas, and work together to find creative solutions to complex problems. -### SAFETY +#### SAFETY We ensure our practices are ethical and unbiased to the best of our ability. We address biases when we discover it and practice good data governance. We work to improve practices and dismantle existing structures that create harm to people or the environment. - diff --git a/docs/index.md b/docs/index.md new file mode 100644 index 0000000..6eaed3f --- /dev/null +++ b/docs/index.md @@ -0,0 +1,40 @@ +# Welcome to the Imageomics Institute! + +This wiki is intended to host internal documentation, making the information needed to get started with and use institute resources readily available to all members. It will evolve continuously with the institute. + +## Highlights +There are many pages of useful information contained in this wiki covering a range of topics from institute hardware, to repositories and archives, to a glossary of _imageomics-related_ terms. + +### Just starting a project? +Check out our guides to get your project off on the right foot! + +- [The GitHub Repo Guide](wiki-guide/GitHub-Repo-Guide.md): This page reviews expected and suggested GitHub repository contents, as well as structural considerations. + +- [The Hugging Face Repo Guide](wiki-guide/Hugging-Face-Repo-Guide.md): Analogous expected and suggested repository contents for Hugging Face repositories; there are notable differences from GitHub in both content and structure. + +- [Metadata Guide](wiki-guide/Metadata-Guide.md): Guide to metadata collection and documentation. This closely follows the [HF Dataset Card Template](wiki-guide/HF_DatasetCard_Template_mkdocs.md) sections. + +### Project repo up, what's next? +Check out our workflow guides for how to interact with your new repo: + +- [The GitHub Workflow](wiki-guide/The-GitHub-Workflow.md): This page mainly focuses on branching and the PR process. + +- [The Hugging Face Workflow](wiki-guide/The-Hugging-Face-Workflow.md): Analogous workflow directions for Hugging Face; there are notable differences from GitHub in how this process works practically, though the concept is the same. + +### Project management or organization got you down? +Discover new tools to help: + +- [Guide to GitHub Projects](wiki-guide/Guide-to-GitHub-Projects.md): This page focuses on GitHub's project management tool, Projects, which integrates issues and pull requests into a unified task board to keep tabs on how your project is progressing. Labels, milestones, and assignee tags provide improved organization, and allow for more focused views. + +- [Helpful Tools for your Workflow](wiki-guide/Helpful-Tools-for-your-Workflow.md): Collection of useful tools to facilitate and improve workflows. Comments and recommendations encouraged! + +- [Virtual Environments](wiki-guide/Virtual-Environments.md): Summary of `conda` and `pip` environments: how to make, use, and share them. + +### Other pages of note +- [Glossary for Imageomics](wiki-guide/Glossary-for-Imageomics.md): Collection of terms used in imageomics. The goal is to ensure all participating domains are represented, thus facilitating interdisciplinary communication. This is a group effort, please check it out and add terms you think should be there! +- [Command Line Cheat Sheet](wiki-guide/Command-Line-Cheat-Sheet.md): Collection of useful bash, emacs, and git commands. + +
+
+ +!!! question "[Questions, Comments, or Concerns?](https://github.com/Imageomics/Imageomics-guide/issues)" diff --git a/logos/Imageomics_logo_butterfly.png b/docs/logos/Imageomics_logo_butterfly.png similarity index 100% rename from logos/Imageomics_logo_butterfly.png rename to docs/logos/Imageomics_logo_butterfly.png diff --git a/logos/Imageomics_logo_fish.png b/docs/logos/Imageomics_logo_fish.png similarity index 100% rename from logos/Imageomics_logo_fish.png rename to docs/logos/Imageomics_logo_fish.png diff --git a/docs/pdfs/Imageomics_Equity_Tool.pdf b/docs/pdfs/Imageomics_Equity_Tool.pdf new file mode 100644 index 0000000..755a292 Binary files /dev/null and b/docs/pdfs/Imageomics_Equity_Tool.pdf differ diff --git a/wiki-guide/Command-Line-Cheat-Sheet.md b/docs/wiki-guide/Command-Line-Cheat-Sheet.md similarity index 83% rename from wiki-guide/Command-Line-Cheat-Sheet.md rename to docs/wiki-guide/Command-Line-Cheat-Sheet.md index 204ff71..757282b 100644 --- a/wiki-guide/Command-Line-Cheat-Sheet.md +++ b/docs/wiki-guide/Command-Line-Cheat-Sheet.md @@ -29,8 +29,8 @@ See also [GitHub's Markdown Guide](https://docs.github.com/en/get-started/writin | `git checkout ` | checkout branch | | `git branch -d ` | delete branch | -**Usual Process:** -After making changes to a file, check the status of your current working branch (with `git status`). Then, you "add" the file, state what is new about the file ("commit the change"), and `push` the file from your local copy of the repo to the remote copy: +#### Usual Process +After making changes to a file on a branch, check the status of your current working branch (with `git status`). Then, you "add" the file, state what is new about the file ("commit the change"), and `push` the file from your local copy of the repo to the remote copy: ```bash git add @@ -41,10 +41,10 @@ git push ``` -**Note:** If you need to update your branch with changes from `main`, first switch to the branch, then set pull from `main` instead of the current branch, as below. +!!! note Note + If you need to update your branch with changes from the remote `main`, first switch to the branch, then set pull from `main` instead of the current branch, as below. + ```bash + git checkout -```bash -git checkout - -git pull origin main -``` + git pull origin main + ``` diff --git a/wiki-guide/GitHub-Repo-Guide.md b/docs/wiki-guide/GitHub-Repo-Guide.md similarity index 73% rename from wiki-guide/GitHub-Repo-Guide.md rename to docs/wiki-guide/GitHub-Repo-Guide.md index 9058c92..ae227b0 100644 --- a/wiki-guide/GitHub-Repo-Guide.md +++ b/docs/wiki-guide/GitHub-Repo-Guide.md @@ -3,12 +3,16 @@ Just joining or starting a new project and need a repository to store your work? You've come to the right place! Below we have compiled guidance on conventions and best practices for maintaining a shared (or shareable) repository of your work. -# Setting up a New Organization Repository +## Setting up a New Organization Repository -**Note:** We recommend doing development in a public repo, or at least publishing the repo in which development was done at the time of publication/release. However, if you're looking to have a public-facing repo _and_ a private repo for development, please be sure to read our guidance on the [Two Repo Problem](Two-Repo-Problem.md) before proceeding. +!!! note "Note" + We recommend doing development in a public repo, or at least publishing the repo in which development was done at the time of publication/release. However, if you're looking to have a public-facing repo _and_ a private repo for development, please be sure to read our guidance on the [Two Repo Problem](Two-Repo-Problem.md) before proceeding. + + ## Standard Files For each repository, include the following files in the root directory as soon as possible; they can (and should) be instantiated when you create a new repository. + * [README.md](#readme) * [LICENSE.md](#license) * [.gitignore](#gitignore) @@ -21,15 +25,15 @@ The README.md file is what everyone will notice first when they open your reposi Once you've created your repo, populate your README (you can do this by clicking on the file "README.md", then clicking the pencil at the top left to edit). Editing your README in the browser allows you to preview the formatting of the file before committing changes. The content of your README may vary based on the purpose or goal of your repo, but there are key elements that should always be included. -* Summary of the repo: - * This could be a simple explanation of what the package or tool developed in your repo is intended to do, - * Or an abstract describing your research. -* Detailed documentation on how to access and use the project software (User Guide). - * Including installation of [dependencies](#dependencies-and-environments). - * If your tool requires input be in a particular format, this would be included in the README. It would also help to include an example file demonstrating the format. -* Information about the sources you've used (links and what they were used for), such as: - * Tools from other repos - * Data for analysis +- Summary of the repo: + - This could be a simple explanation of what the package or tool developed in your repo is intended to do, + - Or an abstract describing your research. +- Detailed documentation on how to access and use the project software (User Guide). + - Including installation of [dependencies](Virtual-Environments.md). + - If your tool requires input be in a particular format, this would be included in the README. It would also help to include an example file demonstrating the format. +- Information about the sources you've used (links and what they were used for), such as: + - Tools from other repos + - Data for analysis For more inspiration on making an awesome README, check out [this list](https://github.com/matiassingers/awesome-readme). @@ -37,7 +41,8 @@ For more inspiration on making an awesome README, check out [this list](https:// #### 1. Select a license. Alongside the appropriate stakeholders, select a license that is [Open Source Initiative](https://opensource.org/licenses) (OSI) compliant. -*Remember, a public repository on GitHub with no license can be viewed and forked by others under GitHub's ToS, but unless the author associates a license, it is unclear what others are allowed to do with it legally. Adding an OSI license can help others feel comfortable contributing!* +!!! note "Remember" + A public repository on GitHub with no license can be viewed and forked by others under GitHub's ToS, but unless the author associates a license, it is unclear what others are allowed to do with it legally. Adding an OSI license can help others feel comfortable contributing! For more information on how to choose a license and why it matters, see [Choose A License](https://choosealicense.com) and [A Quick Guide to Software Licensing for the Scientist-Programmer](https://doi.org/10.1371/journal.pcbi.1002598) by A. Morin, et al. @@ -77,10 +82,11 @@ As with journal publications, we expect to be cited when someone uses our code. You can check your CITATION.cff file prior to upload using this [validator tool](https://www.yamllint.com/). -**Note:** -- Subcategories of `preferred-citation` do not get bullet points, but the first subcategory of `references` must be bulleted (as below). -- This is generally intended as a reference for your code. Preferred citation can be used for the paper, though it is better to ask in the `README` that someone cites _both_ and provide the paper reference there (only the `preferred-citation` will show up to be copied from the citation box if it is included). -```yaml +!!! note "Note" + - Subcategories of `preferred-citation` do not get bullet points, but the first subcategory of `references` must be bulleted (as below). + - This is generally intended as a reference for your code. Preferred citation can be used for the paper, though it is better to ask in the `README` that someone cites _both_ and provide the paper reference there (only the `preferred-citation` will show up to be copied from the citation box if it is included). + +```yaml { py linenums="1" } abstract: "" authors: - family-names: @@ -131,26 +137,35 @@ references: ## Additional Considerations ### Formatting and Naming Conventions -* Dates and Times: For interoperability and avoiding ambiguity, [dates and times should be reported](https://dataoneorg.github.io/Education/bestpractices/describe-formats-for) in [ISO 8601 format](https://en.wikipedia.org/wiki/ISO_8601). + +**Dates and Times** + +For interoperability and to avoid ambiguity, [dates and times should be reported](https://dataoneorg.github.io/Education/bestpractices/describe-formats-for) in [ISO 8601 format](https://en.wikipedia.org/wiki/ISO_8601). + - For dates, this means `YYYY-MM-DD` (for ISO 8601 compliance, the dashes are required). - For times, use `THHMMSS` in 24-hour format. - - For example, the moment when there were 60 seconds left before New Year 2000 would be 1999-12-31T235900. -* Branches - * Primary branch: `main` - * Other branches: follow the pattern `category/reference/description` - * category: `feature`, `bugfix`, `experiment` - * `feature` is for new functionality - * `bugfix` is for fixing errors - * `experiment` is for more open-ended - * the associated issue (if no issue, put no-ref), formatted as `issue-NN` - * description: brief description, e.g. `solve-world-hunger` - * e.g. `git branch feature/issue-1/general-ai` -* Commits: to combine human- and computer-readability into commit messages, follow the [Conventional Commits specification](https://www.conventionalcommits.org/en/v1.0.0/#summary). + - For example, the moment when there were 60 seconds left before New Year 2000 would be `1999-12-31T235900`. + +**Branches** + + - Primary branch: `main` + - Other branches follow the pattern `category/reference/description`: + - **category**: `feature`, `bugfix`, `experiment` + - `feature` is for new functionality + - `bugfix` is for fixing errors + - `experiment` is for more open-ended work + - the associated issue (if no issue, put `no-ref`), formatted as `issue-NN` + - description: brief description, e.g., `solve-world-hunger` + - Example: `git branch feature/issue-1/general-ai` + +**Commits** + +To combine human- and computer-readability into commit messages, follow the [Conventional Commits specification](https://www.conventionalcommits.org/en/v1.0.0/#summary). ### Workflow Do not conduct routine work in the `main` branch. Only do one thing on a branch at a time. Prune a branch once its purpose is fulfilled and it is merged (i.e., delete it). -For more information on creating, merging, and deleting branches, see the [GitHub Workflow Guide](2.1.-The-GitHub-Workflow.md). +For more information on creating, merging, and deleting branches, see the [GitHub Workflow Guide](The-GitHub-Workflow.md). ## General Repository Structure In addition to the [standard files](#standard-files) recommended for every repo, you will likely have some code, notebooks, and data. For an easily accessible and readable repo, it is good to organize these files within a clear directory (folder) structure, such as @@ -163,14 +178,10 @@ Project_Directory - data ``` -**Note:** Depending on the size of your data, `data` may only be local on your machine in which case it is good to include instructions to access the data where appropriate. - - +!!! note "Note" + Depending on the size of your data, `data` may only be local on your machine in which case it is good to include instructions to access the data where appropriate. *** - - - # Working on GitHub After the initial creation of a repo on the GitHub website, there are two primary modes of interacting with it. @@ -180,15 +191,19 @@ After the initial creation of a repo on the GitHub website, there are two primar 2. Through the GitHub Desktop App, [GitHub Desktop](https://desktop.github.com/) - GitHub provides documentation to get started on [Mac](https://docs.github.com/en/desktop/installing-and-configuring-github-desktop/overview/getting-started-with-github-desktop?platform=mac) or [Windows](https://docs.github.com/en/desktop/installing-and-configuring-github-desktop/overview/getting-started-with-github-desktop?platform=windows), as well as extensive documentation on use cases we discuss throughout the wiki [here](https://docs.github.com/en/desktop/contributing-and-collaborating-using-github-desktop). + GitHub provides documentation to get started on [Mac](https://docs.github.com/en/desktop/overview/getting-started-with-github-desktop?platform=mac) or [Windows](https://docs.github.com/en/desktop/overview/getting-started-with-github-desktop?platform=windows), as well as extensive documentation on use cases we discuss throughout the wiki [here](https://docs.github.com/en/desktop/contributing-and-collaborating-using-github-desktop). - **Note:** The bulk of our step-by-step guides will outline interaction through the command line, but the same principles apply to using GitHub Desktop. +!!! note "Note" + The bulk of our step-by-step guides will outline interaction through the command line, but the same principles apply to using GitHub Desktop. + ## Cloning a Repository Navigate to the main ("<> Code") page of your repository and click the green button at the top right corner (as shown below) and copy the link (for command line) or select "Open with GitHub Desktop". For command line interaction, navigate within the `bash` shell to the directory where you would like to place your local copy of the repo (`cd `), then clone the repo into that folder (`git clone `), this will generate a local copy of the repo on your computer. -![Screenshot 2023-05-16 at 5 22 25 PM](https://github.com/Imageomics/internal-guidelines/assets/38985481/43857a4d-789b-4073-b872-da29c4474916) +![Clone repository interface](images/GH-repo-guide/238778583-43857a4d-789b-4073-b872-da29c4474916.png){ loading=lazy } +/// caption +/// If you would like a specific branch, use `git clone -b `. @@ -196,5 +211,3 @@ If you would like a specific branch, use `git clone -b ` Generally, repositories are organized around an Imageomics Project/Topic/Team, eg., butterflies. These broader topics may contain various projects organized under a GitHub [Team](https://github.com/orgs/Imageomics/teams) focused on that topic. Both [projects](https://github.com/orgs/Imageomics/projects?query=is%3Aopen) and [repositories](https://github.com/orgs/Imageomics/repositories) may be linked to teams, providing an organizational structure upon which to plan and manage tasks while maintaining a clear link/connection to the work being done on those tasks. Note that a project may encapsulate multiple repositories just as a repository may be referenced by multiple projects. Ideally, each task will be linked to an issue in the relevant repository. Team members may then be assigned tasks, and asynchronous discussions about the task can be recorded on its issue page in the repository. To accomplish the task, a new branch should be created following the [branch naming conventions](#formatting-and-naming-conventions); do not work directly on the `main` branch. Once the task is completed, a pull request can be opened to merge the changes into the main branch (see the [GitHub Workflow Guide](The-GitHub-Workflow.md) and the [PR Guide](The-GitHub-Pull-Request-(PR)-Guide.md) for more details on this process). Reviewers may be assigned to each pull request to ensure compatibility and that the proposed solution functions as expected/needed; this is an opportunity for more dialogue. - - diff --git a/wiki-guide/Glossary-for-Imageomics.md b/docs/wiki-guide/Glossary-for-Imageomics.md similarity index 79% rename from wiki-guide/Glossary-for-Imageomics.md rename to docs/wiki-guide/Glossary-for-Imageomics.md index 7d37ce9..0e06b68 100644 --- a/wiki-guide/Glossary-for-Imageomics.md +++ b/docs/wiki-guide/Glossary-for-Imageomics.md @@ -8,18 +8,18 @@ Definitions are not meant to be comprehensive. Ideally, they will be tailored to It is meant to be a collaborative effort, so please [contribute](https://github.com/Imageomics/Imageomics-guide/issues) terms you would like defined, definitions you know, or corrections for errors you notice! -# A -### Application Programming Interface (API) +## A +#### Application Programming Interface (API) -### Autoencoder +#### Autoencoder -# B +## B -# C +## C -### CARE Principles for Indigenous Data Governance +#### CARE Principles for Indigenous Data Governance "People and purpose-oriented" to complement [FAIR Principles](#fair-data-principles). **C**ollective Benefit @@ -32,38 +32,39 @@ It is meant to be a collaborative effort, so please [contribute](https://github. For more information, see [CARE Principles for Indigenous Data Governance](https://www.gida-global.org/care). -### Contrastive Language-Image Pre-training (CLIP) +#### Contrastive Language-Image Pre-training (CLIP) -# D -### Decoder +## D +#### Decoder -### Dimensionality Reduction +#### Dimensionality Reduction Used in machine learning and data analysis to refer to a set of methods used to reduce the number of variables or features under consideration to a smaller subset with the greatest explanatory power without drastically reducing the accuracy of the model or analysis. The purpose is to exclude irrelevant, redundant, and noisy information, thereby improving computational complexity and model interpretability. That is, it seeks to preserve the "most important" variables or features of the data based on some quantitative metric, such as variance, while removing "less important" variables or features. This is especially helpful when using high-dimensional data such as images or genomes. Dimensionality reduction techniques can be subdivided into two main categories: -* [Feature Extraction](#feature-extraction) -* [Feature Selection](#feature-selection) -### Docker +- [Feature Extraction](#feature-extraction) +- [Feature Selection](#feature-selection) +#### Docker -# E -### Ecology +## E +#### Ecology -### Epoch (in machine learning) +#### Epoch (in machine learning) -### Encoder +#### Encoder -### Experiment (in machine learning) -# F -### FAIR Data Principles +#### Experiment (in machine learning) + +## F +#### FAIR Data Principles **F**indable -- metadata and data easily found by both humans and machines **A**ccessible -- clear indication of how to access data once it is found. @@ -72,60 +73,60 @@ Dimensionality reduction techniques can be subdivided into two main categories: **R**eusable -- clearly described so it is easily used by others. -For more information, see [fair principles](https://www.go-fair.org/fair-principles/). +For more information, see [FAIR principles](https://www.go-fair.org/fair-principles/). -### Feature +#### Feature In machine learning and data science, a feature is a single measurable property or characteristic of the phenomenon under observation. With tabular data, a feature is a column in the dataset used by a model to make predictions. In genomics, a feature could be, for example, gene expression levels, the presence (or absence) of certain genetic variants (such as [SNPs](#single-nucleotide-polymorphism-snp), insertions and deletions (indels), and others), or epigenetic markers. -### Feature Extraction +#### Feature Extraction A set of [dimensionality reduction](#dimensionality-reduction) techniques used to map raw data to a smaller set of features. Example techniques include [PCA](#principal-component-analysis-pca), [MDS](#multidimensional-scaling-mds), [t-SNE](#t-distributed-stochastic-neighbor-embedding-t-sne), [autoencoders](#autoencoder), and Fourier or wavelet transforms. The key difference from feature selection is that feature extraction generates a new set of features from the original dataset by projecting or mapping the data into a new feature space rather than selecting from existing features. -### Feature Selection +#### Feature Selection A method to select a subset of relevant features for use in model construction. The key difference from feature extraction is that feature selection does not generate new features but rather identifies the most meaningful existing features in a dataset by excluding redundant or irrelevant features. For example, in genomics, feature selection would involve selecting the most important gene(s) relevant to a certain phenotype among thousands of genes. -### Feature Space +#### Feature Space -# G -### Genome-Wide Association Study (GWAS) +## G +#### Genome-Wide Association Study (GWAS) -# H -### Hyperparameter Tuning -The process of selecting the best hyperparameters for a machine learning model by minimizing the [loss function](#loss-function). This can be done through [experiments](#experiments-in-ml) or in some cases, using optimization techniques. Hyperparameters are parameters that are set by the researcher before training and are not learned during the training process. Some examples of common hyperparameters are [learning rate](#learning-rate), number of [epochs](#epoch-in-machine-learning), number of clusters (k) in [k-means clustering](#k-means-clustering), and many others. +## H +#### Hyperparameter Tuning +The process of selecting the best hyperparameters for a machine learning model by minimizing the [loss function](#loss-function). This can be done through [experiments](#experiment-in-machine-learning) or in some cases, using optimization techniques. Hyperparameters are parameters that are set by the researcher before training and are not learned during the training process. Some examples of common hyperparameters are [learning rate](#learning-rate), number of [epochs](#epoch-in-machine-learning), number of clusters (k) in [k-means clustering](#k-means-clustering), and many others. -# I -### Imageomics +## I +#### Imageomics i-'mi-j**ə**-'**ō**-miks A new scientific field in which computational (machine learning) tools built around biological knowledge bases are used by biologists to analyze image data in order to characterize patterns and gain insights into traits and relationships at individual, population and species scales—insights that then get incorporated into the algorithms that run the tools. -# J +## J -# K -### K-Means Clustering +## K +#### K-Means Clustering -# L -### Latent Space +## L +#### Latent Space -### Learning Rate +#### Learning Rate -### Loss Function +#### Loss Function -# M -### Multidimensional Scaling (MDS) +## M +#### Multidimensional Scaling (MDS) -# N -### Nucleotide +## N +#### Nucleotide The fundamental building blocks of DNA and RNA. A nucleotide is composed of a base and a sugar-phosphate backbone. Bases for DNA: adenine (A), guanine (G), cytosine (C), and thymine (T). @@ -140,64 +141,64 @@ The bases A, G, and C are the same molecule for DNA and RNA. T and U are incorpo A DNA or RNA molecule consists of a chain of the four relevant nucleotides in a sequence, where the order of A, G, C, and T in the DNA sequence determines the "blueprint" for the organism, and the order and length of A, G, C, and U in an RNA sequence determines the purpose and function of the RNA molecule, which can be a messenger RNA (mRNA) that encodes a protein, a microRNA (miRNA) which are short RNAs that help regulate gene expression by binding to other mRNAs, and many others. -# O -### Ontology +## O +#### Ontology -# P -### Phenotype +## P +#### Phenotype -### Phylogeny +#### Phylogeny -### Pre-training +#### Pre-training -### Principal Component Analysis (PCA) +#### Principal Component Analysis (PCA) -# Q +## Q -# R +## R -# S -### Single Nucleotide Polymorphism (SNP) +## S +#### Single Nucleotide Polymorphism (SNP) A SNP (pronounced "snip") is a variation in the [nucleotide](#nucleotide) present at a single position in a DNA sequence among individuals in a species. For example, a SNP may be the replacement of a cytosine (C) by a thymine (T) at the same location in a stretch of DNA, where C is observed in a subset of individuals and T is observed in the others. -### Snakemake +#### Snakemake -### Subspecies +#### Subspecies -### Supervised Learning +#### Supervised Learning As opposed to [unsupervised learning](#unsupervised-learning), supervised learning methods learn from labeled data. That is, it is trained using input data that is labeled with corresponding outputs, such as the input of an image and the output of a classification. -# T -### Taxonomy +## T +#### Taxonomy -### t-Distributed Stochastic Neighbor Embedding (t-SNE) +#### t-Distributed Stochastic Neighbor Embedding (t-SNE) -### Trait +#### Trait -### Transfer Learning +#### Transfer Learning -# U -### Unsupervised Learning +## U +#### Unsupervised Learning As opposed to [supervised learning](#supervised-learning), unsupervised learning detects patterns or structures within the input data without any labels. Clustering and dimensionality reduction techniques are some examples. -# V +## V VLMs (Vision-Language Models) -# W +## W -# X +## X -# Y +## Y -# Z -### Zero-Shot Prediction +## Z +#### Zero-Shot Prediction diff --git a/wiki-guide/Guide-to-GitHub-Projects.md b/docs/wiki-guide/Guide-to-GitHub-Projects.md similarity index 99% rename from wiki-guide/Guide-to-GitHub-Projects.md rename to docs/wiki-guide/Guide-to-GitHub-Projects.md index b3d7005..c7bb394 100644 --- a/wiki-guide/Guide-to-GitHub-Projects.md +++ b/docs/wiki-guide/Guide-to-GitHub-Projects.md @@ -14,4 +14,4 @@ When starting a new project, it can be helpful to have a shared tracker or proje ## Interacting with GitHub Projects To help you get started working with [GitHub Projects](https://docs.github.com/en/issues/planning-and-tracking-with-projects/learning-about-projects/about-projects), we have an [Imageomics General Project Template](https://github.com/orgs/Imageomics/projects/31/views/1) with both a [Taskboard](https://github.com/orgs/Imageomics/projects/31/views/1) and [Table](https://github.com/orgs/Imageomics/projects/31/views/2) view initialized, along with label and milestone displays turned on. -Both of these views will automatically stay updated so that each member of the project can utilize whichever version they find most informative. Issues can be added directly to the project board/table or on the repo (if added on the repo, they must be linked to the project, and have status assigned). Milestones must be created on the repo (under the Issues tab, select "Milestones" to create one). \ No newline at end of file +Both of these views will automatically stay updated so that each member of the project can utilize whichever version they find most informative. Issues can be added directly to the project board/table or on the repo (if added on the repo, they must be linked to the project, and have status assigned). Milestones must be created on the repo (under the Issues tab, select "Milestones" to create one). diff --git a/templates/HF_DatasetCard_Template_Imageomics.md b/docs/wiki-guide/HF_DatasetCard_Template_Imageomics.md similarity index 99% rename from templates/HF_DatasetCard_Template_Imageomics.md rename to docs/wiki-guide/HF_DatasetCard_Template_Imageomics.md index e6be519..2ec3a2e 100644 --- a/templates/HF_DatasetCard_Template_Imageomics.md +++ b/docs/wiki-guide/HF_DatasetCard_Template_Imageomics.md @@ -265,4 +265,3 @@ This work was supported by the [Imageomics Institute](https://imageomics.org), w [More Information Needed--optional] - diff --git a/docs/wiki-guide/HF_DatasetCard_Template_mkdocs.md b/docs/wiki-guide/HF_DatasetCard_Template_mkdocs.md new file mode 100644 index 0000000..d0630c7 --- /dev/null +++ b/docs/wiki-guide/HF_DatasetCard_Template_mkdocs.md @@ -0,0 +1,7 @@ +# Dataset Card Template + +Below is the **HF_DatasetCard_Template_Imageomics.md**. You can copy this content and paste it into a new Markdown file to create a new dataset card. + +[Download Template](https://github.com/Imageomics/Imageomics-guide/raw/main/docs/wiki-guide/HF_DatasetCard_Template_Imageomics.md) + +{{ include_file_as_code("docs/wiki-guide/HF_DatasetCard_Template_Imageomics.md") }} diff --git a/templates/HF_ModelCard_Template_Imageomics.md b/docs/wiki-guide/HF_ModelCard_Template_Imageomics.md similarity index 100% rename from templates/HF_ModelCard_Template_Imageomics.md rename to docs/wiki-guide/HF_ModelCard_Template_Imageomics.md diff --git a/docs/wiki-guide/HF_ModelCard_Template_mkdocs.md b/docs/wiki-guide/HF_ModelCard_Template_mkdocs.md new file mode 100644 index 0000000..50d8ff5 --- /dev/null +++ b/docs/wiki-guide/HF_ModelCard_Template_mkdocs.md @@ -0,0 +1,7 @@ +# Model Card Template + +Below is the **HF_ModelCard_Template_Imageomics.md**. You can copy this content and paste it into a new Markdown file to create a new dataset card. + +[Download Template](https://github.com/Imageomics/Imageomics-guide/raw/main/docs/wiki-guide/HF_ModelCard_Template_Imageomics.md) + +{{ include_file_as_code("docs/wiki-guide/HF_ModelCard_Template_Imageomics.md") }} diff --git a/wiki-guide/Handling-API-Keys.md b/docs/wiki-guide/Handling-API-Keys.md similarity index 70% rename from wiki-guide/Handling-API-Keys.md rename to docs/wiki-guide/Handling-API-Keys.md index 7bf3e77..afc27d8 100644 --- a/wiki-guide/Handling-API-Keys.md +++ b/docs/wiki-guide/Handling-API-Keys.md @@ -1,11 +1,12 @@ # Handling API Keys If you are using a web service with API keys, there are a few things to keep in mind. The key to key storage is that the process must meet the following requirements: -* Not hard-coded into your code -* Not visible in version-control -* Convenient to use -* Convenient to change if needed -* Unique for different environments + +- Not hard-coded into your code +- Not visible in version-control +- Convenient to use +- Convenient to change if needed +- Unique for different environments ## Key Storage Our recommended way of storing and using API is within `.env` (dotenv) files. @@ -18,16 +19,18 @@ For instance, if your API key for OpenAI is `sk-AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPp ``` OPENAI_API_KEY=sk-AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz ``` -* Ensure `.env` is added to your `.gitignore` file. The `.env` should not be published in a remote repository; it should be for your eyes only. -* Store the `.env` file in the root directory for your project. -* Backup the `.env` or key in a secure location. A free personal account with [Bitwarden](https://bitwarden.com/) is an excellent option for this. -* If you notice the key or the `.env` file has been published somewhere public for any length of time, it must be changed immediately. -> Note: the `.env` file can be used for setting arbitrary environment variables used by your code besides API keys. +- Ensure `.env` is added to your `.gitignore` file. The `.env` should not be published in a remote repository; it should be for your eyes only. +- Store the `.env` file in the root directory for your project. +- Backup the `.env` or key in a secure location. A free personal account with [Bitwarden](https://bitwarden.com/) is an excellent option for this. +- If you notice the key or the `.env` file has been published somewhere public for any length of time, it must be changed immediately. + +!!! note "Note" + The `.env` file is a simple text file, so you can use any text editor to create and edit it. ## Key Usage If you are using Python, the `dotenv` package will enable to use this approach. First, install with [pip](https://pypi.org/project/python-dotenv/) or [conda](https://anaconda.org/conda-forge/python-dotenv). In your work, the following will get you access to your API key as a Python variable `RESOURCE_API_KEY` (you may name it whatever you like; the Python variable may be different from the environment variable): -```python +```python { py linenums="1" } import os from dotenv import load_dotenv @@ -38,5 +41,6 @@ RESOURCE_API_KEY = os.getenv("RESOURCE_API_KEY") ## Keys for a Shared Resource If you are part of a group with access to the same API: -* Create a unique API key for each application you use and for each environment you work in. -* Avoid sharing API keys with other users or between different applications/scripts. + +- Create a unique API key for each application you use and for each environment you work in. +- Avoid sharing API keys with other users or between different applications/scripts. diff --git a/wiki-guide/Helpful-Tools-for-your-Workflow.md b/docs/wiki-guide/Helpful-Tools-for-your-Workflow.md similarity index 99% rename from wiki-guide/Helpful-Tools-for-your-Workflow.md rename to docs/wiki-guide/Helpful-Tools-for-your-Workflow.md index 6883129..c09a7a7 100644 --- a/wiki-guide/Helpful-Tools-for-your-Workflow.md +++ b/docs/wiki-guide/Helpful-Tools-for-your-Workflow.md @@ -36,4 +36,3 @@ ruff check ``` Ruff can also be set up as part of a pre-commit hook or GitHub Workflow. See their [Usage section](https://github.com/astral-sh/ruff?tab=readme-ov-file#usage) for more information. - diff --git a/wiki-guide/Hugging-Face-Repo-Guide.md b/docs/wiki-guide/Hugging-Face-Repo-Guide.md similarity index 71% rename from wiki-guide/Hugging-Face-Repo-Guide.md rename to docs/wiki-guide/Hugging-Face-Repo-Guide.md index 311fdec..35eacb5 100644 --- a/wiki-guide/Hugging-Face-Repo-Guide.md +++ b/docs/wiki-guide/Hugging-Face-Repo-Guide.md @@ -3,34 +3,34 @@ Need a repository to store your data or model? You've come to the right place! Below we have compiled guidance on conventions and best practices for maintaining a shared (or shareable) Hugging Face repository of your work. -# Setting up a New Organization Repository +## Setting up a New Organization Repository -## Standard Files -For each repository, include the following files in the root directory as soon as possible; a license can (and should) be instantiated when you create a new repository, and the standard `.gitattributes` will be generated for you. On the [Imageomics HF](https://github.com/organizations/Imageomics) select `New` and pick which type of repository you need. -* [README.md](#readme) -* [LICENSE.md](#license) -* [.gitignore](#gitignore) -* [.gitattributes](#gitattributes) +### Standard Files +For each repository, include the following files in the root directory as soon as possible; a license can (and should) be instantiated when you create a new repository, and the standard `.gitattributes` will be generated for you. On the [Imageomics HF](https://huggingface.co/imageomics) select `New` and pick which type of repository you need. -More [recommendations](#recommended-files) are discussed below. +- [README.md](#readme) +- [LICENSE.md](#license) +- [.gitignore](#gitignore) +- [.gitattributes](#gitattributes) -### README -The README.md file is generally referred to as either a Dataset or Model Card and is what everyone will notice first when they open your repository on Hugging Face. Choose the appropriate Imageomics-specific HF template ([model](../templates/HF_ModelCard_Template_Imageomics.md?plain=1) or [dataset](../templates/HF_DatasetCard_Template_Imageomics.md?plain=1)) to get started. Be sure to include a brief description and as much information as possible at the beginning. You can update this file as you go, so don't remove the recommended sections prior to completion. The templates include descriptions of many fields, Imageomics grant information, citation formatting, and some notes on HF-flavored markdown to get you started. +#### README +The README.md file is generally referred to as either a Dataset or Model Card and is what everyone will notice first when they open your repository on Hugging Face. Choose the appropriate Imageomics-specific HF template ([model](HF_ModelCard_Template_mkdocs.md) or [dataset](HF_DatasetCard_Template_mkdocs.md)) to get started. Be sure to include a brief description and as much information as possible at the beginning. You can update this file as you go, so don't remove the recommended sections prior to completion. The templates include descriptions of many fields, Imageomics grant information, citation formatting, and some notes on HF-flavored markdown to get you started. -Once you've created your repo, populate your README (you can do this online by selecting "Create Dataset/Model Card" and pasting in the appropriate [Imageomics HF template](../templates), then filling in your info). Editing your README in the browser allows you to preview the formatting of the file before committing changes. +Once you've created your repo, populate your README (you can do this online by selecting "Create Dataset/Model Card" and pasting in the appropriate Imageomics HF template, then filling in your info). Editing your README in the browser allows you to preview the formatting of the file before committing changes. -### LICENSE -#### 1. Select a license. +#### LICENSE +##### 1. Select a license. Alongside the appropriate stakeholders, select a license that is [Open Source Initiative](https://opensource.org/licenses) (OSI) compliant. -*Remember, a public repository on Hugging Face with no license can be viewed and accessed by others, but unless the author associates a license, it is unclear what others are allowed to do with it legally. Adding an OSI license can help others feel comfortable building off your work!* +!!! note "Remember" + A public repository on Hugging Face with no license can be viewed and accessed by others, but unless the author associates a license, it is unclear what others are allowed to do with it legally. Adding an OSI license can help others feel comfortable building off your work! For more information on how to choose a license and why it matters, see [Choose A License](https://choosealicense.com) and [A Quick Guide to Software Licensing for the Scientist-Programmer](https://doi.org/10.1371/journal.pcbi.1002598) by A. Morin, et al. -#### 2. Add LICENSE.md to the repository. +##### 2. Add LICENSE.md to the repository. Once a license has been chosen (if not initialized with one), add the appropriate license label in the `yaml` portion of the README (the web UI generates a dropdown of recommendations under "Edit dataset/model card"). -### gitignore +#### gitignore As with GitHub, the `.gitignore` file is an important tool for maintaining a clean repository by ensuring that git will not track temp files of any and all your collaborators (no pesky `pycache` or `.DS_Store` files floating around). The same [options for GitHub](https://github.com/github/gitignore) are usable here, and if you or anyone on your team uses a Mac (or if you intend to encourage outside collaboration on this repo), add @@ -40,15 +40,14 @@ The same [options for GitHub](https://github.com/github/gitignore) are usable he ``` at the end of the `.gitignore` file. -### gitattributes +#### gitattributes The `.gitattributes` file determines file patterns to be tracked by [`git LFS`](https://git-lfs.com/) (Git Large File Storage). The preset `gitattributes` file includes many binary file types, but you may need to add particular files if they get too large (eg., a large CSV, but do **NOT** store all CSV files with `git LFS`, just add the particular one or pattern). Pattern-matching can be done using `*`. You can either add the file (and appropriate pattern description) to the `.gitattributes` file, or add it in the command line: ``` git lfs track "my-big-list.csv" ``` Then add and commit the `.gitattributes` file as described below. - -# Hugging Face Pull Requests With Local Edits +## Hugging Face Pull Requests With Local Edits Hugging Face also has a pull request (PR) feature, though the process is a bit different from GitHub. As with GitHub, you can interact through the web browser or a command line interface (eg., terminal on Mac). However, instead of the `create new branch` option, there is a `create new pull request` option. It is still preferable to avoid committing everything directly to main. To make further changes to the particular PR created on the browser, one must first clone the repo: @@ -71,5 +70,5 @@ git push origin pr/:refs/pr/ For more information on Hugging Face Pull Requests and Discussions, see their [documentation](https://huggingface.co/docs/hub/repositories-pull-requests-discussions). -# Templates for Model and Dataset Cards +## Templates for Model and Dataset Cards See [here](https://github.com/Imageomics/Imageomics-Guide#hugging-face) for guidelines on using templates for these important pieces of documentation. diff --git a/wiki-guide/Metadata-Guide.md b/docs/wiki-guide/Metadata-Guide.md similarity index 64% rename from wiki-guide/Metadata-Guide.md rename to docs/wiki-guide/Metadata-Guide.md index 872a38c..520a7a8 100644 --- a/wiki-guide/Metadata-Guide.md +++ b/docs/wiki-guide/Metadata-Guide.md @@ -1,16 +1,19 @@ # Metadata Guide -When collecting or compiling new data, there are generally questions one is _trying_ to answer. There are also often questions that will come up later--whether for yourself or others interested in using your data. To improve both the _**Findability**_ and _**Reusability**_ of your data (ensuring [FAIR principles](Glossary-for-Imageomics.md#fair-data-principles)) for yourself and others, be sure to note down the following information. +When collecting or compiling new data, there are generally questions one is _trying_ to answer. There are also often questions that will come up later--whether for yourself or others interested in using your data. -**Note:** This is _**NOT**_ an exhaustive list. Be sure to include any other information that may be important to your particular project or field. +To improve both the _**Findability**_ and _**Reusability**_ of your data (ensuring [FAIR principles](Glossary-for-Imageomics.md#fair-data-principles)) for yourself and others, be sure to note down the following information. -## Metadata to record: +!!! note "This is not an exhaustive list." + Be sure to include any other information that may be important to your particular project or field. + +## Checklist for Metadata to Record - [ ] **Description:** Summary of your data, for instance: - What are the contents of the data (images, text, type of animal)? - Is it machine-ready? - Where did it come from (Source)? - [ ] **Data Sources:** Machine-readable sources of the data (links or other files). -- [ ] **License Information:*** This is part of retaining records of a data source (eg., museum images, previous dataset). A record of licenses on the images must be retained to ensure they are respected. If dealing with CC licenses, please see this [OSU Library CC best practices guide](https://library.osu.edu/sites/default/files/2022-10/attributing_cc_license_flyer_2022_ac.pdf). +- [ ] **License Information:** This is part of retaining records of a data source (eg., museum images, previous dataset). A record of licenses on the images must be retained to ensure they are respected. If dealing with CC licenses, please see this [OSU Library CC best practices guide](https://library.osu.edu/sites/default/files/2022-10/attributing_cc_license_flyer_2022_ac.pdf). - [ ] **Dataset Structure:** - Organization of the full dataset (eg., file structure). - Feature information: Information available for each image, such as species and subspecies designations, location information, etc. @@ -25,13 +28,16 @@ When collecting or compiling new data, there are generally questions one is _try - [ ] **Related Publication:** Any papers that are based on this dataset. - [ ] **Related Datasets:** Provide links to any related datasets (may include previous/background research). - [ ] **Other References:** Links to any related/background articles. - - [ ] **Keywords/Tags:** Terms one might search to find this dataset, eg., type(s) of animals, type(s) of images, imbalanced (if not even distribution of species/subspecies/etc). - It helps to keep a running list. - [ ] **Notes:** Any other image/data information. -*Datasets **_cannot_** be redistributed without this information. +!!! warning "Remember" + + Datasets **_cannot_** be redistributed without this information. + +!!! tip "Pro tip" ->**Pro-tip:** Copy this markdown into an issue on your GitHub [Repo](GitHub-Repo-Guide.md) or [Project](Guide-to-GitHub-Projects.md) so you can check the boxes as you add each. + Use the eye icon at the top of this page to access the source and copy the markdown for the checklist above into an issue on your GitHub [Repo](GitHub-Repo-Guide.md) or [Project](Guide-to-GitHub-Projects.md) so you can check the boxes as you add each. -[Questions, Comments, Concerns?](https://github.com/Imageomics/Imageomics-guide/issues) +!!! question "[Questions, Comments, or Concerns?](https://github.com/Imageomics/Imageomics-guide/issues)" diff --git a/docs/wiki-guide/The-GitHub-Pull-Request-(PR)-Guide.md b/docs/wiki-guide/The-GitHub-Pull-Request-(PR)-Guide.md new file mode 100644 index 0000000..aa32af0 --- /dev/null +++ b/docs/wiki-guide/The-GitHub-Pull-Request-(PR)-Guide.md @@ -0,0 +1,164 @@ +# **GitHub Pull Request (PR) Guide Overview** + +This guide is divided into three essential sections to help you effectively manage pull requests in a collaborative project: + +- [Create a Pull Request](#1-create-a-pull-request): This section explains how to properly prepare and submit a pull request (PR) to ensure that your changes are well-documented, easy to review, and aligned with project goals. +- [Review a Pull Request](#2-review-a-pull-request): Learn the best practices for providing constructive feedback, identifying potential issues, and ensuring code quality during the review process. +- [Respond to a Pull Request Review](#3-respond-to-a-pull-request-review): Understand how to address reviewer feedback, make necessary changes, and ensure your pull request meets the required standards for approval. + +By following these steps, you will contribute to a smooth and efficient workflow, ensuring collaboration and quality in your project. + + +## **1. Create a Pull Request** +Before creating a pull request, first, please follow [2.1. The GitHub Workflow](The-GitHub-Workflow.md) to create and push your Branch. + +### 1.1 Navigate to the Repository's Main Page +On GitHub, go to the main page of the repository where you’ve pushed your branch. + +### 1.2 Select Your Branch +From the "Branch" menu, choose the branch that contains your changes (the one you just pushed). + +### 1.3 Click 'Compare & pull request' +You’ll see a button labeled Compare & pull request. Click this to begin the process of creating a pull request for your changes. + +![GitHub's Compare & pull request button](images/GH-PR-guide/365234535-659b312e-d95d-4bee-a958-4ce23fc4255d.png){ loading=lazy, width="800" } +/// caption +/// + +### 1.4 Add Title and Description +In the pull request form, type a descriptive title for your PR. Provide a detailed description of the changes you've made, why they are important, and any other relevant information. + +![GitHub's pull request title and description interface](images/GH-PR-guide/365234601-90a0bfcf-807d-4983-a643-0678ace542d2.png){ loading=lazy, width="800" } + +### 1.5 Choose Review Type + +- If your pull request is ready for review, click Create Pull Request. +- If you want to create a draft version of the pull request for further work before it's ready for others to review, click the drop-down and select Create Draft Pull Request, then click Draft Pull Request. + +![GitHub's Create pull request button dropdown options](images/GH-PR-guide/365234701-72dd00f2-936e-44df-af79-ab7522a51def.png){ loading=lazy, width="350" } +/// caption +/// + +## **2. Review a Pull Request** + +### 2.1 Navigate to the **Pull requests** tab + +![GitHub's Pull Requests tab](images/GH-PR-guide/369927182-fe1bc8e0-6a9a-48cf-a3b3-e9dc11f9fe13.png){ loading=lazy, width="800" } +/// caption +/// + +### 2.2 Select a Pull Request + +In the list of pull requests, click the pull request that you'd like to review. + +![Example pull request selection list](images/GH-PR-guide/362360024-c03d12bf-78ce-4311-8a5e-badc9ca1ebef.png){ loading=lazy, width="800" } +/// caption +/// + +### 2.3 Review Changes +In the pull request page, click **Files changed** so as to see the changes. + +![The Files Changed buttin inside the pull request interface](images/GH-PR-guide/362372502-0380ad63-3e22-473f-9eeb-336e43edb81f.png){ loading=lazy, width="600" } +/// caption +/// + +2.3.1 by clicking ![Gear icon](images/GH-PR-guide/362361806-198206f8-6d94-452f-b136-cebec0472e10.png){ loading=lazy, width="20"}, you can choose the unified or split view. + +![Unified vs. split view for file diffs in a pull request](images/GH-PR-guide/362373284-ba099090-8f60-444f-8ae6-c7ab13b7d78f.png){ loading=lazy, width="600" } +/// caption +/// + +### 2.4 Add Comments or Suggestions +When hovering over the lines of code, you can click the blue comment icon to add your review comments. + +![In-line comments button](images/GH-PR-guide/362373606-77b1e2ea-08ab-4fdc-8ef3-22ba678da422.png){ loading=lazy, width="800" } +/// caption +/// + +2.4.1 If you'd like to add a comment on multiple lines, please click the line number of the first line you want to add comments and drag down to select a range of lines. + +### 2.5 Suggest Changes +If you'd like to suggest a specific change to the lines, click ![Suggest changes button](images/GH-PR-guide/362393831-18d9f1f8-9c93-430a-9667-2dbe01248ff0.png){ loading=lazy, width="20"}, and then edit the text within the suggestion block. + +![Suggested changes interface](images/GH-PR-guide/362394354-7bce4591-4ff7-44c7-8377-d250eabd3c2c.png){ loading=lazy, width="600" } +/// caption +/// + +### 2.6 Comment on a File +If you'd like to comment on a file, click ![Comment button](images/GH-PR-guide/362398001-12afff80-ceb8-4c5c-bff0-d95b9c78216a.png){ loading=lazy, width="20"} at the right top of the file, then add your comments. + +![Comment button in context](images/GH-PR-guide/362398683-c497797a-da7c-48be-8db9-dcff89ce0fcf.png){ loading=lazy, width="500" } +/// caption +/// + +### 2.7 Mark Files as Viewed +After you finished reviewing a file, you can mark it as viewed. + +![Viewed button](images/GH-PR-guide/362409533-b4297e2f-f74d-4945-a6ab-81da6726e985.png){ loading=lazy, width="600" } +/// caption +/// + +### 2.8 Start or Add to a Review +When you're done, click Start a review. If you have already started a review, please click Add review comment. +!!! note "Notice" + All line comments are pending and only visible to you. You can edit the comments when needed. If you'd like to abandon your review, please go to in **Review changes** and click **Abandon review** + +### 2.9 Review and Summarize Proposed Changes + +Click Review changes, and then type comments to summarize your proposed changes. + +![Review changes button](images/GH-PR-guide/364920995-8ce17b45-8a2a-428e-a254-eab0abebb318.png){ loading=lazy, width="500" } +/// caption +/// + +### 2.10 Select Review Type + +![{Review summary interface}](images/GH-PR-guide/362403686-2afb0ae6-6cdc-45c6-b91e-b7b9cfc30189.png){ loading=lazy, width="600" } +/// caption +/// + +- Select Comment: Provide general feedback on the changes without explicitly approving or rejecting them. +- Select Approve: Indicate that you’ve reviewed the changes and approve them for merging. A common comment for simple approvals is "LGTM" (Looks Good to Me). +- Select Request changes: Provide feedback indicating that revisions are needed before the changes can be approved. + +### 2.11 Click Submit review +Current review round is done; this publishes your comments and suggestions. Then the PR can either be merged or updated (depending on approval or comments). We generally expect that whoever submits the PR will merge once all feedback has been incorporated or otherwise addressed. + +## **3. Respond to a Pull Request Review** + +### 3.1 Navigate to the Repository's Main Page +Navigate to your repository name, click **Pull requests** + +![Pull requests tab](images/GH-PR-guide/369927182-fe1bc8e0-6a9a-48cf-a3b3-e9dc11f9fe13.png){ loading=lazy, width="600" } +/// caption +/// + +### 3.2 Incorporate Feedback Changes + +After receiving feedback on your pull request, you can apply the changes in one of two ways: either by committing each change individually or by grouping several changes into a single commit. The method you choose depends on whether you prefer fine-grained control over the commit history or a more streamlined approach. + +#### 3.2.1 Apply a change in its own commit +If you agree with at suggested change, qpply it by creating a separate commit for it. This approach helps keep your commit history clear and each change traceable. + +![Commit suggestions button](images/GH-PR-guide/369920474-be5503d3-6cc2-4313-b49d-069a5a806ac4.png){ loading=lazy, width="600" } +/// caption +/// + +#### 3.2.2 Add multiple suggestions to a batch of changes +If you plan to include multiple changes in one commit, you can add suggestions to a batch. Once you've collected all the desired suggestions, click "Commit suggestions" to apply them in one go. + +![Add suggested change to batch button](images/GH-PR-guide/369920952-1b4e0db0-3451-448b-822f-dd1b14679ec6.png){ loading=lazy, width="600" } +/// caption +/// + +### 3.3 Add Commit Message +In the commit message field, enter a brief, descriptive message that clearly explains the changes made to the file(s). + +### 3.4 Click Commit changes +After entering your commit message, click the "Commit changes" button to finalize and save your modifications to the repository. This step ensures that your changes are recorded and can be reviewed or merged into the main codebase. + +### 3.5 Re-requesting a Review +If you’ve addressed all the requested changes and your pull request requires further review, re-request a review by notifying the reviewers. This action prompts them to evaluate your updated code and provide feedback or approval. + +### 3.6 Out-of-scope Suggestion +If the suggested change falls outside the scope of your pull request, create a new issue to address the feedback separately. Issues can be created directly from a PR comment. diff --git a/wiki-guide/The-GitHub-Workflow.md b/docs/wiki-guide/The-GitHub-Workflow.md similarity index 71% rename from wiki-guide/The-GitHub-Workflow.md rename to docs/wiki-guide/The-GitHub-Workflow.md index 5343598..01f7881 100644 --- a/wiki-guide/The-GitHub-Workflow.md +++ b/docs/wiki-guide/The-GitHub-Workflow.md @@ -4,15 +4,16 @@ Thank you for contributing! This document outlines guidelines for collaboratively contributing to a repository (repo). This workflow is ideal for when: -* You are a member of the Imageomics Institute and have write access to the repository you're contributing to. -* You have (or expect to have) multiple people contributing to the repository and want to keep contributions organized and all team members up-to-date on progress. -* You are working on a repository individually and want to keep contributions organized and log progress for your future self or others interested in seeing it. + +- You are a member of the Imageomics Institute and have write access to the repository you're contributing to. +- You have (or expect to have) multiple people contributing to the repository and want to keep contributions organized and all team members up-to-date on progress. +- You are working on a repository individually and want to keep contributions organized and log progress for your future self or others interested in seeing it. It follows a branch and pull request (PR) based workflow, which provides a controlled way to bring internal contributions together for those with write access to the repository (those without write access will need to fork the repository first before making contributions). Importantly, this workflow suggests that **_contributions are created through PRs_** rather than directly committing to or merging into the `main` branch. -## To contribute as an Imageomics member with write access: +## Contribute as an Imageomics member with write access ### 1. Clone the repo to your machine. ```sh git clone https://github.com/Imageomics/.git @@ -22,7 +23,9 @@ cd ### 2. Create a new branch. For example, if you want to add a feature to your code that simulates human vision, you could name the branch `feature/simulate-vision`. -> _pro-tip_: make a new branch for each PR scoped by the task, feature, or bug fix. +!!! tip "Pro tip" + Make a new branch for each PR scoped by the task, feature, or bug fix. + ```sh git branch feature/simulate-vision git checkout feature/simulate-vision @@ -38,9 +41,9 @@ For example, imagine you created three new files, each simulating a component of ### 4. Stage and commit changes to the new branch. Commit frequently with each commit based on a logical self-contained change using descriptive commit messages. -> _pro-tip_: use imperative phrases beginning with words such as "add", "update", "fix", "refactor", "remove", "improve", ... - -> _pro-tip_: write a multi-line commit message with a short summary on the first line and a longer description if needed using `git commit -m "Short summary" -m "Long description"` +!!! tip "Pro tip" + - Use imperative phrases beginning with words such as "add", "update", "fix", "refactor", "remove", "improve", ... + - Write a multi-line commit message with a short summary on the first line and a longer description if needed using `git commit -m "Short summary" -m "Long description"` ```sh git add retina.py occipital.py visual_cortex.py @@ -50,7 +53,8 @@ git commit -m "Implement the retina, occipital, and visual cortex components of ### 5. Update your local `main` branch. Ensure your local `main` branch is up-to-date with the remote to incorporate any changes other collaborators may have made. -> _pro-tip_: if you're unsure what branch you should have checked out, remember that the branch being merged to or committed to should be the branch that is active. Check with `git branch` and look for `*` next to what's active. +!!! tip "Pro tip" + If you're unsure what branch you should have checked out, remember that the branch being merged to or committed to should be the branch that is active. Check with `git branch` and look for `*` next to what's active. ```sh git checkout main git pull origin main @@ -73,7 +77,7 @@ git push --set-upstream origin feature/simulate-vision # to specify the upstream git push # subsequent pushes for this branch once the remote tracking branch is set ``` -### 8. Make, commit, and push with this branch as needed. +### 8. Make changes, commit, and push with this branch as needed. Repeat steps 3-7 until results are in a state suitable to merge with the project's `main` branch. ### 9. Open a Pull Request. @@ -83,7 +87,8 @@ You can also set the PR to draft status for visibility and discussion of ongoing If you like doing everything from the command line, you can consider using the [GitHub CLI](https://cli.github.com/) for this step. -> _pro-tip_: keep PRs small and manageable for review; the scope should be focused on the task, feature, or bug fix associated with the branch. +!!! tip "Pro tip" + Keep PRs small and manageable for review; the scope should be focused on the task, feature, or bug fix associated with the branch. ### 10. Verify the repositories and branches in the PR. **Base Repository:** The original repo you are contributing into. @@ -105,13 +110,14 @@ Click `Create pull request` to submit. ### 13. Clean up branches. After a branch is merged and a PR is closed, delete the branch from the remote and your local repository to keep things tidy. -> _pro-tip_: remember, a branch should exist to create a functional contribution to the repository through a PR, and once the function is merged in, the purpose of the branch is fulfilled. -```sh -git checkout main # switch to the main branch before deleting another branch -git branch -d feature/simulate-vision # delete the local branch that was merged -git push origin --delete feature/simulate-vision # delete the remote branch that was merged -git fetch --prune # optionally, this removes any references to deleted remote branches -``` +!!! tip "Pro tip" + Remember, a branch should exist to create a functional contribution to the repository through a PR, and once the function is merged in, the purpose of the branch is fulfilled. + ```sh + git checkout main # switch to the main branch before deleting another branch + git branch -d feature/simulate-vision # delete the local branch that was merged + git push origin --delete feature/simulate-vision # delete the remote branch that was merged + git fetch --prune # optionally, this removes any references to deleted remote branches + ``` ### 14. Update your local main branch before starting new work. ```sh @@ -119,5 +125,6 @@ git pull ``` And for a slightly abbreviated visual summary, the same workflow looks like this: -![image](https://user-images.githubusercontent.com/31709066/230167049-6315b056-74d5-4a18-bb60-5bc06a191783.png) -(image credit: [dbt Labs](https://www.getdbt.com/analytics-engineering/transformation/git-workflow/)) \ No newline at end of file + +![GitHub workflow diagram](https://www.getdbt.com/ui/img/guides/analytics-engineering/git-workflow-1.png){ loading=lazy } +(image credit: [dbt Labs](https://www.getdbt.com/analytics-engineering/transformation/git-workflow/)) diff --git a/wiki-guide/The-Hugging-Face-Dataset-Upload-Guide.md b/docs/wiki-guide/The-Hugging-Face-Dataset-Upload-Guide.md similarity index 68% rename from wiki-guide/The-Hugging-Face-Dataset-Upload-Guide.md rename to docs/wiki-guide/The-Hugging-Face-Dataset-Upload-Guide.md index 029f0f6..6db9cf0 100644 --- a/wiki-guide/The-Hugging-Face-Dataset-Upload-Guide.md +++ b/docs/wiki-guide/The-Hugging-Face-Dataset-Upload-Guide.md @@ -1,16 +1,19 @@ # Hugging Face Dataset Guide -## Creating a New Dataset Repository +## Create a New Dataset Repository When creating a new dataset repository, you can make the dataset **Public** (accessible to anyone on the internet) or **Private** (accessible only to members of the organization). - +![New dataset repository interface](images/HF-dataset-upload/346972860-ed0feb0e-529b-4021-b44f-41ac96680bc3.png){ loading=lazy, width=800 } +/// caption +/// -## Uploading a Dataset with the Web Interface. +## Upload a Dataset with the Web Interface In the Files and versions tab of the Dataset card, you can choose to add file in the hugging web interface. -![image](https://github.com/ABC-climate/internal-guidelines/assets/30881036/9e6cef9b-18ef-4d4a-84c5-1a3f75ac9336) -## Uploading a Dataset with HfApi -``` +![Dataset repository Add file button](images/HF-dataset-upload/346190430-9e6cef9b-18ef-4d4a-84c5-1a3f75ac9336.png){ loading=lazy } + +## Upload a Dataset with HfApi +``` py linenums="1" from huggingface_hub import login # Login with your personal token (find your tokens at: Settings/Access Tokens) @@ -27,7 +30,7 @@ api.upload_file ( ) ``` -## Uploading a Dataset with Git +## Upload a Dataset with Git ### If the Dataset is Less Than 5GB Navigate to the folder for the repository: ``` @@ -41,26 +44,26 @@ git push ``` ### If the Dataset is Larger Than 5GB -#### Install Git LFS: +#### Install Git LFS Follow instructions at https://git-lfs.com/ -#### Install the Hugging Face CLI: +#### Install the Hugging Face CLI ``` brew install huggingface-cli pip install -U "huggingface_hub[cli]" ``` -#### Enable the repository to upload large files: +#### Enable the repository to upload large files ``` huggingface-cli lfs-enable-largefiles ``` -#### Initialize Git LFS: +#### Initialize Git LFS ``` git lfs install ``` -#### Track large files (e.g., .csv files): +#### Track large files (e.g., .csv files) ``` # Adds a line to .gitattributes, which Git uses to determine files managed by LFS git lfs track "*.csv" @@ -68,15 +71,9 @@ git add .gitattributes git commit -m "Track large files with Git LFS" ``` -#### Add, commit, and push the files: +#### Add, commit, and push the files ``` git add git commit -m 'comments' git push ``` - - - - - - diff --git a/wiki-guide/The-Hugging-Face-Workflow.md b/docs/wiki-guide/The-Hugging-Face-Workflow.md similarity index 77% rename from wiki-guide/The-Hugging-Face-Workflow.md rename to docs/wiki-guide/The-Hugging-Face-Workflow.md index 9e60f34..222d2d9 100644 --- a/wiki-guide/The-Hugging-Face-Workflow.md +++ b/docs/wiki-guide/The-Hugging-Face-Workflow.md @@ -23,29 +23,35 @@ git push origin pr/:refs/pr/ For more information on Hugging Face Pull Requests and Discussions, see their [documentation](https://huggingface.co/docs/hub/repositories-pull-requests-discussions). -## To contribute as an Imageomics member with write access: +## Contribute as an Imageomics member with write access The workflow on Hugging Face repositories should closely mirror that of GitHub repos (described in detail in the [Github Workflow](The-GitHub-Workflow.md)). However, Hugging Face repos function a little differently from GitHub’s, so we will review the details relevant to those differences and refer back to the [GitHub directions](The-GitHub-Workflow.md) where necessary. Firstly, when making changes it is still best not to work on the main branch, but instead go through the pull request (PR) process. This process is a bit different on Hugging Face, as this is not their focus. Instead of initializing a new branch, we initialize a new PR. There are two ways of doing this, but both rely on the UI (web interface). + 1. Make your change directly on the UI (upload a file, edit the dataset/model card, etc), BUT select “Open as a pull request to the `main` branch” and write a descriptive commit message of your changes before pressing `commit`. + 2. Navigate to the “Community” tab, and click “New pull request” -| Community Tab | New PR Pop-up | -:---:|:---: -![Screenshot Community Tab](https://github.com/Imageomics/internal-guidelines/assets/38985481/c3493cff-7dbc-4158-802b-d3054ba1bfbe)|![New Pull Request](https://github.com/Imageomics/internal-guidelines/assets/38985481/f7cde0bf-2559-4b81-af58-f8d175cf25c5) | +| Community tab with New pull request button | New Pull Request Interface | +| --- | --- | +| ![New pull request button under Community tab](images/HF-workflow/290567257-c3493cff-7dbc-4158-802b-d3054ba1bfbe.png){ loading=lazy, width=400 } | ![New Pull Request interface](images/HF-workflow/290565108-f7cde0bf-2559-4b81-af58-f8d175cf25c5.png){ loading=lazy, width=400 } | +!!! note "Their instructions for “from the website” are out of date." + You actually select “Add file”, choose upload or create file, and you can upload any number of files (that are reasonable to include in a single commit) in a single commit, just select “Open as a pull request to the `main` branch”, as described in #1. -Note that their instructions for “from the website” are out of date. You actually select “Add file”, choose upload or create file, and you can upload any number of files (that are reasonable to include in a single commit) in a single commit, just select “Open as a pull request to the `main` branch”, as described in #1. Now, to continue with the local branch method (if you intend to make multiple commits), give your PR an informative title and select “Create PR branch”. This will take you to a new page with instructions on how to connect locally to the PR and send your updates back to the repo. If the repo is private, you will need to ensure your credentials are set before cloning/fetching. -![Screenshot-Get Started with your PR page](https://github.com/Imageomics/internal-guidelines/assets/38985481/2f2adf5c-0654-410a-8d93-d1172066ad8e) +![Interface for Get started with your pull request](images/HF-workflow/290563763-2f2adf5c-0654-410a-8d93-d1172066ad8e.png){ loading=lazy } +/// caption +/// Once you have made all of your changes, it is time to publish your branch. This is similar to initializing the PR on GitHub, in that you should: + - Provide a detailed description of what your PR does - Tag one or two other people on the project to review it (`@hf-username please review`) From here, reviewers can add their comments and suggestions on your PR (Note: to see the files in the PR, click on the last commit, then select “Browse files”) Now, if there are changes to be made based on reviewer suggestions, files can be edited as usual (pushing to the PR branch). Alternatively, if you are working from the web interface and need to add files (or edit files of a supported type), then click on “from: refs/pr/#” just below the title of your PR to view the copy of the repo on the PR branch (like looking at a different branch on GitHub). Files can be added or edited here too. - -![PR Header](https://github.com/Imageomics/internal-guidelines/assets/38985481/ceccdbea-cccf-482a-ab79-cfb04c5c42e8) + +![Pull Request header](images/HF-workflow/290563994-ceccdbea-cccf-482a-ab79-cfb04c5c42e8.png){ loading=lazy } diff --git a/wiki-guide/Two-Repo-Problem.md b/docs/wiki-guide/Two-Repo-Problem.md similarity index 82% rename from wiki-guide/Two-Repo-Problem.md rename to docs/wiki-guide/Two-Repo-Problem.md index 4356c10..9e4e708 100644 --- a/wiki-guide/Two-Repo-Problem.md +++ b/docs/wiki-guide/Two-Repo-Problem.md @@ -2,6 +2,7 @@ When working on a research project often the code is kept private until a paper is published. In conflict with this is the need to have a public repo for following purposes: + - Host a website (via a `ghpages` branch) - Act as a placeholder for when the paper is published - Share code from an earlier paper @@ -13,22 +14,24 @@ Once code changes are complete in the __private git repo__ moving them to the __ For instance, if the __public git repo__ and the __private git repo__ were created separately they will have unrelated histories. **Common challenges when merging:** + - Determining the correct git commands and steps to perform the merge - Cleaning up many small commits into one or a few larger commits - Merge conflicts - Files such as the README that may have diverged and result in merge conflicts - Accidentally losing changes or duplicating changes -# Solutions +## Solutions -## Create private from public repo +### Create private from public repo To ensure related histories, create the public repo and then create a private repo from it. The public repo will be created with a README file ensuring it has a commit. The private repo will be created without any extra files so it will have no commits. -### 1. Create Public Repo +#### 1. Create Public Repo First create a public repo with commits. Visit https://github.com/organizations/Imageomics/repositories/new + - Enter the public repo name - Click the checkbox for `Add a README file` - Choose a license @@ -37,29 +40,33 @@ Visit https://github.com/organizations/Imageomics/repositories/new After this step you should see a repo with commits similar to the following: -PublicRepoAfterCreate +![New public repository Initial commit indicator](images/two-repo-problem/340342731-d174b21a-0d2d-480d-a7b5-77c7cacf16af.png){ loading=lazy, width=600 } +/// caption +/// - -### 2. Update Main Branch of Public Repo +#### 2. Update Main Branch of Public Repo Make changes to the [README](GitHub-Repo-Guide.md#readme) and [`.gitignore`](GitHub-Repo-Guide.md#gitignore) in the public repo such that no further changes will be needed until the private repo is merged. After this step you should see a repo with at least 2 commits similar to the following: -PublicAfterREADMEChange - +![New commits indicator](images/two-repo-problem/340343092-84608140-6d1a-4708-8659-bd03e715afb2.png){ loading=lazy, width=600 } +/// caption +/// -### 3. Add Branch Protections to Public Repo +#### 3. Add Branch Protections to Public Repo Once your repository is set up, only changes to the `ghpages` branch are recommended; establish branch protections on both `main` and `ghpages` that require review and approval (see [When to think about branch protections](When-to-think-about-branch-protections.md) for more information). There are two issues at play here: + 1. There is potential to introduce merge conflicts when bringing in the development repo to merge with the `main` branch if it has been changed. Hence, it is important that you avoid making changes to the `main` branch after spin-off. 2. The `ghpages` branch will generate the website for the publication. Hence, it is a "published" branch, requiring regular checks with protections like the `main` branch. -### 4. Create Private Repo +#### 4. Create Private Repo First create a private repo __without__ commits. -Visit https://github.com/organizations/Imageomics/repositories/new +Visit https://github.com/organizations/Imageomics/repositories/new + - Enter the private repo name (ex: `-dev`) - __DO NOT__ check `Add a README file` - __DO NOT__ Choose a license @@ -67,14 +74,16 @@ Visit https://github.com/organizations/Imageomics/repositories/new - Click `Create repository` After this step you should see a repo without any commits with a box similar to the following: -PrivateRepoAfterCreate +![New private repo after creation](images/two-repo-problem/340343305-7f0f79f9-956b-4a46-b110-235e2ed4295a.png){ loading=lazy, width=600 } +/// caption +/// -### 5. Push initial changes from public to private +#### 5. Push initial changes from public to private In the following example we will clone the private repo: `johnbradley/research-project-x-private`. And pull commits from the public repo: `johnbradley/research-project-x`. -#### 5a. Clone Private Repo +##### 5a. Clone Private Repo ```console git clone git@github.com:johnbradley/research-project-x-private.git ``` @@ -85,7 +94,7 @@ Cloning into 'research-project-x-private'... warning: You appear to have cloned an empty repository. ``` -#### 5b. Pull Commits to Private Repo +##### 5b. Pull Commits to Private Repo Switch to the private repo directory. ```console cd research-project-x-private @@ -101,26 +110,29 @@ Pull commits from the public repo. git pull upstream main ``` -NOTE: Running git remote -v will confirm where a standard git push (or git pull) will send (or receive) commits from. +!!! note "Note" + Running `git remote -v` will confirm where a standard git push (or git pull) will send (or receive) commits from. -#### 5c. Push Commits to Private Repo on GitHub +##### 5c. Push Commits to Private Repo on GitHub ``` git push ``` After the above command you should be able to see commits in the private repo similar to the following: -PrivateAfterMerge +![Private repo status after merge](images/two-repo-problem/340343584-069c445a-487d-432c-8b82-c3867be863ae.png){ loading=lazy, width=600 } +/// caption +/// Now you're ready to work on development in the private repo following the standard [GitHub Workflow](The-GitHub-Workflow.md) with the private repo as your remote. -## Merge Private to Public +### Merge Private to Public Once your changes are done on the private repo (i.e., when you're ready to make your project public) you can push the changes to the public repo. For this example the public repo will be at `johnbradley/research-project-x` and the private will be at `johnbradley/research-project-x-private`. A branch named `v1` will be created on the public repo with changes from the private repo. -### Create a branch on Public with Private commits +#### Create a branch on Public with Private commits Clone the public repo, cd into the directory. ```console git clone git@github.com:johnbradley/research-project-x.git @@ -158,7 +170,7 @@ Push `v1` branch to the public repo. git push --set-upstream origin v1 ``` -### Next Steps +#### Next Steps At this point the main branch of the public repo should match the main branch of the private repo. Additional changes should be made only to the private repo, preferably using a branch. See [Github-Workflow](The-GitHub-Workflow.md) for more details. @@ -166,21 +178,23 @@ When you are ready to release a new version of the code in the private repo foll
-# _What if I already have mismatched repos?_ +## _What if I already have mismatched repos?_ If you find yourself with two repositories that have misaligned histories, please read the following and reach out to the Imageomics Informatics Team so we can help. -## Resolving Mismatched Public/Private Repos +### Resolving Mismatched Public/Private Repos If you already have a public and private repo with unrelated histories resolving this can be challenging. Three approaches to resolve merging disparate public/private repos are documented here. + - Merge - use when the public and private repos contain only unrelated commits. - Reset - use when all public repo commits can be deleted and replaced with private repo commits. - Cherry Pick - use when the same commits exist in both repos with different hashes. -## Merge +### Merge Merge commits from the `main` branch of the private repo into the `main` branch of the public repo. -__NOTE: If the repos have commits in common with different hashes this will result in merge conflicts and duplicated commits.__ +!!! warning "Warning" + If the repos have commits in common with different hashes this will result in merge conflicts and duplicated commits. Merge the main branch of the private repo with the main branch of the public repo. As far as maintaining history this is the safest approach. Often this approach results in merge conflicts. @@ -190,10 +204,11 @@ The allow unrelated histories flag is necessary for this approach: git merge --allow-unrelated-histories ... ``` -## Reset +### Reset Replace all commits on the `main` branch of the public rep with commits from the `main` branch of the private repo. -__NOTE: This will destroy all history in the public repo main branch!__ +!!! danger "Danger" + This will destroy all history in the public repo main branch! This option is only safe to do when releasing the first version of a version on the public repo. After setting up the remote for upstream run a command similar to the following: @@ -201,12 +216,14 @@ After setting up the remote for upstream run a command similar to the following: git reset --hard upstream/main ``` -## Cherry Pick +### Cherry Pick This method is used when the same commits exist in both repos with different hashes. This requires finding which commits are in the private repo but not in the public repo. -__NOTE: If the commits you cherry-pick have commits in common with different hashes this will result in merge conflicts and duplicated commits.__ + +!!! warning "Warning" + If the commits you cherry-pick have commits in common with different hashes this will result in merge conflicts and duplicated commits. After fetching your upstream branch you can cherry pick a range of commits to add like so: ``` git cherry-pick .. -``` \ No newline at end of file +``` diff --git a/docs/wiki-guide/Virtual-Environments.md b/docs/wiki-guide/Virtual-Environments.md new file mode 100644 index 0000000..fde33ed --- /dev/null +++ b/docs/wiki-guide/Virtual-Environments.md @@ -0,0 +1,29 @@ +# Managing Dependencies and Environments +Recording dependencies and environment information is crucial for reproducibility and interoperability across different platforms. There are many options for this, and sometimes it is appropriate to use multiple within the same project. + +The goal is to make it as easy as possible for others (including your future self) to run the code. + +## Conda Environments +The following example commands will get you set up with a Conda environment that can be tracked and shared. + +- Install [Miniconda](https://docs.conda.io/en/latest/miniconda.html). +- Create an environment: `conda create --name ` +- Activate the environment: `conda activate ` +- Install packages you need: `conda install -c conda-forge python=3.9 pandas matplotlib` + - `-c conda-forge` specifies the channel to install from. ([more information](https://docs.conda.io/projects/conda/en/latest/user-guide/concepts/channels.html)) + - You can specify the version of a package or omit this to get the latest available. ([more information](https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-pkgs.html#id2)) +- Once the needed packages are installed, export the environment to a file: +```bash +conda env export --no-builds --from-history | grep -v "prefix" > environment.yml +``` +!!! info "Command breakdown" + - `--no-builds` and `--from-history` flags will cause the environment file to only specify the packages and versions that you manually installed. This may help with cross-platform compatibility by giving conda the flexibility to find compatible sub-dependencies on another system. + - `| grep -v "prefix"` eliminates your system-specific environment storage location (what is called the `prefix`) from being added to the file + - If you want to add the actual package versions that were installed (if you did not specify during installation) to the `environment.yml` file, you can check those and copy-paste them in manually with `conda env list`. + - Don't forget to also add and track this new file with git! + - To install the dependencies somewhere else from this file, use `conda env create -f environment.yml`. + +## Pip Virtual Environment +For virtual environments using `pip` to install packages (Python environments), use `python -m pip freeze` to print a list of packages (and their versions) installed in the environment. `python -m pip freeze > requirements.txt` will populate a `requirements.txt` file with all these packages and versions listed (eg., `pandas==2.0.1`). Note that this will _not_ give only minimum software requirements, but will also print all dependencies. For more information, see the [pip documentation](https://pip.pypa.io/en/stable/cli/pip_freeze/). + +This machine-readable file can then be installed using `pip install -r requirements.txt` when in the appropriate folder. diff --git a/wiki-guide/When-to-think-about-branch-protections.md b/docs/wiki-guide/When-to-think-about-branch-protections.md similarity index 69% rename from wiki-guide/When-to-think-about-branch-protections.md rename to docs/wiki-guide/When-to-think-about-branch-protections.md index f880041..e7b63ad 100644 --- a/wiki-guide/When-to-think-about-branch-protections.md +++ b/docs/wiki-guide/When-to-think-about-branch-protections.md @@ -17,26 +17,39 @@ The example below shows the addition of branch protection rules for `main` that ### Example Branch Protection Rules for `main` -![Screenshot 2023-05-17 at 5 54 34 PM](https://github.com/Imageomics/internal-guidelines/assets/38985481/190b758a-68f7-4cbf-9368-a8df9bef7a08) +![GitHub branch protections in a repo's Settings](images/GH-branch-protections/239086285-190b758a-68f7-4cbf-9368-a8df9bef7a08.png){ loading=lazy } +/// caption +/// ## How to Implement Rulesets (Newer Version of Branch Protections) From your repository, navigate to "Settings" and select "Rules" from the left toolbar. Click on "New ruleset" and select the type you wish to create ("New branch ruleset" is the ruleset equivalent to branch protections). -![select new ruleset in rules setting](https://github.com/user-attachments/assets/02304951-367a-4c03-bf4b-cf0d53960da9) +![Select New ruleset in Rules settings](images/GH-branch-protections/382110241-02304951-367a-4c03-bf4b-cf0d53960da9.png){ loading=lazy } +/// caption +/// -Here we have selected "New branch ruleset", and named it "published-branch", as we will be applying it to our publication branches (i.e., `main` and `ghpages`). Be sure to select "Active" to enable the protections. +Here we have selected "New branch ruleset", and named it "published-branch", as we will be applying it to our publication branches (i.e., `main` and `gh-pages`). Be sure to select "Active" to enable the protections. -![Screenshot 2024-10-31 at 4 44 30 PM](https://github.com/user-attachments/assets/691a2831-ddf6-4ed9-bc27-5288f2936577) +![Interface for creating a new ruleset](images/GH-branch-protections/382110848-691a2831-ddf6-4ed9-bc27-5288f2936577.png){ loading=lazy } +/// caption +/// -We choose to apply these to the default branch (`main` or `master`). As with branch protections, it is also possible to set the rules for branches matching a particular pattern (eg., type `*release*` to apply the rules to any branch containing the word `release`). We will do this for `ghpages`. +We choose to apply these to the default branch (`main` or `master`). -

-select add target under target branches for branch ruleset -add target pattern for ghpages under target branches for branch ruleset -

+![Select add target under target branches for branch ruleset](images/GH-branch-protections/382111392-8157d014-0482-4d4f-b695-ab3b7624d5e4.png){ loading=lazy } +/// caption +/// + +As with branch protections, it is also possible to set the rules for branches matching a particular pattern (eg., type `*release*` to apply the rules to any branch containing the word `release`). We will do this for `gh-pages`. + +![Add target pattern for gh-pages under target branches for branch ruleset](images/GH-branch-protections/382111988-20d6499e-fb12-4335-8b8d-76ac6b989528.png){ loading=lazy } +/// caption +/// You can also edit branch rulesets from this page. The example below shows the addition of a branch ruleset that requires a pull request and that it be approved prior to merging. It also will remove approval if other changes are added that require approval. This is equivalent to the branch protection example given above. -![rule selections to require a pull request before merging along with 1 approval and dismiss stale approvals](https://github.com/user-attachments/assets/39fd79d4-ff95-404b-86c4-8ab875cc9a4b) +![Rules checklist](images/GH-branch-protections/382113942-39fd79d4-ff95-404b-86c4-8ab875cc9a4b.png){ loading=lazy } +/// caption +/// diff --git a/images/GH-PR-guide/362360024-c03d12bf-78ce-4311-8a5e-badc9ca1ebef.png b/docs/wiki-guide/images/GH-PR-guide/362360024-c03d12bf-78ce-4311-8a5e-badc9ca1ebef.png similarity index 100% rename from images/GH-PR-guide/362360024-c03d12bf-78ce-4311-8a5e-badc9ca1ebef.png rename to docs/wiki-guide/images/GH-PR-guide/362360024-c03d12bf-78ce-4311-8a5e-badc9ca1ebef.png diff --git a/images/GH-PR-guide/362361806-198206f8-6d94-452f-b136-cebec0472e10.png b/docs/wiki-guide/images/GH-PR-guide/362361806-198206f8-6d94-452f-b136-cebec0472e10.png similarity index 100% rename from images/GH-PR-guide/362361806-198206f8-6d94-452f-b136-cebec0472e10.png rename to docs/wiki-guide/images/GH-PR-guide/362361806-198206f8-6d94-452f-b136-cebec0472e10.png diff --git a/images/GH-PR-guide/362371360-b36c2d36-bc75-45c0-9396-9794ed1d2404.png b/docs/wiki-guide/images/GH-PR-guide/362371360-b36c2d36-bc75-45c0-9396-9794ed1d2404.png similarity index 100% rename from images/GH-PR-guide/362371360-b36c2d36-bc75-45c0-9396-9794ed1d2404.png rename to docs/wiki-guide/images/GH-PR-guide/362371360-b36c2d36-bc75-45c0-9396-9794ed1d2404.png diff --git a/images/GH-PR-guide/362372502-0380ad63-3e22-473f-9eeb-336e43edb81f.png b/docs/wiki-guide/images/GH-PR-guide/362372502-0380ad63-3e22-473f-9eeb-336e43edb81f.png similarity index 100% rename from images/GH-PR-guide/362372502-0380ad63-3e22-473f-9eeb-336e43edb81f.png rename to docs/wiki-guide/images/GH-PR-guide/362372502-0380ad63-3e22-473f-9eeb-336e43edb81f.png diff --git a/images/GH-PR-guide/362373284-ba099090-8f60-444f-8ae6-c7ab13b7d78f.png b/docs/wiki-guide/images/GH-PR-guide/362373284-ba099090-8f60-444f-8ae6-c7ab13b7d78f.png similarity index 100% rename from images/GH-PR-guide/362373284-ba099090-8f60-444f-8ae6-c7ab13b7d78f.png rename to docs/wiki-guide/images/GH-PR-guide/362373284-ba099090-8f60-444f-8ae6-c7ab13b7d78f.png diff --git a/images/GH-PR-guide/362373606-77b1e2ea-08ab-4fdc-8ef3-22ba678da422.png b/docs/wiki-guide/images/GH-PR-guide/362373606-77b1e2ea-08ab-4fdc-8ef3-22ba678da422.png similarity index 100% rename from images/GH-PR-guide/362373606-77b1e2ea-08ab-4fdc-8ef3-22ba678da422.png rename to docs/wiki-guide/images/GH-PR-guide/362373606-77b1e2ea-08ab-4fdc-8ef3-22ba678da422.png diff --git a/images/GH-PR-guide/362393831-18d9f1f8-9c93-430a-9667-2dbe01248ff0.png b/docs/wiki-guide/images/GH-PR-guide/362393831-18d9f1f8-9c93-430a-9667-2dbe01248ff0.png similarity index 100% rename from images/GH-PR-guide/362393831-18d9f1f8-9c93-430a-9667-2dbe01248ff0.png rename to docs/wiki-guide/images/GH-PR-guide/362393831-18d9f1f8-9c93-430a-9667-2dbe01248ff0.png diff --git a/images/GH-PR-guide/362394354-7bce4591-4ff7-44c7-8377-d250eabd3c2c.png b/docs/wiki-guide/images/GH-PR-guide/362394354-7bce4591-4ff7-44c7-8377-d250eabd3c2c.png similarity index 100% rename from images/GH-PR-guide/362394354-7bce4591-4ff7-44c7-8377-d250eabd3c2c.png rename to docs/wiki-guide/images/GH-PR-guide/362394354-7bce4591-4ff7-44c7-8377-d250eabd3c2c.png diff --git a/images/GH-PR-guide/362398001-12afff80-ceb8-4c5c-bff0-d95b9c78216a.png b/docs/wiki-guide/images/GH-PR-guide/362398001-12afff80-ceb8-4c5c-bff0-d95b9c78216a.png similarity index 100% rename from images/GH-PR-guide/362398001-12afff80-ceb8-4c5c-bff0-d95b9c78216a.png rename to docs/wiki-guide/images/GH-PR-guide/362398001-12afff80-ceb8-4c5c-bff0-d95b9c78216a.png diff --git a/images/GH-PR-guide/362398683-c497797a-da7c-48be-8db9-dcff89ce0fcf.png b/docs/wiki-guide/images/GH-PR-guide/362398683-c497797a-da7c-48be-8db9-dcff89ce0fcf.png similarity index 100% rename from images/GH-PR-guide/362398683-c497797a-da7c-48be-8db9-dcff89ce0fcf.png rename to docs/wiki-guide/images/GH-PR-guide/362398683-c497797a-da7c-48be-8db9-dcff89ce0fcf.png diff --git a/images/GH-PR-guide/362403686-2afb0ae6-6cdc-45c6-b91e-b7b9cfc30189.png b/docs/wiki-guide/images/GH-PR-guide/362403686-2afb0ae6-6cdc-45c6-b91e-b7b9cfc30189.png similarity index 100% rename from images/GH-PR-guide/362403686-2afb0ae6-6cdc-45c6-b91e-b7b9cfc30189.png rename to docs/wiki-guide/images/GH-PR-guide/362403686-2afb0ae6-6cdc-45c6-b91e-b7b9cfc30189.png diff --git a/images/GH-PR-guide/362409533-b4297e2f-f74d-4945-a6ab-81da6726e985.png b/docs/wiki-guide/images/GH-PR-guide/362409533-b4297e2f-f74d-4945-a6ab-81da6726e985.png similarity index 100% rename from images/GH-PR-guide/362409533-b4297e2f-f74d-4945-a6ab-81da6726e985.png rename to docs/wiki-guide/images/GH-PR-guide/362409533-b4297e2f-f74d-4945-a6ab-81da6726e985.png diff --git a/images/GH-PR-guide/364920995-8ce17b45-8a2a-428e-a254-eab0abebb318.png b/docs/wiki-guide/images/GH-PR-guide/364920995-8ce17b45-8a2a-428e-a254-eab0abebb318.png similarity index 100% rename from images/GH-PR-guide/364920995-8ce17b45-8a2a-428e-a254-eab0abebb318.png rename to docs/wiki-guide/images/GH-PR-guide/364920995-8ce17b45-8a2a-428e-a254-eab0abebb318.png diff --git a/images/GH-PR-guide/365234535-659b312e-d95d-4bee-a958-4ce23fc4255d.png b/docs/wiki-guide/images/GH-PR-guide/365234535-659b312e-d95d-4bee-a958-4ce23fc4255d.png similarity index 100% rename from images/GH-PR-guide/365234535-659b312e-d95d-4bee-a958-4ce23fc4255d.png rename to docs/wiki-guide/images/GH-PR-guide/365234535-659b312e-d95d-4bee-a958-4ce23fc4255d.png diff --git a/images/GH-PR-guide/365234601-90a0bfcf-807d-4983-a643-0678ace542d2.png b/docs/wiki-guide/images/GH-PR-guide/365234601-90a0bfcf-807d-4983-a643-0678ace542d2.png similarity index 100% rename from images/GH-PR-guide/365234601-90a0bfcf-807d-4983-a643-0678ace542d2.png rename to docs/wiki-guide/images/GH-PR-guide/365234601-90a0bfcf-807d-4983-a643-0678ace542d2.png diff --git a/images/GH-PR-guide/365234701-72dd00f2-936e-44df-af79-ab7522a51def.png b/docs/wiki-guide/images/GH-PR-guide/365234701-72dd00f2-936e-44df-af79-ab7522a51def.png similarity index 100% rename from images/GH-PR-guide/365234701-72dd00f2-936e-44df-af79-ab7522a51def.png rename to docs/wiki-guide/images/GH-PR-guide/365234701-72dd00f2-936e-44df-af79-ab7522a51def.png diff --git a/images/GH-PR-guide/369920474-be5503d3-6cc2-4313-b49d-069a5a806ac4.png b/docs/wiki-guide/images/GH-PR-guide/369920474-be5503d3-6cc2-4313-b49d-069a5a806ac4.png similarity index 100% rename from images/GH-PR-guide/369920474-be5503d3-6cc2-4313-b49d-069a5a806ac4.png rename to docs/wiki-guide/images/GH-PR-guide/369920474-be5503d3-6cc2-4313-b49d-069a5a806ac4.png diff --git a/images/GH-PR-guide/369920952-1b4e0db0-3451-448b-822f-dd1b14679ec6.png b/docs/wiki-guide/images/GH-PR-guide/369920952-1b4e0db0-3451-448b-822f-dd1b14679ec6.png similarity index 100% rename from images/GH-PR-guide/369920952-1b4e0db0-3451-448b-822f-dd1b14679ec6.png rename to docs/wiki-guide/images/GH-PR-guide/369920952-1b4e0db0-3451-448b-822f-dd1b14679ec6.png diff --git a/images/GH-PR-guide/369927182-fe1bc8e0-6a9a-48cf-a3b3-e9dc11f9fe13.png b/docs/wiki-guide/images/GH-PR-guide/369927182-fe1bc8e0-6a9a-48cf-a3b3-e9dc11f9fe13.png similarity index 100% rename from images/GH-PR-guide/369927182-fe1bc8e0-6a9a-48cf-a3b3-e9dc11f9fe13.png rename to docs/wiki-guide/images/GH-PR-guide/369927182-fe1bc8e0-6a9a-48cf-a3b3-e9dc11f9fe13.png diff --git a/images/branch-protections/239086285-190b758a-68f7-4cbf-9368-a8df9bef7a08.png b/docs/wiki-guide/images/GH-branch-protections/239086285-190b758a-68f7-4cbf-9368-a8df9bef7a08.png similarity index 100% rename from images/branch-protections/239086285-190b758a-68f7-4cbf-9368-a8df9bef7a08.png rename to docs/wiki-guide/images/GH-branch-protections/239086285-190b758a-68f7-4cbf-9368-a8df9bef7a08.png diff --git a/images/branch-protections/382110241-02304951-367a-4c03-bf4b-cf0d53960da9.png b/docs/wiki-guide/images/GH-branch-protections/382110241-02304951-367a-4c03-bf4b-cf0d53960da9.png similarity index 100% rename from images/branch-protections/382110241-02304951-367a-4c03-bf4b-cf0d53960da9.png rename to docs/wiki-guide/images/GH-branch-protections/382110241-02304951-367a-4c03-bf4b-cf0d53960da9.png diff --git a/images/branch-protections/382110848-691a2831-ddf6-4ed9-bc27-5288f2936577.png b/docs/wiki-guide/images/GH-branch-protections/382110848-691a2831-ddf6-4ed9-bc27-5288f2936577.png similarity index 100% rename from images/branch-protections/382110848-691a2831-ddf6-4ed9-bc27-5288f2936577.png rename to docs/wiki-guide/images/GH-branch-protections/382110848-691a2831-ddf6-4ed9-bc27-5288f2936577.png diff --git a/images/branch-protections/382111392-8157d014-0482-4d4f-b695-ab3b7624d5e4.png b/docs/wiki-guide/images/GH-branch-protections/382111392-8157d014-0482-4d4f-b695-ab3b7624d5e4.png similarity index 100% rename from images/branch-protections/382111392-8157d014-0482-4d4f-b695-ab3b7624d5e4.png rename to docs/wiki-guide/images/GH-branch-protections/382111392-8157d014-0482-4d4f-b695-ab3b7624d5e4.png diff --git a/docs/wiki-guide/images/GH-branch-protections/382111988-20d6499e-fb12-4335-8b8d-76ac6b989528.png b/docs/wiki-guide/images/GH-branch-protections/382111988-20d6499e-fb12-4335-8b8d-76ac6b989528.png new file mode 100644 index 0000000..ad169fb Binary files /dev/null and b/docs/wiki-guide/images/GH-branch-protections/382111988-20d6499e-fb12-4335-8b8d-76ac6b989528.png differ diff --git a/images/branch-protections/382113942-39fd79d4-ff95-404b-86c4-8ab875cc9a4b.png b/docs/wiki-guide/images/GH-branch-protections/382113942-39fd79d4-ff95-404b-86c4-8ab875cc9a4b.png similarity index 100% rename from images/branch-protections/382113942-39fd79d4-ff95-404b-86c4-8ab875cc9a4b.png rename to docs/wiki-guide/images/GH-branch-protections/382113942-39fd79d4-ff95-404b-86c4-8ab875cc9a4b.png diff --git a/images/GH-repo-guide/238778583-43857a4d-789b-4073-b872-da29c4474916.png b/docs/wiki-guide/images/GH-repo-guide/238778583-43857a4d-789b-4073-b872-da29c4474916.png similarity index 100% rename from images/GH-repo-guide/238778583-43857a4d-789b-4073-b872-da29c4474916.png rename to docs/wiki-guide/images/GH-repo-guide/238778583-43857a4d-789b-4073-b872-da29c4474916.png diff --git a/images/HF-dataset-guide/346190430-9e6cef9b-18ef-4d4a-84c5-1a3f75ac9336.png b/docs/wiki-guide/images/HF-dataset-guide/346190430-9e6cef9b-18ef-4d4a-84c5-1a3f75ac9336.png similarity index 100% rename from images/HF-dataset-guide/346190430-9e6cef9b-18ef-4d4a-84c5-1a3f75ac9336.png rename to docs/wiki-guide/images/HF-dataset-guide/346190430-9e6cef9b-18ef-4d4a-84c5-1a3f75ac9336.png diff --git a/images/HF-dataset-guide/346972860-ed0feb0e-529b-4021-b44f-41ac96680bc3.png b/docs/wiki-guide/images/HF-dataset-guide/346972860-ed0feb0e-529b-4021-b44f-41ac96680bc3.png similarity index 100% rename from images/HF-dataset-guide/346972860-ed0feb0e-529b-4021-b44f-41ac96680bc3.png rename to docs/wiki-guide/images/HF-dataset-guide/346972860-ed0feb0e-529b-4021-b44f-41ac96680bc3.png diff --git a/docs/wiki-guide/images/HF-dataset-upload/346190430-9e6cef9b-18ef-4d4a-84c5-1a3f75ac9336.png b/docs/wiki-guide/images/HF-dataset-upload/346190430-9e6cef9b-18ef-4d4a-84c5-1a3f75ac9336.png new file mode 100644 index 0000000..265e540 Binary files /dev/null and b/docs/wiki-guide/images/HF-dataset-upload/346190430-9e6cef9b-18ef-4d4a-84c5-1a3f75ac9336.png differ diff --git a/docs/wiki-guide/images/HF-dataset-upload/346972860-ed0feb0e-529b-4021-b44f-41ac96680bc3.png b/docs/wiki-guide/images/HF-dataset-upload/346972860-ed0feb0e-529b-4021-b44f-41ac96680bc3.png new file mode 100644 index 0000000..e39a5f1 Binary files /dev/null and b/docs/wiki-guide/images/HF-dataset-upload/346972860-ed0feb0e-529b-4021-b44f-41ac96680bc3.png differ diff --git a/images/HF-workflow/290563763-2f2adf5c-0654-410a-8d93-d1172066ad8e.png b/docs/wiki-guide/images/HF-workflow/290563763-2f2adf5c-0654-410a-8d93-d1172066ad8e.png similarity index 100% rename from images/HF-workflow/290563763-2f2adf5c-0654-410a-8d93-d1172066ad8e.png rename to docs/wiki-guide/images/HF-workflow/290563763-2f2adf5c-0654-410a-8d93-d1172066ad8e.png diff --git a/images/HF-workflow/290563994-ceccdbea-cccf-482a-ab79-cfb04c5c42e8.png b/docs/wiki-guide/images/HF-workflow/290563994-ceccdbea-cccf-482a-ab79-cfb04c5c42e8.png similarity index 100% rename from images/HF-workflow/290563994-ceccdbea-cccf-482a-ab79-cfb04c5c42e8.png rename to docs/wiki-guide/images/HF-workflow/290563994-ceccdbea-cccf-482a-ab79-cfb04c5c42e8.png diff --git a/docs/wiki-guide/images/HF-workflow/290565108-f7cde0bf-2559-4b81-af58-f8d175cf25c5.png b/docs/wiki-guide/images/HF-workflow/290565108-f7cde0bf-2559-4b81-af58-f8d175cf25c5.png new file mode 100644 index 0000000..e0448f1 Binary files /dev/null and b/docs/wiki-guide/images/HF-workflow/290565108-f7cde0bf-2559-4b81-af58-f8d175cf25c5.png differ diff --git a/images/HF-workflow/290567257-c3493cff-7dbc-4158-802b-d3054ba1bfbe.png b/docs/wiki-guide/images/HF-workflow/290567257-c3493cff-7dbc-4158-802b-d3054ba1bfbe.png similarity index 100% rename from images/HF-workflow/290567257-c3493cff-7dbc-4158-802b-d3054ba1bfbe.png rename to docs/wiki-guide/images/HF-workflow/290567257-c3493cff-7dbc-4158-802b-d3054ba1bfbe.png diff --git a/images/two-repo-problem/340342731-d174b21a-0d2d-480d-a7b5-77c7cacf16af.png b/docs/wiki-guide/images/two-repo-problem/340342731-d174b21a-0d2d-480d-a7b5-77c7cacf16af.png similarity index 100% rename from images/two-repo-problem/340342731-d174b21a-0d2d-480d-a7b5-77c7cacf16af.png rename to docs/wiki-guide/images/two-repo-problem/340342731-d174b21a-0d2d-480d-a7b5-77c7cacf16af.png diff --git a/images/two-repo-problem/340343092-84608140-6d1a-4708-8659-bd03e715afb2.png b/docs/wiki-guide/images/two-repo-problem/340343092-84608140-6d1a-4708-8659-bd03e715afb2.png similarity index 100% rename from images/two-repo-problem/340343092-84608140-6d1a-4708-8659-bd03e715afb2.png rename to docs/wiki-guide/images/two-repo-problem/340343092-84608140-6d1a-4708-8659-bd03e715afb2.png diff --git a/images/two-repo-problem/340343305-7f0f79f9-956b-4a46-b110-235e2ed4295a.png b/docs/wiki-guide/images/two-repo-problem/340343305-7f0f79f9-956b-4a46-b110-235e2ed4295a.png similarity index 100% rename from images/two-repo-problem/340343305-7f0f79f9-956b-4a46-b110-235e2ed4295a.png rename to docs/wiki-guide/images/two-repo-problem/340343305-7f0f79f9-956b-4a46-b110-235e2ed4295a.png diff --git a/images/two-repo-problem/340343584-069c445a-487d-432c-8b82-c3867be863ae.png b/docs/wiki-guide/images/two-repo-problem/340343584-069c445a-487d-432c-8b82-c3867be863ae.png similarity index 100% rename from images/two-repo-problem/340343584-069c445a-487d-432c-8b82-c3867be863ae.png rename to docs/wiki-guide/images/two-repo-problem/340343584-069c445a-487d-432c-8b82-c3867be863ae.png diff --git a/images/technical-infrastructure/382108831-1173cd79-db94-4326-8b6e-dcbdeb8939cd.png b/images/technical-infrastructure/382108831-1173cd79-db94-4326-8b6e-dcbdeb8939cd.png deleted file mode 100644 index 45ba52a..0000000 Binary files a/images/technical-infrastructure/382108831-1173cd79-db94-4326-8b6e-dcbdeb8939cd.png and /dev/null differ diff --git a/main.py b/main.py new file mode 100644 index 0000000..705c79f --- /dev/null +++ b/main.py @@ -0,0 +1,30 @@ +import os + +def define_env(env): + """Define custom macros for MkDocs.""" + + @env.macro + def include_file_as_code(file_path, language="markdown"): + """ + Include the content of a file within a code block. + + Args: + file_path (str): The path to the file to include, relative to the project root. + language (str): The language identifier for syntax highlighting. + + Returns: + str: A Markdown-formatted code block containing the file's content. + """ + full_path = os.path.join(env.project_dir, file_path) + try: + with open(full_path, 'r', encoding='utf-8') as f: + content = f.read() + except FileNotFoundError: + content = f"**Error:** The file `{file_path}` was not found." + + # Escape triple backticks in content to prevent breaking the code block + content = content.replace("```", "```\u200b") + + line_nums_string = "{ py linenums='1' }" + + return f"```{language} {line_nums_string}\n{content}\n```" diff --git a/mkdocs.yaml b/mkdocs.yaml new file mode 100644 index 0000000..c8fcf33 --- /dev/null +++ b/mkdocs.yaml @@ -0,0 +1,88 @@ +site_name: "Imageomics Guide" +site_description: "A guide to collaborative work for Imageomics, including GitHub and Hugging Face workflows." +site_author: "Imageomics Institute" +site_url: "https://Imageomics.github.io/Imageomics-guide/" +edit_uri: view/main/docs + +repo_url: https://github.com/Imageomics/Imageomics-guide +edit_uri: blob/main/docs/ + +theme: + name: material + icon: + view: material/eye + logo: logos/Imageomics_logo_butterfly.png + favicon: logos/Imageomics_logo_butterfly.png + palette: + # Palette toggle for automatic mode + - media: "(prefers-color-scheme)" + toggle: + icon: material/brightness-auto + name: Switch to light mode + # Palette toggle for light mode + - media: "(prefers-color-scheme: light)" + scheme: default + toggle: + icon: material/brightness-7 + name: Switch to dark mode + # Palette toggle for dark mode + - media: "(prefers-color-scheme: dark)" + scheme: slate + toggle: + icon: material/brightness-4 + name: Switch to system preference + features: + - content.action.view + - content.code.copy + - navigation.footer + - navigation.tabs + +plugins: + - glightbox + - macros + - search + +markdown_extensions: + - admonition + - attr_list + - md_in_html + - pymdownx.betterem + - pymdownx.blocks.caption + - pymdownx.details + - pymdownx.inlinehilite + - pymdownx.snippets + - pymdownx.superfences + - pymdownx.tasklist + - pymdownx.tilde + - pymdownx.highlight: + anchor_linenums: true + line_spans: __span + pygments_lang_class: true + - toc: + permalink: true + title: 📖 On This Page + +nav: + - Home: index.md + - GitHub Guide: + - "Repo Guide": wiki-guide/GitHub-Repo-Guide.md + - "Workflow": wiki-guide/The-GitHub-Workflow.md + - "Pull Request Guide": wiki-guide/The-GitHub-Pull-Request-(PR)-Guide.md + - "Projects Guide": wiki-guide/Guide-to-GitHub-Projects.md + - "Branch Protections": wiki-guide/When-to-think-about-branch-protections.md + - "Two Repo Problem": wiki-guide/Two-Repo-Problem.md + - Hugging Face Guide: + - "Repo Guide": wiki-guide/Hugging-Face-Repo-Guide.md + - "Workflow": wiki-guide/The-Hugging-Face-Workflow.md + - "Dataset Upload Guide": wiki-guide/The-Hugging-Face-Dataset-Upload-Guide.md + - Metadata Guide: wiki-guide/Metadata-Guide.md + - Templates: + - "Dataset Card Template": wiki-guide/HF_DatasetCard_Template_mkdocs.md + - "Model Card Template": wiki-guide/HF_ModelCard_Template_mkdocs.md + - Command Line Cheat Sheet: wiki-guide/Command-Line-Cheat-Sheet.md + - Code of Conduct: CODE_OF_CONDUCT.md + - Other Resources: + - "Glossary for Imageomics": wiki-guide/Glossary-for-Imageomics.md + - "Handling API Keys": wiki-guide/Handling-API-Keys.md + - "Helpful Tools": wiki-guide/Helpful-Tools-for-your-Workflow.md + - "Virtual Environments": wiki-guide/Virtual-Environments.md diff --git a/requirements.txt b/requirements.txt new file mode 100644 index 0000000..2f5e4d0 --- /dev/null +++ b/requirements.txt @@ -0,0 +1,5 @@ +mkdocs +mkdocs-material +mkdocs-material-extensions +mkdocs-macros-plugin +mkdocs-glightbox diff --git a/wiki-guide/Home.md b/wiki-guide/Home.md deleted file mode 100644 index 9142fff..0000000 --- a/wiki-guide/Home.md +++ /dev/null @@ -1,34 +0,0 @@ -# Welcome to the Imageomics Institute! - -This wiki is intended to host internal documentation, making the information needed to get started with and use institute resources readily available to all members. It will evolve continuously with the institute. - -## Highlights -There are many pages of useful information contained in this wiki covering a range of topics from institute hardware, to repositories and archives, to a glossary of _imageomics-related_ terms. - -### Just starting a project? -Check out our guides to get your project off on the right foot! -- [The GitHub Repo Guide](GitHub-Repo-Guide.md): This page reviews expected and suggested GitHub repository contents, as well as structural considerations. -- [The Hugging Face Repo Guide](Hugging-Face-Repo-Guide.md): Analogous expected and suggested repository contents for Hugging Face repositories; there are notable differences from GitHub in both content and structure. -- [Metadata Guide](Metadata-Guide.md): Guide to metadata collection and documentation. Follows closely the [HF Dataset Card Template](../templates/HF_DatasetCard_Template_Imageomics.md?plain=1) sections. - -### Project repo up, what's next? -Check out our workflow guides for how to interact with your new repo: -- [The GitHub Workflow](The-GitHub-Workflow.md): This page mainly focuses on branching and the PR process. -- [The Hugging Face Workflow](The-Hugging-Face-Workflow.md): Analogous workflow directions for Hugging Face; there are notable differences from GitHub in how this process works practically, though the concept is the same. - -### Project management or organization got you down? -Discover new tools to help: -- [Guide to GitHub Projects](Guide-to-GitHub-Projects.md): This page focuses on GitHub's project management tool, Projects, which integrates issues and pull requests into a unified task board to keep tabs on how your project is progressing. Labels, milestones, and assignee tags provide improved organization, and allow for more focused views. -- [Helpful Tools for your Workflow](Helpful-Tools-for-your-Workflow.md): Collection of useful tools to facilitate and improve workflows. Comments and recommendations encouraged! -- [Virtual Environments](Virtual-Environments.md): Summary of `conda` and `pip` environments: how to make, use, and share them. - -### Other pages of note: -- [Glossary for Imageomics](Glossary-for-Imageomics.md): Collection of terms used in imageomics. The goal is to ensure all participating domains are represented, thus facilitating interdisciplinary communication. This is a group effort, please check it out and add terms you think should be there! -- [Command Line Cheat Sheet](Command-Line-Cheat-Sheet.md): Collection of useful bash, emacs, and git commands. - - - -
-
- -[Questions, Comments, or Concerns](https://github.com/Imageomics/Imageomics-guide/issues) \ No newline at end of file diff --git a/wiki-guide/Technical-Infrastructure.md b/wiki-guide/Technical-Infrastructure.md deleted file mode 100644 index fce2015..0000000 --- a/wiki-guide/Technical-Infrastructure.md +++ /dev/null @@ -1,15 +0,0 @@ -## Collaboration Infrastructure (Code, Data, Models, and Documents) - -* GitHub - * [Institute Code Repositories](https://github.com/Imageomics) - * Location to store our code (software + tools). -* Hugging Face - * [Imageomics Organization page](https://huggingface.co/imageomics) - * Location to store our datasets and models (and their metadata). - * Hugging Face [Docs](https://huggingface.co/docs) - * [Model Hub](https://huggingface.co/docs/hub/models-the-hub) - * [Datasets](https://huggingface.co/docs/hub/datasets-overview) - -## Collaborative Infrastructure Diagram -![tech_infrastructure_diagram](https://github.com/user-attachments/assets/1173cd79-db94-4326-8b6e-dcbdeb8939cd) - diff --git a/wiki-guide/The-GitHub-Pull-Request-(PR)-Guide.md b/wiki-guide/The-GitHub-Pull-Request-(PR)-Guide.md deleted file mode 100644 index b7b5b28..0000000 --- a/wiki-guide/The-GitHub-Pull-Request-(PR)-Guide.md +++ /dev/null @@ -1,149 +0,0 @@ -# **GitHub Pull Request (PR) Guide Overview** - -This guide is divided into three essential sections to help you effectively manage pull requests in a collaborative project: - -[Creating a Pull Request](#1-creating-a-pull-request): This section explains how to properly prepare and submit a pull request (PR) to ensure that your changes are well-documented, easy to review, and aligned with project goals. - -[Reviewing a Pull Request](#2-reviewing-a-pull-request): Learn the best practices for providing constructive feedback, identifying potential issues, and ensuring code quality during the review process. - -[Responding to a Pull Request Review](#3-responding-to-a-pull-request-review): Understand how to address reviewer feedback, make necessary changes, and ensure your pull request meets the required standards for approval. - -By following these steps, you will contribute to a smooth and efficient workflow, ensuring collaboration and quality in your project. - - -# **1. Creating a Pull Request** -> Before creating a pull request, first, please follow [2.1. The GitHub Workflow](The-GitHub-Workflow.md) to create and push your Branch. - -## 1.1 Navigate to the Repository's Main Page -> On GitHub, go to the main page of the repository where you’ve pushed your branch. - -## 1.2 Select Your Branch -> From the "Branch" menu, choose the branch that contains your changes (the one you just pushed). - - -## 1.3 Click 'Compare & pull request': -> You’ll see a button labeled Compare & pull request. Click this to begin the process of creating a pull request for your changes. - -> - - -## 1.4 Add Title and Description: -> In the pull request form, type a descriptive title for your PR. -> Provide a detailed description of the changes you've made, why they are important, and any other relevant information. - -> - - -## 1.5 Choose Review Type: -> * If your pull request is ready for review, click Create Pull Request. -> * If you want to create a draft version of the pull request for further review before it's ready, click the drop-down and select Create Draft Pull Request, then click Draft Pull Request. -> - -# **2. Reviewing a Pull Request** - - ## 2.1 Navigate to the **Pull requests** tab: - -> - - ## 2.2 Select a Pull Request - -In the list of pull requests, please click the pull request that you'd like to review. -> - - - ## 2.3 Review Changes -In the pull request page, please click **Files changed** so as to see the changes. - -> - -> 2.3.1 by clicking , you can choose the unified or split view. - -> - - - ## 2.4 Add Comments or Suggestions -When hovering over the lines of code, you can click the blue comment icon to add your review comments. - - -> - -> 2.4.1 If you'd like to add a comment on multiple lines, please click the line number of the first line you want to add comments and drag down to select a range of lines. - - - ## 2.5 Suggest Changes -If you'd like to suggest a specific change to the lines, please click , and then edit the text within the suggestion block. - -> - - - ## 2.6 Comment on a File -If you'd like to comment on a file, please click at the right top of the file, then add you comments. - -> - - -## 2.7 Mark Files as Viewed -After you finished reviewing a file, you can mark it as viewed. - -> - - -## 2.8 Start or Add to a Review -When you're done, click Start a review. If you have already started a review, please click Add review comment. -> Please notice that all line comments are pending and only visible to you. You can edit the comments when needed. If you'd like to abandon your review, please go to in **Review changes** and click **Abandon review** - -## 2.9 Review and Summarize Proposed Changes - -Click Review changes, and then type comments to summarize your proposed changes. - - -> - -## 2.10 Select Review Type - -> - - -> * Select Comment: Provide general feedback on the changes without explicitly approving or rejecting them. -> * Select Approve: Indicate that you’ve reviewed the changes and approve them for merging. -> * Select Request changes: Provide feedback indicating that revisions are needed before the changes can be approved. - -## 2.11 Click Submit review. -Current review round is done; this publishes your comments and suggestions. Then the PR can either be merged or updated (depending on approval or comments). We generally expect that whoever submits the PR will merge once all feedback has been incorporated or otherwise addressed. - -# **3. Responding to a Pull Request Review** - -## 3.1 Navigate to the Repository's Main Page -Navigate to your repository name, click **Pull requests** -> - -## 3.2 Incorporate Feedback Changes - -After receiving feedback on your pull request, you can apply the changes in one of two ways: either by committing each change individually or by grouping several changes into a single commit. The method you choose depends on whether you prefer fine-grained control over the commit history or a more streamlined approach. - -### 3.2.1 Apply change in its own commit -Apply the suggested change by creating a separate commit for it. This approach helps keep your commit history clear and each change traceable. -> - -### 3.2.2 Add the Suggestion to a Batch of Changes -If you plan to include multiple changes in one commit, you can add suggestions to a batch. Once you've collected all the desired suggestions, click "Commit suggestions" to apply them in one go. - -> - - -## 3.3 Add Commit Message -In the commit message field, enter a brief, descriptive message that clearly explains the changes made to the file(s) - -## 3.4 Click Commit changes -After entering your commit message, click the "Commit changes" button to finalize and save your modifications to the repository. This step ensures that your changes are recorded and can be reviewed or merged into the main codebase. - -## 3.5 Re-requesting a Review -If you’ve addressed all the requested changes and your pull request requires further review, re-request a review by notifying the reviewers. This action prompts them to evaluate your updated code and provide feedback or approval. - -## 3.6 Out-of-scope Suggestion -If the suggested change falls outside the scope of your pull request, create a new issue to address the feedback separately. Issues can be created directly from a PR comment. - - - - - diff --git a/wiki-guide/Virtual-Environments.md b/wiki-guide/Virtual-Environments.md deleted file mode 100644 index 92fca48..0000000 --- a/wiki-guide/Virtual-Environments.md +++ /dev/null @@ -1,29 +0,0 @@ -# Managing Dependencies and Environments -Recording dependencies and environment information is crucial for reproducibility and interoperability across different platforms. There are many options for this, and sometimes it is appropriate to use multiple within the same project. - -The goal is to make it as easy as possible for others (including your future self) to run the code. - -## Conda Environments -The following example commands will get you set up with a Conda environment that can be tracked and shared. -* Install [Miniconda](https://docs.conda.io/en/latest/miniconda.html). -* Create an environment: `conda create --name ` -* Activate the environment: `conda activate ` -* Install packages you need: `conda install -c conda-forge python=3.9 pandas matplotlib` - * `-c conda-forge` specifies the channel to install from. ([more information](https://docs.conda.io/projects/conda/en/latest/user-guide/concepts/channels.html)) - * You can specify the version of a package or omit this to get the latest available. ([more information](https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-pkgs.html#id2)) -* Once the needed packages are installed, export the environment to a file: -```bash -conda env export --no-builds --from-history | grep -v "prefix" > environment.yml -``` - * `--no-builds` and `--from-history` flags will cause the environment file to only specify the packages and versions that you manually installed. This may help with cross-platform compatibility by giving conda the flexibility to find compatible sub-dependencies on another system. - * `| grep -v "prefix"` eliminates your system-specific environment storage location (what is called the `prefix`) from being added to the file - * If you want to add the actual package versions that were installed (if you did not specify during installation) to the `environment.yml` file, you can check those and copy-paste them in manually with `conda env list`. - * Don't forget to also add and track this new file with git! -* To install the dependencies somewhere else from this file, use `conda env create -f environment.yml`. - -## Pip Virtual Environment -For virtual environments using `pip` to install packages (Python environments), use `python -m pip freeze` to print a list of packages (and their versions) installed in the environment. `python -m pip freeze > requirements.txt` will populate a `requirements.txt` file with all these packages and versions listed (eg., `pandas==2.0.1`). Note that this will _not_ give only minimum software requirements, but will also print all dependencies. For more information, see the [pip documentation](https://pip.pypa.io/en/stable/cli/pip_freeze/). - -This machine-readable file can then be installed using `pip install -r requirements.txt` when in the appropriate folder. - -