We at Weights & Biases ❤️ open source and welcome contributions from the community!
This guide discusses the development workflow and the internals of the wandb
library.
- Development workflow
- Setting up your development environment
- Code organization
- Building protocol buffers
- Linting the code
- Testing
- Live development
- Library Objectives
- Detailed walk through of a simple program
- Server introspection
- Deprecating features
- Adding URLs
-
Browse the existing Issues on GitHub to see if the feature/bug you are willing to add/fix has already been requested/reported.
- If not, please create a new issue. This will help the project keep track of feature requests and bug reports and make sure effort is not duplicated.
-
If you are a first-time contributor, please go to
https://github.com/wandb/client
and click the "Fork" button in the top-right corner of the page. This will create your personal copy of the repository that you will use for development.-
Set up SSH authentication with GitHub.
-
Clone the forked project to your machine and add the upstream repository that will point to the main
wandb
project:git clone https://github.com/<your-username>/wandb.git cd wandb git remote add upstream https://github.com/wandb/wandb.git
-
-
Develop your contribution.
- Make sure your fork is in sync with the main repository:
git checkout main git pull upstream main
- Create a
git
branch where you will develop your contribution. Use a sensible name for the branch, for example:
git checkout -b new-awesome-feature
- Hack! As you make progress, commit your changes locally, e.g.:
git add changed-file.py tests/test-changed-file.py git commit -m "feat(integrations): Add integration with the `awesomepyml` library"
- Test and lint your code! Please see below for a detailed discussion.
- Ensure compliance with conventional commits, see below. This is enforced by the CI and will prevent your PR from being merged if not followed.
-
Proposed changes are contributed through GitHub Pull Requests.
-
When your contribution is ready and the tests all pass, push your branch to GitHub:
git push origin new-awesome-feature
-
Once the branch is uploaded,
GitHub
will print a URL for submitting your contribution as a pull request. Open that URL in your browser, write an informative title and a detailed description for your pull request, and submit it. -
Please link the relevant issue (either the existing one or the one you created) to your PR. See the right column on the PR page. Alternatively, in the PR description, mention that it "Fixes link-to-the-issue" - GitHub will do the linking automatically.
-
The team will review your contribution and provide feedback. To incorporate changes recommended by the reviewers, commit edits to your branch, and push to the branch again (there is no need to re-create the pull request, it will automatically track modifications to your branch), e.g.:
git add tests/test-changed-file.py git commit -m "test(sdk): Add a test case to address reviewer feedback" git push origin new-awesome-feature
-
Once your pull request is approved by the reviewers, it will be merged into the main codebase.
-
At Weights & Biases, we ask that all PR titles conform to the Conventional Commits specification. Conventional Commits is a lightweight convention on top of commit messages.
Structure
The commit message should be structured as follows:
<type>(<scope>): <description>
Only certain types are permitted.
⭐ User-facing notes such as `fix` and `feat` should be written so that a user can clearly understand the changes. If the feature or fix does not directly impact users, consider using a different type. Examples can be found in the section below.Type | Name | Description | User-facing? |
---|---|---|---|
feat | ✨ Feature | A pull request that adds new functionality that directly impacts users | Yes |
fix | 🐛 Fix | A pull request that fixes a bug | Yes |
docs | 📚 Documentation | Documentation changes only | Maybe |
style | 💎 Style | Changes that do not affect the meaning of the code (e.g. linting or adding type annotations) | No |
refactor | 📦 Code Refactor | A code change that neither fixes a bug nor adds a feature | No |
perf | 🚀 Performance Improvements | A code change that improves performance | No |
test | 🚨 Tests | Adding new or missing tests or correcting existing tests | No |
build | 🛠 Builds | Changes that affect the build system (e.g. protobuf) or external dependencies | Maybe |
ci | ⚙️ Continuous Integrations | Changes to our CI configuration files and scripts | No |
chore | ♻️ Chores | Other changes that don't modify source code files. | No |
revert | 🗑 Reverts | Reverts a previous commit | Maybe |
security | 🔒 Security | Security fix/feature | Maybe |
Which part of the codebase does this change impact? Only certain scopes are permitted.
Scope | Name | Description |
---|---|---|
sdk | Software Development Kit | Generic SDK changes or if can't define a narrower scope |
cli | Command-Line Interface | Generic CLI changes |
public-api | Public API | Public API changes |
integrations | Integrations | Changes related to third-party integrations |
artifacts | Artifacts | Changes related to Artifacts |
media | Media Types | Changes related to Media types |
sweeps | Sweeps | Changes related to Sweeps |
launch | Launch | Changes related to Launch |
Sometimes a change may span multiple scopes. In this case, please choose the scope that would be most relevant to the user.
Write a short, imperative tense description of the change.
User-facing notes (ones with type fix
and feat
) should be written so that a user can understand what has changed.
If the feature or fix does not directly impact users, consider using a different type.
✅ Good Examples
-
feat(media): add support for RDKit Molecules
It is clear to the user what the change introduces to our product.
-
fix(sdk): fix a hang caused by keyboard interrupt on Windows
This bug fix addressed an issue that caused the sdk to hang when hitting Ctrl-C on Windows.
❌ Bad Examples
-
fix(launch): fix an issue where patch is None
It is unclear what is referenced here.
-
feat(sdk): Adds new query to the the internal api getting the state of the run
It is unclear what is of importance to the user here, what do they do with that information. A better type would be
chore
or the title should indicate how it translates into a user-facing feature.
We test the library code against multiple python
versions
and use pyenv
to manage those. Install pyenv
by running
curl -L https://github.com/pyenv/pyenv-installer/raw/master/bin/pyenv-installer | bash
To load pyenv
automatically, add the following lines to your shell's startup script,
such as ~/.bashrc
or ~/.zshrc
(and then either restart the shell, run exec $SHELL
, or source
the changed script):
export PYENV_ROOT="$HOME/.pyenv"
export PATH="$HOME/.pyenv/bin:$PATH"
eval "$(pyenv init --path)"
eval "$(pyenv virtualenv-init -)"
Then run the following command to set up your environment:
./tools/setup_dev_environment.py
At the first invocation, this tool will set up multiple python environments, which takes some time. You can set up a subset of the target environments to test against, for example:
./tools/setup_dev_environment.py --python-versions 3.7 3.8
The tool will also set up tox
, which we use
for automating development tasks such as code linting and testing.
Note: to switch the default python version, edit the .python-version
file in the repository root.
- The
tensorflow-macos
package that is installed on Macs with the Apple M1 chip, requires theh5py
package to be installed, which in turn requireshdf5
to be installed in the system. You can installhdf5
andh5py
into apyenv
environment with the following commands using homebrew:
$ brew install hdf5
$ export HDF5_DIR="$(brew --prefix hdf5)"
$ pip install --no-binary=h5py h5py
- The
soundfile
package requires thelibsndfile
package to be installed in the system. Note that a pre-release version ofsoundfile
will be installed. You can installlibsndfile
with the following command using homebrew:
$ brew install libsndfile
- The
moviepy
package requires theffmpeg
package to be installed in the system. You can installffmpeg
with the following command using homebrew:
$ brew install ffmpeg
- The
lightgbm
package might require build packagescmake
andlibomp
to be installed. You can installcmake
andlibomp
with the following command using homebrew:
$ brew install cmake libomp
wandb/
├── ...
├── apis/ # Public api (still has internal api but this should be moved to wandb/internal)
│ ├── ...
│ ├── internal.py
│ ├── ...
│ └── public.py
├── cli/ # Handlers for command line functionality
├── ...
├── integration/ # Third party integration
│ ├── fastai/
│ ├── gym/
│ ├── keras/
│ ├── lightgbm/
│ ├── metaflow/
│ ├── prodigy/
│ ├── sacred/
│ ├── sagemaker/
│ ├── sb3/
│ ├── tensorboard/
│ ├── tensorflow/
│ ├── torch/
│ ├── xgboost/
│ └── ...
├── ...
├── proto/ # Protocol buffers for inter-process communication and persist file store
├── ...
├── sdk/ # User accessed functions [wandb.init()] and objects [WandbRun, WandbConfig, WandbSummary, WandbSettings]
│ ├── backend/ # Support to launch internal process
│ ├── ...
│ ├── interface/ # Interface to backend execution
│ ├── internal/ # Backend threads/processes
│ └── ...
├── ...
├── sweeps/ # Hyperparameter sweep engine (see repo: https://github.com/wandb/sweeps)
└── ...
We use protocol buffers to communicate
from the user process to the wandb
backend process.
If you update any of the .proto
files in wandb/proto
, you'll need to run:
make proto
We use black
, flake8
,
and mypy
for code formatting and checks (including static type checks).
To reformat the code, run:
tox -e format
To run checks, execute:
tox -e flake8,mypy
We use the pytest
framework. Tests can be found in tests/
.
By default, tests are run in parallel with 4 processes. This can be changed by setting the
CI_PYTEST_PARALLEL
environment variable to a different value.
To run specific tests in a specific environment:
tox -e py37 -- tests/test_some_code.py -k substring_of_test
To run all tests in a specific environment:
tox -e py38
If you make changes to requirements_dev.txt
that are used by tests, you need to recreate the python environments with:
tox -e py37 --recreate
Sometimes, pytest
will swallow or shorten important print messages or stack traces sent to stdout and stderr (particularly when they are coming from background processes).
This will manifest as a test failure with no/shortened associated output.
In these cases, add the -vvvv --showlocals
flags to stop pytest from capturing the messages and allow them to be printed to the console. Eg:
tox -e py37 -- tests/test_some_code.py -k substring_of_test -vvvv --showlocals
If a test fails, you can use the --pdb -n0
flags to get the
pdb debugger attached to the test:
tox -e py37 -- tests/test_some_code.py -k failing_test -vvvv --showlocals --pdb -n0
You can also manually set breakpoints in the test code (breakpoint()
)
to inspect the test failures.
Testing wandb
is tricky for a few reasons:
wandb.init
launches a separate process, this adds overhead and makes it difficult to assert logic happening in the backend process.- The library makes lots of requests to a W&B server as well as other services. We don't want to make requests to an actual server, so we need to mock one out.
- The library has many integrations with 3rd party libraries and frameworks. We need to assert we never break compatibility with these libraries as they evolve.
- wandb writes files to the local file system. When we're testing we need to make sure each test is isolated.
- wandb reads configuration state from global directories such as
~/.netrc
and~/.config/wandb/settings
we need to override these in tests. - The library needs to support jupyter notebook environments as well.
To make our lives easier we've created lots of tooling to help with the above challenges. Most of this tooling comes in the form of Pytest Fixtures. There are detailed descriptions of our fixtures in the section below. What follows is a general overview of writing good tests for wandb.
To test functionality in the user process the wandb_init_run
is the simplest fixture to start with. This is like calling wandb.init()
except we don't actually launch the wandb backend process and instead returned a mocked object you can make assertions with. For example:
def test_basic_log(wandb_init_run):
wandb.log({"test": 1})
assert wandb.run._backend.history[0]["test"] == 1
One of the most powerful fixtures is live_mock_server
. When running tests we start a Flask server that provides our graphql, filestream, and additional web service endpoints with sane defaults. This allows us to use wandb just like we would in the real world. It also means we can assert various requests were made. All server logic can be found in tests/utils/mock_server.py
and it's really straight forward to add additional logic to this server. Here's a basic example of using the live_mock_server
:
def test_live_log(live_mock_server, test_settings):
run = wandb.init(settings=test_settings)
run.log({"test": 1})
ctx = live_mock_server.get_ctx()
first_stream_hist = utils.first_filestream(ctx)["files"]["wandb-history.jsonl"]
assert json.loads(first_stream_hist["content"][0])["test"] == 1
Notice we also used the test_settings
fixture. This turns off console logging and ensures the run is automatically finished when the test finishes. Another really cool benefit of this fixture is it creates a run directory for the test at tests/logs/NAME_OF_TEST
. This is super useful for debugging because the logs are stored there. In addition to getting the debug logs you can find the live_mock_server
logs at tests/logs/live_mock_server.log
.
We also have pytest fixtures that are automatically used. These include local_netrc
and local_settings
this ensures we never read those settings files from your own environment.
The final fixture worth noting is notebook
. This actually runs a jupyter notebook kernel and allows you to execute specific cells within the notebook environment:
def test_one_cell(notebook):
with notebook("one_cell.ipynb") as nb:
nb.execute_all()
output = nb.cell_output(0)
assert "lovely-dawn-32" in output[-1]["data"]["text/html"]
The wandb system can be viewed as 3 distinct services:
- The user process where
wandb.init()
is called - The internal process where work is done to format data to be synced to the server
- The backend server which listens to graphql endpoints and populates a database
The interfaces are described here:
Users . Shared . Internal . Mock
Process . Queues . Process . Server
. . .
+----+ . +----+ . +----+ . +----+
| Up | . | Sq | . | Ip | . | Ms |
+----+ . +----+ . +----+ . +----+
| . | . | . |
| ------> | -------> | --------> | 1
| . | . | . |
| . | -------> | --------> | 2
| . | . | . |
| ------> | . | . | 3
| . | . | . |
1. Full codepath from wandb.init() to mock_server
Note: coverage only counts for the User Process and interface code
Example: [wandb_integration_test.py](tests/pytest_tests/system_tests/test_wandb_integration.py)
2. Inject into the Shared Queues to mock_server
Note: coverage only counts for the interface code and internal process code
Example: [test_sender.py](tests/pytest_tests/system_tests/test_sender.py)
3. From wandb.Run object to Shared Queues
Note: coverage counts for User Process
Example: [wandb_run_test.py](tests/pytest_tests/unit_tests/test_wandb_run.py)
Good examples of tests for each level of testing can be found at:
- test_system_metrics_*.py: User process tests
- test_metric_internal.py: Internal process tests
- test_metric_full.py: Full stack tests
Global fixtures are defined in tests/**/conftest.py
, separated into unit test fixtures, system test fixtures, as well as shared fixtures.
local_netrc
- used automatically for all tests and patches the netrc logic to avoid interacting with your system .netrclocal_settings
- used automatically for all tests and patches the global settings path to an isolated directory.test_settings
- returns awandb.Settings
object that can be used to initialize runs against thelive_mock_server
. Seetests/wandb_integration_test.py
runner
— exposes a click.CliRunner object which can be used by calling.isolated_filesystem()
. This also mocks out calls for login returning a dummy api key.mocked_run
- returns a mocked out run object that replaces the backend interface with a MagicMock so no actual api calls are made.wandb_init_run
- returns a fully functioning run with a mocked out interface (the result of callingwandb.init
). No api's are actually called, but you can access what apis were called viarun._backend.{summary,history,files}
. Seetest/utils/mock_backend.py
andtests/frameworks/test_keras.py
mock_server
- mocks all calls to therequests
module with sane defaults. You can customizetests/utils/mock_server.py
to use context or add api calls.live_mock_server
- we start a live flask server when tests start. live_mock_server configures WANDB_BASE_URL point to this server. You can alter or get its context with theget_ctx
andset_ctx
methods. Seetests/wandb_integration_test.py
. NOTE: this currently doesn't support concurrent requests so if we run tests in parallel we need to solve for this.git_repo
— places the test context into an isolated git repositorytest_dir
- places the test intotests/logs/NAME_OF_TEST
this is useful for looking at debug logs. This is used bytest_settings
notebook
— gives you a context manager for reading a notebook providingexecute_cell
. Seetests/utils/notebook_client.py
andtests/test_notebooks.py
. This useslive_mock_server
to enable actual api calls in a notebook context.mocked_ipython
- to get credit for codecov you may need to pretend you're in a jupyter notebook when you aren't, this fixture enables that.
We use codecov to ensure we're executing all branches of logic in our tests. Below are some JHR Protips™
- If you want to see the lines not covered you click on the “Diff” tab. then look for any “+” lines that have a red block for the line number
- If you want more context about the files, go to the “Files” tab, it will highlight diffs, but you have to do even more searching for the lines you might care about
- If you don't want to use codecov, you can use local coverage (I tend to do this for speeding things up a bit, run your tests then run tox -e cover ). This will give you the old school text output of missing lines (but not based on a diff from main)
We currently have 8 categories of test coverage:
project
: main coverage numbers, I don't think it can drop by more than a few percent, or you will get a failurepatch/tests
: must be 100%, if you are writing code for tests, it needs to be executed, if you are planning for the future, comment out your linespatch/tests-utils
: tests/conftest.py and supporting fixtures at tests/utils/, no coverage requirementspatch/sdk
: anything that matcheswandb/sdk/*.py
(so top level sdk files). These have lots of ways to test, so it should be high coverage. Currently, target is ~80% (but it is dynamic)patch/sdk-internal
: should be covered very high target is around 80% (also dynamic)patch/sdk-other
: will be a "catch all" for other stuff inwandb/sdk/
target around 75% (dynamic)patch/apis
: we have no good fixtures for this, so until we do, this will get a waiverpatch/other
: everything else, we have lots of stuff that isn't easy to test, so it is in this category, currently the requirement is ~60%
The circleci uses pytest-split to balance unittest load on multiple nodes. In order to do this efficiently every once in a while the test timing file (.test_durations
) needs to be updated with:
CI_PYTEST_SPLIT_ARGS="--store-durations" tox -e py37
TODO: overview of how to write and run functional tests with yea and the yea-wandb plugin.
The yea-wandb
plugin for yea
uses copies of several components from tests/utils
(artifact_emu.py
, mock_requests.py
, and mock_server.py
)
to provide a test environment for functional tests. Currently, we maintain a copy of those components in
yea-wandb/src/yea_wandb
, so they need to be in sync.
If you update one of those files, you need to:
- While working on your contribution:
- Make a new branch (say,
shiny-new-branch
) inyea-wandb
and pull in the new versions of the files. Make sure to update theyea-wandb
version. - Point the
wandb/wandb
branch you are working on to thiswandb/yea-wandb
branch. Intox.ini
, search foryea-wandb==<version>
and replace the entire line withhttps://github.com/wandb/yea-wandb/archive/shiny-new-branch.zip
.
- Make a new branch (say,
- Once you are happy with your changes:
- Bump to a new version by first running
make bumpversion-to-dev
, committing, and then runningmake bumpversion-from-dev
. - Release
yea-wandb
(withmake release
) from yourshiny-new-branch
branch. - If you have changes made to any file in (
artifact_emu.py
,mock_requests.py
, ormock_server.py
) in yourwandb/yea-wandb
branch, make sure to update these files intests/utils
in awandb/wandb
branch. We have a Github Action that verifies that these files are equal (between thewandb/wandb
andwandb/yea-wandb
). If you have changes in these files and you merge then in the wandb/yea-wandb repo and do not sync them to the wandb/wandb repo, all wandb/wandb PRs will fail this Github Action. - Once your
wandb/wandb
PR andwandb/yea-wandb
PR are ready to be merged, you can merge first thewandb/yea-wandb
PR, make sure that yourwandb/wandb
PR is green and merge it next.
- Bump to a new version by first running
You can find all the logic in the wandb-testing
repo.
The main script (wandb-testing/regression/regression.py
) to run your regression tests can be found
here.
Also, the main configuration file (wandb-testing/regression/regression-config.yaml
),
can be found here.
git clone [email protected]:wandb/wandb-testing.git
cd wandb-testing/regression && python regression.py tests/main/huggingface/ --dryrun
The above script will print all of the huggingface-transformers
test configurations.
The expected output should look something like this:
########################################
# huggingface-transformers init py37-pt
########################################
########################################
# huggingface-transformers init py37-pt1.4
########################################
########################################
# huggingface-transformers init py37-ptn
########################################
------------------
Good runs:
Failed runs:
In the names of the tests you can see the configurations of the tests:
init
is the configuration specified in the test config file.
Some details include:
- All the tests are using
py37
: python-3.7. - Each test uses a different version
PyTorch
:pt
: Latests PyTorch releasept1.4
: Version 1.4 of PyTorchptn
: Nightly version of Pytorch
For more details about general usage and how to add new tests see this README.
You can enter any of the tox environments and install a live dev build with:
source .tox/py37/bin/activate
pip install -e .
There's also a tox dev environment using Python 3, more info here.
TODO: There are lots of cool things we could do with this, currently it just puts us in iPython.
tox -e dev
When using editable mode outside of the wandb directory, it is necessary to apply specific configuration settings. Due to the naming overlap between the run directory and the package, editable mode might erroneously identify the wrong files. To address this concern, several options can be considered. For more detailed information, refer to the documentation available at this link. There are two approaches to achieve this:
-
During installation, provide the following flags:
pip install -e . --config-settings editable_mode=strict
By doing so, editable mode will correctly identify the relevant files.
-
Alternatively, you can configure it once using the following command:
pip config set global.config-settings editable_mode=strict
Once the configuration is in place, you can use the command:
pip install -e .
without any additional flags, and the strict editable mode will be applied consistently.
All objects and methods that users are intended to interact with are in the wandb/sdk
directory. Any
method on an object that is not prefixed with an underscore is part of the supported interface and should
be documented.
User interface should be typed using python 3.6+ type annotations. Older versions will use untyped interface.
wandb.Settings
is the main settings object that is passed explicitly or implicitly to all wandb
functions.
The primary objective of the design principle is that behavior of code can be impacted by multiple sources. These sources need to be merged consistently and information given to the user when settings are overwritten to inform the user. Examples of sources of settings:
- Enforced settings from organization, team, user, project
- Settings set by environment variables prefixed with
WANDB_
, e.g.WANDB_PROJECT=
- Settings passed to the
wandb.init
function:wandb.init(project=)
- Default settings from organization, team, project
- Settings in global settings file:
~/.config/wandb/settings
- Settings in local settings file:
./wandb/settings
Source priorities are defined in wandb.sdk.wandb_settings.Source
.
Each individual setting of the Settings object is either a default or priority setting.
In the latter case, reverse priority is used to determine the source of the setting.
Under the hood in wandb.Settings
, individual settings are represented as wandb.sdk.wandb_settings.Property
objects
that:
- Encapsulate the logic of how to preprocess and validate values of settings throughout the lifetime of a class instance.
- Allows for runtime modification of settings with hooks, e.g. in the case when a setting depends on another setting.
- Use the
update()
method to update the value of a setting. Source priority logic is enforced when updating values. - Determine the source priority using the
is_policy
attribute when updating the property value. E.g. ifis_policy
isTrue
, the smallestSource
value takes precedence. - Have the ability to freeze/unfreeze.
Here's a basic example (for more examples, see tests/wandb_settings_test.py
)
from wandb.sdk.wandb_settings import Property, Source
def uses_https(x):
if not x.startswith("https"):
raise ValueError("Must use https")
return True
base_url = Property(
name="base_url",
value="https://wandb.com/",
preprocessor=lambda x: x.rstrip("/"),
validator=[lambda x: isinstance(x, str), uses_https],
source=Source.BASE,
)
endpoint = Property(
name="endpoint",
value="site",
validator=lambda x: isinstance(x, str),
hook=lambda x: "/".join([base_url.value, x]),
source=Source.BASE,
)
>>> print(base_url) # note the stripped "/"
'https://wandb.com'
>>> print(endpoint) # note the runtime hook
'https://wandb.com/site'
>>> print(endpoint._value) # raw value
'site'
>>> base_url.update(value="https://wandb.ai/", source=Source.INIT)
>>> print(endpoint) # valid update with a higher priority source
'https://wandb.ai/site'
>>> base_url.update(value="http://wandb.ai/") # invalid value - second validator will raise exception
ValueError: Must use https
>>> base_url.update(value="https://wandb.dev", source=Source.USER)
>>> print(endpoint) # valid value from a lower priority source has no effect
'https://wandb.ai/site'
The Settings
object:
- The code is supposed to be self-documented -- see
wandb/sdk/wandb_settings.py
:) - Uses
Property
objects to represent configurable settings. - Clearly and compactly defines all individual settings, their default values, preprocessors, validators,
and runtime hooks as well as whether they are treated as policies.
- To leverage both static and runtime validation, the
validator
attribute is a list of functions (or a single function) that are applied in order. The first function is automatically generated from type annotations of class attributes.
- To leverage both static and runtime validation, the
- Provides a mechanism to update settings specifying the source (which abides the corresponding Property source logic)
via
Settings.update()
. Direct attribute assignment is not allowed. - Careful Settings object copying.
- Mapping interface.
- Exposes
attribute.value
if attribute is aProperty
. - Has ability to freeze/unfreeze the object.
Settings.make_static()
method that we can use to replaceStaticSettings
.- Adapted/reworked convenience methods to apply settings originating from different source.
- Add a new type-annotated
Settings
class attribute. - Add the new field to
wandb/proto/wandb_settings.proto
following the existing pattern.- Run
make proto
to re-generate the python stubs.
- Run
- If the setting comes with a default value/preprocessor/additional validators/runtime hooks, add them to
the template dictionary that the
Settings._default_props
method returns, using the same key name as the corresponding class variable.- For any setting that is only computed (from other settings) and need/should not be set/updated
(and so does not require any validation etc.), define a hook (which does not have to depend on the setting's value)
and use
"auto_hook": True
in the template dictionary (see e.g. thewandb_dir
setting).
- For any setting that is only computed (from other settings) and need/should not be set/updated
(and so does not require any validation etc.), define a hook (which does not have to depend on the setting's value)
and use
- Add tests for the new setting to
tests/wandb_settings_test.py
. - Note that individual settings may depend on other settings through validator methods and runtime hooks,
but the resulting directed dependency graph must be acyclic. You should re-generate the topologically-sorted
modification order list with
tox -e auto-codegen
-- it will also automatically detect cyclic dependencies and throw an exception.
Calls to wandb.log()
result in the dictionary being serialized into a schema'ed data structure.
Any non supported element should result in an immediate exception.
When changing properties of objects, those objects should serialize the changes into a schema'ed data
structure. There should be no need for .save()
methods on objects.
When running in disabled mode, all objects act as in memory stores of attribute information, but they do not perform any serialization to sync data.
import wandb
run = wandb.init(config=dict(param1=1))
run.config.param2 = 2
run.log(dict(this=3))
- minimal code should be run on import
-
User Process:
- Calls internal
wandb.setup()
in case the user has not yet initialized the global wandb state.wandb.setup()
is similar towandb.init()
but it impacts the entire process or session. This allows multiplewandb.init()
calls to share some common setup. - Sets up notification and request queues for communicating with internal process
- Spawns internal process used for syncing passing queues and the settings object
- Creates a Run object
RunManaged
- Encodes passed config dictionary into
RunManaged
object - Sends synchronous protocol buffer request message
RunData
to internal process - Wait for response for configurable amount of time. Populate run object with response data
- Terminal (
sys.stdout
,sys.stderr
) is wrapped which sends output to internal process withRunOutput
message - Sets a global
Run
object for users who usewandb.log()
syntax Run.on_start()
is called to display initial information about the run- Returns
Run
object
- Calls internal
-
Internal Process:
- Process initialization
- Wait on notify queue for work
- When RunData message is seen, queue this message to be written to disk
wandb_write
and sent to cloudwandb_send
- wandb_send thread sends upsert_run graphql http request
- response is populated into a response message
- Spin up internal threads which monitor system metrics
- Queue response message to the user process context
-
User Process:
- Callback on the
Run
object is called with the changed config item Run
object callback generatesConfigData
message and asynchronously sends to internal process
- Callback on the
-
Internal Process:
- When
ConfigData
message is seen, queue message towandb_write
andwandb_send
wandb_send
thread sendsupsert_run
graphql http request
- When
-
User process:
- Log dictionary is serialized and sent asynchronously as HistoryData message to internal process
-
Internal Process:
- When
HistoryData
message is seen, queue message towandb_write
andwandb_send
wandb_send
thread sendsfile_stream
data to cloud server
- When
- User process:
- Terminal wrapper is shutdown and flushed to internal process
- Exit code of program is captured and sent synchronously to internal process as
ExitData
Run.on_final()
is called to display final information about the run
Some features may depend on a minimum version of the W&B backend service, but this library may be communicating with an outdated backend. We use the GraphQL introspection schema to determine which features are supported. See the *_introspection
methods in internal_api.py for examples. Depending on the nature of your feature, you may need to introspect:
- If one or more fields on the root
Query
orMutation
types exist: example - If an input type includes a specific field: example
You should reuse the generic introspection methods if possible, and cache the introspection result.
The entire introspection schema is available. For more info see the official GraphQL docs
Starting with version 1.0.0, wandb
will be using Semantic Versioning.
The major version of the library will be incremented for all backwards-incompatible changes,
including dropping support for older Python versions.
Features currently marked as deprecated will be removed in the next major version (1.0.0).
To mark a feature as deprecated (and to be removed in the next major release), please follow these steps:
- Add a new field to the
Deprecated
message definition inwandb/proto/wandb_telemetry.proto
, which will be used to track the to-be-deprecated feature usage. - Rebuild protocol buffers and re-generate
wandb/proto/wandb_deprecated.py
by runningmake proto
. - Finally, to mark a feature as deprecated, call
wand.sdk.lib.deprecate
in your code:
from wandb.sdk.lib import deprecate
deprecate.deprecate(
field_name=deprecate.Deprecated.deprecated_field_name, # new_field_name from step 1
warning_message="This feature is deprecated and will be removed in a future release.",
)
All URLs displayed to the user should be added to wandb/sdk/lib/wburls.py
. This will better
ensure that URLs do not lead to broken links.
Once you add the URL to that file you will need to run:
python tools/generate-tool.py --generate