Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support metadata files as CLI arg supplier #81

Merged
merged 4 commits into from
Oct 27, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
98 changes: 88 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,11 +45,11 @@ like [Numpy](https://numpy.org) and [Pillow](https://pypi.org/project/pillow/) f


### 💿 Installation
For users, the easiest way to install MFLUX is to use `uv tool`: If you have [installed `uv`](https://github.com/astral-sh/uv?tab=readme-ov-file#installation), simply:
For users, the easiest way to install MFLUX is to use `uv tool`: If you have [installed `uv`](https://github.com/astral-sh/uv?tab=readme-ov-file#installation), simply:

```sh
uv tool install --upgrade mflux
```
```

to get the `mflux-generate` and related command line executables. You can skip to the usage guides below.

Expand Down Expand Up @@ -80,7 +80,7 @@ pip install -U mflux
```sh
make install
```
3. To run the test suite
3. To run the test suite
```sh
make test
```
Expand Down Expand Up @@ -152,6 +152,76 @@ mflux-generate --model dev --prompt "Luxury food photograph" --steps 25 --seed 2

- **`--controlnet-save-canny`** (optional, bool, default: False): If set, saves the Canny edge detection reference image used by ControlNet.

- **`--config-from-metadata`** or **`-C`** (optional, `str`): [EXPERIMENTAL] Path to a prior file saved via `--metadata`, or a compatible handcrafted config file adhering to the expected args schema.
Copy link
Collaborator Author

@anthonywu anthonywu Oct 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussion in #80 reminded me that we also export exif data to image files, so that opens the possibility that you can pass in an image path here to, and if you pass in an image path that can act as --init-image or --controlnet-image-path implicitly!

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice idea, I think that would work!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of adding Issues noise, I'll mark some todos for upcoming PRs inline here

  • add --config-from-image option where the Image is --init-image while the exif metadata is the run config


<details>
<summary>parameters supported by config files</summary>

#### How configs are used

- all config properties are optional and applied to the image generation if applicable
- invalid or incompatible properties will be ignored
filipstrand marked this conversation as resolved.
Show resolved Hide resolved

#### Config schema

```json
{
"$schema": "http://json-schema.org/draft-07/schema#",
"type": "object",
"properties": {
"seed": {
"type": ["integer", "null"]
},
"steps": {
"type": ["integer", "null"]
},
"guidance": {
"type": ["number", "null"]
},
"quantize": {
"type": ["null", "string"]
},
"lora_paths": {
"type": ["array", "null"],
"items": {
"type": "string"
}
},
"lora_scales": {
"type": ["array", "null"],
"items": {
"type": "number"
}
},
"prompt": {
"type": ["string", "null"]
}
Comment on lines +173 to +198
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for now just the basic image gen + lora args, withholding the controlnet implementation for now, because there are plans to dedupe/consolidate the two classes together so that would cause change in this PR

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great to also added the schema here! Make sense - lets start start with this and update the schema after that refactoring

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo:

  • add support for --config-from-metadata in controlnet workflow

}
}
```

#### Example
filipstrand marked this conversation as resolved.
Show resolved Hide resolved

```json
{
"model": "dev",
filipstrand marked this conversation as resolved.
Show resolved Hide resolved
"seed": 42,
"steps": 8,
"guidance": 3.0,
"quantize": 4,
"lora_paths": [
"/some/path1/to/subject.safetensors",
"/some/path2/to/style.safetensors"
],
"lora_scales": [
0.8,
0.4
],
filipstrand marked this conversation as resolved.
Show resolved Hide resolved
"prompt": "award winning modern art, MOMA"
}
```
</details>

Or, with the correct python environment active, create and run a separate script like the following:

```python
Expand Down Expand Up @@ -304,7 +374,7 @@ mflux-save \

*Note that when saving a quantized version, you will need the original huggingface weights.*

It is also possible to specify [LoRA](#-lora) adapters when saving the model, e.g
It is also possible to specify [LoRA](#-lora) adapters when saving the model, e.g

```sh
mflux-save \
Expand Down Expand Up @@ -453,7 +523,7 @@ To report additional formats, examples or other any suggestions related to LoRA
### 🕹️ Controlnet

MFLUX has [Controlnet](https://huggingface.co/docs/diffusers/en/using-diffusers/controlnet) support for an even more fine-grained control
of the image generation. By providing a reference image via `--controlnet-image-path` and a strength parameter via `--controlnet-strength`, you can guide the generation toward the reference image.
of the image generation. By providing a reference image via `--controlnet-image-path` and a strength parameter via `--controlnet-strength`, you can guide the generation toward the reference image.

```sh
mflux-generate-controlnet \
Expand All @@ -474,10 +544,10 @@ mflux-generate-controlnet \
*This example combines the controlnet reference image with the LoRA [Dark Comic Flux](https://civitai.com/models/742916/dark-comic-flux)*.

⚠️ *Note: Controlnet requires an additional one-time download of ~3.58GB of weights from Huggingface. This happens automatically the first time you run the `generate-controlnet` command.
At the moment, the Controlnet used is [InstantX/FLUX.1-dev-Controlnet-Canny](https://huggingface.co/InstantX/FLUX.1-dev-Controlnet-Canny), which was trained for the `dev` model.
It can work well with `schnell`, but performance is not guaranteed.*
At the moment, the Controlnet used is [InstantX/FLUX.1-dev-Controlnet-Canny](https://huggingface.co/InstantX/FLUX.1-dev-Controlnet-Canny), which was trained for the `dev` model.
It can work well with `schnell`, but performance is not guaranteed.*

⚠️ *Note: The output can be highly sensitive to the controlnet strength and is very much dependent on the reference image.
⚠️ *Note: The output can be highly sensitive to the controlnet strength and is very much dependent on the reference image.
Too high settings will corrupt the image. A recommended starting point a value like 0.4 and to play around with the strength.*


Expand All @@ -492,7 +562,15 @@ with different prompts and LoRA adapters active.
- Negative prompts not supported.
- LoRA weights are only supported for the transformer part of the network.
- Some LoRA adapters does not work.
- Currently, the supported controlnet is the [canny-only version](https://huggingface.co/InstantX/FLUX.1-dev-Controlnet-Canny).
- Currently, the supported controlnet is the [canny-only version](https://huggingface.co/InstantX/FLUX.1-dev-Controlnet-Canny).

### Workflow Tips

- To hide the model fetching status progress bars, `export HF_HUB_DISABLE_PROGRESS_BARS=1`
- Use config files to save complex job parameters in a file instead of passing many `--args`
- Set up shell aliases for required args examples:
- shortcut for dev model: `alias mflux-dev='mflux-generate --model dev'`
- shortcut for schnell model *and* always save metadata: `alias mflux-schnell='mflux-generate --model schnell --metadata'`

### ✅ TODO

Expand All @@ -505,4 +583,4 @@ with different prompts and LoRA adapters active.

### License

This project is licensed under the [MIT License](LICENSE).
This project is licensed under the [MIT License](LICENSE).
7 changes: 4 additions & 3 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"

[project]
name = "mflux"
version = "0.3.0"
version = "0.4.0"
description = "A MLX port of FLUX based on the Huggingface Diffusers implementation."
readme = "README.md"
keywords = ["diffusers", "flux", "mlx"]
Expand Down Expand Up @@ -37,7 +37,8 @@ classifiers = [

[project.optional-dependencies]
dev = [
"pytest>=8.0.0,<9.0"
"pytest>=8.3.0,<9.0",
"pytest-timer>=1.0,<2.0",
]

[project.urls]
Expand Down Expand Up @@ -102,7 +103,7 @@ docstring-code-line-length = "dynamic"
[tool.pytest.ini_options]
testpaths = ["tests"]
python_files = "test_*.py"
addopts = "-v"
addopts = "-v --exitfirst --failed-first --showlocals --tb=long --full-trace"
filipstrand marked this conversation as resolved.
Show resolved Hide resolved

# https://docs.astral.sh/ruff/settings/#lintisort
[tool.ruff.lint.isort]
Expand Down
8 changes: 8 additions & 0 deletions src/mflux/config/runtime_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,14 @@ def precision(self) -> mx.Dtype:
def num_train_steps(self) -> int:
return self.model_config.num_train_steps

@property
def init_image_path(self) -> str:
return self.config.init_image_path

@property
def init_image_strength(self) -> float:
return self.config.init_image_strength

@property
def init_time_step(self) -> int:
if self.config.init_image_path is None:
Expand Down
2 changes: 2 additions & 0 deletions src/mflux/flux/flux.py
Original file line number Diff line number Diff line change
Expand Up @@ -139,6 +139,8 @@ def generate_image(
generation_time=time_steps.format_dict["elapsed"],
lora_paths=self.lora_paths,
lora_scales=self.lora_scales,
init_image_path=config.init_image_path,
init_image_strength=config.init_image_strength,
config=config,
)

Expand Down
6 changes: 3 additions & 3 deletions src/mflux/generate.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,9 @@
def main():
# fmt: off
parser = CommandLineParser(description="Generate an image based on a prompt.")
parser.add_model_arguments()
parser.add_model_arguments(require_model_arg=False)
parser.add_lora_arguments()
parser.add_image_generator_arguments()
parser.add_image_generator_arguments(supports_metadata_config=True)
parser.add_image_to_image_arguments(required=False)
parser.add_output_arguments()
args = parser.parse_args()
Expand All @@ -36,7 +36,7 @@ def main():
width=args.width,
guidance=args.guidance,
init_image_path=args.init_image_path,
init_image_strength=args.init_image_strength
init_image_strength=args.init_image_strength,
),
)

Expand Down
4 changes: 2 additions & 2 deletions src/mflux/generate_controlnet.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,9 @@

def main():
parser = CommandLineParser(description="Generate an image based on a prompt and a controlnet reference image.") # fmt: off
parser.add_model_arguments()
parser.add_model_arguments(require_model_arg=True)
parser.add_lora_arguments()
parser.add_image_generator_arguments()
parser.add_image_generator_arguments(supports_metadata_config=False)
parser.add_controlnet_arguments()
parser.add_output_arguments()
args = parser.parse_args()
Expand Down
35 changes: 23 additions & 12 deletions src/mflux/post_processing/generated_image.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,8 +22,10 @@ def __init__(
generation_time: float,
lora_paths: list[str],
lora_scales: list[float],
controlnet_image_path: str | None = None,
controlnet_image_path: str | pathlib.Path | None = None,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one of the nice things about argparser is you can type=Path and it's wrapped and validated for us at entry, this allows for that future update to type=Path

controlnet_strength: float | None = None,
init_image_path: str | pathlib.Path | None = None,
init_image_strength: float | None = None,
):
self.image = image
self.model_config = model_config
Expand All @@ -36,29 +38,38 @@ def __init__(
self.generation_time = generation_time
self.lora_paths = lora_paths
self.lora_scales = lora_scales
self.controlnet_image = controlnet_image_path
self.controlnet_image_path = controlnet_image_path
self.controlnet_strength = controlnet_strength
self.init_image_path = init_image_path
self.init_image_strength = init_image_strength

def save(self, path: t.Union[str, pathlib.Path], export_json_metadata: bool = False) -> None:
from mflux import ImageUtil

ImageUtil.save_image(self.image, path, self._get_metadata(), export_json_metadata)

def _get_metadata(self) -> dict:
"""Generate metadata for reference as well as input data for
command line --config-from-metadata arg in future generations.
"""
return {
# mflux_version is used by future metadata readers
# to determine supportability of metadata-derived workflows
"mflux_version": str(GeneratedImage.get_version()),
anthonywu marked this conversation as resolved.
Show resolved Hide resolved
"model": str(self.model_config.alias),
"seed": str(self.seed),
"steps": str(self.steps),
"guidance": "None" if self.model_config == ModelConfig.FLUX1_SCHNELL else str(self.guidance),
"precision": f"{self.precision}",
"quantization": "None" if self.quantization is None else f"{self.quantization} bit",
"generation_time": f"{self.generation_time:.2f} seconds",
"lora_paths": ", ".join(self.lora_paths) if self.lora_paths else "None",
"lora_scales": ", ".join([f"{scale:.2f}" for scale in self.lora_scales]) if self.lora_scales else "None",
"seed": self.seed,
"steps": self.steps,
"guidance": self.guidance if ModelConfig.FLUX1_DEV else None, # only the dev model supports guidance
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed the prior implementation and accepted the logic as is, but want to double check this is true: only the dev model supports guidance

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, only the dev model supports guidance. Right now, we pass the guidance value down to the lower level classes (regardless of model choice), but at some point deeper down, we do a check like the following:

self.guidance_embedder = GuidanceEmbedder() if model_config == ModelConfig.FLUX1_DEV else None

and the guidance value is subsequently only used in case of "dev"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blocker/todo for #84

"precision": str(self.precision),
"quantize": self.quantization,
filipstrand marked this conversation as resolved.
Show resolved Hide resolved
"generation_time_seconds": round(self.generation_time, 2),
"lora_paths": [str(p) for p in self.lora_paths] if self.lora_paths else None,
"lora_scales": [round(scale, 2) for scale in self.lora_scales] if self.lora_scales else None,
filipstrand marked this conversation as resolved.
Show resolved Hide resolved
"init_image_path": str(self.init_image_path) if self.init_image_path else None,
"init_image_strength": self.init_image_strength if self.init_image_path else None,
"controlnet_image_path": str(self.controlnet_image_path) if self.controlnet_image_path else None,
"controlnet_strength": round(self.controlnet_strength, 2) if self.controlnet_strength else None,
"prompt": self.prompt,
"controlnet_image": "None" if self.controlnet_image is None else self.controlnet_image,
anthonywu marked this conversation as resolved.
Show resolved Hide resolved
"controlnet_strength": "None" if self.controlnet_strength is None else f"{self.controlnet_strength:.2f}",
}

@staticmethod
Expand Down
4 changes: 4 additions & 0 deletions src/mflux/post_processing/image_util.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,8 @@ def to_image(
lora_scales: list[float],
config: RuntimeConfig,
controlnet_image_path: str | None = None,
init_image_path: str | None = None,
init_image_strength: float | None = None,
) -> GeneratedImage:
normalized = ImageUtil._denormalize(decoded_latents)
normalized_numpy = ImageUtil._to_numpy(normalized)
Expand All @@ -44,6 +46,8 @@ def to_image(
generation_time=generation_time,
lora_paths=lora_paths,
lora_scales=lora_scales,
init_image_path=init_image_path,
init_image_strength=init_image_strength,
controlnet_image_path=controlnet_image_path,
controlnet_strength=config.controlnet_strength if isinstance(config.config, ConfigControlnet) else None,
)
Expand Down
2 changes: 1 addition & 1 deletion src/mflux/save.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

def main():
parser = CommandLineParser(description="Save a quantized version of Flux.1 to disk.") # fmt: off
parser.add_model_arguments()
parser.add_model_arguments(path_type="save", require_model_arg=True)
parser.add_lora_arguments()
args = parser.parse_args()

Expand Down
Loading