Multi-input pipelines don't construct properly #121

hobu · 2022-07-22T21:45:12Z

@gsakkis Complex pipelines with branches in them don't seem to construct properly with the bindings. It could be my usage is incorrect or not being provided as expected, or the bindings don't support branched pipelines as currently written. Can you take a look?

Expected output


Before => reproject_before \ 
                             => merge
After => reproject_after   /

Produced output

But I end up with

Before => before_reprojection \
                                => after_reprojection => merge
After =>                      /

Example code


before = pdal.Reader.las("before.las")
after = pdal.Reader.las("after.las")

reproject_before = pdal.Filter.reprojection(out_srs="EPSG:26915")
reproject_after = pdal.Filter.reprojection(out_srs="EPSG:26915")

merge = ((before |reproject_before) | (after | reproject_after)) | pdal.Filter.merge()

print (merge.pipeline)

Example output

{
  "pipeline":
  [
    {
      "filename": "before.las",
      "tag": "readers_las1",
      "type": "readers.las"
    },
    {
      "inputs":
      [
        "readers_las1"
      ],
      "out_srs": "EPSG:26915",
      "tag": "filters_reprojection1",
      "type": "filters.reprojection"
    },
    {
      "filename": "after.las",
      "tag": "readers_las2",
      "type": "readers.las"
    },
    {
      "inputs":
      [
        "filters_reprojection1",
        "readers_las2"
      ],
      "out_srs": "EPSG:26915",
      "tag": "filters_reprojection2",
      "type": "filters.reprojection"
    },
    {
      "inputs":
      [
        "filters_reprojection2"
      ],
      "tag": "filters_merge1",
      "type": "filters.merge"
    }
  ]
}

The text was updated successfully, but these errors were encountered:

gsakkis · 2022-07-25T14:14:46Z

@hobu I took a look; some notes:

Currently the pdal-python pipe operator all it does is concatenate the stage(s) of the piped stages/pipelines. That is, (before | reproject_before) | (after | reproject_after) is equivalent to before | reproject_before | after | reproject_after. Do you expect these to have different semantics in general and if yes how?
A side-effect of doing a simple stage concatenation and nothing else is that pdal-python does not set implicitly the inputs of any stage, it relies on pdal-core to determine the inputs according to the algorithm described here. This is why the inputs of reproject_after is determined as ["filters_reprojection1", "readers_las2"].
Regardless of the previous points, a pipeline such as p1 | p2 | pdal.Filter.merge() (where p1 = before | reproject_before and p2 = after | reproject_after) does not (and cannot) infer that the inputs of merge are p1 and p2. The only reasonable semantics of this pipeline is that the input of p2 is p1 and the input of merge is p2.

I can think of two ways to support multiple inputs:
- Specify them explicitly:
  p1 | p2 | pdal.Filter.merge(inputs=["reproject_before", "reproject_after"])
  This should have been working today but it doesn't (due to (1) and/or (2)).
- Introduce a new syntax to support multiple inputs, for example allow the left hand side to be a list/tuple:
  (p1, p2) | pdal.Filter.merge()
  This needs some more thought and could be addressed as a separate feature.

hobu · 2022-07-26T20:58:55Z

The second syntax looks easier to read, but can it actually be made to work as we expect?

I guess I would be happy with the first proposed syntax working even if it is verbose to write.

gsakkis · 2022-07-27T12:29:11Z

Regardless of the syntax for multiple inputs, the first thing to be decided is the (1) above: should the pipe operator be (left) associative or not? In other words is

[a]    (reader1 | filter1) | (reader2 | filter2)

equivalent to

[b]    (((reader1 | filter1) | reader2) | filter2)

(where [b] can be simplified to reader1 | filter1 | reader2 | filter2)?

If yes, what's the implied inputs of filter2? According to the pdal core algorithm (which pdal-python inherits), it is [filter1, reader2].
If not, do we want the implied inputs of filter2 to be [reader2] in [a] and [filter1, reader2] in [b]?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multi-input pipelines don't construct properly #121

Multi-input pipelines don't construct properly #121

hobu commented Jul 22, 2022 •

edited

Loading

gsakkis commented Jul 25, 2022

hobu commented Jul 26, 2022

gsakkis commented Jul 27, 2022

Multi-input pipelines don't construct properly #121

Multi-input pipelines don't construct properly #121

Comments

hobu commented Jul 22, 2022 • edited Loading

Expected output

Produced output

Example code

Example output

gsakkis commented Jul 25, 2022

hobu commented Jul 26, 2022

gsakkis commented Jul 27, 2022

hobu commented Jul 22, 2022 •

edited

Loading