[MPPI] Zero velocity used after Optimizer reset #4545

BriceRenaudeau · 2024-07-19T09:54:01Z

Bug report

The MPPI controller outputs a near zero cmd_vel after a dynamic param change.
It creates strange behavior where the robot slows down when changing its speed limit:

This is because the algorithm uses a zero velocity state after Optimizer reset.

Required Info:

Operating System:
- Ubuntu 22.04 -->
ROS2 Version:
- Iron
Version or commit hash:
- from source
DDS implementation:
- cyclone

Steps to reproduce issue

Send the robot somewhere using MPPI controller
While it's still moving change a parameter like FollowPath.wz_std

Expected behavior

The robot doesn't slow down.

Actual behavior

The robot does slow down.

Implementation considerations

The optimizer may use the previous state or the current speed of the robot.

The text was updated successfully, but these errors were encountered:

SteveMacenski · 2024-07-19T16:51:14Z

@BriceRenaudeau I think you can use the controller server's API for changing the speed if that's the behavior you want: https://github.com/ros-navigation/navigation2/blob/main/nav2_controller/src/controller_server.cpp#L241-L243 -- I think that's more the appropriate location.

The dynamic parameters for MPPI contain a number of things that require a reset because the size of tensors can change or dynamics and limits adjust which the existing optimal trajectory may unsafely violate. I suppose we could pick and choose which cause a reset, but that would require a little bit of work since all parameter updates currently have the same post-callback that performs the reset. We'd need to either make the post-callback parameter-specific or give the post-update callback the name of the parameter updates to make a judgement about whether we execute the reset or not. This wouldn't be terribly difficult, but I think the nav2_controller method is more what you're looking for :-)

BriceRenaudeau · 2024-07-22T09:53:34Z

Thanks for the quick reply.
My issue is not about speed limitation but trajectories configuration like:

FollowPath.wz_std
FollowPath.model_dt
FollowPath.time_steps

I know it's not an easy topic.
My ultimate goal is to get a fixed behavior of MPPI trajectories computation whatever the speed is.
The trajectories horizon is directly related to the max_speed. So reducing the max speed reduces the horizon of the trajectories and the obstacles are avoided much later or worse the MPPI falls in a local minima and no valid trajectories are found.
It's the same issue with the spreading horizon.

To get this fixed horizon I dynamically change the model_dt. It's not a perfect solution as it doesn't allow shift_control_sequence. But it's better than changing time_steps and having much higher CPU consumption.

SteveMacenski · 2024-07-22T17:23:31Z

The problem with changing core parameters like model_dt is that it invalidates the previous optimal trajectory. The controls at each of the indices with a new value for model_dt mean completely different things now and it would be unsafe to start from that old-optimal trajectory which could easily now collide with the environment or otherwise have strange behavior. The problem would naturally require an internal reset() to manage the optimal trajectory being in collision and unable to execute, so its better we just do it from the outset.

Some of the key parameters like model_dt and time_steps I don't think can be changed on the fly without resetting the state back to ground zero. Would it be permissible to stop the robot between changes to the control parameters? Things like max velocities, std I think you could change live on the fly though!

BriceRenaudeau · 2024-07-23T08:41:45Z

As mentioned above, the speed changes that are well-managed by the controller create behavior changes. To keep the same behavior I need to change these parameters.

One solution could be to use the odom velocity after a reset() to avoid recomputing the acceleration slope?

BriceRenaudeau · 2024-07-23T09:52:24Z

Maybe make the Optimizer::reset() only on batch_size or batch_size change. In my case, it worked but I didn't make extensive tests.

void Optimizer::reset()
{
  if (state_.vx.shape(0) != settings_.batch_size || state_.vx.shape(1) != settings_.time_steps) {
    state_.reset(settings_.batch_size, settings_.time_steps);
    control_sequence_.reset(settings_.time_steps);
    control_history_[0] = {0.0, 0.0, 0.0};
    control_history_[1] = {0.0, 0.0, 0.0};
    control_history_[2] = {0.0, 0.0, 0.0};
    control_history_[3] = {0.0, 0.0, 0.0};

    costs_ = xt::zeros<float>({settings_.batch_size});
    generated_trajectories_.reset(settings_.batch_size, settings_.batch_size);
  }

    settings_.constraints = settings_.base_constraints;
    noise_generator_.reset(settings_, isHolonomic());
    RCLCPP_INFO(logger_, "Optimizer reset");
}

SteveMacenski · 2024-07-23T17:51:21Z

One solution could be to use the odom velocity after a reset() to avoid recomputing the acceleration slope?

I just don't think that'll work. This isn't something like a DWA technique that has a trivial future-predicted trajectory that can completely shift in the next iteration. Each trajectory sample of MPPI can be unique and is converged into the optimal solution by the previous solution and the cost functions.

Initializing it to something like a 'go current speed' for all samples will almost certainly cause you to fall into local minima each time you make a parameter change since the current timestep's velocity is likely not the right optimal solution for timestep 10, 20, 50, and so on - you may never actually converge correctly into the optimal solution by doing that until fully resetting the state. A special case where that might work would be if you always change parameters when on straight-aways, but that's pretty finnicky / fragile.

I don't think its not worth a try, but I suspect that you'll get behaviors you don't like out of it or the optimal trajectory after that initial seeding will be frequently in collision with the environment for longer time horizons triggering a reset() anyway since the seeded solution is invalid to execute. Things that I think could create this issue:

Seeding an arbitrary velocity across the trajectory --> seeds an initial condition that is not optimization generated + likely in collision anyway for longer time horizons
Changing model_dt such that existing velocities in the trajectories no longer mean the same thing --> effectively warps the trajectory
Accelerations since clipping them would manifestly change the meaning of a trajectory if the change was non-trivially smaller than the current setting --> effectively warps the trajectory

batch_size is probably fine to change on the fly as long as other parts of the state (i.e. noise sample sizing, cost, etc) are also thoughtfully resized with it. time_steps I'm not 100% sure. I suppose that should be fine to change, but again on careful initialization of values after its resize. When we roll the trajectory forward, we seed the new furthest trajectory point with size - 1th's value, so I suppose we could do that here for however many more are added and let the optimization problem populate it. That feels very natural / doable

Is batch size the only thing you want to change on the fly now that you have the other API for max speed through the controller server?

BriceRenaudeau · 2024-07-25T09:54:23Z

Batch_size and time_steps are the parameters I don't want to change. They required an Xtensor reset of the vector's size.

Changing model_dt forbids to use shift_control_sequence so all the trajectories are resampled every time using the current state. As you said, it must be done carefully because it changes the meaning of the other parameters.

Each trajectory sample of MPPI can be unique and is converged into the optimal solution by the previous solution and the cost functions.

Yes, but only a small horizon (4 control_history_) of the previous controls is used for the next one, and at 50Hz, it affects less to keep the non-consistent previous one than starting back to 0 (as it creates a big discontinuity in the command cf. graph)

Right now it seems too delicate to modify this and it may stay a personal preference.

SteveMacenski · 2024-07-25T16:26:23Z

shift_control_sequence so all the trajectories are resampled every time using the current state

shift control sequence rolls the trajectories so that the i+ith trajectory point is now the ith trajectory after we execute the velocity command to seed the next iteration. That's not resampling.

but only a small horizon (4 control_history_) of the previous controls is used for the next one, and at 50Hz

Sure, but it takes many iterations for further away parts of the trajectory to converge to the solution that by the time we get to the local time horizon to execution we have a sense of what we want to do and its not fallen into a local minima. Its also important that the trajectory understands the consequences in the local time horizon for execution's commands on the long-term objective given the full time horizon. When you first start up from stop, note how many iterations it takes to get your robot up to full speed and the trajectory samples converged to an optimal solution. Its > 20 iterations to start to converge into the optimal solution space.

Now, imagine you seeded that trajectory with an assumption about the velocity the entire trajectory should take from the start. I see 3 possible outcomes:

The velocity simply collides at some point with the environment, which would cause a reset() anyway
The velocity is not near the actual optimal solution and the controller is unable to find the optimal solution because its stuck in its local minima
The velocity happens to be a good guess near the optimal solution and thus works fine as a pre-seed

Generally speaking, there's odds on each of those 3 situations for any given robot in an application, depending when/where/current state that parameter update is made. One would still cause a reset() which is now less understandable, one results in not-great behavior of the robot, and one works out fine.

In thinking about this, the one thing I could see working out potentially is keeping the next ~few iteration's velocities in tact (as long as you're not changing model_dt), reset the others, and keep the current executing trajectory alive. That gives the system some time to re-converge from scratch after the parameter update while keeping the robot in motion. However if you change a parameter that changes the meaning of the velocity's trajectory (model_dt), I don't see how we could exactly mimick that with a new value of model_dt.

We could try to make it similar?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MPPI] Zero velocity used after Optimizer reset #4545

[MPPI] Zero velocity used after Optimizer reset #4545

BriceRenaudeau commented Jul 19, 2024

SteveMacenski commented Jul 19, 2024

BriceRenaudeau commented Jul 22, 2024

SteveMacenski commented Jul 22, 2024

BriceRenaudeau commented Jul 23, 2024

BriceRenaudeau commented Jul 23, 2024 •

edited

Loading

SteveMacenski commented Jul 23, 2024 •

edited

Loading

BriceRenaudeau commented Jul 25, 2024

SteveMacenski commented Jul 25, 2024 •

edited

Loading

[MPPI] Zero velocity used after Optimizer reset #4545

[MPPI] Zero velocity used after Optimizer reset #4545

Comments

BriceRenaudeau commented Jul 19, 2024

Bug report

Steps to reproduce issue

Expected behavior

Actual behavior

Implementation considerations

SteveMacenski commented Jul 19, 2024

BriceRenaudeau commented Jul 22, 2024

SteveMacenski commented Jul 22, 2024

BriceRenaudeau commented Jul 23, 2024

BriceRenaudeau commented Jul 23, 2024 • edited Loading

SteveMacenski commented Jul 23, 2024 • edited Loading

BriceRenaudeau commented Jul 25, 2024

SteveMacenski commented Jul 25, 2024 • edited Loading

BriceRenaudeau commented Jul 23, 2024 •

edited

Loading

SteveMacenski commented Jul 23, 2024 •

edited

Loading

SteveMacenski commented Jul 25, 2024 •

edited

Loading