Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MPPI] Zero velocity used after Optimizer reset #4545

Open
BriceRenaudeau opened this issue Jul 19, 2024 · 8 comments
Open

[MPPI] Zero velocity used after Optimizer reset #4545

BriceRenaudeau opened this issue Jul 19, 2024 · 8 comments

Comments

@BriceRenaudeau
Copy link
Contributor

Bug report

The MPPI controller outputs a near zero cmd_vel after a dynamic param change.
It creates strange behavior where the robot slows down when changing its speed limit:
mppi_zero_vel_after_param_change2

This is because the algorithm uses a zero velocity state after Optimizer reset.

Required Info:

  • Operating System:
    • Ubuntu 22.04 -->
  • ROS2 Version:
    • Iron
  • Version or commit hash:
    • from source
  • DDS implementation:
    • cyclone

Steps to reproduce issue

  • Send the robot somewhere using MPPI controller
  • While it's still moving change a parameter like FollowPath.wz_std

Expected behavior

The robot doesn't slow down.

Actual behavior

The robot does slow down.

Implementation considerations

The optimizer may use the previous state or the current speed of the robot.

@SteveMacenski
Copy link
Member

@BriceRenaudeau I think you can use the controller server's API for changing the speed if that's the behavior you want: https://github.com/ros-navigation/navigation2/blob/main/nav2_controller/src/controller_server.cpp#L241-L243 -- I think that's more the appropriate location.

The dynamic parameters for MPPI contain a number of things that require a reset because the size of tensors can change or dynamics and limits adjust which the existing optimal trajectory may unsafely violate. I suppose we could pick and choose which cause a reset, but that would require a little bit of work since all parameter updates currently have the same post-callback that performs the reset. We'd need to either make the post-callback parameter-specific or give the post-update callback the name of the parameter updates to make a judgement about whether we execute the reset or not. This wouldn't be terribly difficult, but I think the nav2_controller method is more what you're looking for :-)

@BriceRenaudeau
Copy link
Contributor Author

Thanks for the quick reply.
My issue is not about speed limitation but trajectories configuration like:

  • FollowPath.wz_std
  • FollowPath.model_dt
  • FollowPath.time_steps

I know it's not an easy topic.
My ultimate goal is to get a fixed behavior of MPPI trajectories computation whatever the speed is.
The trajectories horizon is directly related to the max_speed. So reducing the max speed reduces the horizon of the trajectories and the obstacles are avoided much later or worse the MPPI falls in a local minima and no valid trajectories are found.
It's the same issue with the spreading horizon.
mppi_dt_0 05
mppi_dt_0 075

To get this fixed horizon I dynamically change the model_dt. It's not a perfect solution as it doesn't allow shift_control_sequence. But it's better than changing time_steps and having much higher CPU consumption.

@SteveMacenski
Copy link
Member

The problem with changing core parameters like model_dt is that it invalidates the previous optimal trajectory. The controls at each of the indices with a new value for model_dt mean completely different things now and it would be unsafe to start from that old-optimal trajectory which could easily now collide with the environment or otherwise have strange behavior. The problem would naturally require an internal reset() to manage the optimal trajectory being in collision and unable to execute, so its better we just do it from the outset.

Some of the key parameters like model_dt and time_steps I don't think can be changed on the fly without resetting the state back to ground zero. Would it be permissible to stop the robot between changes to the control parameters? Things like max velocities, std I think you could change live on the fly though!

@BriceRenaudeau
Copy link
Contributor Author

As mentioned above, the speed changes that are well-managed by the controller create behavior changes. To keep the same behavior I need to change these parameters.

One solution could be to use the odom velocity after a reset() to avoid recomputing the acceleration slope?

@BriceRenaudeau
Copy link
Contributor Author

BriceRenaudeau commented Jul 23, 2024

Maybe make the Optimizer::reset() only on batch_size or batch_size change. In my case, it worked but I didn't make extensive tests.

void Optimizer::reset()
{
  if (state_.vx.shape(0) != settings_.batch_size || state_.vx.shape(1) != settings_.time_steps) {
    state_.reset(settings_.batch_size, settings_.time_steps);
    control_sequence_.reset(settings_.time_steps);
    control_history_[0] = {0.0, 0.0, 0.0};
    control_history_[1] = {0.0, 0.0, 0.0};
    control_history_[2] = {0.0, 0.0, 0.0};
    control_history_[3] = {0.0, 0.0, 0.0};

    costs_ = xt::zeros<float>({settings_.batch_size});
    generated_trajectories_.reset(settings_.batch_size, settings_.batch_size);
  }

    settings_.constraints = settings_.base_constraints;
    noise_generator_.reset(settings_, isHolonomic());
    RCLCPP_INFO(logger_, "Optimizer reset");
}

@SteveMacenski
Copy link
Member

SteveMacenski commented Jul 23, 2024

One solution could be to use the odom velocity after a reset() to avoid recomputing the acceleration slope?

I just don't think that'll work. This isn't something like a DWA technique that has a trivial future-predicted trajectory that can completely shift in the next iteration. Each trajectory sample of MPPI can be unique and is converged into the optimal solution by the previous solution and the cost functions.

Initializing it to something like a 'go current speed' for all samples will almost certainly cause you to fall into local minima each time you make a parameter change since the current timestep's velocity is likely not the right optimal solution for timestep 10, 20, 50, and so on - you may never actually converge correctly into the optimal solution by doing that until fully resetting the state. A special case where that might work would be if you always change parameters when on straight-aways, but that's pretty finnicky / fragile.

I don't think its not worth a try, but I suspect that you'll get behaviors you don't like out of it or the optimal trajectory after that initial seeding will be frequently in collision with the environment for longer time horizons triggering a reset() anyway since the seeded solution is invalid to execute. Things that I think could create this issue:

  • Seeding an arbitrary velocity across the trajectory --> seeds an initial condition that is not optimization generated + likely in collision anyway for longer time horizons
  • Changing model_dt such that existing velocities in the trajectories no longer mean the same thing --> effectively warps the trajectory
  • Accelerations since clipping them would manifestly change the meaning of a trajectory if the change was non-trivially smaller than the current setting --> effectively warps the trajectory

batch_size is probably fine to change on the fly as long as other parts of the state (i.e. noise sample sizing, cost, etc) are also thoughtfully resized with it. time_steps I'm not 100% sure. I suppose that should be fine to change, but again on careful initialization of values after its resize. When we roll the trajectory forward, we seed the new furthest trajectory point with size - 1th's value, so I suppose we could do that here for however many more are added and let the optimization problem populate it. That feels very natural / doable

Is batch size the only thing you want to change on the fly now that you have the other API for max speed through the controller server?

@BriceRenaudeau
Copy link
Contributor Author

Batch_size and time_steps are the parameters I don't want to change. They required an Xtensor reset of the vector's size.

Changing model_dt forbids to use shift_control_sequence so all the trajectories are resampled every time using the current state. As you said, it must be done carefully because it changes the meaning of the other parameters.

Each trajectory sample of MPPI can be unique and is converged into the optimal solution by the previous solution and the cost functions.

Yes, but only a small horizon (4 control_history_) of the previous controls is used for the next one, and at 50Hz, it affects less to keep the non-consistent previous one than starting back to 0 (as it creates a big discontinuity in the command cf. graph)

Right now it seems too delicate to modify this and it may stay a personal preference.

@SteveMacenski
Copy link
Member

SteveMacenski commented Jul 25, 2024

shift_control_sequence so all the trajectories are resampled every time using the current state

shift control sequence rolls the trajectories so that the i+ith trajectory point is now the ith trajectory after we execute the velocity command to seed the next iteration. That's not resampling.

but only a small horizon (4 control_history_) of the previous controls is used for the next one, and at 50Hz

Sure, but it takes many iterations for further away parts of the trajectory to converge to the solution that by the time we get to the local time horizon to execution we have a sense of what we want to do and its not fallen into a local minima. Its also important that the trajectory understands the consequences in the local time horizon for execution's commands on the long-term objective given the full time horizon. When you first start up from stop, note how many iterations it takes to get your robot up to full speed and the trajectory samples converged to an optimal solution. Its > 20 iterations to start to converge into the optimal solution space.

Now, imagine you seeded that trajectory with an assumption about the velocity the entire trajectory should take from the start. I see 3 possible outcomes:

  • The velocity simply collides at some point with the environment, which would cause a reset() anyway
  • The velocity is not near the actual optimal solution and the controller is unable to find the optimal solution because its stuck in its local minima
  • The velocity happens to be a good guess near the optimal solution and thus works fine as a pre-seed

Generally speaking, there's odds on each of those 3 situations for any given robot in an application, depending when/where/current state that parameter update is made. One would still cause a reset() which is now less understandable, one results in not-great behavior of the robot, and one works out fine.


In thinking about this, the one thing I could see working out potentially is keeping the next ~few iteration's velocities in tact (as long as you're not changing model_dt), reset the others, and keep the current executing trajectory alive. That gives the system some time to re-converge from scratch after the parameter update while keeping the robot in motion. However if you change a parameter that changes the meaning of the velocity's trajectory (model_dt), I don't see how we could exactly mimick that with a new value of model_dt.

We could try to make it similar?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants