Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recording primal/dual iterates #791

Open
FSchmidtDIW opened this issue Oct 14, 2024 · 18 comments · May be fixed by #806
Open

Recording primal/dual iterates #791

FSchmidtDIW opened this issue Oct 14, 2024 · 18 comments · May be fixed by #806

Comments

@FSchmidtDIW
Copy link

Hi Oscar,

in this working paper you show a very nice plot that shows how first-stage states converge with the number of iterations. Is there a quick way to record the current (first-stage) choices for a given iteration?

I was going to do it like this but it seems a little impractical (I have not run this yet):

out = Dict{Int64,Dict}()

SDDP.train( model,
            time_limit = 18*3600,
            iteration_limit = 100,
            stopping_rules = [SDDP.FirstStageStoppingRule()],
            cut_type = SDDP.SINGLE_CUT,
            forward_pass = SDDP.RegularizedForwardPass(),
            log_file = string(today())*"sddp_serial.log"
            );
k = 100
while k<=1000
   out[k] = SDDP.simulate(model,1,...,[capacities],...)

   SDDP.train( model,
            time_limit = 18*3600,
            add_to_exisiting_cuts = true
            iteration_limit = 10,
            stopping_rules = [SDDP.FirstStageStoppingRule()],
            cut_type = SDDP.SINGLE_CUT,
            forward_pass = SDDP.RegularizedForwardPass(),
            log_file = string(today())*"sddp_serial.log"

            );
          k+= 10
end

Thank you,

Felix

image

@odow
Copy link
Owner

odow commented Oct 14, 2024

I was going to do it like this but it seems a little impractical

This is, in fact, exactly what @jarandh did 😄

@odow
Copy link
Owner

odow commented Oct 14, 2024

It's one reason that we didn't really report computation times in the paper, because there was so much other stuff going on just so we could make this one pretty picture.

@FSchmidtDIW
Copy link
Author

Haha :) understood! Do you think it might be worth it to create a CapExForwardPass that records iterates if the first stage is deterministic? Otherwise, I will see how the approach above works for me. Would evaluating a decision rule for the first stage be quicker than simulating a (dummy) trajectory?

@odow
Copy link
Owner

odow commented Oct 15, 2024

I just remembered that there is:

forward_pass_callback::Function = (x) -> nothing,

You could try:

trajectories = Any[]
SDDP.train(model; forward_pass_callback = trajectory -> push!(trajectories, trajectory))

@FSchmidtDIW
Copy link
Author

Works like a charm!

I used the following to obtain a DataFrame of first-stage states by iteration:

capacities= Any[]
SDDP.train(model; forward_pass_callback = trajectory -> push!(capacities, trajectory.sampled_states[1])) # This depends on the first node being deterministic
capacity_df = reduce(vcat,[DataFrame(keys(c) .=> values(c)) for c in capacities])

Thanks!

@FSchmidtDIW
Copy link
Author

Hi, sorry to reopen this. I struggle to do this in a parallel version of my code. I've tried using SharedArrays.jl but they do not admit Dict elements in the capas vector. Alternatively I could store the iterates locally on each worker and combine them later. I'm just not sure how to do that. I know, this is rather a question on distributed computing in general than one on SDDP.jl. Sorry about that :)

@odow
Copy link
Owner

odow commented Nov 23, 2024

What parallel scheme?

I'd advise against using the Asynchronous one. Use the new Threaded instead.

Start Julia with julia -t N where N is the number of threads, then do something like:

capacities= Any[]
my_lock = Threads.ReentrantLock();
function callback(trajectory)
    Threads.lock(my_lock) do
         push!(capacities, trajectory.sampled_states[1])
    end
    return
end
SDDP.train(model; forward_pass_callback = callback, parallel_scheme = SDDP.Threaded())

@odow odow reopened this Nov 23, 2024
@FSchmidtDIW
Copy link
Author

Great, I will try this. Indeed, I've been using the asynchronous mode and it's been working quite well but I'll make the switch to Threaded then.
Thank you very much!

@odow
Copy link
Owner

odow commented Nov 24, 2024

I don't have an easy way to use forward_pass_callback with Distributed. You'd probably need to use a Channel, but it'd get quite complicated.

@odow
Copy link
Owner

odow commented Nov 24, 2024

Note that threaded works best when the number of nodes is >> the number of threads.

@FSchmidtDIW
Copy link
Author

Thank you! That is nodes in a policy graph right? Does that mean that you also parallelise the backward pass?

@odow
Copy link
Owner

odow commented Nov 24, 2024

Yes, nodes in the graph.

The forward and backward pass are conducted asynchronously in parallel across the threads.

The differences are:

  • SDDP.Asynchronous has a complete copy of the graph in each process, and it periodically shares cuts between processes. This requires a lot of memory because there are multiple copies of the problem, and this has a lot of data movement between processes. But it theoretically scales to any number of processes. (Although in practice, you'll find that performance quickly tapers off with more procesess.)
  • SDDP.Threaded has a single copy of the model in shared memory. Each thread does forward and backward passes asynchronously and in parallel on the same graph. There is a lock at each node, so that only one thread can be solving subproblems at a node at one time. Therefore, we can have at most as many threads as there are nodes in the graph. But things work better when there are many more nodes than threads.

@FSchmidtDIW
Copy link
Author

Very, cool! Thank you! First test run on my problem with Threaded looks very promising!

@odow
Copy link
Owner

odow commented Nov 25, 2024

I'll point you to the docstring: https://sddp.dev/stable/apireference/#SDDP.Threaded

It's still somewhat experimental. It should work for most standard use-cases, but if you've written any custom plugins you need to be very careful that they are themselves threadsafe.

But yeah, assuming it works and you're running on a single machine, it is much, much better than before.

@FSchmidtDIW
Copy link
Author

Thank you! It looks like the Threaded option does not work with a RegularizedForwardPass, right? As the latter leads to significant performance gains compared to a vanilla ForwardPass in my case, it'd be interested in fixing this.
Why exactly does it fail with regularization?

@odow
Copy link
Owner

odow commented Nov 25, 2024

Why exactly does it fail with regularization?

I assume we need to make it thread safe

@odow
Copy link
Owner

odow commented Nov 25, 2024

The issue is that we modify the first-stage for the forward pass:

old_bounds = Dict{Symbol,Tuple{Float64,Float64}}()
for (k, v) in node.states
if has_lower_bound(v.out) && has_upper_bound(v.out)
old_bounds[k] = (l, u) = (lower_bound(v.out), upper_bound(v.out))
x = get(fp.trial_centre, k, model.initial_root_state[k])
set_lower_bound(v.out, max(l, x - fp.ρ * (u - l)))
set_upper_bound(v.out, min(u, x + fp.ρ * (u - l)))
end
end
pass = forward_pass(model, options, fp.forward_pass)
for (k, (l, u)) in old_bounds
fp.trial_centre[k] = pass.sampled_states[1][k]
set_lower_bound(node.states[k].out, l)
set_upper_bound(node.states[k].out, u)
end

The fix might not be trivial.

@FSchmidtDIW
Copy link
Author

I see! Well, it seems like the Threaded version without regularization is still much faster than serial with regularization, at least in my case. Thank you!

@odow odow linked a pull request Nov 27, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants