Partitioning representative periods #78

greg-neustroev · 2024-11-11T12:57:22Z

Description

So currently we assume that the input data for clustering is a table with the following header:

			p.u.
profile_name	year	timestep	value

or:

				p.u.
profile_name	year	period	timestep	value

If period is not provided, the data first needs to be split into periods. Currently this is done using TulipaClustering.split_into_periods!(df; period_duration), splitting the datafrae into periods of equal length.

At the same time, TulipaEnergyModels supports splitting splitting the year into unequal periods via partition specification, see for example assets-timeframe-partitions.csv.

I think that TulipaClustering should also utilize the partitioning approach, e.g., instead of calling TulipaClustering.split_into_periods!(df; period_duration) we should be able to call TulipaClustering.partition!(df; partition_string) to split the base data into periods.

Example:

Your data frame df is:

			p.u.
profile_name	year	timestep	value
profile_1	2030	1	1
profile_1	2030	2	2
profile_1	2030	3	3
profile_1	2030	4	4
...	...	...	...
profile_1	2030	8760	8760

I can call TulipaClustering.split_into_periods!(df; period_duration=24) which will change the data frame into periods of length 24 each (365 periods in total):

				p.u.
profile_name	year	period	timestep	value
profile_1	2030	1	1	1
profile_1	2030	1	2	2
profile_1	2030	1	3	3
profile_1	2030	1	4	4
...	...	...	...
profile_1	2030	365	24	8760

Instead, we might want to partition the first week as a period of length 168, and then have 358 periods of length 24. I would be able to do this by calling TulipaClustering.partition!(df; partition_string="1x168+358x24"). The example above can be done with TulipaClustering.partition!(df; partition_string="365x24").

The questions regarding this:

Is this string-based partitioning useful and worth implementing?
Should it be implemented in TulipaClustering or elsewhere, since partitioning is used outside of clustering as well.
Should the ne partitioning method replace the existing split_into_periods, or coexist with it? What's a good name for the method and its arguments? I like using partition as a verb, but then we also use it as a noun for the string specifying the partitioning structure, so TulipaClustering.partition!(df; partition) would look confusing potentially.

The text was updated successfully, but these errors were encountered:

greg-neustroev · 2024-11-11T12:58:10Z

@abelsiqueira @datejada @g-moralesespana @gnawin

What do you guys think?

greg-neustroev · 2024-11-11T13:01:11Z

Another question is how do we cluster periods of different length, but this should be a separate issue I think

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Partitioning representative periods #78

Partitioning representative periods #78

greg-neustroev commented Nov 11, 2024 •

edited

Loading

greg-neustroev commented Nov 11, 2024

greg-neustroev commented Nov 11, 2024

Partitioning representative periods #78

Partitioning representative periods #78

Comments

greg-neustroev commented Nov 11, 2024 • edited Loading

Description

greg-neustroev commented Nov 11, 2024

greg-neustroev commented Nov 11, 2024

greg-neustroev commented Nov 11, 2024 •

edited

Loading