Overview
FleetRL is a Reinforcement Learning (RL) environment for EV charging optimization with a special focus on commercial vehicle fleets. Its main function is the modelling of real-world charging processes. FleetRL was developed with a modular approach, keeping in mind that improvements and additions are important to maintain the value of this framework. Emphasis was therefore laid on readability, ease of use and ease of maintenance. For example, the base-derived class architecture was used throughout the framework, allowing for easy interchangeability of modules such as battery degradation.
Emphasis was also laid on customizability: own schedules can be generated, electricity prices can be switched by changing a csv file, episode length, time resolution, and EV charging-specific parameters can be changed in their respective config files.
Physical model
As seen in the graphic below, the physical scope currently includes the EVs, the building (load + PV), a limited grid connection, and electricity prices (both for purchase and feed-in).
The EVs have probabilistic schedules that are generated beforehand. The main objective is thereby to minimize charging cost while respecting the constraints of the schedules, grid connection and SOC requirements. Battery degradation is modelled, both linearly and non-linearly. Electricity prices are taken from the EPEX spot market. PV production data is taken from the MERRA-2 open data set, available here. Building load was taken from the NREL TMY-3 dataset.
Some assumptions are taken to limit the degree of complexity of the optimization problem:
All connections are three-phase, and load imbalance is not considered.
Only active power is considered when calculating transformer loading. Capacitive and inductive properties of electrical equipment are not considered.
Although bidirectional charging is enabled, it only allows for energy arbitrage. Frequency regulation is not implemented in this study.
The companies are modelled as price takers and are assumed not to influence the electricity price with their demand.
Battery degradation is modelled non-linearly, taking into account rainflow cycle-counting and SEI-film formation according to Xu et al..
Code structure
FleetRL is based on the OpenAI / farama foundation gym framework and “basically” implements the step, reset, and init functions. To train RL agents, the stable-baselines3 framework is used due to its high-quality implementations, level of maintenance and documentation. Different agents can be plugged into the environment plug-and-play, only having to change a few lines of code.
To train RL agents on a FleetRL environment, an env object needs to be created, which inherits from gym
and implements the necessary methods and the entire logic of the EV charging problem. The implementation can
be found under FleetRL.fleet_env.fleet_environment
. As can be seen, functions such as EV charging, or calculating
battery degradation are outsourced to separate modules to maintain readability. This way, a submodule can also
be changed without having to touch the main logic of the code.
Leveraging base-derived class architecture
FleetRL makes use of parent classes and sub-classes that implement the methods of the parent class - this
is also known as a base-derived hierarchy. For example, FleetRL uses time_picker
to pick a starting date
for a new episode. This can either be random, or always the same date. A parent class TimePicker
thereby
exists:
class TimePicker:
def choose_time(self, db: pd.Series, freq: str, end_cutoff: int) -> Timestamp:
"""
Parent class for time picker objects
:param db: dataframe from env
:param freq: frequency specification string for pandas
:param end_cutoff: amount of days cut off at the end to allow some buffer. In the eval time picker case,
the end cutoff specifies the size of the validation set.
:return: A chosen time stamp
"""
raise NotImplementedError("This is an abstract class.")
As can be seen, it dictates the inputs and outputs, as well as the methods the class contains -> these
must be matching exactly when implementing a sub-class of TimePicker
. The parent class methods cannot
be called directly, this would raise a NotImplementedError
.
A sub-class is implemented as follows, taking the example of the StaticTimePicker
:
from FleetRL.utils.time_picker.time_picker import TimePicker
class StaticTimePicker(TimePicker):
"""
Picks a static / always the same starting time.
"""
def __init__(self, start_time: str = "01/01/2020 15:00"):
"""
:param start_time: When initialised, start time is specified
"""
self.start_time = start_time
def choose_time(self, db: pd.Series, freq: str, end_cutoff: int) -> Timestamp:
chosen_start_time = pd.to_datetime(self.start_time)
# return start time
return chosen_start_time
As can be seen, StaticTimePicker
inherits from TimePicker
, and implements its methods. In the code,
an object of the type TimePicker
can then be created as follows:
tp: TimePicker = StaticTimePicker()
If a different time picker should be chosen, it can be changed by changing one line of code. In this
case, the RandomTimePicker
is chosen instead:
tp: TimePicker = RandomTimePicker()
Note
This is already automated in FleetRL, and the right time picker module can be specified via a string in the input parameters when creating an env object.
Note
When writing an own sub-class, it must be ensured that all methods are implemented, and that the methods follow the same inputs/outputs as the parent class. Once this is done, the own implementation can be used in FleetRL by changing one line of code, as shown above.