Defining optimization problems
Optimizable objects
Simsopt is able to optimize any python object that has a
get_dofs()
function and a set_dofs()
function. The overall
objective function can be defined using any function, attribute, or
@property of such an object. Optimizable objects can depend on other
optimizable objects, so objects can be part of an optimization even if
they do not directly own a function that is part of the overall
objective function.
Specifying functions that go into the objective function
Suppose we want to solve a least-squares optimization problem in which
an object obj
is optimized. If obj
has a function func()
,
we can use a 3-element tuple or list:
term1 = (obj.func, goal, weight)
to represent a term weight * ((obj.func() - goal) ** 2)
in the
least-squares objective function.
In this example, func()
could return a scalar, or it could return
a 1D numpy array. In the latter case, sum(weight * ((obj.func() -
goal) ** 2))
would be included in the objective function, and
goal
could be either a scalar or a 1D numpy array of the same
length as that returned by func()
.
Similarly, we can define additional terms:
term2 = (obj2.fn, goal2, weight2)
The total least-squares objective function is created using a list or tuple of all the terms that are to be added together:
prob = LeastSquaresProblem.from_tuples((term1, term2))
The sequence can include any mixture of terms defined by scalar functions
and by 1D numpy array-valued functions. The problem can be defined
in an alternative way without defining the intermediate names
term1
, term2
, etc:
prob = LeastSquaresProblem([goal1, goal2, goal3],
[weight1, weight2, weight3],
[obj1.fn1, obj2.fn2, obj3.fn3])
If you prefer, you can specify
sigma = 1 / sqrt(weight)
rather than weight
and use the
LeastSquaresProblem.from_sigma
as:
prob = LeastSquaresProblem.from_sigma([goal1, goal2, goal3],
[sigma1, sigma2, sigma3],
[obj1.fn1, obj2.fn2, obj3.fn3])
Degrees of freedom (“dofs”)
Any optimizable object can define a function get_dofs(self)
that
returns a 1D numpy array of floats. (Simsopt will not allow integer
optimization.) Any optimizable object also can have a function
set_dofs(self, x)
that accepts x
, a 1D numpy array of
floats. Some of the degrees of freedom may be fixed in the
optimization - see below.
It can be convenient to have a descriptive string associated with each
dof, e.g. "phiedge"
or "rc(1,0)"
. To supply these names, add a
names
attribute to your optimizable object which is a list of
strings, where the length of the list matches the size of the
get_dofs()
array. It is not necessary to have a names
list; if
one is not present, simsopt will use as a default ["x[0]", "x[1]",
...]
.
Each dof is owned by a single object. In other words, if a physical
dof is represented by an array element for get_dofs()
and
set_dofs()
of a particular object, there should be no other object
that has that physical degree of freedom an an element in its
get_dofs()
and set_dofs()
arrays. If two objects both depend
on a given physical degree of freedom, that information should be
represented using the depends_on
attribute, described below.
In addition to the “local” dofs owned by each object, an important concept is the set of “global” dofs associated with an optimization problem. The global dofs are the set of all non-fixed dofs of each optimized object and all objects upon which they depend. (Dependencies among objects are discussed in a section below.) The global dofs correspond to the state vector passed to the optimization algorithm.
At each function evaluation during an optimization, simsopt will call
the recompute_bell()
function of each object involved in the
optimization problem, if the local dofs owned
by that object have changed or if the output of an object it depends on
has changed. It is the responsibility of
each object to implement the recompute_bell
method to
update expensive calculations.
Helpful functions
Simsopt can provide several helpful functions to your optimizable
object. For instance, functions can be provided to help with fixing
certain degrees of freedom, as discussed below. There is also the
method index(str)
which gets the index into the dof array
associated with a dof name str
, and the methods get(str)
and
set(str, val)
which get and set a dof associated with the name
str
.
There are two ways to equip your object with these functions. One way
is to inherit from the :obj:simsopt.Optimizable
class. Or, if you do
not want your class to depend on simsopt, you can use the function
:func:simsopt.make_optimizable()
to your objective function.
Fixing degrees of freedom
Sometimes we may want to vary a particular dof in an optimization, and
other times we may want to hold that same dof fixed. One example is
the current in a coil. To enable this flexibility, every dof in
simsopt is considered to be either fixed or not. Only the non-fixed
dofs are included in the optimization. Whether or not a
dof is fixed can be identified by the dofs_free_status
attribute of the
object. The dofs_free_status
attribute is a boolean
numpy array, with each element True or False. You can also query the free/fixed
status of individual status by :meth:is_free
or :meth:is_fixed
methods by using the array index
of the dof or by using the name of the dof as key.
- There are several ways you can manipulate the fixed/free status of the dofs.
You can set all entries to True or False using the
:meth:fix_all
or :meth:unfix_all
methods from :obj:simsopt.Optimizable
.
You can set individual entries using the
string names or arry indices of each dof via the :meth:fix
or
:meth:unfix
methods, e.g. :meth:fix("phiedge")
or :meth:unfix("2")
.
Dependencies
It may happen that one object depends on degrees of freedom owned by a
different object. For instance, suppose we have an object v
which
is an instance of the VMEC
class. Then v
has as an attribute
boundary
which is an instance of the Surface
class, describing
the plasma boundary, and v’s functions depend on dofs owned by the
Surface
. Simsopt detects this kind of dependency automatically, so that
if a function of v
is put into the objective function, the dofs of
boundary
are also included among the global dofs.
To represent this situation, the v
object at the time of initialization
passes the argument depends_on
, which is a list of Optimizable objects, to
the base Optimizable
class. In this specific
example, inside VMEC.__init__
method, call to base class is made as
super().__init__(..., depends_on=[self.boundary], ...
.
Derivatives
Simsopt can manage both derivative-free and derivative-based optimization, automatically detecting whether derivative information is available. For now, if derivatives are not available for all functions going into the objective function, then derivative-free optimization will be used; cases with a mixture of analytic and finite-difference derivatives are left for future work.
To supply derivative information, your object must provide a function,
property, or attribute with the same name as the one supplied to the
objective function, but with a d
added in front. For instance, if
you used obj.func()
to form the objective function, the derivative
of obj.func()
must be a function obj.dfunc()
. Or, if you used
a property obj.prop
to form the objective function, the derivative
of obj.prop
must be a property obj.dprop
. If simsopt detects
that all of these functions/properties/attributes are present, it will
use derivative-based optimization. If one or more derivative
functions is missing, a derivative-free algorithm will be used.
These derivative functions must each return a 1D numpy array,
containing the derivative of the original scalar function with respect
to all local dofs owned both by the object and any objects it depends
on. So if obj
owns 10 dofs, and it depends on an object dep
that owns 5 dofs, obj.dfunc()
should return a 15-element vector.
The 10 dofs owned directly by obj
come first. The order of the
dofs from dependencies is the order specified in the depends_on
list. Your object is responsible for gathering and manipulating
derivative information from objects it depends on in order to form
this combined gradient vector.
The length of the gradient vector returned by your function is independent of whether or not any dofs are fixed. However, if a dof is fixed, the corresponding entry in the gradient vector will not be used, so you could return 0.0 for that entry in the vector rather than actually computing the derivative.