Coil optimization
Here we work through several examples of “stage 2” coil optimization. In this approach, a target plasma shape (i.e. boundary flux surface shape) has already been found in a “stage 1” optimization, and the present task is to optimize the shapes of coils to produce this target field. For this stage-2 problem, no MHD codes are used (example, VMEC or SPEC), so there is no need to have them installed.
We first describe the basic principles of “stage 2” coil optimization and
how they are used in SIMSOPT, namely, how to set up the optimization problem
in the minimal example examples/2_Intermediate/stage_two_optimization.py
.
We then add several advanced features to the problem:
Including more terms in the objective to regularize the coils, which may help with engineering feasibility.
Taking into account coil perturbations via systematic and statistical errors using a stochastic optimization method similar to the example in
examples/2_Intermediate/stage_two_optimization_stochastic.py
.Taking into account the presence of plasma current using the virtual casing principle, similar to the example in
examples/2_Intermediate/stage_two_optimization_finite_beta.py
.Taking into account finite coil width using a multi-filament approach similar to the example in
examples/3_Advanced/stage_two_optimization_finite_build.py
.
Minimal example
A minimal objective function (cost function) for coil optimization is
This example can also be found in
examples/1_Simple/stage_two_optimization_minimal.py
.
The first term in the right-hand-side term is the “quadratic flux”, the area
integral over the target plasma surface of the square of the magnetic
field normal to the surface. If the coils exactly produce a flux
surface of the target shape, this term will vanish.
Next, \(L_j\)
is the length of the \(j\)-th coil.
The number \(L_*\)
is chosen to balance a trade-off: small
values will give smooth coils at the cost of inaccuracies in producing
the target field; large values of \(L_*\) will give a more
accurate match to the target field at the cost of complicated coils.
The weight parameter LENGTH_WEIGHT
provides some additional control over the regularization,
though results should not be very sensitive to it, and a value of 1.0 is fine.
In this first section we consider vacuum fields only, so the magnetic field due to current in the plasma does not need to be subtracted in the quadratic flux term. The configuration considered is the precise QA configuration, which has two field periods and is quasi-axisymmetric.
The stage-2 optimization problem is automatically parallelized in
simsopt using OpenMP and vectorization, but MPI is not used, so the
mpi4py
python package is not needed. This example can be run in a
few seconds on a laptop.
To solve this optimization problem in simsopt, we first import the necessary classes:
from simsopt.geo import SurfaceRZFourier, create_equally_spaced_curves, \
CurveLength, plot, curves_to_vtk
from simsopt.field import Current, coils_via_symmetries, BiotSavart
from simsopt.objectives import SquaredFlux, QuadraticPenalty
The target plasma surface is given in the VMEC input file tests/test_files/input.LandremanPaul2021_QA
.
We load the surface using:
nphi = 32
ntheta = 32
filename = "tests/test_files/input.LandremanPaul2021_QA"
s = SurfaceRZFourier.from_vmec_input(filename, range="half period", nphi=nphi, ntheta=ntheta)
You can adjust the directory in "filename"
as appropriate for your
system. The surface could also be read in from a VMEC wout file using
simsopt.geo.SurfaceRZFourier.from_wout()
. Note that VMEC does
not need to be installed to initialize a surface from a VMEC input or
output file. As surface objects carry a grid of “quadrature points” at
which the position vector is evaluated we may want these points to
cover different ranges of the toroidal angle. For this problem with
stellarator symmetry and field-period symmetry, we need only consider
half of a field period in order to evaluate integrals over the entire
surface. For this reason, the range
parameter of the surface is
set to "half period"
here. Possible options in general are "full
torus"
, "field period"
and "half period"
, but "half
period"
is significantly more efficient than the other options for
this problem.
We next set the initial condition for the coils, which will be equally spaced circles.
We consider here a case with four unique coil shapes, each of which is repeated four times due to
stellarator symmetry and two-field-period symmetry, giving a total of 16 coils.
The four unique coil shapes are called the “base coils”. Each copy of a coil also carries the same current,
but we will allow the unique coil shapes to have different current from each other,
as is allowed on W7-X. For this tutorial we consider the coils to be infinitesimally thin filaments.
In simsopt, such a coil is represented with the simsopt.field.Coil
class,
which is essentially a curve paired with a current, represented using
simsopt.geo.Curve
and simsopt.field.Current
respectively.
The initial conditions are set as follows:
# Number of unique coil shapes:
ncoils = 4
# Major radius for the initial circular coils:
R0 = 1.0
# Minor radius for the initial circular coils:
R1 = 0.5
# Number of Fourier modes describing each Cartesian component of each coil:
order = 5
base_curves = create_equally_spaced_curves(ncoils, s.nfp, stellsym=True, R0=R0, R1=R1, order=order)
base_currents = [Current(1.0) * 1e5 for i in range(ncoils)]
In the last line, the degrees of freedom for the current are set to be of order 1,
but the actual currents are 1e5
larger so the magnetic field will be on the order of 1 Tesla.
One detail of optimizing coils for a vacuum configuration is that the
optimizer can “cheat” by making all the currents go to zero, which
makes the quadratic flux vanish. To close this loophole, we can fix
the current of the first base coil:
base_currents[0].fix_all()
(A Current
object only has one degree of freedom, hence we can use
fix_all()
.) If you wish, you can fix the currents in all the
coils to force them to have the same value. Now the full set of 16
coils can be obtained using stellarator symmetry and field-period
symmetry:
coils = coils_via_symmetries(base_curves, base_currents, s.nfp, True)
To make sure that the coils class has the non-fixed degrees of freedom that
we specified, we can print the dof_names
property:
>>> print(coil[0].dof_names)
['CurveXYZFourier1:xc(0)', 'CurveXYZFourier1:xs(1)', 'CurveXYZFourier1:xc(1)', ...
>>> print(coil[1].dof_names)
['Current2:x0', 'CurveXYZFourier2:xc(0)', 'CurveXYZFourier2:xs(1)', 'CurveXYZFourier2:xc(1)', ...
>>> print(coil[4].dof_names)
['CurveXYZFourier1:xc(0)', 'CurveXYZFourier1:xs(1)', 'CurveXYZFourier1:xc(1)', ...
Notice that the current appears in the list of dofs for coil[1]
but not for coil[0]
, since we fixed the current for
coil[0]
. Also notice that coil[4]
has the same degrees of
freedom (owned by CurveXYZFourier1
) as coil[0]
, because coils
0 and 4 refer to the same base coil shape and current.
There are several ways to view the objects we have created so far. One
approach is the function simsopt.geo.plot()
, which accepts
a list of Coil, Curve, and/or Surface objects:
plot(coils + [s], engine="mayavi", close=True)
Instead of "mayavi"
you can select "matplotlib"
or
"plotly"
as the graphics engine, although matplotlib has problems
with displaying multiple 3D objects in the proper
order. Alternatively, you can export the objects in VTK format and
open them in Paraview:
curves = [c.curve for c in coils]
curves_to_vtk(curves, "curves_init")
s.to_vtk("surf_init")
To evaluate the magnetic field on the target surface, we create a
simsopt.field.BiotSavart
object based on the coils,
and instruct it to evaluate the field on the surface:
bs = BiotSavart(coils)
bs.set_points(s.gamma().reshape((-1, 3)))
(The surface position vector gamma()
returns an array of size
(nphi, ntheta, 3)
, which we reshaped here to
(nphi*ntheta, 3)
as required for the
BiotSavart
object.)
To check the size of the field normal to the target surface
before optimization we can run:
B_dot_n = np.sum(bs.B().reshape((nphi, ntheta, 3)) * s.unitnormal(), axis=2)
print('Initial max|B dot n|:', np.max(np.abs(B_dot_n)))
The result is 0.23 Tesla.
We now define the objective function by stating what are the weights used and the corresponding terms:
# Weight on the curve lengths in the objective function.
LENGTH_WEIGHT = 1.0
# Target value for the sum of the lengths of the base coils:
LENGTH_TARGET = 18.0
# Define the individual terms objective function:
Jf = SquaredFlux(s, bs)
Jls = [CurveLength(c) for c in base_curves]
# Form the total objective function.
objective = Jf + LENGTH_WEIGHT * QuadraticPenalty(sum(Jls), LENGTH_TARGET)
In the last line, we have used the fact that the Optimizable objects
representing the individual terms in the objective can be scaled by a
constant and added. (This feature applies to Optimizable objects that
have a function J()
returning the objective and, if gradients are
used, a function dJ()
returning the gradient.)
You can check the degrees of freedom that will be varied in the
optimization by printing the dof_names
property of the objective:
>>> print(objective.dof_names)
['Current2:x0', 'Current3:x0', 'Current4:x0', 'CurveXYZFourier1:xc(0)', 'CurveXYZFourier1:xs(1)', ...
'CurveXYZFourier1:zc(5)', 'CurveXYZFourier2:xc(0)', 'CurveXYZFourier2:xs(1)', ...
'CurveXYZFourier4:zs(5)', 'CurveXYZFourier4:zc(5)']
As desired, the Fourier amplitudes of all four base coils appear, as do three of the four currents. Next, to interface with scipy’s minimization routines, we write a small function:
def fun(dofs):
objective.x = dofs
return objective.J(), objective.dJ()
Note that when the dJ()
method of the objective is called to
compute the gradient, simsopt automatically applies the chain rule to
assemble the derivatives from the various terms in the objective, and
entries in the gradient corresponding to degrees of freedom that are
fixed (such as the current in the first coil) are automatically
removed. We can now run the optimization using the L-BFGS-B algorithm
from scipy:
res = minimize(fun, objective.x, jac=True, method='L-BFGS-B',
options={'maxiter': 300, 'iprint': 5}, tol=1e-15)
The optimization takes a few seconds, and the output will look like
RUNNING THE L-BFGS-B CODE
* * *
Machine precision = 2.220D-16
N = 135 M = 10
This problem is unconstrained.
At X0 0 variables are exactly at the bounds
At iterate 0 f= 3.26880D-02 |proj g|= 5.14674D-02
At iterate 5 f= 6.61538D-04 |proj g|= 2.13561D-03
At iterate 10 f= 1.13772D-04 |proj g|= 6.27872D-04
...
At iterate 295 f= 6.20353D-07 |proj g|= 9.16379D-06
At iterate 300 f= 6.19516D-07 |proj g|= 1.66448D-05
* * *
Tit = total number of iterations
Tnf = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip = number of BFGS updates skipped
Nact = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F = final function value
* * *
N Tit Tnf Tnint Skip Nact Projg F
135 300 409 2 0 0 1.664D-05 6.195D-07
F = 6.1951581434132075E-007
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT
(You may obtain somewhat different values). You can adjust parameters such as the tolerance and number of iterations. Let us check the final \(\vec{B}\cdot\vec{n}\) on the surface:
B_dot_n = np.sum(bs.B().reshape((nphi, ntheta, 3)) * s.unitnormal(), axis=2)
print('Final max|B dot n|:', np.max(np.abs(B_dot_n)))
The final value is 0.0015 Tesla, reduced two orders of magnitude from the initial state. As with the initial conditions, you can plot the optimized coil shapes directly from simsopt using
plot(coils + [s], engine="mayavi", close=True)
or you can export the objects in VTK format and open them in Paraview. For this latter option, we can also export the final \(\vec{B}\cdot\vec{n}\) on the surface using the following syntax:
curves = [c.curve for c in coils]
curves_to_vtk(curves, "curves_opt")
s.to_vtk("surf_opt", extra_data={"B_N": B_dot_n[:, :, None]})
(Your coils may look somewhat different).
The optimized value of the current in coil j
can be obtained using
coils[j].current.get_value()
. The optimized Fourier coefficients
for coil j
can be obtained from coils[j].curve.x
, where the
meaning of each array element can be seen from
coils[j].curve.dof_names
. The position vector for coil j
in
Cartesian coordinates can be obtained from coils[j].curve.gamma()
.
At the end of the optimization, the results can be saved as follows:
bs.save("biot_savart_opt.json")
This line saves the optimized coil shapes and currents, along with the
BiotSavart
object that evaluates the magnetic
field produced by the coils. The resulting json file can be loaded in
later in a separate script to analyze the results (making Poincare
plots, etc.)
Further coil regularization terms
In the previous example we have already seen several objective function terms available for coil regularization:
CurveLength
: The length of a coilQuadraticPenalty
: Useful for making quantities equal to, greater than, or less than a target value.
However, better coils can be obtained if additional terms are included in the objective function
to control the coil curvature and coil-to-coil separation.
A large number of other terms are available in simsopt.geo
to include in the objective function,
many of which are illustrated in
examples/2_Intermediate/stage_two_optimization.py
. Other available objective terms include
MeanSquaredCurvature
: Reduces the coil curvature evenly along the curve.LpCurveCurvature
: Penalizes values of the coil curvature that exceed a threshold.LpCurveTorsion
: Penalizes values of the coil torsion that exceed a threshold. Coil optimization does not tend to work well when this term is included.CurveCurveDistance
: Useful for ensuring the minimum coil-to-coil distance is at least a specified target value.CurveSurfaceDistance
: Useful for ensuring the minimum coil-to-plasma distance is at least a specified target value.ArclengthVariation
: Ensures the curves are parameterized using (approximately) a uniform-arclength parameter.
You can click on any of the links above in this section to see the precise definitions of these objective terms.
Another useful class for forming objective functions is
simsopt.objectives.Weight
, which is a wrapper for a float.
This object can be used for the weight in front of a term in the
objective function. Although a standard float works for this purpose
as well, the advantage of using Weight
is
that when the value of the weight is changed, the objective function
object is automatically updated with the new value.
Stochastic Optimization
In this example we solve a stochastic version of
the first example here. As before,
the goal is to find coils that generate a specific target
normal field on a given surface. As we are still considering a vacuum
field the target is just zero.
The target equilibrium is the precise QA configuration.
The complete script can be found in examples/2_Intermediate/stage_two_optimization_stochastic.py
.
The objective function is similar to examples/2_Intermediate/stage_two_optimization.py
but with a few modifications:
J = (1/2) Mean(\int |B dot n|^2 ds)
+ LENGTH_WEIGHT * (sum CurveLength)
+ DISTANCE_WEIGHT * MininumDistancePenalty(DISTANCE_THRESHOLD)
+ CURVATURE_WEIGHT * CurvaturePenalty(CURVATURE_THRESHOLD)
+ MSC_WEIGHT * MeanSquaredCurvaturePenalty(MSC_THRESHOLD)
+ ARCLENGTH_WEIGHT * ArclengthVariation
The first term is altered to be given by the mean (expected value) of the flux over a distribution of possible perturbed coils instead of the flux itself. The mean is approximated by a sample average over perturbed coils. The coil perturbations for each coil are the sum of a ‘systematic error’ and a ‘statistical error’. The former satisfies rotational and stellarator symmetry, the latter is independent for each coil.
The last term in the objective, ArclengthVariation
,
tends to be useful to include when doing stochastic optimization. Without this term,
which tends to make the curve parameterization have a uniform arclength,
the optimizer can exploit the curve parameterization rather than actually changing the shape of the curve.
This is possible since the Gaussian process model for curve perturbations is based on the curve parameter rather than arclength.
We now define the objective function by stating what are the weights used and the corresponding terms. Besides the terms in the previous examples, we additionally define:
# Weight for the arclength variation penalty in the objective function:
ARCLENGTH_WEIGHT = 1e-2
# Standard deviation for the coil errors
SIGMA = 1e-3
# Length scale for the coil errors
L = 0.5
# Number of samples to approximate the mean
N_SAMPLES = 16
# Number of samples for out-of-sample evaluation
N_OOS = 256
# Objective function for the arclength variation
Jals = [ArclengthVariation(c) for c in base_curves]
# Objective function for the coils and its perturbations
rg = np.random.Generator(PCG64(seed, inc=0))
sampler = GaussianSampler(curves[0].quadpoints, SIGMA, L, n_derivs=1)
Jfs = []
curves_pert = []
for i in range(N_SAMPLES):
# first add the 'systematic' error. this error is applied to the base curves and hence the various symmetries are applied to it.
base_curves_perturbed = [CurvePerturbed(c, PerturbationSample(sampler, randomgen=rg)) for c in base_curves]
coils = coils_via_symmetries(base_curves_perturbed, base_currents, s.nfp, True)
# now add the 'statistical' error. this error is added to each of the final coils, and independent between all of them.
coils_pert = [Coil(CurvePerturbed(c.curve, PerturbationSample(sampler, randomgen=rg)), c.current) for c in coils]
curves_pert.append([c.curve for c in coils_pert])
bs_pert = BiotSavart(coils_pert)
Jfs.append(SquaredFlux(s, bs_pert))
Jmpi = MPIObjective(Jfs, comm, needs_splitting=True)
# Form the total objective function. To do this, we can exploit the
# fact that Optimizable objects with J() and dJ() functions can be
# multiplied by scalars and added:
JF = Jmpi \
+ LENGTH_WEIGHT * sum(Jls) \
+ DISTANCE_WEIGHT * Jdist \
+ CURVATURE_WEIGHT * sum(Jcs) \
+ MSC_WEIGHT * sum(QuadraticPenalty(J, MSC_THRESHOLD) for J in Jmscs) \
+ ARCLENGTH_WEIGHT * sum(Jals)
As can be seen here, in the stochastic optimization method, we apply two different types of errors. The first one is the systematic error which is applied where random perturbations based on a Gaussian Sampler with a predefined standard deviation are added to the base curves. The second is a statistical error that is added to each of the final coils, and is independent between coils.
Finite Beta Optimization
In this example, we solve a variant of
the first example here
in which there is current inside the plasma, which contributes significantly to
the total magnetic field.Therefore, the
target quantity \(\vec{B}_{external}\cdot \vec{n}\) is no longer zero
and a virtual casing calculation is used to find its value.
The complete script can be found in examples/2_Intermediate/stage_two_finite_beta.py
.
We use an objective function similar to the first example with small modifications:
J = (1/2) \int |(B_{BiotSavart} - B_{External}) dot n|^2 ds
+ LENGTH_WEIGHT * (sum CurveLength)
The first term, while similar to the previous examples, it calculates the external field \(B_{external}\) using a virtual casing principle. The virtual casing calculation is done in the following way:
# Resolution for the virtual casing calculation:
vc_src_nphi = 80
# (For the virtual casing src_ resolution, only nphi needs to be
# specified; the theta resolution is computed automatically to
# minimize anisotropy of the grid.)
# Once the virtual casing calculation has been run once, the results
# can be used for many coil optimizations. Therefore here we check to
# see if the virtual casing output file alreadys exists. If so, load
# the results, otherwise run the virtual casing calculation and save
# the results.
head, tail = os.path.split(vmec_file)
vc_filename = os.path.join(head, tail.replace('wout', 'vcasing'))
print('virtual casing data file:', vc_filename)
if os.path.isfile(vc_filename):
print('Loading saved virtual casing result')
vc = VirtualCasing.load(vc_filename)
else:
# Virtual casing must not have been run yet.
print('Running the virtual casing calculation')
vc = VirtualCasing.from_vmec(vmec_file, src_nphi=vc_src_nphi, trgt_nphi=nphi, trgt_ntheta=ntheta)
We then define the objective function as the squared flux targeting the value of the dot product between \(B_{external}\) and the surface normal vector computed with the results of the virtual casing principle:
# Define the objective function:
Jf = SquaredFlux(s, bs, target=vc.B_external_normal)
Jls = [CurveLength(c) for c in base_curves]
# Form the total objective function. To do this, we can exploit the
# fact that Optimizable objects with J() and dJ() functions can be
# multiplied by scalars and added:
JF = Jf \
+ LENGTH_PENALTY * sum(QuadraticPenalty(Jls[i], Jls[i].J()) for i in range(len(base_curves)))
The example above uses very minimal coil regularization: only the deviation from the initial coil length is penalized; curvature and distance are not targeted here.
Finite Build Optimization
In this final example, we perform a stage 2 optimization with
finite build coils. The script for this case can be found in
examples/3_Advanced/stage_two_optimization_finite_build.py
.
In particular, we use a multifilament approach that follows
"Optimization of finite-build stellarator coils,"
Singh, Luquant, et al. Journal of Plasma Physics 86, 4 (2020)
to approximate a finite build coil in order to have finite thickness.
The target equilibrium is the precise QA configuration. Besides the degrees of freedom listed in first example here, in this case, we have additional degrees of freedom related to the rotation of the coil pack. The objective function is given by:
J = (1/2) \int |(B_{BiotSavart}) dot n|^2 ds
+ LENGTH_PEN * (sum CurveLength)
+ DIST_PEN * CurveCurveDistance
Here, CurveCurveDistance
is used to prevents coils from
becoming too close. The constant
DIST_PEN
is selected to balance this minimum distance penalty
against the other objectives.
To initialize the finite build optimization, we use the definitions below:
# Weight on the curve length penalty in the objective function:
LENGTH_PEN = 1e-2
# Threshhold and weight for the coil-to-coil distance penalty in the objective function:
DIST_MIN = 0.1
DIST_PEN = 10
# Settings for multifilament approximation. In the following
# parameters, note that "normal" and "binormal" refer not to the
# Frenet frame but rather to the "coil centroid frame" defined by
# Singh et al., before rotation.
numfilaments_n = 2 # number of filaments in normal direction
numfilaments_b = 3 # number of filaments in bi-normal direction
gapsize_n = 0.02 # gap between filaments in normal direction
gapsize_b = 0.04 # gap between filaments in bi-normal direction
rot_order = 1 # order of the Fourier expression for the rotation of the filament pack, i.e. maximum Fourier mode number
nfil = numfilaments_n * numfilaments_b
base_curves = create_equally_spaced_curves(ncoils, s.nfp, stellsym=True, R0=R0, R1=R1, order=order)
base_currents = []
for i in range(ncoils):
curr = Current(1.)
# since the target field is zero, one possible solution is just to set all
# currents to 0. to avoid the minimizer finding that solution, we fix one
# of the currents
if i == 0:
curr.fix_all()
base_currents.append(ScaledCurrent(curr, 1e5/nfil))
# use sum here to concatenate lists
base_curves_finite_build = sum([
create_multifilament_grid(c, numfilaments_n, numfilaments_b, gapsize_n, gapsize_b, rotation_order=rot_order) for c in base_curves], [])
base_currents_finite_build = sum([[c]*nfil for c in base_currents], [])
# apply stellarator and rotation symmetries
curves_fb = apply_symmetries_to_curves(base_curves_finite_build, s.nfp, True)
currents_fb = apply_symmetries_to_currents(base_currents_finite_build, s.nfp, True)
# also apply symmetries to the underlying base curves, as we use those in the
# curve-curve distance penalty
curves = apply_symmetries_to_curves(base_curves, s.nfp, True)
coils_fb = [Coil(c, curr) for (c, curr) in zip(curves_fb, currents_fb)]
bs = BiotSavart(coils_fb)
bs.set_points(s.gamma().reshape((-1, 3)))
Finally, the objective function takes the form:
# Define the objective function:
Jf = SquaredFlux(s, bs)
Jls = [CurveLength(c) for c in base_curves]
Jdist = CurveCurveDistance(curves, DIST_MIN)
# Form the total objective function. To do this, we can exploit the
# fact that Optimizable objects with J() and dJ() functions can be
# multiplied by scalars and added:
JF = Jf \
+ LENGTH_PEN * sum(QuadraticPenalty(Jls[i], Jls[i].J()) for i in range(len(base_curves))) \
+ DIST_PEN * Jdist