masterarbeit/arbeit/ma.md
2017-10-27 14:31:55 +02:00

59 KiB
Raw Blame History

fontsize
12pt

\chapter*{How to read this Thesis}

As a guide through the nomenclature used in the formulas we prepend this chapter.

Unless otherwise noted the following holds:

  • lowercase letters x,y,z
    refer to real variables and represent the coordinates of a point in 3D--Space.
  • lowercase letters u,v,w
    refer to real variables between 0 and 1 used as coefficients in a 3D B--Spline grid.
  • other lowercase letters
    refer to other scalar (real) variables.
  • lowercase bold letters (e.g. \vec{x},\vec{y})
    refer to 3D coordinates
  • uppercase BOLD letters (e.g. \vec{D}, \vec{M})
    refer to Matrices

Introduction

\improvement[inline]{Mehr Bilder}

Many modern industrial design processes require advanced optimization methods due to the increased complexity resulting from more and more degrees of freedom as methods refine and/or other methods are used. Examples for this are physical domains like aerodynamics (i.e. drag), fluid dynamics (i.e. throughput of liquid) --- where the complexity increases with the temporal and spatial resolution of the simulation --- or known hard algorithmic problems in informatics (i.e. layouting of circuit boards or stacking of 3D--objects). Moreover these are typically not static environments but requirements shift over time or from case to case.

Evolutionary algorithms cope especially well with these problem domains while addressing all the issues at hand\cite{minai2006complex}. One of the main concerns in these algorithms is the formulation of the problems in terms of a genome and fitness--function. While one can typically use an arbitrary cost--function for the fitness--functions (i.e. amount of drag, amount of space, etc.), the translation of the problem--domain into a simple parametric representation (the genome) can be challenging.

This translation is often necessary as the target of the optimization may have too many degrees of freedom. In the example of an aerodynamic simulation of drag onto an object, those object--designs tend to have a high number of vertices to adhere to various requirements (visual, practical, physical, etc.). A simpler representation of the same object in only a few parameters that manipulate the whole in a sensible matter are desirable, as this often decreases the computation time significantly.

Additionally one can exploit the fact, that drag in this case is especially sensitive to non--smooth surfaces, so that a smooth local manipulation of the surface as a whole is more advantageous than merely random manipulation of the vertices.

The quality of such a low-dimensional representation in biological evolution is strongly tied to the notion of evolvability\cite{wagner1996complex}, as the parametrization of the problem has serious implications on the convergence speed and the quality of the solution\cite{Rothlauf2006}. However, there is no consensus on how evolvability is defined and the meaning varies from context to context\cite{richter2015evolvability}. As a consequence there is need for some criteria we can measure, so that we are able to compare different representations to learn and improve upon these.

One example of such a general representation of an object is to generate random points and represent vertices of an object as distances to these points --- for example via \acf{RBF}. If one (or the algorithm) would move such a point the object will get deformed only locally (due to the \ac{RBF}). As this results in a simple mapping from the parameter-space onto the object one can try out different representations of the same object and evaluate which criteria may be suited to describe this notion of evolvability. This is exactly what Richter et al.\cite{anrichterEvol} have done.

As we transfer the results of Richter et al.\cite{anrichterEvol} from using \acf{RBF} as a representation to manipulate geometric objects to the use of \acf{FFD} we will use the same definition for evolvability the original author used, namely regularity, variability, and improvement potential. We introduce these term in detail in Chapter \ref{sec:intro:rvi}. In the original publication the author could show a correlation between these evolvability--criteria with the quality and convergence speed of such optimization.

We will replicate the same setup on the same objects but use \acf{FFD} instead of \acf{RBF} to create a local deformation near the control points and evaluate if the evolution--criteria still work as a predictor for evolvability of the representation given the different deformation scheme, as suspected in \cite{anrichterEvol}.

First we introduce different topics in isolation in Chapter \ref{sec:back}. We take an abstract look at the definition of \ac{FFD} for a one--dimensional line (in \ref{sec🔙ffd}) and discuss why this is a sensible deformation function (in \ref{sec🔙ffdgood}). Then we establish some background--knowledge of evolutionary algorithms (in \ref{sec🔙evo}) and why this is useful in our domain (in \ref{sec🔙evogood}) followed by the definition of the different evolvability criteria established in \cite{anrichterEvol} (in \ref {sec🔙rvi}).

In Chapter \ref{sec:impl} we take a look at our implementation of \ac{FFD} and the adaptation for 3D--meshes that were used. Next, in Chapter \ref{sec:eval}, we describe the different scenarios we use to evaluate the different evolvability--criteria incorporating all aspects introduced in Chapter \ref{sec:back}. Following that, we evaluate the results in Chapter \ref{sec:res} with further on discussion, summary and outlook in Chapter \ref{sec:dis}.

Background

\label{sec:back}

What is \acf{FFD}?

\label{sec🔙ffd}

First of all we have to establish how a \ac{FFD} works and why this is a good tool for deforming geometric objects (esp. meshes in our case) in the first place. For simplicity we only summarize the 1D--case from \cite{spitzmuller1996bezier} here and go into the extension to the 3D case in chapter \ref{3dffd}.

The main idea of \ac{FFD} is to create a function $s : [0,1[^d \mapsto \mathbb{R}^d$ that spans a certain part of a vector--space and is only linearly parametrized by some special control points p_i and an constant attribution--function a_i(u), so


s(\vec{u}) = \sum_i a_i(\vec{u}) \vec{p_i}

can be thought of a representation of the inside of the convex hull generated by the control points where each point can be accessed by the right u \in [0,1[^d.

\begin{figure}[!ht] \begin{center} \includegraphics[width=0.7\textwidth]{img/B-Splines.png} \end{center} \caption[Example of B-Splines]{Example of a parametrization of a line with corresponding deformation to generate a deformed objet} \label{fig:bspline} \end{figure}

In the 1--dimensional example in figure \ref{fig:bspline}, the control--points are indicated as red dots and the color-gradient should hint at the $u$--values ranging from 0 to 1.

We now define a \acf{FFD} by the following:
Given an arbitrary number of points p_i alongside a line, we map a scalar value \tau_i \in [0,1[ to each point with $\tau_i < \tau_{i+1} \forall i$ according to the position of p_i on said line. Additionally, given a degree of the target polynomial d we define the curve N_{i,d,\tau_i}(u) as follows:

\begin{equation} \label{eqn:ffd1d1} N_{i,0,\tau}(u) = \begin{cases} 1, & u \in [\tau_i, \tau_{i+1}[ \ 0, & \mbox{otherwise} \end{cases} \end{equation} and \begin{equation} \label{eqn:ffd1d2} N_{i,d,\tau}(u) = \frac{u-\tau_i}{\tau_{i+d}} N_{i,d-1,\tau}(u) + \frac{\tau_{i+d+1} - u}{\tau_{i+d+1}-\tau_{i+1}} N_{i+1,d-1,\tau}(u) \end{equation}

If we now multiply every p_i with the corresponding N_{i,d,\tau_i}(u) we get the contribution of each point p_i to the final curve--point parameterized only by u \in [0,1[. As can be seen from \eqref{eqn:ffd1d2} we only access points [p_i..p_{i+d}] for any given $i$^[one more for each recursive step.], which gives us, in combination with choosing p_i and \tau_i in order, only a local interference of d+1 points.

We can even derive this equation straightforward for an arbitrary $N$^[Warning: in the case of d=1 the recursion--formula yields a $0$ denominator, but N is also 0. The right solution for this case is a derivative of $0$]:

\frac{\partial}{\partial u} N_{i,d,r}(u) = \frac{d}{\tau_{i+d} - \tau_i} N_{i,d-1,\tau}(u) - \frac{d}{\tau_{i+d+1} - \tau_{i+1}} N_{i+1,d-1,\tau}(u)

For a B--Spline

s(u) = \sum_{i} N_{i,d,\tau_i}(u) p_i

these derivations yield \frac{\partial^d}{\partial u} s(u) = 0.

Another interesting property of these recursive polynomials is that they are continuous (given d \ge 1) as every p_i gets blended in between \tau_i and \tau_{i+d} and out between \tau_{i+1}, and \tau_{i+d+1} as can bee seen from the two coefficients in every step of the recursion.

This means that all changes are only a local linear combination between the control--point p_i to p_{i+d+1} and consequently this yields to the convex--hull--property of B-Splines --- meaning, that no matter how we choose our coefficients, the resulting points all have to lie inside convex--hull of the control--points.

For a given point v_i we can then calculate the contributions n_{i,j}~:=~N_{j,d,\tau} of each control point p_j to get the projection from the control--point--space into the object--space:


v_i = \sum_j n_{i,j} \cdot p_j = \vec{n}_i^{T} \vec{p}

or written for all points at the same time:


\vec{v} = \vec{N} \vec{p}

where \vec{N} is the n \times m transformation--matrix (later on called deformation matrix) for n object--space--points and m control--points.

\begin{figure}[ht] \begin{center} \includegraphics[width=\textwidth]{img/unity.png} \end{center} \caption[B--spline--basis--function as partition of unity]{From \cite[Figure 2.13]{brunet2010contributions}:\newline \glqq Some interesting properties of the B--splines. On the natural definition domain of the B--spline ([k_0,k_4] on this figure), the B--spline basis functions sum up to one (partition of unity). In this example, we use B--splines of degree 2. The horizontal segment below the abscissa axis represents the domain of influence of the B--splines basis function, i.e. the interval on which they are not null. At a given point, there are at most d+1 non-zero B--spline basis functions (compact support).\grqq \newline Note, that Brunet starts his index at -d opposed to our definition, where we start at 0.} \label{fig:partition_unity} \end{figure}

Furthermore B--splines--basis--functions form a partition of unity for all, but the first and last d control-points\cite{brunet2010contributions}. Therefore we later on use the border-points d+1 times, such that $\sum_j n_{i,j} p_j = p_i$ for these points.

The locality of the influence of each control--point and the partition of unity was beautifully pictured by Brunet, which we included here as figure \ref{fig:partition_unity}.

Why is \ac{FFD} a good deformation function?

\label{sec🔙ffdgood}

The usage of \ac{FFD} as a tool for manipulating follows directly from the properties of the polynomials and the correspondence to the control points. Having only a few control points gives the user a nicer high--level--interface, as she only needs to move these points and the model follows in an intuitive manner. The deformation is smooth as the underlying polygon is smooth as well and affects as many vertices of the model as needed. Moreover the changes are always local so one risks not any change that a user cannot immediately see.

But there are also disadvantages of this approach. The user loses the ability to directly influence vertices and even seemingly simple tasks as creating a plateau can be difficult to achieve\cite[chapter~3.2]{hsu1991dmffd}\cite{hsu1992direct}.

This disadvantages led to the formulation of \acf{DM--FFD}\cite[chapter~3.3]{hsu1991dmffd} in which the user directly interacts with the surface--mesh. All interactions will be applied proportionally to the control--points that make up the parametrization of the interaction--point itself yielding a smooth deformation of the surface at the surface without seemingly arbitrary scattered control--points. Moreover this increases the efficiency of an evolutionary optimization\cite{Menzel2006}, which we will use later on.

\begin{figure}[!ht] \includegraphics[width=\textwidth]{img/hsu_fig7.png} \caption{Figure 7 from \cite{hsu1991dmffd}.} \label{fig:hsu_fig7} \end{figure}

But this approach also has downsides as can be seen in figure \ref{fig:hsu_fig7}, as the tessellation of the invisible grid has a major impact on the deformation itself.

All in all \ac{FFD} and \ac{DM--FFD} are still good ways to deform a high--polygon mesh albeit the downsides.

What is evolutionary optimization?

\label{sec🔙evo}

In this thesis we are using an evolutionary optimization strategy to solve the problem of finding the best parameters for our deformation. This approach, however, is very generic and we introduce it here in a broader sense.

\begin{algorithm} \caption{An outline of evolutionary algorithms} \label{alg:evo} \begin{algorithmic} \STATE t := 0; \STATE initialize P(0) := \{\vec{a}_1(0),\dots,\vec{a}_\mu(0)\} \in I^\mu; \STATE evaluate F(0) : \{\Phi(x) | x \in P(0)\}; \WHILE{c(F(t)) \neq \TRUE} \STATE recombine: P(t) := r(P(t)); \STATE mutate: P''(t) := m(P(t)); \STATE evaluate $F''(t) : {\Phi(x) | x \in P''(t)}$ \STATE select: P(t + 1) := s(P''(t) \cup Q,\Phi); \STATE t := t + 1; \ENDWHILE \end{algorithmic} \end{algorithm}

The general shape of an evolutionary algorithm (adapted from \cite{back1993overview}) is outlined in Algorithm \ref{alg:evo}. Here, P(t) denotes the population of parameters in step t of the algorithm. The population contains \mu individuals a_i from the possible individual--set I that fit the shape of the parameters we are looking for. Typically these are initialized by a random guess or just zero. Further on we need a so--called fitness--function \Phi : I \mapsto M that can take each parameter to a measurable space M (usually M = \mathbb{R}) along a convergence--function $c : I \mapsto \mathbb{B}$ that terminates the optimization.

Biologically speaking the set I corresponds to the set of possible Genotypes while M represents the possible observable Phenotypes. \improvement[inline]{Erklären, was das ist. Quellen!}

The main algorithm just repeats the following steps:

  • Recombine with a recombination--function r : I^{\mu} \mapsto I^{\lambda} to generate \lambda new individuals based on the characteristics of the $\mu$ parents.
    This makes sure that the next guess is close to the old guess.
  • Mutate with a mutation--function m : I^{\lambda} \mapsto I^{\lambda} to introduce new effects that cannot be produced by mere recombination of the parents.
    Typically this just adds minor defects to individual members of the population like adding a random gaussian noise or amplifying/dampening random parts.
  • Selection takes a selection--function s : (I^\lambda \cup I^{\mu + \lambda},\Phi) \mapsto I^\mu that selects from the previously generated I^\lambda children and optionally also the parents (denoted by the set Q in the algorithm) using the fitness--function \Phi. The result of this operation is the next Population of \mu individuals.

All these functions can (and mostly do) have a lot of hidden parameters that can be changed over time.

\improvement[inline]{Genauer: Welche? Wo? Wieso? ...}

Advantages of evolutionary algorithms

\label{sec🔙evogood}

The main advantage of evolutionary algorithms is the ability to find optima of general functions just with the help of a given fitness--function. Components and techniques for evolutionary algorithms are specifically known to help with different problems arising in the domain of optimization\cite{weise2012evolutionary}. An overview of the typical problems are shown in figure \ref{fig:probhard}.

\begin{figure}[!ht] \includegraphics[width=\textwidth]{img/weise_fig3.png} \caption{Fig.~3. taken from \cite{weise2012evolutionary}} \label{fig:probhard} \end{figure}

Most of the advantages stem from the fact that a gradient--based procedure has only one point of observation from where it evaluates the next steps, whereas an evolutionary strategy starts with a population of guessed solutions. Because an evolutionary strategy modifies the solution randomly, keeping the best solutions and purging the worst, it can also target multiple different hypothesis at the same time where the local optima die out in the face of other, better candidates.

\improvement[inline]{Verweis auf MO-CMA etc. Vielleicht auch etwas ausführlicher.}

If an analytic best solution exists and is easily computable (i.e. because the error--function is convex) an evolutionary algorithm is not the right choice. Although both converge to the same solution, the analytic one is usually faster.

But in reality many problems have no analytic solution, because the problem is either not convex or there are so many parameters that an analytic solution (mostly meaning the equivalence to an exhaustive search) is computationally not feasible. Here evolutionary optimization has one more advantage as one can at least get suboptimal solutions fast, which then refine over time and still converge to the same solution.

Criteria for the evolvability of linear deformations

\label{sec:intro:rvi}

As we have established in chapter \ref{sec🔙ffd}, we can describe a deformation by the formula


\vec{V} = \vec{U}\vec{P}

where \vec{V} is a n \times d matrix of vertices, \vec{U} are the (during parametrization) calculated deformation--coefficients and P is a m \times d matrix of control--points that we interact with during deformation.

We can also think of the deformation in terms of differences from the original coordinates


\Delta \vec{V} = \vec{U} \cdot \Delta \vec{P}

which is isomorphic to the former due to the linear correlation in the deformation. One can see in this way, that the way the deformation behaves lies solely in the entries of \vec{U}, which is why the three criteria focus on this.

Variability

In \cite{anrichterEvol} variability is defined as

\mathrm{variability}(\vec{U}) := \frac{\mathrm{rank}(\vec{U})}{n},

whereby \vec{U} is the n \times m deformation--Matrix used to map the $m$ control points onto the n vertices.

Given n = m, an identical number of control--points and vertices, this quotient will be =1 if all control points are independent of each other and the solution is to trivially move every control--point onto a target--point.

In praxis the value of V(\vec{U}) is typically \ll 1, because as there are only few control--points for many vertices, so m \ll n.

This criterion should correlate to the degrees of freedom the given parametrization has. This can be seen from the fact, that \mathrm{rank}(\vec{U}) is limited by \min(m,n) and --- as n is constant --- can never exceed n.

The rank itself is also interesting, as control--points could theoretically be placed on top of each other or be linear dependent in another way --- but will in both cases lower the rank below the number of control--points m and are thus measurable by the variability.

Regularity

Regularity is defined\cite{anrichterEvol} as

\mathrm{regularity}(\vec{U}) := \frac{1}{\kappa(\vec{U})} = \frac{\sigma_{min}}{\sigma_{max}}

where \sigma_{min} and \sigma_{max} are the smallest and greatest right singular value of the deformation--matrix \vec{U}.

As we deform the given Object only based on the parameters as $\vec{p} \mapsto f(\vec{x} + \vec{U}\vec{p})$ this makes sure that $|\vec{Up}| \propto |\vec{p}|$ when \kappa(\vec{U}) \approx 1. The inversion of $\kappa(\vec{U})$ is only performed to map the criterion--range to [0..1], whereas 1 is the optimal value and 0 is the worst value.

On the one hand this criterion should be characteristic for numeric stability\cite[chapter 2.7]{golub2012matrix} and on the other hand for the convergence speed of evolutionary algorithms\cite{anrichterEvol} as it is tied to the notion of locality\cite{weise2012evolutionary,thorhauer2014locality}.

Improvement Potential

In contrast to the general nature of variability and regularity, which are agnostic of the fitness--function at hand, the third criterion should reflect a notion of the potential for optimization, taking a guess into account.

Most of the times some kind of gradient g is available to suggest a direction worth pursuing; either from a previous iteration or by educated guessing. We use this to guess how much change can be achieved in the given direction.

The definition for an improvement potential P is\cite{anrichterEvol}:


\mathrm{potential}(\vec{U}) := 1 - \|(\vec{1} - \vec{UU}^+)\vec{G}\|^2_F

\unsure[inline]{ist das ^2 richtig?} given some approximate n \times d fitness--gradient \vec{G}, normalized to \|\vec{G}\|_F = 1, whereby \|\cdot\|_F denotes the Frobenius--Norm.

Implementation of \acf{FFD}

\label{sec:impl}

The general formulation of B--Splines has two free parameters d and $\tau$ which must be chosen beforehand.

As we usually work with regular grids in our \ac{FFD} we define $\tau$ statically as \tau_i = \nicefrac{i}{n} whereby n is the number of control--points in that direction.

d defines the degree of the B--Spline--Function (the number of times this function is differentiable) and for our purposes we fix d to 3, but give the formulas for the general case so it can be adapted quite freely.

Adaption of \ac{FFD}

\label{sec:ffd:adapt}

As we have established in Chapter \ref{sec🔙ffd} we can define an \ac{FFD}--displacement as \begin{equation} \Delta_x(u) = \sum_i N_{i,d,\tau_i}(u) \Delta_x c_i \end{equation}

Note that we only sum up the $\Delta$--displacements in the control points c_i to get the change in position of the point we are interested in.

In this way every deformed vertex is defined by


\textrm{Deform}(v_x) = v_x + \Delta_x(u)

with u \in [0..1[ being the variable that connects the high--detailed vertex--mesh to the low--detailed control--grid. To actually calculate the new position of the vertex we first have to calculate the $u$--value for each vertex. This is achieved by finding out the parametrization of v in terms of c_i


v_x \overset{!}{=} \sum_i N_{i,d,\tau_i}(u) c_i

so we can minimize the error between those two:


\underset{u}{\argmin}\,Err(u,v_x) = \underset{u}{\argmin}\,2 \cdot \|v_x - \sum_i N_{i,d,\tau_i}(u) c_i\|^2_2

As this error--term is quadratic we just derive by u yielding


\begin{array}{rl}
\frac{\partial}{\partial u} & v_x - \sum_i N_{i,d,\tau_i}(u) c_i \\
= & - \sum_i \left( \frac{d}{\tau_{i+d} - \tau_i} N_{i,d-1,\tau}(u) - \frac{d}{\tau_{i+d+1} - \tau_{i+1}} N_{i+1,d-1,\tau}(u) \right) c_i
\end{array}

and do a gradient--descend to approximate the value of u up to an \epsilon of 0.0001.

For this we use the Gauss--Newton algorithm\cite{gaussNewton} \todo[inline]{rewrite. falsch und wischi-waschi. Least squares?} as the solution to this problem may not be deterministic, because we usually have way more vertices than control points (\#v~\gg~\#c).

Adaption of \ac{FFD} for a 3D--Mesh

\label{3dffd}

This is a straightforward extension of the 1D--method presented in the last chapter. But this time things get a bit more complicated. As we have a 3--dimensional grid we may have a different amount of control--points in each direction.

Given n,m,o control points in $x,y,z$--direction each Point on the curve is defined by

V(u,v,w) = \sum_i \sum_j \sum_k N_{i,d,\tau_i}(u) N_{j,d,\tau_j}(v) N_{k,d,\tau_k}(w) \cdot C_{ijk}.

In this case we have three different B--Splines (one for each dimension) and also 3 variables u,v,w for each vertex we want to approximate.

Given a target vertex \vec{p}^* and an initial guess $\vec{p}=V(u,v,w)$ we define the error--function for the gradient--descent as:

Err(u,v,w,\vec{p}^{*}) = \vec{p}^{*} - V(u,v,w)

And the partial version for just one direction as

Err_x(u,v,w,\vec{p}^{*}) = p^{*}_x - \sum_i \sum_j \sum_k N_{i,d,\tau_i}(u) N_{j,d,\tau_j}(v) N_{k,d,\tau_k}(w) \cdot {c_{ijk}}_x 

To solve this we derive partially, like before:


\begin{array}{rl}
    \displaystyle \frac{\partial Err_x}{\partial u} & p^{*}_x - \displaystyle \sum_i \sum_j \sum_k N_{i,d,\tau_i}(u) N_{j,d,\tau_j}(v) N_{k,d,\tau_k}(w) \cdot {c_{ijk}}_x \\
  = & \displaystyle - \sum_i \sum_j \sum_k N'_{i,d,\tau_i}(u) N_{j,d,\tau_j}(v) N_{k,d,\tau_k}(w) \cdot {c_{ijk}}_x
\end{array}

The other partial derivatives follow the same pattern yielding the Jacobian:


J(Err(u,v,w)) = 
\left(
\begin{array}{ccc}
\frac{\partial Err_x}{\partial u} & \frac{\partial Err_x}{\partial v} & \frac{\partial Err_x}{\partial w} \\
\frac{\partial Err_y}{\partial u} & \frac{\partial Err_y}{\partial v} & \frac{\partial Err_y}{\partial w} \\
\frac{\partial Err_z}{\partial u} & \frac{\partial Err_z}{\partial v} & \frac{\partial Err_z}{\partial w}
\end{array}
\right)

\scriptsize
=
\left(
\begin{array}{ccc}
- \displaystyle \sum_{i,j,k} N'_{i}(u) N_{j}(v) N_{k}(w) \cdot {c_{ijk}}_x &- \displaystyle \sum_{i,j,k} N_{i}(u) N'_{j}(v) N_{k}(w) \cdot {c_{ijk}}_x & - \displaystyle \sum_{i,j,k} N_{i}(u) N_{j}(v) N'_{k}(w) \cdot {c_{ijk}}_x \\
- \displaystyle \sum_{i,j,k} N'_{i}(u) N_{j}(v) N_{k}(w) \cdot {c_{ijk}}_y &- \displaystyle \sum_{i,j,k} N_{i}(u) N'_{j}(v) N_{k}(w) \cdot {c_{ijk}}_y & - \displaystyle \sum_{i,j,k} N_{i}(u) N_{j}(v) N'_{k}(w) \cdot {c_{ijk}}_y \\
- \displaystyle \sum_{i,j,k} N'_{i}(u) N_{j}(v) N_{k}(w) \cdot {c_{ijk}}_z &- \displaystyle \sum_{i,j,k} N_{i}(u) N'_{j}(v) N_{k}(w) \cdot {c_{ijk}}_z & - \displaystyle \sum_{i,j,k} N_{i}(u) N_{j}(v) N'_{k}(w) \cdot {c_{ijk}}_z
\end{array}
\right)

With the Gauss--Newton algorithm we iterate via the formula

J(Err(u,v,w)) \cdot \Delta \left( \begin{array}{c} u \\ v \\ w \end{array} \right) = -Err(u,v,w)

and use Cramers rule for inverting the small Jacobian and solving this system of linear equations.

As there is no strict upper bound of the number of iterations for this algorithm, we just iterate it long enough to be within the given $\epsilon$--error above. This takes --- depending on the shape of the object and the grid --- about 3 to 5 iterations that we observed in practice.

Another issue that we observed in our implementation is, that multiple local optima may exist on self--intersecting grids. We solve this problem by defining self--intersecting grids to be invalid and do not test any of them.

This is not such a big problem as it sounds at first, as self--intersections mean, that control--points being further away from a given vertex have more influence over the deformation than control--points closer to this vertex. Also this contradicts the notion of locality that we want to achieve and deemed beneficial for a good behaviour of the evolutionary algorithm.

Deformation Grid

\label{sec:impl:grid}

As mentioned in chapter \ref{sec🔙evo}, the way of choosing the representation to map the general problem (mesh--fitting/optimization in our case) into a parameter-space is very important for the quality and runtime of evolutionary algorithms\cite{Rothlauf2006}.

Because our control--points are arranged in a grid, we can accurately represent each vertex--point inside the grids volume with proper B--Spline--coefficients between [0,1[ and --- as a consequence --- we have to embed our object into it (or create constant "dummy"-points outside).

The great advantage of B--Splines is the local, direct impact of each control point without having a $1:1$--correlation, and a smooth deformation. While the advantages are great, the issues arise from the problem to decide where to place the control--points and how many to place at all.

\begin{figure}[!tbh] \centering \includegraphics{img/enoughCP.png} \caption[Example of a high resolution control--grid]{A high resolution (10 \times 10) of control--points over a circle. Yellow/green points contribute to the parametrization, red points don't.\newline An Example--point (blue) is solely determined by the position of the green control--points.} \label{fig:enoughCP} \end{figure}

One would normally think, that the more control--points you add, the better the result will be, but this is not the case for our B--Splines. Given any point \vec{p} only the 2 \cdot (d-1) control--points contribute to the parametrization of that point^[Normally these are d-1 to each side, but at the boundaries the number gets increased to the inside to meet the required smoothness]. This means, that a high resolution can have many control-points that are not contributing to any point on the surface and are thus completely irrelevant to the solution.

We illustrate this phenomenon in figure \ref{fig:enoughCP}, where the four red central points are not relevant for the parametrization of the circle. This leads to artefacts in the deformation--matrix \vec{U}, as the columns corresponding to those control--points are 0.

This leads to useless increased complexity, as the parameters corresponding to those points will never have any effect, but a naive algorithm will still try to optimize them yielding numeric artefacts in the best and non--terminating or ill--defined solutions^[One example would be, when parts of an algorithm depend on the inverse of the minimal right singular value leading to a division by 0.] at worst.

One can of course neglect those columns and their corresponding control--points, but this raises the question why they were introduced in the first place. We will address this in a special scenario in \ref{sec:res:3d:var}.

For our tests we chose different uniformly sized grids and added noise onto each control-point^[For the special case of the outer layer we only applied noise away from the object, so the object is still confined in the convex hull of the control--points.] to simulate different starting-conditions.

Scenarios for testing evolvability criteria using \ac{FFD}

\label{sec:eval}

In our experiments we use the same two testing--scenarios, that were also used by \cite{anrichterEvol}. The first scenario deforms a plane into a shape originally defined in \cite{giannelli2012thb}, where we setup control-points in a 2--dimensional manner and merely deform in the height--coordinate to get the resulting shape.

In the second scenario we increase the degrees of freedom significantly by using a 3--dimensional control--grid to deform a sphere into a face, so each control point has three degrees of freedom in contrast to first scenario.

Test Scenario: 1D Function Approximation

In this scenario we used the shape defined by Giannelli et al.\cite{giannelli2012thb}, which is also used by Richter et al.\cite{anrichterEvol} using the same discretization to 150 \times 150 points for a total of n = 22\,500 vertices. The shape is given by the following definition \begin{equation} t(x,y) = \begin{cases} 0.5 \cos(4\pi \cdot q^{0.5}) + 0.5 & q(x,y) < \frac{1}{16},\ 2(y-x) & 0 < y-x < 0.5,\ 1 & 0.5 < y - x \end{cases} \end{equation} with (x,y) \in [0,2] \times [0,1] and q(x,y)=(x-1.5)^2 + (y-0.5)^2, which we have visualized in figure \ref{fig:1dtarget}.

\begin{figure}[ht] \begin{center} \includegraphics[width=0.7\textwidth]{img/1dtarget.png} \end{center} \caption[The 1D--target--shape]{The target--shape for our 1--dimensional optimization--scenario including a wireframe--overlay of the vertices.} \label{fig:1dtarget} \end{figure}

As the starting-plane we used the same shape, but set all $z$--coordinates to 0, yielding a flat plane, which is partially already correct.

Regarding the fitness--function \mathrm{f}(\vec{p}), we use the very simple approach of calculating the squared distances for each corresponding vertex \begin{equation} \mathrm{f}(\vec{p}) = \sum_{i=1}^{n} |(\vec{Up})_i - t_i|_2^2 = |\vec{Up} - \vec{t}|^2 \rightarrow \min \end{equation} where t_i are the respective target--vertices to the parametrized source--vertices^[The parametrization is encoded in \vec{U} and the initial position of the control points. See \ref{sec:ffd:adapt}] with the current deformation--parameters \vec{p} = (p_1,\dots, p_m). We can do this one--to--one--correspondence because we have exactly the same number of source and target-vertices do to our setup of just flattening the object.

This formula is also the least--squares approximation error for which we can compute the analytic solution \vec{p^{*}} = \vec{U^+}\vec{t}, yielding us the correct gradient in which the evolutionary optimizer should move.

Test Scenario: 3D Function Approximation

\label{sec:test:3dfa} Opposed to the 1--dimensional scenario before, the 3--dimensional scenario is much more complex --- not only because we have more degrees of freedom on each control point, but also, because the fitness--function we will use has no known analytic solution and multiple local minima.

\begin{figure}[ht] \begin{center} \includegraphics[width=0.9\textwidth]{img/3dtarget.png} \end{center} \caption[3D source and target meshes]{\newline Left: The sphere we start from with 10,807 vertices\newline Right: The face we want to deform the sphere into with 12,024 vertices.} \label{fig:3dtarget} \end{figure}

First of all we introduce the set up: We have given a triangulated model of a sphere consisting of 10\,807 vertices, that we want to deform into a the target--model of a face with a total of 12\,024 vertices. Both of these Models can be seen in figure \ref{fig:3dtarget}.

Opposed to the 1D--case we cannot map the source and target--vertices in a one--to--one--correspondence, which we especially need for the approximation of the fitting--error. Hence we state that the error of one vertex is the distance to the closest vertex of the other model and sum up the error from the respective source and target.

We therefore define the fitness--function to be:

\begin{equation} \mathrm{f}(\vec{P}) = \frac{1}{n} \underbrace{\sum_{i=1}^n |\vec{c_T(s_i)} - \vec{s_i}|2^2}{\textrm{source-to-target--distance}}

  • \frac{1}{m} \underbrace{\sum_{i=1}^m |\vec{c_S(t_i)} - \vec{t_i}|2^2}{\textrm{target-to-source--distance}}
  • \lambda \cdot \textrm{regularization}(\vec{P}) \label{eq:fit3d} \end{equation}

where \vec{c_T(s_i)} denotes the target--vertex that is corresponding to the source--vertex \vec{s_i} and \vec{c_S(t_i)} denotes the source--vertex that corresponds to the target--vertex \vec{t_i}. Note that the target--vertices are given and fixed by the target--model of the face we want to deform into, whereas the source--vertices vary depending on the chosen parameters \vec{P}, as those get calculated by the previously introduces formula $\vec{S} = \vec{UP}$ with \vec{S} being the $n \times 3$--matrix of source--vertices, \vec{U} the $n \times m$--matrix of calculated coefficients for the \ac{FFD} --- analog to the 1D case --- and finally \vec{P} being the $m \times 3$--matrix of the control--grid defining the whole deformation.

As regularization-term we add a weighted Laplacian of the deformation that has been used before by Aschenbach et al.\cite[Section 3.2]{aschenbach2015} on similar models and was shown to lead to a more precise fit. The Laplacian \begin{equation} \mathrm{regularization}(\vec{P}) = \frac{1}{\sum_i A_i} \sum_{i=1}^n A_i \cdot \left( \sum_{\vec{s_j} \in \mathcal{N}(\vec{s_i})} w_j \cdot |\Delta \vec{s_j} - \Delta \vec{\overline{s}_j}|^2 \right) \label{eq:reg3d} \end{equation} \unsure[inline]{was ist \vec{\overline{s}_j}? Zentrum? eigentlich s_i?} is determined by the cotangent weighted displacement w_j of the to $s_i$ connected vertices \mathcal{N}(s_i) and A_i is the Voronoi--area of the corresponding vertex \vec{s_i}. We leave out the $\vec{R}_i$--term from the original paper as our deformation is merely linear.

This regularization--weight gives us a measure of stiffness for the material that we will influence via the $\lambda$--coefficient to start out with a stiff material that will get more flexible per iteration. \unsure[inline]{Andreas: hast du nen cite, wo gezeigt ist, dass das so sinnvoll ist?}

Evaluation of Scenarios

\label{sec:res}

To compare our results to the ones given by Richter et al.\cite{anrichterEvol}, we also use Spearman's rank correlation coefficient. Opposed to other popular coefficients, like the Pearson correlation coefficient, which measures a linear relationship between variables, the Spearmans's coefficient assesses \glqq how well an arbitrary monotonic function can describe the relationship between two variables, without making any assumptions about the frequency distribution of the variables\grqq\cite{hauke2011comparison}.

As we don't have any prior knowledge if any of the criteria is linear and we are just interested in a monotonic relation between the criteria and their predictive power, the Spearman's coefficient seems to fit out scenario best and was also used before by Richter et al.\cite{anrichterEvol}

For interpretation of these values we follow the same interpretation used in \cite{anrichterEvol}, based on \cite{weir2015spearman}: The coefficient intervals r_S \in [0,0.2[, [0.2,0.4[, [0.4,0.6[, [0.6,0.8[, and [0.8,1] are classified as very weak, weak, moderate, strong and very strong. We interpret p--values smaller than 0.01 as significant and cut off the precision of p--values after four decimal digits (thus often having a p--value of 0 given for p--values < 10^{-4}).

As we are looking for anti--correlation (i.e. our criterion should be maximized indicating a minimal result in --- for example --- the reconstruction--error) instead of correlation we flip the sign of the correlation--coefficient for readability and to have the correlation--coefficients be in the classification--range given above.

For the evolutionary optimization we employ the CMA--ES (covariance matrix adaptation evolution strategy) of the shark3.1 library \cite{shark08}, as this algorithm was used by \cite{anrichterEvol} as well. We leave the parameters at their sensible defaults as further explained in \cite[Appendix~A: Table~1]{hansen2016cma}.

Procedure: 1D Function Approximation

\label{sec:proc:1d}

For our setup we first compute the coefficients of the deformation--matrix and use then the formulas for variability and regularity to get our predictions. Afterwards we solve the problem analytically to get the (normalized) correct gradient that we use as guess for the improvement potential. To check we also consider a distorted gradient \vec{g}_{\textrm{d}}


\vec{g}_{\textrm{d}} = \frac{\vec{g}_{\textrm{c}} + \mathbb{1}}{\|\vec{g}_{\textrm{c}} + \mathbb{1}\|}

where \mathbb{1} is the vector consisting of 1 in every dimension and \vec{g}_\textrm{c} = \vec{p^{*}} - \vec{p} the calculated correct gradient. As we always start with a gradient of \mathbb{0} this shortens to \vec{g}_\textrm{c} = \vec{p^{*}}.

\begin{figure}[ht] \begin{center} \includegraphics[width=\textwidth]{img/example1d_grid.png} \end{center} \caption[Example of a 1D--grid]{\newline Left: A regular $7 \times 4$--grid\newline Right: The same grid after a random distortion to generate a testcase.} \label{fig:example1d_grid} \end{figure}

We then set up a regular 2--dimensional grid around the object with the desired grid resolutions. To generate a testcase we then move the grid--vertices randomly inside the x--y--plane. As self-intersecting grids get tricky to solve with our implemented newtons--method we avoid the generation of such self--intersecting grids for our testcases (see section \ref{3dffd}).

To achieve that we select a uniform distributed number r \in [-0.25,0.25] per dimension and shrink the distance to the neighbours (the smaller neighbour for r < 0, the larger for r > 0) by the factor $r$^[Note: On the Edges this displacement is only applied outwards by flipping the sign of r, if appropriate.]. \improvement[inline]{update!! gaussian, not uniform!!}

An Example of such a testcase can be seen for a $7 \times 4$--grid in figure \ref{fig:example1d_grid}.

Results of 1D Function Approximation

In the case of our 1D--Optimization--problem, we have the luxury of knowing the analytical solution to the given problem--set. We use this to experimentally evaluate the quality criteria we introduced before. As an evolutional optimization is partially a random process, we use the analytical solution as a stopping-criteria. We measure the convergence speed as number of iterations the evolutional algorithm needed to get within 1.05 \times of the optimal solution.

We used different regular grids that we manipulated as explained in Section \ref{sec:proc:1d} with a different number of control points. As our grids have to be the product of two integers, we compared a $5 \times 5$--grid with $25$ control--points to a 4 \times 7 and $7 \times 4$--grid with $28$ control--points. This was done to measure the impact an \glqq improper\grqq \ setup could have and how well this is displayed in the criteria we are examining.

Additionally we also measured the effect of increasing the total resolution of the grid by taking a closer look at 5 \times 5, 7 \times 7 and 10 \times 10 grids.

Variability

\begin{figure}[tbh] \centering \includegraphics[width=0.7\textwidth]{img/evolution1d/variability_boxplot.png} \caption[1D Fitting Errors for various grids]{The squared error for the various grids we examined.\newline Note that 7 \times 4 and 4 \times 7 have the same number of control--points.} \label{fig:1dvar} \end{figure}

Variability should characterize the potential for design space exploration and is defined in terms of the normalized rank of the deformation matrix \vec{U}: V(\vec{U}) := \frac{\textrm{rank}(\vec{U})}{n}, whereby n is the number of vertices. As all our tested matrices had a constant rank (being m = x \cdot y for a $x \times y$ grid), we have merely plotted the errors in the boxplot in figure \ref{fig:1dvar}

It is also noticeable, that although the 7 \times 4 and 4 \times 7 grids have a higher variability, they perform not better than the 5 \times 5 grid. Also the 7 \times 4 and 4 \times 7 grids differ distinctly from each other with a mean$\pm$sigma of 233.09 \pm 12.32 for the former and 286.32 \pm 22.36 for the latter, although they have the same number of control--points. This is an indication of an impact a proper or improper grid--setup can have. We do not draw scientific conclusions from these findings, as more research on non-squared grids seem necessary.

Leaving the issue of the grid--layout aside we focused on grids having the same number of prototypes in every dimension. For the 5 \times 5, 7 \times 7 and 10 \times 10 grids we found a very strong correlation (-r_S = 0.94, p = 0) between the variability and the evolutionary error.

Regularity

\begin{figure}[tbh] \centering \includegraphics[width=\textwidth]{img/evolution1d/55_to_1010_steps.png} \caption[Improvement potential and regularity vs. steps]{\newline Left: Improvement potential against steps until convergence\newline Right: Regularity against steps until convergence\newline Coloured by their grid--resolution, both with a linear fit over the whole dataset.} \label{fig:1dreg} \end{figure}

\begin{table}[b] \centering \begin{tabular}{c|c|c|c|c} 5 \times 5 & 7 \times 4 & 4 \times 7 & 7 \times 7 & $10 \times 10$\ \hline 0.28 (0.0045) & \textcolor{red}{$0.21$} (0.0396) & \textcolor{red}{$0.1$} (0.3019) & \textcolor{red}{$0.01$} (0.9216) & \textcolor{red}{$0.01$} (0.9185) \end{tabular} \caption[Correlation 1D Regularity/Steps]{Spearman's correlation (and p-values) between regularity and convergence speed for the 1D function approximation problem. \newline Note: Not significant results are marked in \textcolor{red}{red}. } \label{tab:1dreg} \end{table}

Regularity should correspond to the convergence speed (measured in iteration--steps of the evolutionary algorithm), and is computed as inverse condition number \kappa(\vec{U}) of the deformation--matrix.

As can be seen from table \ref{tab:1dreg}, we could only show a weak correlation in the case of a 5 \times 5 grid. As we increment the number of control--points the correlation gets worse until it is completely random in a single dataset. Taking all presented datasets into account we even get a strong correlation of - r_S = -0.72, p = 0, that is opposed to our expectations.

To explain this discrepancy we took a closer look at what caused these high number of iterations. In figure \ref{fig:1dreg} we also plotted the improvement-potential against the steps next to the regularity--plot. Our theory is that the very strong correlation (-r_S = -0.82, p=0) between improvement--potential and number of iterations hints that the employed algorithm simply takes longer to converge on a better solution (as seen in figure \ref{fig:1dvar} and \ref{fig:1dimp}) offsetting any gain the regularity--measurement could achieve.

Improvement Potential

\begin{figure}[ht] \centering \includegraphics[width=0.8\textwidth]{img/evolution1d/55_to_1010_improvement-vs-evo-error.png} \caption[Correlation 1D Improvement vs. Error]{Improvement potential plotted against the error yielded by the evolutionary optimization for different grid--resolutions} \label{fig:1dimp} \end{figure}

\improvement[inline]{write something about it..}

  • spearman 1 (p=0)
  • gradient macht keinen unterschied
  • UU^+ scheint sehr kleine EW zu haben, s. regularität
  • trotzdem sehr gutes kriterium - auch ohne Richtung.

Procedure: 3D Function Approximation

\label{sec:proc:3dfa}

As explained in section \ref{sec:test:3dfa} in detail, we do not know the analytical solution to the global optimum. Additionally we have the problem of finding the right correspondences between the original sphere--model and the target--model, as they consist of 10\,807 and 12\,024 vertices respectively, so we cannot make a one--to--one--correspondence between them as we did in the one--dimensional case.

Initially we set up the correspondences \vec{c_T(\dots)} and \vec{c_S(\dots)} to be the respectively closest vertices of the other model. We then calculate the analytical solution given these correspondences via \vec{P^{*}} = \vec{U^+}\vec{T}, and also use the first solution as guessed gradient for the calculation of the improvement--potential, as the optimal solution is not known. We then let the evolutionary algorithm run up within 1.05 times the error of this solution and afterwards recalculate the correspondences $\vec{c_T(\dots)}$ and \vec{c_S(\dots)}.

\begin{figure}[ht] \begin{center} \includegraphics[width=\textwidth]{img/example3d_grid.png} \end{center} \caption[Example of a 3D--grid]{\newline Left: The 3D--setup with a $4\times 4\times 4$--grid.\newline Right: The same grid after added noise to the control--points.} \label{fig:setup3d} \end{figure}

For the next step we then halve the regularization--impact \lambda (starting at 1) of our fitness--function (\ref{eq:fit3d}) and calculate the next incremental solution \vec{P^{*}} = \vec{U^+}\vec{T} with the updated correspondences (again, mapping each vertex to its closest neighbour in the respective other model) to get our next target--error. We repeat this process as long as the target--error keeps decreasing and use the number of these iterations as measure of the convergence speed. As the resulting evolutional error without regularization is in the numeric range of \approx 100, whereas the regularization is numerically \approx 7000 we need at least 10 to $15$ iterations until the regularization--effect wears off.

The grid we use for our experiments is just very coarse due to computational limitations. We are not interested in a good reconstruction, but an estimate if the mentioned evolvability criteria are good.

In figure \ref{fig:setup3d} we show an example setup of the scene with a $4\times 4\times 4$--grid. Identical to the 1--dimensional scenario before, we create a regular grid and move the control-points \improvement{Beschreiben wie} random between their neighbours, but in three instead of two dimensions^[Again, we flip the signs for the edges, if necessary to have the object still in the convex hull.].

\begin{figure}[!htb] \includegraphics[width=\textwidth]{img/3d_grid_resolution.png} \caption[Different resolution of 3D grids]{\newline Left: A 7 \times 4 \times 4 grid suited to better deform into facial features.\newline Right: A 4 \times 4 \times 7 grid that we expect to perform worse.} \label{fig:3dgridres} \end{figure}

As is clearly visible from figure \ref{fig:3dgridres}, the target--model has many vertices in the facial area, at the ears and in the neck--region. Therefore we chose to increase the grid-resolutions for our tests in two different dimensions and see how well the criteria predict a suboptimal placement of these control-points.

Results of 3D Function Approximation

In the 3D--Approximation we tried to evaluate further on the impact of the grid--layout to the overall criteria. As the target--model has many vertices in concentrated in the facial area we start from a 4 \times 4 \times 4 grid and only increase the number of control points in one dimension, yielding a resolution of 7 \times 4 \times 4 and 4 \times 4 \times 7 respectively. We visualized those two grids in figure \ref{fig:3dgridres}.

To evaluate the performance of the evolvability--criteria we also tested a more neutral resolution of 4 \times 4 \times 4, 5 \times 5 \times 5, and 6 \times 6 \times 6 --- similar to the 1D--setup.

\begin{figure}[ht] \centering \includegraphics[width=0.7\textwidth]{img/evolution3d/variability_boxplot.png} \caption[3D Fitting Errors for various grids]{The fitting error for the various grids we examined.\newline Note that the number of control--points is a product of the resolution, so $X \times 4 \times 4$ and 4 \times 4 \times X have the same number of control--points.} \label{fig:3dvar} \end{figure}

Variability

\label{sec:res:3d:var}

\begin{table}[tbh] \centering \begin{tabular}{c|c|c|c} 4 \times 4 \times \mathrm{X} & \mathrm{X} \times 4 \times 4 & \mathrm{Y} \times \mathrm{Y} \times \mathrm{Y} & all \ \hline 0.89 (0) & 0.9 (0) & 0.91 (0) & 0.94 (0) \end{tabular} \caption[Correlation between variability and fitting error for 3D]{Correlation between variability and fitting error for the 3D fitting scenario.\newline Displayed are the negated Spearman coefficients with the corresponding p-values in brackets for three cases of increasing variability ($\mathrm{X} \in [4,5,7], \mathrm{Y} \in [4,5,6]$). \newline Note: Not significant results are marked in \textcolor{red}{red}.} \label{tab:3dvar} \end{table}

Similar to the 1D case all our tested matrices had a constant rank (being m = x \cdot y \cdot z for a x \times y \times z grid), so we again have merely plotted the errors in the boxplot in figure \ref{fig:3dvar}.

As expected the \mathrm{X} \times 4 \times 4 grids performed slightly better than their 4 \times 4 \times \mathrm{X} counterparts with a mean$\pm$sigma of 101.25 \pm 7.45 to 102.89 \pm 6.74 for \mathrm{X} = 5 and 85.37 \pm 7.12 to 89.22 \pm 6.49 for \mathrm{X} = 7.

Interestingly both variants end up closer in terms of fitting error than we anticipated, which shows that the evolutionary algorithm we employed is capable of correcting a purposefully created \glqq bad\grqq \ grid. Also this confirms, that in our cases the number of control--points is more important for quality than their placement, which is captured by the variability via the rank of the deformation--matrix.

\begin{figure}[hbt] \centering \includegraphics[width=0.8\textwidth]{img/evolution3d/variability2_boxplot.png} \caption[Histogram of ranks of high--resolution deformation--matrices]{ Histogram of ranks of various 10 \times 10 \times 10 grids with $1000$ control--points each. } \label{fig:histrank3d} \end{figure}

Overall the correlation between variability and fitness--error were significant and showed a very strong correlation in all our tests. The detailed correlation--coefficients are given in table \ref{tab:3dvar} alongside their p--values.

As introduces in section \ref{sec:impl:grid} and visualized in figure \ref{fig:enoughCP}, we know, that not all control points have to necessarily contribute to the parametrization of our 3D--model. Because we are starting from a sphere, some control-points are too far away from the surface to contribute to the deformation at all.

One can already see in 2D in figure \ref{fig:enoughCP}, that this effect starts with a regular 9 \times 9 grid on a perfect circle. To make sure we observe this, we evaluated the variability for 100 randomly moved $10 \times 10 \times 10$ grids on the sphere we start out with.

As the variability is defined by \frac{\mathrm{rank}(\vec{U})}{n} we can easily recover the rank of the deformation--matrix \vec{U}. The results are shown in the histogram in figure \ref{fig:histrank3d}. Especially in the centre of the sphere and in the corners of our grid we effectively loose control--points for our parametrization.

This of course yields a worse error as when those control--points would be put to use and one should expect a loss in quality evident by a higher reconstruction--error opposed to a grid where they are used. Sadly we could not run a in--depth test on this due to computational limitations.

Nevertheless this hints at the notion, that variability is a good measure for the overall quality of a fit.

Regularity

\begin{table}[tbh] \centering \begin{tabular}{c|c|c|c} & 5 \times 4 \times 4 & 7 \times 4 \times 4 & \mathrm{X} \times 4 \times 4 \ \cline{2-4} & \textcolor{red}{0.15} (0.147) & \textcolor{red}{0.09} (0.37) & 0.46 (0) \B \ \cline{2-4} \multicolumn{4}{c}{} \[-1.4em] \hline 4 \times 4 \times 4 & 4 \times 4 \times 5 & 4 \times 4 \times 7 & 4 \times 4 \times \mathrm{X} \T \ \hline 0.38 (0) & \textcolor{red}{0.17} (0.09) & 0.40 (0) & 0.46 (0) \B \ \hline \multicolumn{4}{c}{} \[-1.4em] \cline{2-4} & 5 \times 5 \times 5 & 6 \times 6 \times 6 & \mathrm{Y} \times \mathrm{Y} \times \mathrm{Y} \T \ \cline{2-4} & \textcolor{red}{-0.18} (0.0775) & \textcolor{red}{-0.13} (0.1715) & -0.25 (0) \B \ \cline{2-4} \multicolumn{4}{c}{} \[-1.4em] \cline{2-4} \multicolumn{3}{c}{} & all: 0.15 (0) \T \end{tabular} \caption[Correlation between regularity and iterations for 3D]{Correlation between regularity and number of iterations for the 3D fitting scenario. Displayed are the negated Spearman coefficients with the corresponding p--values in brackets for various given grids (\mathrm{X} \in [4,5,7], \mathrm{Y} \in [4,5,6]). \newline Note: Not significant results are marked in \textcolor{red}{red}.} \label{tab:3dreg} \end{table}

Opposed to the predictions of variability our test on regularity gave a mixed result --- similar to the 1D--case.

In roughly half of the scenarios we have a significant, but weak to moderate correlation between regularity and number of iterations. On the other hand in the scenarios where we increased the number of control--points, namely 125 for the 5 \times 5 \times 5 grid and 216 for the 6 \times 6 \times 6 grid we found a significant, but weak anti--correlation when taking all three tests into account^[Displayed as $Y \times Y \times Y$], which seem to contradict the findings/trends for the sets with 64, 80, and 112 control--points (first two rows of table \ref{tab:3dreg}).

Taking all results together we only find a very weak, but significant link between regularity and the number of iterations needed for the algorithm to converge.

\begin{figure}[!htb] \centering \includegraphics[width=\textwidth]{img/evolution3d/regularity_montage.png} \caption[Regularity for different 3D--grids]{ Plots of regularity against number of iterations for various scenarios together with a linear fit to indicate trends.} \label{fig:resreg3d} \end{figure}

As can be seen from figure \ref{fig:resreg3d}, we can observe that increasing the number of control--points helps the convergence--speeds. The regularity--criterion first behaves as we would like to, but then switches to behave exactly opposite to our expectations, as can be seen in the first three plots. While the number of control--points increases from red to green to blue and the number of iterations decreases, the regularity seems to increase at first, but then decreases again on higher grid--resolutions.

This can be an artefact of the definition of regularity, as it is defined by the inverse condition--number of the deformation--matrix \vec{U}, being the fraction \frac{\sigma_{\mathrm{min}}}{\sigma_{\mathrm{max}}} between the least and greatest right singular value.

As we observed in the previous section, we cannot guarantee that each control--point has an effect (see figure \ref{fig:histrank3d}) and so a small minimal right singular value occurring on higher grid--resolutions seems likely the problem.

Adding to this we also noted, that in the case of the $10 \times 10 \times 10$--grid the regularity was always 0, as a non--contributing control-point yields a $0$--column in the deformation--matrix, thus letting \sigma_\mathrm{min} = 0. A better definition for regularity (i.e. using the smallest non--zero right singular value) could solve this particular issue, but not fix the trend we noticed above.

Improvement Potential

\begin{table}[tbh] \centering \begin{tabular}{c|c|c|c} & 5 \times 4 \times 4 & 7 \times 4 \times 4 & \mathrm{X} \times 4 \times 4 \ \cline{2-4} & 0.3 (0.0023) & \textcolor{red}{0.23} (0.0233) & 0.89 (0) \B \ \cline{2-4} \multicolumn{4}{c}{} \[-1.4em] \hline 4 \times 4 \times 4 & 4 \times 4 \times 5 & 4 \times 4 \times 7 & 4 \times 4 \times \mathrm{X} \T \ \hline 0.5 (0) & 0.38 (0) & 0.32 (0.0012) & 0.9 (0) \B \ \hline \multicolumn{4}{c}{} \[-1.4em] \cline{2-4} & 5 \times 5 \times 5 & 6 \times 6 \times 6 & \mathrm{Y} \times \mathrm{Y} \times \mathrm{Y} \T \ \cline{2-4} & 0.47 (0) & \textcolor{red}{-0.01} (0.8803) & 0.89 (0) \B \ \cline{2-4} \multicolumn{4}{c}{} \[-1.4em] \cline{2-4} \multicolumn{3}{c}{} & all: 0.95 (0) \T \end{tabular} \caption[Correlation between improvement--potential and fitting--error for 3D]{Correlation between improvement--potential and fitting--error for the 3D fitting scenario. Displayed are the negated Spearman coefficients with the corresponding p--values in brackets for various given grids (\mathrm{X} \in [4,5,7], \mathrm{Y} \in [4,5,6]). \newline Note: Not significant results are marked in \textcolor{red}{red}.} \label{tab:3dimp} \end{table}

\begin{figure}[!htb] \centering \includegraphics[width=\textwidth]{img/evolution3d/improvement_montage.png} \caption[Improvement potential for different 3D--grids]{ Plots of improvement potential against error given by our fitness--function after convergence together with a linear fit of each of the plotted data to indicate trends.} \label{fig:resimp3d} \end{figure}

Schluss

\label{sec:dis}

  • Regularity ist kacke für unser setup. Bessere Vorschläge? EW/EV?

\improvement[inline]{Bibliotheksverzeichnis links anpassen. DOI überschreibt Direktlinks des Autors.\newline Außerdem bricht url über Seitengrenzen den Seitenspiegel.}