Mueller 21/22 Flashcards by George Preston

Give a brief outline of the SIMP method

Introduces material density variability from 0 to 1, where 0 is no material and 1 is a solid. Calculates derivative of the objective function (e.g. weight) w.r.t. density. Then reduces density of elements with lowest derivative.

How well did you know this?

Not at all

Perfectly

What is an Adjoint Solution?

One which directly expresses the sensitivities of a single cost function w.r.t. many design variables. (Think of the car which showed where to push in/push out for better aerodynamic performance.)

How well did you know this?

Not at all

Perfectly

State the Optimality conditions for optimization

If F'(x) = 0 
and
F''(x) >0
and if these are satisfied by x=x*
F(x)>F(x*) for all other x

Then x* is a minimum.

How well did you know this?

Not at all

Perfectly

State the steps for the Bisection Method

For an interval a

How well did you know this?

Not at all

Perfectly

List the key properties of the Bisection Method

User define interval which must contain the min
Will find any min, not not necessarily the global
Convergence is slow, depends of width of the initial and desired brackets
N = [log(xb-xa)+log(eta)]/log(2)
Gradient free

How well did you know this?

Not at all

Perfectly

Describe the Secant Method and list the key steps

A gradient based optimization technique which uses linearly interpolates between a bracket (although extrapolation can be used under certain conditions), using values of x and F’(x).

Set x1 = a and x2 = b (Bracket)
Compute F’1 = F’(x1) and F’2
Set k=2
Use linear interpolation formula to find x’k+1
Compute F’k+1
Set k=k+1
Repeat until convergence

How well did you know this?

Not at all

Perfectly

State the conditions for extrapolation for the Secant Method

If x1x2 & F’1>F’2>0

How well did you know this?

Not at all

Perfectly

State the key properties of the Secant Method

Needs gradients
Only need first derivatives
May converge to a max
Faster than Bisection Method
Flexibility in choosing xk
Can be generalised to multi-variate problems.

How well did you know this?

Not at all

Perfectly

State the three methods of choosing xk in the Secant method and briefly describe them

Chronological - xk becomes xk-1, xk+1 becomes xk. Simple and quick.

Smallest gradient - The two F’(x) with the smallest absolute values are used as the bracketing points, as they are, in theory, closer to the min. Faster

Bracketed - The above may find a max, by choosing two F’(x) with opposite signs for the brackets, we make sure we find a min. Slower but no risk of finding max.

How well did you know this?

Not at all

Perfectly

List the key differences in origin of the Bisection, Secant, and Newton Methods

Bisection - Literal calculation of function values over points and converging to the smallest values

Secant - Uses Linear Interpolation to find the zero of the gradient to find the min

Newton - Uses Taylor expansion to approximate the zero of the function using gradients.

How well did you know this?

Not at all

Perfectly

What is the process for safeguarding the Bisection Method? Explain.

From the initial point, calculate the negative gradient and follow until it starts increases in steps s. This will select an interval with a min and if the function is unimodal, there will only be one min.

How well did you know this?

Not at all

Perfectly

What is the process for safeguarding Newton’s Method? Explain.

If F’‘(x) < then we will tend to a max.
If F’‘(x) = 0 then we will divide by zero.
If the above conditions are true, then use deltax = -F’(k)

To ensure the next step, s = alphadeltax is within the interval.
If deltax < 0, alpha = min{1, (a-xk)/deltax}
If deltax > 0, alpha = min{1, (b-xk)/deltax}

How well did you know this?

Not at all

Perfectly

What is the process for safeguarding the Secant Method? Explain.

Use a bracketed interval. This ensures a local max is not found.

How well did you know this?

Not at all

Perfectly

State a Taylor Expansion for two variables

F(x+dx, y+dy) = F+p^tg+0.5p^tHp

where H is the hessian
g = gradient matrix = [Fx Fy]^t
p = step vector = [deltax, deltay]^t

How well did you know this?

Not at all

Perfectly

State the multivariate optimality conditions

For F(x+dx, y+dy) = F+p^tg+0.5p^tHp

H is positive definite
p^t*g<0

How well did you know this?

Not at all

Perfectly

State the three Wolfe Conditions and what they mean.

Study These Flashcards

p^tgk<= -eta0mag(p)*mag(gk)

This is stronger than the p^t*g<0 condition as it ensures a minimum angle of from the contour is achieved

F(xk+spk) - F(xk) <=eta1sp^t*gk

This ensures the step size is not too big

{F(xk+spk) - F(xk)}>=(1-eta2)sp^t*gk

This ensures the step size isn’t too small.

What is an Augmented Lagrangian?

Study These Flashcards

It is a penalty function which, using approximation methods, approximates the first-order constrained optimality criteria by using Lagrange multipliers. It then adds a small penalty to correct for the error in the approximation of the Lagrange multipliers.

How do projected gradient methods work?

Study These Flashcards

By removing the component of the gradient that is perpendicular to the constraint, however, they need the gradient of the constraint.

What are interior point methods good for?

Study These Flashcards

Inequality problems, as they can explore the feasible are well.

What are projected gradient methods good for?

Study These Flashcards

Constraint problems with linear constraints

What are the principles in the derivation of SQP?

Study These Flashcards

Perform a TE to approximate the objective function to obtain a quadratic model, then include an approximated 2nd derivative of the constraints within the Hessian. This improves stability and convergence. So even though the constraint approximation is still linear, the hessian includes the second derivative, so it can handle non-linear constraints better.

When should you use SQP?

Study These Flashcards

When you’ve got non-linear constraints and you already have a design which is close to the minimum. SQP is expensive when far from the min, so if you’re establishing a first design, it might not be the best method.

Describe the finite difference method and when to use it with regard to calculation of derivatives

Study These Flashcards

The finite difference method calculated the function at two locations:

For forward difference, its at x and x+del_x
For central difference it’s x+del_x and x-del_x

It then subtracts the first from the second and divides by the interval between the first and second.
FD is FO accurate, CD is SO accurate. Good for use with 50 or so variables but calcs scale with number of desvar so not appropriate for 100 +, can get v expensive. Also suffers from precision error of pc, and too small a change in x will blow the solution up.

What is the principle behind complex variable derivative calculation?

Study These Flashcards

By implementing an imaginary perturbation, a Taylor Expansion eliminations the subtraction that occurs in FD methods. This eliminates the cancellation error that occurs when delta approaches zero, so we get a more accurate calculation. It’s slightly more expensive but has increased precision.

What is a compiler, why use it?

A compiler is a programming tool which reads an entire script before translating into assembly code. This allows for a degree of optimization of the script, as the whole script is read as opposed to a line by line interpretation.

What is source transformation AD?

Where the source code is modified to include the derivatives of the components of a function. Requires compiler code which is only available in a small number of of coding languages, such as Fortran and C.

What is operator overload AD?

Where custom data types are created and their respective interactions are programmed using the basic rules of calculus. Method is usable in languages which cannot compile code, such as Python, C++, or MATLAB. Less efficient than S-T AD but available in more popular languages.

What is the most efficient way to deal with linear operators in AD?

Call the primal function twice, but the second time with original variables replaced with the derivatives, as linear function diff results in the exact same coefficient * by the derivative of the variable, so they're the same format. This way is cheaper than running the full operation.

What are exterior penalty functions good for?

Problems in which representation of the entire design space is desired, for example, genetic algorithms.

In genetic algorithms, what is the biased roulette wheel and tournament selection?

They are methods of selection. BRW selects like a RW. The chromosome width is proportional to the fitness, making fitter chromosomes more likely to be selected. TS is where a few chromosomes are randomly compared, maybe two or three, and the fittest are selected for reproduction.

Mueller 21/22 Flashcards

(30 cards)