Introduction to Optimization Techniques

1. In descent methods, the particular choice of search direction does not matter so much.
a. True.
b. False.
answer : b

2. In descent methods, the particular choice of line search does not matter so much.
a. True.
b. False.
answer : a

3. When the gradient descent method is started from a point near the solution, it will converge very quickly.
a. True.
b. False.
answer : b

4. Newton’s method with step size $h=1$ always works.
a. True.
b. False.
answer : b

5. When Newton’s method is started from a point near the solution, it will converge very quickly.
a. True.
b. False.
answer : a

6. Using Newton’s method to minimize $f(Ty)$, where $Ty=x$ and $T$ is nonsingular, can greatly improve the convergence speed when $T$ is chosen appropriately.
a. True.
b. False.
answer : b

7. If $f$ is self-concordant, its Hessian is Lipschitz continuous.
a. True.
b. False.
answer : b

8. If the Hessian of $f$ is Lipschitz continuous, then $f$ is self-concordant.
a. True.
b. False.
answer : b

9. Newton’s method should only be used to minimize self-concordant functions.
a. True.
b. False.
answer : b

10. $f(x) = \exp x$ is self-concordant.
a. True.
b. False.
answer : b

11. $f(x) = -\log x$ is self-concordant.
a. True.
b. False.
answer : a

12. Consider the problem of minimizing \[ f(x) = (c^Tx)^4 + \sum_{i=1}^n w_i \exp x_i, \] over $x \in \mathbf{R}^n$, where $w \succ 0$.
Newton’s method would probably require fewer iterations than the gradient method, but each iteration would be much more costly.
a. True.
b. False.
answer : b

13. Newton’s method is seldom used in machine learning because
a. common loss functions are not self-concordant
b. Newton’s method does not work well on noisy data
c. machine learning researchers don’t really understand linear algebra
d. it is generally not practical to form or store the Hessian in such problems, due to large problem size
answer : d