Archive for September, 2019

September 22nd, 2019

In many introductions to numpy, one gets taught about np.ones, np.zeros and np.empty. The accepted wisdom is that np.empty will be faster than np.ones because it doesn’t have to waste time doing all that initialisation. A quick test in a Jupyter notebook shows that this seems to be true!

import numpy as np

N = 200
zero_time = %timeit -o some_zeros = np.zeros((N,N))
empty_time = %timeit -o empty_matrix = np.empty((N,N))
print('np.empty is {0} times faster than np.zeros'.format(zero_time.average/empty_time.average))


8.34 µs ± 202 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
436 ns ± 10.6 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
np.empty is 19.140587654682545 times faster than np.zeros

20 times faster may well be useful in production when using really big matrices. Might even be worth the risk of dealing with uninitialised variables even though they are scary!

However…..on my machine (Windows 10, Microsoft Surface Book 2 with 16Gb RAM), we see the following behaviour with a larger matrix size (1000 x 1000).

import numpy as np

N = 1000
zero_time = %timeit -o some_zeros = np.zeros((N,N))
empty_time = %timeit -o empty_matrix = np.empty((N,N))
print('np.empty is {0} times faster than np.zeros'.format(zero_time.average/empty_time.average))


113 µs ± 2.97 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
112 µs ± 1.01 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
np.empty is 1.0094651980894993 times faster than np.zeros

The speed-up has vanished! A subsequent discussion on twitter suggests that this is probably because the operating system is zeroing all of the memory for security reasons when using numpy.empty on large arrays.

September 2nd, 2019


What is Second Order Cone Programming (SOCP)?

 

Second Order Cone Programming (SOCP) problems are a type of optimisation problem that have applications in many areas of science, finance and engineering. A summary of the type of problems that can make use of SOCP, including things as diverse as designing antenna arrays, finite impulse response (FIR) filters and structural equilibrium problems can be found in the paper ‘Applications of Second Order Cone Programming’ by Lobo et al. There are also a couple of examples of using SOCP for portfolio optimisation in the GitHub repository of the Numerical Algorithms Group (NAG).

A large scale SOCP solver was one of the highlights of the Mark 27 release of the NAG library (See here for a poster about its performance). Those who have used the NAG library for years will expect this solver to have interfaces in Fortran and C and, of course, they are there. In addition to this is the fact that Mark 27 of the NAG Library for Python was released at the same time as the Fortran and C interfaces which reflects the importance of Python in today’s numerical computing landscape.

Here’s a quick demo of how the new SOCP solver works in Python. The code is based on a notebook in NAG’s PythonExamples GitHub repository.

NAG’s handle_solve_socp_ipm function (also known as e04pt) is a solver from the NAG optimization modelling suite for large-scale second-order cone programming (SOCP) problems based on an interior point method (IPM).

$$
\begin{array}{ll}
{\underset{x \in \mathbb{R}^{n}}{minimize}\ } & {c^{T}x} \\
\text{subject to} & {l_{A} \leq Ax \leq u_{A}\text{,}} \\
& {l_{x} \leq x \leq u_{x}\text{,}} \\
& {x \in \mathcal{K}\text{,}} \\
\end{array}
$$

where $\mathcal{K} = \mathcal{K}^{n_{1}} \times \cdots \times \mathcal{K}^{n_{r}} \times \mathbb{R}^{n_{l}}$ is a Cartesian product of quadratic (second-order type) cones and $n_{l}$-dimensional real space, and $n = \sum_{i = 1}^{r}n_{i} + n_{l}$ is the number of decision variables. Here $c$, $x$, $l_x$ and $u_x$ are $n$-dimensional vectors.

$A$ is an $m$ by $n$ sparse matrix, and $l_A$ and $u_A$ and are $m$-dimensional vectors. Note that $x \in \mathcal{K}$ partitions subsets of variables into quadratic cones and each $\mathcal{K}^{n_{i}}$ can be either a quadratic cone or a rotated quadratic cone. These are defined as follows:

  • Quadratic cone:

$$
\mathcal{K}_{q}^{n_{i}} := \left\{ {z = \left\{ {z_{1},z_{2},\ldots,z_{n_{i}}} \right\} \in {\mathbb{R}}^{n_{i}} \quad\quad : \quad\quad z_{1}^{2} \geq \sum\limits_{j = 2}^{n_{i}}z_{j}^{2},\quad\quad\quad z_{1} \geq 0} \right\}\text{.}
$$

  • Rotated quadratic cone:

$$
\mathcal{K}_{r}^{n_{i}} := \left\{ {z = \left\{ {z_{1},z_{2},\ldots,z_{n_{i}}} \right\} \in {\mathbb{R}}^{n_{i}}\quad\quad:\quad \quad\quad 2z_{1}z_{2} \geq \sum\limits_{j = 3}^{n_{i}}z_{j}^{2}, \quad\quad z_{1} \geq 0, \quad\quad z_{2} \geq 0} \right\}\text{.}
$$

For a full explanation of this routine, refer to e04ptc in the NAG Library Manual

Using the NAG SOCP Solver from Python

 

This example, derived from the documentation for the handle_set_group function solves the following SOCP problem

minimize $${10.0x_{1} + 20.0x_{2} + x_{3}}$$

from naginterfaces.base import utils
from naginterfaces.library import opt

# The problem size:
n = 3

# Create the problem handle:
handle = opt.handle_init(nvar=n)

# Set objective function
opt.handle_set_linobj(handle, cvec=[10.0, 20.0, 1.0])

subject to the bounds
$$
\begin{array}{rllll}
{- 2.0} & \leq & x_{1} & \leq & 2.0 \\
{- 2.0} & \leq & x_{2} & \leq & 2.0 \\
\end{array}
$$

# Set box constraints
opt.handle_set_simplebounds(
    handle,
    bl=[-2.0, -2.0, -1.e20],
    bu=[2.0, 2.0, 1.e20]
)

the general linear constraints

\begin{array}{lcrcrcrclcl}
& & {- 0.1x_{1}} & – & {0.1x_{2}} & + & x_{3} & \leq & 1.5 & & \\
1.0 & \leq & {- 0.06x_{1}} & + & x_{2} & + & x_{3} & & & & \\
\end{array}
# Set linear constraints
opt.handle_set_linconstr(
    handle,
    bl=[-1.e20, 1.0],
    bu=[1.5, 1.e20],
    irowb=[1, 1, 1, 2, 2, 2],
    icolb=[1, 2, 3, 1, 2, 3],
    b=[-0.1, -0.1, 1.0, -0.06, 1.0, 1.0]
    );

and the cone constraint

$$\left( {x_{3},x_{1},x_{2}} \right) \in \mathcal{K}_{q}^{3}\text{.}$$

# Set cone constraint
opt.handle_set_group(
    handle,
    gtype='Q',
    group=[ 3,1, 2],
    idgroup=0
);

We set some algorithmic options. For more details on the options available, refer to the routine documentation

# Set some algorithmic options.
for option in [
        'Print Options = NO',
        'Print Level = 1'
]:
    opt.handle_opt_set(handle, option)

# Use an explicit I/O manager for abbreviated iteration output:
iom = utils.FileObjManager(locus_in_output=False)

Finally, we call the solver

# Call SOCP interior point solver
result = opt.handle_solve_socp_ipm(handle, io_manager=iom)
 ------------------------------------------------
  E04PT, Interior point method for SOCP problems
 ------------------------------------------------

 Status: converged, an optimal solution found
 Final primal objective value -1.951817E+01
 Final dual objective value   -1.951817E+01

The optimal solution is

result.x
array([-1.26819151, -0.4084294 ,  1.3323379 ])

and the objective function value is

result.rinfo[0]
-19.51816515094211

Finally, we clean up after ourselves by destroying the handle

# Destroy the handle:
opt.handle_free(handle)

As you can see, the way to use the NAG Library for Python interface follows the mathematics quite closely.
NAG also recently added support for the popular cvxpy modelling language that I’ll discuss another time.

Links

TOP