Written by: Neal Kaw
July 6, 2018
The CRAN Task View on Optimization and Mathematical Programming lists many R packages, however, a large number of them are general purpose solvers that can only handle unconstrained problems or problems with box constraints (bounds on variables). For example, the package optimr falls into this category. These packages are useful for many optimization problems in statistics, data science, and machine learning, but the emphasis of this post will be more specifically on solving constrained optimization problems such as linear programs, which are encountered in operations research and management science. I’ll first provide some background on optimization software, and then I’ll focus on the R packages CVXR and ROI. I’ll assume you have some experience with constrained optimization, or at least linear optimization.
Solving an optimization model generally makes use of up to three software components: a programming language, a solver, and (optionally) a modeling language. Aside from R, common programming languages used to implement optimization models are MATLAB and Python. A solver implements algorithms to identify an optimal solution for a given problem, and solvers differ in the optimization problem classes that they can solve (for example, linear programming, quadratic programming, and so on). Commonly used commercial solvers are Gurobi, CPLEX, and MOSEK, all of which provide free academic versions. There are also a large number of free open-source optimization solvers, such as GLPK, ECOS, and SCS. Each solver will generally have an interface to one or more programming languages, and the interface provides functions that can be used to pass the problem data to the solver, control the solver’s execution, and receive output from the solver. For example, Gurobi currently supports interfaces to C, C++, Java, Python, R, .NET, and MATLAB. In many cases, a modeling language can make a solver easier to use by providing a more natural or concise syntax for interacting with the solver. Examples of modeling languages include YALMIP in MATLAB and CVXPY in Python (both of these modeling languages support all three of the previously mentioned commercial solvers).
My first recommendation for doing optimization in R is the package CVXR, for its ease of use. CVXR is a modeling language which currently supports a limited number of open source (ECOS, ECOS_BB, SCS, lpSolve, GLPK) and commercial (MOSEK and Gurobi) solvers, and is capable of solving convex optimization and mixed-integer optimization problems.
The major advantage of CVXR is that it is, as far as I know, the only R package which provides a modeling language that allows you to enter an optimization problem in a mathematically natural syntax. For example, to formulate a linear program, other packages would require you to conceptualize your decision variables as a single column vector and then write code to construct the left-hand-side constraint matrix, a right-hand-side vector, and a cost vector (this is true even for the R interface of a commercial solver like Gurobi). For problems with many constraints and decision variables with more than one subscript, writing the code to construct the necessary vectors and matrices can be cumbersome. CVXR instead allows you to define multiple vectors or matrices of decision variables, and then write mathematical expressions to separately define the objective function, and the left-hand-side and right-hand-side of each constraint. The official CVXR website provides several tutorial examples.
One possible limitation of CVXR is that it currently doesn’t support many solvers. Under unlucky circumstances you may then find that your problem instance can’t be solved to optimality or can’t be solved efficiently with any of the available solvers, especially if you don’t have access to a commercial solver. An alternative package is ROI, which supports several more solvers. Although ROI is the only other R package comparable to CVXR, ROI is not a modeling language, so it does not allow the user to formulate models in natural mathematical syntax. What ROI does do is provide wrapper functions that allow you to easily plug in different solvers for a given set of problem data, instead of having to tailor the data and function call for each solver that you want to try. For example, to construct a linear program, ROI will require you to input a constraint matrix, cost vector, and right-hand-side vector, but will also allow you to supply an argument to choose among many solvers available in R. ROI also provides the function ROI_available_solvers() to return a list of installed solvers which can potentially solve a problem you have constructed. At publication, ROI supported eighteen different solvers, which can solve problem classes including linear, quadratic, and mixed integer optimization problems. Installation notes can be found on the official ROI website, and a comprehensive guide to the package is in the ROI paper.
If neither CVXR nor ROI works for your problem instance, your remaining option is to directly use the R interface for the solver that you want to use. To choose an appropriate solver for your problem, I suggest consulting the CRAN Task View on Optimization and Mathematical Programming, which is continuously updated; or Section 3 (particularly Tables 1 and 5) of the ROI publication, which I find easier to read but may not be as up-to-date.