Convex Optimization (2004) [pdf] (opens in new tab)

(web.stanford.edu)

117 pointsnewsoul3y ago56 comments

56 comments

47 comments · 11 top-level

Solvency3y ago· 23 in thread

"A mathematical optimization problem, or just optimization problem, has the form minimize f0(x) subject to fi(x) ≤ bi , i = 1, . . . , m. (1.1) Here the vector x = (x1, . . . , xn) is the optimization variable of the problem, the function f0 : R n → R is the objective function, the functions fi : R n → R, i = 1, . . . , m, are the (inequality) constraint functions, and the constants b1, . . . , bm are the limits, or bounds, for the constraints. A vector x ⋆ is called optimal, or a solution of the problem (1.1), if it has the smallest objective value among all vectors that satisfy the constraints: for any z with f1(z) ≤ b1, . . . , fm(z) ≤ bm, we have f0(z) ≥ f0(x ⋆ ). We generally consider families or classes of optimization problems, characterized by particular forms of the objective and constraint functions. As an important example, the optimization problem (1.1) is called a linear program if..."

Boy, and that's just the opening paragraph of the introduction.

Exactly what arcane requisite elite math precursors are necessary to even remotely understand this?

bo10243y ago

It's more some level of what they call "mathematical maturity", or just experience with math. None of this relies on math past the first or second year of college for a typical STEM degree: linear algebra and vector calculus. But most people aren't used to the notation that is taken for granted here, for example, subscripts on functions.

I know it looks scary if you aren't used to it, but it just takes some practice to pick up. This is far from arcane and elite in the math world!

bjornsing3y ago

I still remember the formal prerequisite for one of the more advanced math courses I took at uni: “The mathematical maturity that several years of study in mathematics typically gives” (roughly translated from Swedish). I thought it was kind if pretentious at the time (most other courses listed other courses as prerequisites), but now I think it makes sense.

jcranmer3y ago

This is one of those things where the math is actually pretty simple, but the notation is incredibly opaque (in part because things are generalized to hell) if you haven't been exposed to it before.

In layman's terms: you have a set of n variables, a set of m constraints on those variables (like x + y ≤ 3, or x² + y² ≥ 1), and some function you're trying to minimize. Oh, and everything involves real numbers, no fancy stuff like complex numbers or rationals or p-adic integers or Banach spaces.

The book itself gives you a taste of what you need to know to fully understand the material:

> The only background required of the reader is a good knowledge of advanced calculus and linear algebra. If the reader has seen basic mathematical analysis (e.g., norms, convergence, elementary topology), and basic probability theory, he or she should be able to follow every argument and discussion in the book.

FooBarBizBazz3y ago

> This is one of those things where the math is actually pretty simple, but the notation is incredibly opaque

If you're not used to it, this kind of notation looks like hieroglyphics.

If you are used to it, every vague English-language technical document you see floating around your workplace just reads like a bunch of flailing-arm hand-waving.

1 more reply

twic3y ago

> Oh, and everything involves real numbers, no fancy stuff like complex numbers or rationals or p-adic integers or Banach spaces.

Or integers! You could probably do complex convex optimisation okay, but when it's integers, you need a whole new set of techniques.

dfan3y ago

From the introduction: "The only background required of the reader is a good knowledge of advanced calculus and linear algebra. If the reader has seen basic mathematical analysis (e.g., norms, convergence, elementary topology), and basic probability theory, he or she should be able to follow every argument and discussion in the book."

It's a graduate-level course. If that paragraph is arcane, the book is probably a few courses in your future.

wiz21c3y ago

A someone who followed courses in num op, I can say that yes, indeed, the math concepts are rather basic and not that hard to learn. BUT you have to be at ease with mathematical reasoning because many of the proofs are not that easy to understand, they involve quite a lot of small insights here and there. Nothing really hard, but your brain will definitely have to work to understand these.

So yeah, the necessary concepts to study that course are not much but you have to have a strong habit of thinking in maths to apply them. And that habit comes with a lot of practice...

tomrod3y ago

We used the book in an advanced undergrad course. It's definitely doable once you have some real analysis and matrix analysis under your belt.

IIAOPSW3y ago

All this is is mathematical notation version of a data type system. Its telling you the problem search space is represented as vector<n> x, the optimization problem is 'real f(vector<n> x)', the problem constraint is vector<m> b, and the meaning of b is we are going to test 'for i=0; i<m; i++' { is f(x)<=b[i] }

HelloNurse3y ago

The mathematical entities are straightforward, and the important things to remember in this definitions package are conventions and names: x is the variable vector and its cardinality is n, minimization of a function called f0 is the objective, fi<=bi are the m constraints, and they are numbered starting from 1 to exclude "f0" and avoid the use of a different letter for the objective function.

1 more reply

donkeybeer3y ago

I must clarify, the function to optimize is f0(x), and the constraints are a set of functions fi that should each be <= bi. You have used a single function for the constraints in your pseudocode.

1 more reply

Blammar3y ago

Nicely put!

peterfirefly3y ago

Getting tripped up by this is like seeing Cyrillic letters in an intro book on Russian and believing those are the hard parts of learning Russian.

As others have said: it's actually fairly straightforward and clear. That paragraph states exactly what they mean by "a mathematical optimization problem".

A simpler version that you might have seen in Calculus for Jocks is one where you are supposed to find the optimum of a single-variate function with no further constraints -- just find the largest or smallest value. You would look for places where the tangent is horizontal (by differentiating) + you would look at the "ends" (by looking at limits).

"Here the vector x..." tells us that this cost function is a bit more complicated: you are not looking for a single input value but a vector of many input values.

"subject to fi(x) <= bi" tells us that there are constraints and what language the authors will use to describe those constraints.

It's really all fairly standard and simple. The hard parts come later.

cgio3y ago

It can be a bit intimidating but try to read it not as something you need to understand as in a truth revealed but rather as something you need to understand as in these are the terms we’ll be using to represent what we explain in text. I.e. f0 for objective function, fi for constraints etc. Many otherwise approachable books have this wall of conventions in the beginning + a set of proofs that underpin the key thesis. Try to speed through these sections, picking up as many symbols, findings, as you can , but not everything, and when you reach the actual essence you can go back and revisit. The good thing is that this wall in the beginning becomes very common between books the more you read on a topic. In a sense, if you are a programmer, think of this as the schema representation for the actual message to come.

hinkley3y ago

Once upon a time I mentioned in passing that I subscribed to the proceedings of SIGPLAN. My coworker shot his hand out to stop the conversation.

“You can read those??”

“A little more than half.”

I knew exactly what he meant, and was amused that “half” satisfied his sudden suspicion that I was an alien living among humans.

turtleyacht3y ago

SIGPLAN: Special Interest Group on Programming Languages (online)

A list of them can be found at

https://www.acm.org/special-interest-groups/join

aleph_minus_one3y ago

> Boy, and that's just the opening paragraph of the introduction.

> Exactly what arcane requisite elite math precursors are necessary to even remotely understand this?

You don't have a degree in some STEM area (or even economics)? To me, this rather looks like the kind of basic mathematical notation that you learn and get become perfectly used to in the first two years of your degree course.

donkeybeer3y ago

Its saying a very simple thing with somewhat convoluted language. First of all for full generality they are treating functions of Rn -> R (if the output is also a vector then it becomes a different type of problem, I believe its called multimodal optimisation).

And the f1, f2, ... fm are simply the (inequality) constraints that all candidate solutions must satisfy.

For example, you might want to maximize the volume of something you want to build from sheet metal, then f0 could be the expression for the volume of body and the constraint could be one inequality ie area(x) <= your_maximum_budget_for_sheet_metal etc

ant6n3y ago

> Exactly what arcane requisite elite math precursors are necessary to even remotely understand this?

Probably helps to have a bit of a background in mathematical optimization, if you, uh, pardon the recursion.

quantum_state3y ago

If one reads a few serious introductory math books, starting from set theory, etc., one would get used to this “math” language and find it natural, precise, and effective.

nextos3y ago

Also, really important to refer to Boyd's prequel book, which is much simpler and can be read by anyone here without any prior background: https://web.stanford.edu/~boyd/vmls

It has tons of code and exercises in Julia and Python. Start here, excellent to get a taste of linear algebra and its applications.

yxhuvud3y ago

Huh? This is quite straightforward and is simply establishing what terms are used for describing the problem. The problem formulation itself seems to be the same as in any lower level course on optimization that is taught to regular masters students, so it is expected that anyone reading the paper will already know it.

gigatexal3y ago

I'm probably the most junior math person here. I got to Advanced Algebra and then Calc 1 and realized I peaked in Advanced Algebra. That being said, judging from just the paragraph one might need an understanding of number theory (sets to be exact), vectors, ...

I always tend to get tripped up in the terminology and the symbols.

mvcalder3y ago· 4 in thread

Professors Boyd and Vandenberghe really broke ground with this text. Prior to this, optimization algorithms and methods were very much locked up behind a metaphorical paywall: difficult to access literature with very high barriers to entry, and strictly commercial software offerings. They brought optimization to the masses and should be celebrated for it.

fiforpg3y ago

Come on, prior to this people read Nocedal & Wright, which is still very much a standard text on nonlinear optimization, and there were well-known implementations of nonlinear optimization algorithms written by these people in Fortran. These are most likely hiding in any modern LBFGS library you are looking at, including Scipy etc.

It is rather that more people understand these algorithms now and more people wrote implementations or bindings for popular languages, so you don't need to use Fortran anymore these days, but only conveniently invoke your optimization library of choice. The entire optimization ecosystem has matured; any particular good book certainly has contributed to that, but so did any other particular good book.

blt3y ago

They complement each other well IMO. Nocedal & Wright focus more on algorithmic details and methods that can be applied to nonconvex problems. Boyd & Vandenberghe focus more on convex analysis and showing how some non-obvious problems can be expressed in convex form.

B&V might be more useful as an "extended user's manual" for convex optimization software. I would guess that most readers of N&W are writing their own solvers, or at least want to know what all the tolerances mean in their third-party solver's bewildering list of parameters.

PartiallyTyped3y ago

Can attest, I studied Nocedal & Wright's book last year during my master's, it was my favourite course.

optbuild3y ago

Which book by Nocedal and Wright? Can someone link to it?

1 more reply

smiley14373y ago· 3 in thread

I remember ago Lars Blackmore of SpaceX released a paper on soft landing Falcon 9, that's the first time I'd encountered convex optimization

https://www.semanticscholar.org/paper/Lossless-Convexificati...

It blew my mind that you could convexify non-convex curves into useful-for-optimization convex curves to optimize for so many things simultaneously (physics constraints, control thruster limitations, sensor constraints, g forces, etc) and it's cool that part of the spectacular landings we get from SpaceX relies on it

antman3y ago

Non paywalled by Lars Blackmore http://www.larsblackmore.com/iee_tcst13.pdf

PartiallyTyped3y ago

til. For whatever reason I totally imagined it was some RL based method trained on sims. In my defense, RL is used for control problems as well, but this is so cool! Thank you for sharing.

4gotunameagain3y ago

no serious, safety critical system uses RL (except tesla "autopilot" and we see how that went). Control theory algorithms can be validated to work within the desired envelope and produce a valid solution.

The big advantage of convexifying the problem, is that when it is convex you have a guarantee it can be solved in fixed time, a major requirement for real time systems

1 more reply

vrc3y ago· 2 in thread

I had the pleasure of taking this course with Prof. Boyd when he did a semester at MIT and it really was excellent. With a basic understanding of linear algebra and proofs it opened my eyes to so many techniques and ways to look at problems. It also lowered my fear of tackling more complex coursework because it motivated my interest. The only downside is that I became far too over reliant on the MATLAB package they made to pair with the course, so trying to implement some of the techniques later on from scratch took some doing.

m_c_g3y ago

Author of said MATLAB package (CVX) here. Yours is a really interesting observation!

When the book was created, CVX didn't exist. Instead, Boyd & Vandenberghe wrote separate MATLAB scripts for virtually every figure (to be fair, with a lot of cut-and-paste and convenience functions). The course did hum along pretty well without CVX (or its later and now better-supported Python equivalent CVXPY), for sure.

I think it is fair to say that these software packages made convex optimization far more accessible a topic. Certainly, implementing some of the solution techniques is not straightforward. But far more people can actually spend time using convex optimization in their application domains if they don't have to concern themselves with those implementation complexities.

Incidentally, on the commercial side, Mosek ApS and Gurobi both offer Python-based modeling frameworks for convex optimization that do a great job of making the discipline accessible as well. They don't operate in quite the same way as CVX and CVXPY, but that's not really important: what matters is that people can readily solve their problems, not the specific approach that gets that done.

vrc3y ago

I agree with all your points. CVX made it so that students like me spent more time learning the material and techniques and less time worrying about implementation. Great work!

uptownfunk3y ago· 2 in thread

Does this have any application to SOTA ML?

nextos3y ago

Yes, AFAIK to regularized regression which you would typically use in problems with more variables than instances, such as GWAS.

TrackerFF3y ago

Dunno if support vector machines are still considered "SOTA", but that would be one of the most obvious examples of a convex optimization problem (and probably a good starting point for students trying to tie ML and convex optimization)

onos3y ago· 2 in thread

I’ve heard great things but it’s longer than I could commit to reading. Can anyone recommend a similar but more concise text?

antiviral3y ago

Have you looked at the Python library and tutorial?

https://www.cvxpy.org/

onos3y ago

No, thanks for the link!

yellowcake03y ago

One of my favorite math books, there's also the companion text, https://web.mit.edu/~jadbabai/www/EE605/additional_exercises..., which contains a lot of interesting applications, presumably compiled by Boyd himself and his colleagues over the years.

melling3y ago

The videos for Boyd’s classes are on YouTube.

He’s also got an edX course you can audit for free.

https://www.edx.org/course/convex-optimization?index=product...

cpgxiii3y ago

Convex optimization can be a really amazing tool. We use optimization extensively, both on actually convex problems, and on non-convex-but-practically-solvable problems in robotics.

The math surrounding optimization is great; however, the reality of optimization tools is still very poor. A competitive optimizer is a massive project, and outside of a number of mostly limited/specialized solvers, the effective tools are all proprietary and very expensive (e.g. SNOPT, Gurobi, Mosek, CPLEX, etc). How solveable and stable your problems can be depends on these tools, and effective problem formulation (e.g. what and how many constraints you use, how you compute gradients) is essentially a black art learned through hard experience.

There's a great example of the complexity difference between optimization and other tools in the world of motion planning for robots: we expect that any semi-competent undergrad can implement search- and sampling-based planners (e.g. A* or RRT), implementing a good optimizer for trajectory optimization is a multi-million dollar project.

The world of optimization desperately needs a MuJoCo-DeepMind moment, where a large interested company buys one of the major commercial optimization providers and makes their tools free and open source. This would really be transformative to the field.

crabmusket3y ago

Encountered this in my final year of University and audited half the online course before it went way beyond what I needed to know. Boyd was a great lecturer, and the content was fantastic. Really interesting stuff to know. I was applying it to model-predictive control, which is a super interesting set of algorithms as well.

blt3y ago

My grad school bible!

j / k navigate · click thread line to collapse

56 comments

47 comments · 11 top-level

Solvency3y ago· 23 in thread

Boy, and that's just the opening paragraph of the introduction.

Exactly what arcane requisite elite math precursors are necessary to even remotely understand this?

bo10243y ago

I know it looks scary if you aren't used to it, but it just takes some practice to pick up. This is far from arcane and elite in the math world!

bjornsing3y ago

jcranmer3y ago

This is one of those things where the math is actually pretty simple, but the notation is incredibly opaque (in part because things are generalized to hell) if you haven't been exposed to it before.

The book itself gives you a taste of what you need to know to fully understand the material:

FooBarBizBazz3y ago

> This is one of those things where the math is actually pretty simple, but the notation is incredibly opaque

If you're not used to it, this kind of notation looks like hieroglyphics.

If you are used to it, every vague English-language technical document you see floating around your workplace just reads like a bunch of flailing-arm hand-waving.

1 more reply

twic3y ago

> Oh, and everything involves real numbers, no fancy stuff like complex numbers or rationals or p-adic integers or Banach spaces.

Or integers! You could probably do complex convex optimisation okay, but when it's integers, you need a whole new set of techniques.

dfan3y ago

It's a graduate-level course. If that paragraph is arcane, the book is probably a few courses in your future.

wiz21c3y ago

So yeah, the necessary concepts to study that course are not much but you have to have a strong habit of thinking in maths to apply them. And that habit comes with a lot of practice...

tomrod3y ago

We used the book in an advanced undergrad course. It's definitely doable once you have some real analysis and matrix analysis under your belt.

IIAOPSW3y ago

HelloNurse3y ago

1 more reply

donkeybeer3y ago

I must clarify, the function to optimize is f0(x), and the constraints are a set of functions fi that should each be <= bi. You have used a single function for the constraints in your pseudocode.

1 more reply

Blammar3y ago

Nicely put!

peterfirefly3y ago

Getting tripped up by this is like seeing Cyrillic letters in an intro book on Russian and believing those are the hard parts of learning Russian.

As others have said: it's actually fairly straightforward and clear. That paragraph states exactly what they mean by "a mathematical optimization problem".

"Here the vector x..." tells us that this cost function is a bit more complicated: you are not looking for a single input value but a vector of many input values.

"subject to fi(x) <= bi" tells us that there are constraints and what language the authors will use to describe those constraints.

It's really all fairly standard and simple. The hard parts come later.

cgio3y ago

hinkley3y ago

Once upon a time I mentioned in passing that I subscribed to the proceedings of SIGPLAN. My coworker shot his hand out to stop the conversation.

“You can read those??”

“A little more than half.”

I knew exactly what he meant, and was amused that “half” satisfied his sudden suspicion that I was an alien living among humans.

turtleyacht3y ago

SIGPLAN: Special Interest Group on Programming Languages (online)

A list of them can be found at

https://www.acm.org/special-interest-groups/join

aleph_minus_one3y ago

> Boy, and that's just the opening paragraph of the introduction.

> Exactly what arcane requisite elite math precursors are necessary to even remotely understand this?

donkeybeer3y ago

And the f1, f2, ... fm are simply the (inequality) constraints that all candidate solutions must satisfy.

ant6n3y ago

> Exactly what arcane requisite elite math precursors are necessary to even remotely understand this?

Probably helps to have a bit of a background in mathematical optimization, if you, uh, pardon the recursion.

quantum_state3y ago

If one reads a few serious introductory math books, starting from set theory, etc., one would get used to this “math” language and find it natural, precise, and effective.

nextos3y ago

Also, really important to refer to Boyd's prequel book, which is much simpler and can be read by anyone here without any prior background: https://web.stanford.edu/~boyd/vmls

It has tons of code and exercises in Julia and Python. Start here, excellent to get a taste of linear algebra and its applications.

yxhuvud3y ago

gigatexal3y ago

I always tend to get tripped up in the terminology and the symbols.

mvcalder3y ago· 4 in thread

fiforpg3y ago

blt3y ago

PartiallyTyped3y ago

Can attest, I studied Nocedal & Wright's book last year during my master's, it was my favourite course.

optbuild3y ago

Which book by Nocedal and Wright? Can someone link to it?

1 more reply

smiley14373y ago· 3 in thread

I remember ago Lars Blackmore of SpaceX released a paper on soft landing Falcon 9, that's the first time I'd encountered convex optimization

https://www.semanticscholar.org/paper/Lossless-Convexificati...

antman3y ago

Non paywalled by Lars Blackmore http://www.larsblackmore.com/iee_tcst13.pdf

PartiallyTyped3y ago

til. For whatever reason I totally imagined it was some RL based method trained on sims. In my defense, RL is used for control problems as well, but this is so cool! Thank you for sharing.

4gotunameagain3y ago

The big advantage of convexifying the problem, is that when it is convex you have a guarantee it can be solved in fixed time, a major requirement for real time systems

1 more reply

vrc3y ago· 2 in thread

m_c_g3y ago

Author of said MATLAB package (CVX) here. Yours is a really interesting observation!

vrc3y ago

I agree with all your points. CVX made it so that students like me spent more time learning the material and techniques and less time worrying about implementation. Great work!

uptownfunk3y ago· 2 in thread

Does this have any application to SOTA ML?

nextos3y ago

Yes, AFAIK to regularized regression which you would typically use in problems with more variables than instances, such as GWAS.

TrackerFF3y ago

onos3y ago· 2 in thread

I’ve heard great things but it’s longer than I could commit to reading. Can anyone recommend a similar but more concise text?

antiviral3y ago

Have you looked at the Python library and tutorial?

https://www.cvxpy.org/

onos3y ago

No, thanks for the link!

yellowcake03y ago

melling3y ago

The videos for Boyd’s classes are on YouTube.

He’s also got an edX course you can audit for free.

https://www.edx.org/course/convex-optimization?index=product...

cpgxiii3y ago

Convex optimization can be a really amazing tool. We use optimization extensively, both on actually convex problems, and on non-convex-but-practically-solvable problems in robotics.

crabmusket3y ago

blt3y ago

My grad school bible!

j / k navigate · click thread line to collapse