Relaxation-Based Search, From Everyday Life To Unfamiliar Territory

https://www.lesswrong.com/posts/tXjJDXCp4pZzzeg8Y/relaxation-based-search-from-everyday-life-to-unfamiliar

Contents

Why This Matters: High-Dimensional Worlds

Our world is high-dimensional. Even a trip to visit my parents for the holidays involves hundreds of choices: dates, modes of transport, packing list, how early to arrive at the airport, etc. Multiply them all out, and we have an exponentially vast space of possibilities. And that’s just a trip to visit my parents. Imagine the possibilities involved in designing a website or a scientific experiment! When searching in spaces this large, we cannot afford to test every possibility. We are forced to rule out vast swaths of the search space without even looking at them. Relaxation-based search is one of the most general algorithms we know of which can do this. The key is that we can usually express the problem-space using constraints which each depend on only a few dimensions. For instance, if I want to arrive at the airport an hour before the plane takes off, that constraint only involves two dimensions: my arrival time at the airport, and the takeoff time of the flight. It does not directly depend on what time I wake up, whether I pack a toothbrush, my parents’ plans, cost of the plane tickets, etc, etc. So, if I can relax the problem to just a few of the most-difficult constraints, hopefully those constraints will involve a relatively small number of variables, and I can search through that smaller space more efficiently. When booking flights, for instance, I mostly just think about flight date, flight time, and cost (and maybe airline, sometimes, as a tiebreaker). I don’t worry about all the other dimensions, like packing my bag, getting to the airport, eating, buying presents for my parents, etc, etc. Then, holding my solution for the most-taut constraints constant, I can go handle the other constraints. I plan around my flights to decide when to set my alarm, what/​when to eat, and how to get to the airport. Each of those, in turn, only involves a few variables at a time, while holding the rest of the solution constant.

Tautness And Slackness

Imagine that I mistakenly think that the plane flight is an easy part of my holiday plans. I show up at the airport on December 22nd, find a service desk, and ask what flights are available. This is likely to go poorly. One crucial element to make relaxation-based search work well is to have an accurate idea of what’s hard, and what’s easy. Which constraints are "slack"—i.e. we can relax them, and probably be easily able to tweak the plan to handle them later? Which constraints are "taut"—i.e. we need to handle them up-front, because we probably won’t be able to easily tweak our plan to handle them later? If we mistakenly relax a taut constraint, then when we need to handle it later on, we’re likely to have to scrap a large chunk of the plan and redo it, or spend a lot of resources, or just fail altogether. It’s like showing up at the airport on December 22nd expecting to easily be able to get on a plane. So, two kinds of useful information for plan-making:

Hardest-First vs Easiest-First

One implication of the relaxation-based search view is that we should usually tackle problems hardest-part first. Advice to do the opposite is reasonably common, so it’s worth addressing *why *hardest-first is a good idea, and when/​how easiest-first might be better. The main argument is that, if we just solve an easy part of a problem, then we’ll likely have to throw out our solution and start over mostly-from-scratch when we get to the hard part. If I have a $1000 vacation budget, and flights cost $800, then planning my hotels without accounting for the flight budget will be basically useless. I’ll have to redo it all later. As a general rule, it’s the hard parts which most restrict the solution-space, so it’s the hard parts which most steer the overall form of solutions. So why might one instead tackle an easy part first? Well, tackling an easy part of a problem might provide some value other than a usable almost-solution for the main problem. It might help build confidence, or acquire generally-useful resources. In particular, tackling an easy part of a problem can unearth information about the problem itself. If I’m entering a domain where I have little experience, where there’s lots of unknown unknowns, tackling an easy problem pushes me to interact with the system and "get my feet wet". It gives my brain its first few data points about the problem. I may find out that I was completely wrong about which parts are hard vs easy! Main takeaway: if we have a reasonable idea of what’s easy vs hard, then hardest-first makes sense; if we solve an easy part first, we’ll likely just have to throw away that effort later. But if we have no idea what’s hard vs easy, then the main priority is to figure that out, and tackling an easy-looking problem is one way to get our first few data points.

Duality: Mapping Tautness and Slackness in Unfamiliar Territory

More generally, how can we efficiently figure out which constraints are taut vs slack in a new domain? How do we map out the problem/​solution space? Here’s a concrete example: suppose we want to put something into orbit without using a big rocket. We don’t know anything at all about putting things into orbit, so to start out, we babble some dumb starting ideas to see where they fail:

Comment

https://www.lesswrong.com/posts/tXjJDXCp4pZzzeg8Y/relaxation-based-search-from-everyday-life-to-unfamiliar?commentId=MdDbqcYjH5zYRchrM

This seems surprisingly related to a thought I’ve had about the inefficiency of markets in consumer goods. When buying things, people usually only have room to optimize on a small number of taut constraints. One of them is going to be price; you get one, maybe two more if you’re lucky. If some product or service has more axes of variation than "what does it cost" and "does it work at all", they will typically get almost negligible optimization pressure applied to them.

For example, the problem of buying an airplane ticket already has a bunch of taut constraints—the right place, the right time, the right price. The wifi on planes is generally garbage; whether or not there is wifi has risen up to the point where it gets a bit of optimization pressure, but how well the wifi works is way below the required bar. There is no incentive to improve it. Similarly the food on planes.

(To be honest, this applies to employment relationships, too, which might explain some things about how people generally find jobs to suck.)

https://www.lesswrong.com/posts/tXjJDXCp4pZzzeg8Y/relaxation-based-search-from-everyday-life-to-unfamiliar?commentId=F7tHyWvDJhbbRWRLC

This concept is often discussed in the subfield of AI called planning. There are a few notes you hit on that were of particular interest to me /​ relevance to the field:

The key is that we can usually express the problem-space using constraints which each depend on only a few dimensions. In Reinforcement Learning and Planning, domains which obey this property are often modeled as Factored Markov Decision Processes (MDPs), where there are known dependency relationships between different portions of the state space that can be represented compactly using a Dynamic Bayes Net (DBN). The dynamics of Factored MDPs are easier to learn from an RL perspective, and knowing that an MDP’s state space is factored has other desirable properties from a planning perspective. I expect getting to the airport to be easy. There are many ways to get there (train, Uber/​Lyft, drive & park) all of which I’ve used before and any of which would be fine....I want to arrive at the airport an hour before the plane takes off, that constraint only involves two dimensions: my arrival time at the airport, and the takeoff time of the flight. It does not directly depend on what time I wake up, whether I pack a toothbrush, my parents’ plans, cost of the plane tickets, etc, etc. You are actually touching on what seems to be three kinds of independence relationships. The first is temporal, and has something to do with options having identical goal states. The second is regarding the underlying independence relationships of the MDP. The third isn’t technically an independence relationship, and is instead in regards to the utility of abstraction. In detail:

  • It doesn’t matter which option you take (train, Uber/​Lyft, drive & park) because they all have the same termination state (at the airport). This shows that we plan primarily using subgoals.

  • Certain factors of the state space (your parents’ plans, whether you pack a toothbrush, cost of the place tickets) are actually independent of each other, i.e. your parents’ plans have no real physical consequences in your plan at any time, e.g. you can walk and chew gum at the same time. This shows that we plan with a factored understanding of the state-action space.

  • The time you wake up does indeed matter in your plan, but the exact time does not. For your planning purposes, waking up any time before you must leave your house (including factoring in packing, etc.) is permissible and functionally equivalent in your plan. All possible states of being awake before your out-the-door time collapse to the same abstract state of being awake on-time. This shows that **we plan using abstract states **(a similar, but subtly different point than point 1).

More generally, how can we efficiently figure out which constraints are taut vs slack in a new domain? How do we map out the problem/​solution space? We can use the three kinds of independence relationships I mentioned above to answer these questions in the RL/​Planning setting:

  • So long as you can learn to consistently reach a specific state, you can use that state as a subgoal for planning and exploration. This principle is used in some existing RL literature (I’m a student in this lab).

  • If you can figure out the underlying representation of the world and discern independence relationships between state variables, you can focus on making plans for subsets of the state space. This idea is used in some planning literature.

  • If you discover a consistent way to get from any set of states A to a single state b, you can treat all states in A as a single abstract state a, so long as b is relevant to the rest of your plan. This abstraction principle allows one to derive a smaller, discrete MDP (much easier to solve) from a bigger, continuous one. This is actually the theme of the literature in point 1, and here is the source text (to be more specific, I am an undergrad working in George’s lab).

https://www.lesswrong.com/posts/tXjJDXCp4pZzzeg8Y/relaxation-based-search-from-everyday-life-to-unfamiliar?commentId=svaiXusw8uZwxvBso

I read about half this post before realizing that this concept is intuitively familiar to me from the process of translating poems/​songs: very often a poem or song will have certain specific bits that are going to be extra important to get exactly right (e.g. the title, or something conceptually loadbearing, or a particularly clever or emotionally impactful line) or unusually hard for some reason (e.g. it’s trying to get across a very specific or finicky or culturally specific concept, or using clever wordplay, or it’s self-referential like "a fourth, a fifth, a minor fall, a major lift"), and when considering a new translation it usually makes sense to start brainstorming from those bits, because every line you write will affect what can go near it, so it makes sense to start with the lines where I’ll be lucky if I can think of one good thing that scans, and then from there try to fill in the other lines (which hopefully have more degrees of freedom) with things that not only scan but also rhyme with the hard parts. (Also because sometimes you won’t think of anything for the hard parts, and then it might not make sense to invest a bunch of time in working on the rest of the thing.)