Category: Nifty Questions

What Does it Mean to Solve a Problem?

At the tail end of CS103, we talk about decidable and recognizable (R and RE) languages as a way of talking about problems that are beyond the capacity of computers to solve. I’ve typically framed the discussion of these languages in terms of trying to answer this overarching question:

What problems can be solved by a computer?

Since CS103 stresses mathematical rigor, I usually introduce this question less as a concrete mathematical goal and more as a framework for thinking about compution. In order to rigorously answer this question, we nee definitions of “problem,” “computer,” and “solve.” In formal language theory, problems are modeled by languages and computers are modeled by Turing machines. The idea of “solving” a problem can then be formulated in terms of decidability and recognizability, along with a ton of other solution strategies that we don’t explore in the course (e.g. co-recognizability).

To help students get why it’s so important to have a formal definition of what it means to solve a problem, I’ve added in this question to the problem set, which connect the earlier material on implications to the later material on computability:

Let L be a language over \Sigma and M be a TM with input alphabet \Sigma. Below are three properties that may hold for M:

  1. M halts on all inputs.
  2. For any string w \in \Sigma^*, if M accepts w, then w \in L.
  3. For any string w \in \Sigma^*, if M rejects w, then w \notin L.

At some level, for a TM to claim to solve a problem, it should have at least some of these properties. Interestingly, though, just having two of these properties doesn’t say much.

  1. Prove that if L is any language over \Sigma, then there is a TM M that satisfies properties (1) and (2).
  2. Prove that if L is any language over \Sigma, then there is a TM M that satisfies properties (1) and (3).
  3. Prove that if L is any language over \Sigma, then there is a TM M that satisfies properties (2) and (3).
  4. Suppose that L is a language over \Sigma for which there is a TM M that satisfies properties (1), (2), and (3). What can you say about L? Prove it.

Many students, upon reading this problem, think that they must have misread something, since it looks like it suggests that all languages are decidable or something along those lines. Working out why exactly this isn’t the case is trickier and forces students to read the properties closely to determine what they mean.

These questions also segue nicely into questions about sound and complete approximations to problems. Many students ask whether, if a language is undecidable, there’s still a way to build a TM that gets the right answer “most of the time,” for some definition of “most of the time,” or that gets an “approximately” correct answer. This question shows that it’s always possible to find a decidable subset or superset of any given language (though not a particularly useful one), and is a great launching point for a further discussion of the topic.

Advertisements

A Set of Things Not Implied

The first midterm exam for CS103 focuses heavily on set theory and formal logic, since they form the backbone for so many of the later mathematical structures and concepts that come up the course (and more broadly in discrete mathematics and theoretical computer science), and so each quarter I try to come up with at least one good question relating the two together.

A while back I came up with the following problem, which has had a lot of staying power and is a great way to test the two concepts:

Let A and B be arbitrary sets and consider the set S defined below:

S = \{ x | \lnot(x \in A \to x \in B) \}

Write an expression for S in terms of A and B using the standard set operators (union, intersection, etc.), but without using set-builder notation. Briefly justify why your answer is correct.

I highly recommend taking a minute to work through this problem if you haven’t worked with propositional logic or set-builder notation in a while.

One of the reasons I love this problem so much is that the set-builder notation for the set just plain doesn’t seem to make sense. What on earth is meant by a set of things where it isn’t implied that if the things are in one set, they’re in another? But if you proceed slowly, expand out the negation of the implication x \in A \to x \in B, then read the overall expression, you can see that it ends up working out really nicely to the set A - B. That in turn might cause you to wonder what the connection between a negated implication and the set A - B is, which might you remember that negations only fail to hold when the antecedent holds and the consequent doesn’t. That in turn might then cause you to wonder whether there are other connections between set theory and propositional logic, and now you’re starting to get a better sense for how everything more generally relates to one another.

Bin Packing and Health Care

In an earlier post I mentioned that I enjoy reframing classic algorithms questions as real-world problems to motivate student interest. One problem I’ve used as an exercise in recursive backtracking (both on an exam and in a recent assignment) is a version of the bin packing problem framed in terms of scheduling patients in a hospital. The problem is described like this:

You are working at a hospital trying to coordinate times in which non-emergency patients can meet with their doctors. Each doctor has a maximum number of hours each day that she can see patients. For each patient, you are given how much time she will need to spend with her doctor. Given the amount of time that each doctor is free on a given day, along with the amount of time required for each patient, you are interested in determining whether or not it is possible for every patient to be seen.

I guess that since the doctors have a bunch of patients to see and no fixed plan to see them, they’re Doctors Without Orders!

This problem is essentially the bin packing problem with the modification that each bin might have a different total size. Each doctor forms a bin whose capacity is equal to the number of free hours in the doctor’s schedule, and each patient is an object to place into a bin whose size is equal to the number of hours for which she needs to be seen.

The theming on this problem is quite nice – there’s a clear motivation behind why you might want to find the most efficient way to schedule patients across doctors. Since it’s NP-hard, most greedy solutions will fail to give optimal solutions in some cases (which is a great way to introduce students to some concepts from algorithms design). It also generalizes in a lot of fun ways. Imagine you have a bunch of doctors like these and want to plan a full week’s schedule, or otherwise try to get everyone seen as fast as possible. How would you do that? How might you define a measure of “fairness” in terms of workload across the doctors, then try to maximize fairness while getting everyone seen? What if the doctors are specialists in different subfields and each patient needs to be seen by a doctor with a certain set of skills? And on top of it all, the problem has a really nice recursive backtracking solution: fix an unassigned patient, and for each doctor with time available to see them, try assigning the patient to that doctor.

Negations, Contrapositives, and Happiness

One of the first skills I teach students in CS103 is how to take the negations of different classes of statements, such as universally-quantified statements, existentially-quantified statements, vacuous truths, and implications. The goal is to help students get comfortable with proofs by contradiction and contrapositive as early as possible.

I accidentally stumbled on this little exercise, which is very effective at forcing students to pause and see how all these concepts interact:

What is the contrapositive of the statement “if someone is happy, then everyone is happy?”

  1. If everyone is happy, then someone is happy.
  2. If someone is not happy, then everyone is not happy.
  3. If everyone is not happy, then someone is not happy.
  4. If everyone is happy, then someone is not happy.
  5. If someone is not happy, then everyone is happy.

What is the negation of the statement “if someone is happy, then everyone is happy?

  1. If someone is not happy, then everyone is not happy.
  2. If someone is happy, then someone is not happy.
  3. Someone is happy and everyone is not happy.
  4. Someone is happy and someone is not happy.
  5. Someone is not happy and everyone is not happy.

If you haven’t practiced taking contrapositives and negating formulas recently, it’s worth taking a minute to answer these question for yourself before moving on. (The answers are scattered throughout the rest of this post)

These problems tend to really throw students when they’re getting started and they’re excellent at launching helpful conversations. There are a number of reasons why.

First, the statement at the core of these questions – “if someone is happy, then everyone is happy” – is hard to wrap your head around in a mathematical sense. A plain English reading of this statement leads most people to conclude that it’s totally false. The fact that one person is happy doesn’t cause other people to be happy. But if you interpret this statement in a mathematical sense – as in something given by the first-order logic statement (\exists x. Happy(x)) \to (\forall x. Happy(x)) – then the meaning is totally different. That implication is a material conditional, and the truth or falsity of the statement depends on the world in which you evaluate it. So for most students, learning to work with this statement at all involves first recognizing that they’re not dealing with implications or conditionals in the familiar sense of the term.

The second challenge with answering these questions is that they require students to have a fairly good understanding of how to negate universally-quantified statements, existentially-quantified statements, and implications. This is an area I’ve found students to be perennially perplexed by when they’re getting started. Let’s take the contrapositive, for example. Most students can recall independently that

  • the negation of a statement of the form \forall x. P(x) is \exists x. \lnot P(x),
  • the negation of a statement of the form \exists x. P(x) is \forall x. \lnot P(x), and
  • the contrapositive of the implication A \to B is \lnot B \to \lnot A,

but they don’t realize how to assemble them together. The majority of students I’ve had will correctly realize that they need to negate the antecedent and consequent and swap their roles, but don’t recognize that negating the statements “someone  is happy” and “everyone is happy” involves negating universally-quantified and existentially-quantified statements.

Then there’s the fact that the correct answer, in some sense, looks wrong. If you negate the antecedent and consequent and swap their order, you get back (in first-order logic) the statement (\exists x. \lnot Happy(x)) \to (\forall x. \lnot Happy(x)), which renders in English as “if someone is not happy, then everyone is not happy.” Many students going through a process-of-elimination through the answers will rule this one out because it initially looks like the antecedent and consequent were negated but not swapped.

And finally, there’s a big question that comes up about why, if this really is the contrapositive, it means the same thing as the original statement. Why do the statements “if someone is happy, then everyone is happy” and “if someone is not happy, then everyone is not happy” mean the same thing?

The best way I know to answer that question is to move on to the next part and think about the negation of the statement. Seeing when this statement isn’t true sheds a lot of light on when it is true.

Again, negating this statement gives students a lot of trouble. Most students get this one wrong by choosing the option “if someone is happy, then someone is not happy,” or, in first-order logic, (\exists x. Happy(x)) \to (\exists x. \lnot Happy(x)). The issue here is that the negation of an implication of the form A \to B is A \land \lnot B, rather than A \to \lnot B. The correct answer is “someone is happy and someone is sad.”

So why is that statement the negation of the original one (or its contrapositive)? I like asking students to mull this one over. If you think about it, if you’re in a room where there’s a happy person and a sad person, then the claim “if someone is happy, then everyone is happy” is false, since the sad person disproves it, and the claim “if someone is not happy, then everyone is not happy” is also false because of the happy person. (Most students don’t find that part too hard to get). On the other hand, imagine that you have a room where everyone is happy or everyone is sad. If everyone is happy, then the statement “if someone is happy, then everyone is happy” is trivially true (“trivially” in the sense of “it’s an implication with a true consequent” rather than “obviously,”), and the statement “if someone is not happy, then everyone is not happy” is vacuously true. There’s a similar argument to be made for the case where everyone is sad.

Because this simple statement touches on so many topics – what implications mean, how to take a contrapositive, how to negate universally-quantified and existentially-quantified statements, negating implications, and vacuous truths – I’ve found that it’s an excellent conversation starter. If I see someone struggling with first-order logic, I’ll often ask this question and have them work through it.

So there you have it – a tiny little implication that leads to a ton of interesting edge cases on the boundaries of logic and language. I doubt I’m the first person to have stumbled on this one, and I hope that it finds a use somewhere outside of CS103!

The Electoral College as Knapsack

One of the challenges in teaching theoretical CS – or at least, the more abstract parts of computer science – is that many of the topics can seem completely divorced from reality. Given that a lot of students take introductory CS courses to build up their programming skills – for interest, because they need them in a research lab, or because they’re looking for a safe and stable career path – it’s just as important to sell the students on the topics as it is to provide a clear and easily understood presentation.

A question I frequently ask myself when I’m teaching a class is whether the material has stakes. Are the motivating examples and questions things that students would consider inherently interesting independent of their technical merits?

About six months ago, I had the pleasure of teaching CS106B, the second course in Stanford’s introductory programming sequence. Topics include recursion, recursive problem-solving, and algorithmic efficiency. This is where students learn to solve problems like “list all the subsets of a set S” or “generate all permutations of a given string.” I have to confess that I’m the sort of person who really gets a kick out of problems like these (and likes playing around with ways of using nonstandard number systems to generate them), but the fact that I think it’s cool to generate all partitions of a set doesn’t really matter if I can’t get the students to find the topic interesting as well.

Over the course of teaching CS106B, I had a ton of fun taking classic algorithms problems (often from Karp’s 21 Problems) and dressing them up in a way that motivated student engagement. One particular example of this that students found interesting (but challenging!) came up on the eve of the 2016 US Presidential Election. NPR had reported that it would be possible to win the presidency with 27% of the popular vote by winning a majority of votes in the 11 states with the highest number of Electoral College votes. Someone tweeted at them that it would be better to instead win a majority of votes in the least-populous states, which would require only 23% of the popular vote. I thought about the setup to this problem: you know the number of votes required to carry each state, along with the number of electoral votes for each state, and you want to find the way to get to 270 electoral votes using as few popular votes as possible. Which states do you choose to do this?

I pretty quickly realized that I’d seen this problem before – this is the 0/1 Knapsack Problem! We can think of each state as an object whose “weight” is the number of popular votes needed to win a majority, and whose “value” is its number of electoral votes. This means that all of the theory built up over the years for the knapsack problem easily transfer over to this new problem of winning the presidency with the fewest number of electoral votes.

I ended up turning this problem into a programming assignment for CS106B. I asked some of my undergraduate section leaders to pull up historical voting data for different presidential elections, and to my astonishment two of them (Nick Bowman and Eli Echt-Wilson) pulled up historical data going back to 1828 by the end of the day. Using some shape data files for US states that I’d downloaded back in 2011 (unfortunately, I cannot remember the source), I whipped up a little visualizer and asked students to code up the back-end logic to take in a list of states (along with popular votes cast and electoral vote totals) and output the minimum collection of states necessary to win the presidency. I gave some high-level guideposts about how to proceed – start off by writing a brute-force recursive backtracking solution, then add memoization – and provided starter code that hooked their answer into the visualizer.

The results were quite interesting. It turns out that the NPR article still had it wrong. Based on voting patterns from 2012, and making some simplifying assumptions (that electors never defect, that states that split their electors instead decide to send them as a bloc, that there’s a majority in the Electoral College, that there aren’t any third-party spoilers, etc.), it turns out that you’d only need around 21.1% of the popular vote to win the presidency. Just win the states highlighted in gold:

Screenshot from 2017-07-09 10-16-38.png

The particular states you’d need to win here seem like a pretty eclectic assortment. It turns out it is a good idea to win some big-population states like California and Texas, though you’d forgo others like New York and Florida. I’m curious what sort of platform you’d need to campaign on in order to win these states. 🙂

It’s interesting to run this program over data from different years to see what other trends emerge. You can run it in 1860 versus 1864 to see that, for some reason, in 1864 a bunch of states from the South didn’t cast Electoral College votes. You can run it over a longer period of time to see the addition of new States to the Union and the change in population centers over time.

One of the advantages of framing the knapsack problem this way is that many students were genuinely curious to explore this problem independently of the underlying CS theory. It motivates students to ask questions about how their government (or the government of their temporary host country) operates, its history, and questions of fairness and electoral power. In the end-of-quarter evaluations, many students reported this problem to be one of their favorites from the quarter.

I think there were some rough edges with the way the question was deployed the first time around – it’s a pretty tricky problem to assign to students who are still getting comfortable with recursion – but with a few touchups I think it could have a lot of staying power.

Cantor’s Theorem and the Universal Set

Over the past six years I’ve had the pleasure of teaching CS103, one of the core CS classes in the Stanford Computer Science department. The course moves fast – it spends five weeks on discrete mathematics; four weeks on automata, languages, and computability theory; and one week on complexity theory. There are no prerequisites for the course, and so the course effectively functions as a combination of introduction to proofwriting and a quick survey of some of the major results from computability and complexity theory.

In the course of teaching CS103 I’ve curated and developed a bunch of discrete math and CS theory exercises that I give out in a number of problem sets, practice exams, and extra practice problems. I wanted to share a few of my favorite problems from that group here, along with a discussion of why I like those problems so much.

To kick things off, I thought I’d share this problem, which originated as an exam question and has since become a mainstay on the problem sets:

What happens if we take absolutely everything and throw it into a set? If we do, we would get a set called the universal set, which we denote 𝒰:

𝒰 = { x | x exists }

Absolutely everything would belong to this set: 1 ∈ 𝒰, ℕ ∈ 𝒰, philosophy ∈ 𝒰, CS103 ∈ 𝒰, etc. In fact, we’d even have 𝒰 ∈ 𝒰, which is strange but not immediately a problem.

Unfortunately, the set 𝒰 doesn’t actually exist, as its existence would break mathematics.

  1. Prove that if A and B are arbitrary sets where AB, then |A| ≤ |B|. (Hint: Look at the Guide to Cantor’s theorem. Formally speaking, if you want to prove that |A| ≤ |B|, what do you need to prove? Your answer should involve defining some sort of function between A and B and proving that function has some specific property or properties.)

  2. Using your result from (i), prove that if 𝒰 exists, then |℘(𝒰)| ≤ |𝒰|.

  3. Using your result from (ii) and Cantor’s Theorem, prove that 𝒰 does not exist. Feel free to the following fact in the course of writing up your proof: for any sets A and B, if |A| ≤ |B| and |B| ≤ |A|, then |A| = |B|.

The result you’ve proven shows that there is a collection of objects (namely, the collection of  everything that exists) that cannot be put into a set. When this was discovered at the start of the twentieth century, it caused quite a lot of chaos in the math world and led to a reexamination of logical reasoning itself and a more formal definition of what objects can and cannot be gathered into a set. If you’re curious to learn more about what came out of that, take Math 161 (Set Theory) or Phil 159 (Non-Classical Logic).

This problem has a huge “whoa” factor for students. There is something really cool about proving that the very natural, intuitive concept of a set (a set is just a collection of things) somehow doesn’t quite work. I’ve had a lot of students tell me over the years that this result totally blew their minds.

I also really like the pedagogical value of the question. In the lectures leading up to this problem, I formally define the relations |A| \le |B|, |A| < |B|, and |A| = |B| using injections, surjections, and bijections, then write up a rigorous proof of Cantor’s theorem using those definitions. Part (i) of this problem asks students to prove a result that, in most students’ minds, is probably “obvious,” that if A \subseteq B, then |A| \le |B|. The intent is to get students to engage with the formal definition of this relation and to find an injection f : A \to B. This pushes students to think about what that function might look like, given that there’s basically no information given about what A and B look like.

Part (ii) of the problem asks people to think about what the power set of the universal set 𝒰 looks like and how it related back to 𝒰. This hits on the difference between the element-of relation and the subset-of relation. Most students recognize that \wp(\mathcal{U}) \in \mathcal{U}, but it’s a second step to see why \wp(\mathcal{U}) \subseteq \mathcal{U} holds as well. One of the major issues I see students struggle with when they’re just getting started with discrete mathematics is understanding how these two concepts relate to one another, and I’m always looking for more ways to get them to practice this.

Part (iii) of this problem combines everything together and asks students to think about how everything fits together. The real star here is Cantor’s theorem, which sets up a contradiction between |\wp(\mathcal{U})| \le |\mathcal{U}| and |\mathcal{U}| < |\wp(\mathcal{U})|.

As an interesting technical aside, the “textbook” way you’d prove that the universal set doesn’t exist doesn’t proceed this way. Typically, you’d Russell’s paradox: use the axiom schema of specification to select from 𝒰 the set S = \{ x | x \notin x \}, and then prove that S \in S if and only if S \notin S. There’s a diagonal-esque argument involved in Russell’s paradox (looking at objects that don’t contain themselves), which is still present in the proof route here (embedded in the proof of Cantor’s theorem). In fact, if you trace through the formal proof of Cantor’s theorem using the identity function, which is in a sense what this entire problem is all about, you’ll find that it ends up making precisely the same argument that gives rise to the Russell’s paradox contradiction.

I think this particular problem is one that could be adapted for use in other discrete math courses. The major prerequisites here are Cantor’s theorem and the formal definitions of the cardinality relations, which often come up in some form when talking about countable versus uncountable infinities. If you work through this problem, I think you’ll find that it’s actually not all that tricky and that the sticking points students will hit make for really good teaching moments (expanding out formal definitions, defining functions between sets, element-of versus subset-of, etc.).