Description

(25 pt) You’re helping to run a highperformance computing system capable of processing several terabytes of data per day. For each of n days, you’re presented with a quantity of data; on day i, you’re presented with x_{i} terabytes. For each terabyte you process, you receive a fixed revenue, but any unprocessed data becomes unavailable at the end of the day (i.e., you can’t work on it in any future day).
You can’t always process everything each day because you’re constrained by the capabilities of your computing system, which can only process a fixed number of terabytes in a given day. In fact, it’s running some oneofakind software that, while very sophisticated, is not totally reliable, and so the amount of data you can process goes down with each day that passes since the most recent reboot of the system. On the first day after a reboot, you can process s_{1} terabytes, on the second day after a reboot, you can process s_{2} terabytes,
and so on, up to s_{n}; we assume s_{1} > s_{2} > s_{3} > > s_{n} > 0. (Of course, on day i you can only process up to

_{i} terabytes, regardless of how fast your system is.) To get the system back to peak performance, you can choose to reboot it; but on any day you choose to reboot the system, you can’t process any data at all.
The problem. Given the amounts of available data x_{1}, x_{2}, …, x_{n} for the next n days, and given the profile of your system as expressed by s_{1}, s_{2}, …, s_{n} (and starting from a freshly rebooted system on day 1), choose the days on which you’re going to reboot so as to maximize the total amount of data you process.
Example. Suppose n = 4, and the values of x_{i} and s_{i} are given by the following table.

day1
day2
day3
day4
x
10
1
7
7
s
8
4
2
1
The best solution would be to reboot on day 2 only; this way, you process 8 terabytes on day 1, then 0 on day 2, then 7 on day 3, then 4 on day 4, for a total of 19. (Note that if you didn’t reboot at all, you’d process 8 + 1 + 2 + 1 = 12; and other rebooting strategies give you less than 19 as well.)

Give an example of an instance with the following properties.


There is a “surplus” of data in the sense that x_{i} > s_{1} for every i.



The optimal solution reboots the system at least twice.

In addition to the example, you should say what the optimal solution is. You do not need to provide a proof that it is optimal.



Give an efficient algorithm that takes values for x_{1}, x_{2}, …, x_{n} and s_{1}, s_{2}, …, s_{n} and returns the total number of terabytes processed by an optimal solution.



(25 pt) A palindrome is a string that reads the same from left to right and from right to left. Design an algorithm to find the minimum number of characters required to make a given string to a palindrome if you are allowed to insert characters at any position of the string. For example, for the input “aab” the output should 1 (we’ll add a ’b’ in the beginning so it becomes “baab”).
The algorithm should run in O(n^{2}) time if the input string has length n.

(25 pt) Given an undirected graph G = (V, E), an independent set is a subset I V such that no two nodes in I are adjacent in G. I.e. for any two nodes u, v 2 I, (u, v) 2= E. Finding a maximum cardinality independent set in a graph is a hard problem, but the problem becomes easy when the graph is a tree. Design an algorithm which, given a tree T = (V, E), runs in O(jV j) time and returns a maximum cardinality independent set in T .

(10 pt) Consider the set A = fa_{1}, . . . , a_{n}g and a collection B_{1}, B_{2}, . . . , B_{m} of subsets of A (i.e., B_{i} A for each i). We say that a set H A is a hitting set for the collection B_{1}, B_{2}, . . . , B_{m} if H contains at least one element from each B_{i} —that is, if H \ B_{i} is not empty for each i (so H “hits” all the sets B_{i} ).
We now define the Hitting Set Problem as follows. We are given a set A = fa_{1}, . . . , a_{n}g, a collection B_{1}, B_{2}, . . . , B_{m} of subsets of A, and a number k. We are asked: Is there a hitting set H A for B_{1}, . . . , B_{m} so that the size of H is at most k?
Prove that the vertex cover problem _{p }the hitting set problem.

(15 pt) An undirected graph G = (V, E) is called “kcolorable” if there exists a way to color the nodes with k colors such that no pair of adjacent nodes are assigned the same color. I.e. G is kcolorable iff there exists a kcoloring : V ! f1, . . . , kg, such that for all (u, v) 2 E, (u) 6= (v) (the function is called a proper kcoloring). The “kcolorable problem” is the problem of determining whether an input graph G = (V, E) is kcolorable. Prove that the 3colorable problem _{P} the 4colorable problem.

Homework assignments are due on the exact time indicated. Please submit your homework using the Gradescope system. Email attachments or other electronic delivery methods are not acceptable. To learn how to use Gradescope, you can:
– 1. Watch the oneminute video with complete instructions from here:
https://www.youtube.com/watch?v=wemznvGPfg
– 2. Follow the instructions to generate a PDF scan of the assignments:
http://gradescopestaticassets.s3uswest2.amazonaws.com/help/submitting_ hw_guide.pdf
– 3. Make sure you start each problem on a new page.

We recommend to use L^{A}T_{E}X, L_{Y}X or other word processing software for submitting the homework. This is not a requirement but it helps us to grade the homework and give feedback. For grading, we will take into account both the correctness and the clarity. Your answer are supposed to be in a simple and understandable manner. Sloppy answers are expected to receiver fewer points.

Unless specified, you should justify your algorithm with proof of correctness and time complexity.