PROBLEM LINK:
Div1
Div2
Practice
Setter-Stepan Filippov
Tester-Misha Chorniy
Editorialist-Abhishek Pandey
DIFFICULTY:
MEDIUM
PRE-REQUISITES:
Square Root Decomposition, or Segment Trees, or Data Structures such as Deque
PROBLEM:
Given an array $A[]$ representing height of plants, and another array $B[]$ representing the required height of plants, find minimum time taken to convert array $A[]$ to array $B[]$ using the specified operation. The operation is, choose valid indexes $L$ and $R$ and a target height $H_i$ and cut all plants in the range to height $H_i$. Their initial height must be $\ge H_i$ for this to be valid.
QUICK EXPLANATION:
We can clearly see that the only way of saving time is to make a valid operation in which maximum number of plants need to be cut to same height $B_i$. The problem hence boils down to, making the optimal query by finding such $L$ and $R$ and checking if its valid or not. The case where answer is not possible is trivial to check.
EXPLANATION:
View Content
Hide Content
"Oh no! Its that editorialist again. Guess what Misterrr Editorialist. I solved the question in $O(N\sqrt{N})$ so I dont need to read this editorial :D "
Well, my solution is quite better than $O(N\sqrt{N})$
"Really Mister Editorialist? Guess what? My solution runs in $O(NLogN)$ Ha! Beat that X)"
Ok. We will discuss an $O(N)$ solution then :D
So, as some of you might have perceived, we will be discussing three approaches here. First we will discuss my approach which is $O(N)$, and then we will see Misha's (tester) $O(N\sqrt{N})$ approach and finally discuss how we can convert/optimize it to become $O(NlogN)$ approach of the setter.
1. Editorialist's Solution-
The very first thing we see is if the answer is possible or not. Clearly, no answer is possible if for some plant $i$, its current height $A_i$ is less than the required height $B_i$. If its possible, we proceed further.
First I will try to give the intuition. See what the problem demands. How can we save time? We can save time for every valid query which sets multiple plants to same required height $B_i$. Hence the problem boils down to proper identification of $L$ and $R$ for the query, and checking if such a query can be made or not.
The first problem which we face in brute force is that, to verify the query is possible, we might have to check all plants and their required heights in range of $L$ to $R$. Not to mention getting exact $L$ and $R$ for query might seem problematic to some. Perhaps we can make some observations first to make our life simpler?
- Optimal query will have to set at least one plant to its required height. Hence, its value must be between one of the values $B_i$ such that $L\le i\le R$.
- We further refine our first observation, claiming that the query height must be $max(B_i)$ for $L\le i\le R$. Because if its not so, one of the plants will be cut well below its required height!
Also, imagine if we "have got the required $L$ and $R$" where we got multiple plants which are to be set to same height $B_H$ and are checking if the next element can be included in the range. When will we discard this element at index $i$ from being inside the valid query? We will do so if-
- If the required height of $B_i$ is more than $B_H$.
- If the plant height $A_i$ is less than $B_H$.
With these two principles in mind, lets move towards the solution. I will first describe my algorithm, and then we will discuss what it does, and why is it correct.
- Make a deque/list/appropriate data structure which supports insertion and deletion at both ends.
- For all plants in range $1$ to $N$, do following-
- If my deque is not empty, and the current plant (at index $i$) being considered has a required height $B_i$ which is more than the height at the back of deque, keep popping from back of deque till height at back ($B_j$) satisfies $B_j \ge B_i$ .
- Now check if the height of plant at $i$, (i.e. $A_i$) is less than the required height $B_L$ at the front of deque. If yes, keep popping from front till this condition becomes false.
- After performing step 3 and 4, we can assure that we can make a valid query for all plants currently in the deque, because steps 3. and 4. assure that $B_i$ are stored in descending order, and that the height of final plant $(A_j)$ is sufficient enough to execute query from start to end of deque.
- Now check if required height of the plant ($B_j$) at back of deque is equal to required height of plant in consideration ($B_i$). If yes, then we can successfully make a query from the last element of deque to current element, saving an operation. If we find that $B_i \neq B_j$, then we cannot save an operation and must increment time needed by $1$.
Why is this correct?
Notice that, we maintain the condition that, element is in the deque only if its possible to make a query from plant $B_L$ at start of deque to plant at end of deque/plant being considered right now ($B_R$ or $B_i$). This is possible because $B_i$ gets stored in descending order in deque. Then, we also check before adding a plant if its possible to execute a query from start of deque to it by making sure that the current height of plant being added ($A_i$) is enough to support query for plant at index $L$ (i.e. $A_L$). $L$ and $R$ are, obviously, starting and ending range of query.
Why descending order?
We will discuss this in Setter's approach as well. I will mention my intuition here. Lets take a simple example, we have this configuration of elements in array $B[]$ such that $B[]=\{B_j,B_i,B_j\}$ with $B_j < B_i > B_j$. Can we make a query from $B_j$ at left of $B_i$ to $B_j$ at right of $B_i$ to save time? No, we cannot as it would cut plant $A_i$ to a height less than $B_i$. Hence, once we encounter a $B_i$, we can safely eliminate all previous $B_j$ which are smaller than $B_i$ as we cannot make a query from them across $B_i$. Also, any queries which could had been done before $B_i$ had already been considered when we added the element $B_j$ and every other element thereafter.
Hence, the algorithm is correct.
$Time$ $Complexity-$ $O(N)$
Tester's Aprroach-
The tester follows Square root decomposition.
If you got the intuition (even if only a little) of my solution, then this one is even easier. What Misha did after taking input is, he made two buckets. First bucket $sqa[]$ stores minimum height of plant $A_i$ in the range of bucket, and second bucket $sqb[]$ stores the maximum of the required heights height $B_i$ in range of bucket.
Then, he checks if the configuration is valid or not. While doing so, he also makes a vector of pairs, which stores $< B[i], i> $. He sorts this vector in descending order. Now, he does the following-
- If required height and current height of plants being considered are equal, do nothing. Else, goto 2.
- If this is the first element being considered, or if the required height $B_i$ of this element is not equal to the required height of previous plants, $B_z$ which we were considering, add $1$ to answer and set $B_z=B_i$ and set the index of plant being considered (index $z$) to $i$. Go back to 1. and consider next plant.
- If required height of this plant $B_z$ is equal to required height of previous plant being considered, get the index $y$ to that previous plant in the original array (He had made pairs of $< B[i], i> $ for this purpose). Now check if we can make a valid query from the previous plant's index $z$ to this index $y$. This checking happens in $O(\sqrt{N})$ on basis of the observations about invalid queries I mentioned earlier (i.e. if there is a $B_k$ in between which is more than $B_z$ preventing valid queries across it, or if current height $A_k$ of some plant is less than the required height $B_z$). If the query is possible, do nothing, else add $1$ to answer and set current plant of consideration to plant at index $y$. (i.e. set $B_z=B_y$ and $z=y$.)
$Time$ $Complexity-$ $O(N\sqrt{N})$
Setter's Solution-
Basically, the change is in his' and tester's solution is in the step where we check the possibility of queries. He optimizes the step where we check if the query is valid or not.
What he observed was, that, consider two queries which set height of plants (in some range) to $H_i$ and $H_j$. Now, suppose $H_i$ happens after $H_j$, and $H_i > H_j$.
He observed that, we can swap the order of these two queries, i.e. execute query $i$ before query $j$ without any effect on the final answer. This is because, any plants getting affected by both query $i$ and query $j$ can be ultimately reduced to height $H_j$ because $H_i > H_j$. He uses this to argue that, there always exist a solution where queries are executed in non-increasing order of heights $H_q$. That is, he will execute query which with greatest height $H_q$ first, &etc. This also justifies tester's step of sorting the vector on basis of required heights $B_i$, as each optimal query will make height ($A_i$) of at least one plant equal to its corresponding required height $B_i$.
With this clear, the only difference comes in checking if the query is valid or not. If we closely see what tester did, and what my definitions say about invalid queries, we can reduce it to finding maximum required height $B_i$ in the range of $[L,R]$ and minimum current height $A_i$ in the range $[L,R]$ where $[L,R]$ are the range of query. This can be easily done via appropriate data structure like Segment Tree & similar.
$Time$ $Complexity-$ $O(NLogN)$
SOLUTION:
Setter
Tester
Editorialist
Commented Solution of Editorialist
CHEF VIJJU'S CORNER:
1. This is a very fit data structure problem, and is really beautiful in this way. There are very little problems where you can solve the question elegantly using the apt data structure. The correct code of algorithm with deque was hardly 6-7 lines for me. This question is really appreciable in this regard, and we must appreciate the setter for such an interesting question.
2. As you can see, the setter and tester have $O(N\sqrt{N})$ and $O(NLogN)$ solutions respectively. The best part of being an editorialist is you can claim the best solution for your share :D :p
3. I will like to leave out some of the apt problems on data structures which I found elegant-