PROBLEM LINK:

Author:Avijit Agarwal
Tester and Editorialist:Soumik Sarkar

DIFFICULTY:

MEDIUM

PREREQUISITES:

Prime sieve, Fast fourier transform

PROBLEM:

Let $f(x)=$ the number of pairs $(a, b)$ such that $0 \le a \le b < x$ and $a$ and $b$ are prime and $x = a + b$. Given a number $N$, find the number of pairs $(a, b)$ such that $0 \le a \le b < x$ and $f(N) = f(a) + f(b)$.

EXPLANATION:

Let us assume can find $f(x)$ in constant time for any given $x$. Then for a query $N$ naively trying all pairs will result in a complexity of $\mathcal{O}(N^2)$. Instead we can build up a frequency map/array of the sequence $f$ by upto index $N-1$. Let's call this frequency map $cnt$. Now we can count how many pairs add up to $f(N)$ by iterating over suitable $i$ and adding $cnt(i) \times cnt(N-i)$ to the total. Note that we are looking for pairs where $a \le b$ so we must make sure not to double count any pair. The complexity is $\mathcal{O}(N)$ for one query.

This approach relies on the availability of all $f(x)$ in constant time. One way to achieve that would be to calculate all primes upto the required range beforehand using a sieve. Let there be $P$ such primes, then we can compute the array $f$ in $\mathcal{O}(P^2)$ time, but that would be too slow.

For an efficient solution, consider the polynomial $P(x) = x^2 + x^3 + x^5 + .... + x^p$ where in each term the exponent of $x$ is a prime and $p$ is the last prime we need to be concerned with.

$$P(x)^2 = ( x^2 + x^3 + x^5 + ....)^2 = x^4 + 2x^5 + x^6 + 2x^7 + 2x^8 + ...$$

Multipying $x^a$ and $x^b$ gives $x^{a+b}$. Thus in the polynomial above the coefficient of $x^k$ is the number of ways to represent $k$ as a sum of 2 primes. Note that here the pairs $(a, b)$ and $(b, a)$ for $a \ne b$ are counted separately, so the values must be adjusted to suit the given definition of $f(x)$.

Squaring a polynomial of degree $n$ naively takes $\mathcal{O}(n^2)$ time. However there exists the awesome method of Fast Fourier Transform which can be used to efficiently compute the product of two polynomials in $\mathcal{O}(n \log n)$.

The complexity is $\mathcal{O}(N \log N + TN)$.

AUTHOR'S AND TESTER'S SOLUTION:

Author's solution can be found here
Tester's solution can be found here.

PROBLEM LINK:

Practice
Contest

Author and Editorialist:Soumik Sarkar
Tester:Sarthak Manna

DIFFICULTY:

MEDIUM

PREREQUISITES:

Euler tour, Dynamic programming

PROBLEM:

There is a tree with $N$ vertices, and a set of vertices $G$. For every subset $G'$ of $G$, find the maximum number of edges which can be removed while leaving each vertex in $G'$ connected to vertex $1$. Calculate the XOR-sum of all these values.

EXPLANATION:

Consider the tree to be rooted at $1$. Given a set $G$ imagine you have removed all edges are not necessary for the vertices in $G$ to remain connected to $1$. What will this graph look like?

Of course the graph will remain a connected tree. Also, all the leaves will belong to $G$. Some elements of $G$ may lie in the interior of the tree as well. If asked to find the number of edges in this tree, you would say the answer is $|G| - 1$. However, in a roundabout way I can also claim that counting the edges on an Euler tour of this tree will give me exactly twice the answer.

Let it be that during the Euler tour I start from the root and happen to visit the vertices in $G$ in the order $G_1, G_2, G_3, ... G_K$. I can say the total length of the Euler tour is $dist(1, G_1) + dist(G_1, G_2) + dist(G_2, G_3) + .... + dist(G_K, 1)$. As mentioned before, this is exactly twice the number of edges in the tree, but the values of all these terms are the same as in the original tree. So if we know $dist(G_i, G_j)$ for each pair $(i, j)$ and also the "Euler tour order" of the vertices in $G$, we can calculate the answer in $\mathcal{O}(K)$!

We can get the Euler tour order with one dfs from the root. And then we can proceed to make $K$ dfs from each vertex in $G$ to calculate the distance matrix for each pair of vertices in $G$. To get the distance matrix it is also possible to use faster methods, but not necessary.

However applying this procedure in this form is too slow as it requires $\mathcal{O}(K)$ time ($K/2$ on average) for each subset of $G$.

To optimize it, we can notice that we can use the answer for one subset to calculate that of another. Let us denote by $f(G)$ the value $dist(1, G_1) + dist(G_1, G_2) + ... dist(G_{x-1}, G_{x})$ where $G_1$ to $G_x$ are the elements of $G$ in Euler tour order. Then clearly $f(G) = f(G \setminus G_x) + dist(G_{x-1}, G_{x})$. Now we can apply dynamic programming using bitmasks to denote subsets and calculate the answer for each subset in constant time and XOR them together.

ans = N - 1
for mask in [1..2^K]:
    x = last set bit in mask
    if x is the only set bit in mask:
        dp[mask] = dist(1, G[x])
        ans = ans XOR (N - 1 - dp[mask])
    else:
        y = second last set bit in mask
        dp[mask] = dp[mask \ y] + dist(G[y], G[x])
        ans = ans XOR ((N - 1 - (dp[mask] + dist(G[x], 1)) / 2)

Note: The solution requires finding the last (or first) 2 set bits in an integer which can be done quickly in GCC using __builtin_ctz. If you instead find the set bits by looping over each bit position it appears to take $\mathcal{O}(K)$ time for each subset but with a little thought it can be shown that the total operations required over all subsets remains $\mathcal{O}(2^K)$.

Total complexity is $\mathcal{O}(NK + 2^K)$.

ALTERNATE SOLUTION:

Instead of calculating the answer as half of $dist(1, G_1) + dist(G_1, G_2) + ... + dist(G_K, 1)$ one can also claim that the answer is exactly $dist(lca(1, G_1), G_1) + dist(lca(G_1, G_2), G_2) + ... + dist(lca(G_{K-1}, G_K), G_K)$, where $lca(u, v)$ is the lowest common ancestor of $u$ and $v$. The rest of the process can be done based on this as well.

AUTHOR'S AND TESTER'S SOLUTION:

Author's solution can be found here
Tester's solution can be found here.

prob : http://codeforces.com/contest/957/problem/D?mobile=false

I have seen the editorial but didn't understand it. Can someone please explain me it in an easier way.

Is there some problem with Codechef? The problems are taking too long to be evaluated.

This code of c++ do not have any error. Can anybody please explain why.

https://ide.geeksforgeeks.org/3Cwr1gGwNv

How big theta is computed from the concept of big-oh ? explain with example?

I am trying to solve the problem Enjoy Sum with Operations from spoj, the problem is based on segment tree. Here is my solution, can anyone help me in gettting AC in this problem.

PROBLEM LINK:

Div1
Practice

Setter-Trung Nguyen
Tester-Misha Chorniy
Editorialist-Abhishek Pandey

DIFFICULTY:

HARD

PRE-REQUISITES:

Fast Walsh Hadmard Transformation (Some knowledge about concepts of FFT etc. may aid in understanding), Dynamic Programming with Bitmasking

This editorial will assume you have a knowledge of the pre-requisites.

PROBLEM:

You are to count number of "$S$ Semi Palindromic" numbers which are less than ${10}^{K}$. A non-negative number is called $S$ Semi Palindrome if its digits can be re-arranged to give a palindromic number.

QUICK EXPLANATION:

Let $A[i][mask]$ be the matrix of transition where $i$ represents the remainder $\%S$ and $mask$ represent a $XOR-SUM$ of digits as- $XorSum={2}^{d_0}\oplus{2}^{d_1}.....\oplus{2}^{d_k}$. Our answer is $\sum_{i=1}^{K} {A}^{i}$ $-$ but it will count leading zeroes! However, we can easily remove count of leading zeroes if we also store the state $\sum_{i=1}^{K-1} {A}^{i}$. Say matrices $Ans1[][]=\sum_{i=1}^{K} {A}^{i}$ and $Ans2[][]=\sum_{i=1}^{i=K-1} {A}^{i}$. Our final answer will be $(Ans=\sum_{d_i=0}^{9}Ans1[0][{2}^{d_i}]-Ans2[0][({2}^{d_i})\oplus ({2}^{0})])$

EXPLANATION:

First, lets make a transition matrix, which is initially initialized to hold answer for $K=1$ (i.e. single digit numbers). Hence, the initial state of our transition matrix is-

for (int i=0; i<10; i++) A[i%s][1<< i]=1;

The meaning of $A[i][mask]$ is All numbers, which leave a remainder i when $\%S$ and have the given mask. Note that, mask here is calculated by $Mask={2}^{d_0}\oplus{2}^{d_1}.....\oplus{2}^{d_k}$. This helps in easy semi-palindrome detection, as digits can have values only from $0-9$ and if he do a $xor$ of $\large {2}^{d_i}$, then a semi-palindrome must have mask which is either a power of $2$ , or $0$. Power of $2$ represents a single digit occurring odd number of times (middle digit) while rest occurs even number of times, and a $0$ represents that every digit occurs in multiples of $2$ which makes it possible to re-arrange the number to make it a palindrome.

Our $A[][]$ holds answer for $K=1$ currently. To answer the query, we need $\sum_{i=1}^{K} {A}^{i}$ where ${A}^{i}$ will store answer for numbers of length $i$. This process will also, however, count numbers with leading $0s$ which is undesirable. To compensate for that, we will need the state value of $\sum_{i=1}^{K-1} {A}^{i}$.

But how to find these in first place?

We will be using two tricks.

The first one is Fast Walsh Hadmard Transformation of our transformation matrix $A$ , so that we can compute things faster.

The second one is how we will calculate the polynomials.

Say we have the following values-

Matrix $T$=$\sum_{i=1}^{P} {A}^{i}$
Matrix $U$=${A}^{P}$

We need to find $\sum_{i=1}^{2P} {A}^{i}$ from it. What to do?

It turns out to be pretty simple!

We can reformulate $({A}^{2p}+{A}^{2P-1}....{A}^{P+1})+{A}^{P}.....+A$ as-

${A}^{P}({A}^{P}+{A}^{P-1}+....+A)+{A}^{P}+{A}^{P-1}+.....+A$

Which is equivalent to, in terms of our matrices, $UT+T$.

Now, remember that we need $\sum_{i=1}^{K} {A}^{i}$ AND $\sum_{i=1}^{K-1} {A}^{i}$. Hence, lets find upto $\sum_{i=1}^{K-1} {A}^{i}$ using the above logarithmic method. The answer for $\sum_{i=1}^{K} {A}^{i}$ can be easily obtained by multiplying with, and adding $A$ once more to $\sum_{i=1}^{K-1} {A}^{i}$.

Suppose matrix $B$ holds the value of ${A}^{K}$ (equivalent to matrix $U$ of my example.) , matrix $D$ stores values of answering summation, and matrix $A$ holds the value of polynomial, (equivalent to matrix $T$ of my example).
So, our code should look somewhat like this-

Fast-Walsh Transform(A); int lim=k-1; while (lim) { if (lim&1) { Multiply_Matrices(D, B, s, r); //s passed to calculate %s Add_Matrices(D, A, s); } for (int i=0; i<s; i++) for (int j=0; j<N; j++) Duplicate[i][j]=A[i][j]; Multiply_Matrices(A, B, s, r); //r is needed in multiplication process. Explained in bonus section. Add_Matrices(A, Duplicate, s); Multiply_Matrices(B, B, s, r); lim>>=1; r<<=1; }

We are almost done!

We have the answer of $\sum_{i=1}^{K-1} {A}^{i}$ stored in matrix $D$. Lets make another matrix $F$ to store the answer for $\sum_{i=1}^{K} {A}^{i}$. Any guesses about $F$ till now?

A=Restore_Original(A);//The transition matrix which stored answer for K=1 length. FWT_Transform_Matrix(A); Multiply_Matrices(D, A, F, s, r);//Multiply D and A, store in F. Add_Matrices(F, A, s);

We are just two steps away from the final answer!

Our first task is to reverse the transformation which we did for faster multiplication and computation of required answer. We have $D=\sum_{i=1}^{K-1} FWT({A}^{i})$ and $F=\sum_{i=1}^{K} FWT({A}^{i})$.

Our next step is to calculate the inverse of this FWT transformation. Once we calculate the inverse, our answer will be of form-

ans=f[0][0]-d[0][1];//Separately for digit 0 for (int i=0; i<=9; i++) ans+=F[0][1<< i]-D[0][(1<< i)^(1<<0)];//For rest of digits from '1'-'9'

Recall that $mask$ must be either $0$ or a ${2}^{i}$. Now, why F[0][1<< i]-D[0][(1<< i)^1] ? The exact reasoning will be clear only when you get acquainted with transition matrices- but I can try giving an intuition on why its right.

We know that matrix $D$ has answer till length $K-1$. One interpretation goes this way. In numbers of length $K-1$ we have some unpaired number (as ${2}^{i}$) and another unpaired $0$. When we added another digit at $MSB$ to get the ans for numbers till length $K$, the $0$ got paired. So naturally, what will that number be? (Please see, this isnt intended to be an "Absolute Mathematical Reasoning" but merely a rough intuition to kind of help get the logic. People are always advised to explore the topic and get the real mathematical reasoning).

The exact definition of functions Multiply_Matrix, Add_Matrix and another way of dealing with number of leading zeroes is discussed in bonus section.

SOLUTION:

Setter
Tester

$Time$ $Complexity-$ $O({2}^{10}* {S}^{2} *LogK * log({2}^{10}) )$

CHEF VIJJU'S CORNER:

1.Add_Matrix function.

View Content

2.Multiply_Matrix Function

View Content

3.Another way of bypassing leading $0's$

View Content

PROBLEM LINK:

Div2
Practice

Setter-Anuj Maheshwari
Tester-Misha Chorniy
Editorialist-Abhishek Pandey

DIFFICULTY:

Simple

PRE-REQUISITES:

Array, Looping, Basic Math and Principles of Counting, Frequency Array.

PROBLEM:

Given an array of length $N$, find number of unordered pairs $(i,j)$ such that their average exists in the array (real division), or in other words, find number of unordered pairs of $(i,j)$ for which we can find an element in array $A_k$ which satisfies-

$2*A_k=A_i+A_j$ where $A_i$ and $A_j$ are $i$ $'th$ and $j$ $'th$ element of the array.

Unordered pair means that order of listing elements in pair doesnt matter. It means $(2,1)$ is treated same as $(1,2)$. In ordered pairs, we treat both of them as different.

QUICK EXPLANATION:

We immediately see the low constraints for value of $A_i$. We can make a frequency array to record frequency of each array element. The only problem is, $A_i$ can be negative as well. This is easily solved by adding a constant $K$ (preferably $1000$) to all array elements. With all elements positive, we iterate over all possible pair of values of array elements, i.e. we check all pairs of $(a,b)$ where $0\le a,b \le 2000$. We check if $a$, $b$ and their average exists in array and update answer accordingly.

EXPLANATION:

This editorial will discuss 2 approaches. First is mine (Editorialist), and second one is Misha's (@mgch), i.e. the tester's. My approach is a bit more difficult than tester's because I use some formulas and observations, while tester's solution is simply basic.

We noticed a lot of Div2 users getting stuck in lots of useless and unnecessary things, hence we will answer how to overcome such things in Chef Vijju's Corner at the end of editorial. :)

1. Editorialist's Approach-

The first thing I did was to add $1000$ to each element so that all elements of array are non-negative and between $[0,2000]$. We can show that if we add any constant to all values of array, then although the average of numbers increases, it has no effect on "existence of average (of 2 array elements) inside array." Try to prove it if you can. (Answer is in tab below).

View Content

For now, forget about duplicates. My approach counts even duplicates in the process, and removes them later at the end.

Once we have the frequencies of elements inside array, I iterate from all possible "pairs of values" allowed, i.e. , we know that now values of array elements are in range $[0,2000]$, so we will iterate over all pairs of $(a,b)$ where $0 \le a,b \le 2000$ and do the following-

1.If $(a,b)$ exist, and if their average (by real division) exists in the array as well, goto 2, else check next pair.
2.Is $a==b$ $?$ If yes, then add $(freq[a]-1)*freq[b]$ to answer, else add $freq[a]*freq[b]$

To remove duplicates, simply divide the answer by $2$ - because each pair is counted twice. The proof for the 2 formulas I used, along with fact that dividing by $2$ removes duplicates is left to reader as an exercise. (Answer/Hints are discussed in Chef Vijju's corner.)

$Time$ $Complexity-O({2000}^{2}+N)=O(N+K)=O(N)$ (where $K$ is a constant).

Tester's Approach (Easy)-

The first step is the same, he also made all elements non-negative by adding 1000. He deals with duplicates immediately.

1.For each possible value in $[0,2000]$, check if it exists in array. If it does, goto 2, else check the next value.
2.Count all pairs formed by this value and its duplicates by adding $ cnt[mid] * (cnt[mid] - 1) / 2$ to the answer. $mid$ here is the value being investigated.
3.If $mid$ is average of 2 numbers, this means both the numbers are equidistant from $mid$. Hence, count all pairs possible due to presence of equidistant numbers present at a distant $x$ , where $x$ is in range $[0,mid]$ and $mid+x$ and $mid-x$ are in valid range of $[0,2000]$. For each such pair, add $cnt[mid - x] * cnt[mid + x]$ to the final answer.

Time complexity is same as my solution in worst case :).

SOLUTION:

Setter
Tester
Editorialist

CHEF VIJJU'S CORNER:

1.Lets first discuss the mistakes. Most of the participants got the formulas right...but they did a tremendous blunder. Suffered from Overflow!!. If you are declaring your frequency array as int, then for larger test cases where $N$ is as large as ${10}^{5}$, the frequency of elements can be as large. Hence, when we do $freq[a]*freq[b]$, we must make sure to use proper type casting, else the answer will overflow!! Many contestants stored answer in an int variable, and that code is doomed to be wrong even before submission!!

One thing I will tell, if you are getting AC for smaller sub tasks, and WA for larger ones, do check for overflow!!. There is a good $60\%$ chance that it is overflow issue giving you WA. We were going through the wrong solutions in detail, and our heart sank a little bit each time we saw a solution killed because of overflow :(

2.Some of them mistook $|A_i| \le 1000$ as $0 \le A_i\le 1000$. The difference between the two is that, the first condition allows negative integers in the input. Whenever in such confusion, assume value of $A_i$ to be negative - and ask yourself - Does it satisfy this constraint? If yes, its in the input, else its not.

3.Derivation of $freq[a]*freq[b]$ in tab below.

View Content

4.Whenever you see that the max value of array elements is comparable to size of the array, frequency array approach can be used. Its usually asked as a part of question rather than an individual question, like here it was Math (principle of counting) +Frequency array. With some modification, you can also count number of each character in a string using this approach.

5. Frequency Array approach to count number of each element is one of the steps involved in Counting Sort , an $O(N)$ sorting technique.

6. What would we do if this question had constraints on $A_i$ upto $|A_i| \le {10}^{5}?$ It would be a difficult question, but still possible to solve. That problem then, can be solved using $FFT$. We create a polynomial $P(x)$ such that $P(x)=\sum_{0}^{2*{10}^{5}}(freq[i]*{x}^{i})$. On squaring $P(x)$, we will have $freq[i]*freq[j]$ for all $(i,j)$ between $1 \le i,j \le N$, but we need to check the case when $i=j$. The next step to do is, if $i+j$ is even, and $freq[(i+j)/2]>0$ , it has to be added to the final answer. This outline is just meant to introduce beginners that there is something like $FFT$ existing in this universe (xD) and to offer an interesting insight to the solution. Time Complexity is $O(|A|Log|A|)$

7. Questions to practice frequency array-

MAXCOUNT - Cakewalk.

(More will be added if I find more, or if the community helps :) )

PROBLEM LINK:

Div1
Div2
Practice

Setter-Misha Chorniy
Tester-Misha Chorniy
Editorialist-Abhishek Pandey

DIFFICULTY:

CHALLENGE

PRE-REQUISITES:

Varying. Challenge problem is usually an application of your skills in competitive programming in general.

PROBLEM:

Given an array $A[]$ of randomly generated sequences, we have to add some integer $D_i$ (need not be distinct) to each array element $A_i$, where $0\le D_i\le K$. Our goal is to maximize $\frac{1}{M}\sum_{i=1}^{M} B_i$ where $B_i=(A_1*A_2...*A_N)\%P_i$

QUICK ANALYSIS:

It seemed that contestants faced problems in getting a good solution. We're concluding that, because, Some of the trivial solutions got too high points than what they should have got. Majority of the contestants simply printed back the input, or used $input+rand()\%K$ &etc.

ANALYSIS:

The first thing I want to say is that, this editorial is incomplete. And it will remain so, until you guys dont contribute! Its impossible to have any hard and fast approach for the challenge problems, and for the editorial to be at its full potential, I want to request you guys to discuss your approach. Benefit in challenge problem is better gained by discussion, seeing flaws of your approach and analyzing other's strengths- and seeing what worked well, and to what degree. Hence, I would request the community to put forward their approach and intuition behind the question :).

As for editorial, I will try to discuss some of the approaches I saw, in both div1 and div2. I hope you people like it :)

1. Div2-

Not even 10 minutes passed from start of contest on the historical date of $6th$ $April,2018$ when Div2 had got the first few accepted solutions. I had a guess in mind, and I,curious as a doomed cat, decided to see what they did and confirm my intuition. And I dont know if it was fate, or destiny, or perhaps something else, but what I saw was an exact picture of what I had in my mind...

cin>> arr[i]; //Take array input. cout<< arr[i];//Print it back.

This approach got something $\approx 85-88$ points. It was $88.7$ when I checked last on $14th$.
Further solutions were also on similar lines. Eg-

cin>>arr[i]; cout<< arr[i]+rand()%k;

This one got $85.8$ points then. Sad luck for that guy :/

cin>>arr[i]; cout<< arr[i]+k;

This solution performed better, and got around$88-89$ points on average.

Some of the better solutions at div2 which got $\ge90$ involved-

Choose one of the primes. Lets call it $P$ Either largest, smallest, middle one or randomly any.
Make array ideal according to that prime, i.e. add $D_i$ so that $(A_1*A_2..*A_N)\%P$ is maximum.
Pray that this solution gets better points.

By roughly around 20-25 submissions, people experimented with what prime to take. Most of them settled on the median prime.

A good number of approaches used simulation and storing array and its result. Eg-

cin>>arr[i]; Store arr[i]+rand()%k;//Store in a vector etc. Compute score for the just stored array. Repeat above 250-400 times. Print the configuration with maximum score

Depending on luck and number of simulation, the above approach fetched $88-94.7$ points. I saw quite a few with $94.7$ points.

Some of the top approaches also use the concept of simulating till just about to time-out. The contestants chosed a distribution (random, or some other) which they simulated for $\approx3.8-3.95$ seconds where they sought to see which choice of $D_i$ is increasing score for a particular $A_i$. When about to time out, they aborted the process and printed the output they got.

2. Div1

The performance of Div1 was kind of similar to Div2 xD. One of the codes which got $91.1$ points at pretest was-

cin>>A[i]; cout<< A[i]+K/2;

Most of the approaches were common- like simulation till best answer, or take $rand()$ values $300-400$ times. Omitting the common approaches, the approaches of top solutions were quite distinct. (However, most of them are hard to decipher due to 300+ lines of code).

Some crucial optimizations were, however, seen. For example, lets say I got some values of $D_1,D_2...D_N$ and calculated the value of $Score=(A_1+D_1)*(A_2+D_2)*...*(A_N+D_N)\%P_i$. The top solutions preferred to change $D_i$ one by one, and re-calculate $Score$ in $O(Log(A_i+D_i))$ by using inverses as - $NewScore=({A_i+D_{old}})^{-1}*Score*(A_i+D_{new})\%P_i$. This method got $95+$ points on pretest.

Some of the good top codes deserve a mention here. These codes are what one can call crisp :)

As usual, I want to invite everybody (yes, everybody, not just the top scorers) to discuss their approaches so that we can have an engaging and learning insights into various intuitions :)

CHEF VIJJU'S CORNER:

1. The very first solution submitted by tester to test this problem was also simply printing back the input xD. After that as many as $16$ more submissions were made.

2.Lets discuss the setter's approach here. Lets take a random subset (serveral times) of array $P[]$ and multiply the primes in this subset. Lets call their multiple $MUL$. Now, we now that $MUL\%P_i=0$ where $P_i$ is a prime in the subset.Now, our aim is to get the value of $(A_1+D_1)*(A_2+D_2)....*(A_N+D_N)$ closer to (preferably exactly equal to) $MUL-x$. This is because, by applying $\%$ operation for various $P_i$, we will be left with $MUL\%P_i-x\%P_i=(-x)\%P_i$. We can go greedily and try to factorize $MUL-1,MUL-2,...,MUL-x$ (The exact value of $x$ can be experimented upon). The factorization will help us in manipulation of $D_i$ values.

One thing to check is that, $(A_1+K)*(A_2+K)..*(A_N+K)$ must have a value more than $MUL$ else the above scenario may not be possible. We can check this by using $log$ function, i.e. by changing expression $(A_1+K)*(A_2+K)..*(A_N+K)$
to-
$Log(A_1+K)+Log(A_2+K)+...+Log(A_N+K)$.

Similarly, $MUL$ can be expressed as $(Log(P_1) +Log(P_2)+....+Log(P_N))$

You can find good practice problem here

3.There is also a better random generator in C++ known as Mersenne Twister or mt19937. It is available from C++11 and later. Some of the advantages it has over rand() are that-

mt19937 has much longer period than that of rand. This means that it will take its random sequence much longer to repeat itself again.
It much better statistical behavior.
Several different random number generator engines can be initiated simultaneously with different seed, compared with the single “global” seed srand() provides

PROBLEM LINK:

Div1
Practice

Setter-Utkarsh Saxena
Tester-Misha Chorniy
Editorialist-Abhishek Pandey

DIFFICULTY:

HARD

PRE-REQUISITES:

Generating Functions, Fermat's Little Theorem (for implementation and finding inverses), Prefix Array and our favorite topic, i.e, MATHS (Calculus, Solving quadratic equations &etc.), Finding Modular Inverses ($O(N)$ time).

PROBLEM:

Given the recurrence relation such that-

$dp(1)=K$
If $n>1$, then $dp(n)=A*dp(n-1)+B*\sum_{i=1}^{n-1} dp(i)*dp(n-i)$

We need to calculate $\sum_{i=L}^{R}{dp(i)}^{2}$ modulo $({10}^{9}+7)$ for $Q$ queries where $N,K,A,B$ are provided in input.

QUICK EXPLANATION:

This $O({N}^{2})$ recurrence is of no direct use to us $-$ we must reduce it further. On further simplification, we are able to convert this recurrence into the following-

$n*dp(n)=(2n-3)(A+2KB)*dp[n-1]-{A}^{2}(n-3)*dp[n-2]$

Calculating $dp(n)$ became trivial now by a single $for$ loop. We can then, maintain a prefix-sum of the following $dp[]$ array to answer the queries in $O(1)$ time.

EXPLANATION:

This editorial will stress on solving this problem (and hence, problem JADUGAR as well) by the Generating Function technique. The solution of tester and setter are similar, and editorialist's solution is derived from the same idea. The problem has little implementation and more conceptual derivation, so the editorial will attempt to tackle that. For implementations, refer to the code of Editorialist.

Setter/Tester/Editorialist's Solution-

Notice that one of the methods to solve/find recurrence relations is Generating functions. Going by the standard technique, we make a (ordinary) generating function $f(x)$ such that-

$f(x)=S_1x+S_2{s}^{2}+S_3{x}^{3}....\infty$

which is nothing by $f(x)=\sum_{i=1}^{\infty}S_i{x}^{i}.$

Note that the $S_i$, which is the coefficient of ${x}^{i}$, is nothing but $dp(i)$. Allow me to represent $dp(i)$ from henceforth as $S_i$ for convenience.

Now, we know that-

$S_n{x}^{n}=AS_{n-1}{x}^{n} + B\sum_{i=1}^{n-1} S_i*S_{n-i}{x}^{n}$

Note that it is simply equivalent to multiplying $L.H.S$ and $R.H.S$ of our original relation by ${x}^{n}$. Now, we need to get rid of $\sum_{i=1}^{n-1} S_i*S_{n-i}{x}^{n}$. Thankfully, a property of generating function (as proved here ) comes to our rescue. We see that-

$f(x)*f(x)=({\sum_{i=1}^{\infty}S_i{x}^{i}})^{2} =\sum_{i=1}^{\infty} S_i*S_{n-i}{x}^{n}$ (....from the above property.)

Now, notice that $S_n{x}^{n}=AS_{n-1}{x}^{n} + B\sum_{i=1}^{n-1} S_i*S_{n-i}{x}^{n}$

$\implies \sum_{n=1}^{\infty}S_n{x}^{n}= Kx+ Ax\sum_{n=1}^{\infty}S_{n-1}{x}^{n-1}+B({\sum_{n=1}^{\infty}S_n{x}^{n}})^{2}$.

Note that the $Kx$ comes due to $S_1$.

Now we know that, $f(x)=\sum_{i=1}^{\infty}S_i{x}^{i}$

$\therefore f(x)=Kx+ Ax*f(x)+B({f(x)})^{2}$

Let us denote $f(x)$ as $g$, so we have a quadratic equation-

$g=Kx+Axg+B{g}^{2}$

We can find the Generating function $g$ (i.e. $f(x)$) by solving this quadratic equation. We get the result-

$\large g=\frac{1-Ax\pm\sqrt{{A}^{2}{x}^{2}-(2A+4KB)x+1}}{2B}$

You can take either sign of the $\pm\sqrt{{A}^{2}{x}^{2}-(2A+4KB)x+1}$ term, you will get same recurrence relation. Lets go by the $-$ sign for now.

We need to somehow get past this square root in our equation, because if we can do that, we can easily get the recurrence relation by collecting coefficients of ${x}^{n}$. Lets try playing with calculus!

$Let$ ${Q}^{2}={A}^{2}{x}^{2}-(2A+4KB)x+1$.

We need an equation involving $g$ without the square root. Can you give it a try? The hint is, differentiation! They are given under the two tabs, click them to proceed further if stuck!

View Content

From above, we are finally able to get $eq(3)$ which is $(A+2Bg')Q=A+2KB-{A}^{2}x$

Now, multiply $Q$ on both sides-

$(A+2Bg'){Q}^{2}=Q(A+2KB-{A}^{2}x)$

From previous equations, we already know the value of ${Q}^{2}$ and $Q!!$. Refer to $eq(1)$ and $eq(2)$ for those. After substituting the appropriate values, we get something like-

$(A+2Bg')({A}^{2}{x}^{2}-(2A+4KB)x+1)=(1-Ax-2Bg)(A+2KB-{A}^{2}x)$

Now-

$\because g=\sum_{n=1}^{\infty}S_n{x}^{n}$
$\implies g'=\sum_{n=1}^{\infty}nS_{n}{x}^{n-1}$

Remember that our aim was to find $dp(n)$, i.e. $S_n$. We know that $S_n$ is nothing but the coefficient of ${x}^{n}$ in the above relation. On collecting coefficients of ${x}^{n}$, we get-

$(n+1)S_{n+1}=(2n-1)(A+2KB)S_{n}-{A}^{2}(n-2)S_{n-1}$

$\implies nS_{n}=(2n-3)(A+2KB)S_{n-1}-{A}^{2}(n-3)S_{n-2}$

This, was the solution of JADUGAR.

For solving JADUGAR2, all we need to do is, compute the prefix sum. After calculating all values upto $dp(n)$, we do an additional step-

$dp(i)=dp(i-1)+dp(i)*dp(i)$ (of course, modulo $({10}^{9}+7)$ !!)

If we use one based indexing, then the answer for query $Q$ $L$ $R$ is simply $dp[R]-dp[L-1]$.

SOLUTION:

Setter
Tester
Editorialist

$Time$ $Complexity=$ $O(N)$

CHEF VIJJU'S CORNER:

1.Remember to give a read about Generating functions first. At first, few parts of proof will seem really difficult to grasp, but as you get acquainted with the basics involved, you will be able to do well here :). Also, dont hesitate to put request for derivation of any part which you could not work out.

2.You can refer to another good problem on Generating Function from ICPC here

3.During the contest, we saw many interesting solutions for JADUGAR which involved stuff like Gaussian Elimination, Inclusion-Exclusion Principle &etc. I would like to invite all those coders to explain their intuition and approach so that we can get a new perspective for this problem :)

4. One interesting thing I would like to point out, is that we can see the costliness of $\%$ operator in this problem. Look at editorialist solution and setter's solution. Setter's solution is nearly $4x$ fast that editorialist, due to very less use of $\%$ operators. Whenever constraints go upto as high as ${10}^{7}$, its better to use expensive operators like $*,/,\%$ $etc.$ as reluctantly as possible.

5. For $A=0$, this problem becomes very similar to finding the $n'th$ Catalan Number $C_n$. You can read about them here. :)

6. Woe behold! For what I have in my possession! The ancient scrolls of knowledge and manipulation of Generating Functions which were used to construct the solution of this question itself!

View Content

I think thats it for this problem xD. If there is any more reference material which you guys want, do tell me. I will try to fit in the requests whenever possible :D.

PROBLEM LINK:

Div1, Div2
Practice

Author:Vaibhav Gupta
Tester:Misha Chorniy
Editorialist:Vaibhav Gupta

DIFFICULTY:

HARD

PREREQUISITES:

Dynamic Programming, Combinatorics

PROBLEM:

Chef is at coordinate $(0,0)$ in a 2-D grid and has to reach point $(p,q)$. From any point $(x,y)$ he can move only in increasing direction: either UP(to the point $(x, y+1)$) or RIGHT(to the point $(x+1, y)$). Here, UP & RIGHT correspond to a bad & good deed respectively. But he can move UP if he is strictly below the line $x = y + c$. $M$ points are blocked in the grid(he can’t visit a blocked point), whose coordinates are given.
You have to count number of such paths possible. Two ways are same if and only if path taken is exactly same in both ways. Print the required answer modulo $10^9 + 7$.

QUICK EXPLANATION:

The problem can be broken down in 2 parts.
1. Make a function that finds the number of ways to reach point $(x2,y2)$ from point $(x1,y1)$ with subject to the constraint that we can't cross line $x = y + c$.
2. The number of ways of reaching any blocked point (i,j) is independent of any blocked cell having larger x or y coordinates. If we sort the blocked cells on basis of x,y in increasing order - the number of ways to reach a cell in $ith$ index of the array is independent of any cell after it in the array. So we can find the number of ways reaching any blocked cell by using $m^2$ DP approach.

The overall complexity is $O(M^2 + p +q)$.

GENERAL SUGGESTION:

This editorial will have some hand exercises. It’s highly suggested that you yourself try to figure out the solution to those before proceeding further.

EXPLANATION:

Part 1:Find number of ways to reach point $Q(x2, y2)$ from point $P(x1, y1)$ if $x2 >= x1$ and $y2 >= y1$ without crossing the line $x = y + c$ and ignoring the blocked cells: Consider the following figure.

Here we want to go from P to Q without crossing the line(black line). We will call it L1. Consider all possible paths from P to Q. There can be 2 types of paths:
1. Paths from P to Q without crossing the line L1. We call such a path a good path. NOTE A path is good as long as it doesn’t cross L1. Although, it can it touch it any number of times.
2. Paths that cross the line L1. We call such a path a bad path.

In the figure the blue path is a good path whereas the orange path is a bad path. Now the sum of all good paths and bad paths gives us the total number of paths from P to Q.

Ex1: Find total number of paths from point P(x1, y1) to point Q(x2, y2) where x2 >= x1 and y2 >= y1.

Let $x2 - x1 = x$ and $y2 - y1 = y$. The total steps needed to be taken is $x + y$. Also, it’s fixed that each path will have exactly $x$ RIGHT steps and exactly $y$ UP steps. Different paths can be generated by permuting the UP and RIGHT steps amongst them. The total number of ways to permute $x + y$ items such that $x$ and $y$ of them are identical respectively is given by: $(x+y)!/(x!y!)$ where $n!$ denotes $n$ factorial.
Now, we have found the total number of paths from P to Q. But our aim was to find the total number of good paths from P to Q. We can first find the total number of bad paths and then subtract is from the total number of paths. This will give us the total number of good paths from P to Q.
For a path B to be bad, it has to cross L1 at atleast 1 point. Now, if it touches the line L1 at point $(a,b)$ after crossing L1, it will reach point $(a,b+1)$.

Any general point on L1 is $x = y + c $ , so any general point that a bad path reaches on crossing L1 is given by $x = y + c - 1 $ . Let this line be L2(black dotted line). So we can say that any path B that touches L2 atleast once is a bad path.
Let the first point where B touches L2 is $X’$. Now take reflection of the segment of B from P till $X’$ about the line L2 as shown in the figure in green color. Consider a new path B’ : formed by the reflected segment of B and the rest of the unreflected segment of B from X’ to Q i.e. the path from $P’$ to $X’$ + the path from $X’$ to $Q$.

Now there are 2 important observations about the path B’

Starting point of B’ is always point P’ i.e. reflection of P in L2.
B’ is a path from P’ to Q that always touches the line L2 atleast at one point.

Now, we easily prove by construction that for every bad path B, that touches the line L2 for the first time at $X’$, we can construct a corresponding path B’ from $P’$ to $Q$ by : concatenating the reflected segment of B from P to X’ and the segment of B from $X’$ to $Q$.
Also, by similar argument, for every path B’ from P’ to Q we can construct a bad path B by again concatenating the reflected segment of B’ from $P’$ to $X’$ and the segment of B’ from $X’$ to $Q$. So there is a bijection between bad paths B and the paths B from $P’$ to $Q$.
Now, to find the number of bad paths we just have to calculate the total number of paths from $P’$ to $Q$. So all we are left to do is to find the reflection of P in L2.

 Ex2: Find the reflection of P in L2.

The above formula can be looked through in any high school book. Using it, reflection of $P(x1,y1)$ in L2: x = y + c - 1 is $(y1 + (c - 1)$, $x1 - (c - 1))$. Now number of paths from $(y1 + (c - 1)$, $x1 - (c - 1))$ to $Q(x2,y2)$ can be found using the formula above. Now, by subtracting it from the total number of paths from P to Q we can get the total number of good paths from P to Q as was required. Let $F(x1, y1, x2, y2)$ gives us this number.
So the final expression for number of paths from P(x1,y1) to Q(x2,y2) without crossing L1: x = y + c is given by
$Ways = (x1+x2 + y1+y2)C(x1+x2) - (x1+x2 + y1+y2)C(x1+x2 + (c-1)) .$

Now, we have the solution to the first part. But wait! there is one more small exercise for you.

Ex3: How to calculate C(N,K) for N, K <= 4 * 10^6?

Sol: We can pre-compute and store all the factorials and inverse-factorials for $1 <= i <= 4e6$. Now to compute C(N,K) all we have to do is 2 elementary operations:

long long findnCk(N, K) {
    ll nCk = fact[N];
    nCk = (nCk * inv[K]) % MOD;          //where inv[K] is modular inverse of K
    nCk = (nCk * inv[N-K]) % MOD;
    return nCk;
}

But how to compute factorials and inverse-factorials of very large numbers?
Finding factorial for very N can be done in O(N) by simply using the factorial of previous index

fact[i] = (i * fact[i-1]) % MOD;      //where fact[0] = 1

Now, we have to efficiently find the inverse-factorial. We can this too, using the inverse-factorial of next index as follows.

inv[i] = (inv[i+1]*(i+1))%MOD;   //where inv[4e6] is modular-inverse of fact[4e6]

So the complexity of this pre-computation is O(N) or O(p + q). These are some relevant links: Modular multiplicative inverse ,Fast Exponentiation ,A small intution of the approach.
Now let’s, put $P(x1,y1) = (0,0)$, $Q(x2,y2) = (n,n)$ and making $L1: x = y$ i.e. $c = 0$ in the above formula, we get $2nCn - 2nC(n-1)$ which is the expression for nth catalan number(sequence of natural numbers occurring in various counting problems).

Awesome! This just shows how the catalan numbers can be derived. Now, you are good to go with all catalan related problems. :)

 Ex4: What is the number of full binary trees with n internal vertices?

Part 2:Find number of ways to reach point $Q(x2, y2)$ from point $P(x1, y1)$ without going to blocked cells.

The naive way to incorporate the blocked cells is using Inclusion-Exclusion Principle. Let $(x,y)$ be the starting point. The number of ways to get from $(x,y)$ to $(m,n)$ while avoiding the one restricted point at $(a,b)$ is given by the number of ways to get to $(m,n)$ with no restrictions, minus the number of ways to get to $(m,n)$ that go through $(a,b)$.
$F(x,y,m,n) - F(x,y,a,b) * F(a,b,m,n)$
Generalising, it we can get the total number of ways to reach (m,n) without going to the blocked point.

 Ex5: How? This exercise is left for the user.

But, such a solution will take $2^m$ time where $m = 10^3$. Hence, not feasible. We need to improve.

It is subtle observation that the number of ways of reaching any blocked point $(i,j)$ is independent of any blocked point having larger $x$ or $y$ coordinates. So we can sort the blocked cells on basis of the pair $(x,y)$ in increasing order so that the number of ways to reach a cell in $ith$ index of the array is independent of any cell after it in the array. So we have broken the given bigger problem into smaller subproblems.
Nice enough? Uhhh! wait we also have an optimal substructure in the problem. Let the point $P_i$ at $ith$ index is $(x_i, y_i)$. Actual number of ways to reach

$P_i = (ways To Reach P_i Ignoring The Blocked Cells) - (Σ(ways To Reach From P_i To P_j) * (ways To Reach P_j Ignoring The Blocked Cells)).$

Hurrah! Now we have a straightforward $M^2$ dynamic programming solution.

Let's say our set $S = {all Blocked Cells + cell(p,q)}$. Sort S on increasing basis of $x$ coordinate and then increasing on $y$. Also let’s assume $dp[i] = F(x,y,x_i,y_i)$ where $(x,y)$ is the starting point of the path. Pseudo code for it is given below:


for(i = 0; i < S.size(); ++i) 
    for(j = i-1; j >= 0; --j) 
        if(S[i].x >= S[j].x && S[i].y >= S[j].y) 
             dp[i] -= (dp[j] * F(S[j].x, S[j].y, S[i].x, S[i].y));

The overall complexity is $O(M^2 + p + q)$

AUTHOR'S AND TESTER'S SOLUTIONS:

Author's solution can be found [here][333].
Tester's solution can be found [here][444].

PROBLEM LINK:

Div1
Div2
Practice

Setter-Ashesh Vidyut
Tester-Misha Chorniy
Editorialist-Abhishek Pandey

DIFFICULTY:

EASY

PRE-REQUISITES:

Math, Implementation, Dealing with floating points & similar etc.

Several interpretations about domain and category of this problem are possible. One may see it as one of Physics numerical of type Kinematics, others may see it as Vectors in Maths, while some may compare it to Geometry by using 2-D graph and due to the fact that this concept of distance being $>{10}^{6}$ is used in many similar geometry problems.

PROBLEM:

We have to find the minimum time taken by Chef to cross $N$ lanes without getting hit by a car (i.e. distance $Dist$ between Car's front or side and Chef must be such that $Dist> {10}^{6}$). All relevant information like Car's initial position, velocity, direction and length are provided to us.

QUICK EXPLANATION:

The logic and math of this problem is very easy to get, and hence the problem boils down to careful and accurate implementation. We divide the solution into cases, like when Chef will have to wait for the car, when Chef can cross lane without waiting for car etc. We must make sure to use correct data-types, and mind the gap of ${10}^{6}.$

EXPLANATION:

There isn't much to explore in this question with respect to time complexity, as a straight-forward $O(N)$ solution is possible. We will discuss the various approaches used by Editorialist and Tester. Setter's approach is a hybrid between the two, and hence will be left as an exercise to work on (his code is commented for easier understanding :) ).

1. Editorialist's Solution-

My basic idea is inspired from divide and conquer. (Divide the problem into cases which are easy to conquer xD).

The car passes Chef when the back of car is at $y=0$. So I calculated a few things, such as back of car when $t=0$ (i.e. $B_0$) and back of car when current moment (i.e $B_t$). I then identified the following cases-

a.If the car is moving in positive direction, and its back is ahead of $y=0$ at $t=0$ (i.e. car has crossed $y=0$ already), and vice-versa if car is moving in negative direction, $OR$ if the car has already crossed the lane by the time Chef reaches lane $i$, then make Chef cross the lane immediately.
b. As an extension to first case, if the car cannot reach Chef by the time he crosses the lane, defined by front and rear of car being at a distance of at least ${10}^{6}$ by the time Chef crosses the lane, then make Chef cross the lane immediately.
c. For $ALL$ other cases, car will hit Chef if he doesn't wait. Hence, add to ans the time taken by car to cross $y=0$ and by Chef to cross the lane.

Print upto 3 decimal places and we are done :)

Tester's approach-

Misha's solution follows a similar method, but different conditions. The cases he identified were-

a. First check the direction.
b. For the respective direction, calculate the where the car's rear will be when-
i) When Chef arrives at that lane.
ii)When Chef finishes crossing the lane, had he started immediately.
c. If, we observe that the car was at different side of $y=0$ at i) than it is at ii), it means it will hit Chef if Chef starts to cross immediately. Add time for car to cross to the answer.
d. Add time for Chef to cross the lane.

$Time Complexity-$ $O(T*N)$ for both the approaches.

SOLUTION:

Setter
Tester
Editorialist

CHEF VIJJU'S CORNER:

1. This was a testing implementation problem. While the frustration a coder feels when he gets WA is understandable, it reaches to a whole new level when he knows that the logic is easy but its difficult to implement. But we cannot discard the fact that it is a necessary skill. Teaching importance of implementation, especially in a field which overlaps with concept of geometry, was one of our contest admin Misha's big aim. And I see many of you fared well :). To all those who didnt, I will say, better fail and get frustrated here, than fail at somewhere else where it might be more important for you to get that AC. Learn from this experience, as solving this question taught me a few things as well :)

2.I can recall a famous saying that-

$''There$ $is$ $never$ $too$ $much$ $Geometry!!''$
$-$ $Somebody$

So true a saying, lets complement it with a few Geometry, or related, problems :D

Path by a Castle- Past ICPC question. Dont follow geeksforgeeks.com code for point inside a polygon part. That code is wrong.
New Year and Curling- A decent Geometry problem.
Car-pal Tunnel- In case anyone did not have enough of cars :p
Broken Clock - A Trigonometry question, but I believe Trigonometry and Geometry go hand in hand.
Point Inside a Polygon - Name is self descriptive :D

PROBLEM LINK:

Div1
Practice

Setter-Bogdan Ciobanu
Tester-Misha Chorniy
Editorialist-Abhishek Pandey

DIFFICULTY:

MEDIUM-HARD

PRE-REQUISITES:

Segment Tree with Lazy Propagation , Taylor Series , Logarithmic Functions, Basic Calculus

PROBLEM:

There are $N$ food stalls. After eating food from food stall $i,$ there is a $P_i$ probability of you getting food poisoning initially. We have to support 2 queries-

$0$ $L$ $R$ - What are chances of you not getting infected with food poisoning if you eat from all stalls in range $[L,R]$.
$1$ $L$ $R$ $T$ - Each of the food stall in range $[L,R]$ has improved hygiene, leading to chances of food poisoning getting reduced by a factor of $T$.

QUICK EXPLANATION:

What we have to actually find is, $(1-P_L)*(1-P_{L+1})...*(1-P_R)$, which itself is easy, but the update operation makes it complicated. We use the property of $Log$ which says ${e}^{Log_e a}=a$ to express $Ans=(1-P_L)*(1-P_{L+1})...*(1-P_R)={e}^{Log_e [(1-P_L)*(1-P_{L+1})...*(1-P_R)]} ={e}^{Log_e(1-P_L)+Log_e(1-P_{L+1})...+Log_e(1-P_R)} $. Now, each of the $Log_e(1-P_i)$ can be easily calculated by using its Taylor series, which says that $Log_e(1-x)=-(x+\frac{{x}^{2}}{2}+\frac{{x}^{3}}{3}.....\infty)$ . With this, we can easily support update operations using Lazy Propagation.

EXPLANATION:

The setter, tester and editorialist have majorly used the same approach. Hence, that will be discussed in this editorial, along with few common mistakes, setter's note of precision of Taylor's series in the bonus section. :)

Setter/Tester/Editorialist's Solution-

1. What to store in Node?

For our approaches, we decided to store the first $100$ terms of Taylor Series. Because $0\le P_i \le 0.9$, then the $100'th$ term of the series, $\Large \frac{{x}^{100}}{100}$, will be at most of magnitude $2.6*{10}^{-7}$ and will have an effect of $\approx {10}^{-7}$ on the final answer. Hence, taking first $100$ terms should do. We didnt experiment for the exact amount, you need at least $88-90$ terms, as any less will definitely not do.

2. How to handle updates?

For convenience, let me denote $P_i$ by $x$ from henceforth.

Note that we are storing $\Large \sum_{i=1}^{100} \frac{{x}^{i}}{i}$ in the node. For each update, all we have to do is to change $\Large \sum_{i=1}^{100} \frac{{x}^{i}}{i}$ to $\Large \sum_{i=1}^{100} \frac{{(t*x)}^{i}}{i}$. This can be easily done by multiplying $\Large \frac{{x}^{i}}{i}*{t}^{i}$. A single $for$ loop suffices for that, and we can update a node after reaching to it in $100$ operations.

We must make sure to use Lazy propagation here, so that the latter part of tree, which may not be needed right now, is updated efficiently only when needed. If we keep on updating the entire tree every time (i.e. if we dont use lazy propagation) then our update will take $100*NlogN$ (worst-case) operations, which will time out!

What is the Parent-Child relationship?

The parent child relationship for this tree is actually simple!! Say Taylor series of left child is $\Large \sum_{i=1}^{100} \frac{{x}^{i}}{i}$ and that of right child is $\Large \sum_{i=1}^{100} \frac{{y}^{i}}{i}$. Then Taylor series of parent is given by $\Large \sum_{i=1}^{100} (\frac{{x}^{i}}{i}+\frac{{y}^{i}}{i})$. Can you think a bit on why is this so? (Hint: $Log_e(ab)=Log_ea+Log_eb$). Proof is in Chef Vijju's corner.

4. How to get the answer-

For getting the answer, we query for the nodes in range $[L,R]$ to get the sum of Taylor series of each of them. Lets call this as $Sum$. Note that this $Sum$ represents nothing but $Log_e(1-P_L)+Log_e(1-P_{L+1})...Log_e(1-P_R)=Log_e [(1-P_L)*(1-P_{L+1})...*(1-P_R)]$. Our answer, hence, is ${e}^{-sum}$ by using the ${e}^{Log_e a}=a$ property of $Log$.

For this, we used the $expl()$ function of $C++$ because of its high accuracy. You can refer to my commented solution for implementation (I tried my best to explain tester's solution there as well, so do give it a shot :D )

And thats it, we are done with one of the hardest problems of this long :) . Were you expecting a $3000$ word editorial here? Sorry :p xD

SOLUTION:

Setter
Tester
Editorialist

$Time$ $Complexity$- $O(K*(N+Q)LogN)$ where $K=100$ (Number of terms of Taylor Series)

CHEF VIJJU'S CORNER

1. Dont open the tab below! Please no! Its empty, really!!

View Content

2.Proof of why $\Large \sum_{i=1}^{\infty} \frac{{x}^{i}}{i}$ $=-Log_e(1-x)$

View Content

3. Reverse proof of how we correlated $Ans=(1-P_L)*(1-P_{L+1})...*(1-P_R)={e}^{- \sum_{L}^{R}Log(1-P_i)}$ and how we arrived at the Taylor Series is clearly outlined in the quick explanation section. Please refer there for derivation :).

4.Setter's Note of precision of Taylor's series method-

View Content

Let $\delta(x)$ be the signed relative error precision when representing the real number $x$ as a $float$, i.e. $\delta(x) = \frac{(x - float(x)}{x}$

$\implies float(x) = x * (1 + \delta(x))$

In the leaves of the SGT, we will store a list (of terms of Taylor Series) of size $P$, where $P$ is our desired precision for the Taylor series. This list will be of form $[P_i, {P_i} ^ {2}, .., {P_i} ^ P]$.

$\delta(x * y) \approx \delta(x) + \delta(y)$, so ${P_i} ^{K}$ will have $K * \delta(P_i)$ relative error.

This is alright, because $P$ is around $100$.

Afterwards, we're going to go up the tree and start building it.

In every node we store the sum of the son's lists, i.e. if son-left's list is $[A_1, A_2, .., A_P]$ and son_right's list is $[B_1, B_2, .., B_P]$, then the list in node is $[A_1 + B_1, A_2 + B_2, .., A_P + B_P]$. $\delta(x + y) =$ $\Large \frac{(x * \delta(x) + y * delta(y))}{(x + y)}$

We can then show inductively that every component of these lists is bound by the same relative error as the leaves.

When multiply by $Q$ on an interval, the same reasoning applies, it's $K'th$ iteration will have a relative error of $K * \delta(Q)$. Afterwards we push this value on $O(logN)$ nodes.

So the worst case precision relative error is $\delta * (Q * logN + N) * P$, which is about ${10}^{-9}$.

Now, that was the Segment Tree and building / updating, we're left with the query: the analysis for the relative error introduced by the summation of the lists in which the $[L, R]$ interval of the query is broken into is similar to the one for building the tree.

For computing the term $ Q= -x - \frac{{x}^{2}}{2} - \frac{{x}^{3}}{3}...$, divisions won't introduce any addition relative error, because $\delta(x / y) \approx \delta(x) - \delta(y)$ and $1, 2, 3, ..$ are exactly representable as floats.

Computing ${e}^{q}$ using exp, going by the documentation, we know that it's rounded well.

In my opinion, the following reasoning applies:

${e}^{x} * (1 + \delta({e}^{x})) = {e}^{(x + x*\delta(x))}$

$\implies (\epsilon = x*\delta(x)) {e}^{(x + \epsilon)} {e}^{(x + \epsilon)} = {e}^{x} * (1 + \epsilon) + O({\epsilon}^{2} * const)$

5.Proof of Parent-Child Relationship

View Content

6.Common errors-

We are already maintaining the precision using double. Do not use long double unnecessarily, as it takes more memory and a lot more time in computation. Solutions with long double everywhere can still take as long as $5-10secs$ to run under given constraints.
Creating a new node is expensive! Save time by already creating a set of nodes, and using reference to those instead of creating new ones. Carefully see the use of $\&$ operator in setter and tester's code. What we do is, we pre-create a set of $LogN$ nodes, and whenever we feel need to create a node, we use one of the already created ones. This saves time and memory because if we dont do this, each recursion on query will create at least one node, and each node stores a series of length $100$.
Improper and incorrect lazy propagation was seen. In cases, some didnt use any.

7.Some resources for Segment tree with lazy propagation-

A Challenge which my friend recommended me once.
Refer to explanation and Sample Problems
The GSS series of SPOJ is a good question set for beginners. GSS1, GSS2, GSS3, GSS4

PROBLEM LINKS:

Div2
Practice

Setter-Adlet Zeineken
Tester-Misha Chorniy
Editorialist-????

DIFFICULTY:

CAKEWALK

PRE-REQUISITES:

Array, Looping, Sorting (optional).

PROBLEM:

There are $3$ type of workers, where the first type only translates, second type only writes and third type does both, we are to find minimum cost to write $and$ translate a piece of text. We are given the information of type of worker, and how many coins he will charge for his service.

QUICK EXPLANATION:

We clearly see that the answer will be ${min} (C_3,C_1+C_2)$ where ${C}_{i}$ denotes the cost of cheapest worker of type $i$. With that, we just have to take care of cases where -

There are no workers of type $C_1$ or $C_2$ .
No worker of type ${C}_{3}$.

EXPLANATION :

This editorial will describe two approaches, one which is easy and intended one, and another which is followed by me (a bit complex- but its intended to expose you guys to data structures).

Easy Approach #1
This is a fairly simple problem. What we must focus on, is finding the minimum cost of workers of all three types.

There can be multiple ways to do so. One way is to iterate over the array thrice, each time finding minimum cost of worker of the required type. If no worker of that type exists, we simply put ${C}_{i}=INF$ where INF can be some large number, more than $2*{10}^{5}$ (preferably INTMAX).

With this data, we can simply find answer as $min(C_3,C_1+C_2)$, which was stated above.

My Approach (Medium)-

My approach intends to introduce you people to data types, and this time it is vectors (in C++) and any equivalent data structure in other languages. In context to editorial, you can think like, vector is an array where you can insert and delete elements from end in $O(1)$ time. (although its much more than that!)

What I did was, I created an array of vectors, of size $4$ (just to follow 1-based indexing). In my solution, $worker[i][j]$ represents a worker of type $i$ take $j$ coins to do his work. I sorted the vectors of all $3$ types, took care of conditions when worker of a specific type are absent, and simply printed the answer (because after sorting, the first element is the minimum).

Dont worry if this seems complex to you now, but do make sure to understand this at some point of time :).

SOLUTION:

Setter
Tester - He essentially did the same as in approach 1 we discussed. His array $F[i]$ is equivalent to our $C_i$.
Editorialist

CHEF VIJJU'S CORNER :D

1.Make it a point to learn vectors. A proper command over data structures are needed to master algorithms. Vectors are very commonly used in Graph Algorithms. You can refer to here for more on vectors :)

2.Any other approaches are welcomed :)

PROBLEM LINK:

Div1
Div2
Practice

Setter-Praveen Dhinwa
Tester-Misha Chorniy
Editorialist-Abhishek Pandey

DIFFICULTY:

EASY

PRE-REQUISITES:

Strings, Basic Math, Simulation.

PROBLEM:

Given a string $S$ , with only characters being $'a'$ and $'b$ $'$, find number of good prefixes in the string $T$ which is made by concatenating $N$ copies of $S$, i.e.-

$T = \underbrace{S + S + \dots + S}_{N\text{ times}}$

A prefix is a good prefix if and only if number of occurrences of $'a'$ is strictly greater than the number of occurrences of $'b'$.

QUICK EXPLANATION:

We make cases for this problem. If frequency of $'a'$ is equal to frequency of $'b'$ in string, then calculating answer is simple- for we can simply use $Ans=G(S)*N$ where $G(S)$ denotes number of Good Prefixes in $S$.

If frequency of $'a'$ is more than to frequency of $'b'$ , we notice that after at most $1000$ appendations, each new appendation will contribute maximum possible value only, which is $K$ ($K$=Length of string) .

Similarly we can see that if frequency of $'a'$ is less than to frequency of $'b'$, after $1000$ appendations, each new appendation will contribute $0$ to answer.

So if frequency of character $'a'$ is not equal to that of $'b'$, we make a string $T$ by appending $S$ to it $1000$ times, and then use mathematical formulas to find contribution of every other appendation.

EXPLANATION:

As usual, we will be discussing Editorialist's and tester's approach. Editorialist's approach is based on same lines as that of setter- and hence, will form main part of editorial. We will discuss better optimized solutions (tester's approach) and common mistakes of contestants at the bonus part of editorial :)

Before that, please first get acquainted with the notations I will be using.

$G(S)$= Number of Good Prefix in string $S$
$freq(a,S)$= Frequency of character 'a' in string $S$.
$freq(b,S)$= Frequency of character 'b' in string $S$.
${S}^{p}$= String formed by concatenation $S$ $p$ times.

1. Editorialist's/Setter's Solution-

If you want, you can scroll down to refer to my solution side by side while reading the editorial.

The first thing to note are the constraints. $1\le N \le {10}^{9}$ while $1\le |S| \le {10}^{3}$.

We start by calculating number of Good Prefixes in $S$ , i.e. $G(S)$. Along with that, we also find $freq(a,S)$ and $freq(b,S)$. Now we make cases-

a. When $freq(a,S) == freq(b,S)$-

In this case, there is no affect of previous appendations of $S$ on incoming/new appendations of $S$. In other words, when $freq(a,S) \neq freq(b,S)$, then there is some change in value of $(freq(a,S)-freq(b,S))$ after each appendation of $S$ to $T$, which affect the contribution of any future appendation of $S$ towards $G(T)$. In most basic words, it means that since $freq(a,S) \neq freq(b,S)$, then number of $'b'$ in $T$ will not increase at same rate as number of $'a'$ at each appendation, which will affect contribution of next appendation to the final answer.

This is not the case when $freq(a,S) == freq(b,S)$. Hence, each appendation of $S$ will contribute $G(S)$ towards the final answer. There are $N$ appendations of $S$ and we already found $G(S)$ by simple looping earlier. Hence, for this case, $Answer=G(S)*N$

b. When $freq(a,S) < freq(b,S)$-

In this case, because $ freq(a,S) < freq(b,S)$, each new appendation of $S$ to $T$ will decrease $(freq(a,S)-freq(b,S))$. In other words, the number of $'b'$ in $T$ will increase more quickly than number of $'a'$, which will reduce the contribution of every future appendation of $S$ towards final answer.

We see that, after every appendation, the contribution of next appendation must reduce by $at$ $least$ $1$ , ($\because freq(a,S) < freq(b,S)$). Remember that length of $S$ is only upto $1000$. This simply means, that after at most $1000$ appendations, the contribution of incoming appendation reduces to $0$. (In fact, $1000$ is a loose upper bound, we can go closer to the real value, but for sake of explaining lets take it $1000$ right now :) )

So, we form the string $T$ by appending string $S$ $min(N,1000)$ times as-

$T = \underbrace{S + S + \dots + S}_{min(N,1000)\text{ times}}$

and our answer is equal to $G(T)$ as any future appendation has $0$ contribution towards final answer.

$Answer=G(T)$

b. When $freq(a,S) > freq(b,S)$-

Clearly, each new appendation of $S$ to $T$ will increase $(freq(a,S)-freq(b,S))$. Thus, the number of $'a'$ in $T$ will increase more quickly than number of $'b'$, which will increase the contribution of every future appendation of $S$ towards final answer.

The maximum possible contribution of an appendation to our answer can be $|S|$, i.e. the length of string $S$. Again, as seen above, after $at$ $most$ $1000$ appendations, the next appendation will reach its maximum contribution. Hence, we do same as above, we form $T$ by appending string $S$ $min(N,1000)$ times as-

$T = \underbrace{S + S + \dots + S}_{min(N,1000)\text{ times}}$

Now each future appendation will have maximum contribution, equal to $|S|$. Hence, our answer will be-

$Answer=G(T)+|S|*max(0,N-1000)$

Now, what is the time complexity? How many of you would agree if I said-

$Time$ $Complexity=$ $O(min(N,1000)*|S|) \equiv O(1000*|S|) \equiv O(|S|)$

Is it correct?
.
.
.
(PS: The above time complexity is a lie :) )

SOLUTION:

Setter
Tester - $O(|S|Log(|S|))$
Tester - $O(|S|)$
Editorialist - $O(???)$

CHEF VIJJU'S CORNER:

1.Most of the contestants did a good job in framing the simulation part correctly, checking for overflows, and coming at right expressions. But they $still$ got a $TLE$. And let me reveal the culprit here! It was-

$T=S+T$

Yes, thats right! This expression should had been $T+=S$. The $+$ operator takes time comparable to $O(|S|+|T|)$ while the $+=$ symbol takes time comparable to $O(|S|)$. Treat it like "Creating a new array with characters of $T$ and $S$ $v/s$ Adding $|S|$ characters to back of char-array $T$ ." For reference, have a look at this solution. Replace the $+$ operator with $+=$ and see the time diffference!

2. Tester's $O(|S|log(|S|))$ solution is discussed here. When $freq(a,S)> freq(b,S)$, what he does is, keep a track of $freq(a,S)- freq(b,S)$ at various points of the string, and then sort them. Now, we keep a variable $curr$ which stores the count excessive $'a$ $'$ we get after each appendation. He uses this to check at most how many prefixes more he can get by the condition $have[ptr - 1] + cur > 0$. Depending on that, he adds $ans += (int)have.size() - ptr;$ to the answer. In other words, tries to go checks the least value in of $freq(a,S)- freq(b,S)$ which $curr$, number of excessive $'a'$ $s$ can afford, and adds to answer accordingly. Same approach used for $freq(a,S)< freq(b,S)$ case, which is left for the readers to work out :).

3. Testers $O(|S|)$ solution is a bit similar. Firstly, Misha eliminated sorting, and replaced it with a cumulative summation'd prefix array. Meaning, he stores the frequency of each value of $freq(a,S)- freq(b,S)$ and does a cumulative sum of that array. Again, $curr$ holds $freq(a,S)- freq(b,S)$. Now, his expression is $ans += cnt[2 * SHIFT - 1] - cnt[SHIFT - cur];$, meaning "Max possible contribution - Contributions which cannot be afforded due to low curr". The solution is same in essence to the first one, but he cleverly got rid of sorting, leading to a much simpler $O(|S|)$ approach.

4. Now you have seen two optimized approaches, I will have a question for you. Have a detailed look at the time complexity I mentioned. No doubt, my solution does ,plain and simple, have a complexity of at least $O(min(N,1000)*|S|)$ , but dont you get something fishy? There is a, very common, and intentional error involved here :). Note that, there is no sense in the $"1000"$ I kept in $O(min(N,1000)*|S|)$. Ask yourself, what is this $1000?$ It is nothing but to denote $|S|$!! Hence, the real complexity would be $O(min(N,|S|)*|S|) \equiv O({|S|}^{2})$. Dont be like this editorialist when computing time complexity xD.

5. The string in this question was made up of only $2$ characters. Replace $'a'$ and $'b'$ with $1$ and $0$ and you have a binary string. Some questions on similar topics-

PROBLEM LINK:

Div1
Div2
Practice

Setter-Stepan Filippov
Tester-Misha Chorniy
Editorialist-Abhishek Pandey

DIFFICULTY:

MEDIUM

PRE-REQUISITES:

Square Root Decomposition, or Segment Trees, or Data Structures such as Deque

PROBLEM:

Given an array $A[]$ representing height of plants, and another array $B[]$ representing the required height of plants, find minimum time taken to convert array $A[]$ to array $B[]$ using the specified operation. The operation is, choose valid indexes $L$ and $R$ and a target height $H_i$ and cut all plants in the range to height $H_i$. Their initial height must be $\ge H_i$ for this to be valid.

QUICK EXPLANATION:

We can clearly see that the only way of saving time is to make a valid operation in which maximum number of plants need to be cut to same height $B_i$. The problem hence boils down to, making the optimal query by finding such $L$ and $R$ and checking if its valid or not. The case where answer is not possible is trivial to check.

EXPLANATION:

View Content

So, as some of you might have perceived, we will be discussing three approaches here. First we will discuss my approach which is $O(N)$, and then we will see Misha's (tester) $O(N\sqrt{N})$ approach and finally discuss how we can convert/optimize it to become $O(NlogN)$ approach of the setter.

1. Editorialist's Solution-

The very first thing we see is if the answer is possible or not. Clearly, no answer is possible if for some plant $i$, its current height $A_i$ is less than the required height $B_i$. If its possible, we proceed further.

First I will try to give the intuition. See what the problem demands. How can we save time? We can save time for every valid query which sets multiple plants to same required height $B_i$. Hence the problem boils down to proper identification of $L$ and $R$ for the query, and checking if such a query can be made or not.

The first problem which we face in brute force is that, to verify the query is possible, we might have to check all plants and their required heights in range of $L$ to $R$. Not to mention getting exact $L$ and $R$ for query might seem problematic to some. Perhaps we can make some observations first to make our life simpler?

Optimal query will have to set at least one plant to its required height. Hence, its value must be between one of the values $B_i$ such that $L\le i\le R$.
We further refine our first observation, claiming that the query height must be $max(B_i)$ for $L\le i\le R$. Because if its not so, one of the plants will be cut well below its required height!

Also, imagine if we "have got the required $L$ and $R$" where we got multiple plants which are to be set to same height $B_H$ and are checking if the next element can be included in the range. When will we discard this element at index $i$ from being inside the valid query? We will do so if-

If the required height of $B_i$ is more than $B_H$.
If the plant height $A_i$ is less than $B_H$.

With these two principles in mind, lets move towards the solution. I will first describe my algorithm, and then we will discuss what it does, and why is it correct.

Make a deque/list/appropriate data structure which supports insertion and deletion at both ends.
For all plants in range $1$ to $N$, do following-
If my deque is not empty, and the current plant (at index $i$) being considered has a required height $B_i$ which is more than the height at the back of deque, keep popping from back of deque till height at back ($B_j$) satisfies $B_j \ge B_i$ .
Now check if the height of plant at $i$, (i.e. $A_i$) is less than the required height $B_L$ at the front of deque. If yes, keep popping from front till this condition becomes false.
After performing step 3 and 4, we can assure that we can make a valid query for all plants currently in the deque, because steps 3. and 4. assure that $B_i$ are stored in descending order, and that the height of final plant $(A_j)$ is sufficient enough to execute query from start to end of deque.
Now check if required height of the plant ($B_j$) at back of deque is equal to required height of plant in consideration ($B_i$). If yes, then we can successfully make a query from the last element of deque to current element, saving an operation. If we find that $B_i \neq B_j$, then we cannot save an operation and must increment time needed by $1$.

Why is this correct?

Notice that, we maintain the condition that, element is in the deque only if its possible to make a query from plant $B_L$ at start of deque to plant at end of deque/plant being considered right now ($B_R$ or $B_i$). This is possible because $B_i$ gets stored in descending order in deque. Then, we also check before adding a plant if its possible to execute a query from start of deque to it by making sure that the current height of plant being added ($A_i$) is enough to support query for plant at index $L$ (i.e. $A_L$). $L$ and $R$ are, obviously, starting and ending range of query.

Why descending order?

We will discuss this in Setter's approach as well. I will mention my intuition here. Lets take a simple example, we have this configuration of elements in array $B[]$ such that $B[]=\{B_j,B_i,B_j\}$ with $B_j < B_i > B_j$. Can we make a query from $B_j$ at left of $B_i$ to $B_j$ at right of $B_i$ to save time? No, we cannot as it would cut plant $A_i$ to a height less than $B_i$. Hence, once we encounter a $B_i$, we can safely eliminate all previous $B_j$ which are smaller than $B_i$ as we cannot make a query from them across $B_i$. Also, any queries which could had been done before $B_i$ had already been considered when we added the element $B_j$ and every other element thereafter.

Hence, the algorithm is correct.

$Time$ $Complexity-$ $O(N)$

Tester's Aprroach-

The tester follows Square root decomposition.

If you got the intuition (even if only a little) of my solution, then this one is even easier. What Misha did after taking input is, he made two buckets. First bucket $sqa[]$ stores minimum height of plant $A_i$ in the range of bucket, and second bucket $sqb[]$ stores the maximum of the required heights height $B_i$ in range of bucket.

Then, he checks if the configuration is valid or not. While doing so, he also makes a vector of pairs, which stores $< B[i], i> $. He sorts this vector in descending order. Now, he does the following-

If required height and current height of plants being considered are equal, do nothing. Else, goto 2.
If this is the first element being considered, or if the required height $B_i$ of this element is not equal to the required height of previous plants, $B_z$ which we were considering, add $1$ to answer and set $B_z=B_i$ and set the index of plant being considered (index $z$) to $i$. Go back to 1. and consider next plant.
If required height of this plant $B_z$ is equal to required height of previous plant being considered, get the index $y$ to that previous plant in the original array (He had made pairs of $< B[i], i> $ for this purpose). Now check if we can make a valid query from the previous plant's index $z$ to this index $y$. This checking happens in $O(\sqrt{N})$ on basis of the observations about invalid queries I mentioned earlier (i.e. if there is a $B_k$ in between which is more than $B_z$ preventing valid queries across it, or if current height $A_k$ of some plant is less than the required height $B_z$). If the query is possible, do nothing, else add $1$ to answer and set current plant of consideration to plant at index $y$. (i.e. set $B_z=B_y$ and $z=y$.)

$Time$ $Complexity-$ $O(N\sqrt{N})$

Setter's Solution-

Basically, the change is in his' and tester's solution is in the step where we check the possibility of queries. He optimizes the step where we check if the query is valid or not.

What he observed was, that, consider two queries which set height of plants (in some range) to $H_i$ and $H_j$. Now, suppose $H_i$ happens after $H_j$, and $H_i > H_j$. He observed that, we can swap the order of these two queries, i.e. execute query $i$ before query $j$ without any effect on the final answer. This is because, any plants getting affected by both query $i$ and query $j$ can be ultimately reduced to height $H_j$ because $H_i > H_j$. He uses this to argue that, there always exist a solution where queries are executed in non-increasing order of heights $H_q$. That is, he will execute query which with greatest height $H_q$ first, &etc. This also justifies tester's step of sorting the vector on basis of required heights $B_i$, as each optimal query will make height ($A_i$) of at least one plant equal to its corresponding required height $B_i$.

With this clear, the only difference comes in checking if the query is valid or not. If we closely see what tester did, and what my definitions say about invalid queries, we can reduce it to finding maximum required height $B_i$ in the range of $[L,R]$ and minimum current height $A_i$ in the range $[L,R]$ where $[L,R]$ are the range of query. This can be easily done via appropriate data structure like Segment Tree & similar.

$Time$ $Complexity-$ $O(NLogN)$

SOLUTION:

Setter
Tester
Editorialist

CHEF VIJJU'S CORNER:

1. This is a very fit data structure problem, and is really beautiful in this way. There are very little problems where you can solve the question elegantly using the apt data structure. The correct code of algorithm with deque was hardly 6-7 lines for me. This question is really appreciable in this regard, and we must appreciate the setter for such an interesting question.
2. As you can see, the setter and tester have $O(N\sqrt{N})$ and $O(NLogN)$ solutions respectively. The best part of being an editorialist is you can claim the best solution for your share :D :p
3. I will like to leave out some of the apt problems on data structures which I found elegant-

Alternating Current - A very nice data structure problem.
Soldier and Cards - One word, Deque.
Square Root Decomposition - List of famous problems of this topic :)

I submitted the following code nearly three minutes before the April Long Challenge ended. But my score has not been updated for the same. Here is a link to the submission.

Also, I checked some other submissions. I think it is a common issue. Please check.

PROBLEM LINK:

Div2
Practice

Setter-Varad Kulkarni
Tester-Misha Chorniy
Editorialist-Abhishek Pandey

DIFFICULTY:

SIMPLE

PRE-REQUISITES:

Fast Exponentiation

PROBLEM:

Given a value of $N$, number of digits in a number $D_1D_2D_3...D_N$, and $W$ which is given by $W=\sum_{i=2}^N (D_i - D_{i-1})\,$, we have to find number of $N$ digit numbers which have weight $W$ modulo ${10}^{9}+7$.

QUICK EXPLANATION:

We can reduce the expression to $W=D_n-D_1$ where $D_n$ is the rightmost digit, and $D_1$ is the leftmost digit. We see that we can find valid numbers only for $-9\le W \le 8$, for all other values of $W$ the answer is $0$. If $-9\le W \le 8$, we brute force the pair of digits $(D_1,D_N)$ such that $D_N-D_1=W$. Let this be denoted by $cnt$. Clearly, answer is $cnt*{10}^{N-2}\% ({10}^{9}+7)$

EXPLANATION:

" Oh...This question...This question has that $"\sum"$ symbol...It must be..It must be very difficult."

If you left the question because of above belief, then dont read this editorial. You're gonna regret that decision of yours :(

For this question, since Editorialist's , Tester's, and Setter's, all three have use a similar approach, so the approach will be described in a bit more detail. After discussing the intuition and approach, we will see tester's idea to further reduce time complexity in the bonus section :).

Editorialist's/Tester's/Setter's Solution-

The very first thing we need to look at is the constraints. We have number of test cases, $T$ as $1\le T \le {10}^{5}$ and $2 \le N \le {10}^{18}$ , which clearly hints that we need to answer to answer each test case in $O(logN)$ or $O(1)$ complexity.

With that in mind, we should first try to simplify the summation. Carefully look at the summation $W=\sum_{i=2}^N (D_i - D_{i-1})\,$ and try to expand it. On expanding it, we get-

$W=\sum_{i=2}^N (D_i - D_{i-1})\,$

$=(D_2-D_1)+(D_3-D_2)+(D_4-D_3)......(D_N-D_{N-1})$

We see that we can cancel some terms above. Upon cancelling, we are left with-

$W=D_N-D_1$ where $0 \le D_i\le9$.

This means $W$ is nothing but difference of first and last digit of the number. This restricts valid values of $W$ to $-9\le W \le 8$. No answer exists for any value outside this range.

The first conclusion, hence, is that answer for $W$ outside the range $[-9,8]$ is $0$ as no such $N$ digit number exists.

Now, how to find $W$ for $-9\le W \le 8$ $?$

We notice that for answer to exist, we need to put adjust only $D_n$ and $D_1$, while any other digit $D_i$ in between can take any value from $[0,9]$. Lets focus on valid pairs of $(D_1,D_N)$ right now.

What we can do is, we can brute-force all possible pairs of $(D_1,D_N)$, and count the number of pairs satisfying $D_N-D_1=W$. Let this be denoted by $cnt$. Or, you can simply find the number of valid pairs by this formula-

if(w>=0)//Proof -By simple counting of pairs
    cnt=9-abs(w);
else
    cnt=10-abs(w);

Please note that $D_1$ (leftmost digit) cannot be $0$ as leading $0's$ are not allowed.

Now we have the number of valid pairs of $(D_1,D_N)$ in $cnt$. What about the rest of the digits? We see that the answer is independent of all other digits. Thus, all the remaining $N-2$ digits can take any value between $[0,9]$.

So, we have $cnt$ choices for $D_1,D_N$ and ${10}^{N-2}$ choices for rest of the $N-2$ digits. Thus, the answer will be $cnt*{10}^{N-2}\% ({10}^{9}+7)$. We will calculate ${10}^{N-2}\% ({10}^{9}+7)$ using fast exponentiation. (Link to algo given under pre-requisites).

$Time$ $Complexity=$ $O(T*LogN)$

SOLUTION:

Setter
Tester - $O(Log(MOD)$ solution by Misha. :)
Tester - $O(1)$ solution by Misha :D
Editorialist's

CHEF VIJJU'S CORNER:

1.During contest, this question was flooded with comments- with everyone speaking the same thing - "Are constraints wrong?" or "How can W be >10? Why do constraints have W upto 300?" . &etc. While almost EVERYONE derived that formula that $W=D_N-D_1$, they all got stuck at this trivial stuff that why is $W$ upto 300 in problem statement. Whats worse, they started feeling that since $|W|\le 300$ in the question, perhaps their derivation and understanding of the question is wrong. I have nothing to say here, except that if you yourself arent confident at your solution, dont expect anyone else to. Have confidence in what you do, and at the same time learn to accept when you go wrong. Balance these two things, and thats the secret of life :).

2. I have to mention Misha's immense variety of solution. If you see tester's code (the $O(Log(MOD))$ one), he has used something different. He used bpow(10, (n - 2) % (MOD - 1)), while setter and me have simply used count*fastExpo(10,n-2) (fastExpo and bpow, both are same). He framed that solution based on Euler's Totient Function and Euler's theorem. Go through the given links, try to derive the expression. Answer is in the tab below to cross check :).

View Content

3. Dont open the tab below.

View Content

Now, Misha did not want to stop at $O(Log(MOD)$ solution, so he devised an $O(1)$ solution! Basically, that solution uses pre-processing. What he did is, we know that $MOD <{2}^{30}$. Hence, the highest value which we deal with can be represented within $30$ bits.

He took the binary representation of number , and partitioned it into 3 parts, consisting 10 bits each. (Remember we can represent the numbers within 30 bits).

Now, he uses his $POW00[],POW10[],POW20[]$ arrays to calculate any power of $10$ from $[0,MOD]$ in $O(1)$ time. $POW00[]$ is simple, it just stores powers of 10 mod ${10}^{9}+7$ for powers from $[0,1023]$. Imagine it like this, we partitioned the $30-bit$ number into $3$ part of $10$ $bits$ each. $POW00[]$ deals with lowest 10 bits. Multiplying previous element by $10$ is like adding $1$ to power of previous element, (which , in terms of binary numbers, is essentially adding $1$ to binary representation of the number. ) Recurrence formula-

$POW00[i]=10*POW[i-1];$

He does the same in $POW10[]$ array. He adds $1$ to binary representation of number- but he is careful of one thing. $POW00[]$ dealt with numbers from $[0,{2}^{10}-1]$, where the value of $LSB$ is 1. However, in case of $POW10[]$, the $LSB$ is (in reality) the $11$ $'th$ bit of the number. Hence, we multiply it with ${10}^{1024}$ instead of $10$.

$POW10[i]={10}^{{2}^{10}}*POW10[i-1];$

Similarly, in $POW20[]$, the recurrence relation is $POW20[i]={10}^{{2}^{20}}*POW20[i-1]$. Its basically, he is adding $1$ to the $LSB$ of the $10-bit$ partition. And adding one there, means we must multiply it with 10, raised to power equal to value of that $LSB$. His generator code is given below for the curious :).

View Content

His generator code for the array values. :)

int mul20 = 1, mul10 = 1;
    for (int i = 0; i < 1 << 20; ++i) {
        mul20 = 10LL * mul20 % MOD;
        if (i % (1 << 10) == 0) {
            mul10 = 10LL * mul10 % MOD;
        }
    }
    cout << "int PW20[] = {";
    for (int i = 0, now = 1; i < 1 << 10; ++i) {
        printf("%d%c", now, i == 1023 ? ' ' : ',');
        now = 1LL * now * mul20 % MOD;
    }
    cout << "};\n";
    cout << "int PW10[] = {";
    for (int i = 0, now = 1; i < 1 << 10; ++i) {
        printf("%d%c", now, i == 1023 ? ' ' : ',');
        now = 1LL * now * mul10 % MOD;
    }
    cout << "};\n"; 
    cout << "int PW00[] = {";
    for (int i = 0, now = 1; i < 1 << 10; ++i) {
        printf("%d%c", now, i == 1023 ? ' ' : ',');
        now = 1LL * now * 10 % MOD;
    }
    cout << "};\n";*/
    int t;

4. Some of the good questions on fast exponentiation-

Tower 3-coloring - Give Fermat's Little Theorem a read first and you're good to go :)
BROCLK - While also solvable via Matrix Exponentiation, I solved this problem by nested/dual Fast-Exponentiation.
Chef and Segments - Fast Exponentiation +Maths.

https://crp.databread.xyz/

Hi.

I have created a Codechef Rating Predictor with UI and options to filter it by Institution and Country and Username.

Simply go to https://crp.databread.xyz and find the list of running contests to choose from.

Also, you can open it directly from the Codechef Ranking page. Simply replace codechef.com/ with crp.databread.xyz?

For Example : If contest Link is : https://www.codechef.com/rankings/APRIL18A?filterBy=Institution%3DIndian%20Institute%20of%20Technology%20Delhi&order=asc&sortBy=rank

Then Predicted Ratings are present at : https://crp.databread.xyz?rankings/APRIL18A?filterBy=Institution%3DIndian%20Institute%20of%20Technology%20Delhi&order=asc&sortBy=rank

Kindly let me know if there are any bugs.

**Github Link : https://github.com/Shraeyas/Codechef-Rating-Predictor**

Hope you like it.

Thanks.

Happy Coding :)

PROBLEM LINK:

DIFFICULTY:

PREREQUISITES:

PROBLEM:

EXPLANATION:

AUTHOR'S AND TESTER'S SOLUTION:

PROBLEM LINK:

DIFFICULTY:

PREREQUISITES:

PROBLEM:

EXPLANATION:

ALTERNATE SOLUTION:

AUTHOR'S AND TESTER'S SOLUTION:

PROBLEM LINK:

DIFFICULTY:

PRE-REQUISITES:

PROBLEM:

QUICK EXPLANATION:

EXPLANATION:

SOLUTION:

CHEF VIJJU'S CORNER:

PROBLEM LINK:

DIFFICULTY:

PRE-REQUISITES:

PROBLEM:

QUICK EXPLANATION:

EXPLANATION:

SOLUTION:

CHEF VIJJU'S CORNER:

PROBLEM LINK:

DIFFICULTY:

PRE-REQUISITES:

PROBLEM:

QUICK ANALYSIS:

ANALYSIS:

CHEF VIJJU'S CORNER:

PROBLEM LINK:

DIFFICULTY:

PRE-REQUISITES:

PROBLEM:

QUICK EXPLANATION:

EXPLANATION:

SOLUTION:

CHEF VIJJU'S CORNER:

PROBLEM LINK:

DIFFICULTY:

PREREQUISITES:

PROBLEM:

QUICK EXPLANATION:

GENERAL SUGGESTION:

EXPLANATION:

AUTHOR'S AND TESTER'S SOLUTIONS:

RELATED PROBLEMS:

PROBLEM LINK:

DIFFICULTY:

PRE-REQUISITES:

PROBLEM:

QUICK EXPLANATION:

EXPLANATION:

SOLUTION:

CHEF VIJJU'S CORNER:

PROBLEM LINK:

DIFFICULTY:

PRE-REQUISITES:

PROBLEM:

QUICK EXPLANATION:

EXPLANATION:

SOLUTION:

CHEF VIJJU'S CORNER

PROBLEM LINKS:

DIFFICULTY:

PRE-REQUISITES:

PROBLEM:

QUICK EXPLANATION:

EXPLANATION :

SOLUTION:

CHEF VIJJU'S CORNER :D

PROBLEM LINK:

DIFFICULTY:

PRE-REQUISITES: