PROBLEM LINK:
Setter-Anuj Maheshwari
Tester-Misha Chorniy
Editorialist-Abhishek Pandey
DIFFICULTY:
Simple
PRE-REQUISITES:
Array, Looping, Basic Math and Principles of Counting, Frequency Array.
PROBLEM:
Given an array of length $N$, find number of unordered pairs $(i,j)$ such that their average exists in the array (real division), or in other words, find number of unordered pairs of $(i,j)$ for which we can find an element in array $A_k$ which satisfies-
$2*A_k=A_i+A_j$ where $A_i$ and $A_j$ are $i$ $'th$ and $j$ $'th$ element of the array.
Unordered pair means that order of listing elements in pair doesnt matter. It means $(2,1)$ is treated same as $(1,2)$. In ordered pairs, we treat both of them as different.
QUICK EXPLANATION:
We immediately see the low constraints for value of $A_i$. We can make a frequency array to record frequency of each array element. The only problem is, $A_i$ can be negative as well. This is easily solved by adding a constant $K$ (preferably $1000$) to all array elements. With all elements positive, we iterate over all possible pair of values of array elements, i.e. we check all pairs of $(a,b)$ where $0\le a,b \le 2000$. We check if $a$, $b$ and their average exists in array and update answer accordingly.
EXPLANATION:
This editorial will discuss 2 approaches. First is mine (Editorialist), and second one is Misha's (@mgch), i.e. the tester's. My approach is a bit more difficult than tester's because I use some formulas and observations, while tester's solution is simply basic.
We noticed a lot of Div2 users getting stuck in lots of useless and unnecessary things, hence we will answer how to overcome such things in Chef Vijju's Corner at the end of editorial. :)
1. Editorialist's Approach-
The first thing I did was to add $1000$ to each element so that all elements of array are non-negative and between $[0,2000]$. We can show that if we add any constant to all values of array, then although the average of numbers increases, it has no effect on "existence of average (of 2 array elements) inside array." Try to prove it if you can. (Answer is in tab below).
For now, forget about duplicates. My approach counts even duplicates in the process, and removes them later at the end.
Once we have the frequencies of elements inside array, I iterate from all possible "pairs of values" allowed, i.e. , we know that now values of array elements are in range $[0,2000]$, so we will iterate over all pairs of $(a,b)$ where $0 \le a,b \le 2000$ and do the following-
1.If $(a,b)$ exist, and if their average (by real division) exists in the array as well, goto 2, else check next pair.
2.Is $a==b$ $?$ If yes, then add $(freq[a]-1)*freq[b]$ to answer, else add $freq[a]*freq[b]$
To remove duplicates, simply divide the answer by $2$ - because each pair is counted twice. The proof for the 2 formulas I used, along with fact that dividing by $2$ removes duplicates is left to reader as an exercise. (Answer/Hints are discussed in Chef Vijju's corner.)
$Time$ $Complexity-O({2000}^{2}+N)=O(N+K)=O(N)$ (where $K$ is a constant).
Tester's Approach (Easy)-
The first step is the same, he also made all elements non-negative by adding 1000. He deals with duplicates immediately.
1.For each possible value in $[0,2000]$, check if it exists in array. If it does, goto 2, else check the next value.
2.Count all pairs formed by this value and its duplicates by adding $ cnt[mid] * (cnt[mid] - 1) / 2$ to the answer. $mid$ here is the value being investigated.
3.If $mid$ is average of 2 numbers, this means both the numbers are equidistant from $mid$. Hence, count all pairs possible due to presence of equidistant numbers present at a distant $x$ , where $x$ is in range $[0,mid]$ and $mid+x$ and $mid-x$ are in valid range of $[0,2000]$. For each such pair, add $cnt[mid - x] * cnt[mid + x]$ to the final answer.
Time complexity is same as my solution in worst case :).
SOLUTION:
CHEF VIJJU'S CORNER:
1.Lets first discuss the mistakes. Most of the participants got the formulas right...but they did a tremendous blunder. Suffered from Overflow!!. If you are declaring your frequency array as int, then for larger test cases where $N$ is as large as ${10}^{5}$, the frequency of elements can be as large. Hence, when we do $freq[a]*freq[b]$, we must make sure to use proper type casting, else the answer will overflow!! Many contestants stored answer in an int variable, and that code is doomed to be wrong even before submission!!
One thing I will tell, if you are getting AC for smaller sub tasks, and WA for larger ones, do check for overflow!!. There is a good $60\%$ chance that it is overflow issue giving you WA. We were going through the wrong solutions in detail, and our heart sank a little bit each time we saw a solution killed because of overflow :(
2.Some of them mistook $|A_i| \le 1000$ as $0 \le A_i\le 1000$. The difference between the two is that, the first condition allows negative integers in the input. Whenever in such confusion, assume value of $A_i$ to be negative - and ask yourself - Does it satisfy this constraint? If yes, its in the input, else its not.
3.Derivation of $freq[a]*freq[b]$ in tab below.
4.Whenever you see that the max value of array elements is comparable to size of the array, frequency array approach can be used. Its usually asked as a part of question rather than an individual question, like here it was Math (principle of counting) +Frequency array. With some modification, you can also count number of each character in a string using this approach.
5. Frequency Array approach to count number of each element is one of the steps involved in Counting Sort , an $O(N)$ sorting technique.
6. What would we do if this question had constraints on $A_i$ upto $|A_i| \le {10}^{5}?$ It would be a difficult question, but still possible to solve. That problem then, can be solved using $FFT$. We create a polynomial $P(x)$ such that $P(x)=\sum_{0}^{2*{10}^{5}}(freq[i]*{x}^{i})$. On squaring $P(x)$, we will have $freq[i]*freq[j]$ for all $(i,j)$ between $1 \le i,j \le N$, but we need to check the case when $i=j$. The next step to do is, if $i+j$ is even, and $freq[(i+j)/2]>0$ , it has to be added to the final answer. This outline is just meant to introduce beginners that there is something like $FFT$ existing in this universe (xD) and to offer an interesting insight to the solution. Time Complexity is $O(|A|Log|A|)$
7. Questions to practice frequency array-
(More will be added if I find more, or if the community helps :) )