HackerLand National Bank has a simple policy for warning clients about possible fraudulent account activity. If the amount spent by a client on a particular day is greater than or equal to the client's median spending for a trailing number of days, they send the client a notification about potential fraud.
The bank doesn't send the client any notifications until they have at least that trailing number of prior days' transaction data.
Given the number of trailing days and a client's total daily expenditures for a period of days, find and print the number of times the client will receive a notification over all days.
For example, and . On the first three days, they just collect spending data. At day , we have trailing expenditures of . The median is and the day's expenditure is . Because , there will be a notice. The next day, our trailing expenditures are and the expenditures are . This is less than so no notice will be sent. Over the period, there was one notice sent.
Note: The median of a list of numbers can be found by arranging all the numbers from smallest to greatest. If there is an odd number of numbers, the middle one is picked. If there is an even number of numbers, median is then defined to be the average of the two middle values. (Wikipedia)
Function Description
Complete the function activityNotifications in the editor below. It must return an integer representing the number of client notifications.
activityNotifications has the following parameter(s):
- expenditure: an array of integers representing daily expenditures
- d: an integer, the lookback days for median spending
Input Format
The first line contains two space-separated integers and , the number of days of transaction data, and the number of trailing days' data used to calculate median spending.
The second line contains space-separated non-negative integers where each integer denotes .
The second line contains space-separated non-negative integers where each integer denotes .
Constraints
Output Format
Print an integer denoting the total number of times the client receives a notification over a period of days.
Sample Input 0
9 5
2 3 4 2 3 6 8 4 5
Sample Output 0
2
Explanation 0
We must determine the total number of the client receives over a period of days. For the first five days, the customer receives no notifications because the bank has insufficient transaction data: .
On the sixth day, the bank has days of prior transaction data, , and dollars. The client spends dollars, which triggers a notification because : .
On the seventh day, the bank has days of prior transaction data, , and dollars. The client spends dollars, which triggers a notification because : .
On the eighth day, the bank has days of prior transaction data, , and dollars. The client spends dollars, which does not trigger a notification because : .
On the ninth day, the bank has days of prior transaction data, , and a transaction median of dollars. The client spends dollars, which does not trigger a notification because : .
Sample Input 1
5 4
1 2 3 4 4
Sample Output 1
0
There are days of data required so the first day a notice might go out is day . Our trailing expenditures are with a median of The client spends which is less than so no notification is sent.
Fraudulent Activity Notifications - Hacker Rank Solution
In this problem, you need to find the running median. There can be several approaches to solving this. Notice that a client can spend at most $200 per day. We can take advantage of this small number.
Finding Median using Counting Sort
Let's see how can we find the median of an array using counting sort with an example. Suppose the array is . The maximum number in the array is 7. If you write down the frequency of each numbers from to , you will get a table like this:
There are elements in the array, so the median is the number in the sorted array. You can loop over the frequency table to find the number.
Get back to the original problem
In the original problem, you need to maintain a frequency table for each window of size in the array. You can do it by keeping track of the starting and ending point of the window. Let's suppose the start point of the current window is and the end point is and . Also, assume you already have a frequency table for that window. When you go to next window , you can update the frequency table by reducing the frequency of the element in index and by increasing the frequency of the element in index . Using the frequency table you can find the median and your problem is solved.
The complexity is
Tricky Part
Note that the median can be a floating point value when the size of the array is even. For example, if the array is , the median is . You will not pass the 2nd sample I/O if you don't handle it properly.
Another Solution
This can be solved using two priority queues in . This video explains it nicely.
Problem Setter's code:
import sys
#sys.stdin = open("in", "r")
n, d = map(int, raw_input().split())
arr = map(int, raw_input().split())
dic = {}
def find(idx):
s = 0
for i in xrange(0, 200):
freq = 0
if i in dic:
freq = dic[i]
s = s + freq
if s>=idx:
return i
ans = 0
for i in xrange(0, n):
val = arr[i]
if i>=d:
med=find(d/2 + d%2)
if d%2==0:
ret = find(d/2+1)
if val >=med + ret:
ans = ans+1
else:
if val>=med*2:
ans = ans + 1
if val not in dic: dic[val] = 0
dic[val] = dic[val] + 1
#print i,dic
if i>=d:
prev = arr[i-d]
dic[prev] = dic[prev]-1
print ans
Problem Tester's code:
#include <iostream>
#include <cstdio>
#include <algorithm>
#include <cstring>
#include <ctime>
#include <cassert>
using namespace std;
#define SZ(x) ((int)(x.size()))
#define FOR(i,n) for(int (i)=0;(i)<(n);++(i))
#define FOREACH(i,t) for (typeof(t.begin()) i=t.begin(); i!=t.end(); i++)
#define REP(i,a,b) for(int (i)=(a);(i)<=(b);++i)
typedef long long ll;
const int INF = 1e9;
const int N = 2e5;
const int V = 200;
int a[N];
int cnt[V+1];
int main()
{
ios_base::sync_with_stdio(0);
int n, d;
cin >> n >> d;
assert(n >= 1 && n <= N);
assert(d >= 1 && d <= n);
FOR(i, n) cin >> a[i];
FOR(i, n) assert(a[i] >= 0 && a[i] <= V);
int res = 0;
FOR(i, d) cnt[a[i]]++;
REP(i, d, n-1)
{
//SOLVE HERE
int acc = 0;
int low_median = -1, high_median = -1;
REP(v, 0, V)
{
acc += cnt[v];
if(low_median == -1 && acc >= int(floor((d+1)/2.0)))
{
low_median = v;
}
if(high_median == -1 && acc >= int(ceil((d+1)/2.0)))
{
high_median = v;
}
}
assert(acc == d);
int double_median = low_median + high_median;
//cout << low_median << " " << high_median << " -> " << median << endl;
if(a[i] >= double_median)
{
res++;
}
cnt[a[i-d]]--;
cnt[a[i]]++;
}
cout << res << endl;
return 0;
}
using counting sort exceeds the limits
ReplyDelete