Machine Learning Open House

‘Efficient unique counts in sliding windows’ A very common and effective type of feature that Stripe uses in its machine learning models requires counting unique items over windowed time periods: for example, how many unique credit cards have we seen from a given IP address in the last hour or the last day? This short talk will discuss in detail an interesting data structure we use (sometimes called a “hyperloglog series”) that can provide an estimated unique count for any window into the past (or even a weighted, decaying combination of all windows), no matter how large the counts or long the windows, in constant space and time. ‘Practical challenges of transaction fraud model development’ Stripe is continually improving its system for detecting fraud at the transaction level. However, as…


Link to Full Article: Machine Learning Open House