Yahoo releases massive machine learning dataset for researchers

Yahoo has introduced its new “Yahoo News Recommendation” dataset, something it says is the biggest ever machine learning dataset released publicly. Says Yahoo, this dataset has been released for the academic research community, giving researchers who are normally unable to access such large-scale datasets the opportunity to conduct research using the mass of information. This publicly released machine learning dataset contains 110B events, which Yahoo says is 13.5TB uncompressed. The dataset includes anonymized user news item interaction data, according to the company, which was gathered from 20 million or so users over a few month period early last year. Contained within the Yahoo News Feed dataset is anonymized data from users who have interacted with various Yahoo properties, including things like its Yahoo Movies, Yahoo News, and Yahoo Finances. The…


Link to Full Article: Yahoo releases massive machine learning dataset for researchers