Yahoo Releases Largest Machine Learning Data Set for Academia

Yahoo Inc. has released a massive cache of internet data for academic research community. The company hopes that the initiative will encourage innovation and give researchers a window to real-world behavior. The dataset will be available through the company’s ongoing initiative called Yahoo Labs Webscope. According to PC Magazine, the huge cache of data contains around 110 billion events of interaction data taken from about 20 million Yahoo users between February and May of 2015. The dataset measures a staggering 13.5 TB in size, and contains anonymized user interaction data. These users all did something on the news feed of the company’s major sites, which includes the Sports, News and Finance sites. The data also contained details about the gender, age range, and generalized geographic data of a subset of…


Link to Full Article: Yahoo Releases Largest Machine Learning Data Set for Academia