Yahoo releases largest ever data set for machine-learning scientists

Yahoo has today released the largest ever data set made available to machine-learning scientists, IDG News Service reports. The data comprises “anonymized user interactions with the news streams on sites like Yahoo News and Yahoo Sports,” the report states — 110 billion events total. The data release aims to validate machine-learning programs. “Data is the life-blood of research in machine learning,” Yahoo said. But data sets of this magnitude are usually only available to researchers at huge companies and rarely available to academic researchers, and while fake data can be created, it doesn’t always represent the messy, real-world risks of actual data. The released data is now available through Yahoo Lab’s Webscope data-sharing program.Full Story


Link to Full Article: Yahoo releases largest ever data set for machine-learning scientists