Yahoo releases massive 13.5TB web-browsing data set to researchers

Yahoo’s business may be struggling, but millions of people still visit its site to read the news every day. That gives the company unique insights into browsing and reading habits, and today the company has released a huge swath of that data. The “Yahoo News Feed dataset” incorporates anonymous browsing habits of 20 million users between February and May of 2015 across a variety of Yahoo properties, including its home page, main news site, Yahoo Sports, Yahoo Finance, Yahoo Movies and Yahoo Real Estate. All told, the data set is a whopping 13.5TB and covers 110 billion unique interaction “events.” Yahoo calls it the “largest machine learning dataset” ever publicly released, and we’re inclined to believe them — there aren’t very many companies who could accumulate this much browsing data.…


Link to Full Article: Yahoo releases massive 13.5TB web-browsing data set to researchers