How to Keep Your Data Lake From Becoming a Data Swamp

Data lakes refer to massive storage of any structured and unstructured data at a big data scale. Data from various streams flow into the data lake and are available to cross-functional data scientists to examine and interpret patterns for predictive analytics and machine learning. On the surface, the idea sounds fantastic and full of possibilities. Many enterprises jumped on the bandwagon and created Hadoop-based repositories and started filling those with all kinds of data. Whether or not organizations are finding a business value from their data lakes, however, is yet to be determined. In the hope of some future use, many companies are blindly putting all their data into their data lakes without any objectivity, governance or traceability. Data Lakes or Data Swamps? Without proper metadata and quality assurance of…


Link to Full Article: How to Keep Your Data Lake From Becoming a Data Swamp