Data Lake Store and Analytics with Tom Kerkhove
How do you stop your data lake from being a data swamp? Carl and Richard talk to Tom Kerkhove about Azure Data Lakes. The conversation digs into the impact the cloud has had a data warehousing - when you have as much compute and storage as you need on demand, does it still make sense to jump through all the hoops that data warehousing requires? Tom talks about Data Lakes storing all data as it arrives from a huge variety of sources and leaving that data in its native format, so that it is available for analysis as needed. Universal SQL (U-SQL) is the query language of Data Lakes, which is more LINQ-like, but speaks to the power of being able to join anything to anything with the cloud!
Tom Kerkhove is a Senior Software Engineer at Microsoft working on Azure API Management, leading Azure API Management's future to allow customers to build API ecosystems in a hybrid and multi-cloud landscape with its self-hosted gateway and Azure Arc. He has been working in the cloud-native space for 5+ years and has been a CNCF Ambassador since 2020. Autoscaling is in his DNA and is one of the active maintainers of Kubernetes Event-Driven Autoscaling (KEDA), a CNCF Incubation project that makes application autoscaling on Kubernetes dead simple. It is scaling big enterprises such as Zapier, Reddit, FedEx, Alibaba Cloud, and many others.
- React Native https://facebook.github.io/react-native/
- Data Lakes on Azure https://azure.microsoft.com/en-us/solutions/data-lake/
- Martin Fowler on Data Lakes http://martinfowler.com/bliki/DataLake.html
- Azure Data Factory https://azure.microsoft.com/en-us/services/data-factory/
- Microsoft Cosmos Blog Post from 2010 http://blogs.msdn.com/b/seliot/archive/2010/11/05/cosmos-petabytes-perfectly-processed-perfunctorily.aspx
- Azure Data Lake Analytics Preview https://azure.microsoft.com/en-us/services/data-lake-analytics/
- Azure IoT Suite http://www.microsoft.com/en-ca/server-cloud/internet-of-things/azure-iot-suite.aspx
- Global Azure Bootcamp http://global.azurebootcamp.net/
- Revolution Analytics http://www.revolutionanalytics.com/
- Using Data Factories to Control Data Lake Analytics https://azure.microsoft.com/en-us/documentation/articles/data-factory-usql-activity/
- HDInsight https://azure.microsoft.com/en-us/services/hdinsight/
- Cortana Analytics Suite http://www.microsoft.com/en-ca/server-cloud/cortana-analytics-suite/overview.aspx