Just Enough Data Weekly Newsletter 9
SCALING ML MODELS, DATA, ALGORITHMS, & INFRASTRUCTURE
The creators of The Transformer, TensorFlow, Kubernetes, Apache Spark, Tesla Autopilot, Keras, Apache Arrow, MLPerf, Matroid, Turing Award Winners, astronauts, and others will present about running and scaling machine learning algorithms on a variety of computing platforms, such as GPUs, CPUs, FPGAs, TPUs, & the nascent AI chip industry.
Dremio Cloud Live Demo
See Dremio Cloud in action and ask our product experts your questions
Join our live weekly demo where our product experts will showcase Dremio Cloud and answer your most pressing data questions.
Dremio Cloud is a true SaaS SQL lakehouse platform that eliminates the need to install any software or manage infrastructure. You’ll learn how Dremio enables you to run high-performance queries and dashboards directly on cloud data lake storage, without having to copy data into proprietary data warehouses.
Cloud Data Pipelines (RO, online, free)
This month we're hosting a special Cloud Native Bucharest meetup, as part of Big Data Week Bucharest. That's why it's not on Thursday :)
Do note this event is free, you don't need a conference ticket.
What Is Azure Data Lake Storage (ADLS)?
Microsoft Azure Data Lake Storage (ADLS) is a fully managed, elastic, scalable and secure file system that supports HDFS semantics and works with the Apache Hadoop ecosystem. It provides industry-standard reliability, enterprise-grade security and unlimited storage that is suitable for storing a large variety of data. It is built for running large-scale analytics systems that require large computing capacity to process and analyze large amounts of data. Data stored in ADLS can easily be analyzed using Hadoop frameworks like MapReduce and Hive.
The subsurface podcast
Testing pipeline to join 30billion rows of data quickly.
Doordash: As an orchestration engine, Apache Airflow
let us quickly build pipelines in our data infrastructure. However, as our business grew to 2 billion orders delivered, scalability became an issue. Our solution came from a new Airflow version which let us pair it with Kubenetes, ensuring that our data infrastructure could keep up with our business growth.
Google launches serverless Spark, AI workbench, new data offerings at Cloud Next
While the cloud has been great for data and analytics -- given its limitless storage and compute capacity -- it has also caused a real regression in productivity for data professionals. The reason for this, simply put, is that the major cloud providers have hurled numerous data platforms at the market and left it to customers to pick the right combination of services, then integrate them. Say what you will about the old guard enterprise software behemoths, but they did spare their customers much of the "assembly required" experience that the cloud hyperscalers impose today.
No matter how much you extend your working hours, your work will never be over.
Remember, Personal life is also as important as professional.