Just Enough Data Weekly Newsletter 6
Trino Summit
OCTOBER 21ST & 22ND, 2021
Virtually join the community of Trino users and developers on October 21st & 22nd, where we'll share the latest on Trino and learn how some of the most innovative companies are using this technology to power their analytics platforms.
Oracle Cloud Infrastructure Virtual Thought Leadership Event
October 22 at 4 PM IST
Register now for exclusive insights into our big data services trends and innovation journey. We can’t wait to share how our advances can help customers and employees alike, unlock data’s fundamental value and learn to extract and use it.
The Docker Everything Bagel™ – Spin Up A Local Data Stack
An important part of developing an open source project like lakeFS is assisting and advising our users. When they run into an issue and feel pain, we want to feel that pain, too. Quite literally.
This means recreating the environment, running the same code, and raising the same error.
In complex, modern data stacks this is easier said than done. Developed from experience over the past year, we have a setup that helps us in this pursuit. Affectionately, it is referred to as the Everything Bagel.
Create Powerful Data Pipelines by Mastering Sensors
Configure your Sensors to avoid freezing your Airflow instance
Dive into the ExternalTaskSensor
Smart Sensors, what are they? Should you use them?
Datafold: Why to Conduct Data Quality Post-mortems: Lessons learned from GitLab, Google, and Facebook
Organizations lose an average of $15 million per year due to poor data quality, according to Gartner Research. Poor data quality— inaccuracy, inconsistency, incompleteness, and unreliability— can have serious negative consequences for a data-driven company. Given the cost of these consequences, organizations have a high incentive to spot, fix, and understand them so that they don’t reoccur. That’s the goal of post-mortems.
Feature Engineering at Scale
A reference implementation of the design patterns is provided in the attached notebooks to demonstrate how first-class design patterns can simplify the feature engineering process and facilitate efficiency across the silos in your organization. The approach can be integrated with the recently-launched Databricks Feature Store, the first of its kind co-designed with an MLOps and data platform, and can leverage the storage and MLOps capabilities of Delta Lake and MLFlow.
How Istio, Tempo, and Loki speed up debugging for microservices
The architect decided to develop that system based on microservices. Hundreds of them! You, as a developer, think why? Why does the architect hate me so much? And then, the main question of the moment: How am I supposed to debug this?
Of course, we all understand the benefits of a microservice architecture. But we also hate the downsides. One of those is the process of debugging or running a postmortem analysis across hundreds of services. It is tedious and frustrating.
Friday’s are not meant for production deployments. I repeat Abort!!