Just Enough Data Weekly Newsletter 5
Webinars
Superset Meetup: Apache Superset 1.3
Apache Superset 1.3 was just released and brings a large suite of improvements. In this Meetup, we will showcase improvements to Dashboard Native Filters, new charts that were added, new supported data sources, and the numerous user experience improvements.
Time
Sep 22, 2021 10:00 AM Pacific Time (US and Canada)
Pulsar Virtual Summit Europe 2021
The Pulsar Summit is the global event for engineers, architects, data scientists, and technical leaders interested in Pulsar and the messaging and streaming ecosystem. It is a unique opportunity to network and learn about Pulsar project updates, ecosystem developments, best practices, and adoption stories.
October 6th, 2021
Future Data Driven
is a marquee, online event focusing on the comprehensive Microsoft Data Platform. Our mission is to inform and update attendees with Data & AI, DevOps & DataOps, PowerBI & Visualization, Integration & Automation and cloud infrastructure. We are excited to welcome you to participate at our free first ever Future Data Driven online event.
WHEN
29 September, 2021
Open Source Data Stack Conference
Building the modern stack with open source data solutions.
SEPT. 28 - 30
Manage Dependencies Between Airflow Deployments, DAGs, and Tasks
If you have missed the live session by Astronomer here is the recording.
Logging, Monitoring, and Observability in Google Cloud PartI
Rishi Singhal
Customer EngineerGoogle Cloud
Rishi is a Customer Engineer for Google Cloud Platform (GCP). As part of his role he works with various Startups/Enterprises to solve the problems & challenges they are currently facing. He also helps them in visualising their NEXT journey with the power of GCP. Rishi has more than 16 years of IT experience with expertise in Application development and Data processing. He also has done startup consulting in Transport, Healthcare & Stock Trading domains.
Datafold: Data Quality Management According to Lyft, Shopify, and Thumbtack
Given the complexity of real-world data quality management, it’s helpful to look at actual teams that have managed to establish and scale strong management practices. This article explores the data challenges faced by Shopify, Lyft, and Thumbtack and the systems they built to conquer them.
Dremio: What Is a Data Pipeline?
Moving data between systems requires many steps: from copying data, to moving it from an on-premises location into the cloud, to reformatting it or joining it with other data sources. Each of these steps needs to be done, and usually requires separate software.
A data pipeline is the sum of all these steps, and its job is to ensure that these steps happen reliably to all data. These processes should be automated, but most organizations need at least one or two engineers to maintain the systems, repair failures and update according to the changing needs of the business.
Templating in Airflow
Templating is a powerful concept in Airflow to pass dynamic information into task instances at runtime. For example, say you want to print the day of the week every time you run a task.
The Modern Data Stack: Open-source Edition
Most open-source products listed below are in fact open-core, i.e. primarily maintained and developed by teams that make money off consulting about, hosting, and offering “Enterprise” features for those technologies. In this post, we are not taking into account the features that are only available through the SaaS/Enterprise versions, thereby comparing only openly available solutions.
The importance of automating production ops
While running database services for AWS, Anurag Gupta, now Founder & CEO of Shoreline.io, learned the importance of going beyond monitoring and incident response to implement incident automation to automatically resolve common (and uncommon) production issues. In this episode, he talks about the importance of automating production ops, why just using cloud hosting or containers doesn’t fully solve the problem and how to think about building out an effective SRE team.
Always start from small, no matter how big is the problem. Keeping the base clear is the key.