Collecting 25 billion events per day

🎬S3:E06 Event driven architectures

Jul 31, 2024

Hey, TechFlixers!

Every simple cursor movement or user interaction on a web app is an “event” that can hold valuable insights. Modern web products collect and store all such events, often in billions, every day. How exactly do they process them? Let’s take the example of Canva to explore further.

🔦 Spotlight

📢 How Canva collects 25 billion events per day

The Canva Product Analytics Platform collects and processes over 25 billion events per day to provide insights on user behavior, design views, member activities, and template usage for team administrators.
The platform has a strict schema structure with forward- and backward-compatibility rules to ensure data consistency and prevent breaking changes.
The analytics events are collected through an endpoint in the server, validated against predefined schemas, and then enriched with additional details before being distributed to various consumers through Amazon Kinesis Data Stream (KDS).
The KDS reduces costs by 85% compared to Amazon Managed Streaming for Apache Kafka (MSK) and provides low maintenance.
The platform also supports SQS fallback in case of KDS throttling or outages, ensuring no downtime.
Finally, the analytics events are delivered into Snowflake using Snowpipe Streaming and can be consumed by various backend services through KDS or SQS subscriptions.

Source 🔗

🚀 Power Up

Understanding how Canva handles a whopping 25 billion events per day is a crash course in modern data analytics infrastructure. Let's break down the key concepts and tools they use.

Event-Driven Architecture

It is a software design pattern where the system reacts to events—changes in state that are significant to the business or application. This architecture allows systems to be highly responsive, scalable and decoupled.

Schema Structure with Compatibility Rules

Schema Structure defines how data is organized within a database or data processing system. Canva uses a strict schema structure to ensure:

Forward Compatibility: New versions of the schema should be able to read data written by older versions.
Backward Compatibility: Older versions of the schema should be able to read data written by newer versions.

These rules prevent data inconsistencies and ensure that updates to the system do not disrupt ongoing operations.

Data Validation

Data Validation involves checking incoming data against predefined schemas to ensure it meets all necessary criteria before processing. This step is crucial for maintaining data integrity and quality.

Data Enrichment

Data Enrichment is the process of enhancing collected data with additional context or details. For example, enriching an event with metadata such as timestamps, user information, or geographical location.

Amazon Kinesis Data Stream (KDS)

Amazon Kinesis Data Stream (KDS) is a service for real-time data streaming, enabling scalable and fault-tolerant ingestion of large volumes of data. Key benefits include:

Cost Efficiency: KDS is 85% cheaper than Amazon Managed Streaming for Apache Kafka (MSK).
Low Maintenance: It requires minimal management effort, freeing up resources for other tasks.

SQS Fallback

SQS (Simple Queue Service) Fallback is a backup mechanism. If KDS faces throttling (delays due to high traffic) or outages, the system switches to SQS to ensure no events are lost. This guarantees system reliability and zero downtime.

Key Concepts of an SQS

Queue: A temporary storage location for messages.
Message: A unit of data sent between services, typically in the form of a string, JSON, or XML.
Producer: The component or service that sends messages to the queue.
Consumer: The component or service that receives and processes messages from the queue.

Types of SQS Queues

Standard Queue: Offers maximum throughput, best-effort ordering, and at least-once delivery.
FIFO (First-In-First-Out) Queue: Ensures that messages are processed exactly once, in the exact order they are sent.

Snowflake and Snowpipe Streaming

Snowflake is a cloud-based data warehousing platform known for its scalability, performance, and ease of use.

Snowpipe Streaming is Snowflake's continuous data ingestion service, which allows real-time data to flow into Snowflake.

Backend Services and Subscriptions

Backend Services refer to server-side applications that process data and perform tasks away from user interactions. They consume data through:

KDS Subscriptions: Real-time data feeds.
SQS Subscriptions: Queue-based data feeds, ensuring reliable message delivery even during high loads or outages.

Putting It All Together

By leveraging these technologies and practices, Canva ensures its Product Analytics Platform is robust, cost-effective, and capable of handling massive data volumes with high reliability. This setup allows for real-time insights into user behavior, enabling better decision-making and user experiences.

Key Takeaways

Event-Driven Architecture: Enhances responsiveness and scalability.
Strict Schema Structure: Maintains data consistency and compatibility.
Data Validation and Enrichment: Ensures high-quality, contextual data.
Amazon Kinesis Data Stream: Offers cost-effective, low-maintenance data streaming.
SQS Fallback: Provides a reliable backup mechanism.
Snowflake and Snowpipe Streaming: Facilitate real-time data warehousing and analytics.

By understanding these concepts, you can appreciate how Canva and other tech giants efficiently manage and process vast amounts of data, driving their business intelligence and operational success.

With this foundation, you're well-equipped to delve deeper into the mechanics of event-driven architectures and data analytics platforms. Happy learning!