Hi there Ash,
I believe you’re referring to joining/merging the two Parquet files for Deliveries and Metrics into one table in BigQuery?
The reason you are running into an issue is because each table uses a different primary key (unique identifier for rows in that table). The deliveries table uses delivery_id
, while the Metrics table uses event_id
(as you’ve mentioned!). The reason each delivery could have multiple event ids in the Metrics table is that each message can generate multiple delivery events (delivered, opened, clicked, etc.).
With that said, there isn’t really a “correct” answer to your question. As how you model your data is up to you depending on your use case and company goal to achieve with the merged table.
I hope the above helps to provide some background into what is happening and assist you in making a decision about what you would like to do with the data tables.
- Maddie