Question

Wrong Schema Specification - Data Warehouse Sync (Attributes)

  • 12 December 2022
  • 2 replies
  • 78 views

Hey everyone!

 

We’ve turned on the Data Warehouse Sync feature to sync data from Customer.io into Snowflake (documentation here).

 

However, when developing our internal pipelines to handle the data, we’ve found that the schema indicated for the Attributes’ objects has an error: it is missing the indication of “Primary Key” on the field “attribute_name”. This can lead to errors if the developers do not know how the data is structured inside the files (by thinking only the “internal_customer_id” is the PK, they will only aggregate one row per customer id, instead of one row per customer id and attribute).

 

Does anyone know where I can report this?

 

Thanks!


2 replies

Userlevel 3
Badge +1

Hey there, 

 

Thanks for pointing this out. You are able to send these type of questions to win@customer.io so that our technical support team can look into it and make the necessary adjustments. 

Badge
Good catch!
 
A person can have more than one change to the same attribute, and a person might have changes to multiple attributes, so neither of these items should be primary keys. You can use the timestamp with an internal ID and attribute name to find a person's latest attribute change/value.
 
I'll make sure that we update the docs accordingly.
 
On that note, we're probably using "primary key" and "foreign key" in a misleading way. We really mean "data that is unique within this, or another parquet file"; we're obviously not defining your database's schema. I'll see if we can figure out some better terms to define these data relationships.
 
Sorry for the mix-up!

Reply