Since dlt generates a schema, and tracks evolution etc, contains lineage, and fo...

pitah1 · on Oct 25, 2023

Thanks for the response. I also noticed there was a mention of data contracts or Pydantic to keep your data clean. Would it make sense to embed that as part of a DLT pipeline or is the recommendation to include it as part of the transformation step?

MatthausK · on Oct 25, 2023

You can use pydantic models to define schemas, validate data (we also load instances of the models natively): https://dlthub.com/docs/general-usage/resource#define-a-sche...

We have a PR (https://github.com/dlt-hub/dlt/pull/594) that is about to merge that makes the above highly configurable, between evolution and hard stopping: - you will be able to totally freeze schema and reject bad rows - or accept the data for existing columns but not new columns - or accept some fields based on rules'

adrianbr · on Oct 25, 2023

you can request a source or a feature by opening an issue on sources/dlt repo https://github.com/dlt-hub