WHT Project

Folder Structure

HelloWhtProject
├── wht-project.yaml
└── models
└── ml-features.yaml

wht_project.yaml

The project file contains name of the profile to be used, along with model locations.

Models and Features

A model file in YAML describes the sources from where data is to be fetched and the conditions for creating output table. Feature is a single value derived by performing calculation or aggregation on a set of values.

The different kinds of models that can be created are:

  • ID Stitcher - You can create a detailed user journey across platforms and devices, by stitching together same data across sources.

  • Feature Tables - This allows you to create output tables with computations such as aggregation, average, etc. on selected fields.

  • Feature Tables ML Notebook - You can take ML Notebooks in Jupyter and add ML/AI capabilities to it.

  • Feature Tables External - Fetch data from your existing tables in the warehouse that are created by sources like DBT and Airbyte.

Timestamp of WHT Model Inputs

Whenever RudderStack Events or Cloud Extract load data on a warehouse, then it mentions timestamp in a column specifying the date and time the data was loaded. Say you want your models to fetch data that was loaded in the last 24 hours, then specify validity_time: 24h in the model YAML file.

In case you wish to fetch all the data from source tables irrespective of timestamp, then you can add timeless parameter to the query.

wht run -t timeless