Feature Tables¶
The key with which you assign the values for creating a feature table. A list of features can be defined, with each feature having multiple tablevar and/or macro.:
- name: domain_profile
model_type: feature_table_model
model_spec:
validity_time: 24h # 1 day
id_stitching:
model: domain_profile_id_stitcher
features:
- tablefeature:
name: is_mql
ref:
input: salesForceLeadsTable
value: max(case when salesForceLeadsTable.mql__c == 'True' then 1 else 0 end)
description: Whether a domain is mql or not
- tablevar:
name: domainSummary_account_type
ref:
input: domainSummary
value: domainSummary.account_type
- macro:
name: call_conversion
value: type = 'Call' or tasksubtype='Call' or type like '%demo%') and \
(rlike(lower(subject), '.*\\bcall\\b.*') or rlike(lower(subject), '.*\\bdemo\\b.*'
- tablevar:
name: salesCallEventTime
ref:
input: salesforceTasks
value: min(activitydate)
filter: macro(call_conversion)
- feature:
name: sales_conversion_timestamp
value: salesCallEventTime
- tablevar:
name: salesEvents
ref:
input: salesforceTasks
value: json_agg(activitydate, case when (type='Email' or tasksubtype = 'Email') then case when lower(subject) like '%[in]%' then 'sf_inbound_email' \
else 'sf_outbound_email' end when macro(call_conversion) then 'sf_call' else null end as event)
description: Salesforce touches are converted to one of following events - sf_inbound_email, sf_outbound_email, sf_call, null
- tablevar:
name: page_events_var
ref:
input: websitePageVisits
filter: timestamp < sales_conversion_timestamp and timestamp > var('start_date')
value: json_agg(timestamp, case when split_part(trim(context_page_path, '/'), '/', 1) in var('valid_page_paths') then split_part(trim(context_page_path, '/'), '/', 1) \
else var('default_page_path') end as event)
- feature:
name: page_events
value: page_events_var
- tablevar:
name: webhookFormSubmit
ref:
input: webhookSource
filter: variable_1 is null and timestamp < sales_conversion_timestamp and timestamp > var('start_date')
value: min(timestamp)
- feature:
name: formSubmit
value: min(webhookFormSubmit, websiteFormSubmit)
Let’s look at each field to know more about it.
name
Name of the feature table created on the DW. Say if you define this as then the output table will be named something like |
|
type |
string |
model_type
Define the types of models. |
|
type |
string |
model_spec
Give specifications of ID stitching |
||
type: list |
||
properties |
||
|
Defines the ID stitcher to be used. |
|
type |
string |
|
|
Fetch data which was loaded upto this time duration. Say you want to fetch all data that’s been loaded upto last 1 day, so please set this value to 24h. |
|
type |
time |
|
|
The key with which you assign the values for creating a feature table. A list of features can be defined, with each feature having multiple tablevar and/or macro. |
|
properties: |
||
type |
list |
tablevar and tablefeature
Here you define the source data, along with any conditions.tablevars are effectively helper columns which ARE NOT included in the feature table tablefeatures ARE included in the output feature table. |
|||
properties |
|||
|
This is the name in which you’ll store the retrieved data. |
||
type |
string |
||
|
properties:
|
||
type |
list |
||
|
Optional, in case you want to apply any conditions for fetching data (same as where clause in SQL). |
||
type |
string |
||
|
The data that will be stored in the name. |
||
type |
string |
||
|
Optional, a default value for any entities for which the calculated value would otherwise be NULL. |
||
type |
string |
||
|
Optional, a textual description of what it does. |
||
type |
string |
||
|
Optional, for specifying a window over which the value should be calculated.Note: You can have more than one tablevar, for defining a feature. |
||
type |
object |
||
properties |
|||
|
Optional, list of sql expressions to use in ordering the data. |
||
type |
list |
||
|
Not allowed in tablevars / tablefeatures. These are already effectively partitioned by id stitched entity, and can not be partitioned any other way. |
||
type |
list |
inputvar
With syntax identical to tablevar, the difference is that instead of each value being associated a row of the feature table, it’s associated with a row of the specified input. Effectively, while a tablevar can be thought of as adding a helper column to the feature table, an inputvar can be thought of as adding a helper column to the input. |
|||
properties |
|||
|
This is the name in which you’ll store the retrieved data. |
||
type |
string |
||
|
properties:
|
||
type |
list |
||
|
Optional, in case you want to apply any conditions for fetching data (same as where clause in SQL). |
||
type |
string |
||
|
The data that will be stored in the name. |
||
type |
string |
||
|
Optional, a default value for any entities for which the calculated value would otherwise be NULL. |
||
type |
string |
||
|
Optional, a textual description of what it does. |
||
type |
string |
||
|
Optional, for specifying a window over which the value should be calculated. |
||
type |
object |
||
properties |
|||
|
Optional, list of sql expressions to use in ordering the data. |
||
type |
list |
||
|
Optional, list of sql expressions to use in partitioning the data. |
||
type |
list |
macro
An optional section, in case you wish to perform a computation and store in a variable, to be referenced by a tablevar. |
||
properties |
||
|
The name in which you will store the calculated data. |
|
type |
string |
|
|
Regular expression or SQL clause to calculate the value. This section, if defined, proceedes |
|
type |
string |
feature
This is the computed feature. |
|||
properties |
|||
|
Define a name for the feature. |
||
type |
string |
||
|
The value that will be stored in the feature. It can be same as a You can refer multiple tablevar or a |
||
type |
string |
||
|
Optional, in case you want to apply any conditions for fetching data (same as where clause in SQL). |
||
type |
string |
||
|
Optional, a default value for any entities for which the calculated value would otherwise be NULL. |
||
type |
string |
||
|
Optional, a textual description of what it does. |
||
type |
string |
||
|
Optional, for specifying a window over which the value should be calculated. |
||
type |
object |
||
properties |
|||
|
Optional, list of sql expressions to use in ordering the data. |
||
type |
list |
||
|
Optional, list of sql expressions to use in partitioning the data. |
||
type |
list |
Other changes¶
As we are creating a Feature Table, set the value of resource_type to feature_table_model
.
resource_type: feature_table_model
Feature Table Code Examples¶
See Feature table sample.
Feature Tables ML¶
Models from ML Notebooks
You can take ML Notebooks in Jupyter and add ML/AI capabilities to it.
Feature Tables External¶
Integration with DBT
WHT supports DBT models so that customers on that platform can quickly start using the tool. Though WHT also does transformation like DBT, it empowers users who are already more comfortable in that platform.