Troubleshooting¶
Following section describes how you can debug in case you want to troubleshoot the execution of data models.
General¶
Even though my DW username and password are correct, I am getting ‘Authentication FAILED’ error while executing run/compile commands. Why so?
Even if your DW credentials are correct, you will get this Snowflake error, if your user does not have the rights to read and write data. To resolve, kindly ask the administrator to change your role or grant these privileges.
After my run query got complete, I tried to see the results on DW, for name displayed on screen. I typed SELECT * FROM "MY_WAREHOUSE"."MY_SCHEMA"."Material_domain_profile_c0635987_6"
. Am now getting error “Object does not exist or not authorized”.
Please remove the double quotes and run the query, that is SELECT * FROM MY_WAREHOUSE.MY_SCHEMA.Material_domain_profile_c0635987_6
.
When I try to create Hello World Project by executing the command wht init wht-project
then I get message “Error: mkdir HelloWhtProject: file exists”.
That’s because the folder HelloWhtProject
already exists. You can rename or remove that folder. Or, you can create project of another name, by passing that as parameter.
Command Progress & Lifecycle¶
I have executed the run command. Is there any way for me to track the status of those queries?
When you execute the run command, you will see link to data warehouse on your screen. Copy and paste it on the address bar of your browser, which will open the webpage where you can see status of running queries.
I executed a command and it is taking me too long a time. Is there any way I can kill the processes on DW?
In case it is taking too long time to execute queries such as discover or run, then it could be due to other queries running simultaneously on the same warehouse. To clear them up, please open the Queries tab on your DW (Snowflake) and then manually kill the long running processes.
ID Stitcher¶
There are many large size connected components in my DW. To increase accuracy of stitched data, I want to increase the number of iterations. Will it be possible?
The default value of largest diameter, i.e. the longest path length in connected components, is 30. To increase that, you can define a key max_iterations
in the ID Stitcher YAML file, and specify the value as max diameter of connected components. However, please note that by having a large number of iterations, the algorithm can give incorrect results.
YAML¶
Are there any best practices I should follow when writing in YAML?
Please keep these points in mind, otherwise you may get an error.
Use spaces instead of tabs.
Always use proper casing. Say id_stitching and not id_Stitching.
Make sure that the source table you are referring to, exists on DW or data has been loaded into it.
The syntax you have written is correct, as shown in sample code.
Indentation is meaningful in YAML, so please make sure that the spaces have same level as given in sample files.
My YAML has many features. How do I debug step-by-step. How do I run upto a particular feature or feature/macro/tablevar?
There is a parameter --model_args
in the format modelName:argType:argName
in the command wht run
. It allows you to run till a specific feature/tablevar. For example:
./wht run -w samples/attribution --model_args domain_profile:breakpoint:blacklistFlag
Control Access¶
I have two separate roles to read from input tables and write to output tables? How the roles should be defined?
You need to create an additional role specified as a union of the two roles. WHT runs need to be able to read the input tables and write results back to a schema in the warehouse. Furthermore, each run is executed using a single role, specified in the matching profile’s section of the site config. It is best in terms of security to create a new role which has read access to all relevant inputs and write access to the output schema. Alternative is to reuse an existing role which has atleast those permissions.
How do I test if the used role have sufficient privileges to access the objects in warehouse to run the project?
You can use the wht validate access
command to validate the access privileges on all the
input/output objects to the used role. See validate section of CLI Reference for more information.