Lineage
Lineage page the main interface to Recce and how you can quickly determine the zone of impact of any modeling changes.
Lineage Diff
It's from Lineage Diff that you will determine which models to investigate further to validate your changes.
Node Summary
- Models are color coded to indicate
added
,removed
, andmodified
models. - The bottom icon indicates if there is
row count changed
orschema changed
detected. A row count changed icon is only shown if there is row count diff executed on this node. - Click a model to view the Node detail and perform other checks.
Select Models
By clicking the Select models button, you can select multiple nodes for further operations. For detail, see the [Multi Nodes Selections] section (#multi-nodes-selection)
Filter Nodes
By clicking the Filter nodes button, you use different aspect to view the nodes
- View Mode:
- Changed Models: Modified nodes and their downstream + 1st degree of their parents.
- All: Show all nodes.
- Package: Filter by dbt package names.
Node Detail
Schema Diff
Note
Schema Diff requires catalog.json
in both environments.
Schema Diff shows added, removed, and renamed columns. Click a model in the Lineage DAG Diff to view the Schema Diff.
Row Count Diff
Row Count Diff shows the difference in row count between the base and current environments.
Code Diff
- Select the model from the Lineage DAG.
- Click the Diff button on the upper right corner.
Value Diff
Note
Value Diff uses the compare_column_values
from audit-helper. To use Value Diff, ensure that audit-helper
is installed in your project.
Value Diff shows the matched count and percentage for each columns in the table. It use the primary key(s) to uniquely identify the records between the model in both environments.
The primary key is automatically inferenced by the first column with the unique test. If no primary key is detected, at least one column required to be specified as primary key.
- Added: Newly added PKs.
- Removed: Removed PKs.
- Matched: For a column, the count of matched value of common PKs.
- Matched %: For a column, the ratio of matched over common PKs.
You can query all the diff records from the value diff result.
Profile Diff
Note
Profile diff uses the get_profile
from dbt-profiler. To use Profile Diff, ensure that dbt-profiler is installed in your project.
Profile Diff compare the basic statistic (e.g. count, distinct count, min, max, average) for each columns between two environments.
- Select the model from the Lineage DAG.
- Click the Advanced Diffs button
Please reference dbt-profiler for the definition of the profiling stats.
Histogram Diff
Histogram Diff compares the distribution of a numeric column in an overlay histogram chart.
- Select the model from the Lineage DAG.
- Click the
Advanced Diffs
buton and selectHistogram Diff
. - Select the column to diff and click
Execute
.
Top-K Diff
Top-K Diff compares the distribution of a categorical column. The top 10 elements are shown by default. This can be expanded to the the top 50 elements.
- Select the model from the Lineage DAG.
- Click the
Advanced Diffs
buton and selectTop-K Diff
. - Select the column to diff and click
Execute
.
Multi Nodes Selection
Select Models
- Click the Select models button
- Select one or more nodes
- or right click on a nodes, you can Select parent nodes or Select child nodes
- Click the action in the multi select control bar.
Row Count Diff
Row Count Diff shows the difference in row count between the base and current environments.
Value Diff
Screenshot
In the diff result, we can find a Copy to Clipboard button. it's a handy feature to copy the result image to clipboard and paste in your PR comment.
Note
FireFox does not support to copy image to clipboard. Recce show a modal instead. You can download the image to local or right-click on the image to copy the image.
Add to Checklist
In the lineage page, we can run different type of check. However, for these reason we would like to add to checklist
- Keep the check and I can rerun this after my code change
- Add my result and interpretation for review purpose
To add the checklist,
- Lineage
- All nodes: Click Add lineage diff check button to add all lineage
- Partial nodes: Click Select models button > select nodes > Click Add lineage check
- Schema
- Single node: Click a model > Add check > Schema check
- Multiple nodes: Click Select models button > select nodes > Click Add schema check
- Row count diff:
- Click Select models button
- Select nodes
- Click Row count diff
- Select a model
- Click Add to checklist
- Other Diffs:
- Execute the diff
- Click Add to checklist