Documentation
Validation Rules Guide
Validation rules are the heart of data quality enforcement with Sparvi. They allow you to define specific expectations about your data and verify if those expectations are met.
Understanding Validation Rules
A validation rule in Sparvi consists of:
- Name: A unique identifier for the rule
- Description: A human-readable explanation of what the rule checks
- Query: The SQL query that performs the validation check
- Operator: How to compare the query result with the expected value
- Expected Value: The value that indicates a passing validation
For example, a rule might check if there are any orders with zero or negative amounts by counting rows where amount is less than or equal to zero. The rule passes if the count equals 0, meaning there are no such orders.
Validation Rule Anatomy
Query
The query is a SQL statement that returns a single value (typically a COUNT or other aggregate). This value will be compared against the expected value using the specified operator.
Common validation queries include:
- Checking for null values in required columns
- Validating data formats (emails, dates, etc.)
- Ensuring values fall within reasonable date ranges
- Verifying referential integrity between related tables
Operators
Sparvi supports several comparison operators:
- equals: The query result must exactly match the expected value
- greater_than: The query result must be greater than the expected value
- less_than: The query result must be less than the expected value
- between: The query result must fall between two values (requires an array for expected_value)
- not_equals: The query result must not match the expected value (opposite of equals)
Expected Value
The expected_value field defines what the query result should be for the validation to pass. The type depends on the operator:
- For
equals,greater_than,less_than, andnot_equals: A single value (number, string, or boolean) - For
between: An array of two values[min, max]
Types of Validation Rules
Data Integrity Rules
These rules check basic integrity constraints:
- Primary Key Uniqueness: Ensure all IDs are unique with no duplicates
- Foreign Key Validity: Verify that all foreign key references point to valid records in related tables
- Not Null Constraints: Check that required fields contain values
Data Quality Rules
These rules check for data quality issues:
- Format Validation: Ensure emails, phone numbers, and other formatted fields follow valid patterns
- Range Checks: Verify numeric values fall within reasonable bounds (e.g., ages between 18 and 120)
- Date Validity: Confirm dates are not in the future when they shouldn't be
- Pattern Consistency: Check that text fields match expected patterns
Business Logic Rules
These rules enforce business-specific requirements:
- Conditional Requirements: Ensure premium users have required information like phone numbers
- Transaction Verification: Check that high-value orders or transactions have been properly verified
- Workflow Validation: Verify records progress through business processes correctly
- Relationship Rules: Ensure related entities maintain expected relationships
Distribution Rules
These rules check data distributions:
- Value Distribution: Ensure no single value dominates the data (e.g., no status represents more than 80% of records)
- Balance Checks: Verify data is distributed reasonably across categories
- Ratio Validation: Check that proportions between different values remain within expected ranges
Working with Validation Rules in Sparvi Cloud
Creating Validation Rules
In Sparvi Cloud, you can create validation rules through the intuitive web interface:
- Navigate to Validation Rules: Access the Validation Rules section from the main navigation
- Create New Rule: Click "Create Rule" to open the rule builder
- Define Rule Properties:
- Enter a unique name for your rule
- Write a clear description of what the rule checks
- Write or select a SQL query that performs the validation
- Choose the comparison operator (equals, greater_than, less_than, between, not_equals)
- Set the expected value that indicates a passing validation
- Test Your Rule: Run a test to verify the rule works as expected
- Save and Activate: Save the rule and add it to your validation suite
Generating Default Rules
Sparvi Cloud can automatically generate sensible validation rules based on your table structure and data patterns:
- Primary key uniqueness checks
- Foreign key validity checks
- Null checks for non-nullable columns
- Format validation for emails, phone numbers, etc.
- Range checks for numeric columns
- Date range validations
- Pattern consistency for text columns
Use the "Generate Rules" feature in the Validation Rules section to create a starting set of rules for your tables.
Running Validation Rules
In Sparvi Cloud, validation rules can be configured and run through the web interface:
- Create Rules: Define validation rules in the Validation Rules section
- Schedule Runs: Configure automated schedules for your validation rules
- Monitor Results: View validation results in the Health Dashboards
- Track Issues: Failed validations automatically create tracked issues with business context
Validation Rule Best Practices
Writing Effective Rules
- Use Clear Names: Rule names should indicate what is being checked
- Write Descriptive Explanations: Descriptions should explain the business reasoning
- Keep Queries Simple: Focus each rule on a single validation check
- Validate at the Right Level: Some checks need row-level detail, others need aggregates
- Consider Performance: Avoid complex joins in validation queries for large tables
- Use Parameterized Thresholds: Set realistic thresholds based on your data volume
Organizing Validation Rules
Sparvi Cloud helps you organize your rules effectively:
- Categories and Tags: Group rules by category (data integrity, data quality, business logic)
- Table Association: Link rules to specific tables or schemas
- Search and Filter: Quickly find rules using search and filtering
- Rule Collections: Create collections of related rules for specific data domains
Validation Rule Management
As your validation suite grows in Sparvi Cloud:
- Rule History: Track changes to rules with automatic version history
- Documentation: Add notes and documentation directly to each rule
- Testing: Test rules on sample data before activating them in production
- Organization: Group related rules by domain, table, or business function
- Prioritization: Set severity levels (critical, warning, info) for different validations
Advanced Validation Techniques
Dynamic Expected Values
Sometimes you need to compare against a dynamic baseline rather than a fixed value. Sparvi Cloud supports:
- Time-Based Comparisons: Compare today's metrics against yesterday's or last week's values
- Relative Thresholds: Set validation thresholds based on historical averages or trends
- Adaptive Baselines: Use calculated values that adjust based on recent data patterns
For example, you can create rules that ensure today's order volume doesn't drop more than 50% compared to yesterday's volume.
Conditional Validations
Some validations only apply under certain conditions. In Sparvi Cloud you can create:
- Context-Aware Rules: Validations that apply only to specific subsets of data
- If-Then Logic: Rules that check conditions only when certain criteria are met
- Segmented Validation: Different validation thresholds for different data segments
For example, you might validate that international orders have tax IDs, while domestic orders don't require this field.
Statistical Validations
For data with natural variation, Sparvi Cloud supports statistical approaches:
- Outlier Detection: Identify values beyond a certain number of standard deviations from the mean
- Distribution Checks: Verify data follows expected statistical distributions
- Tolerance Ranges: Set acceptable ranges based on statistical measures rather than fixed values
For example, you can create rules that allow a small number of outliers (e.g., up to 10 values beyond 3 standard deviations) while still flagging when too many extreme values appear.
Validation Rule Monitoring
Sparvi Cloud provides comprehensive monitoring of validation rule performance:
Health Dashboards: Track validation pass rates over time with visual trends
Failure Tracking: See which validations are failing most frequently
Historical Trends: Understand how data quality evolves over time
Alert Configuration: Set up notifications when validation rules start failing
Team Visibility: Share validation results with your team through dashboards and issues
Next Steps
After mastering validation rules:
- Learn about Anomaly Detection to automatically detect data quality issues
- Explore Data Profiling to understand your data patterns
- Check out Sparvi Cloud Overview to learn about all platform features