API Reference

Testing and Validation

Testing and Validation

Spartera provides comprehensive testing and validation capabilities to
ensure your assets perform correctly before deployment. This guide covers
the preview/test functionality and full processing capabilities available
in the platform.

Overview of Testing Capabilities

Preview/Test Mode

  • Purpose: Quick validation of asset functionality
  • Data Scope: Tests on 10% of your data
  • Speed: Fast results for rapid iteration
  • Resource Usage: Minimal computational resources required

Process Mode

  • Purpose: Full-scale asset execution
  • Data Scope: Processes 100% of your data
  • Performance: Complete analytical results
  • Resource Usage: Full computational resources utilized

Preview/Test Functionality

How Preview Works

The Preview/Test feature provides a quick way to validate your assets:

  1. Data Sampling: Automatically selects 10% of your data using
    intelligent sampling
  2. Query Execution: Runs your analytical logic on the sample
  3. Result Validation: Checks if the asset returns expected outputs
  4. Performance Metrics: Provides execution time and resource usage
    statistics

Sampling Strategy

Statistical Sampling

  • Random Sampling: Ensures representative data distribution
  • Stratified Sampling: Maintains proportional representation across
    segments
  • Time-Based Sampling: For temporal data, maintains chronological
    distribution
  • Balanced Sampling: Ensures all data categories are represented

Data Quality Considerations

  • Completeness Checks: Validates sample has sufficient data
  • Distribution Validation: Ensures sample reflects full dataset
    characteristics
  • Edge Case Coverage: Includes outliers and edge cases in sampling
  • Temporal Coverage: For time-series data, samples across time periods

Test Execution Process

Pre-Execution Validation

  1. Connection Health: Verifies data connection is active
  2. Permission Check: Confirms read access to required data
  3. Schema Validation: Ensures data structure matches expectations
  4. Resource Availability: Checks computational resources are available

During Execution

  1. Query Performance: Monitors query execution time
  2. Resource Utilization: Tracks CPU, memory, and I/O usage
  3. Error Monitoring: Captures and logs any execution errors
  4. Progress Tracking: Provides real-time execution status

Post-Execution Analysis

  1. Result Validation: Checks output format and structure
  2. Data Quality: Validates result completeness and accuracy
  3. Performance Metrics: Records execution time and resource usage
  4. Error Reporting: Documents any issues or warnings

Using Preview Results

Validation Checklist

  • Output Format: Confirm results match expected structure
  • Data Types: Verify result data types are correct
  • Value Ranges: Check if values fall within expected ranges
  • Null Handling: Ensure null values are handled appropriately
  • Error Conditions: Test behavior with problematic data

Iterative Development

  • Quick Feedback Loop: Make changes and re-test rapidly
  • Performance Optimization: Identify and resolve performance
    bottlenecks
  • Logic Refinement: Adjust analytical algorithms based on results
  • Error Resolution: Fix issues before full deployment

Full Process Functionality

When to Use Process Mode

Use full processing for:

  • Final Validation: Complete testing before production deployment
  • Production Runs: Generating complete analytical results
  • Performance Testing: Evaluating full-scale performance
  • Comprehensive Analysis: When complete dataset analysis is required

Process Execution

Resource Management

  • Compute Allocation: Uses full computational resources
  • Memory Management: Handles large dataset memory requirements
  • I/O Optimization: Optimizes data read/write operations
  • Parallel Processing: Leverages parallel execution where possible

Monitoring and Observability

  • Real-Time Monitoring: Track execution progress and performance
  • Resource Usage: Monitor CPU, memory, and storage utilization
  • Error Detection: Immediate notification of execution problems
  • Performance Metrics: Detailed performance analytics

Result Handling

  • Complete Results: Full analytical output for entire dataset
  • Result Caching: Stores results for future quick retrieval
  • Export Options: Multiple formats for result consumption
  • Integration APIs: Direct API access to processed results

Testing Best Practices

Development Testing Strategy

Unit Testing

  • Logic Components: Test individual analytical components
  • Data Transformations: Validate each data transformation step
  • Calculations: Verify mathematical and statistical calculations
  • Error Handling: Test error conditions and edge cases

Integration Testing

  • Data Connections: Test with various data sources
  • End-to-End: Validate complete analytical workflows
  • Performance: Test under various load conditions
  • Compatibility: Ensure compatibility across different environments

User Acceptance Testing

  • Business Logic: Validate business requirements are met
  • Result Accuracy: Confirm analytical results are correct
  • Usability: Test ease of use and integration
  • Performance: Validate response times meet requirements

Performance Testing

Load Testing

  • Data Volume: Test with varying dataset sizes
  • Concurrent Users: Validate multi-user access patterns
  • Resource Scaling: Test resource utilization and scaling
  • Response Times: Measure performance under load

Stress Testing

  • Resource Limits: Test behavior at resource boundaries
  • Data Quality: Test with poor or incomplete data
  • Network Issues: Test behavior during connectivity problems
  • Error Recovery: Validate graceful degradation and recovery

Quality Assurance

Data Quality Testing

  • Accuracy: Verify analytical results are mathematically correct
  • Completeness: Ensure all required data is processed
  • Consistency: Validate consistent results across runs
  • Timeliness: Confirm data freshness requirements are met

Output Validation

  • Format Compliance: Ensure outputs match specified formats
  • Schema Validation: Verify output structure is correct
  • Value Validation: Check result values are within expected ranges
  • Error Messaging: Validate error messages are helpful and accurate

Automated Testing

Continuous Integration

  • Automated Test Runs: Integrate testing into CI/CD pipelines
  • Regression Testing: Automatically test for functionality regressions
  • Performance Monitoring: Continuous performance validation
  • Quality Gates: Prevent deployment of failing assets

Test Documentation

  • Test Plans: Document comprehensive testing strategies
  • Test Cases: Define specific test scenarios and expectations
  • Results Documentation: Record test results and decisions
  • Issue Tracking: Track and resolve testing issues

Troubleshooting Common Issues

Performance Issues

  • Slow Query Performance: Optimize database queries and indexes
  • Memory Constraints: Adjust memory allocation or data processing
    approach
  • Network Bottlenecks: Optimize data transfer and connection pooling
  • Resource Contention: Balance resource usage across concurrent
    operations

Data Issues

  • Missing Data: Implement robust null and missing data handling
  • Data Type Mismatches: Ensure proper data type handling and
    conversion
  • Schema Changes: Handle evolving data schemas gracefully
  • Data Quality: Implement data validation and quality checks

Integration Issues

  • API Compatibility: Ensure API responses match consumer expectations
  • Authentication: Validate security and access control functionality
  • Version Compatibility: Test compatibility across different system
    versions
  • Network Connectivity: Test various network conditions and
    configurations

Comprehensive testing and validation ensures your assets are reliable,
performant, and ready for production use, providing confidence to both
asset creators and consumers.