Documentation | AI Email Categorizer

🚨 TROUBLESHOOTING

Common Issues & Solutions

❌ “No log files found” Error

Problem: CrashLens can’t find any log files to analyze.

Solutions:

BASH
# Check if logs exist in expected directories
ls -la .llm_logs/
ls -la logs/

# Generate test data if no logs exist
mkdir -p .llm_logs
crashlens simulate --output .llm_logs/test.jsonl --count 50

# Specify custom log directory
crashlens analyze --log-dir /path/to/your/logs

# Use find to locate JSONL files
find . -name "*.jsonl" -type f

⚠️ “Policy violation not detected” Issue

Problem: Expected policy violations aren’t being caught.

Debugging Steps:

BASH
# Enable debug mode to see detailed policy evaluation
crashlens policy-check --debug logs/

# Test policy syntax
crashlens validate --config crashlens.yml

# Run with verbose output
crashlens policy-check --verbose --detailed-output logs/

# Check if log format matches policy expectations
crashlens analyze --format json logs/ | jq '.entries[0]'

🔧 “Configuration not found” Error

Problem: CrashLens can’t find or parse configuration files.

Resolution:

BASH
# Initialize configuration if missing
crashlens init

# Check configuration syntax
crashlens validate --config crashlens.yml

# Use specific config file
crashlens policy-check --config /path/to/config.yml

# Generate default configuration
crashlens init --template basic --non-interactive

🐌 Performance Issues with Large Log Files

Problem: Analysis is slow or runs out of memory with large datasets.

Optimization Strategies:

BASH
# Use streaming mode for large files
crashlens analyze --stream logs/large-file.jsonl

# Enable parallel processing
crashlens analyze --parallel --workers 4 logs/

# Set memory limits
crashlens analyze --memory-limit 512MB logs/

# Use incremental processing
crashlens analyze --incremental --state-file .crashlens-state.json

# Process in batches
crashlens analyze --batch-size 1000 logs/

# Use time-based filtering to reduce dataset
crashlens analyze --since "2025-08-20T00:00:00Z" logs/

🔑 Authentication & API Key Issues

Problem: Issues with API authentication or key management.

Solutions:

BASH
# Set environment variables for API keys
export OPENAI_API_KEY="your-api-key"
export ANTHROPIC_API_KEY="your-anthropic-key"

# Use configuration file for keys
echo "api_keys:" > ~/.crashlens/config.yml
echo "  openai: your-api-key" >> ~/.crashlens/config.yml

# Test API connectivity
crashlens test-connection --provider openai

# Use different API endpoint
crashlens analyze --api-base https://api.openai.com/v1 logs/

📊 Log Format Compatibility Issues

Problem: CrashLens doesn’t recognize your log format.

Format Solutions:

BASH
# Check supported log formats
crashlens formats --list

# Convert logs to compatible format
crashlens convert --input custom.log --output standard.jsonl --format langfuse

# Use custom parser
crashlens analyze --parser custom-parser.py logs/

# Validate log format
crashlens validate-logs logs/sample.jsonl

# Preview log parsing
crashlens analyze --dry-run --limit 5 logs/

🔍 Debug Mode & Diagnostics

Enabling Debug Mode

Debug mode provides detailed information about CrashLens operations, policy evaluation, and error conditions.

BASH
# Enable global debug mode
export CRASHLENS_DEBUG=true
crashlens analyze logs/

# Debug specific commands
crashlens --debug policy-check logs/
crashlens --debug --verbose analyze logs/

# Debug policy evaluation
crashlens policy-check --debug --explain logs/

# Debug configuration loading
crashlens --debug init

# Debug log parsing
crashlens --debug analyze --dry-run logs/sample.jsonl

Log Level Configuration

BASH
# Set different log levels
crashlens --log-level debug analyze logs/    # Most verbose
crashlens --log-level info analyze logs/     # Default
crashlens --log-level warning analyze logs/  # Warnings only
crashlens --log-level error analyze logs/    # Errors only

# Save debug logs to file
crashlens --debug analyze logs/ 2> debug.log

# Real-time debug monitoring
tail -f ~/.crashlens/debug.log

Diagnostic Commands

BASH
# System diagnostics
crashlens doctor                          # Run all diagnostic checks
crashlens doctor --check dependencies    # Check Python dependencies
crashlens doctor --check permissions     # Check file permissions
crashlens doctor --check configuration   # Validate configuration

# Performance diagnostics
crashlens benchmark --dataset-size 1000  # Performance benchmark
crashlens profile analyze logs/          # Profile analysis performance

# Network diagnostics
crashlens test-connection                 # Test all API connections
crashlens test-connection --provider openai
crashlens ping --endpoint https://api.openai.com/v1

# Environment diagnostics
crashlens env --show                     # Show environment variables
crashlens version --detailed            # Detailed version information

⚡ Performance Optimization

💾 Memory Optimization

BASH
# Set memory limits
--memory-limit 512MB

# Enable disk caching
--use-disk-cache

# Process in smaller chunks
--batch-size 500

# Streaming for large files
--stream

🏃 Speed Optimization

BASH
# Parallel processing
--parallel --workers 4

# Skip unnecessary checks
--fast-mode

# Use incremental analysis
--incremental

# Cache results
--cache-results

Large Dataset Strategies

BASH
# For datasets > 1GB
crashlens analyze \
  --stream \
  --parallel \
  --workers 8 \
  --memory-limit 1GB \
  --batch-size 2000 \
  --use-disk-cache \
  large-logs/

# Time-based chunking for historical data
crashlens analyze --date-range "2025-08-01:2025-08-07" logs/
crashlens analyze --date-range "2025-08-08:2025-08-14" logs/

# Selective analysis by criteria
crashlens analyze --filter "cost > 1.0" logs/  # Only expensive requests
crashlens analyze --models gpt-4,claude-3 logs/  # Specific models only

📚 ADVANCED USAGE EXAMPLES

🏢 Enterprise Integration Patterns

Multi-Environment Cost Management

YAML
# Environment-specific configurations
# production.crashlens.yml
policies:
  - enforce: "strict-cost-control"
    rules:
      - daily_budget_limit: 1000
        monthly_budget_limit: 25000
        alert_threshold: 0.8
        block_threshold: 0.95

# staging.crashlens.yml  
policies:
  - enforce: "moderate-cost-control"
    rules:
      - daily_budget_limit: 200
        monthly_budget_limit: 5000

# development.crashlens.yml
policies:
  - enforce: "development-guidelines"
    rules:
      - daily_budget_limit: 50
        warn_on_expensive_models: true

# Usage
crashlens policy-check --config production.crashlens.yml logs/
crashlens analyze --config staging.crashlens.yml --env staging logs/

Multi-Team Cost Allocation & Chargeback

BASH
# Team-based cost tracking
crashlens analyze \
  --group-by team,project \
  --include-metadata team_id,project_id \
  --output-format csv \
  --output team-costs-$(date +%Y-%m).csv \
  logs/

# Chargeback report generation
crashlens report \
  --template chargeback \
  --period monthly \
  --breakdown team,department,cost_center \
  --include-budget-variance \
  --output chargeback-$(date +%Y-%m).html

# Budget allocation tracking
crashlens analyze \
  --filter "team_id IN ('team-a', 'team-b')" \
  --budget-allocation team-a:5000,team-b:3000 \
  --alert-on budget-exceeded \
  logs/

Enterprise Alerting & Integration

BASH
# Slack integration with custom webhooks
crashlens policy-check \
  --slack-webhook https://hooks.slack.com/services/YOUR/WEBHOOK/URL \
  --alert-channel "#finops-alerts" \
  --alert-severity high \
  logs/

# Email alerts
crashlens monitor \
  --email-alerts admin@company.com,finops@company.com \
  --email-template enterprise \
  --alert-frequency daily \
  .llm_logs/

# PagerDuty integration
crashlens monitor \
  --pagerduty-key YOUR_PAGERDUTY_INTEGRATION_KEY \
  --alert-on critical-violations,budget-exceeded \
  --escalation-policy high-priority \
  logs/

# JIRA ticket creation for violations
crashlens policy-check \
  --jira-integration \
  --jira-project FINOPS \
  --jira-issue-type Bug \
  --auto-assign finops-team \
  logs/

📊 Advanced Analytics & Reporting

Cost Trend Analysis & Forecasting

BASH
# Trend analysis with forecasting
crashlens analyze \
  --time-series daily \
  --forecast 30 \
  --trend-analysis \
  --include-seasonality \
  --output forecast-report.json \
  logs/

# Cost anomaly detection
crashlens analyze \
  --anomaly-detection \
  --sensitivity medium \
  --baseline-period 30d \
  --alert-on anomalies \
  logs/

# Comparative analysis across time periods
crashlens compare \
  --baseline "2025-07-01:2025-07-31" \
  --current "2025-08-01:2025-08-31" \
  --metrics cost,usage,efficiency \
  --statistical-significance 0.05 \
  logs/

Custom Dashboards & Visualization

BASH
# Generate interactive dashboard
crashlens dashboard \
  --template executive \
  --include-charts cost-trend,model-usage,team-breakdown \
  --refresh-interval 1h \
  --port 8080 \
  --auth-required \
  logs/

# Export data for external visualization tools
crashlens analyze \
  --output-format prometheus \
  --metrics-endpoint /metrics \
  --export-interval 5m \
  logs/

# Grafana integration
crashlens export \
  --format grafana \
  --dashboard-config grafana-dashboard.json \
  --data-source crashlens-metrics \
  logs/

# Power BI integration
crashlens analyze \
  --output-format powerbi \
  --include-relationships \
  --output powerbi-dataset.pbix \
  logs/

Advanced Query & Filtering

BASH
# Complex SQL-like queries
crashlens query \
  --filter "cost > 10 AND model LIKE 'gpt-4%' AND timestamp >= '2025-08-01'" \
  --select "model, AVG(cost) as avg_cost, COUNT(*) as requests" \
  --group-by model \
  --having "AVG(cost) > 5" \
  --order-by avg_cost DESC \
  logs/

# Statistical analysis
crashlens analyze \
  --statistics percentiles,correlation,regression \
  --correlate cost,latency,token_count \
  --percentiles 50,90,95,99 \
  --output stats-report.json \
  logs/

# Pattern mining
crashlens analyze \
  --pattern-mining \
  --min-support 0.1 \
  --find-patterns retry-loops,cost-spikes,efficiency-issues \
  --association-rules \
  logs/

🔧 DevOps & CI/CD Integration

Advanced GitHub Actions Workflows

YAML
# .github/workflows/comprehensive-cost-control.yml
name: Comprehensive LLM Cost Control
on:
  push:
    branches: [main, develop]
  pull_request:
    branches: [main]
  schedule:
    - cron: '0 8 * * *'  # Daily at 8 AM

jobs:
  cost-analysis:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      
      - name: Setup Python
        uses: actions/setup-python@v4
        with:
          python-version: '3.12'
      
      - name: Install CrashLens
        run: pip install crashlens
      
      - name: Download logs from production
        env:
          AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
          AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
        run: |
          aws s3 sync s3://your-logs-bucket/llm-logs/ ./logs/
      
      - name: Run comprehensive analysis
        run: |
          crashlens analyze \
            --parallel \
            --output-format json \
            --include-recommendations \
            --output analysis-${{ github.sha }}.json \
            logs/
      
      - name: Policy compliance check
        run: |
          crashlens policy-check \
            --config .github/crashlens.yml \
            --fail-on-violation \
            --detailed-output \
            --slack-webhook ${{ secrets.SLACK_WEBHOOK }} \
            logs/
      
      - name: Generate executive report
        if: github.event_name == 'schedule'
        run: |
          crashlens report \
            --template executive \
            --period daily \
            --include-trends \
            --output daily-report-$(date +%Y-%m-%d).html \
            logs/
      
      - name: Upload artifacts
        uses: actions/upload-artifact@v3
        with:
          name: cost-analysis-${{ github.sha }}
          path: |
            analysis-${{ github.sha }}.json
            daily-report-*.html

Kubernetes & Container Integration

YAML
# Kubernetes CronJob for cost monitoring
apiVersion: batch/v1
kind: CronJob
metadata:
  name: crashlens-monitor
spec:
  schedule: "0 */6 * * *"  # Every 6 hours
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: crashlens
            image: crashlens/crashlens:latest
            command:
            - /bin/sh
            - -c
            - |
              crashlens monitor \
                --log-dir /logs \
                --alert-on policy-violation,budget-exceeded \
                --slack-webhook $SLACK_WEBHOOK \
                --continuous
            env:
            - name: SLACK_WEBHOOK
              valueFrom:
                secretKeyRef:
                  name: crashlens-secrets
                  key: slack-webhook
            volumeMounts:
            - name: log-volume
              mountPath: /logs
          volumes:
          - name: log-volume
            persistentVolumeClaim:
              claimName: llm-logs-pvc
          restartPolicy: OnFailure

# Docker Compose for local development
version: '3.8'
services:
  crashlens-monitor:
    image: crashlens/crashlens:latest
    command: >
      crashlens watch /logs
      --poll-interval 30
      --alert-on policy-violation
      --webhook http://webhook-service:3000/alerts
    volumes:
      - ./logs:/logs:ro
      - ./crashlens.yml:/app/crashlens.yml:ro
    environment:
      - CRASHLENS_CONFIG=/app/crashlens.yml
    depends_on:
      - webhook-service

Infrastructure as Code Integration

TERRAFORM

# Terraform module for CrashLens monitoring
# modules/crashlens/main.tf
resource "aws_lambda_function" "crashlens_monitor" {
  filename         = "crashlens-lambda.zip"
  function_name    = "crashlens-cost-monitor"
  role            = aws_iam_role.crashlens_role.arn
  handler         = "index.handler"
  runtime         = "python3.12"
  
  environment {
    variables = {
      LOG_BUCKET = var.log_bucket
      SLACK_WEBHOOK = var.slack_webhook
      POLICY_CONFIG = var.policy_config
    }
  }
}

resource "aws_cloudwatch_event_rule" "crashlens_schedule" {
  name                = "crashlens-daily-check"
  description         = "Daily CrashLens cost analysis"
  schedule_expression = "rate(24 hours)"
}

resource "aws_cloudwatch_event_target" "lambda_target" {
  rule      = aws_cloudwatch_event_rule.crashlens_schedule.name
  target_id = "CrashLensLambdaTarget"
  arn       = aws_lambda_function.crashlens_monitor.arn
}

# Ansible playbook for server deployment
---
- hosts: monitoring_servers
  become: yes
  tasks:
    - name: Install CrashLens
      pip:
        name: crashlens
        state: latest
    
    - name: Create CrashLens config
      template:
        src: crashlens.yml.j2
        dest: /etc/crashlens/crashlens.yml
        mode: '0644'
    
    - name: Create systemd service
      template:
        src: crashlens.service.j2
        dest: /etc/systemd/system/crashlens.service
      notify: restart crashlens
    
    - name: Enable and start CrashLens service
      systemd:
        name: crashlens
        enabled: yes
        state: started

🔌 API Integration & Automation

REST API Usage

BASH
# Start CrashLens API server
crashlens serve --port 8080 --auth-token your-secret-token

# API endpoints usage
curl -H "Authorization: Bearer your-secret-token" \
  -X POST http://localhost:8080/api/v1/analyze \
  -H "Content-Type: application/json" \
  -d '{
    "log_files": ["logs/app.jsonl"],
    "policies": ["prevent-model-overkill"],
    "options": {
      "include_recommendations": true,
      "output_format": "json"
    }
  }'

# Policy check via API
curl -H "Authorization: Bearer your-secret-token" \
  -X POST http://localhost:8080/api/v1/policy-check \
  -H "Content-Type: application/json" \
  -d '{
    "log_files": ["logs/recent.jsonl"],
    "policy_file": "policies/production.yml",
    "severity": "high"
  }'

# Real-time monitoring webhook
curl -X POST http://localhost:8080/api/v1/webhooks/register \
  -H "Authorization: Bearer your-secret-token" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://your-app.com/webhooks/crashlens",
    "events": ["policy_violation", "cost_spike"],
    "secret": "webhook-secret"
  }'

Python SDK Integration

PYTHON
# Python integration example
from crashlens import CrashLens, PolicyConfig
import logging

# Initialize CrashLens client
client = CrashLens(
    config_file="crashlens.yml",
    log_level=logging.INFO
)

# Programmatic policy checking
async def check_llm_request(request_data):
    """Check LLM request against policies before sending"""
    try:
        result = await client.policy_check_async(
            request_data=request_data,
            policies=["prevent-model-overkill", "budget-enforcement"],
            fail_fast=True
        )
        
        if result.violations:
            logger.warning(f"Policy violations: {result.violations}")
            return result.suggested_alternatives
            
        return None  # No violations, proceed with request
        
    except Exception as e:
        logger.error(f"Policy check failed: {e}")
        return None

# Real-time cost monitoring
def setup_cost_monitoring():
    """Set up real-time cost monitoring"""
    
    @client.on_cost_threshold(threshold=100, period="daily")
    async def handle_cost_alert(event):
        """Handle cost threshold alerts"""
        await send_slack_alert(
            f"Daily cost threshold exceeded: {event.current_cost}"
        )
    
    @client.on_policy_violation(severity="high")
    async def handle_violation(violation):
        """Handle policy violations"""
        await create_incident_ticket(violation)
    
    # Start monitoring
    client.start_monitoring(
        log_sources=["./logs", "s3://company-llm-logs"],
        poll_interval=300  # 5 minutes
    )

# Batch analysis and reporting
async def generate_weekly_report():
    """Generate comprehensive weekly cost report"""
    
    analysis = await client.analyze_async(
        log_files=["logs/week-*.jsonl"],
        include_trends=True,
        include_recommendations=True,
        time_range="last-7-days"
    )
    
    report = await client.generate_report(
        analysis=analysis,
        template="executive",
        format="html",
        include_charts=True
    )
    
    # Email report to stakeholders
    await email_report(
        recipients=["cto@company.com", "finops@company.com"],
        subject="Weekly LLM Cost Analysis",
        html_content=report.html,
        attachments=[report.data_export]
    )

🤖 Machine Learning & Predictive Analytics

Predictive Cost Modeling

BASH
# Train custom cost prediction models
crashlens ml train \
  --model-type cost-predictor \
  --features token_count,model_type,time_of_day,user_type \
  --target cost \
  --algorithm random-forest \
  --validation-split 0.2 \
  --output-model cost-model.pkl \
  logs/historical-data.jsonl

# Real-time cost prediction
crashlens ml predict \
  --model cost-model.pkl \
  --input-features '{"token_count": 1500, "model_type": "gpt-4", "time_of_day": 14}' \
  --confidence-interval 0.95

# Anomaly detection model
crashlens ml train \
  --model-type anomaly-detector \
  --algorithm isolation-forest \
  --contamination 0.1 \
  --output-model anomaly-model.pkl \
  logs/normal-usage.jsonl

# Auto-scaling prediction
crashlens ml predict \
  --model scaling-model.pkl \
  --forecast-horizon 24h \
  --include-uncertainty \
  --alert-on capacity-exceeded

Intelligent Policy Optimization

BASH
# Auto-optimize policies based on historical data
crashlens ml optimize-policies \
  --current-policies crashlens.yml \
  --historical-data logs/last-90-days/ \
  --objective minimize-cost \
  --constraints maintain-quality \
  --output optimized-policies.yml

# A/B testing for policy effectiveness
crashlens ml ab-test \
  --policy-a current-policies.yml \
  --policy-b optimized-policies.yml \
  --test-data logs/test-set.jsonl \
  --metrics cost,violations,user-satisfaction \
  --duration 7d

# Reinforcement learning for dynamic policies
crashlens ml rl-train \
  --environment production \
  --reward-function cost-efficiency \
  --exploration-strategy epsilon-greedy \
  --episodes 1000 \
  --save-model rl-policy-agent.pkl

Troubleshooting & Advanced Usage