Join our global community on Discord and let's shape the next wave of simulation and scientific computing Let's shape the future of simulation together

Join us on Discord

How do I track Errfile to see if something wrong happened?

Note: This tutorial shows how to use observer events to automatically monitor SWASH simulation error files and get notified when problems occur.
For a comprehensive guide on observer events with more advanced features and examples, see the full Observer Events tutorial.

The Problem

When running SWASH simulations, errors can occur that might not immediately stop the simulation but indicate problems that need attention. SWASH writes error information to files like Errfile-001, Errfile-002, etc., where the number corresponds to the vCPU that encountered the error. Manually checking these files is tedious, especially for long-running simulations.

The Solution

Use Inductiva's observer events to automatically monitor your SWASH simulation's error files and get email notifications when severe errors are detected.

Quick Setup

Here's how to set up automatic error monitoring for your SWASH simulation:

from inductiva import events

# Register an observer to monitor any Errfile for severe errors
events.register(
    trigger=events.triggers.ObserverFileRegex(
        task_id=task.id,
        file_path="Errfile-*",  # Wildcard matches Errfile-001, Errfile-002, etc.
        regex=r"Severe error (.+)"),  # Captures text after "Severe error "
    action=events.actions.EmailNotification(
        email_address="your@email.com")
)

How It Works

ObserverFileRegex monitors any file matching the pattern Errfile-* in your task's working directory (e.g., Errfile-001, Errfile-002, etc.)
Wildcard matching uses Linux-style * wildcards to match multiple error files, ensuring you catch errors from any vCPU that encounters problems
Regular expression r"Severe error (.+)" detects lines containing "Severe error " and captures the error message that follows
Email notification is sent immediately when a match is found, including the captured error message
The observer runs in the background, so you don't need to actively monitor the simulation

Wildcard File Matching

The Errfile-* pattern uses Linux-style wildcard matching to monitor multiple error files:

Errfile-* matches any file starting with "Errfile-" followed by any characters
This includes Errfile-001, Errfile-002, Errfile-003, etc.
The number in the filename corresponds to the vCPU number that encountered the error
SWASH creates separate error files for each vCPU to avoid conflicts during parallel execution
Using wildcards ensures you don't miss errors from any vCPU, regardless of which one encounters the problem

Customizing the Error Detection

You can modify the regex pattern to detect different types of errors:

# Detect any error (case insensitive)
regex=r"(?i)error"

# Detect specific error types
regex=r"(?i)(fatal|critical|severe) error"

# Detect errors with specific patterns
regex=r"Error \d+: .*"

What You'll Receive

When a severe error is detected, you'll receive an email:

Email notification example

Complete Example

Here's a complete example of running a SWASH simulation with error monitoring:

import inductiva

swash = inductiva.simulators.SWASH()

task = swash.run(...)

# Set up error monitoring after creating the task
events.register(
    trigger=events.triggers.ObserverFileRegex(
        task_id=task.id,
        file_path="Errfile-*",  # Wildcard matches any Errfile
        regex=r"Severe error (.+)"),
    action=events.actions.EmailNotification(
        email_address="your@email.com")
)

# The observer will automatically monitor for errors during execution

Benefits

Automatic monitoring: No need to manually check error files
Immediate alerts: Get notified as soon as problems occur
Background operation: Monitoring happens without affecting simulation performance
Flexible detection: Customize regex patterns for different error types

This approach ensures you're immediately aware of any issues with your SWASH simulation, allowing you to take corrective action without delay.

Run Your First Simulation

Step-by-step guide to run your first SWASH simulation on Inductiva.AI. Easily launch, monitor and analyse results.

Results & Analysis