Skip to content

Troubleshooting

Troubleshooting Tracing

This guide helps you resolve common issues with Sematext Tracing, OpenTelemetry SDKs, and the Sematext Agent.

No Traces Appearing

Check Agent Status

Linux:

sudo systemctl status sematext-agent sudo journalctl -u sematext-agent -f 

Docker:

docker ps | grep sematext-agent docker logs sematext-agent 

Kubernetes:

kubectl get pods -n sematext | grep sematext-agent kubectl logs -n sematext -l name=sematext-agent 

Verify OTLP Configuration

  1. Check ports are open:

    • HTTP: Port 4338
    • gRPC: Port 4337
  2. Test connectivity:

# Test HTTP endpoint curl -v http://localhost:4338/v1/traces  # Test gRPC endpoint (requires grpcurl) grpcurl -plaintext localhost:4337 list 
  1. Verify environment variables:
echo $OTEL_SERVICE_NAME echo $OTEL_EXPORTER_OTLP_ENDPOINT echo $OTEL_EXPORTER_OTLP_PROTOCOL 

Service Name Mismatch

The service name in your application must exactly match what's configured in the agent:

Application:

// Your app sets: const resource = new Resource({  'service.name': 'frontend' }); 

Agent must have:

sudo /opt/spm/spm-monitor/bin/st-agent otel services add \  --service-names "frontend" \  --token-group "your-group" 

Authentication Errors

Invalid Token

  1. Verify your Tracing App token in Sematext Cloud
  2. Check token configuration in agent:

Linux:

cat /opt/spm/properties/otel.yml 

Docker:

docker exec sematext-agent cat /opt/spm/properties/otel.yml 

Token Not Configured

Ensure token is added to token group:

sudo /opt/spm/spm-monitor/bin/st-agent otel token-groups add \  --token-group "web-services" \  --type traces \  --token "your-traces-token" 

Performance Issues

High Memory Usage

  1. Check sampling rate:

    # Development (100% sampling) export OTEL_TRACES_SAMPLER=always_on  # Production (10% sampling) export OTEL_TRACES_SAMPLER=traceidratio export OTEL_TRACES_SAMPLER_ARG=0.1 

  2. Reduce batch size in SDK configuration

  3. Check agent resource limits (Kubernetes):

    resources:  limits:  memory: "512Mi"  requests:  memory: "256Mi" 

Slow Trace Export

  1. Check network latency to agent
  2. Verify agent isn't overloaded:
    # Check CPU and memory top | grep st-agent 
  3. Consider using gRPC instead of HTTP for better performance

SDK-Specific Issues

Java

Agent not loading:

# Verify agent file exists and is readable ls -la opentelemetry-javaagent.jar  # Check Java version compatibility java -version 

Missing traces:

# Enable debug logging java -javaagent:opentelemetry-javaagent.jar \  -Dotel.javaagent.debug=true \  -jar your-app.jar 

Python

Import errors:

# Reinstall OpenTelemetry packages pip install --upgrade opentelemetry-distro[otlp] opentelemetry-bootstrap -a install 

Auto-instrumentation not working:

# Verify installation opentelemetry-instrument --version  # Check supported libraries pip list | grep opentelemetry 

Node.js

Module not found:

# Reinstall dependencies npm install @opentelemetry/auto-instrumentations-node npm install @opentelemetry/exporter-trace-otlp-http 

Traces not exported:

// Enable debug logging const { diag, DiagConsoleLogger, DiagLogLevel } = require('@opentelemetry/api'); diag.setLogger(new DiagConsoleLogger(), DiagLogLevel.DEBUG); 

Go

Compilation errors:

# Update dependencies go get -u go.opentelemetry.io/otel go get -u go.opentelemetry.io/otel/exporters/otlp/otlptrace go get -u go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp 

No traces sent:

// Add error handling if err := tracerProvider.Shutdown(context.Background()); err != nil {  log.Printf("Error shutting down tracer provider: %v", err) } 

.NET

Auto-instrumentation not working:

# Verify profiler is enabled echo $CORECLR_ENABLE_PROFILING # Should be 1 echo $CORECLR_PROFILER # Should be {918728DD-259F-4A6A-AC2B-B85E1B658318} 

Missing dependencies:

# Install required packages dotnet add package OpenTelemetry.Exporter.OpenTelemetryProtocol dotnet add package OpenTelemetry.Extensions.Hosting 

Ruby

Gems not loading:

# Update gems gem update opentelemetry-sdk gem update opentelemetry-exporter-otlp 

Configuration not applied:

# Verify configuration is loaded puts OpenTelemetry.tracer_provider.inspect 

Agent Configuration Issues

OpenTelemetry Not Enabled

# Enable OpenTelemetry support sudo /opt/spm/spm-monitor/bin/st-agent otel enable --type traces  # Restart agent sudo systemctl restart sematext-agent 

Service Not Mapped

# List current services cat /opt/spm/properties/otel.yml | grep -A 5 services  # Add missing service sudo /opt/spm/spm-monitor/bin/st-agent otel services add \  --service-names "new-service" \  --token-group "your-group" 

Wrong Port Configuration

# Check current port configuration cat /opt/spm/properties/otel.yml | grep -A 10 receivers  # Update ports if needed sudo /opt/spm/spm-monitor/bin/st-agent otel receivers set \  --type traces --protocol http --port 4338 

Docker/Kubernetes Issues

Container Can't Connect to Agent

  1. Check network connectivity:
# From application container ping sematext-agent telnet sematext-agent 4338 
  1. Verify service discovery (Kubernetes):

    kubectl get svc -n sematext | grep sematext-agent 

  2. Check firewall rules and network policies

Environment Variables Not Set

Docker Compose:

environment:  - OTEL_SERVICE_NAME=your-service  - OTEL_EXPORTER_OTLP_ENDPOINT=http://sematext-agent:4338 

Kubernetes:

env: - name: OTEL_SERVICE_NAME  value: "your-service" - name: OTEL_EXPORTER_OTLP_ENDPOINT  value: "http://sematext-agent-otlp.sematext:4338" 

Database Tracing

SQL Query Visibility

OpenTelemetry automatically handles SQL queries to protect sensitive data:

  • Parameterized queries (e.g., SELECT * FROM users WHERE id = ?) are shown as-is
  • Non-parameterized queries have literal values replaced with ? placeholders
  • Query parameters are not captured by default for security reasons

Example of what you'll see in traces:

  • Original query: SELECT * FROM users WHERE email = 'john@example.com' AND status = 'active'
  • In traces: SELECT * FROM users WHERE email = ? AND status = ?

Missing Database Operations

If database operations aren't appearing in traces:

  1. Verify database instrumentation is loaded:

  2. Java: Database drivers are auto-instrumented

  3. Python: opentelemetry-instrumentation-sqlalchemy, opentelemetry-instrumentation-psycopg2, etc.
  4. Node.js: @opentelemetry/instrumentation-mysql, @opentelemetry/instrumentation-pg, etc.
  5. Go: Requires manual instrumentation or otelsql wrapper

  6. Check supported databases:

  7. PostgreSQL, MySQL, MariaDB, MongoDB, Redis, Memcached

  8. Most JDBC drivers (Java)
  9. Most database clients with OpenTelemetry instrumentation libraries

  10. Common issues:

  11. ORM queries may need additional instrumentation

  12. Connection pooling libraries might need specific instrumentation
  13. Native database drivers may not be auto-instrumented

Database Performance Troubleshooting

To identify slow queries in traces:

  1. Filter by operation type:

  2. Look for spans with db.system attribute (postgresql, mysql, etc.)

  3. Filter by db.operation (SELECT, INSERT, UPDATE, DELETE)

  4. Analyze query patterns:

  5. Check db.statement for query structure

  6. Look for N+1 query problems (many similar queries in sequence)
  7. Identify missing indexes (slow SELECT operations)

  8. Connection issues:

  9. High latency on connection acquisition spans

  10. Many short-lived connections (connection pool exhaustion)

Database Span Attributes

Common database attributes you'll see:

  • db.system: Database type (postgresql, mysql, mongodb, redis)
  • db.name: Database/schema name
  • db.statement: SQL query with placeholders
  • db.operation: Operation type
  • db.user: Database user (if not filtered)
  • net.peer.name: Database host
  • net.peer.port: Database port

Common Error Messages

"Connection refused"

  • Agent not running
  • Wrong endpoint URL
  • Firewall blocking connection

"Unauthorized" or "403 Forbidden"

  • Invalid or missing token
  • Token not configured for traces
  • Wrong token type (using logs token instead of traces token)

"Service name not found"

  • Service not configured in agent
  • Typo in service name
  • Case sensitivity issue

"Deadline exceeded"

  • Network timeout
  • Agent overloaded
  • Increase timeout in SDK configuration

Getting Help

If these troubleshooting steps don't resolve your issue:

  1. Check agent logs for detailed error messages
  2. Enable debug logging in your OpenTelemetry SDK
  3. Verify configuration against the SDK documentation
  4. Contact support at support@sematext.com with:

    • Agent version
    • SDK language and version
    • Error messages and logs
    • Configuration details