Skip to main content

Telemetry

The Codegen Project CLI collects anonymous usage data to help us understand how the tool is being used and make data-driven improvements.

Privacy First

We take your privacy seriously. Here's what we collect and what we don't:

✅ What We Collect

  • Command usage: Which commands you run (e.g., generate, init)
  • Generator types: Which generators you use (e.g., payloads, channels)
  • Input source types: Whether you use remote URLs, local relative paths, or absolute paths (not the actual paths)
  • Feature usage: Which flags and options you use
  • Error categories: Types of errors that occur (not error messages or stack traces)
  • System information: CLI version, Node.js version, OS platform
  • Execution metrics: Command duration and success rates

❌ What We DON'T Collect

  • ❌ File paths or file names
  • ❌ Actual URLs or file locations
  • ❌ File contents or schema details
  • ❌ Project names
  • ❌ User names or emails
  • ❌ API keys or credentials
  • ❌ IP addresses (anonymized by analytics provider)
  • ❌ Hostnames
  • ❌ Environment variable values
  • ❌ Git repository information
  • ❌ Custom schema structures

Managing Telemetry

Check Status

View your current telemetry settings:

codegen telemetry status

This shows:

  • Whether telemetry is enabled or disabled
  • Configuration file location
  • What data is collected
  • Environment variable overrides

Disable Telemetry

You can disable telemetry in several ways:

Option 1: Using the CLI command

codegen telemetry disable

Option 2: Environment variable (permanent)

Add to your shell profile (.bashrc, .zshrc, etc.):

export CODEGEN_TELEMETRY_DISABLED=1

Or use the standard DO_NOT_TRACK variable:

export DO_NOT_TRACK=1

Option 3: Environment variable (per-command)

CODEGEN_TELEMETRY_DISABLED=1 codegen generate

Option 4: Project-level configuration

In your codegen.config.js:

export default {
inputType: 'asyncapi',
inputPath: './asyncapi.yaml',
generators: [/* ... */],

// Disable telemetry for this project
telemetry: {
enabled: false
}
}

Re-enable Telemetry

codegen telemetry enable

First-Run Notice

When you run any command for the first time, you'll see a notice about telemetry:

┌─────────────────────────────────────────────────────────────┐
│ │
│ The Codegen Project CLI collects anonymous usage data │
│ to help us improve the tool. │
│ │
│ To disable: codegen telemetry disable │
│ Learn more: https://the-codegen-project.org/docs/telemetry │
│ │
└─────────────────────────────────────────────────────────────┘

This notice is shown only once. Telemetry is opt-out by default, meaning it's enabled unless you explicitly disable it.

Debug Mode

To see what telemetry data is being sent:

CODEGEN_TELEMETRY_DEBUG=1 codegen generate

This logs telemetry events to the console, including:

  • The event being tracked
  • The telemetry configuration state
  • The full payload being sent to the analytics endpoint
  • HTTP response status (success/failure)

Events are still sent to the analytics endpoint in debug mode, but you can see exactly what's being transmitted. The events will also appear in GA4 DebugView when debug mode is enabled.

Custom Tracking Endpoint (for Organizations)

Organizations can point telemetry to their own analytics endpoint using environment variables. These environment variables have the highest priority and will override any configuration from project-level config or global config files:

# Set custom endpoint (highest priority - overrides all other configs)
export CODEGEN_TELEMETRY_ENDPOINT=https://analytics.mycompany.com/telemetry
export CODEGEN_TELEMETRY_ID=custom-tracking-id
export CODEGEN_TELEMETRY_API_SECRET=your-api-secret

Configuration Priority Order (highest to lowest):

  1. Environment variables (highest priority):
    • CODEGEN_TELEMETRY_DISABLED / DO_NOT_TRACK - disable telemetry
    • CODEGEN_TELEMETRY_ENDPOINT - custom analytics endpoint
    • CODEGEN_TELEMETRY_ID - custom tracking ID
    • CODEGEN_TELEMETRY_API_SECRET - custom API secret
  2. Project-level config (from codegen.config.js)
  3. Global config file (~/.the-codegen-project/config.json)

Expected endpoint format (GA4 Measurement Protocol compatible):

POST /telemetry
Content-Type: application/json

{
"client_id": "anonymous-uuid",
"events": [{
"name": "command_executed",
"params": {
"command": "generate",
"flags": "watch",
"input_source": "local_relative",
"input_type": "asyncapi",
"generators": "payloads,parameters",
"generator_count": 2,
"duration": 1234,
"success": true,
"cli_version": "0.57.0",
"node_version": "v18.0.0",
"os": "darwin",
"ci": false,
"engagement_time_msec": "1234"
}
}]
}

Configuration File

Telemetry settings are stored in:

~/.the-codegen-project/config.json

Example configuration:

{
"version": "1.0.0",
"telemetry": {
"enabled": true,
"anonymousId": "550e8400-e29b-41d4-a716-446655440000",
"endpoint": "https://www.google-analytics.com/mp/collect",
"trackingId": "G-XXXXXXXXXX"
},
"hasShownTelemetryNotice": true,
"lastUpdated": "2024-12-11T10:30:00Z"
}

Example Telemetry Events

Command Execution

{
event: 'command_executed',
command: 'generate',
flags: 'watch', // Comma-separated if multiple, 'none' if empty
input_source: 'local_relative', // Not the actual path!
input_type: 'asyncapi',
generators: 'payloads,parameters,channels', // Comma-separated list
generator_count: 3,
duration: 1234,
success: true,
cli_version: '0.57.0',
node_version: 'v18.0.0',
os: 'darwin',
ci: false,
engagement_time_msec: '1234' // Same as duration for proper engagement tracking
}

Why track generator combinations? This helps us understand:

  • Which generators are commonly used together
  • Popular generator patterns (e.g., "payloads + parameters")
  • If certain generators are always used in isolation
  • Common workflows and use cases

Generator Usage

{
event: 'generator_used',
generator_type: 'payloads',
input_type: 'asyncapi', // Can be: asyncapi, openapi, jsonschema
input_source: 'remote_url', // Not the actual URL!
language: 'typescript',
options: '{"includeValidation":true,"serializationType":"json"}',
duration: 500,
success: true,
cli_version: '0.57.0',
node_version: 'v18.0.0',
os: 'darwin',
ci: false,
engagement_time_msec: '500'
}

Why track individual generators? This helps us understand:

  • Which generators are most popular
  • How users configure generators (validation, serialization, etc.)
  • Performance characteristics of each generator
  • Success/failure rates per generator type

Combined with command_executed event, we get both:

  • Macro view: What generators are used together
  • Micro view: How each generator is configured

Init Command

{
event: 'init_executed',
config_type: 'esm',
input_type: 'asyncapi',
generators: 'payloads,parameters,channels', // Comma-separated list
language: 'typescript',
completed: true,
cli_version: '0.57.0',
node_version: 'v18.0.0',
os: 'darwin',
ci: false,
engagement_time_msec: '100' // Minimum engagement time
}

Error Tracking

{
event: 'error_occurred',
command: 'generate',
error_type: 'configuration_error', // Category only, not actual error message
cli_version: '0.57.0',
node_version: 'v18.0.0'
}

CI/CD Environments

Telemetry automatically detects CI environments and adjusts behavior:

  • First-run notice is skipped in CI environments
  • Telemetry still runs by default (to track CI usage patterns)
  • You can disable it with environment variables if needed

Detected CI environments:

  • GitHub Actions
  • GitLab CI
  • CircleCI
  • Travis CI
  • Jenkins
  • Bitbucket Pipelines
  • AWS CodeBuild
  • TeamCity
  • Buildkite

Privacy & Compliance

GDPR Compliance

Our telemetry implementation is GDPR compliant:

  • Lawful Basis: Legitimate interest (improving software)
  • Transparency: Clear notice on first run
  • User Control: Easy opt-out mechanism
  • Data Minimization: Only collect necessary data
  • Purpose Limitation: Use only for improvement
  • Anonymization: No PII collected
  • Right to Object: Users can disable anytime

Data Retention

We store data for 14 months, if you use your own telemetry, then its up to you.

Anonymous ID

Each installation generates a random UUID (v4) as an anonymous identifier. This ID:

  • Is NOT tied to your identity
  • Cannot be used to identify you personally
  • Is only used to understand usage patterns
  • Can be reset by deleting the config file

How Telemetry Helps

The data we collect helps us:

  1. Prioritize features: Focus on the most-used generators and commands
  2. Improve reliability: Identify and fix common error scenarios
  3. Optimize performance: Understand typical execution times
  4. Support platforms: Know which Node.js versions and OS platforms to support
  5. Guide documentation: Understand which features cause confusion
  6. Understand workflows: Learn whether users prefer remote URLs, relative paths, or absolute paths

Technical Details

Implementation

  • Non-blocking: Telemetry runs asynchronously and never blocks CLI execution
  • Fail-safe: Network errors or timeouts don't affect CLI functionality
  • Fast timeout: Telemetry requests timeout after 1 second
  • Error handling: All errors are handled gracefully and silently

Default Analytics Provider

We use Google Analytics 4 Measurement Protocol by default:

  • Free service with powerful analytics
  • Automatic IP anonymization
  • GDPR compliant
  • No additional infrastructure needed

Website Analytics

In addition to CLI telemetry, our documentation website (https://the-codegen-project.org) also uses Google Analytics 4 to understand how users interact with our documentation.

What the Website Tracks

  • Page views: Which documentation pages are viewed
  • Navigation: How users navigate through the documentation
  • Search queries: What users search for in the docs
  • Outbound links: Which external links users click
  • Time on page: How long users spend reading documentation
  • Referral sources: How users found our documentation

Note: The website uses a different Google Analytics property than the CLI telemetry. They are completely separate tracking systems.

What the Website Does NOT Track

  • ❌ Personal information
  • ❌ Form inputs or data
  • ❌ Clipboard contents
  • ❌ Code snippets you copy
  • ❌ IP addresses (anonymized by GA4)

Website Privacy Controls

Standard Browser Controls:

  • Use browser "Do Not Track" settings
  • Install privacy extensions (uBlock Origin, Privacy Badger, etc.)
  • Use browser incognito/private mode
  • Disable JavaScript (documentation still accessible)

Website-Specific Settings:

  • Our website respects the DO_NOT_TRACK browser header
  • No cookies are set for tracking purposes
  • Google Analytics IP anonymization is enabled
  • No third-party tracking scripts beyond GA4

FAQ

Q: Will telemetry slow down my CLI?

A: No. Telemetry runs asynchronously and doesn't block command execution. Network requests timeout after 1 second and fail silently.

Q: Can telemetry errors break my CLI?

A: No. All telemetry functions are designed to never throw errors. Failures are handled internally and don't affect CLI functionality.

Q: Does this work behind a corporate proxy?

A: Yes. Telemetry respects standard HTTP_PROXY and HTTPS_PROXY environment variables. If it fails, it fails silently without affecting the CLI.

Q: Can I see what's being sent?

A: Yes! Use debug mode:

CODEGEN_TELEMETRY_DEBUG=1 codegen generate

Q: Why opt-out instead of opt-in?

A: Opt-out telemetry provides more representative data about how the tool is actually used, which leads to better improvements for all users. However, we respect your choice to opt-out at any time.

Q: Is my company's internal tracking supported?

A: Yes! Set CODEGEN_TELEMETRY_ENDPOINT to your internal analytics service. See the "Custom Tracking Endpoint" section above.

Q: Where is the data sent?

A: By default, to Google Analytics 4 (anonymized). You can configure a custom endpoint for organizational tracking.

Q: Can you track me across projects?

A: We use an anonymous UUID that is the same across all your projects (any where you interact with the-codegen-project), but it's not tied to any personal information. You can reset it by deleting ~/.the-codegen-project/config.json.

Q: Are CLI telemetry and website analytics linked?

A: No. The CLI uses an anonymous UUID that is never shared with the website. Website analytics use standard Google Analytics browser tracking. There is no way to correlate CLI usage with website visits - they are completely independent systems.

Contact

If you have questions or concerns about telemetry: