Skip to content

Create Apache Spark integration #493

@sorantis

Description

@sorantis

Apache Spark is an open-source data processing engine for big data sets. Like Hadoop, Spark splits up large tasks across different nodes. However, it tends to perform faster than Hadoop and it uses random access memory (RAM) to cache and process data instead of a file system. This enables Spark to handle use cases that Hadoop cannot.

Spark provides the following types of metrics - Gauge, Counter, Histogram, Meter, Timer. The most common types of metrics used in Spark instrumentation are gauges and counters. Hence, we will support only Gauge and Counter metrics from the following providers:

  • Driver
  • Executor
  • ApplicationMaster
  • Mesos Cluster
  • Master
  • ApplicationSource
  • Worker

Each Spark instance can report to one of the nine available sink options. We will support JmxSink only to start with.
 For more details regarding statistics provided by Apache Spark, please refer to this page.

Package / Dataset creation or update checklist

This checklist is intended for Devs which create or update a package to make sure they are consistent.

All Changes

  • Change follows development guidelines
  • Supported versions of the subject being monitored are documented
  • Supported operating systems are documented (if applicable)
  • System tests exist
  • Documentation
  • Fields follow ECS and naming conventions
  • At least a manual test with ES / Kibana / Agent has been performed.
  • The required Kibana version is set to the lowest version used in the manual test.

Dashboards

  • Dashboards exists (if applicable)
  • Screenshots of added / updated dashboards
  • Datastream filters added to visualizations

Log datasets

  • Pipeline tests exist (if applicable)
  • Test log files exist for the grok patterns
  • Generated output for at least 1 log file exists

Metric datasets

This entry is currently recommended. It will be mandatory once we provide better support for it.

  • Sample event (sample_event.json) exists

New Packages

  • Screenshot of the Fleet "Add Integration" Page.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions