Skip to content

Conversation

@zmoog
Copy link
Contributor

@zmoog zmoog commented Sep 26, 2022

What does this PR do?

Expand Azure Logs integration docs to make it easier for users to set it up.

It addresses issues 2 and 5 from #4169.

Disclaimer: this is an early draft to share the change with the cloud monitoring team members and docs wizards.

Checklist

  • I have reviewed tips for building integrations and this pull request is aligned with them.
  • I have verified that all data streams collect metrics or logs.
  • I have added an entry to my package's changelog.yml file.
  • I have verified that Kibana version constraints are current according to guidelines.

Related issues

Screenshots

CleanShot 2022-09-30 at 13 14 41@2x

I expanded this section adding a list of the required component to set up an Azure Logs integration. The section also add some details about each component and sub- elements the user need to know later in the Setup section.
@zmoog zmoog self-assigned this Sep 26, 2022
@zmoog zmoog added enhancement New feature or request Integration:azure Azure Logs Team:Cloud-Monitoring Label for the Cloud Monitoring team labels Sep 26, 2022
@elasticmachine
Copy link

elasticmachine commented Sep 26, 2022

🚀 Benchmarks report

Package azure 👍(1) 💚(2) 💔(6)

Expand to view
Data stream Previous EPS New EPS Diff (%) Result
activitylogs 1153.4 652.32 -501.08 (-43.44%) 💔
auditlogs 2272.73 1438.85 -833.88 (-36.69%) 💔
identity_protection 2341.92 1828.15 -513.77 (-21.94%) 💔
platformlogs 3076.92 2109.7 -967.22 (-31.43%) 💔
signinlogs 1600 948.77 -651.23 (-40.7%) 💔
springcloudlogs 3597.12 2061.86 -1535.26 (-42.68%) 💔

To see the full report comment with /test benchmark fullreport

@elasticmachine
Copy link

elasticmachine commented Sep 26, 2022

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2022-10-03T08:57:42.515+0000

  • Duration: 14 min 39 sec

Test stats 🧪

Test Results
Failed 0
Passed 119
Skipped 0
Total 119

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

@elasticmachine
Copy link

elasticmachine commented Sep 26, 2022

🌐 Coverage report

Name Metrics % (covered/total) Diff
Packages 100.0% (9/9) 💚
Files 85.0% (17/20) 👎 -12.208
Classes 85.0% (17/20) 👎 -12.208
Methods 82.143% (138/168) 👎 -7.535
Lines 85.578% (2504/2926) 👎 -5.957
Conditionals 100.0% (0/0) 💚
Highlighting the kind of data thaw flow between the components is probably helpful.
@colleenmcginnis
Copy link
Contributor

👋 @zmoog I like the direction you're taking here!

One thing that isn't clear (yet) in this draft is the relationship between this Azure Logs integration and the individual Active Directory, Activity logs, Firewall logs, Platform logs, and Spring Cloud logs integrations. Are they intended to be used together? Or should a user choose either the Azure Logs integration or one or more of the other integrations for collecting logs? In what situations should you choose each approach?

@zmoog
Copy link
Contributor Author

zmoog commented Sep 27, 2022

👋 @zmoog I like the direction you're taking here!

That's good! So I'll move ahead exploring how to tackle the Setup section.

One thing that isn't clear (yet) in this draft is the relationship between this Azure Logs integration and the individual Active Directory, Activity logs, Firewall logs, Platform logs, and Spring Cloud logs integrations. Are they intended to be used together? Or should a user choose either the Azure Logs integration or one or more of the other integrations for collecting logs? In what situations should you choose each approach?

I haven't touched the documentation of the individual integrations yet. The idea is to replicate the AWS doc approach: general and shared information in the main README.md file, and the specific information and references (fields and sample logs) in the individual .md files.

I moved some of the content from the Requirements section to the Setup section. There were too many details.
@zmoog
Copy link
Contributor Author

zmoog commented Sep 27, 2022

Hey @colleenmcginnis, I expanded the Setup section by drafting the kind of content users probably need to set up Azure Logs. I also moved some details from the Requirements section to the Setup one. I updated the screenshots.

Let me know what you think approach!

The next step is revising the Settings section, working on the individual integration pages, and moving the references from the README.

Refine event hub setup information a little
Logs reference is probably more useful in the individual integration page.
@zmoog
Copy link
Contributor Author

zmoog commented Sep 28, 2022

@colleenmcginnis, I reduced the scope of this PR a little; I plan a series of small, more focused PRs next.

In this PR, I focus on addressing issues 2 and 5 from #4169 (address the missing information about event hubs and the ambiguity between event hub namespace and event hub.).

I also removed the field details and sample events from the Reference section on the main page, similar to the AWS integration doc.

@zmoog zmoog marked this pull request as ready for review September 28, 2022 21:11
@zmoog zmoog requested a review from a team as a code owner September 28, 2022 21:11
Copy link
Contributor

@alaudazzi alaudazzi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few editing suggestions, otherwise LGTM.

@alaudazzi
Copy link
Contributor

@zmoog
Great PR 🥇 Thank you for reworking these instructions!
As we discussed yesterday, reducing the scope of this PR and address the remaining items from #4169 makes totally sense.

@zmoog zmoog force-pushed the zmoog/azure-logs-doc-update branch from 08800f7 to aee15cc Compare September 29, 2022 10:29
@zmoog zmoog force-pushed the zmoog/azure-logs-doc-update branch from aee15cc to 0858939 Compare September 29, 2022 10:34
Co-authored-by: Arianna Laudazzi <46651782+alaudazzi@users.noreply.github.com>
#### How many event hubs?

Examples:
Elastic recommends creating one event hub for each Azure service you collect data from. For example, if you plan to collect Azure Active Directory (Azure AD) logs and Activity logs, create two event hubs: one for Azure AD and one for Activity logs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still, the configuration of an integration allows specifying only one event hub, but the user can enable the processing of multiple types of events. So it seems as Azure Integration we somehow assume an EventHub can contain more than 1 type of events/logs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some integrations are designed to filter incoming logs and drop those that are not on the supported log categories list.

For example, both sign-in and audit logs have a processor in their ingest pipelines that drop logs that don't belong to the supported log categories. So you can enable all the Azure AD logs, and each one will ingest only the right log messages. The price is inefficiencies due to all data streams receiving the same messages: dropping a message is not completely free.

Other integrations, like the generic Event Hub integration, can ingest any log category but it indexes only the common fields. So enabling a generic Event Hub integration alongside others is not recommended and can lead to indexing the same log multiple times, in different data streams, and with different field mapping.

@lucabelluccini, do you think we need to elaborate a little instead of just stating "Elastic recommends creating one event hub"?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, both sign-in and audit logs have a processor in their ingest pipelines that drop logs that don't belong to the supported log categories.

If the drop is in the ingest pipeline on Elasticsearch side, leaving all the types of logs in the Event Log settings is a waste of network and ingest pipeline resources on the ingest nodes, as we're basically multiplying N times (N = the type of logs/data streams left enabled)

E.g. I collect only the AD logs and I route them to 1 event hub. If I configure the Azure Logs integration as per defaults (all options enabled) as below, the amount of data transferred to ES (for then being thrown away except for the AD) is huge...

image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. That's why I want to update this document so badly.

I also want to open a different PR to change the default value from enabled to disabled (the new data streams added to Azure AD logs are disabled by default).

The Azure module for Filebeat also recommends one even hub per log type:

It is recommended to use a separate eventhub for each log type as the field mappings of each log type are different.

The reason is different but still valid.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We agreed to make the following changes:

  • Add a disclaimer at the top of the README.md document suggesting installing the "individual" integrations.
  • We "strongly recommend"
This setting can also be used to define your own endpoints, like for hybrid cloud models.
It is not recommended to use the same event hub for multiple integrations.

For high-volume deployments, we recommend one event hub for each data stream.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We introduce the concept of Elasticsearch data stream, but the user doesn't see the word data stream when configuring the Azure Logs Integration.
What is the objective of this statement?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal was to introduce the idea that high-vol data streams may require an additional work.

A good trade-off for most users is one event hub for all Azure AD logs (made of four data streams now), but if you have a substantial Active Directory deployment, you may consider moving to one event hub for each data stream.

Like all other clients, Elastic Agent should specify a consumer group to access the event hub.

## Logs reference
A Consumer Group is a view (state, position, or offset) of an entire event hub. Consumer groups enable multiple agents to each have a separate view of the event stream, and to read the logs independently at their own pace and with their own offsets.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be misleading - at least to my novice eyes.

If the same Consumer group is used across multiple Elastic Agents with Azure Logs configured identically, allow to read the logs concurrently and without duplicates.
Different consumer groups allow to have a separate view of the event stream and to read the logs independently at their own page and their own offsets.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to go one level deeper to understand what's going on.

The current integration structure forces the same event hub and consumer group to be shared across all enabled data streams.

Every enabled data stream spawns an azureeventhub input that connects to the same event hub and uses the same consumer group name:

 ┌────────────────┐ │ adlogs │ │ <<event hub>> │ └────────────────┘ │ │ │ ┌──────────────┼──────────────┐ $Default $Default $Default │ │ │ ┌──────────┼──────────────┼──────────────┼────────┐ │ ▼ ▼ ▼ │ │ ┌────────────┐ ┌────────────┐ ┌────────────┐ │ │ │ signin │ │ audit │ │ activity │ │ │ │ <<input>> │ │ <<input>> │ │ <<input>> │ │ │ └────────────┘ └────────────┘ └────────────┘ │ │ │ │ │ │ └─Filebeat─┼──────────────┼──────────────┼────────┘ │ │ │ │ │ │ │ consumer group info │ │ (state, position, or │ │ offset) │ │ │ │ │ │ │ ┌──────────┼──────────────┼──────────────┼────────┐ │ ▼ ▼ ▼ │ │ ┌────────────┐ ┌──────────┐ ┌──────────┐ │ │ │ signin │ │ audit │ │ audit │ │ │ │ <<blob>> │ │ <<blob>> │ │ <<blob>> │ │ │ └────────────┘ └──────────┘ └──────────┘ │ │ │ └─storage account container───────────────────────┘ 

But, since each input stores the consumer group info (state, position, or offset) on a different blob, it's like each uses a dedicated consumer group. Even if they use the same consume group name.

That's why each data stream connected to the same data stream receives a copy of each message.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you assign the same agent policy to multiple Elastic Agents, they end up using the same blob and meet the goal of sharing the load:

 ┌────────────────┐ │ adlogs │ │ <<event hub>> │ └────────────────┘ │ │ │ ┌────────────┴───────────┐ $Default $Default │ │ ┌──────────┼──────────┐ ┌──────────┼──────────┐ │ ▼ │ │ ▼ │ │ ┌────────────┐ │ │ ┌────────────┐ │ │ │ signin │ │ │ │ signin │ │ │ │ <<input>> │ │ │ │ <<input>> │ │ │ └────────────┘ │ │ └────────────┘ │ │ │ │ │ │ │ └─Filebeat─┼──────────┘ └─Filebeat─┼──────────┘ │ │ │ │ │ │ │ consumer group info │ │(state, position, or │ │ offset) │ │ │ ┌──────────┼────────────────────────┼─────────────┐ │ │ │ │ │ │ ┌────────────┐ │ │ │ │ │ signin │ │ │ │ └────▶│ <<blob>> │◀────┘ │ │ └────────────┘ │ │ │ └─storage account container───────────────────────┘ 

In this example, both Filebeat instances will use the same blob. The blob name is identified by name using the following pattern: filebeat-signinlogs-{{eventhub}}.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lucabelluccini, what do you think is the best way to describe the role of the consumer group to a user who wants to set up an integration?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We agreed to make the following changes:

  • add a paragraph that describes that consumer groups allow Agent to collaborate and scale the throughput
  • underline that having an event hub per log type is a good thing
A Consumer Group is a view (state, position, or offset) of an entire event hub. Consumer groups enable multiple agents to each have a separate view of the event stream, and to read the logs independently at their own pace and with their own offsets.

### Activity logs
In most cases, you can use the default value of `$Default`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we mention an example of when this should be configured?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. I am not an expert here, but I think having a consumer group named after the log type it serves would help. I will update the doc accordingly.

zmoog and others added 5 commits September 29, 2022 17:20
Co-authored-by: Luca Belluccini <luca.belluccini@elastic.co>
Co-authored-by: Luca Belluccini <luca.belluccini@elastic.co>
- put more emphasis on the "Azure service" concept; we want to make it a first-class citizen of this doc to leverage it when we discuss the recommended 1:1 mapping between service and event hub. - recommend installing the individual integration vs collective one - clarify the role of consumer group and storage account container as enablers of shared logs processing. - minor stuff (add more links and supporting diagrams)
@zmoog zmoog force-pushed the zmoog/azure-logs-doc-update branch from d930a4b to 69153c6 Compare September 30, 2022 10:05
@zmoog
Copy link
Contributor Author

zmoog commented Sep 30, 2022

Hey @alaudazzi @lucabelluccini, I pushed an update that addresses the topic we discussed earlier today.

Let me know what you think! I'm more than happy to clarify, expand or fix errors.


A Consumer Group is a view (state, position, or offset) of an entire event hub. Consumer groups enable multiple agents to each have a separate view of the event stream, and to read the logs independently at their own pace and with their own offsets.

Consumer groups allow the Elastic Agents assigned to the same agent policy to work together on the logs processing to increase ingestion throughput if required.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above :)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find also that mentioning the concept of “horizontal scaling" might increase clarity. but it could be a subjective opinion

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I agree, and I'm also stealing the "horizontal scaling" thing 😇

@zmoog zmoog force-pushed the zmoog/azure-logs-doc-update branch from 1208a48 to 25c75c4 Compare September 30, 2022 11:16
@zmoog zmoog merged commit e4a3193 into elastic:main Oct 3, 2022
@zmoog zmoog deleted the zmoog/azure-logs-doc-update branch October 3, 2022 09:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request Integration:azure Azure Logs Team:Cloud-Monitoring Label for the Cloud Monitoring team

7 participants