Release ML-URLSpoof-20210805-1 · elastic/detection-rules

model name: urlspoof_20210803_1.0
sha256: 4cbd8d82d382864d28147c5f80ac86108e774319bbe5d2c4c9f3c68d9f86e01e
for details, reference: https://github.com/elastic/detection-rules/tree/main/docs/experimental-machine-learning

changelog

This is the first release package for URL Spoofing. It consists of the following:

Feature Extraction Scripts:
- ml_urlspoof_char_continuity_script.json: Calculate the continuity of different parts of a the domain (i.e. number of consecutive characters before seeing a number)
- ml_urlspoof_domain_entropy_script.json: Calculate the entropy of the URL domain
- ml_urlspoof_keyword_extractor_script.json: Extract keywords of interest from certain features
- ml_urlspoof_ngrams_extractor_script.json: Extract ngrams from certain features
- ml_urlspoof_remove_features_script.json: Remove extra fields created for prediction purposes to avoid cluttering incoming documents - this will NOT remove any of your original fields in your documents
- ml_urlspoof_tld_keyword_extractor_script.json: Extract top level domain related keywords of interest from certain features
Model:
- ml_urlspoof_model.json: Supervised model to classify URLs as malicious vs benign
Inference Pipeline:
- ml_urlspoof_inference_pipeline.json: Inference pipeline to make predictions on URLs using the URL Spoofing model and threat intelligence enrichments
Training Pipeline:
- ml_urlspoof_features_pipeline.json: Training pipeline used to train the URL Spoofing model - this is primarily for analysts looking for a starting point to train their own model