Skip to content

ML-URLSpoof-20210805-1

Pre-release
Pre-release

Choose a tag to compare

@brokensound77 brokensound77 released this 05 Aug 19:04
· 2423 commits to main since this release
17bf3c1

model name: urlspoof_20210803_1.0
sha256: 4cbd8d82d382864d28147c5f80ac86108e774319bbe5d2c4c9f3c68d9f86e01e
for details, reference: https://github.com/elastic/detection-rules/tree/main/docs/experimental-machine-learning


changelog

This is the first release package for URL Spoofing. It consists of the following:

  • Feature Extraction Scripts:

    • ml_urlspoof_char_continuity_script.json: Calculate the continuity of different parts of a the domain (i.e. number of consecutive characters before seeing a number)
    • ml_urlspoof_domain_entropy_script.json: Calculate the entropy of the URL domain
    • ml_urlspoof_keyword_extractor_script.json: Extract keywords of interest from certain features
    • ml_urlspoof_ngrams_extractor_script.json: Extract ngrams from certain features
    • ml_urlspoof_remove_features_script.json: Remove extra fields created for prediction purposes to avoid cluttering incoming documents - this will NOT remove any of your original fields in your documents
    • ml_urlspoof_tld_keyword_extractor_script.json: Extract top level domain related keywords of interest from certain features
  • Model:

    • ml_urlspoof_model.json: Supervised model to classify URLs as malicious vs benign
  • Inference Pipeline:

    • ml_urlspoof_inference_pipeline.json: Inference pipeline to make predictions on URLs using the URL Spoofing model and threat intelligence enrichments
  • Training Pipeline:

    • ml_urlspoof_features_pipeline.json: Training pipeline used to train the URL Spoofing model - this is primarily for analysts looking for a starting point to train their own model