Skip to content

Conversation

@shephinphilip
Copy link

@shephinphilip shephinphilip commented Jun 5, 2025

This commit introduces an LLM-powered adaptive replay strategy and an auto-documentation feature.

Key changes include:

  1. LLMAdaptiveStrategy (openadapt/strategies/llm_adaptive_strategy.py):

    • A new replay strategy that inherits from BaseReplayStrategy.
    • Intent Abstraction: I use an LLM (via generate_action_event.j2 prompt) to determine the next action based on recorded actions, current UI state, and your task description.
    • Semantic Matching & Adaptation: I implemented a UI consistency check (_is_ui_consistent_for_next_original_action) to decide whether to replay a recorded action directly or use the LLM for adaptation if the UI has changed. This involves comparing window titles, dimensions, and screenshot similarity.
    • Basic Error Recovery: I overrode the run method to include a post-action check using prompt_is_action_complete. If an action doesn't complete as expected, this is logged, and I implicitly handle the new state in the next cycle.
    • Action history is consistently managed in self.action_events.
  2. Auto-Documentation Script (openadapt/scripts/generate_documentation.py):

    • A new script that takes a recording timestamp.
    • It loads the recording, prepares context (action details, window states, screenshots).
    • It uses the describe_recording.j2 prompt to ask an LLM to generate a human-readable summary of the recording.
    • It prints the generated documentation to the console.
  3. Integration & Prompts:

    • The new strategy is dynamically discovered by the system.
    • It leverages existing prompt infrastructure and LLM adapter configurations.
    • Relevant prompts (generate_action_event.j2, describe_recording.j2, is_action_complete.j2, system.j2) are utilized.

Steps I Took:

  • Initial planning and codebase exploration.
  • Created LLMAdaptiveStrategy class structure.
  • Implemented LLM-based intent abstraction in get_next_action_event.
  • Added semantic replay matching logic to intelligently choose between replaying original actions or using LLM for adaptation.
  • Implemented basic error detection in the strategy's run method.
  • Ensured the new strategy integrates with the existing system.
  • Developed the generate_documentation.py script for auto-documentation.

This work fulfills the core requirements of the issue to create an intelligent replay system using LLMs to generalize, abstract, and execute workflows across varying UI states, and to auto-document recordings. I planned unit tests as the next step.

to run , we need to use

python -m openadapt.replay LLMAdaptiveStrategy --timestamp YOUR_RECORDING_TIMESTAMP 
@abrichr
Copy link
Member

abrichr commented Aug 18, 2025

Thank you @shephinphilip ! Can you please show some example output, e.g. a video, a screenshot, log text?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants