A powerful native macOS application that brings AI assistance directly to your desktop with seamless integration into your workflow. AIassistant leverages both Google Gemini and OpenAI's GPT models to provide intelligent, context-aware assistance across all your applications.
#Key Features
- Instant Access - Double-tap right Shift key to activate from anywhere
- Full Chat Interface - Rich conversations with support for text, images, videos, PDFs, and URLs
- Rewrite in Place - Transform text directly in any application without copy-paste
- Quick Actions - One-click operations like Summarize, Translate, Simplify, and more
- Image Generation - Create and modify images with Gemini's AI capabilities
- Screenshot Capture - Analyze any application window with AI
- Multi-Model Support - Switch between Gemini and OpenAI models dynamically
- Glass UI
- Secure - API keys stored securely in macOS Keychain
- Accessibility-First - Full keyboard support and system-wide automation
- macOS 12.0 or later
- API key from either:
- Google AI Studio (for Gemini models)
- OpenAI (for GPT models)
- Or both for maximum flexibility
-
Clone the repository:
git clone https://github.com/Joaov41/Assist.git cd Assist -
Open
Aiassistant.xcodeprojin Xcode -
Configure code signing:
- Select the project in the navigator
- Select the Aiassistant target
- Under Signing & Capabilities, choose your team
- Update the bundle identifier if needed
-
Build and run (βR)
-
Grant necessary permissions when prompted:
- Accessibility (required for Rewrite in Place)
- Screen Recording (required for Screenshot Capture)
- Launch AIassistant
- Complete the onboarding setup
- Enter your API keys for Gemini and/or OpenAI
- Configure your preferred default model
- You're ready to go!
Keyboard Shortcut (Primary)
- Double-tap the right Shift key to activate the chat popup from anywhere
Menu Bar
- Click the AIassistant icon in your menu bar for quick access
Drag & Drop
- Drag files, URLs, images, videos, or PDFs directly into the chat interface
- The app automatically detects content type and processes accordingly
The full-featured chat interface supports multiple content types and provides rich, formatted responses.
Text
- Simply type your question or prompt
- Select text and type your question or prompt abou the selected text
- Markdown rendering for beautiful formatted responses
URLs
- Drag and drop a URL from your browser address bar into the chat interface
- The app automatically extracts and processes the web content
Text Files
- Drag and drop any text file (.txt, .md, etc.) into the chat interface
- The app reads and processes the file content automatically
PDFs
- Drag and drop a PDF file from Finder into the chat interface
- The app extracts the text content automatically
- Great for analyzing documents, papers, and reports
Images
- Drag and drop image files (.png, .jpg, .gif, etc.) into the chat interface
- Ask questions about the image or request analysis
- Works with screenshots, photos, diagrams, and more
Videos (Gemini models only)
- Drag and drop video files into the chat interface
- The AI can analyze and describe video content
- Supports common video formats
Screenshot Capture
- In the chat interface, select a screenshot option
- Choose an application window to capture
- The screenshot is automatically sent to the AI for analysis
The most powerful feature - transform text in ANY application without copy-paste!
- Select text in any application (Notes, Mail, Messages, web browser, etc.)
- Activate AIassistant (double-tap right Shift)
- Enter your transformation prompt (e.g., "Make this more professional", "Fix grammar", "Translate to Spanish")
- Press Enter or click Submit
- Watch the magic - The AI-generated text automatically replaces your original selection in the source application
- AIassistant uses macOS Accessibility APIs to detect your text selection
- Sends the selected text + your prompt to the AI model
- Receives the AI response
- Automatically pastes the result back into the original application
- Uses keyboard automation to seamlessly replace the selection
- Email Writing: "Make this email more professional"
- Grammar Correction: "Fix all grammar and spelling errors"
- Translation: "Translate this to French"
- Tone Adjustment: "Rewrite this in a friendly tone"
- Simplification: "Explain this like I'm 5"
- Expansion: "Add more detail to this paragraph"
Requirements:
- Accessibility permissions must be granted
- Text must be selectable in the target application
Pre-configured AI operations for common tasks. Access them instantly from the Quick Actions menu.
** Summarize**
- Condenses long text, URLs, or PDFs into key points
- Perfect for research papers, articles, and documents
** Key Points**
- Extracts main ideas as a bulleted list
- Great for meeting notes and reports
** Simplify**
- Makes complex text easier to understand
- Ideal for technical documentation or legal text
** Translate to Spanish**
- Quick translation to Spanish
- Can be customized for other languages
** Describe Image**
- AI analysis of image content
- Identifies objects, scenes, text, and context
** Describe Video** (Gemini only)
- Analyzes video content
- Describes scenes, actions, and context
Create your own Quick Actions!
You can define custom actions for your specific workflows:
- Open Settings in AIassistant
- Navigate to the Quick Actions section
- Click "Add Custom Action"
- Configure:
- Name: Display name (e.g., "Code Review")
- Prompt: The AI instruction (e.g., "Review this code and suggest improvements")
- Icon: Choose an emoji or icon
- Save your custom action
Custom Action Examples:
- "Review this code for bugs and optimization"
- "Rewrite in a humorous tone"
- "Create a social media post from this content"
- "Extract action items and create a TODO list"
- "Generate unit tests for this function"
- "Explain this concept to a beginner"
Your custom actions appear alongside built-in actions and can be used with any selected text!
Available with Gemini models only
Create and modify images using AI:
- In the chat interface, describe the image you want
- Use prompts like:
- "Generate an image of a sunset over mountains"
- "Create a logo for a coffee shop"
- "Draw a cartoon character of a friendly robot"
- The AI generates the image and displays it in chat
- You can refine by continuing the conversation
Capture and analyze any application window:
- Click the Screenshot button in the chat interface
- Choose from the list of running applications
- The window is captured automatically
- Ask questions about the screenshot:
- "What's wrong with this design?"
- "Summarize the information in this screenshot"
- "Extract the text from this image"
Use Cases:
- UI/UX feedback and analysis
- Extract text from images (OCR)
- Debug visual issues
- Get design suggestions
- Analyze charts and graphs
AIassistant supports multiple AI models for different use cases:
Gemini 2.5 Pro
- Maximum capability and intelligence
- Best for complex reasoning and analysis
- Supports images, videos, and long contexts
Gemini 2.5 Flash
- Ultra-fast responses
- Great for quick questions and simple tasks
- More cost-effective
Exclusive Features:
- Image generation
- Video content analysis
- Longer context windows
GPT-5
- Latest and most advanced model
- Superior reasoning and understanding
GPT-4o
- Optimized for performance
- Balanced speed and capability
GPT-4o Mini
- Fast and cost-effective
- Great for simple tasks and quick responses
Change models anytime in Settings:
- Open Settings (from menu bar or chat interface)
- Select AI Provider (Gemini or OpenAI)
- Choose your preferred Model
- Model switches immediately for new conversations
Access settings via the menu bar icon or chat interface.
AI Provider Configuration
- Choose between Gemini and OpenAI
- Enter or update API keys
- Select default model
UI Customization
- Choose from 19 glass morphism variants
- Adjust window opacity and blur
- Dark mode (always enabled)
Behavior Settings
- Enable/disable automatic content detection
- Configure keyboard shortcuts
- Adjust response streaming
Quick Actions Management
- View all quick actions
- Create custom actions
- Edit or delete existing actions
- Reorder action list
About
- App version information
- API usage statistics
- Privacy policy and terms
- API Keys: Stored securely in macOS Keychain, never in plaintext
- Local Processing: Text selection and clipboard handling done locally
- No Data Collection: AIassistant doesn't collect or store your data
- Secure Communication: All API calls use HTTPS encryption
- Permissions: Only requests necessary system permissions
- Accessibility: Required for Rewrite in Place
- Screen Recording: Required for Screenshot Capture
Built With:
- Swift - Modern, type-safe programming language
- SwiftUI - Declarative UI framework
- AppKit - Native macOS integration
- Accessibility APIs - System-wide automation
- Markdown Rendering - Beautiful formatted responses
Architecture:
- Native macOS application (not Electron/web-based)
- ~7,420 lines of Swift code
- 30+ source files
- Modular architecture with clean separation of concerns
System Requirements:
- macOS 12.0 or later
- Internet connection for AI API calls
- ~50MB disk space
- Be specific in your prompts ("Make formal" vs "Rewrite")
- Works in any app with selectable text
- Great for iterative refinement - rewrite multiple times
- Use with Quick Actions for common transformations
- Be specific: "Summarize in 3 bullet points" vs "Summarize"
- Provide context: "Translate to casual Spanish" vs "Translate"
- Iterate: Refine results by continuing the conversation
- Use Gemini Pro for complex analysis and long documents
- Use Gemini Flash or GPT-4o Mini for quick questions
- Use GPT-5 or GPT-4o for critical reasoning tasks
- Use Gemini for image generation and video analysis
- Create actions for repetitive tasks
- Use clear, descriptive names
- Test prompts in chat first, then save as actions
- Organize actions by category (Writing, Coding, Translation, etc.)
Contributions are welcome! Feel free to:
- Report bugs or issues
- Suggest new features
- Submit pull requests
- Improve documentation
This project is licensed under the MIT License - see the LICENSE file for details.
- Powered by Google Gemini and OpenAI GPT models
- Built with Apple's native frameworks
- Glass morphism design inspiration from modern UI trends
For issues, questions, or feedback:
- Open an issue on GitHub
- Check existing documentation
- Review closed issues for solutions



