Skip to content

TheScriptRailoth/interview-helper-ai

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

5 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Voice Interview Assistant - Production Ready

A professional voice recording and AI-powered interview assistance application designed for software development interview preparation.

πŸš€ Features

  • High-quality voice recording with configurable audio settings
  • Real-time speech transcription using OpenAI Whisper
  • AI-powered interview responses via Ollama or compatible APIs
  • Production-ready architecture with proper error handling and logging
  • Configurable settings via INI file
  • Professional UI with dark theme and responsive design
  • Thread-safe operations with proper resource management
  • Comprehensive logging for debugging and monitoring

πŸ“‹ Requirements

System Requirements

  • Python 3.8 or higher
  • Audio input device (microphone)
  • 4GB+ RAM (8GB recommended for larger Whisper models)
  • GPU with CUDA support (optional, for faster transcription)

Software Dependencies

  • Ollama or compatible AI service running locally
  • PortAudio (for audio recording)

πŸ› οΈ Installation

1. Clone or Download

# Save the main application file as voice_assistant.py

2. Install Python Dependencies

# Install required packages
pip install -r requirements.txt

# Optional: Install CUDA support for faster transcription
pip install torch torchaudio --index-url https://download.pytorch.org/whl/cu118

3. Install System Dependencies

Windows

# PortAudio is usually included with sounddevice
# If you encounter issues, install Visual C++ Build Tools

macOS

# Install using Homebrew
brew install portaudio

Linux (Ubuntu/Debian)

sudo apt-get update
sudo apt-get install portaudio19-dev python3-pyaudio

4. Setup AI Service (Ollama)

# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh

# Pull the required model
ollama pull llama3

# Start Ollama service
ollama serve

βš™οΈ Configuration

The application uses a config.ini file for configuration. On first run, it will create a default configuration file that you can customize.

Key Configuration Options

Audio Settings

  • sample_rate: Audio sample rate (default: 44100)
  • channels: Number of audio channels (default: 1)
  • dtype: Audio data type (default: int16)

Whisper Model

  • model: Whisper model size (tiny, base, small, medium, large)
  • device: Processing device (auto, cpu, cuda)

AI Service

  • api_url: AI service endpoint
  • model: AI model name
  • timeout: Request timeout in seconds

Logging

  • level: Log level (DEBUG, INFO, WARNING, ERROR, CRITICAL)
  • file: Log file path (optional)

πŸš€ Usage

Starting the Application

python voice_assistant.py

Basic Operation

  1. Start Recording: Press SPACE or click the microphone button
  2. Stop Recording: Press SPACE again or click the stop button
  3. View Results: The popup window shows transcription and AI response
  4. Toggle Popup: Double-click the microphone button

Keyboard Shortcuts

  • SPACE: Start/Stop recording
  • Double-click mic button: Toggle popup window

πŸ—οΈ Production Deployment

1. Environment Setup

# Create virtual environment
python -m venv venv
source venv/bin/activate  # Linux/macOS
# or
venv\Scripts\activate     # Windows

# Install dependencies
pip install -r requirements.txt

2. Configuration Management

# Copy and customize configuration
cp config.ini.template config.ini
# Edit config.ini with your settings

3. Service Setup (Linux)

Create a systemd service file:

[Unit]
Description=Voice Interview Assistant
After=network.target

[Service]
Type=simple
User=your_username
WorkingDirectory=/path/to/voice-assistant
ExecStart=/path/to/venv/bin/python voice_assistant.py
Restart=always
RestartSec=10

[Install]
WantedBy=multi-user.target

4. Monitoring and Logging

  • Check logs in the configured log file
  • Monitor system resources (CPU, memory, GPU)
  • Set up log rotation for production environments

πŸ”§ Troubleshooting

Common Issues

Audio Recording Problems

# Check available audio devices
python -c "import sounddevice as sd; print(sd.query_devices())"

# Test microphone
python -c "import sounddevice as sd; import numpy as np; print('Recording...'); data = sd.rec(44100, samplerate=44100, channels=1); sd.wait(); print('Done')"

Whisper Model Loading Issues

  • Ensure sufficient RAM is available
  • Try a smaller model (tiny, base) if memory is limited
  • Check CUDA installation for GPU acceleration

AI Service Connection Issues

  • Verify Ollama is running: curl http://localhost:11434/api/tags
  • Check firewall settings
  • Verify model availability: ollama list

UI Issues

  • Ensure tkinter is installed (usually included with Python)
  • Check display settings for popup window positioning
  • Verify window manager compatibility

Performance Optimization

For Better Transcription Speed

  • Use GPU acceleration (CUDA)
  • Choose appropriate Whisper model size
  • Optimize audio settings

For Lower Memory Usage

  • Use smaller Whisper models (tiny, base)
  • Reduce audio buffer sizes
  • Close popup when not needed

πŸ“Š Architecture Overview

Key Components

  1. ConfigManager: Handles application configuration
  2. AudioManager: Manages audio recording with thread safety
  3. AIService: Handles Whisper transcription and AI API calls
  4. VoiceInterviewAssistant: Main application controller
  5. UI Components: Professional GUI with responsive design

Thread Safety

  • All audio operations are thread-safe
  • Proper resource cleanup on shutdown
  • Graceful handling of interruptions

Error Handling

  • Comprehensive exception handling
  • Graceful degradation on errors
  • User-friendly error messages
  • Detailed logging for debugging

πŸ”’ Security Considerations

  • Audio data is processed locally (privacy-first)
  • Temporary files are cleaned up automatically
  • No sensitive data is stored permanently
  • API calls use session management

πŸ“ˆ Monitoring and Maintenance

Health Checks

  • Monitor log files for errors
  • Check AI service availability
  • Verify audio device connectivity
  • Monitor system resources

Updates and Maintenance

  • Regularly update dependencies
  • Monitor Whisper model updates
  • Check Ollama service updates
  • Review and rotate log files

🀝 Contributing

When contributing to the production version:

  1. Follow Python PEP 8 style guidelines
  2. Add comprehensive error handling
  3. Include logging for debugging
  4. Write unit tests for new features
  5. Update configuration documentation
  6. Test on multiple platforms

πŸ“„ License

This production-ready version includes enterprise-grade features and should be used according to your organization's software licensing policies.

πŸ“ž Support

For production deployment support:

  • Check logs for detailed error information
  • Verify all dependencies are correctly installed
  • Test individual components (audio, transcription, AI service)
  • Monitor system resources during operation

Production Notes: This version includes comprehensive error handling, logging, configuration management, and thread safety suitable for production environments. Always test thoroughly in your specific environment before deployment.

About

A professional voice recording and AI-powered interview assistance application designed for software development interview preparation.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors