Skip to content

Nexus CLI is a revolutionary command-line interface that combines the best practices from successful CLI tools like Warp, Cursor, Gemini CLI, and Claude Code to provide an intelligent, context-aware coding experience.

Notifications You must be signed in to change notification settings

Nexus-Browser/Nexus-CLI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

46 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Nexus CLI - Advanced Code Intelligence

Python 3.8+ PyTorch License Performance

An AI-powered CLI tool that integrates state of the art language model architectures for intelligent code assistance, web search, and conversational AI. Built with proven optimizations from LLMs-from-scratch and nanoGPT.

Nexus-UI

Key Features

Advanced LLM Architecture

  • FlashAttention: 8x memory efficiency for long sequences
  • KV-Cache: 4x faster autoregressive generation
  • torch.compile: 2x inference speedup
  • Mixed Precision: Optimized memory usage
  • Distributed Training: Multi-GPU support

Web Intelligence

  • Real-time web search integration
  • Smart context gathering
  • Information synthesis
  • Cached results for performance

Code Intelligence

  • Advanced code analysis and understanding
  • Multi-language support (Python, JS, Java, C++, etc.)
  • Function and class extraction
  • Complexity estimation
  • Smart suggestions and debugging

Session Management

  • Persistent conversation history
  • Context-aware responses
  • File tracking and analysis
  • Performance monitoring

Quick Start

Installation

  1. Clone and setup:
git clone <repository-url>
cd Nexus-CLI
python setup_nexus.py
  1. Start the CLI:
./start_nexus.sh  # Unix/Mac
# OR
start_nexus.bat   # Windows
# OR  
python nexus.py --interactive

First Steps

  1. Interactive mode:
python nexus.py --interactive
  1. Single query:
python nexus.py "Explain how transformers work"
  1. With file context:
python nexus.py --file mycode.py "Analyze this code for optimization opportunities"

Architecture Overview

Enhanced LLM Core (model/nexus_llm.py)

# Advanced architecture combining best practices
config = NexusConfig(
    block_size=2048,          # Extended context length
    n_layer=12,               # Transformer layers
    n_head=12,                # Attention heads
    n_embd=768,               # Embedding dimension
    use_flash_attention=True, # Memory efficient attention
    use_kv_cache=True,        # Fast generation
)

model = NexusLLM(config)

Key Optimizations

  1. FlashAttention Implementation:

    • Memory-efficient attention computation
    • Linear memory scaling with sequence length
    • 8x faster for long sequences
  2. KV-Cache with Sliding Window:

    • Stores key-value pairs for fast generation
    • Sliding window for infinite context
    • 4x speedup in autoregressive generation
  3. Enhanced Tokenization:

    • Code-aware tokenization
    • Special tokens for different contexts
    • Efficient BPE encoding
  4. Production Training:

    • Gradient accumulation
    • Mixed precision training
    • Distributed data parallel
    • Learning rate scheduling

Configuration

Model Configuration (model_config.json)

{
  "block_size": 2048,
  "vocab_size": 50304,
  "n_layer": 12,
  "n_head": 12,
  "n_embd": 768,
  "dropout": 0.0,
  "bias": false,
  "use_flash_attention": true,
  "use_kv_cache": true,
  "temperature": 0.8,
  "top_k": 200,
  "max_new_tokens": 500
}

Environment Configuration (.env)

# Model settings
NEXUS_MODEL_PATH=model/nexus_model
NEXUS_DEVICE=auto

# API Keys (optional in case model us unavailable)
OPENAI_API_KEY=your_key_here
ANTHROPIC_API_KEY=your_key_here

# Performance
TORCH_COMPILE=true
FLASH_ATTENTION=true
KV_CACHE=true

Training Your Own Model

Data Preparation

# Prepare training data from text file
python train_nexus.py --data your_training_data.txt

Training

# Basic training
python train_nexus.py

# Advanced training with custom config
python train_nexus.py --config training_config.json

# Resume from checkpoint
python train_nexus.py --resume

# Distributed training (multi-GPU)
torchrun --nproc_per_node=4 train_nexus.py

Training Configuration

{
  "batch_size": 8,
  "learning_rate": 6e-4,
  "max_iters": 10000,
  "eval_interval": 200,
  "gradient_accumulation_steps": 4,
  "weight_decay": 0.1,
  "compile": true,
  "flash_attention": true,
  "mixed_precision": true
}

Performance Benchmarks

Speed Improvements

  • FlashAttention: 8x memory efficiency
  • KV-Cache: 4x faster generation
  • torch.compile: 2x inference speedup
  • Mixed Precision: 1.5x training speedup

Memory Usage

  • Base model: ~1.5GB VRAM (768 hidden size)
  • With FlashAttention: 70% memory reduction
  • With gradient checkpointing: 50% training memory reduction

Throughput

  • CPU inference: ~50 tokens/second
  • GPU inference: ~500 tokens/second
  • Multi-GPU training: Linear scaling

Usage Examples

Code Analysis

You: Analyze this Python function for performance issues

Nexus: I'll analyze your code for potential optimizations:

1. **Time Complexity**: The nested loops create O(n²) complexity
2. **Memory Usage**: Unnecessary list comprehensions 
3. **Suggestions**: 
   - Use vectorized operations with NumPy
   - Consider caching repeated calculations
   - Profile with cProfile for bottlenecks

Tip: I can help you implement these optimizations!

Web Intelligence

You: What are the latest developments in transformer architectures?

Nexus: Based on recent research:

**Latest Developments:**
1. **FlashAttention-2**: 2x faster than FlashAttention-1
2. **RoPE Scaling**: Better handling of long sequences
3. **Mixture of Experts**: Sparse transformers for efficiency

**Key Papers:**
- "FlashAttention-2: Faster Attention with Better Parallelism"
- "LongNet: Scaling Transformers to 1,000,000,000 Tokens"

Response time: 1.2s

Interactive Development

You: Help me implement a binary search algorithm

Nexus: I'll help you implement an efficient binary search:

```python
def binary_search(arr, target):
    """
    Optimized binary search with bounds checking
    Time: O(log n), Space: O(1)
    """
    left, right = 0, len(arr) - 1
    
    while left <= right:
        mid = left + (right - left) // 2  # Prevents overflow
        
        if arr[mid] == target:
            return mid
        elif arr[mid] < target:
            left = mid + 1
        else:
            right = mid - 1
    
    return -1  # Not found

# Example usage
numbers = [1, 3, 5, 7, 9, 11, 13]
result = binary_search(numbers, 7)  # Returns 3

Key Optimizations:

  • Overflow-safe midpoint calculation
  • Early termination on exact match
  • Clear variable naming for readability

Tip: I can also help you with iterative vs recursive versions!


## Advanced Features

### Custom Model Integration

```python
from model.nexus_llm import NexusLLM, NexusConfig

# Create custom configuration
config = NexusConfig(
    n_layer=24,           # Larger model
    n_head=16,
    n_embd=1024,
    block_size=4096,      # Longer context
    use_flash_attention=True
)

# Initialize model
model = NexusLLM(config)

# Load pretrained weights
model.load_state_dict(torch.load('custom_weights.pt'))

API Integration

from nexus_cli import NexusCLI

# Initialize CLI programmatically
cli = NexusCLI()
cli.initialize_model()

# Process queries
result = await cli.process_query("Explain quantum computing")
print(result['response'])

Performance Monitoring

# Get performance statistics
stats = cli.get_stats()
print(f"Average response time: {stats['average_response_time']:.2f}s")
print(f"Model parameters: {stats['model_params']:,}")
print(f"Cache hit rate: {stats['cache_hits']/stats['total_requests']:.1%}")

Development

Project Structure

Nexus-CLI/
├── model/
│   ├── nexus_llm.py          # Advanced LLM architecture
│   ├── tokenizer.py          # Enhanced tokenization
│   ├── checkpoints/          # Model checkpoints
│   └── nexus_model/          # Pretrained models
├── nexus.py                  # Main CLI interface
├── train_nexus.py           # Training script
├── setup_nexus.py           # Installation script
├── requirements.txt         # Dependencies
├── model_config.json        # Model configuration
└── README.md               # This file

Dependencies

Core Requirements:

  • Python 3.8+
  • PyTorch 2.1+
  • Transformers 4.36+
  • NumPy 1.24+

Optional Optimizations:

  • FlashAttention (CUDA acceleration)
  • Triton (kernel optimizations)
  • BitsAndBytes (quantization)

Testing

# Run installation tests
python setup_nexus.py --skip-deps

# Run unit tests  
pytest tests/

# Performance benchmarks
python benchmarks/run_benchmarks.py

Contributing

  1. Fork the repository
  2. Create feature branch: git checkout -b feature/amazing-feature
  3. Make changes with proper testing
  4. Commit changes: git commit -m 'Add amazing feature'
  5. Push to branch: git push origin feature/amazing-feature
  6. Open Pull Request

Development Guidelines

  • Follow PEP 8 style guidelines
  • Add type hints for all functions
  • Include docstrings with examples
  • Write unit tests for new features
  • Update documentation as needed

Roadmap

Near-term (v2.0)

  • Multi-modal support (images, audio)
  • Plugin system for extensions
  • REST API interface
  • Web dashboard
  • Model quantization support

Long-term (v3.0)

  • Multi-agent conversations
  • Tool use and function calling
  • Real-time collaboration
  • Custom fine-tuning GUI
  • Mobile app integration

Troubleshooting

Common Issues

1. Import errors:

# Reinstall dependencies
pip install -r requirements.txt --force-reinstall

2. CUDA out of memory:

# Reduce batch size or use CPU
export NEXUS_DEVICE=cpu
python nexus.py --interactive

3. Slow performance:

# Enable optimizations
export TORCH_COMPILE=true
export FLASH_ATTENTION=true

Performance Tips

  1. Use GPU: 10x faster inference
  2. Enable torch.compile: 2x speedup
  3. Use FlashAttention: 8x memory efficiency
  4. Batch queries: Better throughput
  5. Cache results: Avoid redundant computation

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

  • Sebastian Raschka - LLMs-from-scratch for educational foundation
  • Andrej Karpathy - nanoGPT for production patterns
  • OpenAI - GPT architecture and training insights
  • HuggingFace - Transformers library and model hub
  • PyTorch Team - Exceptional deep learning framework

Support


Made with ❤️ for the AI community

Nexus CLI - Where Code Meets Intelligence

About

Nexus CLI is a revolutionary command-line interface that combines the best practices from successful CLI tools like Warp, Cursor, Gemini CLI, and Claude Code to provide an intelligent, context-aware coding experience.

Resources

Stars

Watchers

Forks

Packages

No packages published