AquilaX Docs
Service StatusFeature RequestLogin
  • Documentation
  • Products and Services
    • Demo
      • Security Engineer - Assistant
      • Security Engineer - Chat
      • Scan code Snippet
    • Products
    • Services
      • Vulnerability Triaging
      • AppSec Training
      • DevSecOps Consultation
      • Deployment Options
      • Security Consultation
      • Integrations
    • Company Principles
      • Engineering Principles
      • AI Principles
      • AquilaX Mission
    • Proof of Value (PoV)
    • SLO/SLA/SLI
    • Security Scanners
    • Supported Languages
    • What is AquilaX
    • Success Cases
      • RemoteEngine
    • AquilaX License Model
  • User Manual
    • Access Tokens
    • Scanners
      • Secret Scanning
      • PII Scanner
      • SAST
      • SCA
      • Container Scanning
      • IaC Scanning
      • API Security
      • Malware Scanning
      • AI Generated Code
      • License Scanning
    • DevTools
      • AquilaX CLI
      • CI/CD
        • GitHub Integration
        • GitLab Integration
      • Vulnerability Tickets
        • GitHub Issues
        • GitLab Issues
        • JIRA Tickets
      • IDE
        • VS Code
    • Frameworks
    • Roles
    • Security Policy
    • Comparison
      • ArmorCode vs AquilaX
      • Black Duck vs AquilaX
      • AquilaX vs other Vendors
    • Press and Logo
    • Install AquilaX
    • Public Scan
    • Scanning Setup Guide
    • AI Chat Prompts
  • API Docs
  • Tech Articles
    • Proprietary AI Models
    • AquilaX Securitron
    • Securitron AI Service
    • Secure SDLC (DevSecOps)
    • Bending the technology
    • SecuriTron In Action
    • Future
      • The Future of Code Review
      • Building Superhumans
    • Blog
      • Breaking the Code: AquilaX
      • Rethinking Authentication in 2024
      • Software Supply Chain Security
      • OneFirewall - Network Security
      • The Art of Doing Source Code Review
      • Our Cloud Infrastracture
    • AppSec
      • 10 ‘must’ controls
      • OWASP Top 10
      • MITRE ATT&CK Framework
      • SQL Injection
      • DevSecOps
      • Insider Threats in Application Security
      • Secure API Development
      • RBAC in Applications
      • Security in CI/CD Pipelines
      • Audits in DevSecOps
      • Security Policies
      • S SDLC
      • Multi-Factor Authentication (MFA)
      • API Gateway Security
      • RESTful APIs
      • Microservices
      • Secure API Development
      • API Security Best Practices
    • AI
      • AI part of AppSec
      • NL-JSON Model
      • Findings Review (AquilaX AI)
      • AI-Driven Vulnerability Triage
    • Tech Events
      • Web Summit 2024
    • ASPM
    • State of Art Secure SDLC
      • Validating Runtime Security
    • Announcements
      • 10 Billion
      • AquilaX Joins NVIDIA Inception
    • Webinars
      • Unlock the Future of Code Security with AI
  • AI Models
    • AI Scanner
    • Query
    • QnA
    • Security Assistant
    • Review
Powered by GitBook
On this page
  • Overview
  • Key Features
  • Installation
  • Inference Example
  • Input and Output Format
  • Performance Optimization
  • Deployment Considerations
  • Support and Contributions

Was this helpful?

  1. AI Models

QnA

AquilaX QnA (Securitron)

PreviousQueryNextSecurity Assistant

Last updated 22 days ago

Was this helpful?

Overview

AquilaX QnA, aka Securitron, is a compact, instruction‑tuned transformer optimized for low‑latency CPU inference. It blends general-domain Q&A with precise AquilaX expertise, supporting real‑time streaming and minimal resource usage.


Model Specs

  • Name: AquilaX QnA (Securitron)

  • Architecture: Instruction‑tuned transformer

  • Fine‑Tuning: General conversational + AquilaX domain data

  • Context Window: 8192 tokens

  • Memory: ≥ 4 GB RAM (CPU)

  • Platforms: CPU (quantized) & CUDA GPU

  • API Access: Integrate via the.

  • Interactive Demo: Try the chatbot at Home.


Key Features

CPU-Optimized Performance:

  • Quantized for minimal memory usage and fast inference on CPUs.

  • No GPU required for efficient operation.

Dual Knowledge Base:

  • Handles general queries with clarity.

  • Delivers precise answers for AquilaX-specific topics.

Real-Time Streaming:

  • Supports token-by-token response generation for interactive experiences.

Context-Aware Responses:

  • Maintains a limited conversation history for coherent follow-ups.

  • Automatically manages history to optimize memory.

Custom System Prompt:

  • Configured as Securitron, ensuring professional and consistent responses.


Installation

Prerequisites

  • Python 3.8+

  • PyTorch (CPU or GPU version)

  • Transformers library

  • Optional: CUDA for GPU acceleration

Install Dependencies

pip install torch transformers

Download the Model

Load the model and tokenizer directly from Hugging Face:

from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("AquilaX-AI/QnA")
model = AutoModelForCausalLM.from_pretrained("AquilaX-AI/QnA")

Inference Example

The following code demonstrates how to perform inference with the AquilaX QnA model:

from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
import torch

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("AquilaX-AI/QnA")
model = AutoModelForCausalLM.from_pretrained("AquilaX-AI/QnA")

# System prompt
prompt = "<|im_start|>system\nYou are Securitron, a helpful AI assistant.<|im_end|>"

# Initialize history
history = []

# Set device
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
model.to(device)

while True:
    user_input = input("\nUser Question: ")
    if user_input.lower() == 'break':
        break

    # Format user input
    user = f"<|im_start|>user\n{user_input}<|im_end|>\n<|im_start|>assistant"
    history.append(user)
    history = history[-5:]  # Keep last 5 exchanges

    # Build prompt
    full_prompt = prompt + "\n".join(history)
    inputs = tokenizer(full_prompt, return_tensors="pt", truncation=True).input_ids.to(device)

    # Stream response
    streamer = TextStreamer(tokenizer, skip_prompt=True)
    response = model.generate(
        input_ids=inputs,
        streamer=streamer,
        max_new_tokens=512,
        use_cache=True,
        pad_token_id=151645,
        eos_token_id=151645,
        num_return_sequences=1
    )

    # Update history
    decoded = tokenizer.decode(response[0]).split('<|im_start|>assistant')[-1].split('<|im_end|>')[0].strip()
    history.append(decoded + "<|im_end|>")

Key Notes

  • Device: Automatically uses GPU if available; defaults to CPU.

  • Streaming: TextStreamer enables real-time response display.

  • History: Limits to 5 exchanges to optimize memory.


Input and Output Format

Input Format

<|im_start|>system
You are Securitron, a helpful AI assistant.
<|im_end|>
<|im_start|>user
{user_question}
<|im_end|>
<|im_start|>assistant

Output Format

<|im_start|>assistant
{generated_response}
<|im_end|>
  • Responses are streamed in real-time with TextStreamer.

  • Cleaned output is plain text for user display.


Performance Optimization

  • CPU Efficiency: Quantized model ensures low memory usage.

  • History Management: Limit to 5 exchanges to reduce memory overhead.

  • GPU Support: Enable CUDA for faster inference if available.

  • Response Length: Adjust max_new_tokens for shorter or longer outputs.


Deployment Considerations

  • Environment: Use a virtual environment to manage dependencies.

  • Error Handling: Add try-catch for robust error management.


Support and Contributions


API Alternative: Use the for simpler integration.

Scalability: For production, deploy via FastAPI or use the.

For support or updates, contact the team or visit the model’s Hugging Face repository AquilaX-AI/QnA

Credit on Engineering team: &

AquilaX Securitron API
AquilaX
Securitron API
Securitron API
AquilaX
Suriya
Pachaiappan