AquilaX Docs
Service StatusFeature RequestLogin
  • Documentation
  • Products and Services
    • Demo
      • Security Engineer - Assistant
      • Security Engineer - Chat
      • Scan code Snippet
    • Products
    • Services
      • Vulnerability Triaging
      • AppSec Training
      • DevSecOps Consultation
      • Deployment Options
      • Security Consultation
      • Integrations
    • Company Principles
      • Engineering Principles
      • AI Principles
      • AquilaX Mission
    • Proof of Value (PoV)
    • SLO/SLA/SLI
    • Security Scanners
    • Supported Languages
    • What is AquilaX
    • Success Cases
      • RemoteEngine
    • AquilaX License Model
  • User Manual
    • Access Tokens
    • Scanners
      • Secret Scanning
      • PII Scanner
      • SAST
      • SCA
      • Container Scanning
      • IaC Scanning
      • API Security
      • Malware Scanning
      • AI Generated Code
      • License Scanning
    • DevTools
      • AquilaX CLI
      • CI/CD
        • GitHub Integration
        • GitLab Integration
      • Vulnerability Tickets
        • GitHub Issues
        • GitLab Issues
        • JIRA Tickets
      • IDE
        • VS Code
    • Frameworks
    • Roles
    • Security Policy
    • Comparison
      • ArmorCode vs AquilaX
      • Black Duck vs AquilaX
      • AquilaX vs other Vendors
    • Press and Logo
    • Install AquilaX
    • Public Scan
    • Scanning Setup Guide
    • AI Chat Prompts
  • API Docs
  • Tech Articles
    • Proprietary AI Models
    • AquilaX Securitron
    • Securitron AI Service
    • Secure SDLC (DevSecOps)
    • Bending the technology
    • SecuriTron In Action
    • Future
      • The Future of Code Review
      • Building Superhumans
    • Blog
      • Breaking the Code: AquilaX
      • Rethinking Authentication in 2024
      • Software Supply Chain Security
      • OneFirewall - Network Security
      • The Art of Doing Source Code Review
      • Our Cloud Infrastracture
    • AppSec
      • 10 ‘must’ controls
      • OWASP Top 10
      • MITRE ATT&CK Framework
      • SQL Injection
      • DevSecOps
      • Insider Threats in Application Security
      • Secure API Development
      • RBAC in Applications
      • Security in CI/CD Pipelines
      • Audits in DevSecOps
      • Security Policies
      • S SDLC
      • Multi-Factor Authentication (MFA)
      • API Gateway Security
      • RESTful APIs
      • Microservices
      • Secure API Development
      • API Security Best Practices
    • AI
      • AI part of AppSec
      • NL-JSON Model
      • Findings Review (AquilaX AI)
      • AI-Driven Vulnerability Triage
    • Tech Events
      • Web Summit 2024
    • ASPM
    • State of Art Secure SDLC
      • Validating Runtime Security
    • Announcements
      • 10 Billion
      • AquilaX Joins NVIDIA Inception
    • Webinars
      • Unlock the Future of Code Security with AI
  • AI Models
    • AI Scanner
    • Query
    • QnA
    • Security Assistant
    • Review
Powered by GitBook
On this page
  • Architecture
  • Training Pipeline
  • Inference Workflow
  • Integration Guide
  • Dataset
  • Evaluation Metrics

Was this helpful?

  1. AI Models

Query

Query databases in plain English

The Query (aka, Ask) model empowers developers to query databases in plain English. At its core is a fine-tuned FLAN-T5-base model that translates user questions into optimized PostgreSQL queries. Retrieved results are combined with the original input to generate clear, conversational responses.

Architecture

  • Base Model: FLAN-T5-base (sequence-to-sequence transformer)

  • Fine-Tuning Objective: Natural language → SQL translation

  • Tokenizer: T5 tokenizer with custom prefix handling

Training Pipeline

Dataset Preparation

  • Curated pairs of questions and SQL statements

  • Normalization: lowercase, remove commas, strip trailing punctuation

Data Split

  • 90% training | 10% validation

Input Formatting

  • Prefix: Translate the following text to PGSQL:

Hyperparameters

  • Learning rate: 3e-5

  • Batch size: 4 (×4 gradient-accumulation)

  • Epochs: 10

  • Weight decay: 0.01

  • Scheduler: Cosine decay with 10% warmup

  • Label smoothing: 0.1

Evaluation

  • Metric: SacreBLEU on validation set

Inference Workflow

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
import torch

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("AquilaX-AI/NL-PGSQL")
model = AutoModelForSeq2SeqLM.from_pretrained("AquilaX-AI/NL-PGSQL")

# Prepare and tokenize input
input_text = "Translate the following text to PGSQL: What is the total sales for last month?"
inputs = tokenizer(input_text, return_tensors="pt")

# Generate SQL query
outputs = model.generate(**inputs, max_length=256)
sql_query = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(sql_query)
  1. Preprocess: Lowercase, remove commas and trailing punctuation.

  2. Prefix: Add task instruction.

  3. Tokenize: Convert text to token IDs.

  4. Generate: Produce SQL tokens via model.

  5. Decode: Convert tokens back to string.

Integration Guide

A: Installation

pip install transformers torch

B: Model & Tokenizer

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
tokenizer = AutoTokenizer.from_pretrained("AquilaX-AI/NL-PGSQL")
model = AutoModelForSeq2SeqLM.from_pretrained("AquilaX-AI/NL-PGSQL")

C: Query Execution

  1. Sanitize generated SQL

  2. Execute securely against your PostgreSQL database

D: Response Formatting

Post-process database output into readable text

Dataset

The training dataset comprises custom-curated pairs of natural language questions and SQL queries, designed to align with the target database schema. Its diversity enhances the model's applicability to varied use cases.

Evaluation Metrics

Model performance is assessed using the SacreBLEU metric, which measures the similarity between generated and reference SQL queries. This metric ensures high accuracy and fluency in query generation.

API & Web Interface

Considerations

  • Optimized for a specific schema; may require adaptation for others.

  • Implement error handling for unexpected inputs.


PreviousAI ScannerNextQnA

Last updated 22 days ago

Was this helpful?

AquilaX App: Users can directly interact with the model via the AquilaX platform at, enabling natural language query input and SQL output retrieval without infrastructure management.

API Access: The API, available at, supports programmatic integration for automation and scalability.

Credit on Engineering team: &

https://aquilax.ai/app/home
https://developers.aquilax.ai/api-reference/genai/securitron
Suriya
Pachaiappan