Our AI Models
RakshaLink uses a multi-layered AI approach to detect fraud with high accuracy while maintaining complete privacy through on-device processing.
Conv1D Neural Network v1.4
Primary detection model for real-time fraud analysis
TinyBERT v0.9
Advanced contextual analysis for complex messages
Pattern Detector v2.2
Rule-based system for known fraud patterns
Performance Metrics
Detection Capabilities
Evaluated on test set of 5,200 messages (Aug-Sep 2025) with Hindi/English/Hinglish mix
| Fraud Type | Detection Rate (Recall at Operating Point) | False Positive Rate | Test Set |
|---|---|---|---|
| UPI/Payment Frauds | 94-97% (±2.1%) | 0.8-1.6% | N=1,850 |
| Phishing Links | 92-95% (±2.8%) | 1.2-2.8% | N=1,200 |
| KYC/Identity Scams | 91-94% (±3.1%) | 1.5-3.5% | N=980 |
| Digital Arrest Threats | 96-99% (±1.8%) | 0.5-1.2% | N=420 |
| Job/Investment Scams | 89-92% (±3.5%) | 2.1-4.8% | N=550 |
| Lottery/Prize Frauds | 93-96% | 0.9-2.1% | N=200 |
Language Support
- English (Primary)
- Hindi (Including Hinglish/Romanized)
- Mixed language messages and code-switching
- High-volume WhatsApp patterns in Marathi & Bengali
- Early support for Kannada & Telugu (work-in-progress)
Resource Usage
Tested on devices: Snapdragon 6/7/8 series, Android 11-14, 4-8GB RAM
Memory
Battery
How We Measure
Dataset Summary
Training & Evaluation Data
- Total Messages: 50,000+ labeled samples (anonymized)
- Sources: 70% WhatsApp, 25% SMS, 5% Other
- Language Split: 40% English, 35% Hindi/Hinglish, 25% Mixed/Regional
- Time Window: June 2024 - September 2025
- Balance Strategy: Oversampling rare fraud types, undersampling common legitimate messages
- Validation: Time-based split (80/10/10 train/val/test)
Evaluation Protocol
- Operating Point: Optimized for F2 score (emphasizing recall)
- Cross-validation: 5-fold with stratified sampling
- Adversarial Tests: Unicode tricks (ZWSP, RLO), homoglyphs, compressed URLs
- Confidence Intervals: Bootstrap with 1000 iterations
- Drift Monitoring: Monthly review of false positives/negatives
Safety Principles
🔒 Privacy by Design
Detection runs locally on your device. No chat content, contacts, call logs, or OTPs are uploaded. Optional aggregated statistics (if enabled) contain no message text.
🎯 High Precision Focus
Tuned to reduce false alarms on legitimate bank OTPs, bill reminders, and delivery notifications. We prioritize avoiding incorrect flags over catching every possible threat.
🔄 Rapid Updates
Rules and models updated offline and shipped via app updates (opt-in). No real-time learning on your device to prevent model poisoning.
⚖️ Bias Mitigation
Regular audits for linguistic and regional bias. Balanced training across urban/rural usage patterns and socioeconomic contexts.
Known Limitations
⚠️ Current Limitations
- Unicode Evasion: Advanced tricks (ZWSP, RLO/U+202E, homoglyphs) combined with truncated notifications may evade detection
- Media-Only Scams: Pure image/audio scams (voice notes, stickers) detected only when metadata contains text
- New Patterns: Fresh bank brands, newly registered phishing domains have cold-start detection lag (24-48 hours)
- Ambiguous Messages: Legitimate KYC reminders, mandate notices, e-commerce fees may trigger cautionary flags
- Regional Coverage: Lower accuracy for pure regional languages (Assamese, Odia, etc.) without Romanization
- Device Limits: Cannot protect against device-level malware or screen-overlay attacks
Technical Details
Model Architecture
- Conv1D Network: 3-layer CNN with max pooling, dropout (0.3), batch normalization
- TinyBERT: 4-layer, 312-hidden, 12-heads, 14.5M parameters, distilled from BERT-base
- Tokenizer: WordPiece with 10K vocab, special tokens for Indian financial terms
- Cascade System: Confidence gating at 0.35/0.65 thresholds for efficiency
Evaluation Metrics
| Metric | Conv1D | TinyBERT | Cascade |
|---|---|---|---|
| Precision | 94.8% (±2.2%) | 96.5% (±1.8%) | 96.2% (±1.9%) |
| Recall | 93.2% (±2.5%) | 94.2% (±2.1%) | 95.1% (±2.0%) |
| F1-Score | 94.0% (±2.3%) | 95.3% (±1.9%) | 95.6% (±1.8%) |
| Latency (P95) | 28ms | 48ms | 35ms |
| PR-AUC | 0.947 | 0.968 | 0.971 |
Red Team Test Results
| Attack Vector | Success Rate (Pre-mitigation) | Success Rate (Current) |
|---|---|---|
| Homoglyph Substitution | 45% | < 5% |
| Hidden Unicode (ZWSP) | 30% | < 2% |
| Adversarial Noise | 15% | < 8% |
User Control
- All AI features can be disabled at any time
- Detection thresholds adjustable in developer settings (7 taps on version)
- Model updates are optional with clear changelog
- Explanations show which signals triggered detection
- One-tap feedback for false positives/negatives
Transparency
- Model versions and last update date shown in app
- Detection confidence scores (0-100) for each alert
- Monthly aggregated accuracy reports (opt-in)
- Source patterns shown for rule-based detections
- No hidden data collection or behavioral profiling
Security
🔐 Security Measures
- Models stored in app sandbox with integrity checks
- SHA-256 verification on model load
- Kill-switch for emergency model disable (auto-recovery after 24h)
- No network required for core detection
- Play Integrity API for app attestation (where available)
- Android Keystore for sensitive keys (hardware-backed when supported)
Feedback & Improvement
Your feedback improves detection while preserving privacy. Here's how you can help:
Report False Positives
Help reduce incorrect flags on legitimate bank messages and OTPs
Report Missed Scams
Share new fraud patterns we haven't seen (text only, no personal info)
Anonymous Metrics
Opt-in telemetry shares detection counts, not message content
📧 Contact Us
For technical inquiries about our AI models: support@rakshalink.in
Report security issues: security@rakshalink.in
Additional Resources
Downloads:
Footnotes
1 Device testing methodology: 7-day continuous monitoring using Android Battery Historian and Perfetto. Devices represent 68% of Indian smartphone market share (IDC Q3 2024). Memory measured via Android Studio Profiler during active notification processing. Battery impact calculated as delta from baseline idle consumption.
2 Detection rates represent recall at our chosen operating point optimizing for F2 score (weighing recall 2x precision). Confidence intervals calculated using Wilson score method on holdout test set. False positive rates measured on curated legitimate message corpus (N=12,000) including bank OTPs, delivery notifications, and appointment reminders.