H = {h₁, h₂, ..., hₙ} with confidence scores
Low-rank matrix injection (r=8)
Genetic algorithm-based prompt tuning
TokenLimit + Deduplication + Filtering
Efficient parameter tuning by updating only 0.3%-1.2% of model parameters through low-rank decomposition:
Applied to Query and Value projection layers in attention mechanisms, reducing parameters from dk to r(d+k).
Genetic algorithm with population size 100, 50 iterations, using:
Multi-level error filtering mechanism:
Figure 1: 3D visualization of evolutionary prompt optimization process showing initial population, selection, crossover, and mutation stages
Figure 2: Comprehensive WER comparison showing performance improvements across different models and configurations
Figure 3: Radar chart showing the relative contribution of each technique component to overall WER reduction
Figure 4: End-to-end process flow of the EvoLoRA framework from audio input to final transcription
Figure 5: Comparison between original dense Transformer layer weights and LoRA low-rank matrix injection structure
Figure 6: 3D visualization comparing optimized models versus baseline models across different model sizes
Model | Parameters | Zero-shot WER | Optimized WER | Training Time | Parameter Updates |
---|---|---|---|---|---|
TinyLlama-1.1B | 1.1B | 285.89% | 7.47% | 2.3 GPU hours | 0.3% |
Qwen2.5-7B | 7B | 1503.20% | 6.25% | 18.5 GPU hours | 1.2% (LoRA) |
LLaMA-3.2-3B | 3.2B | 452.71% | 8.12% | 12.7 GPU hours | 100% |
GPT-3.5 Turbo | 175B | 44.90% | 33.96% | 0.1 hours (API) | 0% |
Individual contribution of each component to overall WER reduction on Qwen2.5-7B:
Despite achieving significant performance improvements, our framework exhibits a 4.05% WER gap compared to SOTA method Hypo2Trans (2.20%). Core limitations include:
The framework achieves optimal balance between performance and resource consumption:
This study proposes the EvoLoRA framework for resource-efficient ASR error correction, achieving significant performance improvements through synergistic integration of LoRA, EvoPrompt, and auxiliary techniques. Key contributions include:
The framework provides a cost-effective solution for ASR error correction in edge devices and low-resource environments, demonstrating that careful optimization can achieve competitive performance with significantly reduced computational requirements.