This paper presents the design of a FPGA-based hardware co-processor, based on the SPHINX 3 speech recognition engine from CMU; capable of performing acoustic modeling (AM) for medium sized vocabularies in real-time. By creating an input-driven pipeline for performing the calculations, we were able to maximize the throughput of the system while simultaneously minimizing the number of pipeline stalls. Use advanced placement techniques enabled post place-and-route speeds even greater than those necessary for real-time operation while operating at maximum workload. Further, by using input control vectors all FSMs were removed from the design, greatly increasing the flexibility of the design. These results combined with the ability to reprogram the system for different recognition tasks serve to create a system capable of in a vast array of environments. Synthesis to both Xilinx Virtex 4 and Spartan 3 FPGAs helps to further characterize the flexibility of the architecture.
The different versions of the original document can be found in:
Published on 01/01/2006
Volume 2006, 2006
DOI: 10.1109/ipdps.2006.1639473
Licence: CC BY-NC-SA license
Are you one of the authors of this document?