Academic Journals Database
Disseminating quality controlled scientific knowledge

Implementing High Performance Lexical Analyzer using CELL Broadband Engine Processor


Journal: International Journal of Engineering Science and Technology
ISSN 0975-5462

Volume: 3;
Issue: 9;
Start page: 6907;
Date: 2011;
VIEW PDF   PDF DOWNLOAD PDF   Download PDF Original page

Keywords: Regular Expressions | Multi-core | Cell Processor | Aho-Corasick algorithm | Lexical Analyzer | Parallel Programming | Pattern matching | Compilers.

The lexical analyzer is the first phase of the compiler and commonly the most time consuming. The compilation of large programs is still far from optimized in today’s compilers. With modern processors moving more towards improving parallelization and multithreading, it has become impossible for performance gains in older compilersas technology advances. Any multicore architecture relies on improving parallelism than on improving single core performance. A compiler that is completely parallel and optimized is yet to be developed and would require significant effort to create. On careful analysis we find that the performance of a compiler is majorly affected by the lexical analyzer’s scanning and tokenizing phases. This effort is directed towards the creation of a completelyparallelized lexical analyzer designed to run on the Cell/B.E. processor that utilizes its multicore functionalities to achieve high performance gains in a compiler. Each SPE reads a block of data from the input and tokenizes them independently. To prevent dependence of SPE’s, a scheme for dynamically extending static block-limits isincorporated. Each SPE is given a range which it initially scans and then finalizes its input buffer to a set of complete tokens from the range dynamically. This ensures parallelization of the SPE’s independently and dynamically, with the PPE scheduling load for each SPE. The initially static assignment of the code blocks is made dynamic as soon as one SPE commits. This aids SPE load distribution and balancing. The PPE maintains the output buffer until all SPE’s of a single stage commit and move to the next stage before being written out to the file, to maintain order of execution. The approach can be extended easily to other multicore architectures as well. Tokenization is performed by high-speed string searching, with the keyword dictionary of the language, using Aho-Corasick algorithm.
Why do you need a reservation system?      Affiliate Program