Encyclopedia > Lex programming language

  Article Content

Lex programming language

Lex is a program that generates lexical analyzers ("scanners"). Lex is commonly used with the yacc parser generator. Lex is the standard lexical analyzer on Unix systems, and is included in the POSIX standard. A popular free version of lex is flex[?], a fast lexical analyzer. Lex reads an input file specifying the lexical analyzer and outputs code implementing the lexer in the C programming language.

Structure of a lex file

The structure of a lex file is intentionally similar to that of a yacc file; files are divided up into three parts: a definition section, a rules section, and a C code section. Sections are separated by lines that contain only two percent signs: %%

The definition section is the place to define macros using regular expressions, and also to import header files written in C.

The rules section is the most important section; it associates rules to C statements. When lex sees a pattern in its input matching a given rule, it executes the associated C code. Rules are simply regular expressions, probably containing the macros defined in the definition section.

The C code section contains C statments and functions that are copied verbatim to the generated source file. These statements presumably contain code called by the rules in the rules secion. In large programs it is more convenient to place this code in a separate file and link it in at compile time.

Example flex file

The following is an example input file for the flex[?] verison of lex. It recognizes strings of numbers (integers) in the input. Given the input "abc123z.!&*2ghj6", the program will print:

 Saw an integer: 123
 Saw an integer: 2
 Saw an integer: 6

 /* 
  * Example lexical analyzer for flex
  *
  * Picks out strings of digits (integers) from the input.
  */
 
 /*** Definition section ***/
 
 %{
 
 /*
  * Some C code to include the C standard I/O library.
  * Everything inside the %{ %} brackets is inserted
  * verbatim into the generated file.
  */
 #include <stdio.h>
 
 %}
 
 /* Macros;  regular expressions */
 DIGIT       [0-9]
 INTEGER     {DIGIT}+
 
 /* This tells flex to read only one input file */
 %option noyywrap
 
 %%
     /*
      * Rules section 
      *
      * Comments in this section must be indented
      * so lex won't mistake them for regular expressions.
      */
 
 {INTEGER}   {
                 /*
                  * This rule prints integers from the input.
                  * yytext is a string containing the matched text.
                  */
                 printf("Saw an integer: %s\n", yytext); 
             }
 
 .           { /* Ignore all other characters. */ }
 
 %%
 /*** C Code section ***/
 
 /*
  * The main program.
  *
  * Call the lexer. Quit when done.
  */
 int main(void)
 {
     /* yyin is where lex reads from. Set it to the standard input. */
     FILE *yyin = stdin;
 
     /* Call the lexer. */

     yylex();
 
     return 0;
 }



All Wikipedia text is available under the terms of the GNU Free Documentation License

 
  Search Encyclopedia

Search over one million articles, find something about almost anything!
 
 
  
  Featured Article
Bobby Charlton

... in 1974 and a knighthood in 1994. He had a very humorous haircut (bald on top, with the side bits grown long and combed over the top). His brother Jack Charlton w ...