(flex.info)Cxx


Next: Reentrant Prev: Performance Up: Top
Enter node , (file) or (file)node

18 Generating C++ Scanners
**************************

*IMPORTANT*: the present form of the scanning class is _experimental_
and may change considerably between major releases.

   `flex' provides two different ways to generate scanners for use with
C++.  The first way is to simply compile a scanner generated by `flex'
using a C++ compiler instead of a C compiler.  You should not encounter
any compilation errors (Note: Reporting Bugs).  You can then use C++
code in your rule actions instead of C code.  Note that the default
input source for your scanner remains `yyin', and default echoing is
still done to `yyout'.  Both of these remain `FILE *' variables and not
C++ _streams_.

   You can also use `flex' to generate a C++ scanner class, using the
`-+' option (or, equivalently, `%option c++)', which is automatically
specified if the name of the `flex' executable ends in a '+', such as
`flex++'.  When using this option, `flex' defaults to generating the
scanner to the file `lex.yy.cc' instead of `lex.yy.c'.  The generated
scanner includes the header file `FlexLexer.h', which defines the
interface to two C++ classes.

   The first class, `FlexLexer', provides an abstract base class
defining the general scanner class interface.  It provides the
following member functions:

`const char* YYText()'
     returns the text of the most recently matched token, the
     equivalent of `yytext'.

`int YYLeng()'
     returns the length of the most recently matched token, the
     equivalent of `yyleng'.

`int lineno() const'
     returns the current input line number (see `%option yylineno)', or
     `1' if `%option yylineno' was not used.

`void set_debug( int flag )'
     sets the debugging flag for the scanner, equivalent to assigning to
     `yy_flex_debug' (Note: Scanner Options).  Note that you must
     build the scanner using `%option debug' to include debugging
     information in it.

`int debug() const'
     returns the current setting of the debugging flag.

   Also provided are member functions equivalent to
`yy_switch_to_buffer()', `yy_create_buffer()' (though the first
argument is an `istream*' object pointer and not a `FILE*)',
`yy_flush_buffer()', `yy_delete_buffer()', and `yyrestart()' (again,
the first argument is a `istream*' object pointer).

   The second class defined in `FlexLexer.h' is `yyFlexLexer', which is
derived from `FlexLexer'.  It defines the following additional member
functions:

`yyFlexLexer( istream* arg_yyin = 0, ostream* arg_yyout = 0 )'
     constructs a `yyFlexLexer' object using the given streams for input
     and output.  If not specified, the streams default to `cin' and
     `cout', respectively.

`virtual int yylex()'
     performs the same role is `yylex()' does for ordinary `flex'
     scanners: it scans the input stream, consuming tokens, until a
     rule's action returns a value.  If you derive a subclass `S' from
     `yyFlexLexer' and want to access the member functions and variables
     of `S' inside `yylex()', then you need to use `%option
     yyclass="S"' to inform `flex' that you will be using that subclass
     instead of `yyFlexLexer'.  In this case, rather than generating
     `yyFlexLexer::yylex()', `flex' generates `S::yylex()' (and also
     generates a dummy `yyFlexLexer::yylex()' that calls
     `yyFlexLexer::LexerError()' if called).

`virtual void switch_streams(istream* new_in = 0, ostream* new_out = 0)'
     reassigns `yyin' to `new_in' (if non-null) and `yyout' to
     `new_out' (if non-null), deleting the previous input buffer if
     `yyin' is reassigned.

`int yylex( istream* new_in, ostream* new_out = 0 )'
     first switches the input streams via `switch_streams( new_in,
     new_out )' and then returns the value of `yylex()'.

   In addition, `yyFlexLexer' defines the following protected virtual
functions which you can redefine in derived classes to tailor the
scanner:

`virtual int LexerInput( char* buf, int max_size )'
     reads up to `max_size' characters into `buf' and returns the
     number of characters read.  To indicate end-of-input, return 0
     characters.  Note that `interactive' scanners (see the `-B' and
     `-I' flags in Note: Scanner Options) define the macro
     `YY_INTERACTIVE'.  If you redefine `LexerInput()' and need to take
     different actions depending on whether or not the scanner might be
     scanning an interactive input source, you can test for the
     presence of this name via `#ifdef' statements.

`virtual void LexerOutput( const char* buf, int size )'
     writes out `size' characters from the buffer `buf', which, while
     `NUL'-terminated, may also contain internal `NUL's if the
     scanner's rules can match text with `NUL's in them.

`virtual void LexerError( const char* msg )'
     reports a fatal error message.  The default version of this
     function writes the message to the stream `cerr' and exits.

   Note that a `yyFlexLexer' object contains its _entire_ scanning
state.  Thus you can use such objects to create reentrant scanners, but
see also Note: Reentrant.  You can instantiate multiple instances of
the same `yyFlexLexer' class, and you can also combine multiple C++
scanner classes together in the same program using the `-P' option
discussed above.

   Finally, note that the `%array' feature is not available to C++
scanner classes; you must use `%pointer' (the default).

   Here is an example of a simple C++ scanner:

             // An example of using the flex C++ scanner class.

         %{
         int mylineno = 0;
         %}

         string  \"[^\n"]+\"

         ws      [ \t]+

         alpha   [A-Za-z]
         dig     [0-9]
         name    ({alpha}|{dig}|\$)({alpha}|{dig}|[_.\-/$])*
         num1    [-+]?{dig}+\.?([eE][-+]?{dig}+)?
         num2    [-+]?{dig}*\.{dig}+([eE][-+]?{dig}+)?
         number  {num1}|{num2}

         %%

         {ws}    /* skip blanks and tabs */

         "/*"    {
                 int c;

                 while((c = yyinput()) != 0)
                     {
                     if(c == '\n')
                         ++mylineno;

                     else if(c == @samp{*})
                         {
                         if((c = yyinput()) == '/')
                             break;
                         else
                             unput(c);
                         }
                     }
                 }

         {number}  cout  "number "  YYText()  '\n';

         \n        mylineno++;

         {name}    cout  "name "  YYText()  '\n';

         {string}  cout  "string "  YYText()  '\n';

         %%

         int main( int /* argc */, char** /* argv */ )
             {
             @code{flex}Lexer* lexer = new yyFlexLexer;
             while(lexer->yylex() != 0)
                 ;
             return 0;
             }

   If you want to create multiple (different) lexer classes, you use the
`-P' flag (or the `prefix=' option) to rename each `yyFlexLexer' to
some other `xxFlexLexer'.  You then can include `<FlexLexer.h>' in your
other sources once per lexer class, first renaming `yyFlexLexer' as
follows:

         #undef yyFlexLexer
         #define yyFlexLexer xxFlexLexer
         #include <FlexLexer.h>

         #undef yyFlexLexer
         #define yyFlexLexer zzFlexLexer
         #include <FlexLexer.h>

   if, for example, you used `%option prefix="xx"' for one of your
scanners and `%option prefix="zz"' for the other.


automatically generated by info2www version 1.2.2.9