The SimpleCxxLib package


#include "lexicon.h"

class Lexicon

This class is used to represent a lexicon, or word list. The main difference between a lexicon and a dictionary is that a lexicon does not provide any mechanism for storing definitions; the lexicon contains only words, with no associated information. It is therefore similar to a set of strings, but with a more space-efficient internal representation. The Lexicon class supports efficient lookup operations for words and prefixes.

As an example of the use of the Lexicon class, the following program lists all the two-letter words in the lexicon stored in EnglishWords.dat:

   int main() {
      Lexicon english("EnglishWords.dat");
      for (const string &word: english) {
         if (word.length() == 2) {
            cout << word << endl;
         }
      }
      return 0;
   }
Constructor
Lexicon()
Lexicon(filename) 
Initializes a new lexicon.
Methods
add(word) Adds the specified word to the lexicon.
addWordsFromFile(filename) Reads the file and adds all of its words to the lexicon.
clear() Removes all words from the lexicon.
contains(word) Returns true if word is contained in the lexicon.
containsPrefix(prefix) Returns true if any words in the lexicon begin with prefix.
isEmpty() Returns true if the lexicon contains no words.
mapAll(fn) Calls the specified function on each word in the lexicon.
size() Returns the number of words contained in the lexicon.

Constructor detail


Lexicon();
Lexicon(string filename);
Initializes a new lexicon. The default constructor creates an empty lexicon. The second form reads in the contents of the lexicon from the specified data file. The data file must be in one of two formats: (1) a space-efficient precompiled binary format or (2) a text file containing one word per line. The Stanford library distribution includes a binary lexicon file named English.dat containing a list of words in English. The standard code pattern to initialize that lexicon looks like this:
   Lexicon english("English.dat");

Usage:

Lexicon lex;
Lexicon lex(filename);

Method detail


int size() const;
Returns the number of words contained in the lexicon.

Usage:

int n = lex.size();

bool isEmpty() const;
Returns true if the lexicon contains no words.

Usage:

if (lex.isEmpty()) ...

void clear();
Removes all words from the lexicon.

Usage:

lex.clear();

void add(string word);
Adds the specified word to the lexicon.

Usage:

lex.add(word);

void addWordsFromFile(string filename);
Reads the file and adds all of its words to the lexicon.

Usage:

lex.addWordsFromFile(filename);

bool contains(string word) const;
Returns true if word is contained in the lexicon. In the Lexicon class, the case of letters is ignored, so "Zoo" is the same as "ZOO" or "zoo".

Usage:

if (lex.contains(word)) ...

bool containsPrefix(string prefix) const;
Returns true if any words in the lexicon begin with prefix. Like containsWord, this method ignores the case of letters so that "MO" is a prefix of "monkey" or "Monday".

Usage:

if (lex.containsPrefix(prefix)) ...

void mapAll(void (*fn)(string)) const;
void mapAll(void (*fn)(const string &)) const;
void mapAll(FunctorType fn) const;
Calls the specified function on each word in the lexicon.

Usage:

lexicon.mapAll(fn);