DictD++

Important: Brevity disclaimer.

Intro

DictD++ is a server that implements DICT protocol. This protocol specifies the common way for clients to access dictionary information over network.

More information about this protocol and existing client / server software is provided at dictd.org site.

Although DICT protocol is quite powerful, on practice it seems that more limited (but easier) solutions are typically implemented, most often - some web-based solutions with simple search rules.

Therefore, area of DictD++ usage is nowadays quite limited - still, it can be useful alternative for setting up your own dictionary server if you can find appropriate client software or especially if you require extra functionality (such as extra search strategies), or if you’d like to provide support for already existing dictionaries that DictD++ understands.

Please also note that development of DictD++ stopped quite some time ago (main development was done in 2002 - 2003 years, libdictclient++ was implemented in 2007), and no support is provided - I’d love to work on it further, but I just have no time for that anymore

Features

  • Portability: DictD++ can be compiled and run on *nix-like systems as well as on Win32. Support for Win32 is native, so no “cygwin” or other emulation packages are used. On Win32 it supports running as service.

    DictD++ has been tested under FreeBSD 4.4, Linux Mandrake 8.2, Windows 2000, Windows XP but should run also on most other recent systems provided that they have C++ compiler that is close to the standard and necessary support libraries.

  • Full support of more than 200 codepages & encodings (of course including UTF-8, UTF-16) thanks to ICU library being used.

  • All algorithms that need collation use ICU collation that is based on Unicode Collation Algorithm; this means collation is performed correctly for any locale supported by ICU (more than 230 locales at this moment).

  • DictD-compartible indices as well as dictionaries (including dz format) are supported. You may need to resort indices though if locale you choose to use differs from the locale index was created in. Utility to resort indices (as well as to produce indices for some specific types of dictionaries) is supplied.

  • Supports most common strategies: exact, prefix, suffix, substring, regex, Levinstein (with adjustable edit distances), as well as “top-N” Levinshtein matches. All these strategies work correctly with all of the supported codepages & encodings as well as with all dictionaries.

  • Speed & memory consumption:

    • Server runs in multi-threaded mode (one worker thread per client) to avoid overhead of new process creation for each client.

    • Indices are pre-processed to create cache files that allow faster start-up and searching. You can choose from the two types of cache files: ‘basic’ and ‘extended’ - depending on available disk space and desired time improvements.

    • All dictionary-related files (indices, caches and dictionaries themselves) are mmaped. This makes access faster comparing to file access, allows instant start-up and gives very low memory consumption when server is not used.

  • Authentication is supported and flexible access rules are available to restrict access to particular dictionary or strategy based on user name, group or address.

  • Definitions can be post-processed before sending them to the client, thus f.e. converting xml to text/html or whatever; currently built-in post-processors include XSL conversion to convert XML into anything and html → text convertion. Several example conversions for XML are supplied. Extension to OPTION MIME command is introduced to allow client to request preferred MIME format.

  • DictD++ is written in C++ (and I hope - in standard-conformant way , with heavy STL usage: this should make maintenance easy and eliminate a lot of possible security-related problems like buffer overflows.

    Architecture is highly extendable; all core components are interfaced, thus adding support for the new type of index, dictionary, strategy or transformer is very straightforward.

Compilation

DictD++ uses cmake for building, it should work on Win32 as well as on *nixes.

Dependencies are:

  • Boost library (required), header-based libraries plus thread library.
  • ICU for Unicode support (required).
  • XercesC for XML support (required).
  • ZLib for DZ dictionary format support (required, but can be commented out, if needed).
  • XalanC for XSL transformations (optional, but highly recommended if you plan to work with transformations).

The package consists of dictd++ server (which in turn uses libdictd++ for most operations), dictutil++ utility for preparing indices and dictionaries and libdictclient++ as an example of simple client library that utilizes DICT protocol.

Download

The latest version is 1.2.5, here are its sources.