FastSig
Important: Brevity disclaimer.
Intro
FastSig is “yet another” implementation of signals/slots mechanism in C++.
For those who are not familiar with the concept of signals/slots - please, have a quick look at documentation for boost.signals or, alternatively, here is very short code sample:
#include <fastsig/fastsig.hpp> // Callback function we want to call on some event, passing // it string parameter void f(const std::string& str) { std::cout << "String passed: " << str << std::endl; } // Step 1: construct signal fastsig::signal<void (const std::string&)> sig; // Step 2: connect to the signal our callback // function, AKA slot sig.connect(&f); // Step 3: call all slots connected so far to this signal sig("Hello!");
Rationale
Typically, I’m not really in favor of implementing things that exists already and are good enough - and there are indeed a plenty of good signals/slots implementations for C++. However, I was not able to find any implementation, that satisfies all the following goals:
- The highest possible speed when calling small number of slots (typically, 1-2).
- Small enough to be used in the project without too much overhead, like adding big libraries - headers-only implementation consisting of 1-2 headers would be ideal.
- Yet generic enough to support all basic concepts of signals/slots.
- Boost-like or BSD-like license - I wanted to be able to include the implementation in the commercial product.
- (small one) I do not like pre-instantiated code I wanted the library to use some kind of macros or templates instantiation for signals signatures.
I’ve considered the following possibilities before writing FastSig:
- boost.signal
- libsigc++
- sigslot
- XLObject
Basically, boost.signal and libsigc++ have everything I could dream of, however, the fastest possible speed for slots calls clearly was not their main point - please see speed comparison below. Due to these projects size and complexity, it was also not an option to quickly change them to improve the speed drastically.
Sigslot and XLObject, on the other hand, lack some of the features I need, like support for calling global functions.
I’ve reviewed some other projects also, but they were just too basic to consider them seriously.
Thus, I’ve decided to do my own implementation that should be close enough to the requirements listed above.
Implementation
Implementation was done on the basis of boost and uses the following boost libraries / features:
- boost.shared_ptr
- boost.type_traits
- boost.preprocessor
- boost.function (only if slot_type typedef is used)
Syntax is very much like in boost.signals or libsigc++.
Maximal number of arguments in signal signature is specified as macro FASTSIG_MAX_ARGS, which is defined to 8 by default, but it can be defined to different value prior to inclusion of fastsig.hpp
Tracking of slots is supported, but in quite “dirty” way, by passing pointer to the “trackable” object as an additional parameter to the connect() call:
struct A: public fastsig::trackable { void operator(int i) { // Do smth with 'i' here ... } }; fastsig::signal<void (int)> sig; A* a = new A; sig.connect(*a /* Slot */, *a /* Trackable object */); sig(0); delete a; // At this point 'a' is detached from 'sig'
Speed
Here’s speed comparison of several slots/signals implementations.
I’ve used comparison program posted some time ago on libsigc mailing list, which is capable of measuring boost.signals, Lite (another small signals/slots implementation mainly written for performance estimation, it seems), and libsigc++. I’ve made small adjustments to the testing program to include FastSig implementation and, additionally, direct function calls time measurements.
Initially I’ve tried to measure “sigslot” speed also, but it appeared it has highly ineffective slots removal (quadratic time on the number of slots), thus I had to exclude it.
Tests were done on Athlon 64 X2 3800+ overclocked at 2.5 GHz, compiler was MS Visual Studio 2005, optimizations were set for maximal speed.
Boost version - 1.33.1
libsigc++ - 2.0.17
Time is in seconds.
Num Slots | Calls/Slot | Boost | Lite | LibSigC | FastSig | Direct |
---|---|---|---|---|---|---|
1 | 100000000 | 86.2 | 2.3 | 25.2 | 1.2 | 0.5 |
10 | 10000000 | 42.0 | 1.4 | 4.6 | 0.8 | 0.3 |
50 | 2000000 | 46.0 | 1.1 | 5.3 | 0.8 | 0.2 |
100 | 1000000 | 41.1 | 1.2 | 2.9 | 0.9 | 0.2 |
250 | 400000 | 37.2 | 1.3 | 2.6 | 0.9 | 0.2 |
500 | 200000 | 37.4 | 1.5 | 2.7 | 1.5 | 0.2 |
1000 | 100000 | 37.2 | 1.9 | 2.8 | 1.8 | 0.2 |
5000 | 20000 | 52.4 | 8.2 | 10.0 | 14.0 | 0.2 |
10000 | 10000 | 53.6 | 8.4 | 10.3 | 14.5 | 0.2 |
50000 | 2000 | 54.3 | 9.4 | 10.5 | 14.7 | 0.2 |
100000 | 1000 | 52.5 | 8.4 | 10.2 | 14.4 | 0.3 |
500000 | 200 | 52.7 | 8.5 | 10.6 | 14.5 | 0.5 |
As measurements show, boost.signals is clear outsider in terms of speed - short investigations of the generated code have shown that its complex iterator over slots performs quite a lot of work and also is not inlined by compiler - thus, generating quite serious overhead for each slot invocation.
libsigc++ is better in terms of speed, but still has some problems - most likely, due to temporary list creation in the heap for each slot call.
Similarly to previous measurements results posted on libsigc++ list, there’s jump in timings at some point (on my CPU it is 5000 slots number), that should be caused by the fact that CPU cache becomes too small to fit all needed data.
Interestingly, FastSig performs worse than Lite and SigC after crossing cache limit - I do not have good explanation for that other than something in data structures usage should be really different that causes such behavior, probably smth related to cache locality?
I’m not really interested in further investigation, however, as I would not use FastSig for this number of slots anyway, it’s not what it was designed for.
Things that work
All basic functionality that is expected from signals/slots:
- All entities that support appropriate signature can be used as slot:
- Global functions.
- Functional objects.
- Member functions - via boost::bind or anything similar.
- “Connection” object can be used to query connection status and disconnect slot from signal.
- Slots tracking.
Things that don’t
This is the list of features that, in my opinion, could be more or less reasonably expected from FastSig, but are not currently provided:
- Return values accumulators - currently I do not see any way to support them using the “traditional” way, like it is done in boost.signals or libsigc++, without serious loss of performance. At some point I can consider implementing this feature using another interface, that does not require keeping temporary copy of results or writing complex iterators, but currently I just do not need this feature, so I’m not going to work on it.
- Disabling connections - while this would make implementation not that much slower, I do not see good enough reasons / scenarios to justify adding this feature.
Download
The latest version of FastSig is 0.2, here are its sources.
Bugs
Yes, here they are While I do not promise any kind of support, please, contact me if you find smth particularly ugly / wrong with the implementation - I might consider fixing it.
Future directions
I do not currently consider further development of “fastsig”, as it has all the features I need. However, it may change in the future if I need to extend its functionality for some of the projects. Most likely, these could be some fixes of current implementation (which I bet has some flaws under some particular usage scenarios ), support for return values accumulators, probably - support for multi-threaded usage.