Because number of string is not big (~100), you can use next algo:
- Calculate max length of word you have. Let it be N.
- Create
int checks[N];
array of checksum.
- Let's checksum will be sum of all characters in searching phrase. So, you can calculate such checksum for each word from your list (that is known at compile time), and create
std::map<int, std::vector<std::wstring>>
, where int
is checksum of string, and vector should contain all your strings with that checksum.
Create array of such maps for each length (up to N), it can be done at compile time also.
- Now move over big string by pointer. When pointer points to X character, you should add value of X char to all
checks
integers, and for each of them (numbers from 1 to N) remove value of (X-K) character, where K is number of integer in checks
array. So, you will always have correct checksum for all length stored in checks
array.
After that search on map does there exists strings with such pair (length & checksum), and if exists - compare it.
It should give false-positive result (when checksum & length is equal, but phrase is not) very rare.
So, let's say R is length of big string. Then looping over it will take O(R).
Each step you will perform N operations with "+" small number (char value), N operations with "-" small number (char value), that is very fast. Each step you will have to search for counter in checks
array, and that is O(1), because it's one memory block.
Also each step you will have to find map in map's array, that will also be O(1), because it's also is one memory block.
And inside map you will have to search for string with correct checksum for log(F), where F is size of map, and it will usually contain no more then 2-3 strings, so we can in general pretend that it is also O(1).
Also you can check, and if there is no strings with same checksum (that should happens with high chance with just 100 words), you can discard map at all, storing pairs instead of map.
So, finally that should give O(R), with quite small O.
This way of calculating checksum
can be changed, but it's quite simple and completely fast, with really rare false-positive reactions.
std::list
with an array (normal array,std::array
orstd::vector
). It may improve performance a bit. Also, why do you haveL
prefix only on one of the literals? – HolyBlackCat 5 hours agostd::search
and found out it's not fast enough? – David Haim 4 hours agotolower
on each string. or put a specifi comparator if its ascii only it shouldn't take a lot of time anyway. my point is to try to milk the standard library as much as possible, it is very fast – David Haim 4 hours ago