“Pawn Shop” – work done as a need to improve the key/value type of data structure used in computer vision and image search.
PS is a data structure similar with a hash table implemented in different frameworks and libraries. One of the problems that we had when trying to use the existing data structures was time required to insert the key and the values in the data structures. Another issue is the memory required by these data structures.
I have been tested the Intel TBB with the parallel implementation of hash table, still there are some problems with the performance when it comes to insert the keys and values in the data structure.
How it works?
The Pawn Shop storage system consists from a large memory field, when PS tries to insert a key with a specific value it will pick all bits from the value set to 1 and it will set bits in the field based on some hashing function taking in consideration the key when computing the positions in the field.
For e.g. key = 345 and value=45667, the binary representation of 45667 is 1011001001100011. To compute the pos values in the field I use this function:
unsigned int pos = (( key ^ (j + 10) * 0xFF2343 )) % szpsfield;
To mark the bit in the field:
inline void markps_field(int pos)
int byte_ps = pos / 0x7;
int bit_pos = pos - (byte_ps * 0x7);
_ps_field[byte_ps] |= 1 << (bit_pos);
The functions that take keys with values can be executed in parallel on x86 cores or even on SIMD cores. When the code targets x86 cores the insert time can be reduce up to 9x on a machines with 4 cores. On a SIMD chip the time to insert the same number of keys can be reduce up to 31x.
When it comes to memory footprint of the PS data structure we get a memory reduction of 3x, the map from, STL for 15 million values with unsigned int as a key and unsigned int as a value, needs about 491 MB. For the same number of keys and values the PS takes 80 MB.
Some disadvantages of the PS:
- the keys and values cannot be removed from data structure
- is not generic as a map from STL
- when to many keys are inserted in the field memory corruption can happen