Testing

Testing
Prev	Chapter 22. Policy-Based Data Structures	Next

Regression

The library contains a single comprehensive regression test. For a given container type in this library, the test creates an object of the container type and an object of the corresponding standard type (e.g., std::set). It then performs a random sequence of methods with random arguments (e.g., inserts, erases, and so forth) on both objects. At each operation, the test checks the return value of the method, and optionally both compares this library's object with the standard's object as well as performing other consistency checks on this library's object (e.g., order preservation, when applicable, or node invariants, when applicable).

Additionally, the test integrally checks exception safety and resource leaks. This is done as follows. A special allocator type, written for the purpose of the test, both randomly throws an exceptions when allocations are performed, and tracks allocations and de-allocations. The exceptions thrown at allocations simulate memory-allocation failures; the tracking mechanism checks for memory-related bugs (e.g., resource leaks and multiple de-allocations). Both this library's containers and the containers' value-types are configured to use this allocator.

For granularity, the test is split into the several sources, each checking only some containers.

For more details, consult the files in testsuite/ext/pb_ds/regression.

Performance

Hash-Based

Text `find`

Description

This test inserts a number of values with keys from an arbitrary text ([biblio.wickland96thirty]) into a container, then performs a series of finds using find . It measures the average time for find as a function of the number of values inserted.

It uses the test file: performance/ext/pb_ds/text_find_timing_test.cc

And uses the data file: filethirty_years_among_the_dead_preproc.txt

The test checks the effect of different range-hashing functions, trigger policies, and cache-hashing policies.

Results

The graphic below show the results for the native and collision-chaining hash types the the function applied being a text find timing test using find.

The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.

Name/Instantiating Type	Parameter	Details	Parameter	Details
n_hash_map_ncah
`std::tr1::unordered_map`	`cache_hash_code`	`false`
cc_hash_mod_prime_1div1_nsth_map
`cc_hash_table`	`Comb_Hash_Fn`	`direct_mod_range_hashing`
	`Resize_Policy`	`hash_standard_resize_policy`	`Size_Policy`	`hash_prime_size_policy`
	`Resize_Policy`	`hash_standard_resize_policy`	`Trigger_Policy`	`hash_load_check_resize_trigger` with α_min = 1/8 and α_max = 1/1
cc_hash_mask_exp_1div2_sth_map
`cc_hash_table`	`Comb_Hash_Fn`	`direct_mask_range_hashing`
	`Resize_Policy`	`hash_standard_resize_policy`	`Size_Policy`	`hash_exponential_size_policy`
	`Resize_Policy`	`hash_standard_resize_policy`	`Trigger_Policy`	`hash_load_check_resize_trigger` with α_min = 1/8 and α_max = 1/2
cc_hash_mask_exp_1div1_nsth_map
`cc_hash_table`	`Comb_Hash_Fn`	`direct_mask_range_hashing`
	`Resize_Policy`	`hash_standard_resize_policy`	`Size_Policy`	`hash_exponential_size_policy`
	`Resize_Policy`	`hash_standard_resize_policy`	`Trigger_Policy`	`hash_load_check_resize_trigger` with α_min = 1/8 and α_max = 1/1
cc_hash_mask_exp_1div2_nsth_map
`cc_hash_table`	`Comb_Hash_Fn`	`direct_mask_range_hashing`
	`Resize_Policy`	`hash_standard_resize_policy`	`Size_Policy`	`hash_exponential_size_policy`
	`Resize_Policy`	`hash_standard_resize_policy`	`Trigger_Policy`	`hash_load_check_resize_trigger` with α_min = 1/8 and α_max = 1/2

Observations

In this setting, the range-hashing scheme affects performance more than other policies. As the results show, containers using mod-based range-hashing (including the native hash-based container, which is currently hard-wired to this scheme) have lower performance than those using mask-based range-hashing. A modulo-based range-hashing scheme's main benefit is that it takes into account all hash-value bits. Standard string hash-functions are designed to create hash values that are nearly-uniform as is ([biblio.knuth98sorting]).

Trigger policies, i.e. the load-checks constants, affect performance to a lesser extent.

Perhaps surprisingly, storing the hash value alongside each entry affects performance only marginally, at least in this library's implementation. (Unfortunately, it was not possible to run the tests with std::tr1::unordered_map 's cache_hash_code = true , as it appeared to malfuntion.)

Name/Instantiating Type	Parameter	Details
n_map
`std::map`
splay_tree_map
`tree`	`Tag`	`splay_tree_tag`
`tree`	`Node_update`	`null_node_update`
rb_tree_map
`tree`	`Tag`	`rb_tree_tag`
`tree`	`Node_update`	`null_node_update`

Name/Instantiating Type	Parameter	Details
n_map
`std::map`
ov_tree_map
`tree`	`Tag`	`ov_tree_tag`
`tree`	`Node_update`	`null_node_update`

Name/Instantiating Type	Parameter	Details
n_map
`std::map`
pat_trie_map
`tree`	`Tag`	`pat_trie_tag`
`tree`	`Node_update`	`null_node_update`

Name/Instantiating Type	Parameter	Details
n_set
`std::set`
splay_tree_set
`tree`	`Tag`	`splay_tree_tag`
`tree`	`Node_Update`	`null_node_update`
rb_tree_set
`tree`	`Tag`	`rb_tree_tag`
`tree`	`Node_Update`	`null_node_update`
ov_tree_set
`tree`	`Tag`	`ov_tree_tag`
`tree`	`Node_Update`	`null_node_update`
pat_trie_map
`tree`	`Tag`	`pat_trie_tag`
`tree`	`Node_Update`	`null_node_update`

Name/Instantiating Type	Parameter	Details
n_set
`std::set`
splay_tree_ost_set
`tree`	`Tag`	`splay_tree_tag`
`tree`	`Node_Update`	`tree_order_statistics_node_update`
rb_tree_ost_set
`tree`	`Tag`	`rb_tree_tag`
`tree`	`Node_Update`	`tree_order_statistics_node_update`

Name/Instantiating Type	Parameter	Details	Parameter	Details	Parameter	Details
n_mmap
`std::multimap`
rb_tree_mmap_lu_mtf_set
`tree`	`Tag`	`rb_tree_tag`
	`Node_Update`	`null_node_update`
	`Mapped`	`list_update`	`Update_Policy`	`lu_move_to_front_policy`
rb_tree_mmap_cc_hash_mask_exp_1div2_nsth_set
`tree`	`Tag`	`rb_tree_tag`
	`Node_Update`	`null_node_update`
	`Mapped`	`cc_hash_table`	`Comb_Hash_Fn`	`direct_mask_range_hashing`
			`Resize_Policy`	`hash_standard_resize_policy`	`Size_Policy`	`hash_exponential_size_policy`
			`Resize_Policy`	`hash_standard_resize_policy`	`Trigger_Policy`	`hash_load_check_resize_trigger` with α_min = 1/8 and α_max = 1/2

Name/Instantiating Type	Parameter	Details	Parameter	Details	Parameter	Details
n_hash_mmap
`std::tr1::unordered_multimap`
rb_tree_mmap_lu_mtf_set
`cc_hash_table`	`Comb_Hash_Fn`	`direct_mask_range_hashing`
	`Resize_Policy`	`hash_standard_resize_policy`	`Size_Policy`	`hash_exponential_size_policy`
	`Resize_Policy`	`hash_standard_resize_policy`	`Trigger_Policy`	`hash_load_check_resize_trigger` with α_min = 1/8 and α_max = 1/2
	`Mapped`	`list_update`	`Update_Policy`	`lu_move_to_front_policy`
rb_tree_mmap_cc_hash_mask_exp_1div2_nsth_set
`cc_hash_table`	`Comb_Hash_Fn`	`direct_mask_range_hashing`
	`Resize_Policy`	`hash_standard_resize_policy`	`Size_Policy`	`hash_exponential_size_policy`
	`Resize_Policy`	`hash_standard_resize_policy`	`Trigger_Policy`	`hash_load_check_resize_trigger` with α_min = 1/8 and α_max = 1/2
	`Mapped`	`cc_hash_table`	`Comb_Hash_Fn`	`direct_mask_range_hashing`
			`Resize_Policy`	`hash_standard_resize_policy`	`Size_Policy`	`hash_exponential_size_policy`
			`Resize_Policy`	`hash_standard_resize_policy`	`Trigger_Policy`	`hash_load_check_resize_trigger` with α_min = 1/8 and α_max = 1/2

Name/Instantiating Type	Parameter	Details
n_pq_vector
`std::priority_queue`	`Sequence`	`std::vector`
n_pq_deque
`std::priority_queue`	`Sequence`	`std::deque`
binary_heap
`priority_queue`	`Tag`	`binary_heap_tag`
binomial_heap
`priority_queue`	`Tag`	`binomial_heap_tag`
rc_binomial_heap
`priority_queue`	`Tag`	`rc_binomial_heap_tag`
thin_heap
`priority_queue`	`Tag`	`thin_heap_tag`
pairing_heap
`priority_queue`	`Tag`	`pairing_heap_tag`

Name/Instantiating Type	Parameter	Details
n_pq_vector
`std::priority_queue` adapting `std::vector`	`Sequence`	`std::vector`
n_pq_deque
`std::priority_queue`	`Sequence`	`std::deque`
pairing_heap
`priority_queue`	`Tag`	`pairing_heap_tag`

	`push`	`pop`	`modify`	`erase`	`join`
`std::priority_queue`	Θ(n) worst Θ(log(n)) amortized	Θ(log(n)) Worst	Θ(n log(n)) Worst _{[std note 1]}	Θ(n log(n)) _{[std note 2]}	Θ(n log(n)) _{[std note 1]}
`priority_queue` <`Tag` = `pairing_heap_tag`>	O(1)	Θ(n) worst Θ(log(n)) amortized	Θ(n) worst Θ(log(n)) amortized	Θ(n) worst Θ(log(n)) amortized	O(1)
`priority_queue` <`Tag` = `binary_heap_tag`>	Θ(n) worst Θ(log(n)) amortized	Θ(n) worst Θ(log(n)) amortized	Θ(n)	Θ(n)	Θ(n)
`priority_queue` <`Tag` = `binomial_heap_tag`>	Θ(log(n)) worst O(1) amortized	Θ(log(n))	Θ(log(n))	Θ(log(n))	Θ(log(n))
`priority_queue` <`Tag` = `rc_binomial_heap_tag`>	O(1)	Θ(log(n))	Θ(log(n))	Θ(log(n))	Θ(log(n))
`priority_queue`<`Tag` = `thin_heap_tag`>	O(1)	Θ(n) worst Θ(log(n)) amortized	Θ(log(n)) worst O(1) amortized, or Θ(log(n)) amortized _{[thin_heap_note]}	Θ(n) worst Θ(log(n)) amortized	Θ(n)