Basic Data Structures Basic Data Structures A key part of most problems is being able to identify a basic structure that can be used for the problem Sometimes, the key to solving a problem is just knowing the right structure to use! These are some basic data structures ones you should already be familiar with. We will, later, talk about other structures that you might be less familiar with. We will skip more general graph structures for now Basic Data TYPES bool char short (short int)

int long (long int) long long (long long int) Unsigned versions of ints float double long double string (provided via library) Data structures (and the C++ implementation) Static arrays int a[10]; Dynamic arrays vector

Linked list list Pair pair Tuple tuple stack Queue queue Double-Ended Queue

deque Priority Queue priority_queue Set, Unordered Set set Unordered_set Map, Unordered Map map unordered_map Stack Array vs. Vector vs. List

Many (most?) problems require storing data in such a format Array is generally faster, quicker to write, if you know size List does not support sort like array/vector does List is able to insert and erase more efficiently than vector and especially array Matters if doing a lot of insertion/deletion Arrays: sort and search Sort: dont bother writing your own sort in most cases Unless the sort comparison is tricky or the sort needs to be optimized somehow Sort (in C++): for a vector: sort(a.begin(), a.end()) for a static array of size n: sort(a, a+n)

Search (in C++) for an UNSORTED array: gives pointer to the array element, not the index for a vector: find(a.begin(), a.end(), x) for a static array: find(a, a+n, x) Arrays: sort and search (continued) Search (in C++) for a SORTED array (implements binary search): gives pointer to the array element, not the index for a vector: lower_bound(a.begin(), a.end(), x) for a static array: lower_bound(a, a+n, x) Note: lower_bound gives iterator for first element not less than x, so can also check it to see if an element exists. Note: can also use upper_bound (gives pointer to first element greater than x), but not for checking existence. Note: binary_search just returns true/false, not an index

Variations on sort Sorting with a different comparison Define a comparison process: operator, function, or functor: Return a bool Take two of same element use const reference Return true if first parameter is LESS THAN (not <=) second Useful for other comparison-based algorithms See slide, below To sort stably (keep equal elements in order) use stable_sort for a vector: stable_sort(a.begin(), a.end(), cmp) for a static array of size n: stable_sort(a, a+n, cmp) Pairs and Tuples

Pair: simple way of pairing 2 elements pair p; p = make_pair(a,b); p.first, p.second used to access elements Tuple: simple way of grouping 3+ elements (C++11) tuple t; t = make_tuple(a,b,,n); get<0>(t), get<1>(t), etc. used to access elements Ugh! Stacks / Queues / Deque LIFO / FIFO / both Stacks:

Simulate recursion/function calls Good for matching: match parentheses, for instance Good for processing in reverse order when you dont have a simple loop push(), top(), pop(), size(), empty() Queues: Good for processing things in order when you dont have a simple loop push(), front(), pop(), size(), empty() Deque: Good if you need both LIFO/FIFO push_back(), push_front(), back(), front(), pop_back(), pop_front(), size(), empty(), insert(), []

When we get to graphs: Stacks needed for DFS, Queues for BFS Priority Queue Implemented using a heap, underneath Top is always the GREATEST element (unless different compare used) When want priority without having to enter/sort Good when new stuff will come in over time, rather than all read at beginning Many greedy algorithms can use this To prioritize the next thing to do Graphs: use for shortest path, MST Basic operations: empty(), push(), top(), pop() Sets / Unordered Sets

Keep track of distinct elements Set is still ordered, unordered_set is unordered Set generally implemented as a binary search tree underneath Unordered Set generally implemented as a hash table underneath Other set representations are possible, sometimes better. C++ algorithms: set_union set_intersection set_difference set_symmetric_difference

Any sorted ranges can use these algorithms, not just sets! A simple set representation For small sets of items (n<=32,64,128), labeled with index 0-(n-1). Represent as an integer/long long, where bit indicates in set or not To set bit: (1 << n) where n is how many spaces to shift over (i.e. the element number) Combine (or union) with bitwise or: | Set of {5, 17, 8} : int x = (1 << 5) | (1 << 17) | (1 << 8) Intersection with bitwise and (&) All elements: (1<

Useful in some dynamic programming Implemented as binary search tree (sorted on key) or hash table Generally Unordered is better unless large keys (hashing can take longer than compare) or need an ordering Can use bracket operator with key side effect: if the key does not exist, it is created; use find if you want to avoid map m; m["this"] = 5; Defining a comparison operator bool operator < (const TYPE& lhs, const TYPE& rhs) { // operator // return true if lhs is less than rhs, false otherwise // defines < operator: this is used as basis for most comparision-based algorithms

} bool cmp (const TYPE& lhs, const TYPE& rhs) { // function // return true if lhs is less than rhs, false otherwise // defines function: this can be passed to sort, priority_queue, etc. for ordering } class mycomp() { // functor public: bool operator() (const TYPE& lhs, const TYPE& rhs) { //return true if lhs is "less than" rhs, false otherwise }

}; Can be especially useful for priority queues, or places where different comparisons are needed Can specify this when defining priority queue, sort, lower_bound, set, map, etc. Use operator for default behavior, function in constructor, functor in template Augmenting Data Structures Can often get more functionality by augmenting a data structure and writing your own code. Example: augment a binary search tree so each element includes the size of the subtree increase size on path when adding, decrease size on path when deleting. Can quickly count how many items are less than a given element Add up opposite subtrees on path to element (plus sometimes the node itself)

Can quickly find the i-th element in the tree Follow path by comparing sizes of child trees