In the previous post, we have seen how can we insert and retrieve keys for trie data structure. In this post, we will discuss how to delete keys from trie.
Background: A trie is a data structure used for efficient retrieval of data associated with keys. If key is of length n, then using trie, worst case time complexity for searching the record associated with this key is O(n). Insertion and deletion of (key, record) pair also takes O(n) time in worst case.
You can see in the below picture as to how a trie data structure looks like for key, value pairs ("abc",1),("xy",2),("xyz",5),("abb",9)("xyzb",8), ("word",5).
You might want to visit previous post for more details about trie data structure,constructing a trie, insertion and search algorithms, its comparison with other data structures.
Algorithm requirements for deleting key 'k':
1. If key 'k' is not present in trie, then we should not modify trie in any way.
2. If key 'k' is not a prefix nor a suffix of any other key and nodes of key 'k' are not part of any other key then all the nodes starting from root node(excluding root node) to leaf node of key 'k' should be deleted. For example, in the above trie if we were asked to delete key - "word", then nodes 'w','o','r','d' should be deleted.
3. If key 'k' is a prefix of some other key, then leaf node corresponding to key 'k' should be marked as 'not a leaf node'. No node should be deleted in this case. For example, in the above trie if we have to delete key - "xyz", then without deleting any node we have to simply mark node 'z' as 'not a leaf node' and change its value to "NON_VALUE"
4. If key 'k' is a suffix of some other key 'k1', then all nodes of key 'k' which are not part of key 'k1' should be deleted.
For example, in the above trie if we were to delete key - "xyzb", then we should only delete node "b" of key "xyzb" since other nodes of this key are also part of key "xyz".
5. If key 'k' is not a prefix nor a suffix of any other key but some nodes of key 'k' are shared with some other key 'k1', then nodes of key 'k' which are not common to any other key should be deleted and shared nodes should be kept intact. For example, in the above trie if we have to delete key "abc" which shares node 'a', node 'b' with key "abb", then the algorithm should delete only node 'c' of key "abc" and should not delete node 'a' and node 'b'.
We will now go through the algorithm for deleting a record associated with the given key using an example.