DSA
Table of contents
- Search
- Sorting
- Data structures
- Mathematical concepts
- Concepts and problems list
- Divide and Conquer
- Arrays problems
- Monotonic queue
- Binary search
- Prefix sum
- Dutch National Flag Algorithm
- Kadane’s algorithm
- Floyd’s tortoise and hare algorithm for cycle detection
- Two pointers
- Greedy
- Segment tree and Binary Search Tree variants
- Backtracking
- Diff array technique
- Dynamic Programming (DP)
- Using original indexes array
Search
- Binary search
- Linear search
Sorting
Insertion sort
- Time complexity O(n2)
Bubble sort
- Time complexity O(n2)
Merge sort
- Time complexity is always \(O(nlog(n))\), for all three cases worst, average and best cases. So, when you don’t know about the input use merge sort.
- When using arrays
- we need additional heap space of O(n) to keep the merged array.
- A disadvantage is that it is not an in-place sort.
- When using linked list
- no need for the additional heap space. We only need stack space to keep recursion information. So memory required is only
O(log(n)). - This is an in-place sort, because we only need to modify next pointers of the nodes.
- no need for the additional heap space. We only need stack space to keep recursion information. So memory required is only
- Code for merge sort
-
Code for merge sort with linked list
In this code, the author used recursion intelligently in the merge function.
Quick sort
- Time complexity
- Best: O(n*log(n))
- Average: O(n*log(n))
- Worst: O(n2), when we choose the partition in such a way that the partition ends up at the beginning or at ending of the array.
- Quick sort is not stable.
- This is in-place and is faster when good partition strategies are used.
Data structures
Binary Heap (min-heap)
- Use heap when you quickly want to find the smallest (or biggest) element in a collection.
- This is always a complete binary tree. So it can be represented easily in an array.
- Children of node at i will be at 2i-2 and 2i+2. Parent of node i will be at (i-1)/2.
- In a min-heap, a parent node will always be less than or equal to their children.
Insertion
- First insert it at then end of the heap.
- See if it still follows the heap property, i.e compare it’s value with the parent and swap if needed.
- If you had to swap, then repeat the previous step till you don’t need to swap or reached the root.
- Time complexity: O(log n) = O(height)
Building
- Initialize and keep on inserting all the nodes.
- If you are inserting a total of n nodes, we might conclude that the whole operation takes
O(n * log n)time. But remember that initially the size of tree is very smaller thanO(log n). So, on careful calculation the building process takesO(n)time.
Deletion
- You can only delete the root of a heap.
- Replace the heap root with the last element.
- Then start the heapify process from the root in a top-down manner.
- Time complexity:
O(log n)=O(height) - If you want to delete an intermediate element (as available in Java’s PriotityQueue), we first need to find the element by checking through each element, which takes O(n) operations and
then delete it in the same manner as we delete the root. So deletion of an arbitrary element takes
O(n)complexity.
Binary Search Tree
- All the elements smaller than the root are on the left sub-tree and all higher are on the right sub-tree. This is valid for all underlying sub-trees too.
- In-order traversal will print all values in a sorted manner. This concept can be used to find the kth smallest element.
Insertion
- Similar to search.
- Go till you encounter a leaf node and attach to left/right depending on whether the new value is smaller/bigger than the leaf value.
- Time complexity:
O(log n)=O(height)for an average case. In the worst case, it can beO(n)as the tree can look like a straight line.
Deletion
- If node does not have any children, you can simply mark it as null.
- If the node has only one child, replace the existing node with its child.
- If it has both the children, find the immediate successor (or predecessor) and replace the current node with that.
Trie
- Used for searching as in a dictionary.
Graph vs Tree
- A tree is an undirected graph which is
- connected: Every node must be able to reach all other nodes
- no cycle: A tree should not have a cycle
- The validation can be done using DFS as mentioned in this Algo Monster Medium article.
Mathematical concepts
- Modular division and inverse
- Moyer-Moore Majority voting algorithm
- Recursive sum of digits Leetcode:
// Modulo of 9 or the base digit - 1 // A number abcd is same as a+b+c+d(= s) in the modulo 9 world. // The number s again is same as sum of digits in `s` // When you repeat this process until there is only 1 digit remaining, that is the answer. public int addDigits(int num) { if (num == 0) return 0; if (num % 9 == 0) return 9; return num % 9; }
Concepts and problems list
Divide and Conquer
- A very good problem: Longest Substring with At Least K Repeating Characters
- The problem can also be using Sliding Window
O(n)approach. - Divide and conquer solution is
O(n * log_n) - D&C solution
In this code, we should use int[] instead of map to store the counts of characters to significantly reduce the time needed for execution.
- The problem can also be using Sliding Window
Arrays problems
| Problem | Solution and details |
| Maximum sum of subarray | Use Kadane’s algorithm. See the section. |
Monotonic queue
| Problem | Solution & Details |
|---|---|
| Shortest Unsorted Continuous Subarray | |
| Sliding Window Maximum problem | Solution |
| Remove duplicate numbers Leetcode | Solution |
| Largest Rectangle in a histogram | Solution. We find the next and previous smaller elements(indices pr and nx) from the current element. The maximum area involving the current element is h[i] * (pr - nx - 1). The boundaries will be the elements just after pr and just before nx. Hence the width pr - nx - 1. Instead of using two stacks for finding both, you can use a single stack. Store the elements in a non-decreasing order. If the current element is shorter than the peek. That means peek encountered next smaller element. What is its previous smaller element? Of course, the next top of the stack (which you obtain by popping the current top). If the height is same, then you should keep popping. Now you have found the two boundaries of the stack top. Right is i and the left is stack.peek() after popping the top (and same height ones). Now calculate the area. If there are elements still left in the stack, that means the right boundary is n. Process rest of the stack accordingly. |
| Maximum score including k-th element | Same as the above solution. Only just ensure the prev min and next min indices range include k. Also check my greedy solution. |
| Remove k digits Leetcode | |
| Most competitive subsequence 🔗 | My solution 🔗 |
| Create maximum number from two integer arrays 🔗 | A complex problem involving the above problem. First take all possible splits i.e, if you take 2 elements from the first array, then you must take k-2 from the second. Use above problem to find the most competitive subsequence of given length from each array. Merge the solutions from two arrays greedily (start by comparing the first elements of each result array). Find the maximum from all the possible solutions. |
Binary search
- You have to find not the exact match, but the least absolute distance to the elements in the array. heaters problem
Binary search over solution space
Prefix sum
Use this when sums are involved and when the order of the elements to compute the sum matters.
- Find the longest subarray where sum of elements will be equals to K. Find all prefix sums. Then store them in a map with sums as keys. Now you take each sum \(s\) and see if there exists another sum whose value is \(K-s\).
Dutch National Flag Algorithm
Sort the array containing only 0s, 1s and 2s. Take three variables called low, mid and high.
- [0, low) and [low, mid) are 0s and 1s respectively. [mid, high) is the unsorted subarray that we will process later, [high, n) are 2s.
- Process the subarray like this
while mid <= high: if nums[mid] == 0: nums[low], nums[mid] = nums[mid], nums[low] low += 1 mid += 1 elif nums[mid] == 1: mid += 1 else: nums[mid], nums[high] = nums[high], nums[mid] high -= 1
Kadane’s algorithm
To find the maximum subarray sum, start computing the \(maxSumSubArrayEndingAtK\) for all elements of the array. The value at the current element is current element +
- Previous \(maxSumSubArrayEndingAtK\) if the previous value is positive.
- 0 if the previous value is negative.
Floyd’s tortoise and hare algorithm for cycle detection
– todo –
Two pointers
Sometimes, you may have to sort the data when using this. Also this can be extended to three pointers and more.
| Problem | Solution & Details |
|---|---|
| Three sum problem | Solution |
| Next Permutation problem | Solution |
| Longest substring without repeating characters 🔗 | My solution 🔗 |
| Minimum window substring Problem 🔗 | Solution link 🔗 |
| Minimum size subarray sum 🔗 | Solution link 🔗 - Can be solved using cumulative sum method too. But it takes O(n^2) time. |
| Longest Repeating Character Replacement | YT video |
| Longest substring with at least k repeating characters | Explanation 1 - Find where you can break and apply divide and conquer. Explanation 2 - The trick is to identify the logic based on what you can move the left and right pointers. So we identify a new problem where we want to use only T unique characters. We increase the T from 1 to 26 (or the maximum number of unique chars in the string). Find the max of all the solutions of various T’s. |
| Maximum score including k-th element | My greedy solution Start from index k. Move to the left or right, depending on which is a bigger element. |
Greedy
| Problem | Solution & Details |
|---|---|
| Jump Game | Recursive (very fast) and Greedy solutions |
| Jump Game 2 | Solution - Read the comments in the code. |
| Wildcard Matching | Greedy solution |
| Maximum score including k-th element | My greedy solution Start from index k. Move to the left or right, depending on which is a bigger element. |
Segment tree and Binary Search Tree variants
| Problem | Solution & Details |
|---|---|
| My Calendar II | My published solution. Used treemap and merged ranges intelligently to reduce the tree size. |
Backtracking
| Problem | Solution & Details |
|---|---|
| Word search II | Solution. Maintain a Trie to make the word search faster. Instead of searching for each word, you backtrack the entire board and note down whenever you find a word of interest. |
| Word Ladder II | Solution. The solution used both BFS (to find all the parents till end word; see BFS section) and DFS/backtracking to find the paths. |
Diff array technique
Dynamic Programming (DP)
See the Dynamic Programming page for concepts and problems.
Using original indexes array
Bit manipulation
Some use cases:
- Use XOR to find duplicates, swap numbers etc.
- Efficient way to save whether a test case is covered or not. For example if there are a max of 32 elements in array and you want to track whether element is used, just set that bit to 1 in an integer.
- You can count the number of 1’s and 0’s at each bit index and find the odd one. For example,
- given a problem to find the number that comes only once while the rest of the numbers repeat thrice, you can by looking at each bit which has not occurred thrice.
- Find the only duplicate number which might have repeated more than once in an array of integers 1 to
n+1. Just find the expected set count at each bit(1 to 32). If the count at this particular bit is more than what is expected for 1 ton+1(all repeating only once), then that bit must be in the duplicate number. After you process all the bits you get the full number. This problem can also be solved by Floyd’s tortoise and hare algorithm.
- Buckets concept: For the same problem you maintain buckets of ones, twos and threes while processing each element in the array. Using simple bit operations you can figure out if the number given is repeated once, twice or thrice (until now) and get the final answer after all the elements are processed. See Striver’s video solution.
Hashing
Sorting
- Group anagrams - solution
- My Calendar
- Sometimes numbers can be converted to strings and sorted. Here a custom comparator is used to figure which type of concatenation is better (a+b or b+a) - Largest Number Leetcode
- A variation of merge sort - Count of smaller numbers after self
Recursion
Binary Tree
Heap - Priority Queue
- Leetcode IPO
- Asked in an interview Maximum jobs in non-consecutive towns
- Median of data stream - Use two heaps, maxHeap for the first half of smaller elements, and min heap for the second half of elements. Keep balancing the heaps, so that they are always of equal size or only one element greater than the other. Median is root of the max heap or the average of the roots of two heaps.
Sliding window
No or constant extra space
See if you can manipulate the existing array/DS itself. For example, you can change the sign of the element etc., to remember whether visited or not.