-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Implement segment tree #79
Changes from all commits
920230e
240a2c2
fa96c70
ac2b726
313120f
061dd1d
c5264c8
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,89 @@ | ||
# Segment Tree | ||
|
||
## Background | ||
Segment Trees are primarily used to solve problems that require answers to queries on intervals of an array | ||
with the possibility of modifying the array elements. | ||
These queries could be finding the sum, minimum, or maximum in a subarray, or similar aggregated results. | ||
|
||
![Segment Tree](../../../../../docs/assets/images/SegmentTree.png) | ||
|
||
### Structure | ||
(Note: See below for a brief description of the array-based implementation of a segment tree) | ||
|
||
A Segment Tree for an array of size *n* is a binary tree that stores information about segments of the array. | ||
Each node in the tree represents an interval of the array, with the root representing the entire array. | ||
The structure satisfies the following properties: | ||
1. Leaf Nodes: Each leaf node represents a single element of the array. | ||
2. Internal Nodes: Each internal node represents the sum of the values of its children | ||
(which captures the segment of the array). Summing up, this node captures the whole segment. | ||
3. Height: The height of the Segment Tree is O(log *n*), making queries and updates efficient. | ||
|
||
## Complexity Analysis | ||
**Time**: O(log(n)) in general for query and update operations, | ||
except construction which takes O(nlogn) | ||
|
||
**Space**: O(n), note for an array-based implementation, the array created should have size 4n (explained later) | ||
|
||
where n is the number of elements in the array. | ||
|
||
## Operations | ||
### Construction | ||
The construction of a Segment Tree starts with the root node representing the entire array and | ||
recursively dividing the array into two halves until each segment is reduced to a single element. | ||
This process is a divide-and-conquer strategy: | ||
1. Base Case: If the current segment of the array is reduced to a single element, create a leaf node. | ||
2. Recursive Case: Otherwise, split the array segment into two halves, construct the left and right children, | ||
and then merge their results to build the parent node. | ||
|
||
This takes O(nlogn). logn in depth, and will visit each leaf node (number of leaf nodes could be roughly 2n) once. | ||
|
||
### Querying | ||
To query an interval, say to find the sum of elements in the interval (L, R), | ||
the tree is traversed starting from the root: | ||
1. If the current node's segment is completely within (L, R), its value is part of the answer. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does 'its value is part of the answer' mean that the current node's children nodes may be the answer? |
||
2. If the current node's segment is completely outside (L, R), it is ignored. | ||
3. If the current node's segment partially overlaps with (L, R), the query is recursively applied to its children. | ||
|
||
This approach ensures that each level of the tree is visited only once, time complexity of O(logn). | ||
|
||
### Updating | ||
Updating an element involves changing the value of a leaf node and then propagating this change up to the root | ||
to ensure the tree reflects the updated array. | ||
This is done by traversing the path from the leaf node to the root | ||
and updating each node along this path (update parent to the sum of its children). | ||
|
||
This can be done in O(logn). | ||
|
||
## Array-based Segment Tree | ||
The array-based implementation of a Segment Tree is an efficient way to represent the tree in memory, especially | ||
since a Segment Tree is a complete binary tree. | ||
This method utilizes a simple array where each element of the array corresponds to a node in the tree, | ||
including both leaves and internal nodes. | ||
|
||
### Why 4n space | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Actually, could there perhaps be an explanation here as to why the total number of nodes will always be 4n? |
||
The size of the array needed to represent a Segment Tree for an array of size *n* is 2*2^ceil(log2(*n*)) - 1. | ||
We do 2^(ceil(log2(*n*))) because *n* might not be a perfect power of 2, | ||
**so we expand the array size to the next power of 2**. | ||
This adjustment ensures that each level of the tree is fully filled except possibly for the last level, | ||
which is filled from left to right. | ||
|
||
**BUT**, 2^(ceil(log2(*n*))) seems overly-complex. To ensure we have sufficient space, we can just consider 2*n | ||
because 2*n >= 2^(ceil(log2(*n*))). | ||
Now, these 2n nodes can be thought of as the 'leaf' nodes (or more precisely, an upper-bound). To account for the | ||
intermediate nodes, we use the property that for a complete binary that is fully filled, the number of leaf nodes | ||
= number of intermediate nodes (recall: sum i -> 0 to n-1 of 2^i = 2^n). So we create an array of size 2n * 2 = 4n to | ||
guarantee we can house the entire segment tree. | ||
|
||
### Obtain index representing child nodes | ||
Suppose the parent node is captured at index *i* of the array (1-indexed). | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could there also be an image that could encapsulate the indexing as well? For example representing the array like a binary tree like how you did in the above image but with indexing as well |
||
**1-indexed**: <br> | ||
Left Child: *i* x 2 <br> | ||
Right Child: *i* x 2 + 1 <br> | ||
|
||
The 1-indexed calculation is intuitive. So, when dealing with 0-indexed representation (as in our implementation), | ||
one option is to convert 0-indexed to 1-indexed representation, do the above calculations, and revert. <br> | ||
(Note: Now, we assume parent node is captured at index *i* (0-indexed)) | ||
|
||
**0-indexed**: <br> | ||
Left Child: (*i* + 1) x 2 - 1 = *i* x 2 + 1 <br> | ||
Right Child: (*i* + 1) x 2 + 1 - 1 = *i* x 2 + 2 <br> |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,115 @@ | ||
package dataStructures.segmentTree; | ||
|
||
/** | ||
* Implementation of a Segment Tree. Uses SegmentTreeNode as a helper node class. | ||
*/ | ||
public class SegmentTree { | ||
private SegmentTreeNode root; | ||
private int[] array; | ||
|
||
/** | ||
* Helper node class. Used internally. | ||
*/ | ||
private class SegmentTreeNode { | ||
private SegmentTreeNode leftChild; // left child | ||
private SegmentTreeNode rightChild; // right child | ||
private int start; // start idx of range captured | ||
private int end; // end idx of range captured | ||
private int sum; // sum of all elements between start and end index inclusive | ||
|
||
/** | ||
* Constructor | ||
* @param leftChild | ||
* @param rightChild | ||
* @param start | ||
* @param end | ||
* @param sum | ||
*/ | ||
public SegmentTreeNode(SegmentTreeNode leftChild, SegmentTreeNode rightChild, int start, int end, int sum) { | ||
this.leftChild = leftChild; | ||
this.rightChild = rightChild; | ||
this.start = start; | ||
this.end = end; | ||
this.sum = sum; | ||
} | ||
} | ||
|
||
/** | ||
* Constructor. | ||
* @param nums | ||
*/ | ||
public SegmentTree(int[] nums) { | ||
root = buildTree(nums, 0, nums.length - 1); | ||
array = nums; | ||
} | ||
|
||
private SegmentTreeNode buildTree(int[] nums, int start, int end) { | ||
if (start == end) { | ||
return new SegmentTreeNode(null, null, start, end, nums[start]); | ||
} | ||
int mid = start + (end - start) / 2; | ||
SegmentTreeNode left = buildTree(nums, start, mid); | ||
SegmentTreeNode right = buildTree(nums, mid + 1, end); | ||
return new SegmentTreeNode(left, right, start, end, left.sum + right.sum); | ||
} | ||
|
||
/** | ||
* Queries the sum of all values in the specified range. | ||
* @param leftEnd | ||
* @param rightEnd | ||
* @return the sum. | ||
*/ | ||
public int query(int leftEnd, int rightEnd) { | ||
return query(root, leftEnd, rightEnd); | ||
} | ||
|
||
private int query(SegmentTreeNode node, int leftEnd, int rightEnd) { | ||
// this is the case when: | ||
// start end | ||
// range query: ^ ^ --> so simply capture the sum at this node! | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How do you capture the sums that are not part of the interval of the current node? |
||
if (leftEnd <= node.start && node.end <= rightEnd) { | ||
return node.sum; | ||
} | ||
int rangeSum = 0; | ||
int mid = node.start + (node.end - node.start) / 2; | ||
// Consider the 3 possible kinds of range queries | ||
// start mid end | ||
// poss 1: ^ ^ | ||
// poss 2: ^ ^ | ||
// poss 3: ^ ^ | ||
if (leftEnd <= mid) { | ||
rangeSum += query(node.leftChild, leftEnd, Math.min(rightEnd, mid)); // poss1 or poss2 | ||
} | ||
if (mid + 1 <= rightEnd) { | ||
rangeSum += query(node.rightChild, Math.max(leftEnd, mid + 1), rightEnd); // poss2 or poss3 | ||
} | ||
return rangeSum; | ||
} | ||
|
||
/** | ||
* Updates the segment tree based on updates to the array at the specified index with the specified value. | ||
* @param idx | ||
* @param val | ||
*/ | ||
public void update(int idx, int val) { | ||
if (idx > array.length) { | ||
return; | ||
} | ||
array[idx] = val; | ||
update(root, idx, val); | ||
} | ||
|
||
private void update(SegmentTreeNode node, int idx, int val) { | ||
if (node.start == node.end && node.start == idx) { | ||
node.sum = val; // node is holding a single value; now updated | ||
return; | ||
} | ||
int mid = node.start + (node.end - node.start) / 2; | ||
if (idx <= mid) { | ||
update(node.leftChild, idx, val); | ||
} else { | ||
update(node.rightChild, idx, val); | ||
} | ||
node.sum = node.leftChild.sum + node.rightChild.sum; // propagate updates up | ||
} | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,105 @@ | ||
package dataStructures.segmentTree.arrayRepresentation; | ||
|
||
/** | ||
* Array-based implementation of a Segment Tree. | ||
*/ | ||
public class SegmentTree { | ||
private int[] tree; | ||
private int[] array; | ||
|
||
/** | ||
* Constructor. | ||
* @param nums | ||
*/ | ||
public SegmentTree(int[] nums) { | ||
tree = new int[4 * nums.length]; // Need to account for up to 4n nodes. | ||
array = nums; | ||
buildTree(nums, 0, nums.length - 1, 0); | ||
} | ||
|
||
/** | ||
* Builds the tree from the given array of numbers. | ||
* Unlikely before where we capture child nodes in the helper node class, here we capture position of child nodes | ||
* in the array-representation of the tree with an additional variable. | ||
* @param nums | ||
* @param start | ||
* @param end | ||
* @param idx tells us which index of the tree array we are at. | ||
*/ | ||
private void buildTree(int[] nums, int start, int end, int idx) { | ||
// recall, each node is a position in the array | ||
// explicitly track which position in the array to fill with idx variable | ||
if (start == end) { | ||
tree[idx] = nums[start]; | ||
return; | ||
} | ||
int mid = start + (end - start) / 2; | ||
int idxLeftChild = (idx + 1) * 2 - 1; // convert from 0-based to 1-based, do computation, then revert | ||
buildTree(nums, start, mid, idxLeftChild); | ||
int idxRightChild = (idx + 1) * 2 + 1 - 1; // convert from 0-based to 1-based, do computation, then revert | ||
buildTree(nums, mid + 1, end, idxRightChild); | ||
tree[idx] = tree[idxLeftChild] + tree[idxRightChild]; | ||
} | ||
|
||
/** | ||
* Queries the sum of all values in the specified range. | ||
* @param leftEnd | ||
* @param rightEnd | ||
* @return the sum. | ||
*/ | ||
public int query(int leftEnd, int rightEnd) { | ||
return query(0, 0, array.length - 1, leftEnd, rightEnd); | ||
} | ||
|
||
private int query(int nodeIdx, int startRange, int endRange, int leftEnd, int rightEnd) { | ||
// this is the case when: | ||
// start end | ||
// range query: ^ ^ --> so simply capture the sum at this node! | ||
if (leftEnd <= startRange && endRange <= rightEnd) { | ||
return tree[nodeIdx]; | ||
} | ||
int rangeSum = 0; | ||
int mid = startRange + (endRange - startRange) / 2; | ||
// Consider the 3 possible kinds of range queries | ||
// start mid end | ||
// poss 1: ^ ^ | ||
// poss 2: ^ ^ | ||
// poss 3: ^ ^ | ||
if (leftEnd <= mid) { | ||
int idxLeftChild = (nodeIdx + 1) * 2 - 1; | ||
rangeSum += query(idxLeftChild, startRange, mid, leftEnd, Math.min(rightEnd, mid)); | ||
} | ||
if (mid + 1 <= rightEnd) { | ||
int idxRightChild = (nodeIdx + 1) * 2 + 1 - 1; | ||
rangeSum += query(idxRightChild, mid + 1, endRange, Math.max(leftEnd, mid + 1), rightEnd); | ||
} | ||
return rangeSum; | ||
} | ||
|
||
/** | ||
* Updates the segment tree based on updates to the array at the specified index with the specified value. | ||
* @param idx | ||
* @param val | ||
*/ | ||
public void update(int idx, int val) { | ||
if (idx > array.length) { | ||
return; | ||
} | ||
array[idx] = val; | ||
update(0, 0, array.length - 1, idx, val); | ||
} | ||
|
||
private void update(int nodeIdx, int startRange, int endRange, int idx, int val) { | ||
if (startRange == endRange) { | ||
tree[nodeIdx] = val; | ||
return; | ||
} | ||
int mid = startRange + (endRange - startRange) / 2; | ||
if (idx <= mid) { | ||
update(nodeIdx * 2 + 1, startRange, mid, idx, val); | ||
} else { | ||
update(nodeIdx * 2 + 2, mid + 1, endRange, idx, val); | ||
} | ||
tree[nodeIdx] = tree[nodeIdx * 2 + 1] + tree[nodeIdx * 2 + 2]; | ||
} | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,39 @@ | ||
package dataStructures.segmentTree; | ||
import static org.junit.Assert.assertEquals; | ||
|
||
import org.junit.Test; | ||
|
||
public class SegmentTreeTest { | ||
@Test | ||
public void construct_shouldConstructSegmentTree() { | ||
int[] arr1 = new int[] {7, 77, 37, 67, 33, 73, 13, 2, 7, 17, 87, 53}; | ||
SegmentTree tree1 = new SegmentTree(arr1); | ||
assertEquals(arr1[1] + arr1[2] + arr1[3], tree1.query(1, 3)); | ||
assertEquals(arr1[4] + arr1[5] + arr1[6] + arr1[7], tree1.query(4, 7)); | ||
int sum1 = 0; | ||
for (int i = 0; i < arr1.length; i++) { | ||
sum1 += arr1[i]; | ||
} | ||
assertEquals(sum1, tree1.query(0, arr1.length - 1)); | ||
|
||
|
||
int[] arr2 = new int[] {7, -77, 37, 67, -33, 0, 73, -13, 2, -7, 17, 0, -87, 53, 0}; // some negatives and 0s | ||
SegmentTree tree2 = new SegmentTree(arr1); | ||
assertEquals(arr1[1] + arr1[2] + arr1[3], tree2.query(1, 3)); | ||
assertEquals(arr1[4] + arr1[5] + arr1[6] + arr1[7], tree2.query(4, 7)); | ||
int sum2 = 0; | ||
for (int i = 0; i < arr1.length; i++) { | ||
sum2 += arr1[i]; | ||
} | ||
assertEquals(sum2, tree2.query(0, arr1.length - 1)); | ||
} | ||
|
||
@Test | ||
public void update_shouldUpdateSegmentTree() { | ||
int[] arr = new int[] {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10}; | ||
SegmentTree tree = new SegmentTree(arr); | ||
assertEquals(55, tree.query(0, 10)); | ||
tree.update(5, 55); | ||
assertEquals(105, tree.query(0, 10)); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will there be a short explanation of what the segment tree in the image actually represent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay nvm I just saw that it was explained below