A rope (or cord) is a data structure composed of smaller strings that is used for efficiently storing and manipulating a very big piece of text. Text editors may make use of this data structure so that insertions and deletions performed by the user can be done efficiently.
It is a binary tree whose leafs contain strings. Each node is given a weight value equal to the sum of the weights of its left subtree or the length of its string, if it is a leaf node.
Finding a character by its index in a rope is an O(lg n)
operation, as it
consists of a top-down search on a binary tree.
Concatenating two ropes together is also a O(lg n)
operation.
Given two ropes a
and b
, concatenating them together is as simple as
creating a new root node and assigning a
to its left child and b
to its
right child. However, the root node must descend into the left child to compute
its own weight, making it O(lg n)
.
Splitting a rope into two is a O(lg n)
operation as it relies on indexing,
which is sublinear.
Insertion is also O(lg n)
as it can be done with either one split and two
concatenations or a single concatenation (if prepending or appending the other
rope).
Deletion is yet another O(lg n)
operation. Deletion can be done with two
splits and one concatenation.
In order to present the data to an user or to save it to a file, it will be
necessary to convert the whole tree, or at least part of it, to a string. This
is O(m + lg n)
where m
is the number of nodes needed to visit, usually
directly proportional to the size of the string we are trying to obtain. To
convert a rope into a linear string, one needs to find the starting node in
O(lg n)
and perform an in-order traversal until the last needed node is found
in O(m)
.
For big text blocks, ropes have faster insertion and deletion than monolithic arrays and don’t require large contiguous memory spaces to exist. The biggest disadvantage of ropes over arrays is its sheer complexity, which makes the risk of bugs in your code much higher.