2016-08-30
In this post, I describe an alternative to the Narayana Pandita’s algorithm, an algorithm used to generate
permutations of a set of elements which I have already written about.
Comparison with the alternative
This is one of the most efficient ways to generate the permutations of a list.
It is more efficient than the Narayana Pandita’s algorithm if both are properly implemented.
This algorithm does not generate permutations in lexicographic order, which is
required in some scenarios.
This algorithm does not require any comparison between elements of the list it
is sorting, making it more appealing for sequences without defined order or
whose elements are difficult to compare.
Python implementations
Based on implementations found on the Internet, I wrote two PEP 8 compliant
Python 3 versions of Heap’s algorithm. The first one, which is also the most
elegant one in my opinion, is a recursive implementation. As one may assume, the
cost of the several levels of recursion make it substantially more expensive
than it needs to be.
Recursive implementation
def swap(elements, i, j):
elements[i], elements[j] = elements[j], elements[i]
def generate_permutations(elements, n):
if n == 0:
yield elements
else:
for i in range(n - 1):
# Generate permutations with the last element fixed.
yield from generate_permutations(elements, n - 1)
# Swap the last element.
if i % 2 == 0:
swap(elements, i, n - 1)
else:
swap(elements, 0, n - 1)
# Generate the last permutations after the final swap.
yield from generate_permutations(elements, n - 1)
def permutations(elements):
yield from generate_permutations(elements, len(elements))
This code uses the yield from
syntax, added in Python 3.3. You can read
more about it here.
The algorithm is not trivially understood. Essentially, what is happening is a
locking of the rightmost element and the recursive permutation of all other
elements, then an intelligently chosen swap involving the rightmost element and
the repetition of the process until all elements have been in the rightmost
position.
Non-recursive implementation
def swap(elements, i, j):
elements[i], elements[j] = elements[j], elements[i]
def generate_permutations(elements, n):
# As by Robert Sedgewick in Permutation Generation Methods
c = [0] * n
yield elements
i = 0
while i < n:
if c[i] < i:
if i % 2 == 0:
swap(elements, 0, i)
else:
swap(elements, c[i], i)
yield elements
c[i] += 1
i = 0
else:
c[i] = 0
i += 1
def permutations(elements):
return generate_permutations(elements, len(elements))
Benchmarking
I have benchmarked the algorithm for lexicographic permutations and both
implementations above using a sequence of 10 integers and a sequence of 10
strings of 100 characters which only differed by the last character. Note that
although an input of length 10 may seem small for an unaware reader, there are
10! = 3,628,800 results generated even for such small input.
The benchmarking results follow.
Lexicographic with 10 integers took 10.99 seconds.
Recursive Heap's with 10 integers took 14.10 seconds.
Non-recursive Heap's with 10 integers took 4.29 seconds.
Lexicographic with 10 strings took 11.66 seconds.
Recursive Heap's with 10 strings took 14.43 seconds.
Non-recursive Heap's with 10 strings took 4.31 seconds.
As you can see, the cost of recursion makes the more elegant implementation of
Heap’s algorithm even slower than the lexicographic permutation generator.
However, the non-recursive implementation is substantially more efficient. It is
also visible from these numbers that the performance hit from the use of big
strings (which are more expensive to compare than small integers) was bigger on
the lexicographic permutation generator than on the generators that do not
compare elements.
2016-08-24
On this post, I describe an efficient way to balance chemical equations using
linear algebra.
Procedure
From any chemical equation, we can derive a vector equation that describes the
number of atoms of each element present in the reaction. Row reducing this
vector matrix leads to the general solution, from which we can easily derive a
solution with integer coefficients.
Example
Find the smallest integer coefficients that balance the following chemical
equation.
MnS
- As2Cr10O35
- H2SO4
→ HMnO4
- AsH3
- CrS3O12
- H2O
Solution
From the chemical equation above, we can derive the following matrix. Notice
that the first row is for Mn, the second is for S, and so on. However, how you
choose to order does not matter, as long as all columns respect the order.
1 0 0 -1 0 0 0
1 0 1 0 0 -3 0
0 2 0 0 -1 0 0
0 10 0 0 0 -1 0
0 35 4 -4 0 -12 -1
0 0 2 -1 -3 0 -2
The last four columns all have negative atom counts because they have been
subtracted from both sides of the original vector equation.
After row reducing the matrix with
Octave, we get
1.00000 0.00000 0.00000 0.00000 0.00000 0.00000 -0.04893
0.00000 1.00000 0.00000 0.00000 0.00000 0.00000 -0.03976
0.00000 0.00000 1.00000 0.00000 0.00000 0.00000 -1.14373
0.00000 0.00000 0.00000 1.00000 0.00000 0.00000 -0.04893
0.00000 0.00000 0.00000 0.00000 1.00000 0.00000 -0.07951
0.00000 0.00000 0.00000 0.00000 0.00000 1.00000 -0.39755
However, this is not of much help as we want to end up with integer
coefficients. Therefore, we change the number display format to rational by
using format rat.
This makes the last column six rationals with the same denominator.
-16/327
-13/327
-374/327
-16/327
-26/327
-130/327
After multiplying the last column by 327 and changing its sign, we get the
coefficients for the balanced equation.
Note that the free variable is the coefficient of H2O, which is 327
as we are interested on the smallest integer that makes all other coefficients
integers.
Wrapping up, the balanced equation has coefficients 16, 13, 374, 16, 26, 130,
327, in this order. Not something easy to find by trial and error.
2016-08-08
This post describes how to generate the lexicographic permutations of a
sequence. The lexicographic order is a generalization of the way the
alphabetical order of words is based on the alphabetical order of their
component letters. This generalization consists primarily in defining a total
order over the sequences of elements of a finite totally ordered set. You may
understand that this is a way to establish ordering between sequences based on
how their elements compare.
Algorithm description
There exist several ways to generate all permutations of a given sequence. The
simple algorithm which I will discuss here is based on finding the next
permutation in lexicographic ordering, if it exists, or reversing the last
permutation to get back to the minimal permutation. This method goes back to
Narayana Pandita in 14th
century India.
The algorithm is quite straightforward and may be memorized:
Find the biggest i such that a[i] < a[i + 1];
Find the biggest j greater than i such that a[j] > a[i];
Swap a[i] and a[j];
Reverse the elements from a[i + 1] to the last element.
If the first step fails (because such index does not exist) the current
permutation is the last one.
Given any permutation of a list, this generates the next one. It should be
noted, however, that when given the greatest lexicographic permutation, this
algorithm returns this same permutation, so it should be checked to ensure that
if the permutation at hand is the last one, we reverse the sequence to get back
to the first permutation.
This algorithm is simple to implement correctly, computationally efficient, and
it only generates each distinct permutation once, which is convenient when there
are many repeated elements.
Python implementation
Below is an in-place Python 3 implementation of the described algorithm. If you
do not want mutability, wrap the algorithm calls with a function that copies the
list. However, take into consideration that for very large lists this may use a
significant amount of memory if you are interested in several permutations.
def swap(elements, i, j):
elements[i], elements[j] = elements[j], elements[i]
def reverse(elements, i, j):
for offset in range((j - i + 1) // 2):
swap(elements, i + offset, j - offset)
def next_permutation(elements):
last_index = len(elements) - 1
if last_index < 1:
return
i = last_index - 1
while i >= 0 and not elements[i] < elements[i + 1]:
i -= 1
# If there is no greater permutation, return to the first one.
if i < 0:
reverse(elements, 0, last_index)
else:
j = last_index
while j > i + 1 and not elements[j] > elements[i]:
j -= 1
swap(elements, i, j)
reverse(elements, i + 1, last_index)
Documentation was omitted for the sake of brevity. One could also consider this
an example of code as
documentation.
Algorithm output
Generating eight permutations of the string ‘abc’ in ascending lexicographic
order produces the following sequence:
abc, acb, bac, bca, cab, cba, abc, acb, ...
Notice that after the sixth permutation we get back to the first one, as there
are only six distinct permutations of the string ‘abc’.
Algorithm complexity
This algorithm uses constant additional space (as it is in-place) and has linear
time complexity in the worst case but is amortized to constant time.
Using the C++ standard library
If you can use the C++ STL, you have access to std::next_permutation.
Example
std::string text = "abbc";
do {
cout << text << '\n';
} while (std::next_permutation(text.begin(), text.end()));
abbc
abcb
acbb
babc
bacb
bbac
bbca
bcab
bcba
cabb
cbab
cbba
As per the documentation, the return value of the function is true if the new
permutation is lexicographically greater than the old, or false if the last
permutation was reached and the range was reset to the first permutation.
2016-07-18
This is my solution to Project Euler problem number 206. Although a brute force
solution can be easily derived, this approach makes use of mathematical analysis
to find the answer in less time. The problem statement follows.
Find the unique positive integer whose square has the form
1_2_3_4_5_6_7_8_9_0
, where each “_
” is a single digit.
It can be shown that the answer lies in the range R.
R = [floor(sqrt(1020304...00)), ceiling(sqrt(192939..90))] = [1010101010, 1389026624]
R has 378,925,614 integers, which is a lot to test quickly. Fortunately, one
can reduce it by noticing that the square of the integer is divisible by 10,
which implies that the integer too is divisible by 10. This leaves us with the
possible solution set P.
P = {1010101010, 1010101020, ..., 1389026620}
The largest possible number is 19293949596979899990, which requires 65 bits to
be represented as an unsigned integer.
However, because we know that the answer is a multiple of 10, we also know that
the square of the answer is a multiple of 100. And, therefore, that it ends with
zeros. This allows for another optimization: divide all numbers in P by 10 and
find the one whose square is of the form 1_2_3_4_5_6_7_8_9
. These numbers fit
into 64-bit words, so arithmetic with them will not be specially expensive.
P has 37,892,561 elements, but we do not need to check all of them. If
x2 ends in 9, x ends in 3 or 7, which reduces by 80% the number of
values we have to test.
Using Clang with the O3 flag, my C++ implementation of the algorithm finishes after 120 ms, which
is a good enough solution for this problem.
2016-06-18
I decided to implement a perfect AI (using
Minimax) for
Tic-tac-toe. I had not decided though whether I’d use JavaScript or Racket,
two languages I’ve been using quite a bit recently and about which I am quite
interested. As the post title gives off, I’ve decided to write a functional
programming solution using Racket. It is open-source and available on GitHub.
See the repository. There is
also a fairly decent GUI-based game that allows you to play against the AI.
When writing this I decided to use
vectors instead of
lists, expecting it would
give me better performance. However, it is quite slow. Even though the current
implementation is good enough for real-time gameplay, it could be much more
efficient for such a simple decision tree as the one used in Tic-tac-toe. Maybe
using lists would have been a better choice as some parts of the application now
have to convert vectors to lists. Additionally, slicing and merging vectors seem
to generate a lot of work for the garbage collector. Finally, as the biggest
vector holds only nine elements, the constant-time random access vectors provide
is likely not making too much of a difference.
Being an automation enthusiast myself. This project also got its amount of
continuous integration and delivery. Every push and pull request is built and
tested by Travis CI. Here you can find the build history of the
project. Travis also
prepares and deploys a Linux-only standalone distribution to GitHub releases on
tags. See the releases
page. In the future I
may make Windows and Mac OS X standalone distributions available. You should be
able to play the standalone game on Linux even without having Racket installed.
There is also alpha-beta
pruning left to be
implemented, which should also significantly improve performance. Even though
this is still a work in progress, the game works and the AI is indeed perfect.
It also runs in any operating system as long as it has a Racket
6 distribution installed.