You are on page 1of 10

A New Fast Computation of a Permanent

Xuewei Niu1, Shenghui Su1, 4, Jianghua Zheng2, and Shuwang Lü3


1
College of Computers, Nanjing Univ. of Aeronautics & Astronautics, Najing 211106, PRC
2
School of Network Securities, Information Engineering University, Zhengzhou 450001, PRC
3
Laboratory of Information Security, Univ. of Chinese Academy of Sciences, Beijing 100039, PRC
4
Public Security Innovation Center, Nanjing Univ. of Science and Technology, Nanjing 210094, PRC

Abstract: This paper proposes a general algorithm called Store-zechin for quickly computing the permanent of an
arbitrary square matrix. Its key idea is storage, multiplexing, and recursion. That is, in a recursive process, some
sub-terms which have already been calculated are no longer calculated, but are directly substituted with the
previous calculation results. The new algorithm utilizes sufficiently computer memories and stored data to speed
the computation of a permanent. The Analyses show that computating the permanent of an n  n matrix by
Store-zechin requires (2n-1 - 1)n multiplications and 2n-1(n - 2) + 1 additions while does (2n - 1)n + 1 multiplications
and (2n - n)(n + 1) - 2 additions by the Ryser algorithm, and does 2n-1n + n + 2 multiplications and 2n-1(n + 1) + n2 -
n -1 additions by the R-N-W algorithm. Therefore, Store-zechin is excellent more than the latter two algorithms,
and has a better application prospect.

Keywords: Matrix, Permanent, Recursive algorithm, Linked list, Time complexity

1. Introduction
In the year of 1812, Cauchy used the determinant as a special type of alternating symmetry functions.
In order to distinguish it from ordinary symmetry functions, it is called “fonction symetriques
permanents [1]”. In the meantime, Cauchy introduced a subclass of the symmetric functions which was
later named as permanents by T. Muir [2]. The computation of the permanent of a matrix is known to
be more difficult than the computation of the determinant. The difficulty of computing a permanent is
directly proportional to the difficulty of a boson sampling problem. In recent years, with the advance of
quantum computing technologies, a permanent is often regarded as a measure of the quantum
supremacy by which people can determine whether quantum computers are worthy of research and
development. Therefore, it has received more and more attention.

2. Definition and Computation of Permanent of a Square Matrix

2.1 Basic Definition and Properties


The permanent of a square matrix is a number that is define in a way similar to the determinant. Let
A be an n × n matrix. The permanent of A is defined as

Per ( A)    S 
n
i 1
ai , ( i ) , (1)
n

where Sn is the symmetric group over the set {1, 2, ..., n}, and  is an element of Sn, namely a
permutation of the numbers 1, 2, ..., n [3], while the definition of a determinant is

Det ( A)    S sgn( ) i 1 ai , (i ) ,
n
(2)
n

where sgn() represents the parity sign of a group element [4]. The only difference between the
determinant and the permanent is the parity sign of a group element, so there are some similar
properties between them [5][6], such as
1) Per(I) = 1, where I represents the n-th identity matrix (Normativeness);
2) Per(AT) = Per(A), where AT represents the transpose of A (Transpose invariance);
3) Per(A) will be changed to k  Per(A) when any row or column of A is multiplied by a scalar k.

2.2 Computation Methods

1
At present, the well-known methods to calculate a permanent are the Naive algorithm, Ryser
algorithm, and R-N-W algorithm.
Naive algorithm is a way based on the formula (1). It computes the permanent directly and the
algorithm complexity of this algorithm is O(n·n!).
The Ryser algorithm is an efficient method [7]. This method was proposed by H. Ryser in 1963, and
used the principle of tolerance to calculate the permanent. It is defined as

Per ( A)   k 0 (1) k Tk ,
n 1
(3)

where Tk is the sum of the values of P(Ak) over all possible Ak, Ak is a matrix obtained from A with
columns k removed, and P(Ak) is the product of the row-sums of Ak. According to formula (3), it can be
deduced that the algorithm complexity of the Ryser algorithm is O(n22n-1).
The R-N-W algorithm was developed shortly after the Ryser algorithm [8]. Nijenhuis and Wilf used
some techniques to improve the Ryser algorithm and reduced the complexity to O(n2n-1). This
algorithm can be descripted as

Per ( A)  (1) n 1 2 S (1) S  i 1{xi   jS ai , j },


n
(4a)

1 n
xi  ai , n   ai, j (i  1,..., n),
2 j 1
(4b)

where S runs over the subsets of 1, 2, …, n-1. And for each subset S  {1, 2, …, n-1}, we have to
calculate

f ( S )   i 1 i ( S ),
n
(5)

where
i ( S )  xi   jS ai , j (i  1,..., n), (6)

Suppose that the current subset S differs from its predecessor S’ by a single element. Then
i ( S )  i ( S ')  ai , j (i  1,..., n). (7)
Thus, instead of requiring n(|S| + 1) operations to compute 1, … ,n in (6), we can get them in just n
operations by (7). The key to (6) transitioning to (7) is to encode the subset with Gray code, and then
we can perform related operations on its corresponding subsets.
In addition, with respect to the permanents of some special square matrixes —— 0-1 square matrixes
for example, there are several fast computing methods [9][10].

3. Design of the General Store-zechin Algorithm

3.1 Thought of the Algorithm


Store-zechin is an algorithm designed by us, which has seemingly been ignored by some pure
mathematicians. The computer memories and stored data can be utilized effectively repeatedly so as to
speed the computation of a permanent. The key idea of the Store-zechin algorithm is to calculate the
permanent recursively and to replace the being calculated items with the previous stored results. For
example, if n = 4 and
 a1,1 a1,2 a1,3 a1,4 
 
a2,1 a2,2 a2,3 a2,4 
A ,
 a3,1 a3,2 a3,3 a3,4 
 
 a4,1 a4,2 a4,3 a4,4 
then according the Store-zechin algorithm, we can known that

2
Per ( A)  a4,1 Per ( A4;1 )  a4,2 Per ( A4;2 )  a4,3 Per ( A4;3 )  a4,4 Per ( A4;4 )
 a4,1 (a3,2 Per ( A3,4;1,2 )  a3,3 Per ( A3,4;1,3 )  a3,4 Per ( A3,4;1,4 )) 
a4,2 (a3,1 Per ( A3,4;1,2 )  a3,3 Per ( A3,4;2,3 )  a3,4 Per ( A3,4;2,4 ))  (8)
a4,3 (a3,1 Per ( A3,4;1,3 )  a3,2 Per ( A3,4;2,3 )  a3,4 Per ( A3,4;3,4 )) 
a4,4 (a3,1 Per ( A3,4;1,4 )  a3,2 Per ( A3,4;2,4 )  a3,3 Per ( A3,4;3,4 )),
where Ai;j means the matrix that removes the i-th row and the j-th column. According to (8), we can
find that Per(A3,4;1,2),Per(A3,4;1,3),Per(A3,4;1,4),Per(A3,4;2,3),Per(A3,4;2,4),Per(A3,4;3,4) are repeated.
So the second calculation of these items are replaced by their first results.

3.2 Data Structure of the Algorithm


In order to store the calculation results in a recursive process, we can build a global linked list. Check
whether the item has been calculated before calculating each recursive item. If yes, return the stored
result. Otherwise, calculate the permanent of this item and stored it in the linked list.
We first need to create two structures, HeadNode and BodyNode. BodyNode contains three variables,
Array, value and pbNext. The Array is a one-dimensional integer array which stores the columns that
need to be removed. The value is an integer which means the permanent of a square matrix that
removed columns and rows. In fact, the columns that need to removed can get from Array. So we can
know how many columns should be removed which recorded as m. Then we can remove last m rows of
the original matrix. So we only record the columns that need to be removed. The pbNext is a pointer
which points to the next BodyNode node. The structure of BodyNode is shown in Figure 1.

int *Array int value pbNext

Fig.1. The structure of BodyNode


And the definition of BodyNode in C is
typedef struct bodynode
{
int *Array;
int value;
struct bodynode *pbNext;
}BodyNode,*pBodyNode;

HeadNode also contains three variables, size, phNext and pbody. The size is an integer and it means
how many BodyNode nodes are linked after the node. The phNext is a pointer which points to the next
HeadNode node. The pbody is also a pointer and it points to the BodyNode nodes. The structure of
HeadNode is shown in Figure 2.

int size phNext pbody

Fig.2. The structure of HeadNode


And the definition of HeadNode in C is
typedef struct headnode
{
int size;
pBodyNode pbody;
struct headnode *phNext;
}HeadNode,*pHeadNode;

3
The whole linked list can be constructed by the above two structures as Figure 3. For the sake of
convenience, we specify that only the BodyNode that removes one column can link to the first
HeadNode and only the BodyNode that removes two columns can link to the second HeadNode and so
on.
HeadNode BodyNode

size int *Array value …

size int *Array value …

size int *Array value …

Fig.3. The structure of linked list


Then we can deduce that in general, namely when A is an n-th order square matrix, we can get the
following formula.
Per ( A)  an ,1 Per ( An;1 )  an ,2 Per ( An;2 )    an , n Per ( An;n )
 an ,1 (an 1,2 Per ( An , n 1;1,2 )  an 1,3 Per ( An , n 1;1,3 )    an 1, n Per ( An ,n 1;1, n )) 
an ,2 (an 1,1 Per ( An ,n 1;1,2 )  an 1,3 Per ( An ,n 1;2,3 )    an 1, n Per ( An ,n 1;2,n ))  (9)

an , n (a( n 1)1 Per ( An ,n 1;1, n )  an 1,2 Per ( An , n 1;2,n )    an 1, n 1 Per ( An ,n 1;n 1,n )).
The termination condition of the recursive is
Per ( A)  a1,1a2,2  a2,1a1,2 , n  2. (10)
(9) and (10) and the rule that only calculates the sub-items that not been calculated constitute the
Store-zechin algorithm for calculating a permanent.

3.3 Description of the Algorithm


Based on the key idea and the data structure, we can describe the general Store-zechin algorithm
detailedly.
Calling statement: Store-zechin(pHead, A, n, del_index, exist_index, del_order);
pHead: the pointer which points to the linked list;
A: the matrix that needs to be calculated;
n: the order of A;
del_index: the array of the columns that need to be removed;
exist_index: the array of the columns that still exist after the removal operation;
del_order: the number of columns that need to be removed.
Algorithm steps:
S1:Find if there is such a BodyNode whose Array is same as the del_index in the linked list
which is pointed by the pHead,
S1.1:If it exists, return the value of the node,
S1.2:If it doesn’t exist, go to S2.
S2:Let sum  0,

4
S2.1:If n = 2,sum  a1,1a2,2 + a1,2a2,1 (ai,j is the number at the i-th row and j-th column in A).
Creat a new BodyNode node, assigning del_index and sum to its array and value respectively.
Then link the BodeNode to the linked list,
S2.2:If n > 2,then let i  1,and go to S3.
S3:Let exist_i  exist_index(i),
put exist_i on the last of del_index,
del_order  del_order + 1.
S4:Let temp_exist_index  exist_index,and delete the i-th number of temp_exist_index.
S5:Let coe  ani,and temp_A represent the matrix that removes the last row and i-th column,
sumsum+coe*Store-zechin(pHead, temp_A, n-1, del_index, exist_index, del_order).
S6:Delete the last number of del_index,
del_order  del_order – 1.
S7:Let i  i + 1,
S7.1:If i > n,go to S8,
S7.2:If i <= n,go to S3.
S8:If del_order  0, creat a new BodyNode node, assigning del_index and sum to its array and
value respectively, then link it to the global linked list.
S9:Return sum.
In fact, we need to initialize some global variables before the algorithm starts. The initialization steps
are as follows.
S1:Creat an empty lined list, and let pHead point to it.
S2:Let del_index  array1,and array1 is an empty array.
exist_index  array2,and array2 is an array whose numbers are 1,2,3,…,n,
del_order  0.

4. Analysis of Time Complexity of the New Algorithm


Since the Store-zechin algorithm is obtained by recursion, the number of multiplication operations
and addition operations of each sub-item can be derived by that used by the lower-order sub-items.

4.1 Multiplication Operations


According to the derivation process of the Store-zechin algorithm, it can be found that the number of
multiplication operations required in each sub-item of the algorithm satisfies the following condition.
n 1 0
n2 0 0
n3 2 2 2
n4 9 7 5 3 . (11)
n5 28 19 12 7 4
n6 75 47 28 16 9 5
 
Namely, when n = i, the number of multiplication operations required for the first sub-item from right
to left is i – 1 (0 for i = 1, 2), and the number of multiplication operations of the j (j > 1) sub-items from
right to left satisfies the following relationship. (when n = i, the number of multiplication operations to
be used for the j - 1 sub-item from right to left) + (when n = i - 1, the number of multiplication
operations is required for the j - 1 sub-item from right to left) = (when n = i, the number of

5
multiplication operations is required for the j sub-item from right to left).
In fact, the number of multiply steps we need can be derived from the sequence 0, 0, 2, 3, 4, 5, ..., n
and it can be shown like this.
6 5 4
2 1 i/ j 3
0 1
0 0 2
2 2 2 3
A1  .
9 7 5 3 4
28 19 12 7 4 5
75 47 28 16 9 5 6
      
In A1, the number of multiplication operations of all sub-item can be obtained, as long as it is derived
from the rightmost column to the left and follows the rule ai,j=ai, j-1 + ai-1, j-1. But because in the
sequence 0, 0, 2, 3, 4, 5, ..., n, the second item of this series is 0. It is inconvenient to consider, so we
might consider the sequence 0, 1, 2, 3, 4, 5, ..., n and follows the process of A1, then we can get A2.
6 5 4 3
2 1 i/ j
0 1
1 1 2
4 3 2 3
A2  .
12 8 5 3 4
32 20 12 7 4 5
80 48 28 16 9 5 6
      
By comparing A1 and A2, we can find that when i > 1, ai,i-1 in A2 is 1 larger than ai,i-1 in A1 and ai,i in
A2 is j-1 larger than ai,i in A1, and the other values in the two matrices are equal. Then we can
completely represent the sum of n-th row in A1 recorded as sumn(A1) by firstly calculating the sum of
n-th row in A2 recorded as sumn(A2). Sumn(A1) and sumn(A2) satisfy the following relationship
sumn(A2) = sumn(A1) + (n – 1 + 1). (12)
For A2, we can change the way we express
       
      n6 n5
     2n  11 n5 n4
    4n  20 2n  9 n4 n3
A2 = .
   8n  36 4n  16 2n  7 n3 n2
  16n  64 8n  28 4n  12 2n  5 n2 n 1
 32n  112 16n  48 8n  20 4n  8 2n  3 n 1 n
 6 5 4 3 2 1 i/ j
Then the i-th item of the n-th row in A2 can be expressed as 2i-1*n-(2i-1+(i-1)*2i-2) and sumn(A2) can be
expressed as

 2i 1 * n  (2i 1  (i  1) * 2i  2 ).
n
i 1
(13)

Now according to relation (12), we can derive sumn(A1) as

6
((n  1)  1)   i 1 2i 1 * n  (2i 1  (i  1) * 2i  2 ),
n
(14)
namely

 n   i 1 2i 1 * n  (2i 1  (i  1) * 2i  2 ).
n
(15)
Formula (15) represents the number of multiplication operations required for each recursive item but
it is not what we need for the Store-zechin algorithm. Looking back at formula (9), we can see that in a
recursion term, the preceding coefficients also perform multiplication operations and the number of
them is n.
In summary, we can deduce the number of multiplication operations to calculate the permanent of
square matrix by Store-zechin under general conditions

 2i 1 n  (2i 1  (i  1)2i  2 ).
n
i 1
(16)

After summing the formula (16), the formula (17) is obtained.


n(2n 1  1). (17)

4.2 Addition Operations


Similar to the multiplication operations, the number of addition operations of each sub-item in the
Store-zechin algorithm also satisfies a certain rule
n 1 0
n2 0 0
n3 1 1 1
n4 5 4 3 2 . (18)
n5 17 12 8 5 3
n6 49 32 20 12 7 4
 
It also can list the number of addition operations required for all sub-items from the sequence 0, 0, 1, 2,
3, 4, ..., n
6 5
2 1 i/ j 4 3
0 1
0 0 2
1 1 1 3
A3  .
5 4 3 2 4
17 12 8 5 3 5
49 32 20 12 7 4 6
      
The process of getting A3 is similar to getting A1. The first item of sequence 0, 0, 1, 2, 3, 4, ..., n don’t
satisfy the general condition of n and it is not conducive to the generalization of the derivation. So we
consider the sequence -1, 0, 1, 2, 3, 4, ..., n and we get A4 after going through the same calculation as
A3.

7
6 5 4
1 i/ j 3 2
-1 1
-1 0 2
0 1 1 3
A4  .
4 4 3 2 4
16 12 8 5 3 5
48 32 20 12 7 4 6
      
By comparing A3 and A4, we can conclude that ai,i belong to A3 is 1 larger than ai,i belong to A4 ( i = 1,
2, ..., n ), and the other values in the two matrices are equal. Then we can completely represent the sum
of n-th row in A3 recorded as sumn(A3) by firstly calculating the sum of n-th row in A4 recorded as
sumn(A4). Sumn(A3) and sumn(A4) satisfy the following relationship
sumn(A3)= sumn(A4) + 1 (19)
For A4, we can change the way we express
       
      n7 n5
     2n  13 n6 n4
    4n  24 2n  11 n5 n3
A4 = .
   8n  44 4n  20 2n  9 n4 n2
  16n  80 8n  36 4n  16 2n  7 n3 n 1
 32n  144 16n  64 8n  28 4n  12 2n  5 n2 n
 6 5 4 3 2 1 i/ j
Then the i-th item of the n-th row in A4 can be expressed as 2i-1*n-(2i+(i-1)*2i-2) and sumn(A4) can be
expressed as

 2i 1 n  ((i  1)2i  2  2i ).
n
i 1
(20)

Now according to relation (19), we can derive sumn(A3) as

( i 1 2i 1 n  ((i  1)2i  2  2i ))  1.
n
(21)
However, formula (21) just represents the sum of addition operations of each sub-items. All addition
operations should also include the operations between each sub-items of the recursive top layer, see
formula (9) for details. There are n sub-items, so it need n – 1 addition operations. Now we can deduce
the number of addition operations to calculate the permanent of square matrix by Store-zechin under
general conditions

( i 1 2i 1 n  ((i  1)2i  2  2i ))  1  (n  1).


n
(22)
After summing the formula (22), it becomes
2n 1 (n  2)  1. (23)

5. Comparison of Complexities between New Algorithm and Existing Algorithms


As mentioned above, the current well-known algorithms for calculating the permanent are Naive
algorithm, Ryser algorithm and R-N-W algorithm. Here, the addition operations, the multiplication
operations and the total bit operations (assuming the maximum integer allowed is 264) will be used as
the standard to compare the Store-zechin algorithm with the above algorithm.

8
Firstly, we count the relevant data of each algorithm when n = 3,4,……,10, and the results are shown
in Table 1-3.
Table 1: Comparison of The Addition Operations of Four Algorithms
Algorithm n=3 n=4 n=5 n=6 n=7 n=8 n=9 n=10
Naive 5 23 119 719 5039 40319 362879 3628799
Ryser 18 58 160 404 966 2230 5028 11152
R-N-W 21 51 115 253 553 1207 2631 5721
Store-zechin 5 17 49 129 321 769 1793 4097

Table 2: Comparison of the Multiplication Operations of Four Algorithms


Algorithm n=3 n=4 n=5 n=6 n=7 n=8 n=9 n=10
Naive 18 96 600 4320 35280 322560 3265920 36288000
Ryser 22 61 156 379 890 2041 4600 10231
R-N-W 17 38 87 200 457 1034 2315 5132
Store-zechin 9 28 75 186 441 1016 2295 5110

Table 3: Comparison of the Total Bit Operations of Four Algorithms


Algorithm n=3 n=4 n=5 n=6 n=7 n=8 n=9 n=10
Naive 74048 394688 2465216 17740736 144829376 1.3238e+10 1.3400e+10 1.4887e+11
Ryser 91264 253568 649216 1578240 3707264 8502656 19163392 42619904
R-N-W 70976 158912 363712 835392 1907264 4312512 9650624 21386816
Store-zechin 37184 115776 310336 770112 1826880 4210752 9515072 21192768

From the comparison in Table 1-3, we can see that, when n > 5, the addition operations, the
multiplication operations and the total bit operations all reflect Laplace > Ryser > R-N-W >
Store-zechin. Besides, the difference between them increases as n increases. It is revealed that
Store-zechin algorithm can complete the calculation of the permanent of the fifth order or more with
fewer operations.
In order to prove the above statement, the addition operations, multiplication operations, and total bit
operations of the four algorithms will be compared next. The results are shown in Table 4.

Table 4: Comparison of Computational Complexity of Four Algorithms


Algorithm Addition Multiplication Total Bit
Naive n!-1 n·n! (4096·n+64)·n!-64
Ryser (n+1)(2n-n)-2 n(2n-1)+1 4160·n2n+64·2n-64·n2-4160·n+3968
R-N-W (n+1)2n-1+n2-n-1 n2n-1+n+2 4160·n2n-1+64·2n-1+64·n2+4032·n+8128
Store-zechin n2n-1-2n+1 n2n-1-n 4160·n2n-1-128·2n-1-4096·n+64

As can be seen from the comparison in the table, all three indicators reflect that the Naive algorithm
has the largest expression, so its computational complexity is the highest, and the Ryser ranks second.
Although the R-N-W algorithm has the same highest order as the Store-zechin, it has the larger small
items, so the Store-zechin has the lower computational complexity.

6. Conclusion
Although the Store-zechin algorithm has been neglected by mathematicians, the algorithm can fully
utilize the storage characteristics of the computer, and when the order of the matrix is improved, the
Store-zechin algorithm can calculate the permanent more efficiently. Through theoretical analysis, we

9
also confirm that the Store-zechin has the lower computational complexity than the Naive algorithm
and the Ryser algorithm. The R-N-W has the larger small items, although it has the same highest order
as the Store-zechin. As the order of the matrix increases, the Store-zechin algorithm will have better
performance undoubtedly. Moreover, the Store-zechin algorithm is designed for the storage
characteristics of computer of computers, so it is more compatible with computer. Therefore, in some
performance tests, the Store-zechin algorithm can more fully reflect some of the features of the device
and has a good application prospect.

Acknowledgment
This work is supported by MOST with Project 2007CB311100 and 2009AA01Z441.
…………

References
[1] A. L. Cauchy. Mémoire sur les fonctions qui ne peuvent obtenir que deux valeurs égales et de signes
contraires par suite des transpositions opérées entre les variables qu'elles renferment. Journal de l'École
Polytechnique, 1815, v10: 91-169.
[2] T. Muir. On a Class of Permanent Symmetric Functions. Proceedings of the Royal Society of Edinburgh,
1882, v11(1): 409-418.
[3] S. Su and J. Zheng. The Multiphoton Boson Sampling Machine Doesn't Beat Early Classical Computers for
Five-boson Sampling. Cornell University (https://arxiv.org/ftp/arxiv/papers/1802/1802.02779.pdf), Feb 2018.
[4] N. D. Rugy-Altherre. Determinant versus Permanent: Salvation via Generalization. Lecture Notes in
Computer Science (The Nature of Computation - Logic, Algorithms, Applications), 2013, v7921: 87-96.
[5] H. Minc. Permanents - Encyclopedia of Mathematics and its Applications 6. Boston: Addison-Wesley, 1978.
[6] R. A. Brualdi. Combinatorial Matrix Classes - Encyclopedia of Mathematics and Its Applications 108.
Cambridge (UK): Cambridge University Press, 2006.
[7] H. Ryser. Combinatorial Mathematics. Mathematical Association of America, 1963.
[8] A. Nijenhuis and H. S. Wilf. Combinatorial Algorithms: For Computers and Calculators (2nd Edition).
Cambridge: Academic Press, 1978.
[9] E. Bax and J. Franklin. A Permanent Formula With Many Zero-Valued Terms. Information Process Letter,
1997, v63(1): 33-39.
[10] E. Bax and J. Franklin. A Permanent Algorithm with exp[Ω(n1/3/2 ln n)] Expected Speedup for 0-1 Matrices.
Algorithmica, 2002, v32: 157-162.
[11] D. G. Glynn. The Permanent of a Square Matrix. Cambridge: Academic Press, 2010.
[12] L. G. Valiant. The Complexity of Computing the Permanent. Theoretical Computer Science, 1979, v8(2):
189-201.
[13] A. Barvinok. Computing the Permanent of (Some) Complex Matrices. Foundations of Computational
Mathematics, 2016, v16(2): 329-342.

10

You might also like