A New Way To Solve Linear Equations (over finite fields)

colanderman · on Aug 10, 2012

This is very neat, in that it uses random combinations of vectors to maintain constant entropy over each iteration. Unfortunately this method does not improve on the O(n^2*m) asymptotic run-time of Gaussian elimination, nor does it address numerical stability issues (since it applies only to finite fields). Still neat though.

bilbo0s · on Aug 10, 2012

OK Cool.

I thought I was just being dense until I came back to HN from the article and read this comment.

andrewcooke · on Aug 10, 2012

...over finite fields (http://en.wikipedia.org/wiki/Finite_field_arithmetic)

which is cool for many things, but it's probably not what you expected from the title.

tl;dr: filter a collection of random guesses one equation at a time. new trick is expanding the set of guesses as you work through, so you don't need to start with 2^n (if modulo 2) for example.

ColinWright · on Aug 10, 2012

I've edited the title - thanks.

jps359 · on Aug 10, 2012

As I understand it, the conjugate gradient method works over the reals.

stephencanon · on Aug 10, 2012

Conjugate Gradient works over the reals if the matrix is symmetric positive definite.

jmount · on Aug 10, 2012

Which can be arranged. Instead of solving for x in A x = b solve for x in t(A) A x = t(A) b (t() being transpose, and you may not need to explicitly form t(A) A).

stephencanon · on Aug 10, 2012

... which works, but the condition number of AtA is the square of the condition number of A, so convergence may be very slow, or the results may be numerically unusable.

You can often work around that with a carefully chosen pre-conditioner, but you're rapidly going down a rabbit hole that takes you away from the extreme simplicity of conjugate-gradient.

jmount · on Aug 10, 2012

True. But this is all a rabbit hole. I think the upper comment was trying to point out that real linear algebra already has work from a trial vector methods of solution. So there is some precedent for non-factorization based solutions. But even that detracts that this really does seem to be a new method for finite fields.

beagle3 · on Aug 10, 2012

I wonder if there's an underlying "random attempts" theory that can tie these successes (that have intrigued me for a long time):

1. Shannon's random code book construction (for achieving the noisy channel rate limit)

2. Johnson-Lindenstrauss lemma http://en.wikipedia.org/wiki/Johnson%E2%80%93Lindenstrauss_l..., the result that underlies most if not all of the compressive sensing field (and all of the recent superfast large scale SVD, PCA algorithms)

And of course this recent result might also be related somehow.

eru · on Aug 10, 2012

There's also the (bigger?) question whether randomness helps at all, e.g. whether P is weaker than P + a source of random bits. In practice, randomization does help, but like with P vs NP as far as I know nobody has proven anything definite.

pmb · on Aug 10, 2012

The complexity classes that involve randomness have names like BPP, ZPP, and RP. BPP is usually the one people mean when they talk about practical problems that randomness might help solve. Although the second level of PH contains both NP and BPP, which both contain P, it is still not known if any of those containments are proper, and the relationship between BPP and NP is still open. Even more interesting is the relationship between BPP and its quantum version BQP. The Complexity Zoo has more for the interested reader... http://qwiki.stanford.edu/index.php/Complexity_Zoo

eru · on Aug 10, 2012

Thanks for giving a less hand-wavy account.

pmb · on Aug 10, 2012

This stuff has just been in the air the past few years - Network coding ( http://www.eecs.harvard.edu/~michaelm/CS222/contentdist.pdf ) does the inverse operation to increase efficiency in Bittorrent-like distribution schemes. It is quite cool that randomness can help in both the creation and solving of these problems.

rm999 · on Aug 10, 2012

I never took abstract algebra, can someone explain what is going on? I read about half the article before realizing I have no idea what they mean.

acomar · on Aug 10, 2012

A field is a structure that behaves very much like the reals (under the usual operations). That is, it's a set of elements with an addition operation with an identity called 0 (special elements are named after their numeric equivalents, but that doesn't mean they're always the same thing), a multiplication with an identity called 1, and the two interact "properly". That is, they distribute and the additive identity is the multiplicative zero. Further, inverses exist for all elements, with respect to both addition and multiplication (except 0).

So very reals-like. Finite fields are then fields with a finite number of elements. Can you think of any examples?

The easiest way to read the OP is to read the field F as if it were the reals, except with a finite number of elements.

Let me know if that helps.

sadga · on Aug 10, 2012

Nit: A closer analogy is rationals, more so than reals (as reals have other unrelated properties like uncountability)

That is, given your simple rings (the integers) and finite rings (modulo n), you get fields by introducing just division (integers -> rationals; mod-n -> mod-p for prime p).

dgreensp · on Aug 10, 2012

"F_p" means the field of integers modulo p. Fields have addition and multiplication. If p is prime, the integers mod p have nice properties, like every one has an inverse.

For example, F_7 is the finite field of 0,1,2,3,4,5,6. You can add and multiply these (mod 7): 6 + 2 == 1, 2 * 5 == 3. The elements 0 and 1 work as you'd expect to satisfy the usual identities of arithmetic. Additionally, every element except 0 has a multiplicative inverse. 1 * 1 == 1, 2 * 4 == 1, 3 * 5 == 1, and 6 * 6 == 1. To "divide" by 2, you just multiply by 4. Etc.

ColinWright · on Aug 10, 2012

An answer to that depends a great deal on what you already know. If you know what linear equations are, and you know what a finite field is, then is seems pretty clear. Are you trying to "read like math" or are you trying to "read like a novel"?

If you're interested then read it again, closely, trying to understand every sentence, then feel free to post here or email me with the first sentence you don't understand,

rm999 · on Aug 10, 2012

I have a good applied math background, including linear algebra and gaussian elimination. First part I don't totally understand is: Prasad’s work is for solving linear systems over finite fields, specifically F_p, with p prime.

ColinWright · on Aug 10, 2012

Re-reading your question, since you are already familiar with linear algebra, the thing you need to know about is finite fields. But read the whole thing, it's in a logical progression.

OK ... so a linear equation is one of the form:

    ax + by + ... + dz = n

where the a, b, d and n are constants and the x, y, z, etc are the variables. You may be familiar with these when the constants are real numbers, but they can be from any field (more about that in a minute).

An example of a linear equation is

    3x + 5y - 2z = 1

So a system of linear equations is when you have lots of these. In large systems you may have literally millions of variables.

So what's this about a "field"?

Well, in general the constants don't need to be real numbers. There are properties that we want, such as addition, multiplication, subtraction and division, but there are systems that have these sorts of things that are not the real numbers. These are called fields.

For example, the rational numbers have all the properties we might want. We can add, subtract, divide and multiply rationals, and things work the way we would want.

But another example is the integers modulo 5. The numbers 0, 1, 2, 3 and 4 have all the properties we want. We can divide by 2, for example, but multiplying by 3, because 2x3=6, which is 1 (modulo 5).

So we can have a field (a system with all the operations we want) that only has a finite number of elements. There are others, but the most common ones are to take the integers modulo some prime number. The proof that we get a field is not har, and is actually interesting, but some people find it tedious.

So Prasad's work is for solving systems of linear equations where the constants are not real numbers, but come from a finite field.

Finally, one of the most common finite fields in the area of computing is the field that has just two elements - 0 and 1.

The other answers in this sub-thread are pretty good. Again, if you want to understand you will need to "read like math", but once you've done so, come and ask again if you have any questions.

rm999 · on Aug 10, 2012

Oh, I see, that makes a lot of sense - thanks!

ColinWright · on Aug 10, 2012

You're welcome.

csense · on Aug 11, 2012

You've probably used Gauss-Jordan elimination over the reals. What properties of real numbers make Gauss-Jordan elimination work?

The real numbers have addition and multiplication operations which each obey commutative, associative, identity, and inverse laws; and multiplication is distributes over addition. These properties are all that's required for Gauss-Jordan elimination to work.

(We don't need other properties of real numbers, such as properties involving convergence of sequences, which you may have studied in calculus or real analysis class.)

A set and binary operators + and * which have all of these properties, are called a "field" [1]. The real numbers, rational numbers, and complex numbers are probably your most familiar examples of fields.

When talking about a field like "the field of rational numbers, Q", you're actually referring to a data structure (Q, +, *) which includes definitions of the addition and multiplication operators.

Since most fields have "standard" addition and multiplication operators, often mathematicians drop them from discussion. Kind of like, when you're discussing a program with a colleague, you say, "Add the offset to the base," your colleague doesn't ask, "Add them as integers or floating-point numbers? What should I do if they overflow?" The answers to these questions are almost always clear from context, and also almost always near-universally standardized, at least within a particular programming language.

Anyway, as other posters have noted, the integers modulo p, where p is prime, are actually a field. It's obvious that Z_p has most of the properties of a field, but multiplicative inverse is interesting. The proof of existence of multiplicative inverses mod p -- and a fast O(log(p)) implementation of finding them -- is the Euclidean algorithm, one of the oldest algorithms known to mankind. [2]

[1] http://en.wikipedia.org/wiki/Field_%28mathematics%29

[2] http://en.wikipedia.org/wiki/Euclidean_Algorithm

eru · on Aug 10, 2012

Do you know all the words in that sentence, and just can't seem how they work together to create meaning, or do you have problems with any specific word?

jastr · on Aug 10, 2012

Here goes:

You think that Microsoft's stock price is directly related to how much it rained in Seattle. So you look at one day and see it rained 10 inches and the stock price was $20. On the next day, it rained 6 inches and the stock price fell to $12. What's the relationship between rain and stock price? Obviously 2x. How would your figure that out mathematically?

(Inches of rain) * ( relationship ) = ( stock price )

So we must find a relationship ( we'll call it r) that works for both of these equations (10)(r) = 20 (6)(r) = 12

Obvs r=2. That was easy. What if you think this is too simple and the stock price is actually related to both rain fall and the temperature.

We need a new equation (Inches of rain)(relationship 1) + (temperature)(relationship 2) = ( stock price )

Relationship 1 and relationship 2 are different values, because maybe rain has a much bigger impact on the stock price than temperature. So we need to solve for r1 and r2. (Try making up numbers for rain, temperature, and price on 3 days, and you'll find this gets ugly to solve.)

Then you decide again this is too simple, and the stock price replies on rain, temperature, the moons distance from the earth, the amount of red cars you see on the way to work, etc. Now we have a lot of big equations, that we'd like to solve. Once we solve them we can determine the relationships each factor has on the stock price and tomorrow we can look at all the inputs (temperature, rainfall, etc) and predict the stock price. Matrices are a way of representing all these equations at once.

Matrices are just a data structure, a way of representing equations. Just like computer data structures, matrices have certain functions that you can perform on them, like adding or multiplying them with other matrices.

There are lots of applications that use really really big matrices with thousands of inputs and we'd like to solve these matrices ( ie. find the relationships ).

There are a lot of ways of solving this, like Gauss-Jordan Elimination (which you probably know how to do even if you don't know that you know how). The problem is that most of these algorithms are very slow, because they perform lots of computations manipulating the matrices in order to solve them.

(I have to get back to work so the rest is real short.) In short, this new algorithm is just a really clever guess and check, sort of like genetic algorithms. The algorithm guesses possible values for each relationship. If a set of relationships (aka a vector) fits some of the equations (but not all), then the algorithm generates more relationships similar to it. This goes on until the algorithm finds a set of values for relationships that work in each equation.

azakai · on Aug 10, 2012

> He starts with a random set of vectors {V_{0}}. Then he checks which ones satisfy the first equation. If the vectors are random, then about half of them will satisfy the equation.

I don't follow this. Shouldn't it be 1/n, not 1/2?

ColinWright · on Aug 10, 2012

They're using the specific example of GF(2) as the field. Any given assignment of values to variables then gives a value of 0 or 1, so there's a 50:50 chance of getting it right.

If you're working over F_p then about 1/p of the randomly chosen vectors will satisfy the equation.

azakai · on Aug 10, 2012

Thanks, I didn't realize it was a specific example there.

bascule · on Aug 11, 2012

Does this have any applications to elliptic curve cryptography?

ColinWright · on Aug 11, 2012

I suspect not. This technique works by assigning values to variables at random, lots of times, and then keeping the ones that happen to make the first equation true. You then somehow breed them into another large set, and winnow with the second equation.

However, the number of elements in the fields involved there are so large that you need a huge number of equations to have even the smallest chance of any of them happening to be true "by chance", so I can't see how it could work as stated, but it's certainly an interesting question.

propercoil · on Aug 10, 2012

And no one mentioned a simple matrix?