I've been interested in the problem of finding a better prime number recognizer for years. I realize this is a huge area of academic research and study - my interest in this is really just for fun. Here was my first attempt at a possible solution, in C (below).
My question is, can you suggest an improvement (without citing some other reference on the net, I'm looking for actual C code)? What I'm trying to get from this is a better understanding of determining performance complexity of a solution like this.
Am I right in concluding that the complexity of this solution i开发者_高级运维s O(n^2)?
#include <stdio.h>
#include <math.h>
/* isprime */
/* Test if each number in the list from stdin is prime. */
/* Output will only print the prime numbers in the list. */
int main(int argc, char* argv[]) {
int returnValue = 0;
int i;
int ceiling;
int input = 0;
int factorFound = 0;
while (scanf("%d", &input) != EOF) {
ceiling = (int)sqrt(input);
if (input == 1) {
factorFound = 1;
}
for (i = 2; i <= ceiling; i++) {
if (input % i == 0) {
factorFound = 1;
}
}
if (factorFound == 0) {
printf("%d\n", input);
}
factorFound = 0;
}
return returnValue;
}
for (i = 2; i <= ceiling; i++) {
if (input % i == 0) {
factorFound = 1;
break;
}
}
This is the first improvement to make and still stay within the bounds of "same" algorithm. It doesn't require any math at all to see this one.
Beyond that, once you see that input
is not divisible by 2, there is no need to check for 4, 6, 8, etc. If any even number divided into input
, then surely 2 would have because it divides all even numbers.
If you want to step outside of the algorithm a little bit, you could use a loop like the one that Sheldon L. Cooper provides in his answer. (This is just easier than having him correct my code from the comments though his efforts are much appreciated)
this takes advantage of the fact that every prime other than 2 and 3 is of the form n*6 + 1
or n*6 - 1
for some positive integer n
. To see this, just note that if m = n*6 + 2
or m = n*6 + 4
, then n
is divisible by 2. if m = n*6 + 3
then it is divisible by 3.
In fact, we can take this further. If p1, p2, .., pk
are the first k
primes, then all of the integers that are coprime to their product mark out 'slots' that all remaining primes must fit into.
to see this, just let k#
be the product of all primes up to pk
. then if gcd(k#, m) = g
, g
divides n*k# + m
and so this sum is trivially composite if g != 1
. so if you wanted to iterate in terms of 5# = 30
, then your coprime integers are 1, 7, 11, 13, 17, 19, 23 and 29.
technically, I didn't prove my last claim. It's not much more difficult
if g = gcd(k#, m)
, then for any integer, n
, g
divides n*k# + m
because it divides k#
so it must also divide n*k#
. But it divides m
as well so it must divide the sum. Above I only proved it for n = 1
. my bad.
Also, I should note that none of this is changing the fundamental complexity of the algoritm, it will still be O(n^1/2). All it is doing is drastically reducing the coefficient that gets used to calculate actual expected run times.
The time complexity for each primality test in your algorithm is O(sqrt(n))
.
You can always use the fact that all primes except 2 and 3 are of the form: 6*k+1
or 6*k-1
. For example:
int is_prime(int n) {
if (n <= 1) return 0;
if (n == 2 || n == 3) return 1;
if (n % 2 == 0 || n % 3 == 0) return 0;
int k;
for (k = 6; (k-1)*(k-1) <= n; k += 6) {
if (n % (k-1) == 0 || n % (k+1) == 0) return 0;
}
return 1;
}
This optimization does not improve the asymptotic complexity.
EDIT
Given that in your code you are testing numbers repeatedly, you might want to pre-calculate a list of primes. There are only 4792 primes less than or equal to the square root of INT_MAX (assuming 32 bit ints).
Furthermore, if the input numbers are relatively small you can try calculating a sieve.
Here's a combination of both ideas:
#define UPPER_BOUND 46340 /* floor(sqrt(INT_MAX)) */
#define PRIME_COUNT 4792 /* number of primes <= UPPER_BOUND */
int prime[PRIME_COUNT];
int is_prime_aux[UPPER_BOUND];
void fill_primes() {
int p, m, c = 0;
for (p = 2; p < UPPER_BOUND; p++)
is_prime_aux[p] = 1;
for (p = 2; p < UPPER_BOUND; p++) {
if (is_prime_aux[p]) {
prime[c++] = p;
for (m = p*p; m < UPPER_BOUND; m += p)
is_prime_aux[m] = 0;
}
}
}
int is_prime(int n) {
if (n <= 1) return 0;
if (n < UPPER_BOUND) return is_prime_aux[n];
int i;
for (i = 0; i < PRIME_COUNT && prime[i]*prime[i] <= n; i++)
if (n % prime[i] == 0)
return 0;
return 1;
}
Call fill_primes
at the beginning of your program, before starting to process queries. It runs pretty fast.
Your code there only has complexity O(sqrt(n)lg(n)). If you assume basic mathematical operations are O(1) (true until you start using bignums), then it's just O(sqrt(n)).
Note that primality testing can be performed in faster-than-O(sqrt(n)lg(n)) time. This site has a number of implementations of the AKS primality test, which has been proven to operate in O((log n)^12) time.
There are also some very, very fast probalistic tests - while fast, they sometimes give an incorrect result. For example, the Fermat primality test:
Given a number
p
we want to test for primality, pick a random numbera
, and test whethera^(p-1) mod p = 1
. If false,p
is definitely not prime. If true,p
is probably prime. By repeating the test with different random values ofa
, the probability of a false positive can be reduced.
Note that this specific test has some flaws to it - see the Wikipedia page for details, and other probabilistic primality tests you can use.
If you want to stick with the current approach, there are a number of minor improvements which can still be made - as others have pointed out, after 2
, all further primes are odd, so you can skip two potential factors at a time in the loop. You can also break out immediately when you find a factor. However, this doesn't change the asymptotic worst-case behavior of your algorithm, which remains at O(sqrt(n)lg(n)) - it just changes the best case (to O(lg(n))), and reduces the constant factor by roughly one-half.
A simple improvement would be to change the for loop to break out when it finds a factor:
for (i = 2; i <= ceiling && !factorFound; i++) {
if (input % i == 0) {
factorFound = 1;
Another possibility would be to increment the counter by 2 (after checking 2 itself).
can you suggest an improvement
Here you go ... not for the algorithm, but for the program itself :)
- If you aren't going to use
argc
andargv
, get rid of them - What if I input "fortytwo"? Compare scanf()
== 1
, not!= EOF
- No need to cast the value of
sqrt()
returnValue
is not needed, you can return a constant:return 0;
- Instead of having all of the functionality inside the
main()
function, separate your program in as many functions as you can think of.
You can make small cuts to your algorithm without adding too much code complexity. For example, you can skip the even numbers on your verification, and stop the search as soon as you find a factor.
if (input < 2 || (input != 2 && input % 2 == 0))
factorFound = 1;
if (input > 3)
for (i = 3; i <= ceiling && !factorFound; i += 2)
if (input % i == 0)
factorFound = 1;
Regarding the complexity, if n
is your input number, wouldn't the complexity be O(sqrt(n)), as you are doing roughly at most sqrt(n) divisions and comparisons?
Even numbers(except 2) cannot be prime numbers. So, once we know that the number is not even, we can just check if odd numbers are it's factors.
for (i = 3; i <= ceiling; i += 2) {
if (input % i == 0) {
factorFound = 1;
break;
}
}
The time complexity of your program is O(n*m^0.5)
. With n
the number of primes in the input. And m
the size of the biggest prime in the input, or MAX_INT
if you prefer. So complexity could also be written as O(n)
with n the number of primes to check.
With Big-O, n
is (usually) the size of the input, in your case that would be the number of primes to check. If I make this list twice as big (for example duplicating it), it would take (+-) exactly twice as long, thus O(n)
.
Here's my algorithm, Complexity remains O(n^0.5)
but i managed to remove some expensive operations in the code...
The algorithm's slowest part is the modulus
operation, i've managed to eliminate sqrt
or doing i * i <= n
This way i save precious cycles...its based on the fact that sum of odd numbers is always a perfect square.
Since we are iterating over odd numbers
anyway, why not exploit it? :)
int isPrime(int n)
{
int squares = 1;
int odd = 3;
if( ((n & 1) == 0) || (n < 9)) return (n == 2) || ((n > 1) && (n & 1));
else
{
for( ;squares <= n; odd += 2)
{
if( n % odd == 0)
return 0;
squares+=odd;
}
return 1;
}
}
#include <stdio.h>
#include <math.h>
int IsPrime (int n) {
int i, sqrtN;
if (n < 2) { return 0; } /* 1, 0, and negatives are nonprime */
if (n == 2) { return 2; }
if ((n % 2) == 0) { return 0; } /* Check for even numbers */
sqrtN = sqrt((double)n)+1; /* We don't need to search all the way up to n */
for (i = 3; i < sqrtN; i += 2) {
if (n % i == 0) { return 0; } /* Stop, because we found a factor! */
}
return n;
}
int main()
{
int n;
printf("Enter a positive integer: ");
scanf("%d",&n);
if(IsPrime(n))
printf("%d is a prime number.",n);
else
printf("%d is not a prime number.",n);
return 0;
}
There is no way to improve the algorithm. There may be tiny ways to improve your code, but not the base speed (and complexity) of the algorithm.
EDIT: Of course, since he doesn't need to know all the factors, just whether or not it is a prime number. Great spot.
精彩评论