Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Heap sort is an example of an in-place sorting algorithm that is constant space. Just because this implementation uses some kind of fixed size buffer, doesn't mean it has an upper limit for the amount of data it can sort. It is possible, but probably requires a deeper understanding of the algorithm to know for sure.


Maybe I'm missing some way of implementing heapsort without array indices, but all the implementations I've seen use O(log n) space. For example, the pseudocode on Wikipedia (https://en.wikipedia.org/wiki/Heapsort#Pseudocode) at a quick glance uses five temporary variables: 'count', 'end', 'start', 'child', and 'root'. How much temporary space is this? Well, proportional to the log of the size of the array, at a minimum. If you're sorting 256-element arrays, these can each be 8-bit integers, and you need 5 bytes of temporary space. If you're sorting a 2^32 element array, these need to each be at least 32-bit integers, and you need 20 bytes of temporary space. If you have a 2^64 element array, they need to be 64-bit integers, and you need 40 bytes of temporary space. If you have a 2^128 element array, they need to be 128-bit integers, and you need 80 bytes of temporary space. Therefore, temporary space needed grows with the log of the array size. Which is very slow growth, but still growth!

The two things you can do are: 1) pick a fixed-size int and cap the max size of array you can handle; or 2) use a bigint datatype, and the size of your bigints will (asymptotically) grow with log(n).


But this is a really irrelevant point, as you can't even represent an array of size n without log n space (for the start pointer and the size/end pointer). So, every algorithm involving arrays needs at least log n space, if nothing else for the input arguments.


It's irrelevant in practice, mostly because log(n) is very small for essentially all values of n, so one could consider it "close enough to constant". Close enough in the sense that a reasonably small constant (like "64") bounds any value of it you could possibly encounter in a sorting algorithm. But that's true of many things in big-O analysis.

If your temporary space was used by some indices represented as unary-encoded integers (which take O(n) space), it'd suddenly be harder to ignore the indices, because O(n) is not "very small" for many commonly encountered sizes of n. So I don't think taking the temporary space used by indexing into account is conceptually wrong, it just happens not to matter here because log(n) factors often don't matter.


In practice, though, it makes sense in this case to assume integers representing array indices are constant-space and arithmetic on them takes constant time. The computers we're running these algorithms on use base 2 and the size of integer they can operate on in a single operation has scaled nicely with the amount of storage they have access to.


I understand your point now, I thought you were mainly referring to the stack space quick sort uses.

Edit: A comment above claims that without this bound, quicksort isn't even log n space.


Granted, about twenty years passed before anyone had an array of >2^31 elements in Java and noticed this bug in binary search:

http://googleresearch.blogspot.ca/2006/06/extra-extra-read-a...

If people are using abstract data types and/or built-ins, in practice, presumably they can use a fixed int type for O(1) runtime and just update the data type every 20 years or so.

On that note, has anyone actually created a 2^128 element array yet? I suspect such an array would be too large to represent in the memory of all the world's computers at the moment.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: