Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

  > [Despite the existence of "enumerate", "xrange(maxint)"] can still
  > be useful when you want to include an index along with several
  > other lists, however, e.g. zip(list_1, list_2, indices)
I would just use

  for idx, (elt1, elt2) in enumerate(zip(list_1, list_2))
The following is not necessarily good advice, and he should take his own earlier advice and validate his claims with some profiling.

  > [map is ] much faster, since the loop takes place entirely in
  > the C API and never has to bind loop variables to Python
  > objects.

  > If you find yourself making the same list comprehension
  > repeatedly, make utility functions and use map and/or filter...  
Note that the following "map" snippet is considerably slower than the corresponding list-comprehension snippet. Function calls are expensive in python. Also, his claim that you save time with a "map" by avoiding the loop variable is obviously bogus: you still have to bind the variables in the signature of the function you're mapping, unless the signature is empty.

  met% python -m timeit "[x**3 for x in xrange(10000)]"
  1000 loops, best of 3: 1.27 msec per loop
  met% python -m timeit "map(lambda x: x**3, xrange(10000))"
  1000 loops, best of 3: 1.88 msec per loop


The map and filter functions were almost depreciated in Python 3.

http://www.artima.com/weblogs/viewpost.jsp?thread=98196


FTFY: depreciated --> deprecated


> I would just use

> for idx, (elt1, elt2) in enumerate(zip(list_1, list_2))

or use itertools.count:

    for idx, elt1, elt2 in zip(count(), list_1, list_2)


I'm getting entirely different results with Python 3.2:

    % python -m timeit "[ x**3 for x in range(10000) ]"
    100 loops, best of 3: 7.85 msec per loop
    % python -m timeit "map(lambda x: x**3, range(10000))"
    1000000 loops, best of 3: 1.26 usec per loop
By these benchmarks map is 700 times faster than the corresponding list comprehension.


When times are that short you should question if Python is actually doing anything, or just promising to do it later (remember that range is a generator in python3).

  $ python3.1 -m timeit "map(lambda x: x**3, range(10000))"
  1000000 loops, best of 3: 1.01 usec per loop
Seems fishy. Adding up the elements forces Python to actually do the cubing on all of them:

  $ python3.1 -m timeit "sum(map(lambda x: x**3, range(10000)))"
  100 loops, best of 3: 13 msec per loop
  $ python3.1 -m timeit "sum([ x**3 for x in range(10000) ])"
  100 loops, best of 3: 10.6 msec per loop
map really is slower, at least for me in python3.1.


$ python3.2 -m timeit "sum(map(lambda x: x3, range(10000)))"

100 loops, best of 3: 7.61 msec per loop

$ python3.2 -m timeit "sum([ x3 for x in range(10000) ])"

100 loops, best of 3: 6.27 msec per loop

$ python2.7 -m timeit "sum(map(lambda x: x3, range(10000)))"

100 loops, best of 3: 2.81 msec per loop

$ python2.7 -m timeit "sum([ x3 for x in range(10000) ])"

100 loops, best of 3: 2.28 msec per loop

I wasn't expecting python3 to be that slower.


[ldng: put some whitespace before code on hacker news: it indents it and the asterisks don't disappear.]

I infer that ldng has a 64 bit machine. On my 32 bit machine, Python 3.1 is faster than 2.6 for these examples. On a 64 bit machine I get similar results to ldng's, with Python3 being slower. If I wrap long() around the x in the example, Python2 becomes as slow as Python3.

Note that taking the cube of lots of big integers is not typical for many people: it generates very large integers that have to be in Python2's special long type on a 32 bit machine. On a 64 bit machine they stay as normal ints in Python2, which are much faster. Python3 has a single automagic int type, which seems to internally convert to the arbitrary precision type sooner than it has to on 64 bit machines(?).

Examples more typical of my use would wrap float() around the x, or change the example to add up 3x instead of x^3. These examples are all faster in Python3 for me. Faster still is to use numpy (which is now supported in Python3).

Summary: the people who would be affected by this regression have both a 64 bit machine, and do a lot of exact integer arithmetic on integers that can be represented in 64 bits, but not 32.


map() returns an iterator in Py3k.

    > python3.2 -m timeit "( x**3 for x in range(10000))"         
    100000 loops, best of 3: 2.67 usec per loop

    > python3.2 -m timeit "[ x**3 for x in range(10000)]"      
    100 loops, best of 3: 9.53 msec per loop

    > python3.2 -m timeit "list(map(lambda x: x**3, range(10000)))"
    100 loops, best of 3: 11.8 msec per loop

    > python3.2 -m timeit "list( x**3 for x in range(10000))"      
    100 loops, best of 3: 10.7 msec per loop
And here's the Python 2.5 naïve/in-memory version recreated in Python 3.2:

    > python3.2 -m timeit "list(map(lambda x: x**3, list(range(10000))))" 
    100 loops, best of 3: 12.1 msec per loop
And on Python 2.7 is still the fastest:

    > python -m timeit "map(lambda x: x**3, range(10000))"
    100 loops, best of 3: 4.33 msec per loop

    > python -m timeit "[x**3 for x in xrange(10000)]"                                             
    100 loops, best of 3: 2.32 msec per loop
    
    > python -m timeit "[x**3 for x in range(10000)]" 
    100 loops, best of 3: 2.66 msec per loop
On PyPy (1.5) it's pretty much the same, which goes to show why this is not “Python idiom”, but rather CPython's implementation detail:

    > pypy -m timeit "[x**3 for x in range(10000)]"
    100 loops, best of 3: 2.73 msec per loop

    > pypy -m timeit "[x**3 for x in xrange(10000)]"
    100 loops, best of 3: 2.62 msec per loop

    > pypy -m timeit "map(lambda x: x**3, xrange(10000))"
    100 loops, best of 3: 2.51 msec per loop
    
    > pypy -m timeit "map(lambda x: x**3, range(10000))" 
    100 loops, best of 3: 2.87 msec per loop


Did you try itertools.imap instead? Vanilla map gives us a list; imap gives us a generator. (Also, I'd try a generator expression as well to see how it compares to a list comprehension).

Also, note that the article is from January 2007; I'd wager that Python has evolved and had a wealth of optimisations since then, especially regarding "common idioms" such as vanilla string concatenation and list comprehensions.


  > ...the article is from January 2007; I'd wager that Python has
  > evolved and had a wealth of optimisations since then...
The observations I made regarding list-comprehensions vs map have been true since list comprehensions were first introduced.

  > Did you try itertools.imap instead? Vanilla map gives us a list;
  > imap gives us a generator.
Don't see why that would make any difference (except to confuse the benchmark, as the python3 examples in this thread show.)


The rule of thumb I've read is that map is appropriate and (generally) faster when you do not need to create a lambda.


That's not strictly correct. It really comes down to whether you're calling a native python function or a C extension. If it's a native python function call, it's going to be relatively slow.


In CPython, LCs are simpler in bytecode than map/filter, which makes them faster. That's all, really.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: