Hello there,
The last couple of days I've been working on an experimental rewrite of
the GIL. Since the work has been turning out rather successful (or, at
least, not totally useless and crashing!) I thought I'd announce it
here.
First I want to stress this is not about removing the GIL. There still
is a Global Interpreter Lock which serializes access to most parts of
the interpreter. These protected parts haven't changed either, so Python
doesn't become really better at extracting computational parallelism out
of several cores.
Goals
The new GIL does away with this by ditching -Py-Ticker entirely and
instead using a fixed interval (by default 5 milliseconds, but settable)
after which we ask the main thread to release the GIL and let another
thread be scheduled.
2) GIL overhead and efficiency in contended situations. Apparently, some
OSes (OS X mainly) have problems with lock performance when the lock is
already taken: the system calls are heavy. This is the "Dave Beazley
effect", where he took a very trivial loop, therefore made of very short
opcodes and therefore releasing the GIL very often (probably 100000
times a second), and runs it in one or two threads on an OS with poor
lock performance (OS X). He sees a 50% increase in runtime when using
two threads rather than one, in what is admittedly a pathological case.
Even on better platforms such as Linux, eliminating the overhead of many
GIL acquires and releases (since the new GIL is released on a fixed time
basis rather than on an opcode counting basis) yields slightly better
performance (read: a smaller performance degradation :-)) when there are
several pure Python computation threads running.
3) Thread switching latency. The traditional scheme merely releases the
GIL for a couple of CPU cycles, and reacquires it immediately.
Unfortunately, this doesn't mean the OS will automatically switch to
another, GIL-awaiting thread. In many situations, the same thread will
continue running. This, with the opcode counting scheme, is the reason
why some people have been complaining about latency problems when an I/O
thread competes with a computational thread (the I/O thread wouldn't be
scheduled right away when e.g. a packet arrives; or rather, it would be
scheduled by the OS, but unscheduled immediately when trying to acquire
the GIL, and it would be scheduled again only much later).
The new GIL improves on this by combinating two mechanisms:
- forced thread switching, which means that when the switching interval
is terminated (mentioned in 1) and the GIL is released, we will force
any of the threads waiting on the GIL to be scheduled instead of the
formerly GIL-holding thread. Which thread exactly is an OS decision,
however: the goal here is not to have our own scheduler (this could be
discussed but I wanted the design to remain simple :-) After all,
man-years of work have been invested in scheduling algorithms by kernel
programming teams).
- priority requests, which is an option for a thread requesting the GIL
to be scheduled as soon as possible, and forcibly (rather than any other
threads). This is meant to be used by GIL-releasing methods such as
read() on files and sockets. The scheme, again, is very simple: when a
priority request is done by a thread, the GIL is released as soon as
possible by the thread holding it (including in the eval loop), and then
the thread making the priority request is forcibly scheduled (by making
all other GIL-awaiting threads wait in the meantime).
Implementation
NB : this is a branch of py3k. There should be no real difficulty
porting it back to trunk, provided someone wants to do the job.
Platforms
beginning of Python/ceval-gil.h.
The reason I couldn't use the existing thread support
(Python/thread-*.h) is that these abstractions are too poor. Mainly,
they don't provide:
- events, conditions or an equivalent thereof
- the ability to acquire a resource with a timeout
Measurements
GIL (regular expression matching)
C- one mostly C workload where the implementation does release the GIL
(bz2 compression)
In the ccbench directory you will find benchmark results, under Linux,
for two different systems I have here. The new GIL shows roughly similar
but slightly better throughput results than the old one. And it is much
better in the latency tests, especially in workload B (going down from
almost a second of average latency with the old GIL, to a couple of
milliseconds with the new GIL). This is the combined result of using a
time-based scheme (rather than opcode-based) and of forced thread
switching (rather than relying on the OS to actually switch threads when
we speculatively release the GIL).
As a sidenote, I might mention that single-threaded performance is not
degraded at all. It is, actually, theoretically a bit better because the
old ticker check in the eval loop becomes simpler; however, this goes
mostly unnoticed.
Now what remains to be done?
Having other people test it would be fine. Even better if you have an
actual multi-threaded py3k application. But ccbench results for other
OSes would be nice too :-)
(I get good results under the Windows XP VM but I feel that a VM is not
an ideal setup for a concurrency benchmark)
Of course, studying and reviewing the code is welcome. As for
integrating it into the mainline py3k branch, I guess we have to answer
these questions:
- is the approach interesting? (we could decide that it's just not worth
it, and that a good GIL can only be a dead (removed) GIL)
- is the patch good, mature and debugged enough?
- how do we deal with the unsupported platforms (POSIX and Windows
support should cover most bases, but the fate of OS/2 support depends on
Andrew)?
Regards
Antoine.
Re: Python-Dev - Reworking the GIL by Brett Cannon on
2009-10-25T22:00:06+00:00
>
> Having other people test it would be fine. Even better if you have an
> actual multi-threaded py3k application. But ccbench results for other
> OSes would be nice too :-)
> (I get good results under the Windows XP VM but I feel that a VM is not
> an ideal setup for a concurrency benchmark)
>
> Of course, studying and reviewing the code is welcome. As for
> integrating it into the mainline py3k branch, I guess we have to answer
> these questions:
> - is the approach interesting? (we could decide that it's just not worth
> it, and that a good GIL can only be a dead (removed) GIL)
>
I think it's worth it. Removal of the GIL is a totally open-ended problem
with no solution in sight. This, on the other hand, is a performance benefit
now. I say move forward with this. If it happens to be short-lived because
some actually figures out how to remove the GIL then great, but is that
really going to happen between now and Python 3.2? I doubt it.
> - is the patch good, mature and debugged enough?
> - how do we deal with the unsupported platforms (POSIX and Windows
> support should cover most bases, but the fate of OS/2 support depends on
> Andrew)?
>
>
It's up to Andrew to get the support in. While I have faith he will, this is
why we have been scaling back the support for alternative OSs for a while
and will continue to do so. I suspect the day Andrew stops keeping up will
be the day we push to have OS/2 be externally maintained.
-Brett
<br>
Now what remains to be done?<br>
<br>
Having other people test it would be fine. Even better if you have an<br>
actual multi-threaded py3k application. But ccbench results for other<br>
OSes would be nice too :-)<br>
(I get good results under the Windows XP VM but I feel that a VM is not<br>
an ideal setup for a concurrency benchmark)<br>
<br>
Of course, studying and reviewing the code is welcome. As for<br>
integrating it into the mainline py3k branch, I guess we have to answer<br>
these questions:<br>
- is the approach interesting? (we could decide that it's just not wort=
h<br>
it, and that a good GIL can only be a dead (removed) GIL)<br></blockquote><=
div><br></div><div>I think it's worth it. Removal of the GIL is a total=
ly open-ended problem with no solution in sight. This, on the other hand, i=
s a performance benefit now. I say move forward with this. If it happens to=
be short-lived because some actually figures out how to remove the GIL the=
n great, but is that really going to happen between now and Python 3.2? I d=
oubt it.</div>
<div>=C2=A0</div><blockquote class=3D"gmail-quote" style=3D"margin:0 0 0 .8=
ex;border-left:1px #ccc solid;padding-left:1ex;">
- is the patch good, mature and debugged enough?<br>
- how do we deal with the unsupported platforms (POSIX and Windows<br>
support should cover most bases, but the fate of OS/2 support depends on<br=
>
Andrew)?<br>
<br></blockquote><div><br></div><div>It's up to Andrew to get the suppo=
rt in. While I have faith he will, this is why we have been scaling back th=
e support for alternative OSs for a while and will continue to do so. I sus=
pect the day Andrew stops keeping up will be the day we push to have OS/2 b=
e externally maintained.</div>
<div><br></div><div>-Brett</div></div>
Re: Python-Dev - Reworking the GIL by Terry Reedy on
2009-10-26T05:09:05+00:00
Antoine Pitrou wrote:
> Hello there,
>
> The last couple of days I've been working on an experimental rewrite of
> the GIL. Since the work has been turning out rather successful (or, at
> least, not totally useless and crashing!) I thought I'd announce it
> here.
I am curious as to whether the entire mechanism is or can be turned off
when not needed Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/bull%40pubbs.net
Re: Python-Dev - Reworking the GIL by Antoine Pitrou on
2009-10-26T10:19:36+00:00
Terry Reedy <tjreedy <at> udel.edu> writes:
>
> I am curious as to whether the entire mechanism is or can be turned off
> when not needed Note that "no thread is waiting on the GIL" can mean one of two things:
- either there is only one Python thread
- or the other Python threads are doing things with the GIL released (zlib/bz2
compression, waiting on I/O, sleep()ing, etc.)
So, yes, it automatically "turns itself off".
Regards
Antoine.
Re: Python-Dev - Reworking the GIL by Andrew MacIntyre on
2009-10-26T12:47:49+00:00
Brett Cannon wrote:
> It's up to Andrew to get the support in. While I have faith he will,
> this is why we have been scaling back the support for alternative OSs
> for a while and will continue to do so. I suspect the day Andrew stops
> keeping up will be the day we push to have OS/2 be externally maintained.
Notwithstanding my desire to keep OS/2 supported in the Python tree,
keeping up has been more difficult of late:
- OS/2 is unquestionably a "legacy" environment, with system APIs
different in flavour and semantics from the current mainstream (though
surprisingly capable in many ways despite its age).
- The EMX runtime my OS/2 port currently relies on to abstract the
system API to a Posix-ish API is itself a legacy package, essentially
unmaintained for some years :-( This has been a source of increasing
pain as Python has moved with the mainstream... with regard to Unicode
support and threads in conjunction with multi-processing, in particular.
Real Life hasn't been favourably disposed either...
I have refrained from applying the extensive patches required to make
the port feature complete for 2.6 and later while I investigate an
alternate Posix emulating runtime (derived from FreeBSD's C library,
and which is used by Mozilla on OS/2), which would allow me to dispense
with most of these patches. But it has an issue or two of its own...
The cost in effort has been compounded by effectively having to try and
maintain two ports - 2.x and 3.x. And the 3.x port has suffered more
as its demands are higher.
So while I asked to keep the OS/2 thread support alive, if a decision
were to be taken to remove OS/2 support from the Python 3.x sources I
could live with that. A completed migration to Mercurial might well
make future port maintenance easier for me.
Regards,
Andrew.
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/bull%40pubbs.net
Re: Python-Dev - Reworking the GIL by Sturla Molden on
2009-10-26T13:45:13+00:00
Antoine Pitrou skrev:
> - priority requests, which is an option for a thread requesting the GIL
> to be scheduled as soon as possible, and forcibly (rather than any other
> threads).
So Python threads become preemptive rather than cooperative? That would
be great. :-)
time.sleep should generate a priority request to re-acquire the GIL; and
so should all other blocking standard library functions with a time-out.
S.M.
Re: Python-Dev - Reworking the GIL by Kristján Valur Jónsson on
2009-10-26T14:09:55+00:00
> http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/bull%40pubbs.net
Re: Python-Dev - Reworking the GIL by ssteinerX@gmail.com on
2009-10-26T14:26:52+00:00
On Oct 26, 2009, at 10:09 AM, Kristj=E1n Valur J=F3nsson wrote:
>
>
>> > I don't agree. You have to be very careful with priority. =
> time.sleep() does not promise to wake up in any timely manner, and =
> neither do the timeout functions. Rather, the timeout is a way to =
> prevent infinite wait.
>
> In my experience (from stackless python) using priority wakeup for =
> IO can result in very erratic scheduling when there is much IO going =
> on, every IO trumping another. You should stick to round robin =
> except for very special and carefully analysed cases.
All the IO tasks can also go in their own round robin so that CPU time =
is correctly shared among all waiting IO tasks.
IOW, to make sure that all IO tasks get a fair share *in relation to =
all other IO tasks*.
Tasks can be put into the IO round robin when they "pull the IO alarm" =
so to speak, so there's no need to decide before-hand which task goes =
in which round robin pool.
I'm not familiar with this particular code in Python, but I've used =
this in other systems for years to make sure that IO tasks don't =
starve the rest of the system and that the most "insistent" IO task =
doesn't starve all the others.
S
Re: Python-Dev - Reworking the GIL by Sturla Molden on
2009-10-26T15:47:27+00:00
Antoine Pitrou skrev:
> - priority requests, which is an option for a thread requesting the GIL
> to be scheduled as soon as possible, and forcibly (rather than any other
> threads). T
Should a priority request for the GIL take a priority number?
- If two threads make a priority requests for the GIL, the one with the
higher priority should get the GIL first.
- If a thread with a low priority make a priority request for the GIL,
it should not be allowed to "preempt" (take the GIL away from) a
higher-priority thread, in which case the priority request would be
ignored.
Related issue: Should Python threads have priorities? They are after all
real OS threads.
S.M.
Re: Python-Dev - Reworking the GIL by Antoine Pitrou on
2009-10-26T15:59:19+00:00
Sturla Molden <sturla <at> molden.no> writes:
>
> Antoine Pitrou skrev:
> > - priority requests, which is an option for a thread requesting the GIL
> > to be scheduled as soon as possible, and forcibly (rather than any other
> > threads). T
> Should a priority request for the GIL take a priority number?
Er, I prefer to keep things simple. If you have lots of I/O you should probably
use an event loop rather than separate threads.
> Related issue: Should Python threads have priorities? They are after all
> real OS threads.
Well, precisely they are OS threads, and the OS already assigns them (static or
dynamic) priorities. No need to replicate this.
(to answer another notion expressed in another message, there's no "round-robin"
scheduling either)
Regards
Antoine.
Re: Python-Dev - Reworking the GIL by Daniel Stutzbach on
2009-10-26T16:19:09+00:00
> use an event loop rather than separate threads.
>
On Windows, sometimes using a single-threaded event loop is sometimes
impossible. WaitForMultipleObjects(), which is the Windows equivalent to
select() or poll(), can handle a maximum of only 64 objects.
Do we really need priority requests at all? They seem counter to your
desire for simplicity and allowing the operating system's scheduler to do
its work.
That said, if a thread's time budget is merely paused during I/O rather than
reset, then a thread making frequent (but short) I/O requests cannot starve
the system.
border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; paddi=
ng-left: 1ex;">
Er, I prefer to keep things simple. If you have lots of I/O you should prob=
ably<br>
use an event loop rather than separate threads.<br></blockquote><div><br>On=
Windows, sometimes using a single-threaded event loop is sometimes impossi=
ble.=C2=A0 WaitForMultipleObjects(), which is the Windows equivalent to sel=
ect() or poll(), can handle a maximum of only 64 objects.<br>
<br>Do we really need priority requests at all?=C2=A0 They seem counter to =
your desire for simplicity and allowing the operating system's schedule=
r to do its work.<br><br>That said, if a thread's time budget is merely=
paused during I/O rather than reset, then a thread making frequent (but sh=
ort) I/O requests cannot starve the system.<br>
</div></div><blockquote style=3D"margin: 1.5em 0pt;">
Re: Python-Dev - Reworking the GIL by Antoine Pitrou on
2009-10-26T18:11:18+00:00
RGFuaWVsIFN0dXR6YmFjaCA8ZGFuaWVsIDxhdD4gc3R1dHpiYWNoZW50ZXJwcmlzZXMuY29tPiB3
cml0ZXM6Cj4gCj4gRG8gd2UgcmVhbGx5IG5lZWQgcHJpb3JpdHkgcmVxdWVzdHMgYXQgYWxsP8Kg
IFRoZXkgc2VlbSBjb3VudGVyIHRvIHlvdXIKPiBkZXNpcmUgZm9yIHNpbXBsaWNpdHkgYW5kIGFs
bG93aW5nIHRoZSBvcGVyYXRpbmcgc3lzdGVtJ3Mgc2NoZWR1bGVyIHRvIGRvCj4gaXRzIHdvcmsu
CgpObywgdGhleSBjYW4gYmUgZGlzYWJsZWQgKHJlbW92ZWQpIGlmIHdlIHByZWZlci4gV2l0aCBw
cmlvcml0eSByZXF1ZXN0cwpkaXNhYmxlZCwgbGF0ZW5jeSByZXN1bHRzIGJlY29tZXMgbGVzcyBl
eGNlbGxlbnQgYnV0IHN0aWxsIHF1aXRlIGdvb2QuCgpSdW5uaW5nIGNjYmVuY2ggb24gYSBkdWFs
IGNvcmUgbWFjaGluZSBnaXZlcyB0aGUgZm9sbG93aW5nIGxhdGVuY3kgcmVzdWx0cywKZmlyc3Qg
d2l0aCB0aGVuIHdpdGhvdXQgcHJpb3JpdHkgcmVxdWV0cy4KCi0tLSBMYXRlbmN5IC0tLSAod2l0
aCBwcmlvIHJlcXVlc3RzKQoKQmFja2dyb3VuZCBDUFUgdGFzazogUGkgY2FsY3VsYXRpb24gKFB5
dGhvbikKCkNQVSB0aHJlYWRzPTA6IDAgbXMuIChzdGQgZGV2OiAwIG1zLikKQ1BVIHRocmVhZHM9
MTogMCBtcy4gKHN0ZCBkZXY6IDIgbXMuKQpDUFUgdGhyZWFkcz0yOiAwIG1zLiAoc3RkIGRldjog
MiBtcy4pCkNQVSB0aHJlYWRzPTM6IDAgbXMuIChzdGQgZGV2OiAyIG1zLikKQ1BVIHRocmVhZHM9
NDogMCBtcy4gKHN0ZCBkZXY6IDIgbXMuKQoKQmFja2dyb3VuZCBDUFUgdGFzazogcmVndWxhciBl
eHByZXNzaW9uIChDKQoKQ1BVIHRocmVhZHM9MDogMCBtcy4gKHN0ZCBkZXY6IDAgbXMuKQpDUFUg
dGhyZWFkcz0xOiAzIG1zLiAoc3RkIGRldjogMiBtcy4pCkNQVSB0aHJlYWRzPTI6IDMgbXMuIChz
dGQgZGV2OiAyIG1zLikKQ1BVIHRocmVhZHM9MzogMyBtcy4gKHN0ZCBkZXY6IDIgbXMuKQpDUFUg
dGhyZWFkcz00OiA0IG1zLiAoc3RkIGRldjogMyBtcy4pCgpCYWNrZ3JvdW5kIENQVSB0YXNrOiBi
ejIgY29tcHJlc3Npb24gKEMpCgpDUFUgdGhyZWFkcz0wOiAwIG1zLiAoc3RkIGRldjogMiBtcy4p
CkNQVSB0aHJlYWRzPTE6IDAgbXMuIChzdGQgZGV2OiAyIG1zLikKQ1BVIHRocmVhZHM9MjogMCBt
cy4gKHN0ZCBkZXY6IDAgbXMuKQpDUFUgdGhyZWFkcz0zOiAwIG1zLiAoc3RkIGRldjogMiBtcy4p
CkNQVSB0aHJlYWRzPTQ6IDAgbXMuIChzdGQgZGV2OiAxIG1zLikKCi0tLSBMYXRlbmN5IC0tLSAo
d2l0aG91dCBwcmlvIHJlcXVlc3RzKQoKQmFja2dyb3VuZCBDUFUgdGFzazogUGkgY2FsY3VsYXRp
b24gKFB5dGhvbikKCkNQVSB0aHJlYWRzPTA6IDAgbXMuIChzdGQgZGV2OiAyIG1zLikKQ1BVIHRo
cmVhZHM9MTogNSBtcy4gKHN0ZCBkZXY6IDAgbXMuKQpDUFUgdGhyZWFkcz0yOiAzIG1zLiAoc3Rk
IGRldjogMyBtcy4pCkNQVSB0aHJlYWRzPTM6IDkgbXMuIChzdGQgZGV2OiA3IG1zLikKQ1BVIHRo
cmVhZHM9NDogMjIgbXMuIChzdGQgZGV2OiAyMyBtcy4pCgpCYWNrZ3JvdW5kIENQVSB0YXNrOiBy
ZWd1bGFyIGV4cHJlc3Npb24gKEMpCgpDUFUgdGhyZWFkcz0wOiAwIG1zLiAoc3RkIGRldjogMSBt
cy4pCkNQVSB0aHJlYWRzPTE6IDggbXMuIChzdGQgZGV2OiAyIG1zLikKQ1BVIHRocmVhZHM9Mjog
NSBtcy4gKHN0ZCBkZXY6IDQgbXMuKQpDUFUgdGhyZWFkcz0zOiAyMSBtcy4gKHN0ZCBkZXY6IDMy
IG1zLikKQ1BVIHRocmVhZHM9NDogMTkgbXMuIChzdGQgZGV2OiAyNiBtcy4pCgpCYWNrZ3JvdW5k
IENQVSB0YXNrOiBiejIgY29tcHJlc3Npb24gKEMpCgpDUFUgdGhyZWFkcz0wOiAwIG1zLiAoc3Rk
IGRldjogMSBtcy4pCkNQVSB0aHJlYWRzPTE6IDAgbXMuIChzdGQgZGV2OiAyIG1zLikKQ1BVIHRo
cmVhZHM9MjogMCBtcy4gKHN0ZCBkZXY6IDAgbXMuKQpDUFUgdGhyZWFkcz0zOiAwIG1zLiAoc3Rk
IGRldjogMCBtcy4pCkNQVSB0aHJlYWRzPTQ6IDAgbXMuIChzdGQgZGV2OiAwIG1zLikKCgoKX19f
X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KUHl0aG9uLURldiBt
YWlsaW5nIGxpc3QKUHl0aG9uLURldkBweXRob24ub3JnCmh0dHA6Ly9tYWlsLnB5dGhvbi5vcmcv
bWFpbG1hbi9saXN0aW5mby9weXRob24tZGV2ClVuc3Vic2NyaWJlOiBodHRwOi8vbWFpbC5weXRo
b24ub3JnL21haWxtYW4vb3B0aW9ucy9weXRob24tZGV2L2J1bGwlNDBwdWJicy5uZXQK
Re: Python-Dev - Reworking the GIL by Collin Winter on
2009-10-26T20:09:39+00:00
On Sun, Oct 25, 2009 at 1:22 PM, Antoine Pitrou <solipsis@pitrou.net> wrote:
> Having other people test it would be fine. Even better if you have an
> actual multi-threaded py3k application. But ccbench results for other
> OSes would be nice too :-)
My results for an 2.4 GHz Intel Core 2 Duo MacBook Pro (OS X 10.5.8):
Control (py3k @ r75723)
threads=1: 281 iterations/s.
threads=2: 282 ( 100 %)
threads=3: 282 ( 100 %)
threads=4: 282 ( 100 %)
bz2 compression (C)
threads=1: 379 iterations/s.
threads=2: 735 ( 193 %)
threads=3: 733 ( 193 %)
threads=4: 724 ( 190 %)
CPU threads=0: 0 ms. (std dev: 0 ms.)
CPU threads=1: 975 ms. (std dev: 577 ms.)
CPU threads=2: 1035 ms. (std dev: 571 ms.)
CPU threads=3: 1098 ms. (std dev: 556 ms.)
CPU threads=4: 1195 ms. (std dev: 557 ms.)
Background CPU task: bz2 compression (C)
CPU threads=0: 0 ms. (std dev: 0 ms.)
CPU threads=1: 0 ms. (std dev: 2 ms.)
CPU threads=2: 4 ms. (std dev: 5 ms.)
CPU threads=3: 0 ms. (std dev: 0 ms.)
CPU threads=4: 1 ms. (std dev: 4 ms.)
Experiment (newgil branch @ r75723)
threads=1: 298 iterations/s.
threads=2: 296 ( 99 %)
threads=3: 288 ( 96 %)
threads=4: 287 ( 96 %)
bz2 compression (C)
threads=1: 378 iterations/s.
threads=2: 720 ( 190 %)
threads=3: 724 ( 191 %)
threads=4: 718 ( 189 %)
CPU threads=0: 0 ms. (std dev: 0 ms.)
CPU threads=1: 1 ms. (std dev: 0 ms.)
CPU threads=2: 2 ms. (std dev: 1 ms.)
CPU threads=3: 2 ms. (std dev: 2 ms.)
CPU threads=4: 2 ms. (std dev: 1 ms.)
Background CPU task: bz2 compression (C)
CPU threads=0: 0 ms. (std dev: 0 ms.)
CPU threads=1: 0 ms. (std dev: 0 ms.)
CPU threads=2: 2 ms. (std dev: 3 ms.)
CPU threads=3: 0 ms. (std dev: 1 ms.)
CPU threads=4: 0 ms. (std dev: 0 ms.)
I also ran this through Unladen Swallow's threading microbenchmark,
which is a straight copy of what David Beazley was experimenting with
(simply iterating over 1000000 ints in pure Python) [1].
"iterative-count" is doing the loops one after the other,
"threaded-count" is doing the loops in parallel using threads.
The results below are benchmarking py3k as the control, newgil as the
experiment. When it says "x% faster", that is a measure of newgil's
performance over py3k's.
With two threads:
iterative-count:
Min: 0.336573 -> 0.387782: 13.21% slower # I've run this
configuration multiple times and gotten the same slowdown.
Avg: 0.338473 -> 0.418559: 19.13% slower
Significant (t=-38.434785, a=0.95)
threaded-count:
Min: 0.529859 -> 0.397134: 33.42% faster
Avg: 0.581786 -> 0.429933: 35.32% faster
Significant (t=70.100445, a=0.95)
With four threads:
iterative-count:
Min: 0.766617 -> 0.734354: 4.39% faster
Avg: 0.771954 -> 0.751374: 2.74% faster
Significant (t=22.164103, a=0.95)
Stddev: 0.00262 -> 0.00891: 70.53% larger
threaded-count:
Min: 1.175750 -> 0.829181: 41.80% faster
Avg: 1.224157 -> 0.867506: 41.11% faster
Significant (t=161.715477, a=0.95)
Stddev: 0.01900 -> 0.01120: 69.65% smaller
With eight threads:
iterative-count:
Min: 1.527794 -> 1.447421: 5.55% faster
Avg: 1.536911 -> 1.479940: 3.85% faster
Significant (t=35.559595, a=0.95)
Stddev: 0.00394 -> 0.01553: 74.61% larger
threaded-count:
Min: 2.424553 -> 1.677180: 44.56% faster
Avg: 2.484922 -> 1.723093: 44.21% faster
Significant (t=184.766131, a=0.95)
Stddev: 0.02874 -> 0.02956: 2.78% larger
I'd be interested in multithreaded benchmarks with less-homogenous workloads.
Collin Winter
[1] - http://code.google.com/p/unladen-swallow/source/browse/tests/performance/bm-threading.py
Re: Python-Dev - Reworking the GIL by Antoine Pitrou on
2009-10-26T21:45:25+00:00
Collin Winter <collinw <at> gmail.com> writes:
>
> My results for an 2.4 GHz Intel Core 2 Duo MacBook Pro (OS X 10.5.8):
Thanks!
[the Dave Beazley benchmark]
> The results below are benchmarking py3k as the control, newgil as the
> experiment. When it says "x% faster", that is a measure of newgil's
> performance over py3k's.
>
> With two threads:
>
> iterative-count:
> Min: 0.336573 -> 0.387782: 13.21% slower # I've run this
> configuration multiple times and gotten the same slowdown.
> Avg: 0.338473 -> 0.418559: 19.13% slower
Those numbers are not very in line with the other "iterative-count" results.
Since iterative-count just runs the loop N times in a row, results should be
proportional to the number N ("number of threads").
Besides, there's no reason for single-threaded performance to be degraded since
the fast path of the eval loop actually got a bit streamlined (there is no
volatile ticker to decrement).
> I'd be interested in multithreaded benchmarks with less-homogenous workloads.
So would I.
Regards
Antoine.
Re: Python-Dev - Reworking the GIL by Collin Winter on
2009-10-26T21:50:23+00:00
On Mon, Oct 26, 2009 at 2:43 PM, Antoine Pitrou <solipsis@pitrou.net> wrote:
> Collin Winter <collinw <at> gmail.com> writes:
> [the Dave Beazley benchmark]
>> The results below are benchmarking py3k as the control, newgil as the
>> experiment. When it says "x% faster", that is a measure of newgil's
>> performance over py3k's.
>>
>> With two threads:
>>
>> iterative-count:
>> Min: 0.336573 -> 0.387782: 13.21% slower =A0# I've run this
>> configuration multiple times and gotten the same slowdown.
>> Avg: 0.338473 -> 0.418559: 19.13% slower
>
> Those numbers are not very in line with the other "iterative-count" resul=
ts.
> Since iterative-count just runs the loop N times in a row, results should=
be
> proportional to the number N ("number of threads").
>
> Besides, there's no reason for single-threaded performance to be degraded=
since
> the fast path of the eval loop actually got a bit streamlined (there is no
> volatile ticker to decrement).
I agree those numbers are out of line with the others and make no
sense. I've run it with two threads several times and the results are
consistent on this machine. I'm digging into it a bit more.
Collin
Re: Python-Dev - Reworking the GIL by on
2009-10-26T22:46:58+00:00
On 04:18 pm, daniel@stutzbachenterprises.com wrote:
>On Mon, Oct 26, 2009 at 10:58 AM, Antoine Pitrou
><solipsis@pitrou.net>wrote:
>>Er, I prefer to keep things simple. If you have lots of I/O you should
>>probably
>>use an event loop rather than separate threads.
>
>On Windows, sometimes using a single-threaded event loop is sometimes
>impossible. WaitForMultipleObjects(), which is the Windows equivalent
>to
>select() or poll(), can handle a maximum of only 64 objects.
This is only partially accurate. For one thing, WaitForMultipleObjects
calls are nestable. For another thing, Windows also has I/O completion
ports which are not limited to 64 event sources. The situation is
actually better than on a lot of POSIXes.
>Do we really need priority requests at all? They seem counter to your
>desire for simplicity and allowing the operating system's scheduler to
>do
>its work.
Despite what I said above, however, I would also take a default position
against adding any kind of more advanced scheduling system here. It
would, perhaps, make sense to expose the APIs for controlling the
platform scheduler, though.
Jean-Paul
Re: Python-Dev - Reworking the GIL by Antoine Pitrou on
2009-10-28T11:42:25+00:00
S3Jpc3Rqw6FuIFZhbHVyIErDs25zc29uIDxrcmlzdGphbiA8YXQ+IGNjcGdhbWVzLmNvbT4gd3Jp
dGVzOgo+IAo+IEluIG15IGV4cGVyaWVuY2UgKGZyb20gc3RhY2tsZXNzIHB5dGhvbikgdXNpbmcg
cHJpb3JpdHkgd2FrZXVwIGZvciBJTyBjYW4KcmVzdWx0IGluIHZlcnkgZXJyYXRpYwo+IHNjaGVk
dWxpbmcgd2hlbiB0aGVyZSBpcyBtdWNoIElPIGdvaW5nIG9uLCBldmVyeSBJTyB0cnVtcGluZyBh
bm90aGVyLgoKSSB3aGlwcGVkIHVwIGEgdHJpdmlhbCBtdWx0aXRocmVhZGVkIEhUVFAgc2VydmVy
IHVzaW5nCnNvY2tldHNlcnZlci5UaHJlYWRpbmdNaXhpbiBhbmQgd3NnaXJlZiwgYW5kIHVzZWQg
YXBhY2hlYmVuY2ggYWdhaW5zdCBpdCB3aXRoIGEKcmVhc29uYWJsZSBjb25jdXJyZW5jeSBsZXZl
bCAoMTAgcmVxdWVzdHMgYXQgb25jZSkuIEVuYWJsaW5nL2Rpc2FibGluZyBwcmlvcml0eQpyZXF1
ZXN0cyBkb2Vzbid0IHNlZW0gdG8gbWFrZSBhIGRpZmZlcmVuY2UuCgpSZWdhcmRzCgpBbnRvaW5l
LgoKCl9fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fClB5dGhv
bi1EZXYgbWFpbGluZyBsaXN0ClB5dGhvbi1EZXZAcHl0aG9uLm9yZwpodHRwOi8vbWFpbC5weXRo
b24ub3JnL21haWxtYW4vbGlzdGluZm8vcHl0aG9uLWRldgpVbnN1YnNjcmliZTogaHR0cDovL21h
aWwucHl0aG9uLm9yZy9tYWlsbWFuL29wdGlvbnMvcHl0aG9uLWRldi9idWxsJTQwcHViYnMubmV0
Cg==
Re: Python-Dev - Reworking the GIL by Antoine Pitrou on
2009-11-01T10:34:59+00:00
Hello again,
Brett Cannon <brett <at> python.org> writes:
>
> I think it's worth it. Removal of the GIL is a totally open-ended problem
> with no solution in sight. This, on the other hand, is a performance benefit
> now. I say move forward with this. If it happens to be short-lived because
> some actually figures out how to remove the GIL then great, but is that
> really going to happen between now and Python 3.2? I doubt it.
Based on this whole discussion, I think I am going to merge the new GIL work
into the py3k branch, with priority requests disabled.
If you think this is premature or uncalled for, or if you just want to review
the changes before making a judgement, please voice up :)
Regards
Antoine.
Re: Python-Dev - Reworking the GIL by Brett Cannon on
2009-11-01T20:20:05+00:00
On Sun, Nov 1, 2009 at 03:33, Antoine Pitrou <solipsis@pitrou.net> wrote:
>
> Hello again,
>
> Brett Cannon <brett <at> python.org> writes:
>>
>> I think it's worth it. Removal of the GIL is a totally open-ended problem
>> with no solution in sight. This, on the other hand, is a performance benefit
>> now. I say move forward with this. If it happens to be short-lived because
>> some actually figures out how to remove the GIL then great, but is that
>> really going to happen between now and Python 3.2? I doubt it.
>
> Based on this whole discussion, I think I am going to merge the new GIL work
> into the py3k branch, with priority requests disabled.
This will be a nice Py3K carrot!
>
> If you think this is premature or uncalled for, or if you just want to review
> the changes before making a judgement, please voice up :)
I know I personally trust you to not mess it up, Antoine, but that
might also come from mental exhaustion and laziness. =)
-Brett
Re: Python-Dev - Reworking the GIL by Christian Heimes on
2009-11-01T21:15:22+00:00
Antoine Pitrou wrote:
> Based on this whole discussion, I think I am going to merge the new GIL work
> into the py3k branch, with priority requests disabled.
>
> If you think this is premature or uncalled for, or if you just want to review
> the changes before making a judgement, please voice up :)
+1 from me. I trust you like Brett does.
How much work would it cost to make your patch optional at compile time?
For what it's worth we could compare your work on different machines and
on different platforms before it gets enabled by default. Can you
imagine scenarios where your implementation might be slower than the
current GIL implementation?
Re: Python-Dev - Reworking the GIL by Antoine Pitrou on
2009-11-01T21:29:11+00:00
Christian Heimes <lists <at> cheimes.de> writes:
>
> +1 from me. I trust you like Brett does.
>
> How much work would it cost to make your patch optional at compile time?
Quite a bit, because it changes the logic for processing asynchronous pending
calls (signals) and asynchronous exceptions in the eval loop. The #defines would
get quite convoluted, I think; I'd prefer not to do that.
> For what it's worth we could compare your work on different machines and
> on different platforms before it gets enabled by default. Can you
> imagine scenarios where your implementation might be slower than the
> current GIL implementation?
I don't really think so. The GIL is taken and released much more predictably
than it was before. The thing that might be worth checking is a workload with
many threads (say 50 or 100). Does anyone have that?
Regards
Antoine.
Re: Python-Dev - Reworking the GIL by Christian Heimes on
2009-11-01T21:44:07+00:00
Antoine Pitrou wrote:
> Christian Heimes <lists <at> cheimes.de> writes:
>> +1 from me. I trust you like Brett does.
>>
>> How much work would it cost to make your patch optional at compile time?
>
> Quite a bit, because it changes the logic for processing asynchronous pending
> calls (signals) and asynchronous exceptions in the eval loop. The #defines would
> get quite convoluted, I think; I'd prefer not to do that.
Based on the new piece of information I totally agree.
> I don't really think so. The GIL is taken and released much more predictably
> than it was before. The thing that might be worth checking is a workload with
> many threads (say 50 or 100). Does anyone have that?
I don't have an application that works on Python 3 and uses that many
threads, sorry.
Christian
Re: Python-Dev - Reworking the GIL by Gregory P. Smith on
2009-11-02T05:39:47+00:00
On Sun, Nov 1, 2009 at 3:33 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
>
> Hello again,
>
> Brett Cannon <brett <at> python.org> writes:
>>
>> I think it's worth it. Removal of the GIL is a totally open-ended problem
>> with no solution in sight. This, on the other hand, is a performance benefit
>> now. I say move forward with this. If it happens to be short-lived because
>> some actually figures out how to remove the GIL then great, but is that
>> really going to happen between now and Python 3.2? I doubt it.
>
> Based on this whole discussion, I think I am going to merge the new GIL work
> into the py3k branch, with priority requests disabled.
>
> If you think this is premature or uncalled for, or if you just want to review
> the changes before making a judgement, please voice up :)
+1 Good idea. Thats the best way to make sure this work gets
anywhere. It can be iterated on from there if anyone has objections.
Re: Python-Dev - Reworking the GIL by "Martin v. Löwis" on
2009-11-02T08:08:59+00:00
> The new GIL does away with this by ditching -Py-Ticker entirely and
> instead using a fixed interval (by default 5 milliseconds, but settable)
> after which we ask the main thread to release the GIL and let another
> thread be scheduled.
I've looked at this part of the implementation, and have a few comments.
a) why is gil-interval a double? Make it an int, counting in
microseconds.
b) notice that, on Windows, minimum wait resolution may be as large as
15ms (e.g. on XP, depending on the hardware). Not sure what this
means for WaitForMultipleObjects; most likely, if you ask for a 5ms
wait, it waits until the next clock tick. It would be bad if, on
some systems, a wait of 5ms would mean that it immediately returns.
c) is the gil-drop-request handling thread-safe? If your answer is
"yes", you should add a comment as to what the assumptions are of
this code (ISTM that multiple threads may simultaneously attempt
to set the drop request, overlapping with the holding thread actually
dropping the GIL). There is also the question whether it is
thread-safe to write into a "volatile int"; I keep forgetting the
answer.
Regards,
Martin
Re: Python-Dev - Reworking the GIL by Sturla Molden on
2009-11-02T09:28:53+00:00
Martin v. L=F6wis skrev:
> b) notice that, on Windows, minimum wait resolution may be as large as
> 15ms (e.g. on XP, depending on the hardware). Not sure what this
> means for WaitForMultipleObjects; most likely, if you ask for a 5ms
> wait, it waits until the next clock tick. It would be bad if, on
> some systems, a wait of 5ms would mean that it immediately returns.
> =
Which is why one should use multimedia timers with QPC on Windows.
To get a wait function with much better resolution than Windows' =
default, do this:
1. Set a high resolution with timeBeginPeriod.
2. Loop using a time-out of 0 for WaitForMultipleObjects and put a =
Sleep(0) in the loop not to burn the CPU. Call QPF to get a precise =
timing, and break the loop when the requested time-out has been reached.
3. When you are done, call timeBeginPeriod to turn the multimedia timer off.
This is how you create usleep() in Windows as well: Just loop on QPF and =
Sleep(0) after setting timeBeginPeriod(1).
Re: Python-Dev - Reworking the GIL by "Martin v. Löwis" on
2009-11-02T09:44:59+00:00
Sturla Molden wrote:
> Martin v. L=F6wis skrev:
>> b) notice that, on Windows, minimum wait resolution may be as large as
>> 15ms (e.g. on XP, depending on the hardware). Not sure what this
>> means for WaitForMultipleObjects; most likely, if you ask for a 5ms
>> wait, it waits until the next clock tick. It would be bad if, on
>> some systems, a wait of 5ms would mean that it immediately returns.
>> =
> Which is why one should use multimedia timers with QPC on Windows.
Maybe you should study the code under discussion before making such
a proposal.
Regards,
Martin
Re: Python-Dev - Reworking the GIL by Antoine Pitrou on
2009-11-02T10:03:00+00:00
TWFydGluIHYuIEzDtndpcyA8bWFydGluIDxhdD4gdi5sb2V3aXMuZGU+IHdyaXRlczoKPiAKPiBJ
J3ZlIGxvb2tlZCBhdCB0aGlzIHBhcnQgb2YgdGhlIGltcGxlbWVudGF0aW9uLCBhbmQgaGF2ZSBh
IGZldyBjb21tZW50cy4KPiBhKSB3aHkgaXMgZ2lsX2ludGVydmFsIGEgZG91YmxlPyBNYWtlIGl0
IGFuIGludCwgY291bnRpbmcgaW4KPiAgICBtaWNyb3NlY29uZHMuCgpPaywgSSdsbCBkbyB0aGF0
LgoKPiBiKSBub3RpY2UgdGhhdCwgb24gV2luZG93cywgbWluaW11bSB3YWl0IHJlc29sdXRpb24g
bWF5IGJlIGFzIGxhcmdlIGFzCj4gICAgMTVtcyAoZS5nLiBvbiBYUCwgZGVwZW5kaW5nIG9uIHRo
ZSBoYXJkd2FyZSkuIE5vdCBzdXJlIHdoYXQgdGhpcwo+ICAgIG1lYW5zIGZvciBXYWl0Rm9yTXVs
dGlwbGVPYmplY3RzOyBtb3N0IGxpa2VseSwgaWYgeW91IGFzayBmb3IgYSA1bXMKPiAgICB3YWl0
LCBpdCB3YWl0cyB1bnRpbCB0aGUgbmV4dCBjbG9jayB0aWNrLiBJdCB3b3VsZCBiZSBiYWQgaWYs
IG9uCj4gICAgc29tZSBzeXN0ZW1zLCBhIHdhaXQgb2YgNW1zIHdvdWxkIG1lYW4gdGhhdCBpdCBp
bW1lZGlhdGVseSByZXR1cm5zLgoKSSdsbCBsZXQgc29tZW9uZSBlbHNlIGdpdmUgYW4gYXV0aG9y
aXRhdGl2ZSBhbnN3ZXIuIEkgZGlkIHRoZSBXaW5kb3dzIHZlcnNpb24KbWFpbmx5IGJ5IHJlYWRp
bmcgb25saW5lIE1TRE4gZG9jcywgYW5kIHRlc3RpbmcgaXQgd29ya2VkIGZpbmUgaW4gYW4gWFAg
Vk0uCgpBbnl3YXksIGhlcmUgaXMgd2hhdCB0aGUgb25saW5lIGRvY3MgaGF2ZSB0byBzYXkgOgoK
wqsgVGhlIHN5c3RlbSBjbG9jayAidGlja3MiIGF0IGEgY29uc3RhbnQgcmF0ZS4gSWYgdGhlIHRp
bWUtb3V0IGludGVydmFsIGlzIGxlc3MKdGhhbiB0aGUgcmVzb2x1dGlvbiBvZiB0aGUgc3lzdGVt
IGNsb2NrLCB0aGUgd2FpdCBtYXkgdGltZSBvdXQgaW4gbGVzcyB0aGFuIHRoZQpzcGVjaWZpZWQg
bGVuZ3RoIG9mIHRpbWUuIElmIHRoZSB0aW1lLW91dCBpbnRlcnZhbCBpcyBncmVhdGVyIHRoYW4g
b25lIHRpY2sgYnV0Cmxlc3MgdGhhbiB0d28sIHRoZSB3YWl0IGNhbiBiZSBhbnl3aGVyZSBiZXR3
ZWVuIG9uZSBhbmQgdHdvIHRpY2tzLCBhbmQgc28gb24uIMK7CgpTbyBpdCBzZWVtcyB0aGF0IHRo
ZSB0aW1lb3V0IGFsd2F5cyBoYXBwZW5zIG9uIGEgV2luZG93cyB0aWNrIGJvdW5kYXJ5LCB3aGlj
aAptZWFucyB0aGF0IGl0IGNhbiBpbW1lZGlhdGVseSByZXR1cm4sIGJ1dCBvbmx5IHZlcnkgcmFy
ZWx5IHNvIGFuZCBuZXZlciBtb3JlCnRoYW4gb25jZSBpbiBhIHJvdy4uLgoKPiBjKSBpcyB0aGUg
Z2lsX2Ryb3BfcmVxdWVzdCBoYW5kbGluZyB0aHJlYWQtc2FmZT8gSWYgeW91ciBhbnN3ZXIgaXMK
PiAgICAieWVzIiwgeW91IHNob3VsZCBhZGQgYSBjb21tZW50IGFzIHRvIHdoYXQgdGhlIGFzc3Vt
cHRpb25zIGFyZSBvZgo+ICAgIHRoaXMgY29kZSAoSVNUTSB0aGF0IG11bHRpcGxlIHRocmVhZHMg
bWF5IHNpbXVsdGFuZW91c2x5IGF0dGVtcHQKPiAgICB0byBzZXQgdGhlIGRyb3AgcmVxdWVzdCwg
b3ZlcmxhcHBpbmcgd2l0aCB0aGUgaG9sZGluZyB0aHJlYWQgYWN0dWFsbHkKPiAgICBkcm9wcGlu
ZyB0aGUgR0lMKS4KClRoZSBnaWxfZHJvcF9yZXF1ZXN0IGlzIHZvbGF0aWxlIG1haW5seSBvdXQg
b2YgcHJlY2F1dGlvbiwgYnV0IGl0J3MgcHJvYmFibHkgbm90Cm5lY2Vzc2FyeS4gSXQgaXMgb25s
eSB3cml0dGVuIChzZXQvY2xlYXJlZCkgd2hlbiBob2xkaW5nIHRoZSBnaWxfbXV0ZXg7IGhvd2V2
ZXIsCml0IGNhbiBiZSByZWFkIHdoaWxlIG5vdCBob2xkaW5nIHRoYXQgbXV0ZXguIEV4Y2VwdGlv
bmFsbHkgcmVhZGluZyB0aGUgIndyb25nIgp2YWx1ZSB3b3VsZCBub3QgaGF2ZSBhbnkgYWR2ZXJz
ZSBjb25zZXF1ZW5jZXMsIGl0IHdvdWxkIGp1c3QgZGVjcmVhc2UgdGhlCnN3aXRjaGluZyBxdWFs
aXR5IGF0IHRoZSBzYWlkIGluc3RhbnQuCgpSZWdhcmRzCgpBbnRvaW5lLgoKCl9fX19fX19fX19f
X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fClB5dGhvbi1EZXYgbWFpbGluZyBs
aXN0ClB5dGhvbi1EZXZAcHl0aG9uLm9yZwpodHRwOi8vbWFpbC5weXRob24ub3JnL21haWxtYW4v
bGlzdGluZm8vcHl0aG9uLWRldgpVbnN1YnNjcmliZTogaHR0cDovL21haWwucHl0aG9uLm9yZy9t
YWlsbWFuL29wdGlvbnMvcHl0aG9uLWRldi9idWxsJTQwcHViYnMubmV0Cg==
Re: Python-Dev - Reworking the GIL by "Martin v. Löwis" on
2009-11-02T10:45:36+00:00
>> c) is the gil-drop-request handling thread-safe? If your answer is
>> "yes", you should add a comment as to what the assumptions are of
>> this code (ISTM that multiple threads may simultaneously attempt
>> to set the drop request, overlapping with the holding thread actually
>> dropping the GIL).
>
> The gil-drop-request is volatile mainly out of precaution, but it's probably not
> necessary. It is only written (set/cleared) when holding the gil-mutex; however,
> it can be read while not holding that mutex. Exceptionally reading the "wrong"
> value would not have any adverse consequences, it would just decrease the
> switching quality at the said instant.
I think it then needs definitely to be volatile - otherwise, the
compiler may chose to cache it in a register (assuming enough registers
are available), instead of re-reading it from memory each time (which is
exactly what volatile then forces it to).
Even if it is read from memory, I still wonder what might happen on
systems that require explicit memory barriers to synchronize across
CPUs. What if CPU 1 keeps reading a 0 value out of its cache, even
though CPU 1 has written an 1 value a long time ago?
IIUC, any (most?) pthread calls would cause synchronization in that
case, which is why applications that also use locks for reading:
http://www.opengroup.org/onlinepubs/009695399/basedefs/xbd-chap04.html#tag-04-10
Of course, on x86, you won't see any issues, because it's cache-coherent
anyway.
Regards,
Martin
Re: Python-Dev - Reworking the GIL by Antoine Pitrou on
2009-11-02T11:16:01+00:00
TWFydGluIHYuIEzDtndpcyA8bWFydGluIDxhdD4gdi5sb2V3aXMuZGU+IHdyaXRlczoKPiAKW2dp
bF9kcm9wX3JlcXVlc3RdCj4gRXZlbiBpZiBpdCBpcyByZWFkIGZyb20gbWVtb3J5LCBJIHN0aWxs
IHdvbmRlciB3aGF0IG1pZ2h0IGhhcHBlbiBvbgo+IHN5c3RlbXMgdGhhdCByZXF1aXJlIGV4cGxp
Y2l0IG1lbW9yeSBiYXJyaWVycyB0byBzeW5jaHJvbml6ZSBhY3Jvc3MKPiBDUFVzLiBXaGF0IGlm
IENQVSAxIGtlZXBzIHJlYWRpbmcgYSAwIHZhbHVlIG91dCBvZiBpdHMgY2FjaGUsIGV2ZW4KPiB0
aG91Z2ggQ1BVIDEgaGFzIHdyaXR0ZW4gYW4gMSB2YWx1ZSBhIGxvbmcgdGltZSBhZ28/Cj4gCj4g
SUlVQywgYW55IChtb3N0PykgcHRocmVhZCBjYWxscyB3b3VsZCBjYXVzZSBzeW5jaHJvbml6YXRp
b24gaW4gdGhhdAo+IGNhc2UsIHdoaWNoIGlzIHdoeSBhcHBsaWNhdGlvbnMgdGhhdCBhbHNvIHVz
ZSBsb2NrcyBmb3IgcmVhZGluZzoKPiAKPiBodHRwOi8vd3d3Lm9wZW5ncm91cC5vcmcvb25saW5l
cHVicy8wMDk2OTUzOTkvYmFzZWRlZnMveGJkX2NoYXAwNC5odG1sI3RhZ18wNF8xMAo+IAo+IE9m
IGNvdXJzZSwgb24geDg2LCB5b3Ugd29uJ3Qgc2VlIGFueSBpc3N1ZXMsIGJlY2F1c2UgaXQncyBj
YWNoZS1jb2hlcmVudAo+IGFueXdheS4KCkkgdGhpbmsgdGhlcmUgYXJlIHR3byB0aGluZ3MgaGVy
ZToKLSBhbGwgbWFjaGluZXMgUHl0aG9uIHJ1bnMgb24gc2hvdWxkIEFGQUlLIGJlIGNhY2hlLWNv
aGVyZW50OiBDUFVzIHN5bmNocm9uaXplCnRoZWlyIHZpZXdzIG9mIG1lbW9yeSBpbiBhIHJhdGhl
ciB0aW1lbHkgZmFzaGlvbi4KLSBtZW1vcnkgb3JkZXJpbmc6IHdyaXRlcyBtYWRlIGJ5IGEgQ1BV
IGNhbiBiZSBzZWVuIGluIGEgZGlmZmVyZW50IG9yZGVyIGJ5CmFub3RoZXIgQ1BVIChpLmUuIENQ
VSAxIHdyaXRlcyBBIGJlZm9yZSBCLCBidXQgQ1BVIDIgc2VlcyBCIHdyaXR0ZW4gYmVmb3JlIEEp
LiBJCmRvbid0IHNlZSBob3cgdGhpcyBjYW4gYXBwbHkgdG8gZ2lsX2Ryb3BfcmVxdWVzdCwgc2lu
Y2UgaXQncyBhIHNpbmdsZSB2YXJpYWJsZQooYW5kLCBtb3Jlb3Zlciwgb25seSBhIHNpbmdsZSBi
aXQgb2YgaXQgaXMgc2lnbmlmaWNhbnQpLgoKKHRoZXJlJ3MgYW4gZXhwbGFuYXRpb24gb2YgbWVt
b3J5IG9yZGVyaW5nIGlzc3VlcyBoZXJlOgpodHRwOi8vd3d3LmxpbnV4am91cm5hbC5jb20vYXJ0
aWNsZS84MjExKQoKQXMgYSBzaWRlIG5vdGUsIEkgcmVtZW1iZXIgSmVmZnJleSBZYXNza2luIHRy
eWluZyB0byBzcGVjaWZ5IGFuIG9yZGVyaW5nIG1vZGVsCmZvciBQeXRob24gY29kZQooc2VlIGh0
dHA6Ly9jb2RlLmdvb2dsZS5jb20vcC91bmxhZGVuLXN3YWxsb3cvd2lraS9NZW1vcnlNb2RlbCku
CgpSZWdhcmRzCgpBbnRvaW5lLgoKCl9fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f
X19fX19fX19fX19fClB5dGhvbi1EZXYgbWFpbGluZyBsaXN0ClB5dGhvbi1EZXZAcHl0aG9uLm9y
ZwpodHRwOi8vbWFpbC5weXRob24ub3JnL21haWxtYW4vbGlzdGluZm8vcHl0aG9uLWRldgpVbnN1
YnNjcmliZTogaHR0cDovL21haWwucHl0aG9uLm9yZy9tYWlsbWFuL29wdGlvbnMvcHl0aG9uLWRl
di9idWxsJTQwcHViYnMubmV0Cg==
Re: Python-Dev - Reworking the GIL by "Martin v. Löwis" on
2009-11-02T11:53:03+00:00
> - all machines Python runs on should AFAIK be cache-coherent: CPUs synchronize
> their views of memory in a rather timely fashion.
Ok. I thought that Itanium was an example where this assumption is
actually violated (as many web pages claim such a restriction), however,
it seems that on Itanium, caches are indeed synchronized using MESI.
So claims wrt. lack of cache consistency on Itanium, and the need for
barrier instruction, seem to be caused by the Itanium feature that
allows the processor to fetch memory out-of-order, i.e. an earlier read
may see a later memory state. This is apparently used to satisfy reads
as soon as the cache line is read (so that the cache line can be
discarded earlier). Wrt. to your requirement ("rather timely fashion",
this still seems to be fine).
Still, this all needs to be documented in the code :-)
Regards,
Martin
Re: Python-Dev - Reworking the GIL by Sturla Molden on
2009-11-02T13:07:59+00:00
Martin v. L=F6wis skrev:
> Maybe you should study the code under discussion before making such
> a proposal.
>
I did, and it does nothing of what I suggested. I am sure I can make the =
Windows GIL
in ceval-gil.h and the mutex in thread-nt.h at lot more precise and =
efficient.
This is the kind of code I was talking about, from ceval-gil.h:
r =3D WaitForMultipleObjects(2, objects, TRUE, milliseconds);
I would turn on multimedia timer (it is not on by default), and replace =
this
call with a loop, approximately like this:
for (;;) {
r =3D WaitForMultipleObjects(2, objects, TRUE, 0);
/* blah blah blah */ =
QueryPerformanceCounter(&cnt); =
if (cnt > timeout) break;
Sleep(0);
}
And the timeout "milliseconds" would now be computed from querying the =
performance
counter, instead of unreliably by the Windows NT kernel.
Sturla
Re: Python-Dev - Reworking the GIL by Antoine Pitrou on
2009-11-02T13:20:43+00:00
Sturla Molden <sturla <at> molden.no> writes:
>
> And the timeout "milliseconds" would now be computed from querying the
> performance
> counter, instead of unreliably by the Windows NT kernel.
Could you actually test your proposal under Windows and report what kind of
concrete benefits it brings?
Thank you
Antoine.
Re: Python-Dev - Reworking the GIL by Sturla Molden on
2009-11-02T13:23:59+00:00
Sturla Molden skrev:
>
> I would turn on multimedia timer (it is not on by default), and
> replace this
> call with a loop, approximately like this:
>
> for (;;) {
> r = WaitForMultipleObjects(2, objects, TRUE, 0);
> /* blah blah blah */ QueryPerformanceCounter(&cnt); if (cnt >
> timeout) break;
> Sleep(0);
> }
And just so you don't ask: There should not just be a Sleep(0) in the
loop, but a sleep that gets shorter and shorter until a lower threshold
is reached, where it skips to Sleep(0). That way we avoid hammering om
WaitForMultipleObjects and QueryPerformanceCounter more than we need.
And for all that to work better than just giving a timeout to
WaitForMultipleObjects, we need the multimedia timer turned on.
Sturla
Re: Python-Dev - Reworking the GIL by Sturla Molden on
2009-11-02T13:42:00+00:00
Sturla Molden skrev:
>
> And just so you don't ask: There should not just be a Sleep(0) in the =
> loop, but a sleep that gets shorter and shorter until a lower =
> threshold is reached, where it skips to Sleep(0). That way we avoid =
> hammering om WaitForMultipleObjects and QueryPerformanceCounter more =
> than we need. And for all that to work better than just giving a =
> timeout to WaitForMultipleObjects, we need the multimedia timer turned =
> on.
The important thing about multimedia timer is that the granularity of =
Sleep() and WaitForMultipleObjects() by default is "10 ms or at most 20 =
ms". But if we call
timeBeginPeriod(1);
the MM timer is on and granularity becomes 1 ms or at most 2 ms. But we =
can get even more precise than that by hammering on Sleep(0) for =
timeouts less than 2 ms. We can get typical granularity in the order of =
10 =B5s, with the occational 100 =B5s now and then. I know this because I =
was using Windows 2000 to generate TTL signals on the LPT port some =
years ago, and watched them on the oscilloscope.
~ 15 ms granularity is Windows default. But that is brain dead.
By the way Antoine, if you think granularity of 1-2 ms is sufficient, =
i.e. no need for =B5s precision, then just calling timeBeginPeriod(1) will =
be sufficient.
Sturla
=
Re: Python-Dev - Reworking the GIL by Antoine Pitrou on
2009-11-02T13:54:04+00:00
Sturla Molden <sturla <at> molden.no> writes:
>
> By the way Antoine, if you think granularity of 1-2 ms is sufficient,
It certainly is.
But once again, I'm no Windows developer and I don't have a native Windost host
to test on; therefore someone else (you?) has to try.
Also, the MSDN doc (*) says timeBeginPeriod() can have a detrimental effect on
system performance; I don't know how much of it is true.
(*) http://msdn.microsoft.com/en-us/library/dd757624(VS.85).aspx
Re: Python-Dev - Reworking the GIL by Sturla Molden on
2009-11-02T14:09:04+00:00
Antoine Pitrou skrev:
> It certainly is.
> But once again, I'm no Windows developer and I don't have a native Windost host
> to test on; therefore someone else (you?) has to try.
>
I'd love to try, but I don't have VC++ to build Python, I use GCC on
Windows.
Anyway, the first thing to try then is to call
timeBeginPeriod(1);
once on startup, and leave the rest of the code as it is. If 2-4 ms is
sufficient we can use timeBeginPeriod(2), etc. Microsoft is claiming
Windows performs better with high granularity, which is why it is 10 ms
by default.
Sturla
Re: Python-Dev - Reworking the GIL by John Arbash Meinel on
2009-11-02T14:27:43+00:00
Sturla Molden wrote:
> Antoine Pitrou skrev:
>> It certainly is.
>> But once again, I'm no Windows developer and I don't have a native
>> Windost host
>> to test on; therefore someone else (you?) has to try.
>>
> I'd love to try, but I don't have VC++ to build Python, I use GCC on
> Windows.
>
> Anyway, the first thing to try then is to call
>
> timeBeginPeriod(1);
>
> once on startup, and leave the rest of the code as it is. If 2-4 ms is
> sufficient we can use timeBeginPeriod(2), etc. Microsoft is claiming
> Windows performs better with high granularity, which is why it is 10 ms
> by default.
>
>
> Sturla
That page claims:
Windows uses the lowest value (that is, highest resolution) requested
by any process.
I would posit that the chance of having some random process on your
machine request a high-speed timer is high enough that the overhead for
Python doing the same is probably low.
John
=:->
Re: Python-Dev - Reworking the GIL by Antoine Pitrou on
2009-11-02T14:34:31+00:00
Sturla Molden <sturla <at> molden.no> writes:
>
> I'd love to try, but I don't have VC++ to build Python, I use GCC on
> Windows.
You can use Visual Studio Express, which is free (gratis).
Re: Python-Dev - Reworking the GIL by "Martin v. Löwis" on
2009-11-02T16:27:47+00:00
> I did, and it does nothing of what I suggested. I am sure I can make the
> Windows GIL in ceval-gil.h and the mutex in thread-nt.h at lot more precise and
> efficient.
Hmm. I'm skeptical that your code makes it more accurate, and I
completely fail to see that it makes it more efficient (by what
measurement of efficiency?)
Also, why would making it more accurate make it better? IIUC,
accuracy is completely irrelevant here, though efficiency
(low overhead) does matter.
> This is the kind of code I was talking about, from ceval-gil.h:
>
> r = WaitForMultipleObjects(2, objects, TRUE, milliseconds);
>
> I would turn on multimedia timer (it is not on by default), and replace
> this
> call with a loop, approximately like this:
>
> for (;;) {
> r = WaitForMultipleObjects(2, objects, TRUE, 0);
> /* blah blah blah */ QueryPerformanceCounter(&cnt); if (cnt >
> timeout) break;
> Sleep(0);
> }
>
> And the timeout "milliseconds" would now be computed from querying the
> performance counter, instead of unreliably by the Windows NT kernel.
Hmm. This creates a busy wait loop; if you add larger sleep values,
then it loses accuracy.
Why not just call timeBeginPeriod, and then rely on the higher clock
rate for WaitForMultipleObjects?
Regards,
Martin
Re: Python-Dev - Reworking the GIL by Daniel Stutzbach on
2009-11-02T16:44:32+00:00
> Hmm. This creates a busy wait loop; if you add larger sleep values,
> then it loses accuracy.
>
I thought that at first, too, but then I checked the documentation for
Sleep(0):
"A value of zero causes the thread to relinquish the remainder of its time
slice to any other thread of equal priority that is ready to run."
(this is not to say that I think the solution with Sleep is worthwhile,
though...)
uote" style=3D"border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0=
pt 0.8ex; padding-left: 1ex;">
Hmm. This creates a busy wait loop; if you add larger sleep values,<br>
then it loses accuracy.<br></blockquote><div><br>I thought that at first, t=
oo, but then I checked the documentation for Sleep(0):<br><br>"A value=
of zero causes the thread to relinquish the remainder of its
time slice to any other thread of equal priority that is ready to run."=
;<br><br>(this is not to say that I think the solution with Sleep is worthw=
hile, though...)<br></div></div><blockquote style=3D"margin: 1.5em 0pt;">
Re: Python-Dev - Reworking the GIL by Jeffrey Yasskin on
2009-11-02T16:56:19+00:00
T24gTW9uLCBOb3YgMiwgMjAwOSBhdCA0OjE1IEFNLCBBbnRvaW5lIFBpdHJvdSA8c29saXBzaXNA
cGl0cm91Lm5ldD4gd3JvdGU6Cj4gTWFydGluIHYuIEzDtndpcyA8bWFydGluIDxhdD4gdi5sb2V3
aXMuZGU+IHdyaXRlczoKPj4KPiBbZ2lsX2Ryb3BfcmVxdWVzdF0KPj4gRXZlbiBpZiBpdCBpcyBy
ZWFkIGZyb20gbWVtb3J5LCBJIHN0aWxsIHdvbmRlciB3aGF0IG1pZ2h0IGhhcHBlbiBvbgo+PiBz
eXN0ZW1zIHRoYXQgcmVxdWlyZSBleHBsaWNpdCBtZW1vcnkgYmFycmllcnMgdG8gc3luY2hyb25p
emUgYWNyb3NzCj4+IENQVXMuIFdoYXQgaWYgQ1BVIDEga2VlcHMgcmVhZGluZyBhIDAgdmFsdWUg
b3V0IG9mIGl0cyBjYWNoZSwgZXZlbgo+PiB0aG91Z2ggQ1BVIDEgaGFzIHdyaXR0ZW4gYW4gMSB2
YWx1ZSBhIGxvbmcgdGltZSBhZ28/Cj4+Cj4+IElJVUMsIGFueSAobW9zdD8pIHB0aHJlYWQgY2Fs
bHMgd291bGQgY2F1c2Ugc3luY2hyb25pemF0aW9uIGluIHRoYXQKPj4gY2FzZSwgd2hpY2ggaXMg
d2h5IGFwcGxpY2F0aW9ucyB0aGF0IGFsc28gdXNlIGxvY2tzIGZvciByZWFkaW5nOgo+Pgo+PiBo
dHRwOi8vd3d3Lm9wZW5ncm91cC5vcmcvb25saW5lcHVicy8wMDk2OTUzOTkvYmFzZWRlZnMveGJk
X2NoYXAwNC5odG1sI3RhZ18wNF8xMAo+Pgo+PiBPZiBjb3Vyc2UsIG9uIHg4NiwgeW91IHdvbid0
IHNlZSBhbnkgaXNzdWVzLCBiZWNhdXNlIGl0J3MgY2FjaGUtY29oZXJlbnQKPj4gYW55d2F5Lgo+
Cj4gSSB0aGluayB0aGVyZSBhcmUgdHdvIHRoaW5ncyBoZXJlOgo+IC0gYWxsIG1hY2hpbmVzIFB5
dGhvbiBydW5zIG9uIHNob3VsZCBBRkFJSyBiZSBjYWNoZS1jb2hlcmVudDogQ1BVcyBzeW5jaHJv
bml6ZQo+IHRoZWlyIHZpZXdzIG9mIG1lbW9yeSBpbiBhIHJhdGhlciB0aW1lbHkgZmFzaGlvbi4K
PiAtIG1lbW9yeSBvcmRlcmluZzogd3JpdGVzIG1hZGUgYnkgYSBDUFUgY2FuIGJlIHNlZW4gaW4g
YSBkaWZmZXJlbnQgb3JkZXIgYnkKPiBhbm90aGVyIENQVSAoaS5lLiBDUFUgMSB3cml0ZXMgQSBi
ZWZvcmUgQiwgYnV0IENQVSAyIHNlZXMgQiB3cml0dGVuIGJlZm9yZSBBKS4gSQo+IGRvbid0IHNl
ZSBob3cgdGhpcyBjYW4gYXBwbHkgdG8gZ2lsX2Ryb3BfcmVxdWVzdCwgc2luY2UgaXQncyBhIHNp
bmdsZSB2YXJpYWJsZQo+IChhbmQsIG1vcmVvdmVyLCBvbmx5IGEgc2luZ2xlIGJpdCBvZiBpdCBp
cyBzaWduaWZpY2FudCkuCj4KPiAodGhlcmUncyBhbiBleHBsYW5hdGlvbiBvZiBtZW1vcnkgb3Jk
ZXJpbmcgaXNzdWVzIGhlcmU6Cj4gaHR0cDovL3d3dy5saW51eGpvdXJuYWwuY29tL2FydGljbGUv
ODIxMSkKPgo+IEFzIGEgc2lkZSBub3RlLCBJIHJlbWVtYmVyIEplZmZyZXkgWWFzc2tpbiB0cnlp
bmcgdG8gc3BlY2lmeSBhbiBvcmRlcmluZyBtb2RlbAo+IGZvciBQeXRob24gY29kZQo+IChzZWUg
aHR0cDovL2NvZGUuZ29vZ2xlLmNvbS9wL3VubGFkZW4tc3dhbGxvdy93aWtpL01lbW9yeU1vZGVs
KS4KCk5vdGUgdGhhdCB0aGF0IG1lbW9yeSBtb2RlbCB3YXMgb25seSBmb3IgUHl0aG9uIGNvZGU7
IHRoZSBDIGNvZGUKaW1wbGVtZW50aW5nIGl0IGlzIHN1YmplY3QgdG8gdGhlIEMgbWVtb3J5IG1v
ZGVsLCB3aGljaCBpcyB3ZWFrZXIgKGFuZApub3QgZXZlbiBmdWxseSBkZWZpbmVkIHVudGlsIHRo
ZSBuZXh0IEMgc3RhbmRhcmQgY29tZXMgb3V0KS4KClRvIGJlIHJlYWxseSBzYWZlLCB3ZSBvdWdo
dCB0byBoYXZlIGEgY291cGxlIHByaW1pdGl2ZXMgaW1wbGVtZW50aW5nCiJhdG9taWMiIHJhdGhl
ciB0aGFuIGp1c3QgInZvbGF0aWxlIiBpbnN0cnVjdGlvbnMsIGJ1dCB1bnRpbCB0aGVuIGEKc2ln
bmFsIHRoYXQncyBqdXN0IHNheWluZyAiZG8gc29tZXRoaW5nIiByYXRoZXIgdGhhbiAiaGVyZSdz
IHNvbWUgZGF0YQp5b3Ugc2hvdWxkIGxvb2sgYXQiIHNob3VsZCBiZSBvayBhcyBhIHZvbGF0aWxl
IGludC4KCkknZCBsaWtlIHRvIGxvb2sgYXQgdGhlIHBhdGNoIGluIGRldGFpbCwgYnV0IEkgY2Fu
J3QgZ3VhcmFudGVlIHRoYXQKSSdsbCBnZXQgdG8gaXQgaW4gYSB0aW1lbHkgbWFubmVyLiBJJ2Qg
c2F5IGNoZWNrIGl0IGluIGFuZCBsZXQgbW9yZQp0aHJlYWRpbmcgZXhwZXJ0cyBsb29rIGF0IGl0
IGluIHRoZSB0cmVlLiBXZSd2ZSBnb3Qgc29tZSB0aW1lIGJlZm9yZSBhCnJlbGVhc2UgZm9yIHBl
b3BsZSB0byBmaXggcHJvYmxlbXMgYW5kIG1ha2UgZnVydGhlciBpbXByb3ZlbWVudHMuICsxCnRv
IE1hcnRpbidzIHJlcXVlc3QgZm9yIGRldGFpbGVkIGRvY3VtZW50YXRpb24gdGhvdWdoLiA6KQpf
X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fXwpQeXRob24tRGV2
IG1haWxpbmcgbGlzdApQeXRob24tRGV2QHB5dGhvbi5vcmcKaHR0cDovL21haWwucHl0aG9uLm9y
Zy9tYWlsbWFuL2xpc3RpbmZvL3B5dGhvbi1kZXYKVW5zdWJzY3JpYmU6IGh0dHA6Ly9tYWlsLnB5
dGhvbi5vcmcvbWFpbG1hbi9vcHRpb25zL3B5dGhvbi1kZXYvYnVsbCU0MHB1YmJzLm5ldAo=
Re: Python-Dev - Reworking the GIL by Sturla Molden on
2009-11-02T17:14:02+00:00
Martin v. L=F6wis skrev:
>> I did, and it does nothing of what I suggested. I am sure I can make the
>> Windows GIL in ceval-gil.h and the mutex in thread-nt.h at lot more prec=
ise and
>> efficient.
>> =
>
> Hmm. I'm skeptical that your code makes it more accurate, and I
> completely fail to see that it makes it more efficient (by what
> measurement of efficiency?)
>
> Also, why would making it more accurate make it better? IIUC,
> accuracy is completely irrelevant here, though efficiency
> (low overhead) does matter.
>
> =
>> This is the kind of code I was talking about, from ceval-gil.h:
>>
>> r =3D WaitForMultipleObjects(2, objects, TRUE, milliseconds);
>>
>> I would turn on multimedia timer (it is not on by default), and replace
>> this
>> call with a loop, approximately like this:
>>
>> for (;;) {
>> r =3D WaitForMultipleObjects(2, objects, TRUE, 0);
>> /* blah blah blah */ QueryPerformanceCounter(&cnt); if (cnt >
>> timeout) break;
>> Sleep(0);
>> }
>>
>> And the timeout "milliseconds" would now be computed from querying the
>> performance counter, instead of unreliably by the Windows NT kernel.
>> =
>
> Hmm. This creates a busy wait loop; if you add larger sleep values,
> then it loses accuracy.
>
> =
Actually an usleep lookes like this, and the call to the wait function =
must go into the for loop. But no, it's not a busy sleep.
static int inited =3D 0;
static double diff;
if (!inited) {
timeBeginPeriod(1);
QueryPerformanceFrequency((LARGE-INTEGER*)&hz);
dhz =3D (double)hz;
inited =3D 1;
} =
QueryPerformanceCounter((LARGE-INTEGER*)&cnt);
end =3D cnt + ( else
Sleep(0); =
}
}
> Why not just call timeBeginPeriod, and then rely on the higher clock
> rate for WaitForMultipleObjects?
>
> =
That is what I suggested when Antoine said 1-2 ms was enough.
Sturla
Re: Python-Dev - Reworking the GIL by Baptiste Lepilleur on
2009-11-07T09:52:26+00:00
The benchmark results can be obtained from:
http://gaiacrtn.free.fr/py/benchmark-newgil/benchmark-newgil.tar.bz2
and viewed from:
http://gaiacrtn.free.fr/py/benchmark-newgil/
I ran the benchmark on two platforms:
- Solaris X86, 16 cores: some python extension are likely missing (see
config.log)
- Windows XP SP3, 4 cores: all python extensions but TCL (I didn't bother
checking why it failed as it is not used in the benchmark). It is a release
build.
The results look promising but I let you share your conclusion (some latency
results seem a bit strange from my understanding).
Side-note: PCBuild requires nasmw.exe but it no longer exists in the latest
version. I had to rename nasm.exe to nasmw.exe. Would be nice to add this to
the readme to avoid confusion...
Baptiste.
2009/11/1 Antoine Pitrou <solipsis@pitrou.net>
>
> Hello again,
>
> Brett Cannon <brett <at> python.org> writes:
> >
> > I think it's worth it. Removal of the GIL is a totally open-ended problem
> > with no solution in sight. This, on the other hand, is a performance
> benefit
> > now. I say move forward with this. If it happens to be short-lived
> because
> > some actually figures out how to remove the GIL then great, but is that
> > really going to happen between now and Python 3.2? I doubt it.
>
> Based on this whole discussion, I think I am going to merge the new GIL
> work
> into the py3k branch, with priority requests disabled.
>
ink).<br>
<br>The benchmark results can be obtained from:<br><a href=3D"http://gaiacr=
tn.free.fr/py/benchmark-newgil/benchmark-newgil.tar.bz2">http://gaiacrtn.fr=
ee.fr/py/benchmark-newgil/benchmark-newgil.tar.bz2</a><br>and viewed from:<=
br>
<a href=3D"http://gaiacrtn.free.fr/py/benchmark-newgil/" target=3D"-blank">=
http://gaiacrtn.free.fr/py/benchmark-newgil/</a><br>
<br>I ran the benchmark on two platforms:<br><ul><li>Solaris X86, 16 cores:=
some python extension are likely missing (see config.log)</li><li>Windows =
XP SP3, 4 cores: all python extensions but TCL (I didn't bother checkin=
g why it failed as it is not used in the benchmark). It is a release build.=
<br>
</li></ul>The results look promising but I let you share your conclusion (s=
ome latency results seem a bit strange from my understanding).<br><br>Side-=
note: PCBuild requires nasmw.exe but it no longer exists in the latest vers=
ion. I had to rename nasm.exe to nasmw.exe. Would be nice to add this to th=
e readme to avoid confusion...<br>
<br>Baptiste.<br><br><div class=3D"gmail-quote">2009/11/1 Antoine Pitrou <s=
pan dir=3D"ltr"><<a href=3D"mailto:solipsis@pitrou.net" target=3D"-blank=
">solipsis@pitrou.net</a>></span><br>
<blockquote class=3D"gmail-quote" style=3D"border-left: 1px solid rgb(204, =
204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><br>
Hello again,<br>
<br>
Brett Cannon <brett <at> <a href=3D"http://python.org" target=3D"-=
blank">python.org</a>> writes:<br>
><br>
> I think it's worth it. Removal of the GIL is a totally open-ended =
problem<br>
> with no solution in sight. This, on the other hand, is a performance b=
enefit<br>
> now. I say move forward with this. If it happens to be short-lived bec=
ause<br>
> some actually figures out how to remove the GIL then great, but is tha=
t<br>
> really going to happen between now and Python 3.2? I doubt it.<br>
<br>
Based on this whole discussion, I think I am going to merge the new GIL wor=
k<br>
into the py3k branch, with priority requests disabled.<br>
</blockquote></div><br>
Re: Python-Dev - Reworking the GIL by Antoine Pitrou on
2009-11-07T14:09:16+00:00
Hello,
> Solaris X86, 16 cores: some python extension are likely missing (see
config.log)
> Windows XP SP3, 4 cores: all python extensions but TCL (I didn't bother
checking why it failed as it is not used in the benchmark). It is a release
build.
>
> The results look promising but I let you share your conclusion (some latency
results seem a bit strange from my understanding).
Thank you! The latency results don't look that strange to me.
If you're surprised that py3k shows better latency than newgil on the "pi
calculation" workload, it's easy to understand why: py3k speculatively releases
the GIL so often on that workload (every 100 opcodes) that the latencies are
indeed very good, but if you look at the corresponding throughput numbers they
are dismal (your Solaris box shows it falling to less than 20% with two threads
running compared to the baseline number for single-thread execution, and on your
Windows box the number is hardly better with 45%).
So, to sum it up, the way the current GIL manages to have good latencies is by
issueing an unreasonable number of system calls on a contended lock, and
potentially killing throughput performance (this depends on the OS too, because
numbers under Linux are not so catastrophic).
The new GIL, on the other hand, is much more balanced in that it achieves rather
predictable latencies (especially when you don't overcommit the OS by issueing
more computational threads than you have CPU cores) while preserving throughput
performance.
> Side-note: PCBuild requires nasmw.exe but it no longer exists in the latest
> version. I had to rename nasm.exe to nasmw.exe. Would be nice to add this to
> the readme to avoid confusion...
You should file a bug on http://bugs.python.org
Regards
Antoine.
Re: Python-Dev - Reworking the GIL by Guido van Rossum on
2009-11-07T15:08:20+00:00
Antoine,
How close are you to merging this into the Py3k branch? It looks like
a solid piece of work, that can only get better in the period between
now and the release of 3.2. But I don't want to rush you, and I only
have had a brief look at your code. (I whipped up a small Dave Beazley
example and was impressed by the performance of your code compared to
the original py3k branch I also expect that priority requests aren't important; it somehow
seems strange that if multiple threads are all doing I/O each new
thread whose I/O completes would get to preempt whoever else is active
immediately. (Also the choice of *not* making a priority request when
a read returns no bytes seems strange 00 if I read the code
correctly.)
Anyway, thanks for this work!
Re: Python-Dev - Reworking the GIL by Antoine Pitrou on
2009-11-07T16:02:57+00:00
Hello Guido,
> How close are you to merging this into the Py3k branch? It looks like
> a solid piece of work, that can only get better in the period between
> now and the release of 3.2. But I don't want to rush you, and I only
> have had a brief look at your code.
The code is ready. Priority requests are already disabled, I'm just
wondering whether to remove them from the code, or leave them there in
case someone thinks they're useful. I suppose removing them is ok.
> My only suggestion so far: the description could use more explicit
> documentation on the various variables and macros and how they
> combine.
Is it before or after
http://mail.python.org/pipermail/python-checkins/2009-November/087482.html ?
Regards
Antoine.
Re: Python-Dev - Reworking the GIL by Guido van Rossum on
2009-11-07T18:33:39+00:00
On Sat, Nov 7, 2009 at 9:01 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
>
> Hello Guido,
>
>> How close are you to merging this into the Py3k branch? It looks like
>> a solid piece of work, that can only get better in the period between
>> now and the release of 3.2. But I don't want to rush you, and I only
>> have had a brief look at your code.
>
> The code is ready. Priority requests are already disabled, I'm just
> wondering whether to remove them from the code, or leave them there in
> case someone thinks they're useful. I suppose removing them is ok.
I would remove them >> documentation on the various variables and macros and how they
>> combine.
>
> Is it before or after
> http://mail.python.org/pipermail/python-checkins/2009-November/087482.html ?
After. While that is already really helpful, not all the code is
easily linked back to paragraphs in that comment block, and some
variables are not mentioned by name in the block.
Re: Python-Dev - Reworking the GIL by Baptiste Lepilleur on
2009-11-07T18:39:10+00:00
> So, to sum it up, the way the current GIL manages to have good latencies is
> by
> issueing an unreasonable number of system calls on a contended lock, and
> potentially killing throughput performance (this depends on the OS too,
> because
> numbers under Linux are not so catastrophic).
>
Ah, I remember reading this in the analysis that was published now!
I made another benchmark using one of my script I ported to python 3 (and
simplified a bit). I only test the total execution time. Tests done on
Windows XP SP3. Processor is an Intel Core 2 Quad Q9300 (4 cores).
You can get the script from:
http://gaiacrtn.free.fr/py/benchmark-kwd-newgil/purekeyword-py26-3k.py
Script + test doc (940KB):
http://gaiacrtn.free.fr/py/benchmark-kwd-newgil/benchmark-kwd-newgil.tar.bz2
The threaded loop is:
for match in self.punctuation-pattern.finditer( document ):
word = document[last-start-index:match.start()]
if len(word) > 1 and len(word) < MAX-KEYWORD-LENGTH:
words[word] = words.get(word, 0) + 1
last-start-index = match.end()
if word:
word-count += 1
Here are the results:
-j0 (main thread only)
2.5.2 : 17.991s, 17.947s, 17.780s
2.6.2 : 19.071s, 19.023s, 19.054s
3.1.1 : 46.384s, 46.321s, 46.425s
newgil: 47.483s, 47.605s, 47.512s
-j4 (4 consumer threads, main thread producing/waiting)
2.5.2 : 31.105s, 30.568s
2.6.2 : 31.550s, 30.599s
3.1.1 : 85.114s, 85.185s
newgil: 48.428s, 49.217s
It shows that, on my platform for this specific benchmark:
- newgil manage to leverage a significant amount of parallelism (1.7)
where python 3.1 does not (3.1 is 80% slower)
- newgil as a small impact on non multi-threaded execution (~1-2%) [may
be worth investigating]
- 3.1 is more than 2 times slower than python 2.6 on this benchmark
- 2.6 is "only" 65% slower when run with multiple threads compared to the
80% slower of 3.1.
Newgil is a vast improvement as it manages to leverage the short time the
GIL is released by finditer [if I understood correctly in 3.x regex release
the GIL].
What's worry me is the single threaded performance degradation between 2.6
and 3.1 on this test. Could the added GIL release/acquire on each finditer
call explain this?
1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;=
">
<br>
[...]<br>
So, to sum it up, the way the current GIL manages to have good latencies is=
by<br>
issueing an unreasonable number of system calls on a contended lock, and<br=
>
potentially killing throughput performance (this depends on the OS too, bec=
ause<br>
numbers under Linux are not so catastrophic).<br></blockquote><div>Ah, I re=
member reading this in the analysis that was published now!<br><br>I made a=
nother benchmark using one of my script I ported to python 3 (and simplifie=
d a bit). I only test the total execution time. Tests done on Windows XP SP=
3. Processor is an Intel Core 2 Quad Q9300 (4 cores).<br>
<br>You can get the script from:<br><a href=3D"http://gaiacrtn.free.fr/py/b=
enchmark-kwd-newgil/purekeyword-py26-3k.py">http://gaiacrtn.free.fr/py/benc=
hmark-kwd-newgil/purekeyword-py26-3k.py</a><br>Script + test doc (940KB):<b=
r>
<a href=3D"http://gaiacrtn.free.fr/py/benchmark-kwd-newgil/benchmark-kwd-ne=
wgil.tar.bz2">http://gaiacrtn.free.fr/py/benchmark-kwd-newgil/benchmark-kwd=
-newgil.tar.bz2</a><br><br>The threaded loop is:<br>for match in self.punct=
uation-pattern.finditer( document ):<br>
=A0=A0=A0 word =3D document[last-start-index:match.start()]<br>=A0=A0=A0 if=
len(word) > 1 and len(word) < MAX-KEYWORD-LENGTH:<br>=A0=A0=A0=A0=A0=
=A0=A0 words[word] =3D words.get(word, 0) + 1<br>=A0=A0=A0 last-start-index=
=3D match.end()<br>=A0=A0=A0 if word:<br>
=A0=A0=A0=A0=A0=A0=A0 word-count +=3D 1<br><br>Here are the results:<br><br=
>-j0 (main thread only)<br>2.5.2 : 17.991s, 17.947s, 17.780s<br>2.6.2 : 19.=
071s, 19.023s, 19.054s<br>3.1.1 : 46.384s, 46.321s, 46.425s<br>newgil: 47.4=
83s, 47.605s, 47.512s<br>
<br>-j4 (4 consumer threads, main thread producing/waiting)<br>2.5.2 : 31.1=
05s, 30.568s<br>2.6.2 : 31.550s, 30.599s<br>3.1.1 : 85.114s, 85.185s<br>new=
gil: 48.428s, 49.217s<br><br></div></div>It shows that, on my platform for =
this specific benchmark:<br>
<ul><li>=A0newgil manage to leverage a significant amount of parallelism (1=
.7) where python 3.1 does not (3.1 is 80% slower)</li><li>newgil as a small=
impact on non multi-threaded execution (~1-2%) [may be worth investigating=
]<br>
</li><li>3.1 is more than 2 times slower than python 2.6 on this benchmark<=
/li><li>2.6 is "only" 65% slower when run with multiple threads c=
ompared to the 80% slower of 3.1.</li></ul>Newgil is a vast improvement as =
it manages to leverage the short time the GIL is released by finditer [if I=
understood correctly in 3.x regex release the GIL].<br>
<br>What's worry me is the single threaded performance degradation betw=
een 2.6 and 3.1 on this test. Could the added GIL release/acquire on each f=
inditer call explain this?<br><br>
Re: Python-Dev - Reworking the GIL by Antoine Pitrou on
2009-11-07T18:55:59+00:00
Hello again,
> It shows that, on my platform for this specific benchmark:
> * newgil manage to leverage a significant amount of parallelism
> (1.7) where python 3.1 does not (3.1 is 80% slower)
I think you are mistaken:
-j0 (main thread only)
newgil: 47.483s, 47.605s, 47.512s
-j4 (4 consumer threads, main thread producing/waiting)
newgil: 48.428s, 49.217s
The runtimes are actually the same, so newgil doesn't leverage anything.
However, it doesn't degrade performance like 2.x/3.1 does :-)
> * newgil as a small impact on non multi-threaded execution
> (~1-2%) [may be worth investigating]
It goes from very slightly slower to very slightly faster and it is
likely to be caused by variations in generated output from the compiler.
> * 3.1 is more than 2 times slower than python 2.6 on this
> benchmark
That's the most worrying outcome I'd say. Are you sure the benchmark
really does the same thing? Under 2.6, you should add re.UNICODE to the
regular expression flags so as to match the 3.x semantics.
> [if I understood correctly in 3.x regex release the GIL].
Unless I've missed something it doesn't, no.
This could be a separate opportunity for optimization, if someone wants
to take a look at it.
Regards
Antoine.
Re: Python-Dev - Reworking the GIL by Baptiste Lepilleur on
2009-11-08T13:09:02+00:00
>
> > It shows that, on my platform for this specific benchmark:
> > * newgil manage to leverage a significant amount of parallelism
> > (1.7) where python 3.1 does not (3.1 is 80% slower)
>
> I think you are mistaken:
>
> -j0 (main thread only)
> newgil: 47.483s, 47.605s, 47.512s
> -j4 (4 consumer threads, main thread producing/waiting)
> newgil: 48.428s, 49.217s
>
> The runtimes are actually the same, so newgil doesn't leverage anything.
> However, it doesn't degrade performance like 2.x/3.1 does :-)
>
Ooops, I was comparing to 3.1 -j4 times which make no sense. One would think
I wanted to see that result since I though the GIL was released :/. This
greatly reduce the interest of this benchmark...
> > * 3.1 is more than 2 times slower than python 2.6 on this
> > benchmark
>
> That's the most worrying outcome I'd say. Are you sure the benchmark
> really does the same thing? Under 2.6, you should add re.UNICODE to the
> regular expression flags so as to match the 3.x semantics.
>
I've tried, but there is no change in result (the regexp does not use w &
co but specify a lot unicode ranges). All strings are already of unicode
type in 2.6.
> > [if I understood correctly in 3.x regex release the GIL].
>
> Unless I've missed something it doesn't, no.
>
Hmmm, I was confusing with other modules (bzip2 & hashlib?). Looking back at
the result of your benchmark it's obvious. Is there a place where the list
of functions releasing the GIL is available? I did not see anything in
bz2.compress documentation.
1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;=
">
<br>
Hello again,<br>
<div><br>
> It shows that, on my platform for this specific benchmark:<br>
> =A0 =A0 =A0 * =A0newgil manage to leverage a significant amount of par=
allelism<br>
> =A0 =A0 =A0 =A0 (1.7) where python 3.1 does not (3.1 is 80% slower)<br=
>
<br>
</div>I think you are mistaken:<br>
<br>
-j0 (main thread only)<br>
<div>newgil: 47.483s, 47.605s, 47.512s<br>
-j4 (4 consumer threads, main thread producing/waiting)<br>
</div><div>newgil: 48.428s, 49.217s<br>
<br>
</div>The runtimes are actually the same, so newgil doesn't leverage an=
ything.<br>
However, it doesn't degrade performance like 2.x/3.1 does :-)<br></bloc=
kquote><div><br>Ooops, I was comparing to 3.1 -j4 times which make no sense=
. One would think I wanted to see that result since I though the GIL was re=
leased :/. This greatly reduce the interest of this benchmark...<br>
=A0<br></div><blockquote class=3D"gmail-quote" style=3D"border-left: 1px so=
lid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
> =A0 =A0 =A0 * 3.1 is more than 2 times slower than python 2.6 on this<=
br>
> =A0 =A0 =A0 =A0 benchmark<br>
<br>
That's the most worrying outcome I'd say. Are you sure the benchmar=
k<br>
really does the same thing? Under 2.6, you should add re.UNICODE to the<br>
regular expression flags so as to match the 3.x semantics.<br></blockquote>=
<div><br>I've tried, but there is no change in result (the regexp does =
not use w & co but specify a lot unicode ranges). All strings are alre=
ady of unicode type in 2.6.<br>
=A0</div><blockquote class=3D"gmail-quote" style=3D"border-left: 1px solid =
rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<div>
> [if I understood correctly in 3.x regex release the GIL].<br>
<br>
</div>Unless I've missed something it doesn't, no.<br></blockquote>=
<div>Hmmm, I was confusing with other modules (bzip2 & hashlib?). Looki=
ng back at the result of your benchmark it's obvious. Is there a place =
where the list of functions releasing the GIL is available? I did not see a=
nything in bz2.compress documentation.<br>
</div></div><br>
Re: Python-Dev - Reworking the GIL by Antoine Pitrou on
2009-11-08T13:20:32+00:00
Baptiste Lepilleur <baptiste.lepilleur <at> gmail.com> writes:
>
> I've tried, but there is no change in result (the regexp does not use w &
> co but specify a lot unicode ranges). All strings are already of unicode
> type in 2.6.
No they aren't. You should add "from
No, there isn't. You'd have to test, or read the source code.
But bz2 and zlib, for example, do release the GIL.
Regards
Antoine.
Re: Python-Dev - Reworking the GIL by Antoine Pitrou on
2009-11-10T20:35:55+00:00
Hello again,
I've now removed priority requests, tried to improve the internal doc a
bit, and merged the changes into py3k.
Afterwards, the new Windows 7 buildbot has hung in test-multiprocessing,
but I don't know whether it's related.
Regards
Antoine.
Guido van Rossum <guido <at> python.org> writes:
>
>
> I would remove them > >> documentation on the various variables and macros and how they
> >> combine.
> >
> > Is it before or after
> > http://mail.python.org/pipermail/python-checkins/2009-
November/087482.html ?
>
> After. While that is already really helpful, not all the code is
> easily linked back to paragraphs in that comment block, and some
> variables are not mentioned by name in the block.
>
Re: Python-Dev - Reworking the GIL by Stefan Ring on
2009-11-23T07:56:40+00:00
SGVsbG8sCgpJIGJ1aWx0IHNvbWV0aGluZyB2ZXJ5IHNpbWlsYXIgZm9yIG15IGNvbXBhbnkgbGFz
dCB5ZWFyLCBhbmQgaXTigJlzIGJlZW4gcnVubmluZwpmbGF3bGVzc2x5IGluIHByb2R1Y3Rpb24g
YXQgYSBmZXcgY3VzdG9tZXIgc2l0ZXMgc2luY2UsIHdpdGggYXZnLiBDUFUgdXNhZ2UgfjUwJQph
cm91bmQgdGhlIGNsb2NrLiBJIGV2ZW4gcG9zdGVkIGFib3V0IGl0IG9uIHRoZSBQeXRob24gbWFp
bGluZyBsaXN0IFsxXSB3aGVyZQp0aGVyZSB3YXMgYWxtb3N0IG5vIHJlc29uYW5jZSBhdCB0aGF0
IHRpbWUuIEkgbmV2ZXIgcG9zdGVkIGNvZGUsIHRob3VnaCAtLQpub2JvZHkgc2VlbWVkIHRvIGJl
IHRvbyBpbnRlcmVzdGVkLgoKSSBhbSB3ZWxsIGF3YXJlIHRoYXQgeW91ciBjdXJyZW50IHdvcmsg
aXMgYSBsb3QgbW9yZSBmYXItcmVhY2hpbmcgdGhhbiB3aGF0IEnigJl2ZQpkb25lLCB3aGljaCBp
cyBiYXNpY2FsbHkganVzdCBhIEZJRk8gc2NoZWR1bGVyLiBJIGV2ZW4gYWRkZWQgc2NoZWR1bGlu
Zwpwcmlvcml0aWVzIGxhdGVyIHdoaWNoIGRvbuKAmXQgd29yayB0b28gZ3JlYXQgYmVjYXVzZSB0
aGUgYW1vdW50IG9mIHRpbWUgdXNlZCBmb3IKYSAidGljayIgY2FuIHZhcnkgYnkgc2V2ZXJhbCBv
cmRlcnMgb2YgbWFnbml0dWRlLCBhcyB5b3Uga25vdy4KClRob3VnaHQgeW91IG1pZ2h0IGJlIGlu
dGVyZXN0ZWQuCgpSZWdhcmRzClN0ZWZhbgoKWzFdIGh0dHA6Ly9tYWlsLnB5dGhvbi5vcmcvcGlw
ZXJtYWlsL3B5dGhvbi1kZXYvMjAwOC1NYXJjaC8wNzc4MTQuaHRtbApbMl0gaHR0cDovL3d3dy5i
ZXN0aW5jbGFzcy5kay9pbmRleC5waHAvMjAwOS8xMC9weXRob24tdnMtY2xvanVyZS1ldm9sdmlu
Zy8KWzNdIHd3dy5kYWJlYXouY29tL3B5dGhvbi9HSUwucGRmCgpQUyBPbiBhIHNsaWdodGx5IGRp
ZmZlcmVudCBub3RlLCBJIGNhbWUgYWNyb3NzIHNvbWUgUHl0aG9uIGJhc2hpbmcgWzJdIHllc3Rl
cmRheQphbmQgc29tZWhvdyBmcm9tIHRoZXJlIHRvIERhdmlkIEJlYXpsZXnigJlzIHByZXNlbnRh
dGlvbiBhYm91dCB0aGUgR0lMIFszXS4gV2hpbGUKSSBkb27igJl0IG1pbmQgdGhlIGJhc2hpbmcs
IHRoZSBvYnNlcnZhdGlvbnMgYWJvdXQgdGhlIEdJTCBzZWVtIHF1aXRlIHVuZmFpciB0byBtZQpi
ZWNhdXNlIERhdmlk4oCZcyBtZWFzdXJlbWVudHMgaGF2ZSBiZWVuIG1hZGUgb24gTWFjIE9TIFgg
d2l0aCBpdHMgaG9ycmlibHkgc2xvdwpwdGhyZWFkcyBmdW5jdGlvbnMuIEkgd2FzIG5vdCBhYmxl
IHRvIG1lYXN1cmUgYW55IHNsb3dkb3duIG9uIExpbnV4LgoKCl9fX19fX19fX19fX19fX19fX19f
X19fX19fX19fX19fX19fX19fX19fX19fX19fClB5dGhvbi1EZXYgbWFpbGluZyBsaXN0ClB5dGhv
bi1EZXZAcHl0aG9uLm9yZwpodHRwOi8vbWFpbC5weXRob24ub3JnL21haWxtYW4vbGlzdGluZm8v
cHl0aG9uLWRldgpVbnN1YnNjcmliZTogaHR0cDovL21haWwucHl0aG9uLm9yZy9tYWlsbWFuL29w
dGlvbnMvcHl0aG9uLWRldi9idWxsJTQwcHViYnMubmV0Cg==
Re: Python-Dev - Reworking the GIL by Nick Coghlan on
2009-11-23T09:32:10+00:00
U3RlZmFuIFJpbmcgd3JvdGU6Cj4gWzJdIGh0dHA6Ly93d3cuYmVzdGluY2xhc3MuZGsvaW5kZXgu
cGhwLzIwMDkvMTAvcHl0aG9uLXZzLWNsb2p1cmUtZXZvbHZpbmcvCj4gWzNdIHd3dy5kYWJlYXou
Y29tL3B5dGhvbi9HSUwucGRmCj4gCj4gUFMgT24gYSBzbGlnaHRseSBkaWZmZXJlbnQgbm90ZSwg
SSBjYW1lIGFjcm9zcyBzb21lIFB5dGhvbiBiYXNoaW5nIFsyXSB5ZXN0ZXJkYXkKPiBhbmQgc29t
ZWhvdyBmcm9tIHRoZXJlIHRvIERhdmlkIEJlYXpsZXnigJlzIHByZXNlbnRhdGlvbiBhYm91dCB0
aGUgR0lMIFszXS4gV2hpbGUKPiBJIGRvbuKAmXQgbWluZCB0aGUgYmFzaGluZywgdGhlIG9ic2Vy
dmF0aW9ucyBhYm91dCB0aGUgR0lMIHNlZW0gcXVpdGUgdW5mYWlyIHRvIG1lCj4gYmVjYXVzZSBE
YXZpZOKAmXMgbWVhc3VyZW1lbnRzIGhhdmUgYmVlbiBtYWRlIG9uIE1hYyBPUyBYIHdpdGggaXRz
IGhvcnJpYmx5IHNsb3cKPiBwdGhyZWFkcyBmdW5jdGlvbnMuIEkgd2FzIG5vdCBhYmxlIHRvIG1l
YXN1cmUgYW55IHNsb3dkb3duIG9uIExpbnV4LgoKV2UgY2FyZSBhYm91dCBNYWMgT1MgWCB0aG91
Z2gsIHNvIGV2ZW4gaWYgdGhlIGNvbnRlbnRpb24gd2Fzbid0IGFzIGJhZApvbiBhIGRpZmZlcmVu
dCBPUywgdGhlIE1hYyBkb3duc2lkZXMgbWF0dGVyLgoKV2l0aCB0aGUgR0lMIHVwZGF0ZXMgaW4g
cGxhY2UsIGl0IHdvdWxkIGJlIGludGVyZXN0aW5nIHRvIHNlZSB0aGF0CmFuYWx5c2lzIHJlZG9u
ZSBmb3IgMi43LzMuMiB0aG91Z2guCgpSZWdhcmRzLApOaWNrLgoKUC5TLiBBcyBmYXIgYXMgaW50
ZXJlc3QgaW4gdGhlIGlkZWEgZ29lcywgdGhlIEdJTCBpcyBvbmUgb2YgdGhvc2UgYXJlYXMKd2hl
cmUgaXQgdGFrZXMgYSBmYWlybHkgcmFyZSBjb21iaW5hdGlvbiBvZiBpbnRlcmVzdCwgZXhwZXJ0
aXNlIGFuZAplc3RhYmxpc2hlZCBjcmVkaWJpbGl0eSB0byBwcm9wb3NlIGEgY2hhbmdlIGFuZCBn
ZXQgYXNzZW50IHRvIGl0LiBZb3UnbGwKbm90aWNlIHRoYXQgZXZlbiBBbnRvaW5lIGhhZCB0byBy
ZXNvcnQgdG8gdGhlICJpZiBub2JvZHkgb2JqZWN0cyBzb29uLApJJ20gY2hlY2tpbmcgdGhpcyBp
biIgdGFjdGljIHRvIGdhcm5lciBhbnkgcmVzcG9uc2VzLiBJdCdzIGFuIGFyZWEgd2hlcmUKZXZl
biB0aG9zZSB3aXRoIHJlbGV2YW50IGV4cGVydGlzZSBzdGlsbCBoYXZlIHRvIHB1dCBhc2lkZSBh
IGZhaXIgY2h1bmsKb2YgdGltZSBpbiBvcmRlciB0byBwcm9wZXJseSByZXZpZXcgYSBwcm9wb3Nl
ZCBjaGFuZ2UgOikKCi0tIApOaWNrIENvZ2hsYW4gICB8ICAgbmNvZ2hsYW5AZ21haWwuY29tICAg
fCAgIEJyaXNiYW5lLCBBdXN0cmFsaWEKLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t
LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tCl9fX19fX19fX19fX19fX19fX19fX19fX19f
X19fX19fX19fX19fX19fX19fX19fClB5dGhvbi1EZXYgbWFpbGluZyBsaXN0ClB5dGhvbi1EZXZA
cHl0aG9uLm9yZwpodHRwOi8vbWFpbC5weXRob24ub3JnL21haWxtYW4vbGlzdGluZm8vcHl0aG9u
LWRldgpVbnN1YnNjcmliZTogaHR0cDovL21haWwucHl0aG9uLm9yZy9tYWlsbWFuL29wdGlvbnMv
cHl0aG9uLWRldi9idWxsJTQwcHViYnMubmV0Cg==
Re: Python-Dev - Reworking the GIL by Stefan Ring on
2009-11-23T10:29:49+00:00
mi