www.ShoppingPodder.com

Leading Computer Shopping,
News and information


Part of the Identityscape.com network...

getxfactor.com jmoodmusic.com smartbusinesschoices.com mintdepot.com lowfaresalways.com evangelicalview.com shoppingpodder.com soproudlywehail.com webnews.ws currenthumor.com

 

 

A Dark Day...
Goto page Previous  1, 2, 3 ... , 142, 143, 144  Next
   Shopping Podder - the Best of Computer Postings! Forum Index -> Computer Architecture  
View previous topic :: View next topic  
Author Message
Zeljko Vrba
Guest






PostPosted: Thu Sep 04, 2008 5:55 am    Post subject: Re: AMD working on scaleable hardware-based atomic transacti Reply with quote

On 2008-09-04, David Schwartz <davids@webmaster.com> wrote:
Quote:

confident is correct. It takes serious expertise and reviews from
multiple people to make sure something doesn't slip by. With this, I


Could verification tools such as SPIN be of help there?
Back to top
Dmitriy V'jukov
Guest






PostPosted: Thu Sep 04, 2008 7:17 am    Post subject: Re: AMD working on scaleable hardware-based atomic transacti Reply with quote

On 4 ÓÅÎÔ, 09:55, Zeljko Vrba <zvrba.nos...@ieee-sb1.cc.fer.hr> wrote:
Quote:
On 2008-09-04, David Schwartz <dav...@webmaster.com> wrote:

confident is correct. It takes serious expertise and reviews from
multiple people to make sure something doesn't slip by. With this, I

Could verification tools such as SPIN be of help there?


SPIN badly suitable for synchronization algorithm verification. It
doesn't support relaxed memory models, it doesn't support dynamic
memory allocation, it doesn't support OS blocking primitives etc etc.
And Promela is way too far from real programming languages, so it's
possible that your model on Promela is correct, but what you really
will use in production is still not correct.

One better use Relacy for synchronization algorithm verification. It's
created for verification of real-world algorithms, against real-world
memory models, written in real-world programming languages, using real-
world OS primitives.
http://groups.google.ru/group/relacy

Dmitriy V'jukov
Back to top
Nick Maclaren
Guest






PostPosted: Thu Sep 04, 2008 1:48 pm    Post subject: Re: AMD working on scaleable hardware-based atomic transacti Reply with quote

In article <86c0c11f-785f-4456-9ee5-bfa5d4d8993a@56g2000hsm.googlegroups.com>,
"Dmitriy V'jukov" <dvyukov@gmail.com> writes:
|> On 1 =D3=C5=CE=D4, 04:09, "Chris M. Thomasson" <n...@spam.invalid> wrote:
|> > Here ya go:
|> >
|> > http://www.amd64.org/fileadmin/user_upload/pub/epham08-asf-eval.pdf
|>
|> Cool!

Let's consider language architectures and how they interact with this
feature.

In several language standards, I have argued vigorously that features
like volatile (C, C++ and Fortran) should be an attribute of an
object's actual definition (i.e. might control where it is placed)
and not merely its declaration (i.e. how it is used). The reasons
for arguing that should be obvious in the context of this design.
I have lost, almost uniformly.

In all of those languages, you can add the volatile attribute to
objects not declared as volatile, with only a few restrictions on
parallel access as volatile and non-volatile. They are enough
restrictions to allow an implementation to forbid such access (sic)
in C++ and Fortran, with some loss of optimisation, but it's still a
bit iffy.

Now, my understanding is that any non-protected access to any protected
memory location between an ACQUIRE and COMMIT (whether read or write)
leads to undefined behaviour. Is that so? If not, what happens?

I hope this gets established, because this is an area where the
language standards need to stop using the criterion "We can't see
why it won't work, so let's allow it" and go back to the old one of
"We can't see why it will work, so let's forbid it".


Regards,
Nick Maclaren.
Back to top
David Schwartz
Guest






PostPosted: Thu Sep 04, 2008 9:23 pm    Post subject: Re: AMD working on scaleable hardware-based atomic transacti Reply with quote

On Sep 4, 1:56 pm, Michael Hohmuth <Michael.Hohm...@amd.com> wrote:

Quote:
In the incarnation of ASF that we evaluated in the paper, there was no
interrupt-deferral mechanism.  All interrupts occurring in the
critical section did indeed abort it.

I don't really have a good gut sense of which way is better. I can see
arguments for a benefit to a small amount of deferral, but then the
chance of that actually doing anything is pretty small. If your
'critical region' is big enough that interrupts have a measurable
performance impact, you'd doing something wrong.


Quote:
I'm sorry our description is confusing, and I promise to fix it should
we ever rewrite the paper. Wink  Feel free to point out what you find
confusing either here or in private email.

Section 3.4 was very confusing, particularly the first two bullet
points and the paragraph that begins "The LLB allows". For example, in
the first bullet point, what makes a cache line "protected"? Is it the
ones that are locked? Or is it any cache line dirtied in the critical
section? Maybe I just didn't invest enough time trying to understand
it all, but reading the paper, I hit those points and hit an
understanding wall.

Quote:
In the ASF implementation we simulated for our paper, the buffer
actually holds the backup copies of the protected memory locations
(which are written back to the memory hierarchy in case of an abort).
The simulated buffer's capacity has been exactly 8 cache lines.

Okay, so you actually pass writes during the critical section to the
L1 cache, but don't allow the cache to be written back to memory (or
the L2 cache). The buffer is only used in the case of an abort.

Quote:
[...] Perhaps the biggest advantage will be easing the tradeoff
between correctness and performance. Right now, for example, it's
easy to create an obviously-correct implementation of a
reader/writer lock under x86 Linux. It's also not too hard to create
a heavily-optimized implementation of a reader/writer lock. It is,
however, an unholy bitch to create a heavily-optimized reader/writer
lock that one can be confident is correct. It takes serious
expertise and reviews from multiple people to make sure something
doesn't slip by. With this, I could do it in half an hour, and be
quite confident it had no subtle bugs.

Nicely said -- yes, one major ASF use case definitely is removing
complexity from highly concurrent data-structure implementations.
Additionally, the simulations we've published in the paper indicate
that removing this complexity can yield a substantial performance
benefit as well.

If this does make it into products and becomes sufficiently widely-
implemented that you can rely on it being present, there are huge
advantages. For example, suppose I could benefit from a
synchronization primitive I don't have. Maybe I'd love a reader/writer
lock that's fully recursive. Maybe I'd like a reader/writer/
readpromote lock. Maybe I'd like a lock with queued grants, but a 'go
to the head of the line' lock function. Right now, I've either got to
live without this synchronization primitive, write it myself with high
risk, or write an unoptimized version myself. Most likely, I'll chose
the former and implement my code in a less-natural way.

With this, I can code the synchronization primitive I want myself,
with a nearly-optimal version that I can still easily inspect for
validity. This could easily result in more natural code flows, since I
have the synch primitive that best fits my logic. This could mean more
reliable and easier to debug code.

Obviously, if I built my own such locks, even out of optimized
synchronization primitives, it won't be nearly as nice as one built in
assembly with ASF.

DS
Back to top
David Schwartz
Guest






PostPosted: Thu Sep 04, 2008 10:09 pm    Post subject: Re: AMD working on scaleable hardware-based atomic transacti Reply with quote

On Sep 4, 2:59 pm, Michael Hohmuth <Michael.Hohm...@amd.com> wrote:

Quote:
but don't allow the cache to be written back to memory (or the L2
cache).

In fact, we do!  The nice property of having the backup copy is that
we don't have to wire the speculatively written values anywhere.  In
case of an abort we just overwrite them with the old values before
anyone else can see them.

(That's what the paper's "The LLB allows..." paragraph is trying to
say.)

Ahh, clever. This really optimizes the non-abort case, which is
exactly what you want.

Quote:
The buffer is only used in the case of an abort.

Right.

Okay, I'm convinced now. Nice work.

DS
Back to top
Michael Hohmuth
Guest






PostPosted: Fri Sep 05, 2008 1:56 am    Post subject: Re: AMD working on scaleable hardware-based atomic transacti Reply with quote

David Schwartz <davids@webmaster.com> writes:
Quote:
On Sep 3, 4:43 pm, "Chris M. Thomasson" <n...@spam.invalid> wrote:

Apparently, interrupts are deferred by the OS. I also believe that
this deferment is adjustable by mutating a so-called watch-dog
counter.

Up to a point configurable by the OS. That sounds pretty nice to me.

Disclaimer:
I work for AMD and am one of the coauthors of the paper that spawned
this thread. AMD has not announced support for ASF (Advanced
Synchronization Facility) in any future product. So please don't
get too excited. :-)

In the incarnation of ASF that we evaluated in the paper, there was no
interrupt-deferral mechanism. All interrupts occurring in the
critical section did indeed abort it.

Quote:
[ David Schwartz: ]
Well, I've read it more carefully, and it seems to sort of say
that they 'undo' all writes if there's an abort with a special
buffer. Their description seems kind of confusing to me. If they
hold all writes in a special buffer, how big is it?

I'm sorry our description is confusing, and I promise to fix it should
we ever rewrite the paper. Wink Feel free to point out what you find
confusing either here or in private email.

Quote:
I believe the buffer is big enough to hold at least 7-8 words. If
your transactions need more than that, then ASF is not the right
tool for the job...

In the ASF implementation we simulated for our paper, the buffer
actually holds the backup copies of the protected memory locations
(which are written back to the memory hierarchy in case of an abort).
The simulated buffer's capacity has been exactly 8 cache lines.

Quote:
[...] Perhaps the biggest advantage will be easing the tradeoff
between correctness and performance. Right now, for example, it's
easy to create an obviously-correct implementation of a
reader/writer lock under x86 Linux. It's also not too hard to create
a heavily-optimized implementation of a reader/writer lock. It is,
however, an unholy bitch to create a heavily-optimized reader/writer
lock that one can be confident is correct. It takes serious
expertise and reviews from multiple people to make sure something
doesn't slip by. With this, I could do it in half an hour, and be
quite confident it had no subtle bugs.

Nicely said -- yes, one major ASF use case definitely is removing
complexity from highly concurrent data-structure implementations.
Additionally, the simulations we've published in the paper indicate
that removing this complexity can yield a substantial performance
benefit as well.

Michael
--
Michael Hohmuth, AMD Operating System Research Center, Dresden, Germany
michael.hohmuth@amd.com, www.amd64.org
Back to top
Michael Hohmuth
Guest






PostPosted: Fri Sep 05, 2008 2:59 am    Post subject: Re: AMD working on scaleable hardware-based atomic transacti Reply with quote

David Schwartz <davids@webmaster.com> writes:

Quote:
I'm sorry our description is confusing, and I promise to fix it
should we ever rewrite the paper. Wink  Feel free to point out what
you find confusing either here or in private email.

Section 3.4 was very confusing, particularly the first two bullet
points and the paragraph that begins "The LLB allows". For example,
in the first bullet point, what makes a cache line "protected"? Is
it the ones that are locked? Or is it any cache line dirtied in the
critical section?

The former.

Quote:
Maybe I just didn't invest enough time trying to understand it all,
but reading the paper, I hit those points and hit an understanding
wall.

I agree that this is confusing.

Quote:
In the ASF implementation we simulated for our paper, the buffer
actually holds the backup copies of the protected memory locations
(which are written back to the memory hierarchy in case of an
abort). The simulated buffer's capacity has been exactly 8 cache
lines.

Okay, so you actually pass writes during the critical section to the
L1 cache,

Yes, we do.

Quote:
but don't allow the cache to be written back to memory (or the L2
cache).

In fact, we do! The nice property of having the backup copy is that
we don't have to wire the speculatively written values anywhere. In
case of an abort we just overwrite them with the old values before
anyone else can see them.

(That's what the paper's "The LLB allows..." paragraph is trying to
say.)

Quote:
The buffer is only used in the case of an abort.

Right.

Thanks for your other comments!

Michael
--
Michael Hohmuth, AMD Operating System Research Center, Dresden, Germany
michael.hohmuth@amd.com, www.amd64.org
Back to top
MitchAlsup
Guest






PostPosted: Sat Sep 06, 2008 4:46 pm    Post subject: Re: AMD working on scaleable hardware-based atomic transacti Reply with quote

On Sep 4, 3:48 am, n...@cus.cam.ac.uk (Nick Maclaren) wrote:
Quote:
Let's consider language architectures and how they interact with this
feature.

In several language standards, I have argued vigorously that features
like volatile (C, C++ and Fortran) should be an attribute of an
object's actual definition (i.e. might control where it is placed)
and not merely its declaration (i.e. how it is used).  The reasons
for arguing that should be obvious in the context of this design.
I have lost, almost uniformly.

In all of those languages, you can add the volatile attribute to
objects not declared as volatile, with only a few restrictions on
parallel access as volatile and non-volatile.  They are enough
restrictions to allow an implementation to forbid such access (sic)
in C++ and Fortran, with some loss of optimisation, but it's still a
bit iffy.

Now, my understanding is that any non-protected access to any protected
memory location between an ACQUIRE and COMMIT (whether read or write)
leads to undefined behaviour.  Is that so?  If not, what happens?

In order for ASF to give the illusion of atomicity, there can be no
visibility to the protected cache lines betweeen ACQUIRE and COMMIT.
Visibility remains to the unprotected cache lines. In ASF you specify
which cache lines are protected and these lines are treated in a
special way, everything else remains 'normal' cach ecoherent memory
and retains its visibility. In effect, the HW is attaching volitile-
like semantics and then removing volitile-like semantics on an
instruction by instruction basis.

So, when you take the error exit at ACQUIRE you cannot be sure that
you have not executed any of the instructions between ACQUIRE and
COMMIT, or not. At the compiler level, each instruction between
ACQUIRE and COMMIT has an implicit back edge to ACQUIRE (as if each
instruction had a conditional branch as part of that instruction.)
Other interested parties will not see any intermediate state in
protected lines, but any other memory modified while in the code
between ACQUIRE and COMMIT may be seen (or not).

And, in fact, at lest while I was there; that is how you debug an ASF
event. You define a buffer in local memory (say your stack) and while
making progress over the atomic event, you lob various pieces of data
into this buffer, and if you end up taking the error exit from
ACQUIRE, the data that did arrive in this buffer, give tells you how
far through the atomic event you got before trouble happened. So at
least you have a chance of understanding what went on. Thus software
can help you understand what went on.

This leaves the door open for software to define this buffer in shared
memory (bad idea) or to pass a pointer to it through shared memory
(even worse) that allows another application visibiity to this buffer
while there is an atomic event lobbing data into the buffer. As a HW
person, I revert back to the "No programming language can prevent bad
programming practices" line of reasoning. SW can prevent this by
allocating these buffers away from shared memory, thereby, its not a
HW problem. Atomicity is maintained on the protected cache lines--
nothing else is guarenteed.

Notice that you simply cannot single step through an atomic event and
have any illusion of atomicity*. This is why debug exceptions are
supressed during the atomic event. Basicaly, you can be singel
stepping up to the first LOCKed load, and then the next step you will
find yourself either at the first instruction following COMMIT or at
the JNZ following the ACQUIRE with a failure code in the specified
register. Thus, this gives the illusion of atomicity, even to the
single stepping program.

Thus writes to unprotected memory may be undone (as will likely happen
in optimistic mode) or may persist (as needs to happen in
deterministic mode). So, what happens is not 'undefined', but it is
defined is a manner that current programming languages do not have
semantics to express.

Quote:
I hope this gets established, because this is an area where the
language standards need to stop using the criterion "We can't see
why it won't work, so let's allow it" and go back to the old one of
"We can't see why it will work, so let's forbid it".

Regards,
Nick Maclaren.

Mitch

(*) We discussed a debugger that when it encounters an ASF event,
would find all other applications that share memory and put them in a
stopped state; and then you can single step through an atomic event
and retain the illusion of atomicity (at least from CPUs). And when
the ASF event was complete, the other protrams would be put back in a
running state.
Back to top
Nick Maclaren
Guest






PostPosted: Sat Sep 06, 2008 11:29 pm    Post subject: Re: AMD working on scaleable hardware-based atomic transacti Reply with quote

In article <0a4356fa-1f4d-43af-a79f-78f3402b8a10@m36g2000hse.googlegroups.com>,
MitchAlsup <MitchAlsup@aol.com> writes:
|>
|> In order for ASF to give the illusion of atomicity, there can be no
|> visibility to the protected cache lines betweeen ACQUIRE and COMMIT.
|> Visibility remains to the unprotected cache lines. In ASF you specify
|> which cache lines are protected and these lines are treated in a
|> special way, everything else remains 'normal' cach ecoherent memory
|> and retains its visibility. In effect, the HW is attaching volitile-
|> like semantics and then removing volitile-like semantics on an
|> instruction by instruction basis.
|>
|> So, when you take the error exit at ACQUIRE you cannot be sure that
|> you have not executed any of the instructions between ACQUIRE and
|> COMMIT, or not. At the compiler level, each instruction between
|> ACQUIRE and COMMIT has an implicit back edge to ACQUIRE (as if each
|> instruction had a conditional branch as part of that instruction.)
|> Other interested parties will not see any intermediate state in
|> protected lines, but any other memory modified while in the code
|> between ACQUIRE and COMMIT may be seen (or not).

Thanks very much. That clarifies the model considerably.

My belief is that a language should forbid such accesses, and a
compiler should warn very strongly if they are being done. Your
point about debugging is well-taken, but that is probably the only
justifiable use of non-protected accesses in a protected section.
But that is not a hardware matter.

|> (*) We discussed a debugger that when it encounters an ASF event,
|> would find all other applications that share memory and put them in a
|> stopped state; and then you can single step through an atomic event
|> and retain the illusion of atomicity (at least from CPUs). And when
|> the ASF event was complete, the other protrams would be put back in a
|> running state.

The lack of this is a generic nightmare with all forms of parallel
execution, and is why I rarely bothered with interactive debuggers.
The distortion to the sequencing usually caused the problems to change
enough that they were little help.


Regards,
Nick Maclaren.
Back to top
MitchAlsup
Guest






PostPosted: Sun Sep 07, 2008 3:45 am    Post subject: Re: AMD working on scaleable hardware-based atomic transacti Reply with quote

On Sep 6, 1:29 pm, n...@cus.cam.ac.uk (Nick Maclaren) wrote:
Quote:
My belief is that a language should forbid such accesses, and a
compiler should warn very strongly if they are being done.  Your
point about debugging is well-taken, but that is probably the only
justifiable use of non-protected accesses in a protected section.
But that is not a hardware matter.

My belief, is that once you have seen any interference (with respect
to the addresses you are using to access a CDS), that you should have
to go back to some 'sane' position and start searching thence. There
is every likelyhood that the structure has changed from the one you
looked at just a few instruction ago.

So, in this case, the registers and memory from within the atomic
event should not be considered useable to the applicaiton--if it has
the correct CDS access model.

Mitch
Back to top
Nick Maclaren
Guest






PostPosted: Sun Sep 07, 2008 2:52 pm    Post subject: Re: AMD working on scaleable hardware-based atomic transacti Reply with quote

In article <c82c9ba2-8b21-4d94-ab73-9d5c55da5c59@x41g2000hsb.googlegroups.com>,
MitchAlsup <MitchAlsup@aol.com> writes:
|>
|> > My belief is that a language should forbid such accesses, and a
|> > compiler should warn very strongly if they are being done. =A0Your
|> > point about debugging is well-taken, but that is probably the only
|> > justifiable use of non-protected accesses in a protected section.
|> > But that is not a hardware matter.
|>
|> My belief, is that once you have seen any interference (with respect
|> to the addresses you are using to access a CDS), that you should have
|> to go back to some 'sane' position and start searching thence. There
|> is every likelyhood that the structure has changed from the one you
|> looked at just a few instruction ago.
|>
|> So, in this case, the registers and memory from within the atomic
|> event should not be considered useable to the applicaiton--if it has
|> the correct CDS access model.

Indeed. Our two statements are more a difference of whose job it is
to ensure that is done! I am a great believer in languages and
compilers doing such 'enforcement', because it is something that is
best done systematically and can be done by following simple rules.


Regards,
Nick Maclaren.
Back to top
Jim Collins
Guest






PostPosted: Fri Oct 24, 2008 4:57 pm    Post subject: Re: Sue(A_VV) [01/91] - "!index1.jpg" 111Kb yEnc (1/1) Reply with quote

NOT POSTED BY ME!!!!!

\

On Thu, 24 Jan 2008 20:44:53 GMT, Jim Collins
<jimNOSPAMREMOVEcollins470@hotmail.com> wrote:

Quote:
just. I will raise them up a prophet from among their brethren, like unto
thee, and will put my words in his mouth; and he shall speak unto them all
that I shall command him. And it shall come to pass, that whosoever will not
hearken unto my words which he will speak in my name, I will require it of
him.

Genesis 49: "Judah, thou art he whom thy brethren shall praise, and thou
shalt conquer thine enemies; thy father's children shall bow down before
thee. Judah is a lion's whelp: from the prey, my son, thou art gone up, and
art couched as a lion, and as a lioness that shall be roused up.

"The sceptre shall not depart from Judah, nor a lawgiver from between his
feet, until Shiloh come; and unto him shall the gathering of the people be."

727. During the life of the Messiah. Aenigmatis. Ezek. l7.

His forerunner. Malachi 3.

He will be born an infant. Is. 9.

He will be born in the village of Bethlehem. Micah 5. He will appear chiefly
in Jerusalem and will be a descendant of the family of Judah and of David.

He is to blind the learned and the wise, Is. 6, 8, 29. etc.; and to preach
the Gospel to the lowly, Is. 29; to open the eyes of the blind, give health
to the sick, and bring light to those that languish in darkness. Is. 61.

He is to show the perfect way, and be the teacher of the Gentile
Back to top
Jack
Guest






PostPosted: Mon Nov 10, 2008 12:21 am    Post subject: Re: CREDIT CARD SERVISES Reply with quote

Fuck you.
---------
Message-ID:
<8d949d59-f28b-4147-ad40-ac459789f517@u29g2000pro.googlegroups.com>
NNTP-Posting-Host: 117.197.193.252
X-Trace: posting.google.com 1226194518 30164 127.0.0.1 (9 Nov 2008 01:35:18
GMT)
X-Complaints-To: groups-abuse@google.com
-----------

Don't you all just HATE google-groups??
Back to top
Cliff
Guest






PostPosted: Mon Nov 10, 2008 1:25 am    Post subject: Re: CREDIT CARD SERVISES Reply with quote

On Sun, 9 Nov 2008 19:21:32 +0100, "Jack" <jack@invalid.com> wrote:

Quote:
Fuck you.

He's been posting that same spam for days to
many groups.
I got tired of it & his Yahoo & Google accounts may soon
vanish as well as his blog.
--
Cliff
Back to top
Husband of All FBI n NSA
Guest






PostPosted: Tue Nov 18, 2008 6:24 am    Post subject: Re: RP Singh is the weaklink in the current ($10,000 BOUNTY Reply with quote

"Barclay De Tolly" <tp@bigfoot.com> wrote in message
news:gfsblm$od6$2@aioe.org...
Quote:
"Manoj Misra" <ManojMisra59@yahoo.com> wrote in
news:1226927006_359@news.usenet.com:

RP Singh is the weaklink in the current Indian ODI team

He hasnt been in good for for a few months.

India should consider replacing him with SreeSanth/Gony/others

It also appears MS Dhoni is showing some North Indian regional bias
with his supporting Virat Kohli at the expense of a few experienced
players who already proved themselves in domestics.



Quote:
Why didnt he play the test series against Australia?

--




Bestial Mother Fucking Barclay De TollyTURD,

Some one put a $10,000 BOUNTY on YOUR GOD George Bush. You fucks have no
guts to arrest him.

10K BOUNTY on George Bush, Barack Obama, Dick Cheney, Joe Biden, FBI's
Robert Mueller, NSA Director Michael McConnell
http://groups.google.com/group/rec.sport.cricket/msg/71e42c11c4f855e1?hl=en

I wanna RAPE and MAIM whore daughters of FBI, NSA and NIS DIRECTORS
Mueller, Alexander and McConnell
http://groups.google.com/group/sci.anthropology/msg/86fd0f01624a9868?hl=en&

I wanna RAPE n MAIM barbara and Jenna bush, dubyas daughters
http://groups.google.com/group/alt.activism/msg/ef677b1ef32633b1?hl=en&

/
Back to top
Display posts from previous:   
   Shopping Podder - the Best of Computer Postings! Forum Index -> Computer Architecture Goto page Previous  1, 2, 3 ... , 142, 143, 144  Next  
Page 143 of 144
All times are GMT

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum