| View previous topic :: View next topic |
| Author |
Message |
一首诗 Guest
|
Posted: Sat Nov 08, 2008 4:27 pm Post subject: Asynchronous Disk IO on linux |
|
|
Hi all,
I just read "The C10K problem" and I found this line:
An important bottleneck in this method is that read() or
sendfile() from disk blocks if the page is not in core at the moment;
setting nonblocking mode on a disk file handle has no effect.
So does that means:
1. using select or epoll on disk IO is useless?
2. These is no non-blocking disk IO at all? |
|
| |
|
Back to top |
David Schwartz Guest
|
Posted: Sat Nov 08, 2008 4:36 pm Post subject: Re: Asynchronous Disk IO on linux |
|
|
On Nov 8, 8:27 am, 一首诗 <newpt...@gmail.com> wrote:
| Quote: | Hi all,
I just read "The C10K problem" and I found this line:
An important bottleneck in this method is that read() or
sendfile() from disk blocks if the page is not in core at the moment;
setting nonblocking mode on a disk file handle has no effect.
|
Right.
| Quote: | So does that means:
1. using select or epoll on disk IO is useless?
|
That's correct. The descriptor will always be ready for both reading
and writing.
| Quote: | 2. These is no non-blocking disk IO at all?
|
Not anything like non-blocking network I/O. Read the man page for
'io_setup' and 'aio_read'.
I personally recommend using threads for this purpose.
DS |
|
| |
|
Back to top |
Guest
|
Posted: Mon Nov 10, 2008 4:20 pm Post subject: Re: Asynchronous Disk IO on linux |
|
|
Wouldn't it be better if there were an option to open() like what is
proposed in http://lkml.org/lkml/2005/3/17/139 ?
I personally would like to avoid all of the concurrency baggage that
comes with adding thread usage to my process.
Does anyone know if this functionality is currently being discussed or
prototyped within the kernel dev community?
David
David Schwartz wrote:
| Quote: | On Nov 8, 8:27 am, 一锟斤拷诗 <newpt...@gmail.com> wrote:
Hi all,
I just read "The C10K problem" and I found this line:
An important bottleneck in this method is that read() or
sendfile() from disk blocks if the page is not in core at the moment;
setting nonblocking mode on a disk file handle has no effect.
Right.
So does that means:
1. using select or epoll on disk IO is useless?
That's correct. The descriptor will always be ready for both reading
and writing.
2. These is no non-blocking disk IO at all?
Not anything like non-blocking network I/O. Read the man page for
'io_setup' and 'aio_read'.
I personally recommend using threads for this purpose.
DS |
|
|
| |
|
Back to top |
David Schwartz Guest
|
Posted: Mon Nov 10, 2008 7:18 pm Post subject: Re: Asynchronous Disk IO on linux |
|
|
On Nov 10, 8:20燼m, ben...@xdal.org wrote:
| Quote: | Wouldn't it be better if there were an option to open() like what is
proposed inhttp://lkml.org/lkml/2005/3/17/139?
|
I don't think so, since the semantics are not sensible.
| Quote: | I personally would like to avoid all of the concurrency baggage that
comes with adding thread usage to my process.
|
If you don't want concurrency, what are we talking about? If you want
concurrency, how is the concurrency a bad thing?
| Quote: | Does anyone know if this functionality is currently being discussed or
prototyped within the kernel dev community?
|
Nobody has yet come up with sensible semantics. The reason sockets
have sensible semantics is that it's very clear what it means for a
socket to be writable or readable. There is no such obvious meaning
for a file. What does it mean for a file to be readable?
Linux does have threads and does have aio_read/aio_write.
DS |
|
| |
|
Back to top |
Guest
|
Posted: Mon Nov 10, 2008 9:28 pm Post subject: Re: Asynchronous Disk IO on linux |
|
|
Here are the semantics I want (pseudo code):
fd=open(file, non-blocking);
n = read(fd, buf, 1000);
at this point, the kernel will check "do I have 1000 bytes available
to read? if yes, return them. If no, return the number I have and set
errno to EWOULDBLOCK, then start the process of paging in that data
from the disk. When it is available, signal READ on the fd for
epoll()/poll()/select().
And if I do:
n = write(fd, buf, x);
it does something similar.
re: aio, I consider that solution messy, especially when mixing it
with non-blocking socket handling. Since the operations aren't atomic
(i.e. when I "post" a read, I have to "cancel" it before I can release
my buffer, etc.), that means I have to do a LOT more management of
structures after I want to get rid of them. This is likely what I'm
going to have to use, but it reminds me too much of OVERLAPPED
maddness in the windows API.
re: concurrency, obviously I want to have "concurrency"... my point
was that introducing threads introduces many concurrency issues that
one does not have in a completely epoll() based processing model. I
don't want to start worrying about TLS, resource locks, inter-thread
communication, etc.
David
On Nov 10, 11:18燼m, David Schwartz <dav...@webmaster.com> wrote:
| Quote: | On Nov 10, 8:20燼m, ben...@xdal.org wrote:
Wouldn't it be better if there were an option to open() like what is
proposed inhttp://lkml.org/lkml/2005/3/17/139?
I don't think so, since the semantics are not sensible.
I personally would like to avoid all of the concurrency baggage that
comes with adding thread usage to my process.
If you don't want concurrency, what are we talking about? If you want
concurrency, how is the concurrency a bad thing?
Does anyone know if this functionality is currently being discussed or
prototyped within the kernel dev community?
Nobody has yet come up with sensible semantics. The reason sockets
have sensible semantics is that it's very clear what it means for a
socket to be writable or readable. There is no such obvious meaning
for a file. What does it mean for a file to be readable?
Linux does have threads and does have aio_read/aio_write.
DS |
|
|
| |
|
Back to top |
David Schwartz Guest
|
Posted: Mon Nov 10, 2008 9:58 pm Post subject: Re: Asynchronous Disk IO on linux |
|
|
On Nov 10, 1:28爌m, ben...@xdal.org wrote:
| Quote: | Here are the semantics I want (pseudo code):
爁d=open(file, non-blocking);
爊 = read(fd, buf, 1000);
at this point, the kernel will check "do I have 1000 bytes available
to read? if yes, return them. 營f no, return the number I have and set
errno to EWOULDBLOCK, then start the process of paging in that data
from the disk. 燱hen it is available, signal READ on the fd for
epoll()/poll()/select().
|
When what is available? You started out explaining the semantics and
then stopped. When "it" is available? What's "it"? All 1,000 bytes?
The next byte?
What happens if a seek intervenes before the 'select'? What if you
'select' without calling 'read'?
| Quote: | And if I do:
爊 = write(fd, buf, x);
it does something similar.
|
Which would be? Nobody knows what the semantics for these operations
should be.
| Quote: | re: aio, I consider that solution messy, especially when mixing it
with non-blocking socket handling. 燬ince the operations aren't atomic
(i.e. when I "post" a read, I have to "cancel" it before I can release
my buffer, etc.), that means I have to do a LOT more management of
structures after I want to get rid of them. 燭his is likely what I'm
going to have to use, but it reminds me too much of OVERLAPPED
maddness in the windows API.
|
But that is the right way. That solves all the semantic problems and
shows why normal non-blocking semantics don't work.
| Quote: | re: concurrency, obviously I want to have "concurrency"... my point
was that introducing threads introduces many concurrency issues that
one does not have in a completely epoll() based processing model. 營
don't want to start worrying about TLS, resource locks, inter-thread
communication, etc.
|
Then don't do those things. You want a solution that works by magic.
DS |
|
| |
|
Back to top |
Guest
|
Posted: Mon Nov 10, 2008 10:22 pm Post subject: Re: Asynchronous Disk IO on linux |
|
|
I'm going to think about this more and perhaps respond with a more
thought out proposal. I don't want magic. I want clear, clean and
atomic operations.
David
On Nov 10, 1:58爌m, David Schwartz <dav...@webmaster.com> wrote:
| Quote: | On Nov 10, 1:28爌m, ben...@xdal.org wrote:
Here are the semantics I want (pseudo code):
爁d=open(file, non-blocking);
爊 = read(fd, buf, 1000);
at this point, the kernel will check "do I have 1000 bytes available
to read? if yes, return them. 營f no, return the number I have and set
errno to EWOULDBLOCK, then start the process of paging in that data
from the disk. 燱hen it is available, signal READ on the fd for
epoll()/poll()/select().
When what is available? You started out explaining the semantics and
then stopped. When "it" is available? What's "it"? All 1,000 bytes?
The next byte?
What happens if a seek intervenes before the 'select'? What if you
'select' without calling 'read'?
And if I do:
爊 = write(fd, buf, x);
it does something similar.
Which would be? Nobody knows what the semantics for these operations
should be.
re: aio, I consider that solution messy, especially when mixing it
with non-blocking socket handling. 燬ince the operations aren't atomic
(i.e. when I "post" a read, I have to "cancel" it before I can release
my buffer, etc.), that means I have to do a LOT more management of
structures after I want to get rid of them. 燭his is likely what I'm
going to have to use, but it reminds me too much of OVERLAPPED
maddness in the windows API.
But that is the right way. That solves all the semantic problems and
shows why normal non-blocking semantics don't work.
re: concurrency, obviously I want to have "concurrency"... my point
was that introducing threads introduces many concurrency issues that
one does not have in a completely epoll() based processing model. 營
don't want to start worrying about TLS, resource locks, inter-thread
communication, etc.
Then don't do those things. You want a solution that works by magic.
DS |
|
|
| |
|
Back to top |
David Schwartz Guest
|
Posted: Mon Nov 10, 2008 11:04 pm Post subject: Re: Asynchronous Disk IO on linux |
|
|
On Nov 10, 2:22爌m, ben...@xdal.org wrote:
| Quote: | I'm going to think about this more and perhaps respond with a more
thought out proposal. 營 don't want magic. 營 want clear, clean and
atomic operations.
|
From what I can tell, Windows OVERLAPPED operations or POSIX aio
operations are the correct semantics for non-blocking operations on
files. Normal non-blocking semantics just doesn't work, because
there's no single notion of "readability" or "writability" that makes
sense.
There are many clean, simple ways you can solve this problem. You can
queue bite-sized write operations to a group of worker threads. You
can use asynchronous read.
You complain about the overhead, but then you ask for precisely that
same overhead. Any non-blocking write is going to require the data be
stored until it can be committed, but that's precisely what you claim
that's too much overhead. There is no difference in overhead based on
who does it, and if you do it, you get to control it.
It sounds like your complaint has nothing whatsoever to do with kernel
or OS mechanisms. You don't want a new 'open' mode or new write
semantics. You just want somebody to write an asynchronous file I/O
library for you.
I find that really, really odd. Since it's so simple.
Sit down and write what you want. Use either aio or threads. Stop
complaining and code. ;)
DS |
|
| |
|
Back to top |
John Reiser Guest
|
Posted: Tue Nov 11, 2008 1:34 am Post subject: Re: Asynchronous Disk IO on linux |
|
|
benoit@xdal.org wrote:
| Quote: | Wouldn't it be better if there were an option to open() like what is
proposed in http://lkml.org/lkml/2005/3/17/139 ?
|
There are obvious typos (decimal instead of octal) in the proposed patch
for asm-parisc and asm-alpha:
+#define O_ATOMICREAD 10000000 /* non-blocking file i/o */
This may indicate that little or no testing has been done, which can
cast doubt on the rest of the proposal.
| Quote: | I personally would like to avoid all of the concurrency baggage that
comes with adding thread usage to my process.
|
It looks like the implementation just returns -EWOULDBLOCK whenever
an operation would block. If so, then you have not avoided
"all of the concurrency baggage", because spinning/time_out/re-try
is concurrency baggage. Also, it is not obvious that the implementation
guarantees forward progress. What prevents -EWOULDBLOCK forever
if user code always retries the same operation?
-- |
|
| |
|
Back to top |
Rainer Weikusat Guest
|
Posted: Tue Nov 11, 2008 4:18 pm Post subject: Re: Asynchronous Disk IO on linux |
|
|
benoit@xdal.org writes:
| Quote: | fd=open(file, non-blocking);
n = read(fd, buf, 1000);
at this point, the kernel will check "do I have 1000 bytes available
to read? if yes, return them. If no, return the number I have and set
errno to EWOULDBLOCK, then start the process of paging in that data
from the disk.
|
Well, what you want is an interface to request that the kernel reads a
certain number of bytes from a particular descriptor, starting at a
particular offset, into memory and informs the application when that
has been done. This implies that it would be possible to implement a
'data availability polling-function via pread. But there still
wouldn't be a way to signal availability of this data without
explicitly communicating the desired I/O-parameters to the
kernel. Which means aio. The kernel can be told to post a signal to a
process upon completion of an aio-event and this notification may
include an arbitrary, user-supplied integer or pointer value
(according to SUS). Since 2.6.22, signals can be received via file
descriptor.
So, what's your problem? Random-access I/O is more complicated than
'stream'-I/O because arbitrary 'positions' are involved. |
|
| |
|
Back to top |
mman Guest
|
Posted: Thu Nov 13, 2008 4:35 pm Post subject: Re: Asynchronous Disk IO on linux |
|
|
On Nov 10, 11:28 pm, ben...@xdal.org wrote:
| Quote: | Here are the semantics I want (pseudo code):
fd=open(file, non-blocking);
n = read(fd, buf, 1000);
at this point, the kernel will check "do I have 1000 bytes available
to read? if yes, return them. If no, return the number I have and set
errno to EWOULDBLOCK, then start the process of paging in that data
from the disk. When it is available, signal READ on the fd for
epoll()/poll()/select().
And if I do:
n = write(fd, buf, x);
it does something similar.
re: aio, I consider that solution messy, especially when mixing it
with non-blocking socket handling. Since the operations aren't atomic
(i.e. when I "post" a read, I have to "cancel" it before I can release
my buffer, etc.), that means I have to do a LOT more management of
structures after I want to get rid of them. This is likely what I'm
going to have to use, but it reminds me too much of OVERLAPPED
maddness in the windows API.
|
Agree with you. "aio" does not mix well with non-blocking socket
handling
(and let me add single-process, event-driven designs).
| Quote: |
re: concurrency, obviously I want to have "concurrency"... my point
was that introducing threads introduces many concurrency issues that
one does not have in a completely epoll() based processing model. I
don't want to start worrying about TLS, resource locks, inter-thread
communication, etc.
|
Would you consider to separate the file I/O parts of your program, put
them
into another program and let this new program communicate with your
current
program with UNIX-domain sockets and shared memory?
UNIX-domain sockets would be used to instruct the new program to read
data from
files and put them into a shared memory segment (and the like for
write). Then,
when the operation has been completed or failed, your current program
is informed
through the established connection over the UNIX-domain socket.
The shared memory minimizes the data copies between the two programs
and you
can integrate it easily with non-blocking socket handling.
Michael. |
|
| |
|
Back to top |
|