The ceph_pagelist is a simple list of whole pages, strung together via
their lru list_head. It facilitates encoding to a "buffer" of unknown
size. Allow its use in place of the ceph_msg page vector.
This will be used to fix the huge buffer preallocation woes of MDS
reconnection.
Signed-off-by: Sage Weil <sage@newdream.net>
When we issue an OSD read, we specify a vector of pages that the data is to
be read into. The request may be sent multiple times, to multiple OSDs, if
the osdmap changes, which means we can get more than one reply.
Only read data into the page vector if the reply is coming from the
OSD we last sent the request to. Keep track of which connection is using
the vector by taking a reference. If another connection was already
using the vector before and a new reply comes in on the right connection,
revoke the pages from the other connection.
Signed-off-by: Sage Weil <sage@newdream.net>
Use a single mutex (previously out_mutex) to protect both read and write
activity from concurrent ceph_con_* calls. Drop the mutex when doing
callbacks to avoid nested locking (the callback may need to call something
like ceph_con_close).
Signed-off-by: Sage Weil <sage@newdream.net>
When we open a monitor session, we send an initial AUTH message listing
the auth protocols we support, our entity name, and (possibly) a previously
assigned global_id. The monitor chooses a protocol and responds with an
initial message.
Initially implement AUTH_NONE, a dummy protocol that provides no security,
but works within the new framework. It generates 'authorizers' that are
used when connecting to (mds, osd) services that simply state our entity
name and global_id.
This is a wire protocol change.
Signed-off-by: Sage Weil <sage@newdream.net>
We want to ceph_con_close when we're done with the connection, before
the ref count reaches 0. Once it does, do not call ceph_con_shutdown,
as that takes the con mutex and may sleep, and besides that is
unnecessary.
Signed-off-by: Sage Weil <sage@newdream.net>
We need to make sure we only swab the address during the banner once. So
break process_banner out of process_connect, and clean up the surrounding
code so that these are distinct phases of the handshake.
Signed-off-by: Sage Weil <sage@newdream.net>
We exchange struct ceph_entity_addr over the wire and store it on disk.
The sockaddr_storage.ss_family field, however, is host endianness. So,
fix ss_family endianness to big endian when sending/receiving over the
wire.
Signed-off-by: Sage Weil <sage@newdream.net>
A generic message passing library is used to communicate with all
other components in the Ceph file system. The messenger library
provides ordered, reliable delivery of messages between two nodes in
the system.
This implementation is based on TCP.
Signed-off-by: Sage Weil <sage@newdream.net>