kernel_optimize_test

Author	SHA1	Message	Date
Alex Elder	daba5fdb4c	rbd: rename snap_exists field A Boolean field "snap_exists" in an rbd mapping is used to indicate whether a mapped snapshot has been removed from an image's snapshot context, to stop sending requests for that snapshot as soon as we know it's gone. Generalize the interpretation of this field so it applies to non-snapshot (i.e. "head") mappings. That is, define its value to be false until the mapping has been set, and then define it to be true for both snapshot mappings or head mappings. Rename the field "exists" to reflect the broader interpretation. The rbd_mapping structure is on its way out, so move the field back into the rbd_device structure. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-10-30 08:34:29 -05:00
Alex Elder	971f839a76	rbd: move snap info out of rbd_mapping struct Moving the snap_id and snap_name fields into the separate rbd_mapping structure was misguided. (And in time, perhaps we'll do away with that structure altogether...) Move these fields back into struct rbd_device. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-10-30 08:34:29 -05:00
Alex Elder	86992098e7	rbd: make pool_id a 64 bit value If a format 2 image has a parent, its pool id will be specified using a 64-bit value. Change the pool id we save for an image to match that. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-10-30 08:34:29 -05:00
Alex Elder	41f38c2b2f	rbd: remove snapshots on error in rbd_add() If rbd_dev_snaps_update() has ever been called for an rbd device structure there could be snapshot structures on its snaps list. In rbd_add(), this function is called but a subsequent error path neglected to clean up any of these snapshots. Add a call to rbd_remove_all_snaps() in the appropriate spot to remedy this. Change a couple of error labels to be a little clearer while there. Drop the leading underscores from the function name; there's nothing special about that function that they might signify. As suggested in review, the leading underscores in __rbd_remove_snap_dev() have been removed as well. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-10-30 08:34:28 -05:00
Alex Elder	f7760dad28	rbd: simplify rbd_rq_fn() When processing a request, rbd_rq_fn() makes clones of the bio's in the request's bio chain and submits the results to osd's to be satisfied. If a request bio straddles the boundary between objects backing the rbd image, it must be represented by two cloned bio's, one for the first part (at the end of one object) and one for the second (at the beginning of the next object). This has been handled by a function bio_chain_clone(), which includes an interface only a mother could love, and which has been found to have other problems. This patch defines two new fairly generic bio functions (one which replaces bio_chain_clone()) to help out the situation, and then revises rbd_rq_fn() to make use of them. First, bio_clone_range() clones a portion of a single bio, starting at a given offset within the bio and including only as many bytes as requested. As a convenience, a request to clone the entire bio is passed directly to bio_clone(). Second, bio_chain_clone_range() performs a similar function, producing a chain of cloned bio's covering a sub-range of the source chain. No bio_pair structures are used, and if successful the result will represent exactly the specified range. Using bio_chain_clone_range() makes bio_rq_fn() a little easier to understand, because it avoids the need to pass very much state information between consecutive calls. By avoiding the need to track a bio_pair structure, it also eliminates the problem described here: http://tracker.newdream.net/issues/2933 Note that a block request (and therefore the complete length of a bio chain processed in rbd_rq_fn()) is an unsigned int, while the result of rbd_segment_length() is u64. This change makes this range trunctation explicit, and trips a bug if the the segment boundary is too far off. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-10-30 08:34:28 -05:00
Sage Weil	0ed7285e00	libceph: fix osdmap decode error paths Ensure that we set the err value correctly so that we do not pass a 0 value to ERR_PTR and confuse the calling code. (In particular, osd_client.c handle_map() will BUG(!newmap)). Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Alex Elder <elder@inktank.com>	2012-10-30 08:21:05 -05:00
Alex Elder	069a4b5690	rbd: kill rbd_device->rbd_opts The rbd_device structure has an embedded rbd_options structure. Such a structure is needed to work with the generic ceph argument parsing code, but there's no need to keep it around once argument parsing is done. Use a local variable to hold the rbd options used in parsing in rbd_get_client(), and just transfer its content (it's just a read_only flag) into the field in the rbd_mapping sub-structure that requires that information. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com> Reviewed-by: Dan Mick <dan.mick@inktank.com>	2012-10-26 17:18:08 -05:00
Alex Elder	e5cfeed281	rbd: simplify rbd_merge_bvec() The aim of this patch is to make what's going on rbd_merge_bvec() a bit more obvious than it was before. This was an issue when a recent btrfs bug led us to question whether the merge function was working correctly. Use "obj" rather than "chunk" to indicate the units whose boundaries we care about we call (rados) "objects". Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Dan Mick <dan.mick@inktank.com>	2012-10-26 17:18:08 -05:00
Alex Elder	d4b125e9eb	rbd: increase maximum snapshot name length Change RBD_MAX_SNAP_NAME_LEN to be based on NAME_MAX. That is a practical limit for the length of a snapshot name (based on the presence of a directory using the name under /sys/bus/rbd to represent the snapshot). The /sys entry is created by prefixing it with "snap_"; define that prefix symbolically, and take its length into account in defining the snapshot name length limit. Enforce the limit in rbd_add_parse_args(). Also delete a dout() call in that function that was not meant to be committed. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Dan Mick <dan.mick@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-10-26 17:18:08 -05:00
Alex Elder	db2388b6ee	rbd: verify rbd image order value This adds a verification that an rbd image's object order is within the upper and lower bounds supported by this implementation. It must be at least 9 (SECTOR_SHIFT), because the Linux bio system assumes that minimum granularity. It also must be less than 32 (at the moment anyway) because there exist spots in the code that store the size of a "segment" (object backing an rbd image) in a signed int variable, which can be 32 bits including the sign. We should be able to relax this limit once we've verified the code uses 64-bit types where needed. Note that the CLI tool already limits the order to the range 12-25. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-10-26 17:18:08 -05:00
Alex Elder	4634246db8	rbd: consolidate rbd_do_op() calls The two calls to rbd_do_op() from rbd_rq_fn() differ only in the value passed for the snapshot id and the snapshot context. For reads the snapshot always comes from the mapping, and for writes the snapshot id is always CEPH_NOSNAP. The snapshot context is always null for reads. For writes, the snapshot context always comes from the rbd header, but it is acquired under protection of header semaphore and could change thereafter, so we can't simply use what's available inside rbd_do_op(). Eliminate the snapid parameter from rbd_do_op(), and set it based on the I/O direction inside that function instead. Always pass the snapshot context acquired in the caller, but reset it to a null pointer inside rbd_do_op() if the operation is a read. As a result, there is no difference in the read and write calls to rbd_do_op() made in rbd_rq_fn(), so just call it unconditionally. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-10-26 17:18:08 -05:00
Alex Elder	ff2e4bb5b3	rbd: drop rbd_do_op() opcode and flags The only callers of rbd_do_op() are in rbd_rq_fn(), where call one is used for writes and the other used for reads. The request passed to rbd_do_op() already encodes the I/O direction, and that information can be used inside the function to set the opcode and flags value (rather than passing them in as arguments). So get rid of the opcode and flags arguments to rbd_do_op(). Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-10-26 17:18:08 -05:00
Alex Elder	13f4042c05	rbd: kill rbd_req_{read,write}() Both rbd_req_read() and rbd_req_write() are simple wrapper routines for rbd_do_op(), and each is only called once. Replace each wrapper call with a direct call to rbd_do_op(), and get rid of the wrapper functions. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-10-26 17:18:08 -05:00
Alex Elder	be466c1cc3	rbd: fix read-only option name The name of the "read-only" mapping option was inadvertently changed in this commit: `f84344f3` rbd: separate mapping info in rbd_dev Revert that hunk to return it to what it should be. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Dan Mick <dan.mick@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-10-26 17:18:08 -05:00
Alex Elder	a0ea3a40fd	rbd: zero return code in rbd_dev_image_id() When rbd_dev_probe() calls rbd_dev_image_id() it expects to get a 0 return code if successful, but it is getting a positive value. The reason is that rbd_dev_image_id() returns the value it gets from rbd_req_sync_exec(), which returns the number of bytes read in as a result of the request. (This ultimately comes from ceph_copy_from_page_vector() in rbd_req_sync_op()). Force the return value to 0 when successful in rbd_dev_image_id(). Do the same in rbd_dev_v2_object_prefix(). Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com> Reviewed-by: Dan Mick <dan.mick@inktank.com>	2012-10-26 17:18:08 -05:00
Alex Elder	b213e0b1a6	rbd: fix bug in rbd_dev_id_put() In rbd_dev_id_put(), there's a loop that's intended to determine the maximum device id in use. But it isn't doing that at all, the effect of how it's written is to simply use the just-put id number, which ignores whole purpose of this function. Fix the bug. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-10-26 17:18:08 -05:00
David Zafman	b000056a5a	ceph: Fix NULL ptr crash in strlen() set_request_path_attr() checks for NULL ptr before calling strlen() This fixes http://tracker.newdream.net/issues/3404 Signed-off-by: David Zafman <david.zafman@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>	2012-10-26 16:35:07 -05:00
Sage Weil	7246240c7c	libceph: avoid NULL kref_put from NULL alloc_msg return The ceph_on_in_msg_alloc() method calls the ->alloc_msg() helper which may return NULL. It also drops con->mutex while it allocates a message, which means that the connection state may change (e.g., get closed). If that happens, we clean up and bail out. Avoid calling ceph_msg_put() on a NULL return value and triggering a crash. This was observed when an ->alloc_msg() call races with a timeout that resends a zillion messages and resets the connection, and ->alloc_msg() returns NULL (because the request was resent to another target). Fixes http://tracker.newdream.net/issues/3342 Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Alex Elder <elder@inktank.com>	2012-10-26 16:35:04 -05:00
David Zafman	0f9831a893	ceph: fix dentry reference leak in encode_fh() Call to d_find_alias() needs a corresponding dput() This fixes http://tracker.newdream.net/issues/3271 Signed-off-by: David Zafman <david.zafman@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>	2012-10-26 16:34:53 -05:00
Alex Elder	35152979e6	rbd: activate v2 image support Now that v2 images support is fully implemented, have rbd_dev_v2_probe() return 0 to indicate a successful probe. (Note that an image that implements layering will fail the probe early because of the feature chekc.) Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-10-10 07:44:01 -07:00
Alex Elder	d889140c4a	rbd: implement feature checks Version 2 images have two sets of feature bit fields. The first indicates features possibly used by the image. The second indicates features that the client must support in order to use the image. When an image (or snapshot) is first examined, we need to make sure that the local implementation supports the image's required features. If not, fail the probe for the image. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-10-10 07:43:51 -07:00
Alex Elder	117973fb4c	rbd: define rbd_dev_v2_refresh() Define a new function rbd_dev_v2_refresh() to update/refresh the snapshot context for a format version 2 rbd image. This function will update anything that is not fixed for the life of an rbd image--at the moment this is mainly the snapshot context and (for a base mapping) the size. Update rbd_refresh_header() so it selects which function to use based on the image format. Rename __rbd_refresh_header() to be rbd_dev_v1_refresh() to be consistent with the naming of its version 2 counterpart. Similarly rename rbd_refresh_header() to be rbd_dev_refresh(). Unrelated--we use rbd_image_format_valid() here. Delete the other use of it, which was primarily put in place to ensure that function was referenced at the time it was defined. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-10-10 07:43:39 -07:00
Alex Elder	9478554ae5	rbd: define rbd_update_mapping_size() Encapsulate the code that handles updating the size of a mapping after an rbd image has been refreshed. This is done in anticipation of the next patch, which will make this common code for format 1 and 2 images. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-10-10 07:43:28 -07:00
Alex Elder	802c6d967f	rbd: define common queue_con_delay() This patch defines a single function, queue_con_delay() to call queue_delayed_work() for a connection. It basically generalizes what was previously queue_con() by adding the delay argument. queue_con() is now a simple helper that passes 0 for its delay. queue_con_delay() returns 0 if it queued work or an errno if it did not for some reason. If con_work() finds the BACKOFF flag set for a connection, it now calls queue_con_delay() to handle arranging to start again after a delay. Note about connection reference counts: con_work() only ever gets called as a work item function. At the time that work is scheduled, a reference to the connection is acquired, and the corresponding con_work() call is then responsible for dropping that reference before it returns. Previously, the backoff handling inside con_work() silently handed off its reference to delayed work it scheduled. Now that queue_con_delay() is used, a new reference is acquired for the newly-scheduled work, and the original reference is dropped by the con->ops->put() call at the end of the function. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>	2012-10-09 22:00:44 -07:00
Alex Elder	8618e30bc1	rbd: let con_work() handle backoff Both ceph_fault() and con_work() include handling for imposing a delay before doing further processing on a faulted connection. The latter is used only if ceph_fault() is unable to. Instead, just let con_work() always be responsible for implementing the delay. After setting up the delay value, set the BACKOFF flag on the connection unconditionally and call queue_con() to ensure con_work() will get called to handle it. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>	2012-10-09 22:00:21 -07:00
Alex Elder	588377d619	rbd: reset BACKOFF if unable to re-queue If ceph_fault() is unable to queue work after a delay, it sets the BACKOFF connection flag so con_work() will attempt to do so. In con_work(), when BACKOFF is set, if queue_delayed_work() doesn't result in newly-queued work, it simply ignores this condition and proceeds as if no backoff delay were desired. There are two problems with this--one of which is a bug. The first problem is simply that the intended behavior is to back off, and if we aren't able queue the work item to run after a delay we're not doing that. The only reason queue_delayed_work() won't queue work is if the provided work item is already queued. In the messenger, this means that con_work() is already scheduled to be run again. So if we simply set the BACKOFF flag again when this occurs, we know the next con_work() call will again attempt to hold off activity on the connection until after the delay. The second problem--the bug--is a leak of a reference count. If queue_delayed_work() returns 0 in con_work(), con->ops->put() drops the connection reference held on entry to con_work(). However, processing is (was) allowed to continue, and at the end of the function a second con->ops->put() is called. This patch fixes both problems. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>	2012-10-09 21:59:52 -07:00
Alex Elder	6285bc2312	ceph: avoid 32-bit page index overflow A pgoff_t is defined (by default) to have type (unsigned long). On architectures such as i686 that's a 32-bit type. The ceph address space code was attempting to produce 64 bit offsets by shifting a page's index by PAGE_CACHE_SHIFT, but the result was not what was desired because the shift occurred before the result got promoted to 64 bits. Fix this by converting all uses of page->index used in this way to use the page_offset() macro, which ensures the 64-bit result has the intended value. This fixes http://tracker.newdream.net/issues/3112 Reported-by: Mohamed Pakkeer <pakkeer.mohideen@realimage.com> Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>	2012-10-03 10:51:18 -05:00
Sage Weil	457712a0bc	ceph: return EIO on invalid layout on GET_DATALOC ioctl If the user calls GET_DATALOC on a file with an invalid (e.g., zeroed) layout, return EIO to userland. Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Alex Elder <elder@inktank.com>	2012-10-03 10:51:17 -05:00
Sage Weil	6cae3717cd	rbd: BUG on invalid layout This shouldn't actually be possible because the layout struct is constructed from the RBD header and validated then. [elder@inktank.com: converted BUG() call to equivalent rbd_assert()] Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Alex Elder <elder@inktank.com>	2012-10-01 17:20:00 -05:00
Sage Weil	6816282dab	ceph: propagate layout error on osd request creation If we are creating an osd request and get an invalid layout, return an EINVAL to the caller. We switch up the return to have an error code instead of NULL implying -ENOMEM. Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Alex Elder <elder@inktank.com>	2012-10-01 17:20:00 -05:00
Sage Weil	d63b77f4c5	libceph: check for invalid mapping If we encounter an invalid (e.g., zeroed) mapping, return an error and avoid a divide by zero. Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Alex Elder <elder@inktank.com>	2012-10-01 17:20:00 -05:00
Wei Yongjun	b905a7f8b7	ceph: convert to use le32_add_cpu() Convert cpu_to_le32(le32_to_cpu(E1) + E2) to use le32_add_cpu(). dpatch engine is used to auto generate this patch. (https://github.com/weiyj/dpatch) Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: Sage Weil <sage@inktank.com>	2012-10-01 14:30:54 -05:00
Yan, Zheng	3e8f43a089	ceph: Fix oops when handling mdsmap that decreases max_mds When i >= newmap->m_max_mds, ceph_mdsmap_get_addr(newmap, i) return NULL. Passing NULL to memcmp() triggers oops. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Sage Weil <sage@inktank.com>	2012-10-01 14:30:54 -05:00
Alex Elder	6e14b1a6c3	rbd: update remaining header fields for v2 There are three fields that are not yet updated for format 2 rbd image headers: the version of the header object; the encryption type; and the compression type. There is no interface defined for fetching the latter two, so just initialize them explicitly to 0 for now. Change rbd_dev_v2_snap_context() so the caller can be supplied the version for the header object. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-10-01 14:30:54 -05:00
Alex Elder	b8b1e2db52	rbd: get snapshot name for a v2 image Define rbd_dev_v2_snap_name() to fetch the name for a particular snapshot in a format 2 rbd image. Define rbd_dev_v2_snap_info() to to be a wrapper for getting the name, size, and features for a particular snapshot, using an interface that matches the equivalent function for version 1 images. Define rbd_dev_snap_info() wrapper function and use it to call the appropriate function for getting the snapshot name, size, and features, dependent on the rbd image format. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-10-01 14:30:54 -05:00
Alex Elder	35d489f946	rbd: get the snapshot context for a v2 image Fetch the snapshot context for an rbd format 2 image by calling the "get_snapcontext" method on its header object. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-10-01 14:30:53 -05:00
Alex Elder	b1b5402aa9	rbd: get image features for a v2 image The features values for an rbd format 2 image are fetched from the server using a "get_features" method. The same method is used for getting the features for a snapshot, so structure this addition with a generic helper routine that can get this information for either. The server will provide two 64-bit feature masks, one representing the features potentially in use for this image (or its snapshot), and one representing features that must be supported by the client in order to work with the image. For the time being, neither of these is really used so we keep things simple and just record the first feature vector. Once we start using these feature masks, what we record and what we expose to the user will most likely change. Signed-off-by: Alex Elder <elder@inktank.com>	2012-10-01 14:30:53 -05:00
Alex Elder	1e1301998e	rbd: get the object prefix for a v2 rbd image The object prefix of an rbd format 2 image is fetched from the server using a "get_object_prefix" method. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-10-01 14:30:53 -05:00
Alex Elder	9d475de5d1	rbd: add code to get the size of a v2 rbd image The size of an rbd format 2 image is fetched from the server using a "get_size" method. The same method is used for getting the size of a snapshot, so structure this addition with a generic helper routine that we can get this information for either. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-10-01 14:30:53 -05:00
Alex Elder	a30b71b999	rbd: lay out header probe infrastructure This defines a new function rbd_dev_probe() as a top-level function for populating detailed information about an rbd device. It first checks for the existence of a format 2 rbd image id object. If it exists, the image is assumed to be a format 2 rbd image, and another function rbd_dev_v2() is called to finish populating header data for that image. If it does not exist, it is assumed to be an old (format 1) rbd image, and calls a similar function rbd_dev_v1() to populate its header information. A new field, rbd_dev->format, is defined to record which version of the rbd image format the device represents. For a valid mapped rbd device it will have one of two values, 1 or 2. So far, the format 2 images are not really supported; this is laying out the infrastructure for fleshing out that support. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-10-01 14:30:53 -05:00
Alex Elder	cd892126c6	rbd: encapsulate code that gets snapshot info Create a function that encapsulates looking up the name, size and features related to a given snapshot, which is indicated by its index in an rbd device's snapshot context array of snapshot ids. This interface will be used to hide differences between the format 1 and format 2 images. At the moment this (looking up the name anyway) is slightly less efficient than what's done currently, but we may be able to optimize this a bit later on by cacheing the last lookup if it proves to be a problem. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-10-01 14:30:53 -05:00
Alex Elder	34b131849f	rbd: add an rbd features field Record the features values for each rbd image and each of its snapshots. This is really something that only becomes meaningful for version 2 images, so this is just putting in place code that will form common infrastructure. It may be useful to expand the sysfs entries--and therefore the information we maintain--for the image and for each snapshot. But I'm going to hold off doing that until we start making active use of the feature bits. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-10-01 14:30:53 -05:00
Alex Elder	c8d184250d	rbd: don't use index in __rbd_add_snap_dev() Pass the snapshot id and snapshot size rather than an index to __rbd_add_snap_dev() to specify values for a new snapshot. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-10-01 14:30:53 -05:00
Alex Elder	02cdb02cea	rbd: kill create_snap sysfs entry Josh proposed the following change, and I don't think I could explain it any better than he did: From: Josh Durgin <josh.durgin@inktank.com> Date: Tue, 24 Jul 2012 14:22:11 -0700 To: ceph-devel <ceph-devel@vger.kernel.org> Message-ID: <500F1203.9050605@inktank.com> Right now the kernel still has one piece of rbd management duplicated from the rbd command line tool: snapshot creation. There's nothing special about snapshot creation that makes it advantageous to do from the kernel, so I'd like to remove the create_snap sysfs interface. That is, /sys/bus/rbd/devices/<id>/create_snap would be removed. Does anyone rely on the sysfs interface for creating rbd snapshots? If so, how hard would it be to replace with: rbd snap create pool/image@snap Is there any benefit to the sysfs interface that I'm missing? Josh This patch implements this proposal, removing the code that implements the "snap_create" sysfs interface for rbd images. As a result, quite a lot of other supporting code goes away. Suggested-by: Josh Durgin <josh.durgin@inktank.com> Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-10-01 14:30:53 -05:00
Alex Elder	589d30e0b3	rbd: define rbd_dev_image_id() New format 2 rbd images are permanently identified by a unique image id. Each rbd image also has a name, but the name can be changed. A format 2 rbd image will have an object--whose name is based on the image name--which maps an image's name to its image id. Create a new function rbd_dev_image_id() that checks for the existence of the image id object, and if it's found, records the image id in the rbd_device structure. Create a new rbd device attribute (/sys/bus/rbd/<num>/image_id) that makes this information available. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-10-01 14:30:53 -05:00
Alex Elder	3bb59ad515	rbd: define some new format constants Define constant symbols related to the rbd format 2 object names. This begins to bring this version of the "rbd_types.h" header more in line with the current user-space version of that file. Complete reconciliation of differences will be done at some point later, as a separate task. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-10-01 14:30:53 -05:00
Alex Elder	f8d4de6e1c	rbd: support data returned from OSD methods An OSD object method call can be made using rbd_req_sync_exec(). Until now this has only been used for creating a new RBD snapshot, and that has only required sending data out, not receiving anything back from the OSD. We will now need to get data back from an OSD on a method call, so add parameters to rbd_req_sync_exec() that allow a buffer into which returned data should be placed to be specified, along with its size. Previously, rbd_req_sync_exec() passed a null pointer and zero size to rbd_req_sync_op(); change this so the new inbound buffer information is provided instead. Rename the "buf" and "len" parameters in rbd_req_sync_op() to make it more obvious they are describing inbound data. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-10-01 14:30:53 -05:00
Alex Elder	3cb4a687c7	rbd: pass flags to rbd_req_sync_exec() In order to allow both read requests and write requests to be initiated using rbd_req_sync_exec(), add an OSD flags value which can be passed down to rbd_req_sync_op(). Rename the "data" and "len" parameters to be more clear that they represent data that is outbound. At this point, this function is still only used (and only works) for write requests. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-10-01 14:30:52 -05:00
Alex Elder	3ee4001e0c	rbd: set up watch before announcing disk We're ready to handle header object (refresh) events at the point we call rbd_bus_add_dev(). Set up the watch request on the rbd image header just after that, and after we've registered the devices for the snapshots for the initial snapshot context. Do this before announce the disk as available for use. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-10-01 14:30:52 -05:00
Alex Elder	12f029448c	rbd: set initial capacity in rbd_init_disk() Move the setting of the initial capacity for an rbd image mapping into rb_init_disk(). Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-10-01 14:30:52 -05:00

1 2 3 4 5 ...

323358 Commits