Commit Graph

159 Commits

Author SHA1 Message Date
6a44cc417d RBD plugin: librados connect: increase timeout when in worker
The default timeout in PVE/RADOS.pm is 5 seconds, but this is not
always enough for external clusters under load. Workers can and should
take their time to not fail here too quickly.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2022-09-13 09:57:19 +02:00
2be327abf6 RBD plugin: librados connect: pass along options
In preparation to increase the timeout for workers. Both existing
callers of librados_connect() don't currently use the parameter.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2022-09-13 09:57:19 +02:00
e8e477112f RBD plugin: path: conditionalize get_rbd_dev_path() call
The return value of get_rbd_dev_path() is only used when $scfg->{krbd}
evaluates to true and the function shouldn't have any side effects
that are needed later, so the call can be avoided otherwise.

This also saves a RADOS connection and command with configurations for
external clusters with krbd disabled.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2022-09-13 09:55:56 +02:00
a1528ffe8f rbd: get_rbd_dev_path: return /dev/rbd path only if cluster matches
The changes in cfe46e2d4a git not catch
all situations.
In the case of a guest having 2 disk images with the same name on a pool
with the same name but in two different ceph clusters we still had
issues when starting it. The first disk got mapped as expected. The
second disk did not get mapped because we returned the old $path to
"/dev/rbd/<pool>/<image>" because it already existed from the first
disk.

In the case that only the "old" /dev/rbd path exists and we do not have
the /dev/rbd-pve/<cluster>/... path available, we now check if the
cluster fsid used by that rbd device matches the one we expect. If it
does, then we are in the situation that the image has been mapped before
the new rbd-pve udev rule was introduced. If it does not, then we have
the situation of an ambiguous mapping in /dev/rbd and return the
$pve_path.

Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
2022-06-14 10:50:43 +02:00
e4671f734b rbd: fix #4060 show data-pool usage when configured
When a data-pool is configured, use it for status infos. The 'data-pool'
config option is used to mark the erasure coded pool while the 'pool'
will be the replicated pool holding meta data such as the omap.

This means, the 'pool' will only use a small amount of space and people
are interested how much they can store in the erasure coded pool anyway.

Therefore this patch reorders the assignment of the used pool name by
availability of the scfg parameters: data-pool -> pool -> fallback 'rbd'

Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
2022-05-20 09:43:20 +02:00
ae93163343 rbd: warn if no stats for a pool could be gathered
happens in case of a mistyped poolname, and the new message should be
more helpful than:
`Use of uninitialized value $free in addition (+) at \
/usr/share/perl5/PVE/Storage/RBDPlugin.pm line 64`

Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
2022-05-13 14:08:13 +02:00
0c317c6c0f rbd: add fallback default poolname 'rbd' to status
the fallback to a default pool name of 'rbd' was introduced in:
1440604a4b
and worked for the status command, because it used the `rados_cmd`
sub.

This fallback was lost with the changes in:
41aacc6cde

leading to confusing errors:
`Use of uninitialized value in string eq at \
/usr/share/perl5/PVE/Storage/RBDPlugin.pm line 633`
(e.g. in the journal from pvestatd)

Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
2022-05-13 14:08:13 +02:00
78eac5baf2 pbs: namespace support
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2022-05-12 11:56:48 +02:00
78638b3dea rbd: get path: allow fake override of fsid in scfg for some regression tests
to avoid calls into RADOS connect, that trigger RPCEnv not
initialized breakage in regression tests, but wouldn't really work
otherwise either

in the future the RBD $scfg could actually support this (or similarly
named) property, to safe on storage addition and then avoid frequent
mon commands

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-04-28 18:17:58 +02:00
9594717848 rbd: unmap volume after rename
When krbd is used, subsequent removal after an an operation
involving a rename could fail with
> librbd::image::PreRemoveRequest: 0x559b7506a470 \
> check_image_watchers: image has watchers - not removing
because the old mapping was still present.

For both operations with a rename, the owning guest should be offline,
but even if it weren't, unmap simply fails when the volume is in-use.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2022-04-28 13:44:21 +02:00
cc682faafc rbd: drop get_kernel_device_path
it only redirected to get_rbd_dev_path with the same signature and both
are private subs..

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2022-04-27 13:03:16 +02:00
647a667e10 rbd: reduce number of stats in likely path
the new udev rule is expected to be in place and active, switching the
checks around means 1 instead of 2 stat()s in this rather hot code path.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2022-04-27 13:01:42 +02:00
cfe46e2d4a rbd: fix #3969: add rbd dev paths with cluster info
By adding our own customized rbd udev rules and ceph-rbdnamer we can
create device paths that include the cluster fsid and avoid any
ambiguity if the same pool and namespace combination is used in
different clusters we connect to.

Additionally to the '/dev/rbd/<pool>/...' paths we now have
'/dev/rbd-pve/<cluster fsid>/<pool>/...' paths.

The other half of the patch makes use of the new device paths in the RBD
plugin.

The new 'get_rbd_dev_path' method the full device path. In case that the
image has been mapped before the rbd-pve udev rule has been installed,
it returns the old path.

The cluster fsid is read from the 'ceph.conf' file in the case of a
hyperconverged setup. In the case of an external Ceph cluster we need to
fetch it via a rados api call.

Co-authored-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
2022-04-27 12:57:22 +02:00
c915afca7e rbd: followup code style cleanups
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-02-04 18:04:31 +01:00
ef2afce74a fix #1816: rbd: add support for erasure coded ec pools
The first step is to allocate rbd images correctly.

The metadata objects still need to be stored in a replicated pool, but
by providing the --data-pool parameter on image creation, we can place
the data objects on the erasure coded (EC) pool.

Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
2022-02-04 17:51:48 +01:00
95dfa44ca1 add disk rename feature
Functionality has been added for the following storage types:

* directory ones, based on the default implementation:
    * directory
    * NFS
    * CIFS
    * gluster
* ZFS
* (thin) LVM
* Ceph

A new feature `rename` has been introduced to mark which storage
plugin supports the feature.

Version API and AGE have been bumped.

Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>

the intention of this feature is to support the following use-cases:
- reassign a volume from one owning guest to another (which usually
  entails a rename, since the owning vmid is encoded in the volume name)
- rename a volume (e.g., to use a more meaningful name instead of the
  auto-assigned ...-disk-123)

only the former is implemented at the caller side in
qemu-server/pve-container for now, but since the lower-level feature is
basically the same for both, we can take advantage of the storage plugin
API bump now to get the building block for this future feature in place
already.

adapt ApiChangelog change to fix conflicts and added more detail above

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2021-11-09 17:02:29 +01:00
85043c0193 rbd plugin: free image: use actual command in error message
For linked clones, the base name was included, which is confusing.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-10-27 17:03:14 +02:00
22b68016f7 Ceph: add keyring parameter for external clusters
By adding the keyring for RBD storage or the secret for CephFS ones, it
is possible to add an external Ceph cluster with only one API call.

Previously the keyring / secret file needed to be placed in
/etc/pve/priv/ceph/$storeID.{keyring,secret} manually.

Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
2021-08-26 18:15:30 +02:00
3c93115570 rbd: fix typo in error message
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-04-20 18:21:31 +02:00
c27fe64810 rbd: make volume param for get_rbd_path to allow further use
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2021-04-09 14:19:48 +02:00
ed7ea5a352 rbd: list images: early return to avoid indentation
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2021-04-09 13:51:15 +02:00
a3cad0b50d rbd: list images: sort by keys when pushing on result array
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2021-04-09 13:49:56 +02:00
6d0d0a977d rbd: indentation and whitespace cleanups
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2021-04-09 13:48:27 +02:00
22265bd990 rbd: get kernel device sub returns a path, not a name
also transform to private sub instead of local variable closure.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2021-04-09 13:45:21 +02:00
72bbd8a6f7 rbd: consistent closure call style
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2021-04-09 13:43:33 +02:00
92a7826f88 rbd: build cmd: allow "falsy" namespace value also here
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2021-04-09 13:43:33 +02:00
4cf696f66e rbd: use private sub for get_rbd_path
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2021-04-09 13:43:33 +02:00
e9bc993138 rbd: fix #3286 add namespace support
This patch introduces support for Cephs RBD namespaces.

A new storage config parameter 'namespace' defines the namespace to be
used for the RBD storage.

The namespace must already exist in the Ceph cluster as it is not
automatically created.

The main intention is to use this for external Ceph clusters. With
namespaces, each PVE cluster can get its own namespace and will not
conflict with other PVE clusters.

Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
2021-04-09 12:56:21 +02:00
aeb007cb76 rbd: centralize rbd path concatenation
The <pool>/<image> paths are needed in quite a lot of places. Having one
single place where they are created helps to reduce duplicate code and
makes it easier to introduce new features.

The 'add_pool_to_disk' sub was already doing that but the name was not
really fitting. This commit renames it to the more general
'get_rbd_path' and changes the second parameter to the more widely used
$volume instead of $disk.

Furthermore, all occurences where "$pool/$volume" has been concatenated
have been replaced with a call to get_rbd_path.

Plus some minor code style cleanups for long function calls that were
touched.

Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
2021-04-09 12:56:21 +02:00
f3ccd0ef3d plugin: hooks: add explicit returns
to avoid returning something unexpected. Finish what
afeda18256 already started for all the other
plugins. At least for ZFS's on_add_hook this is necessary (adding a ZFS storage
currently fails as reported here [0]), but it cannot hurt
in the other places either as the only hooks we expect to return something
currently are PBS's on_add_hook and on_update_hook.

[0]: https://forum.proxmox.com/threads/gui-add-zfs-storage-verification-failed-400-config-type-check-object-failed.79734/

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2020-11-27 10:45:42 +01:00
2c036838ed add check for fsfreeze before snapshot
In order to take a snapshot of a container volume, which can be mounted
read-only with RBD, the volume needs to be frozen (fsfreeze (8)) before taking
the snapshot.

This commit adds helpers to determine if the FIFREEZE ioctl needs to be called
for the volume.

Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
2020-11-10 18:58:45 +01:00
d4c31eff96 d/control: bump ceph dependency to 12.2
A newer than the Luminous version is shipped with buster, and our
ceph repos are on Nautilus (14.2) in PVE 6.

Allows to drop a check for really old ceph versions (< 10, so
Infernalis and older).

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2020-04-25 11:27:38 +02:00
e54c3e3347 Fix #2705: cephfs: mount fails with bad option
dmesg: libceph: bad option at 'conf=/etc/pve/ceph.conf'

After the upgrade to PVE 6 with Ceph Luminous, the mount.ceph helper
doesn't understand the conf= option yet. And the CephFS mount with the
kernel client fails. After upgrading to Ceph Nautilus the option exists
in the mount.ceph helper.

Signed-off-by: Alwin Antreich <a.antreich@proxmox.com>
2020-04-25 11:15:23 +02:00
a44c0147bc Use a common interface for find_free_diskname
We can use 'list_images' to get the desired volume IDs in
'find_free_diskname' for most plugins. For the two LVM plugins, 'list_images'
potentially skips untagged volumes, so we keep the custom version. For the
RBD plugin, 'list_images' is much more costly than the custom version, so we
keep the custom version.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2019-12-12 12:52:43 +01:00
a573f66a2e rbd: unprotect all snapshots on image removal
we need to unprotect more snapshots than just the base one, since we
allow linked clones of regular VM snapshots. unprotection will only work
if no linked clones exist anymore.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2019-12-11 10:55:04 +01:00
518f3908bf rbd: update features of image when mapping snapshot
in order to fix the features even if the image itself never gets mapped.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2019-11-29 13:41:33 +01:00
b4227e776f rbd: don't attempt to update features of snapshots
it does not work:

disable RBD image features this kernel RBD drivers is not compatible with: fast-diff,object-map,deep-flatten
clone failed: could not disable krbd-incompatible image features 'fast-diff,object-map,deep-flatten' for rbd image: vm-123123123-disk-0@test: rbd: snapshot name specified for a command that doesn't use it

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2019-11-29 11:09:52 +01:00
0ef8fb9d4d RBD: disable and enable features depending on kernel version
Modern kernel, like 5.3, support all those features ('fast-diff',
'object-map', 'deep-flatten'), so we do not want to disable them
there. 5.0 already supports exclusive-locks, so no need to disable
exclusive locking there.

Further, we also want to profit from new features available, so let's
enable those which can be enabled "live" (i.e., after image creation)
if their available.

While we could also parse the kernel information directly from:
/sys/module/libceph/parameters/supported_features
there's not much advantage to that, features cannot be disabled with
KConfig, their also very dependent of the kernel version booted.
So for us it's enough to check that one.

This only affects container and VMs backed by a storage with KRBD
explicitly enabled. But as the enabling and disabling happens
transparently, it has no effect on the running guest.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2019-11-23 15:44:19 +01:00
5102900d50 rbd: group and sort module usage
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2019-11-22 18:15:25 +01:00
a4e603c657 RBD: fix ceph version detection
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2019-11-22 16:02:40 +01:00
e79ab52c14 Fix #2346: rbd storage shows wrong %-usage
The patch uses the value from the field 'stored' if it is available.

In Ceph 14.2.2 the storage calculation changed to a per pool basis. This
introduced an additional field 'stored' that holds the amount of data
that has been written to the pool. While the field 'used' now has the
data after replication for the pool.

The new calculation will be used only if all OSDs are running with the
on-disk format introduced by Ceph 14.2.2.

Signed-off-by: Alwin Antreich <a.antreich@proxmox.com>
2019-09-03 10:19:45 +02:00
4050fcc16b move Storage/CephTools to CephConfig
it is not a storage plugin, and it makes more sense to have it
top-level, but there we cannot name it CephTools because of the
existing ones in pve-manager

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2018-12-20 09:26:11 +01:00
40d698932e implement map_volume and unmap_volume
This allows to request a mapped device/path explicitly, regardles of
the storage option, eg. krbd option in the RBDplugin.

Bump of the storage ABI => 2

Co-authored-by: Alwin Antreich <a.antreich@proxmox.com>
Signed-off-by: Dietmar Maurer <dietmar@proxmox.com>
2018-11-09 17:25:51 +01:00
dd9e97ed14 find_free_diskname: fixup regex match operator
Co-developed-by: Stoiko Ivanov <s.ivanov@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2018-09-19 11:21:37 +02:00
0057171085 Fix #1925: untaint rbd JSON output
Reviewed-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Tested-by: Stoiko Ivanov <s.ivanov@proxmox.com>
Reviewed-by: Stoiko Ivanov <s.ivanov@proxmox.com>
Signed-off-by: Dietmar Maurer <dietmar@proxmox.com>
2018-09-19 11:21:37 +02:00
e5b2206f8a rbd: krbd_feature_disable was not disabling features
$features is actually an array reference, so use it as one.
This broke creation and migration of disks on rbd storages

Signed-off-by: Alwin Antreich <a.antreich@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2018-09-12 14:56:36 +02:00
c4a29df483 refactor finding next diskname for all plugins
Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
2018-09-10 12:21:10 +02:00
aa14def420 Addition to fix #1895, skip image if no owner
Non conforming image names are not ignored anymore by the new rbd_ls
implementation, this patch adds the old behaviour.

This fix is a temporary workaround and should be removed, once the new
image name parser is ready.

Signed-off-by: Alwin Antreich <a.antreich@proxmox.com>
2018-09-07 13:51:18 +02:00
1be93fe2fd rdb: followup cleanups
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2018-09-06 15:19:16 +02:00
c093e93b28 rbd: remove unused size conversion function
since the json output gives the sizes in bytes, we do not
need to convert anymore

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2018-09-06 15:18:23 +02:00