A newer than the Luminous version is shipped with buster, and our
ceph repos are on Nautilus (14.2) in PVE 6.
Allows to drop a check for really old ceph versions (< 10, so
Infernalis and older).
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
dmesg: libceph: bad option at 'conf=/etc/pve/ceph.conf'
After the upgrade to PVE 6 with Ceph Luminous, the mount.ceph helper
doesn't understand the conf= option yet. And the CephFS mount with the
kernel client fails. After upgrading to Ceph Nautilus the option exists
in the mount.ceph helper.
Signed-off-by: Alwin Antreich <a.antreich@proxmox.com>
We can use 'list_images' to get the desired volume IDs in
'find_free_diskname' for most plugins. For the two LVM plugins, 'list_images'
potentially skips untagged volumes, so we keep the custom version. For the
RBD plugin, 'list_images' is much more costly than the custom version, so we
keep the custom version.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
we need to unprotect more snapshots than just the base one, since we
allow linked clones of regular VM snapshots. unprotection will only work
if no linked clones exist anymore.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
it does not work:
disable RBD image features this kernel RBD drivers is not compatible with: fast-diff,object-map,deep-flatten
clone failed: could not disable krbd-incompatible image features 'fast-diff,object-map,deep-flatten' for rbd image: vm-123123123-disk-0@test: rbd: snapshot name specified for a command that doesn't use it
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Modern kernel, like 5.3, support all those features ('fast-diff',
'object-map', 'deep-flatten'), so we do not want to disable them
there. 5.0 already supports exclusive-locks, so no need to disable
exclusive locking there.
Further, we also want to profit from new features available, so let's
enable those which can be enabled "live" (i.e., after image creation)
if their available.
While we could also parse the kernel information directly from:
/sys/module/libceph/parameters/supported_features
there's not much advantage to that, features cannot be disabled with
KConfig, their also very dependent of the kernel version booted.
So for us it's enough to check that one.
This only affects container and VMs backed by a storage with KRBD
explicitly enabled. But as the enabling and disabling happens
transparently, it has no effect on the running guest.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
The patch uses the value from the field 'stored' if it is available.
In Ceph 14.2.2 the storage calculation changed to a per pool basis. This
introduced an additional field 'stored' that holds the amount of data
that has been written to the pool. While the field 'used' now has the
data after replication for the pool.
The new calculation will be used only if all OSDs are running with the
on-disk format introduced by Ceph 14.2.2.
Signed-off-by: Alwin Antreich <a.antreich@proxmox.com>
it is not a storage plugin, and it makes more sense to have it
top-level, but there we cannot name it CephTools because of the
existing ones in pve-manager
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
This allows to request a mapped device/path explicitly, regardles of
the storage option, eg. krbd option in the RBDplugin.
Bump of the storage ABI => 2
Co-authored-by: Alwin Antreich <a.antreich@proxmox.com>
Signed-off-by: Dietmar Maurer <dietmar@proxmox.com>
$features is actually an array reference, so use it as one.
This broke creation and migration of disks on rbd storages
Signed-off-by: Alwin Antreich <a.antreich@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Non conforming image names are not ignored anymore by the new rbd_ls
implementation, this patch adds the old behaviour.
This fix is a temporary workaround and should be removed, once the new
image name parser is ready.
Signed-off-by: Alwin Antreich <a.antreich@proxmox.com>
since ceph changed the plain output format for 12.2.8
we have to change the code anyway, and when were at it,
we can change it to the (hopefully) more robust json output
Co-authored-by: Alwin Antreich <a.antreich@proxmox.com>
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Some methods for connecting to a ceph cluster are the same for RBD and
CephFS, these are merged into the helper modules.
Signed-off-by: Alwin Antreich <a.antreich@proxmox.com>
with the recent refactoring, external clusters were not handled
correctly with librd if a pveceph or storage specific ceph config
exists.
change the behaviour to include the pveceph config file only for pveceph
managed clusters, and a storage specific one only for external ones.
set mon_host correctly using the values from storage.cfg for external
librbd clusters.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
The path method of the RBDPlugin got a list with comma separated monhosts,
but it needs the list with semi-colon for qemu.
Signed-off-by: Alwin Antreich <a.antreich@proxmox.com>
Method build_cmd and path use similar code to generate the ceph command
line or qemu config parameters. They now use the private method
ceph_connect_option for parameter generation.
Signed-off-by: Alwin Antreich <a.antreich@proxmox.com>
- get storage utilization per pool
- use librados2-perl for pool status
- add librados2-perl as build-depends and depends in debian/control
Signed-off-by: Alwin Antreich <a.antreich@proxmox.com>
To prevent an error when disabling features of a rbd image with already
disabled flags. This aborted the CT/VM cloning halfway through with
a leftover rbd image, but no vmid.conf to it.
Signed-off-by: Alwin Antreich <a.antreich@proxmox.com>
Takes an operation, an optional requested bandwidth
limit override, and a list of storages involved in the
operation and lowers the requested bandwidth against global
and storage-specific limits unless the user has permissions
to change those.
This means:
* Global limits apply to all users without Sys.Modify on /
(as they can change datacenter.cfg options via the API).
* Storage specific limits apply to users without
Datastore.Allocate access on /storage/X for any involved
storage X.
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
preserve the old behaviour of selecting auth_supported based on the
existence of the keyring, but limit it to external clusters.
this allows switching 'auth XXX required' in the pveceph-managed
ceph.conf while still automatically copying the keyring when adding a
storage.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
add /etc/pve/ceph.conf to commands / option strings instead
of the monitor list provided via the 'monhost' option.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
to allow differentiating between user-created external RBD storage
entries (WITH monhost), and those created and managed by pveceph
(without).
making monhost non-fixed allows easily opting into the managed behaviour via
'pvesm set STORAGE -delete monhost', but is also helpful for external clusters
(i.e., after adding or removing a monitor you need to update the monhost
parameter..)
adapt description accordingly.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Ceph change ceph version output.
full output of 'ceph --version'
Luminous 'ceph version 12.1.0 (262617c9f16c55e863693258061c5b25dea5b086) luminous (dev)'
Jewel 'ceph version 10.2.7 (50e863e0f4bc8f4b9e31156de690d765af245185)'
since we allow vm-ID-whatever when allocating images, we
should also include those when listing them.
note: '@' is reserved for snapshots in ceph, so it is safe to
skip lines including an '@' in the image name.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
with more than a few images, 'rbd ls -l' gets rather slow
compared to a simple 'rbd ls'. since we only need to check
existing image names for finding a free one, the latter is
sufficient.
example with ~400 rbd images:
$ time rbd ls -p ceph-vm > /dev/null
real 0m0.027s
user 0m0.012s
sys 0m0.008s
$ time rbd ls -l -p ceph-vm > /dev/null
real 0m5.250s
user 0m1.632s
sys 0m0.584s
a linked clone of two disks on the same setup accordingly
also shows a massive speedup:
$ time qm clone 1000 10000 -snap test
create linked clone of drive scsi0 (ceph-vm:vm-1000-disk-2)
clone vm-1000-disk-2: vm-1000-disk-2 snapname test to
vm-10000-disk-1
create linked clone of drive scsi1 (ceph-vm:vm-1000-disk-1)
clone vm-1000-disk-1: vm-1000-disk-1 snapname test to
vm-10000-disk-2
real 0m11.157s
user 0m3.752s
sys 0m1.308s
$ time qm clone 1000 10000 -snap test
create linked clone of drive scsi1 (ceph-vm:vm-1000-disk-1)
clone vm-1000-disk-1: vm-1000-disk-1 snapname test to
vm-10000-disk-1
create linked clone of drive scsi0 (ceph-vm:vm-1000-disk-2)
clone vm-1000-disk-2: vm-1000-disk-2 snapname test to
vm-10000-disk-2
real 0m0.872s
user 0m0.652s
sys 0m0.096s
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
With krbd we resize volume and tell QemuSever to notify running QEMU
with zero $size by returning undef.
Signed-off-by: Dmitry Petuhov <mityapetuhov@gmail.com>
without this, having an efidisk on a ceph storage
prevents creating another disk on the same
ceph storage, because it will not be detected
and we try to allocate one with the same name
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
since the rbd images themselves are named differently than
the volumes in our config files, we need to recreate this
information from the parent relation in the ceph metadata,
otherwise list_images() might return wrong volume names/IDs
since list_images is used by PVE::Storage::vdisk_free() to
check for children still referencing a base image, because
of the wrong volume id RBDPlugin->parse_volname() does not
detect the base image of linked clones and the check fails.
this is thankfully mitigated by the protected status of the
base snapshot, but creates a rather confusing error message.
scenario (VM 701 is a linked clone of template VM 700):
$ qm config 700 | grep virtio0:
virtio0: ceph_qemu:base-700-disk-1,size=2G
$ qm config 701 | grep virtio0:
virtio0: ceph_qemu:base-700-disk-1/vm-701-disk-1,size=2G
before (pvesm list reports wrong volume ID, check fails):
$ pvesm list ceph_qemu
ceph_qemu:base-700-disk-1 raw 2147483648 700
ceph_qemu:vm-701-disk-1 raw 2147483648 701
$ pvesm free ceph_qemu:base-700-disk-1
snap_unprotect: can't unprotect; at least 1 child(ren) in pool rbd
rbd unprotect base-700-disk-1 snap '__base__' error: snap_unprotect: can't unprotect; at least 1 child(ren) in pool rbd
after (correct volume ID, check works as intended):
$ pvesm list ceph_qemu
ceph_qemu:base-700-disk-1 raw 2147483648 700
ceph_qemu:base-700-disk-1/vm-701-disk-1 raw 2147483648 701
$ pvesm free ceph_qemu:base-700-disk-1
base volume 'base-700-disk-1' is still in use (use by 'base-700-disk-1/vm-701-disk-1')