CIFS/SMB supports directly mounting subdirectories, so it makes sense to
also allow the --subdir parameter for these storages. The subdir
parameter was moved from CephFSPlugin.pm to Plugin.pm, because it isn't
specific to CephFS anymore.
Signed-off-by: Leo Nunner <l.nunner@proxmox.com>
by simply doing a ping with the expected port as a fallback when the
rpcinfo command fails.
The timeout was chosen to be 2 seconds, because that's what the
existing callers of tcp_ping() in the iSCSI and GlusterFS plugins use.
Alternatively, the existing check could be replaced, but that would
1. Dumb down the check.
2. Risk breakage in some corner case that's yet to be discovered.
3. It would still be necessary to use rpcinfo (or dumb the check down
even further) in case port=0; from 'man 5 nfs' about the NFSv4 'port'
option:
> If the specified port value is 0, then the NFS client uses the NFS
> service port number advertised by the server's rpcbind service.
Reported in the community forum:
https://forum.proxmox.com/threads/118466/post-524449https://forum.proxmox.com/threads/120774/
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
The only caller where $running can even be truthy is QemuServer.pm's
qemu_volume_snapshot_delete(). But there, a check if snapshots should
be done with QEMU is already made and the storage function is only
called if snapshots should not be done with QEMU (like for TPM drives
which are not attached directly). So rely on the caller and do not
return early to fix removing snapshots in such cases.
Even if a stray call ends up here (can happen already by changing the
krbd setting while a VM is running to flip the above-mentioned check
and the early return check removed by this patch), it might not even
be problematic. At least a quick test worked fine:
1. take snapshot via a monitor command in QEMU
2. remove snapshot via the storage layer
3. create a new file in the VM
4. take a snapshot with the same name via monitor command in QEMU
5. roll back to the snapshot
6. check that the file in the VM is as expected
Using the storage layer to take the snapshots and QEMU to remove the
snapshot also worked doing the same test. Even if it were problematic,
the check in qemu-server should rather be improved then.
(Trying to issue a snapshot mon command for a krbd-mapped image fails
with an error on the other hand, but that is also not too bad and not
relevant to the storage code. Again, it rather would be necessary to
improve the check in qemu-server).
The fact that the pve-container code didn't even pass $running is the
reason removing snapshots worked for containers on a storage with krbd
disabled (the pve-container code calls map_volume() explicitly, so
containers can work regardless of the krbd setting in the storage
configuration; see commit 841fba6 ("call map_volume before using
volumes.") in pve-container).
For volume_snapshot(), the early return with $running was already
removed (or rather the relevant logic moved to QemuServer.pm) in 2015
by commit f5640e7 ("remove running from Storage and check it in
QemuServer"), even before krbd support was added in RBDPlugin.pm.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Reviewed-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
To be correct in all cases, it's still necessary to filter by "pool"
in zfs_parse_zvol_list(), because $scfg->{pool} could be e.g.
'foo/vm-123-disk-0' which looks like an image name and would pass the
other "should skip"-checks in zfs_parse_zvol_list().
No change in the result of zfs_list_zvol() is intended.
Suggested-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
The 'pool' property in the result of zfs_parse_zvol_list() was not
used for anything else.
No functional change is intended.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Avoids the need for the fallback for the (only existing) caller.
Note that the old my $list = (); is a rather intransparent way of
assigning undef.
Suggested-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Rename the "dirs" parameter to "content-dirs". Switch from a "vtype:/dir"
format to "vtype=/dir", and remove the misleading error message talking
about "absolute" paths. One might expect these to be absolute over the
whole system, while in reality they are relative to the mountpoint of
the storage.
Signed-off-by: Leo Nunner <l.nunner@proxmox.com>
Allowing overrides for the default directory locations seems to
integrate rather well into the existing system. Custom locations
are specified using the "dirs" parameter as a comma-separated list
of "vtype:/location" values.
For now, the option has been enabled for the Directory, CIFS and NFS
backends.
Signed-off-by: Leo Nunner <l.nunner@proxmox.com>
There is no caller using $cache and the same $storeid multiple times,
so there is no need to keep the cache.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
The plugin for remote ZFS storages currently also uses the same
list_images() as the plugin for local ZFS storages. There is only
one cache which does not remember the target host where the
information originated.
This is problematic for rescan in qemu-server:
1. Via list_images() and zfs_list_zvol(), ZFSPlugin.pm's zfs_request()
is executed for a remote ZFS.
2. $cache->{zfs} is set to the result of scanning the images there.
3. Another list_images() for a local ZFS storage happens and uses the
cache with the wrong information.
The only two operations using the cache currently are
1. Disk rescan in qemu-server which is affected by the issue. It is
done as part of (or as a) long-running operations.
2. prepare_local_job for pvesr, but the non-local ZFS plugin doesn't
support replication, so it cannot trigger there. The cache only
helps if there is a replicated guest with volumes on different
ZFS storages, but otherwise it will be less efficient than no
cache or querying every storage by itself.
Fix the issue by making the cache $storeid-specific, which effectively
makes the cache superfluous, because there is no caller using the same
$storeid multiple times. As argued above, this should not really hurt
the callers described above much and actually improves performance for
all other callers.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
If both type and vmid is defined we don't need to list the current
snapshots, we simply can derive the single backup group from that and
let the PBS client handle the rest.
Should be a not so small speedup for most setups using PBS backup and
pruning configured on PVE side, as vzdump calls this separately for
every vmid on backup jobs with multiple guests included.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
The default timeout in PVE/RADOS.pm is 5 seconds, but this is not
always enough for external clusters under load. Workers can and should
take their time to not fail here too quickly.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
In preparation to increase the timeout for workers. Both existing
callers of librados_connect() don't currently use the parameter.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
The return value of get_rbd_dev_path() is only used when $scfg->{krbd}
evaluates to true and the function shouldn't have any side effects
that are needed later, so the call can be avoided otherwise.
This also saves a RADOS connection and command with configurations for
external clusters with krbd disabled.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
When switching this from calling the external binary to
using the perl api client the timeout got reduced to 7
seconds, which is definitely insufficient for larger stores.
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
while the resulting backups are encrypted, they would not be restorable
using the master key (only) if the original PVE system is lost.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
if the key file doesn't exist (anymore), but the storage.cfg references
one, die on commands that should use encryption instead of falling back
to plain-text operations.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Tested-by: Stoiko Ivanov <s.ivanov@proxmox.com>
Before af07f67 ("pbs: use vmid parameter in list_snapshots") the
namespace was set via do_raw_client_command, but now it needs to be
set explicitly here.
Fixes: af07f67 ("pbs: use vmid parameter in list_snapshots")
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Particularly for operations such as pruning backups after a
scheduled backups we do not want to list the entire
store.
(pbs_api_connect is moved up unmodified)
Note that the 'snapshots' CLI command only takes a full
group, but the API does allow specifying a backup-id without
a backup-type!
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
by refactoring it into a helper and use that.
With this, we can omit the 'update_volume_notes' in subclasses
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
and refactored usages for .log and .notes with them.
At some parts in the test case code I had to source new variables to
shorten the line length to not exceed the 100 column line limit.
Signed-off-by: Daniel Tschlatscher <d.tschlatscher@proxmox.com>
Reviewed-by: Fabian Ebner <f.ebner@proxmox.com>
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
The changes in cfe46e2d4a git not catch
all situations.
In the case of a guest having 2 disk images with the same name on a pool
with the same name but in two different ceph clusters we still had
issues when starting it. The first disk got mapped as expected. The
second disk did not get mapped because we returned the old $path to
"/dev/rbd/<pool>/<image>" because it already existed from the first
disk.
In the case that only the "old" /dev/rbd path exists and we do not have
the /dev/rbd-pve/<cluster>/... path available, we now check if the
cluster fsid used by that rbd device matches the one we expect. If it
does, then we are in the situation that the image has been mapped before
the new rbd-pve udev rule was introduced. If it does not, then we have
the situation of an ambiguous mapping in /dev/rbd and return the
$pve_path.
Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
When a data-pool is configured, use it for status infos. The 'data-pool'
config option is used to mark the erasure coded pool while the 'pool'
will be the replicated pool holding meta data such as the omap.
This means, the 'pool' will only use a small amount of space and people
are interested how much they can store in the erasure coded pool anyway.
Therefore this patch reorders the assignment of the used pool name by
availability of the scfg parameters: data-pool -> pool -> fallback 'rbd'
Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
happens in case of a mistyped poolname, and the new message should be
more helpful than:
`Use of uninitialized value $free in addition (+) at \
/usr/share/perl5/PVE/Storage/RBDPlugin.pm line 64`
Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
the fallback to a default pool name of 'rbd' was introduced in:
1440604a4b
and worked for the status command, because it used the `rados_cmd`
sub.
This fallback was lost with the changes in:
41aacc6cde
leading to confusing errors:
`Use of uninitialized value in string eq at \
/usr/share/perl5/PVE/Storage/RBDPlugin.pm line 633`
(e.g. in the journal from pvestatd)
Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
to avoid calls into RADOS connect, that trigger RPCEnv not
initialized breakage in regression tests, but wouldn't really work
otherwise either
in the future the RBD $scfg could actually support this (or similarly
named) property, to safe on storage addition and then avoid frequent
mon commands
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
When krbd is used, subsequent removal after an an operation
involving a rename could fail with
> librbd::image::PreRemoveRequest: 0x559b7506a470 \
> check_image_watchers: image has watchers - not removing
because the old mapping was still present.
For both operations with a rename, the owning guest should be offline,
but even if it weren't, unmap simply fails when the volume is in-use.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
it only redirected to get_rbd_dev_path with the same signature and both
are private subs..
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
the new udev rule is expected to be in place and active, switching the
checks around means 1 instead of 2 stat()s in this rather hot code path.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
By adding our own customized rbd udev rules and ceph-rbdnamer we can
create device paths that include the cluster fsid and avoid any
ambiguity if the same pool and namespace combination is used in
different clusters we connect to.
Additionally to the '/dev/rbd/<pool>/...' paths we now have
'/dev/rbd-pve/<cluster fsid>/<pool>/...' paths.
The other half of the patch makes use of the new device paths in the RBD
plugin.
The new 'get_rbd_dev_path' method the full device path. In case that the
image has been mapped before the rbd-pve udev rule has been installed,
it returns the old path.
The cluster fsid is read from the 'ceph.conf' file in the case of a
hyperconverged setup. In the case of an external Ceph cluster we need to
fetch it via a rados api call.
Co-authored-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
When writing into the file, explicitly utf8 encode it, and then try
to utf8 decode it on read.
If the notes are not valid utf8, we assume they were iso-8859 encoded
and return as is.
Technically this is a breaking change, since there are iso-8859
comments that would successfully decode as utf8, for example: the
byte sequence "C2 A9" would be "£" in iso, but would decode to "£".
From what i can tell though, this is rather unlikely to happen for
"real world" notes, because the first byte would be in the range of
C0-F7 (which are mostly language dependent characters like "Â") and
the following bytes would have to be in the range of 80-BF, which are
only special characters like "£" (or undefined)
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
With 30s we got for sync api calls 10s leaves still enough room for
answering and other stuff.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Since most zfs operations can take a while (under certain conditions),
increase the minimum timeout for zfs_request in workers to 5 minutes.
We cannot increase the timeouts in synchronous api calls, since they are
hard limited to 30 seconds, but in worker we do not have such limits.
The existing default timeout does not change (60minutes in worker,
5seconds otherwise), but all zfs_requests with a set timeout (<5minutes)
will use the increased 5 minutes in a worker.
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
The ability to mark backups as protected broke the implicit assumption
in vzdump that remove=1 and current number of backups being the limit
(i.e. sum of all keep options) will result in a backup being removed.
Introduce a new storage property 'max-protected-backups' to limit the
number of protected backups per guest. Use 5 as a default value, as it
should cover most use cases, while still not having too big of a
potential overhead in many scenarios.
For external plugins that do not return the backup subtype in
list_volumes, all protected backups with the same ID will count
towards the limit.
An alternative would be to count the protected backups when pruning.
While that would avoid the need for a new property, it would break the
current semantics of protected backups being ignored for pruning. It
also would be less flexible, e.g. for PBS, it can make sense to have
both keep-all=1 and a limit for the number of protected snapshots on
the PVE side.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Otherwise, there is no storage-agnostic way to filter by backup group.
Call it subtype, to not confuse it with content type, and to be able
to re-use it for other content types than backup, if the need ever
arises.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Previously, the transport format (which currently is always 'zfs') was
passed in, resulting in subvol-disks not to be renamed correctly.
Fixes: a97d3ee ("Introduce allow_rename parameter for pvesm import and storage_migrate")
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
`qemu-img info --output=json` returns the size and used values as integers in
the JSON format, but the regex match converts them to strings.
As we know they only contain digits, we can simply cast them back to integers
after the regex.
The API requires them to be integers.
Signed-off-by: Mira Limbeck <m.limbeck@proxmox.com>
Reviewed-by: Fabian Ebner <f.ebner@proxmox.com>
The first step is to allocate rbd images correctly.
The metadata objects still need to be stored in a replicated pool, but
by providing the --data-pool parameter on image creation, we can place
the data objects on the erasure coded (EC) pool.
Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
Some versions of ZFS do not automatically display the child snapshots
when '-t snapshot' is used, but require '-r' to be present
additionally[1]. And in general, it's cleaner to specify the flag
explicitly.
Because of that, commit ac5c1af led to a regression[0] in the context
of ZFS over iSCSI with zfs_get_sorted_snapshot_list. Fix it, by adding
a -r flag again.
The volume_snapshot_info function is currently only used in the
context of replication and that requires a local ZFS pool, but it
would be affected by the same issue if it is ever used in the context
of ZFS over iSCSI, so also add -r there.
[0]: https://forum.proxmox.com/threads/102683/
[1]: https://forum.proxmox.com/threads/102683/post-442577
Fixes: 8c20d8a ("plugin: add volume_snapshot_info function")
Fixes: ac5c1af ("zfspool: add zfs_get_sorted_snapshot_list helper")
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>