Commit Graph

43 Commits

Author SHA1 Message Date
79f4a7bfd3 fix #2266: Diskmanage: get correct osd id
The osdid can consist of multiple digits, cope with that and add more
regression tests for this

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2019-07-08 19:04:08 +02:00
35ccd3d004 fix missing osd info for osd 0
0 is falsy, we have to check for definedness
also adapt the tests so we test for this

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2019-07-03 08:59:47 +02:00
9250ddfe3a Diskmanage: correctly add wearout value of 0
if wearout is 0 we showed 'N/A' instead of 100%
(wearout is really the 'life left' value)

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2019-06-04 13:14:45 +02:00
c5fa45a99f Diskmanage: fix incorrect variable usage
sysdir is a string, we wanted sysdata

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2019-06-04 13:14:45 +02:00
ea928fd41a Diskmanage: extract nvme wearout from smartctl text
extract the info from the line:
Percentage Used: XX%

also adapt the tests

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2019-06-04 13:14:45 +02:00
1dc3038d40 Diskmanage: add append_partition sub
we will use this for adding a partition to a disk when using a device
for ceph osd db/wal which already has partitions on it

first we search for the highest partition number, then add the partition
and search for the resulting device (we cannot assume to simply
append the number, e.g. from /dev/nvme0n1 we get /dev/nvme0n1pX)

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2019-06-04 12:51:44 +02:00
52a064afcd Diskmanage: allow get_disks to take multiple disks
we now expect the first parameter to be either a string with a single
disk, or an array ref with a list of disks

this way we can get the info of multiple disks simultaneously while
not iterating over all disks

this will be used to get the info for osd/db/wal disk

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2019-06-04 12:51:44 +02:00
0180fa427f followup: get_disks: use own variable for frequent access
Less reading and the own name for the variable should helps to grasp
more quickly what it should contain

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2019-05-31 12:04:42 +02:00
248f43f58b follouwp: get_ceph_volume_infos: code and comment cleanup
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2019-05-31 12:04:13 +02:00
19dcd1adcb Diskmanage: detect osds/journals/etc. created with ceph-volume
ceph-volume creates osds/journal/etc. on LVM instead of partitions,
so to detect them, we have to parse the lv_tags of the LVs and
match them with the underlying device

also add tests for this detection

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2019-05-31 11:41:33 +02:00
501562d4b7 code cleanup
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2019-04-08 17:48:59 +02:00
4526dffa53 Diskmanage: don't run zpool if not present
Since zfsutils are not a hard dependency of our stack it is possible to not have
`zpool` available.

Checking for existance of `zpool` before calling it suppresses spurious warnings
in the logs (e.g. when creating Ceph OSDs or accessing the 'Disk' Tab in the
GUI).

Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
2019-04-08 17:47:07 +02:00
eebcdb1119 fix tests when one has iscsi devices
the test would read the real device and if one is an iscsi device
it would fail, move the test code to a sub and mock it in the tests

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2019-03-07 11:08:43 +01:00
f2e5018e70 diskmanage: fix device encoding handling
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2019-02-04 11:39:02 +01:00
c3442aa554 Fix #2020: use /sys to map nvmeXnY to nvmeX
`nvmeX` devices nodes are apparently allocated independently
from their namespace block devices `nvmeXnY` and therefore
they are not strictly related by name. For instance:
  $ readlink /sys/block/nvme0n1/device
  ../../nvme1
  $ readlink /sys/block/nvme1n1/device
  ../../nvme0

Here /dev/nvme0n1 is the first namespace of /dev/nvme1 while
/dev/nvme1n1 is the first namespace of /dev/nvme0.

Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2018-12-10 14:54:11 +01:00
0370861cfd diskmanage: rename check_unused to assert_disk_unused
to have a clear method name for this. check_XYZ suggests also that we
return true if the check was OK, but we don't.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2018-10-03 14:51:38 +02:00
76c1e57be7 refactor disk/storage checks for Disk API
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2018-10-03 14:30:22 +02:00
e39e8ee213 refactor diskmanagement lock_file calls
so that we only have one place where we reference the lockfile
and the timeout

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2018-08-08 12:01:02 +02:00
fc7c0e0504 (finally) fix #1099: return disklist correctly
in get_disks, when called with a parameter 'cciss/cXdY', we replaced
the '/' with '!' so that we can properly poll the information
about it from /sys/block/

but we have to replace the '!' with '/' again in our result list,
because the caller does not know anything about it and fails, because
the original dev is not in the list

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2017-11-14 13:23:52 +01:00
e2bd817ccc add info about bluestore to disklist
this patch adds information about bluestore/db/wal to the disklist,
and we set the journal count only when we have at least one journal on
the disk

also adapt the regression tests

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2017-08-07 14:52:10 +02:00
1590fc13bf fix #1099 (again): map cciss/c0d0 to cciss!c0d0
there was still a point where we got the wrong string
on createosd we get the devpath (/dev/cciss/c0d0)

but need the info from get_disks, which looks in /sys/block
where it needs to be cciss!c0d0

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2017-01-05 15:54:11 +01:00
0d28307d2f Add function get_blockdev.
This function will return you the block device of a given partition path.
2016-12-22 12:14:34 +01:00
3196c387d2 Add new function part_num
With this function you get the partnum of a dev.
2016-12-22 12:12:47 +01:00
7b32e2a7cd add crucial smart attribute for wear leveling
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2016-11-30 17:20:09 +01:00
bd54091c2a catch '---' in threshold output of sandisk ssds
sandisk ssds have a default threshold of '---' on nearly all fields,
which prevents our parsing

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2016-11-30 17:19:43 +01:00
571b6f2638 add default rotational value
because if the file does not exist,
we have an perl error for comparing an uninitialized
value

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2016-10-28 10:20:28 +02:00
865bdbd917 use model from udevadm
we want this, because the model in /sys/block/<device>/device/model
is limited to 16 characters

and since the model is not always in the udevadm output (nvme),
also read the model from the model file as fallback

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2016-10-28 10:19:40 +02:00
10a48db55c make dir_is_empty a proper sub
this allows us later to mock the sub,
which we need for testing

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2016-10-28 10:19:40 +02:00
461a9fd8da move directory test into get_sysdir_info
because it logically belongs there, also
this makes the testing easier

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2016-10-28 10:19:40 +02:00
532e89e7be use /sys/block/ path for udev instead of name
since we iterate over the entries in /sys/block
it makes sense to use this path

this should fix #1099
because udevadm does not take
 -n cciss!c0d0 (because it only looks in dev for this)
but takes
 -p /sys/block/cciss!c0d0

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2016-10-28 10:19:40 +02:00
d57fb43d9b disks: fix warnings 2016-10-07 09:28:06 +02:00
38ddd4ce8e correctly parse cciss devices in get_disks
cciss maps the devices to
/sys/block/cciss!cXdY

and not to

/sys/block/cXdY

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2016-10-06 07:50:05 +02:00
6965a6701e fix #1135: refactor wear level parsing
refactored the wear level parsing into its
own function, where we can now define a
vendor <-> attribute id
mapping

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2016-10-06 07:49:30 +02:00
dd902da78e merge get_smart_data/health
instead of parsing the output of smart in two places,
give get_smart_data a flag if we only want health

this fixes a bug (not on the bugtracker), where
an ssd with disabled smart had an empty string as health
in the gui

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2016-10-06 07:48:57 +02:00
c9bd3d22b2 fix #1123: modify NVME device path for SMART support
seems like we need to drop the namespace part before calling
smartctl.

note that NVME devices require smartmontools >= 6.5
2016-10-04 10:35:30 +02:00
dc1311cbe8 fix #1120: non-ATA disks with unstructured smart data
simple parser as fallback for non-ATA disks, just collect
all lines in the "SMART DATA" section.
2016-10-04 10:34:22 +02:00
5db2d529ed fix #1120: SCSI SMART health parsing
SCSI disks report their health as "SMART Health Status: OK"
2016-10-04 10:34:06 +02:00
7a98a62dbc disks: allow get_disks without smart checks
the smart checks are only needed for the API call(s) that
list all disks and their status, but get_disks is also used
in disk usage checks and in the Ceph code, where the smart
status is completely irrelevant.

drop the implicit skipping of smart checks if $disk is set,
since we have an explicit parameter for this now.
2016-10-04 10:33:49 +02:00
1c99955364 disks: parse smart attributes using RE 2016-09-29 08:42:06 +02:00
0c486b09df disks: use smartctl -H -A
to only list SMART health and attributes, instead of
"smartctl -a", which prints "all SMART information"
2016-09-29 08:41:31 +02:00
acd3d91649 move SMART error handling into get_disks
because we never ever want to die in get_disks because of a
single disk, but the nodes/xyz/disks/smart API path is
allowed to fail if a disk device is unsupported by smartctl
or something else goes wrong.
2016-09-29 08:40:19 +02:00
9018a4e639 do not automatically die on smartctl exit code > 0
since smartctl uses the return value to encode
disk health status (such as failure in the past)
we cannot die there, but have to parse the returncode

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2016-09-08 16:52:33 +02:00
cbba9b5b9c add Diskmanage Utilities
this adds the functions for listing the disks (mostly copied from
the ceph code), checking if a disk is a valid blockdevice, if it
is used/in a zfs pool/as an lvm pv, and an init function (just to add a gpt header;
this is important if one wants to use a fresh disk for ceph journals)

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2016-09-05 11:31:19 +02:00