1.. SPDX-License-Identifier: GPL-2.0 2 3====================== 4Generic vcpu interface 5====================== 6 7The virtual cpu "device" also accepts the ioctls KVM_SET_DEVICE_ATTR, 8KVM_GET_DEVICE_ATTR, and KVM_HAS_DEVICE_ATTR. The interface uses the same struct 9kvm_device_attr as other devices, but targets VCPU-wide settings and controls. 10 11The groups and attributes per virtual cpu, if any, are architecture specific. 12 131. GROUP: KVM_ARM_VCPU_PMU_V3_CTRL 14================================== 15 16:Architectures: ARM64 17 181.1. ATTRIBUTE: KVM_ARM_VCPU_PMU_V3_IRQ 19--------------------------------------- 20 21:Parameters: in kvm_device_attr.addr the address for PMU overflow interrupt is a 22 pointer to an int 23 24Returns: 25 26 ======= ======================================================== 27 -EBUSY The PMU overflow interrupt is already set 28 -EFAULT Error reading interrupt number 29 -ENXIO PMUv3 not supported or the overflow interrupt not set 30 when attempting to get it 31 -ENODEV KVM_ARM_VCPU_PMU_V3 feature missing from VCPU 32 -EINVAL Invalid PMU overflow interrupt number supplied or 33 trying to set the IRQ number without using an in-kernel 34 irqchip. 35 ======= ======================================================== 36 37A value describing the PMUv3 (Performance Monitor Unit v3) overflow interrupt 38number for this vcpu. This interrupt could be a PPI or SPI, but the interrupt 39type must be same for each vcpu. As a PPI, the interrupt number is the same for 40all vcpus, while as an SPI it must be a separate number per vcpu. 41 42For GICv5-based guests, the architected PPI (23) must be used, and must be 43communicated as the full GICv5-style Interrupt ID, i.e., 0x20000017. This ioctl 44can be omitted altogether for a GICv5-based guest. 45 461.2 ATTRIBUTE: KVM_ARM_VCPU_PMU_V3_INIT 47--------------------------------------- 48 49:Parameters: no additional parameter in kvm_device_attr.addr 50 51Returns: 52 53 ======= ====================================================== 54 -EEXIST Interrupt number already used 55 -ENODEV PMUv3 not supported or GIC not initialized 56 -ENXIO PMUv3 not supported, missing VCPU feature or interrupt 57 number not set (non-GICv5 guests, only) 58 -EBUSY PMUv3 already initialized 59 ======= ====================================================== 60 61Request the initialization of the PMUv3. If using the PMUv3 with an in-kernel 62virtual GIC implementation, this must be done after initializing the in-kernel 63irqchip. 64 651.3 ATTRIBUTE: KVM_ARM_VCPU_PMU_V3_FILTER 66----------------------------------------- 67 68:Parameters: in kvm_device_attr.addr the address for a PMU event filter is a 69 pointer to a struct kvm_pmu_event_filter 70 71:Returns: 72 73 ======= ====================================================== 74 -ENODEV PMUv3 not supported or GIC not initialized 75 -ENXIO PMUv3 not properly configured or in-kernel irqchip not 76 configured as required prior to calling this attribute 77 -EBUSY PMUv3 already initialized or a VCPU has already run 78 -EINVAL Invalid filter range 79 ======= ====================================================== 80 81Request the installation of a PMU event filter described as follows:: 82 83 struct kvm_pmu_event_filter { 84 __u16 base_event; 85 __u16 nevents; 86 87 #define KVM_PMU_EVENT_ALLOW 0 88 #define KVM_PMU_EVENT_DENY 1 89 90 __u8 action; 91 __u8 pad[3]; 92 }; 93 94A filter range is defined as the range [@base_event, @base_event + @nevents), 95together with an @action (KVM_PMU_EVENT_ALLOW or KVM_PMU_EVENT_DENY). The 96first registered range defines the global policy (global ALLOW if the first 97@action is DENY, global DENY if the first @action is ALLOW). Multiple ranges 98can be programmed, and must fit within the event space defined by the PMU 99architecture (10 bits on ARMv8.0, 16 bits from ARMv8.1 onwards). 100 101Note: "Cancelling" a filter by registering the opposite action for the same 102range doesn't change the default action. For example, installing an ALLOW 103filter for event range [0:10) as the first filter and then applying a DENY 104action for the same range will leave the whole range as disabled. 105 106Restrictions: Event 0 (SW_INCR) is never filtered, as it doesn't count a 107hardware event. Filtering event 0x1E (CHAIN) has no effect either, as it 108isn't strictly speaking an event. Filtering the cycle counter is possible 109using event 0x11 (CPU_CYCLES). 110 1111.4 ATTRIBUTE: KVM_ARM_VCPU_PMU_V3_SET_PMU 112------------------------------------------ 113 114:Parameters: in kvm_device_attr.addr the address to an int representing the PMU 115 identifier. 116 117:Returns: 118 119 ======= ==================================================== 120 -EBUSY PMUv3 already initialized, a VCPU has already run or 121 an event filter has already been set 122 -EFAULT Error accessing the PMU identifier 123 -ENXIO PMU not found 124 -ENODEV PMUv3 not supported or GIC not initialized 125 -ENOMEM Could not allocate memory 126 ======= ==================================================== 127 128Request that the VCPU uses the specified hardware PMU when creating guest events 129for the purpose of PMU emulation. The PMU identifier can be read from the "type" 130file for the desired PMU instance under /sys/devices (or, equivalent, 131/sys/bus/even_source). This attribute is particularly useful on heterogeneous 132systems where there are at least two CPU PMUs on the system. The PMU that is set 133for one VCPU will be used by all the other VCPUs. It isn't possible to set a PMU 134if a PMU event filter is already present. 135 136Note that KVM will not make any attempts to run the VCPU on the physical CPUs 137associated with the PMU specified by this attribute. This is entirely left to 138userspace. However, attempting to run the VCPU on a physical CPU not supported 139by the PMU will fail and KVM_RUN will return with 140exit_reason = KVM_EXIT_FAIL_ENTRY and populate the fail_entry struct by setting 141hardare_entry_failure_reason field to KVM_EXIT_FAIL_ENTRY_CPU_UNSUPPORTED and 142the cpu field to the processor id. 143 1441.5 ATTRIBUTE: KVM_ARM_VCPU_PMU_V3_SET_NR_COUNTERS 145-------------------------------------------------- 146 147:Parameters: in kvm_device_attr.addr the address to an unsigned int 148 representing the maximum value taken by PMCR_EL0.N 149 150:Returns: 151 152 ======= ==================================================== 153 -EBUSY PMUv3 already initialized, a VCPU has already run or 154 an event filter has already been set 155 -EFAULT Error accessing the value pointed to by addr 156 -ENODEV PMUv3 not supported or GIC not initialized 157 -EINVAL No PMUv3 explicitly selected, or value of N out of 158 range 159 ======= ==================================================== 160 161Set the number of implemented event counters in the virtual PMU. This 162mandates that a PMU has explicitly been selected via 163KVM_ARM_VCPU_PMU_V3_SET_PMU, and will fail when no PMU has been 164explicitly selected, or the number of counters is out of range for the 165selected PMU. Selecting a new PMU cancels the effect of setting this 166attribute. 167 1682. GROUP: KVM_ARM_VCPU_TIMER_CTRL 169================================= 170 171:Architectures: ARM64 172 1732.1. ATTRIBUTES: KVM_ARM_VCPU_TIMER_IRQ_{VTIMER,PTIMER,HVTIMER,HPTIMER} 174----------------------------------------------------------------------- 175 176:Parameters: in kvm_device_attr.addr the address for the timer interrupt is a 177 pointer to an int 178 179Returns: 180 181 ======= ================================= 182 -EINVAL Invalid timer interrupt number 183 -EBUSY One or more VCPUs has already run 184 ======= ================================= 185 186A value describing the architected timer interrupt number when connected to an 187in-kernel virtual GIC. These must be a PPI (16 <= intid < 32). Setting the 188attribute overrides the default values (see below). 189 190============================== ========================================== 191KVM_ARM_VCPU_TIMER_IRQ_VTIMER The EL1 virtual timer intid (default: 27) 192KVM_ARM_VCPU_TIMER_IRQ_PTIMER The EL1 physical timer intid (default: 30) 193KVM_ARM_VCPU_TIMER_IRQ_HVTIMER The EL2 virtual timer intid (default: 28) 194KVM_ARM_VCPU_TIMER_IRQ_HPTIMER The EL2 physical timer intid (default: 26) 195============================== ========================================== 196 197Setting the same PPI for different timers will prevent the VCPUs from running. 198Setting the interrupt number on a VCPU configures all VCPUs created at that 199time to use the number provided for a given timer, overwriting any previously 200configured values on other VCPUs. Userspace should configure the interrupt 201numbers on at least one VCPU after creating all VCPUs and before running any 202VCPUs. 203 204.. _kvm_arm_vcpu_pvtime_ctrl: 205 2063. GROUP: KVM_ARM_VCPU_PVTIME_CTRL 207================================== 208 209:Architectures: ARM64 210 2113.1 ATTRIBUTE: KVM_ARM_VCPU_PVTIME_IPA 212-------------------------------------- 213 214:Parameters: 64-bit base address 215 216Returns: 217 218 ======= ====================================== 219 -ENXIO Stolen time not implemented 220 -EEXIST Base address already set for this VCPU 221 -EINVAL Base address not 64 byte aligned 222 ======= ====================================== 223 224Specifies the base address of the stolen time structure for this VCPU. The 225base address must be 64 byte aligned and exist within a valid guest memory 226region. See Documentation/virt/kvm/arm/pvtime.rst for more information 227including the layout of the stolen time structure. 228 2294. GROUP: KVM_VCPU_TSC_CTRL 230=========================== 231 232:Architectures: x86 233 2344.1 ATTRIBUTE: KVM_VCPU_TSC_OFFSET 235 236:Parameters: 64-bit unsigned TSC offset 237 238Returns: 239 240 ======= ====================================== 241 -EFAULT Error reading/writing the provided 242 parameter address. 243 -ENXIO Attribute not supported 244 ======= ====================================== 245 246Specifies the guest's TSC offset relative to the host's TSC. The guest's 247TSC is then derived by the following equation: 248 249 guest_tsc = host_tsc + KVM_VCPU_TSC_OFFSET 250 251This attribute is useful to adjust the guest's TSC on live migration, 252so that the TSC counts the time during which the VM was paused. The 253following describes a possible algorithm to use for this purpose. 254 255From the source VMM process: 256 2571. Invoke the KVM_GET_CLOCK ioctl to record the host TSC (tsc_src), 258 kvmclock nanoseconds (guest_src), and host CLOCK_REALTIME nanoseconds 259 (host_src). 260 2612. Read the KVM_VCPU_TSC_OFFSET attribute for every vCPU to record the 262 guest TSC offset (ofs_src[i]). 263 2643. Invoke the KVM_GET_TSC_KHZ ioctl to record the frequency of the 265 guest's TSC (freq). 266 267From the destination VMM process: 268 2694. Invoke the KVM_SET_CLOCK ioctl, providing the source nanoseconds from 270 kvmclock (guest_src) and CLOCK_REALTIME (host_src) in their respective 271 fields. Ensure that the KVM_CLOCK_REALTIME flag is set in the provided 272 structure. 273 274 KVM will advance the VM's kvmclock to account for elapsed time since 275 recording the clock values. Note that this will cause problems in 276 the guest (e.g., timeouts) unless CLOCK_REALTIME is synchronized 277 between the source and destination, and a reasonably short time passes 278 between the source pausing the VMs and the destination executing 279 steps 4-7. 280 2815. Invoke the KVM_GET_CLOCK ioctl to record the host TSC (tsc_dest) and 282 kvmclock nanoseconds (guest_dest). 283 2846. Adjust the guest TSC offsets for every vCPU to account for (1) time 285 elapsed since recording state and (2) difference in TSCs between the 286 source and destination machine: 287 288 ofs_dst[i] = ofs_src[i] - 289 (guest_src - guest_dest) * freq + 290 (tsc_src - tsc_dest) 291 292 ("ofs[i] + tsc - guest * freq" is the guest TSC value corresponding to 293 a time of 0 in kvmclock. The above formula ensures that it is the 294 same on the destination as it was on the source). 295 2967. Write the KVM_VCPU_TSC_OFFSET attribute for every vCPU with the 297 respective value derived in the previous step. 298