Commit 221bb8a4 authored by Linus Torvalds's avatar Linus Torvalds
Browse files

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Paolo Bonzini:

 - ARM: GICv3 ITS emulation and various fixes.  Removal of the
   old VGIC implementation.

 - s390: support for trapping software breakpoints, nested
   virtualization (vSIE), the STHYI opcode, initial extensions
   for CPU model support.

 - MIPS: support for MIPS64 hosts (32-bit guests only) and lots
   of cleanups, preliminary to this and the upcoming support for
   hardware virtualization extensions.

 - x86: support for execute-only mappings in nested EPT; reduced
   vmexit latency for TSC deadline timer (by about 30%) on Intel
   hosts; support for more than 255 vCPUs.

 - PPC: bugfixes.

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (302 commits)
  KVM: PPC: Introduce KVM_CAP_PPC_HTM
  MIPS: Select HAVE_KVM for MIPS64_R{2,6}
  MIPS: KVM: Reset CP0_PageMask during host TLB flush
  MIPS: KVM: Fix ptr->int cast via KVM_GUEST_KSEGX()
  MIPS: KVM: Sign extend MFC0/RDHWR results
  MIPS: KVM: Fix 64-bit big endian dynamic translation
  MIPS: KVM: Fail if ebase doesn't fit in CP0_EBase
  MIPS: KVM: Use 64-bit CP0_EBase when appropriate
  MIPS: KVM: Set CP0_Status.KX on MIPS64
  MIPS: KVM: Make entry code MIPS64 friendly
  MIPS: KVM: Use kmap instead of CKSEG0ADDR()
  MIPS: KVM: Use virt_to_phys() to get commpage PFN
  MIPS: Fix definition of KSEGX() for 64-bit
  KVM: VMX: Add VMCS to CPU's loaded VMCSs before VMPTRLD
  kvm: x86: nVMX: maintain internal copy of current VMCS
  KVM: PPC: Book3S HV: Save/restore TM state in H_CEDE
  KVM: PPC: Book3S HV: Pull out TM state save/restore into separate procedures
  KVM: arm64: vgic-its: Simplify MAPI error handling
  KVM: arm64: vgic-its: Make vgic_its_cmd_handle_mapi similar to other handlers
  KVM: arm64: vgic-its: Turn device_id validation into generic ID validation
  ...
parents f7b32e4c 23528bb2
......@@ -1482,6 +1482,11 @@ struct kvm_irq_routing_msi {
__u32 pad;
};
On x86, address_hi is ignored unless the KVM_X2APIC_API_USE_32BIT_IDS
feature of KVM_CAP_X2APIC_API capability is enabled. If it is enabled,
address_hi bits 31-8 provide bits 31-8 of the destination id. Bits 7-0 of
address_hi must be zero.
struct kvm_irq_routing_s390_adapter {
__u64 ind_addr;
__u64 summary_addr;
......@@ -1583,6 +1588,17 @@ struct kvm_lapic_state {
Reads the Local APIC registers and copies them into the input argument. The
data format and layout are the same as documented in the architecture manual.
If KVM_X2APIC_API_USE_32BIT_IDS feature of KVM_CAP_X2APIC_API is
enabled, then the format of APIC_ID register depends on the APIC mode
(reported by MSR_IA32_APICBASE) of its VCPU. x2APIC stores APIC ID in
the APIC_ID register (bytes 32-35). xAPIC only allows an 8-bit APIC ID
which is stored in bits 31-24 of the APIC register, or equivalently in
byte 35 of struct kvm_lapic_state's regs field. KVM_GET_LAPIC must then
be called after MSR_IA32_APICBASE has been set with KVM_SET_MSR.
If KVM_X2APIC_API_USE_32BIT_IDS feature is disabled, struct kvm_lapic_state
always uses xAPIC format.
4.58 KVM_SET_LAPIC
......@@ -1600,6 +1616,10 @@ struct kvm_lapic_state {
Copies the input argument into the Local APIC registers. The data format
and layout are the same as documented in the architecture manual.
The format of the APIC ID register (bytes 32-35 of struct kvm_lapic_state's
regs field) depends on the state of the KVM_CAP_X2APIC_API capability.
See the note in KVM_GET_LAPIC.
4.59 KVM_IOEVENTFD
......@@ -2032,6 +2052,12 @@ registers, find a list below:
MIPS | KVM_REG_MIPS_CP0_CONFIG5 | 32
MIPS | KVM_REG_MIPS_CP0_CONFIG7 | 32
MIPS | KVM_REG_MIPS_CP0_ERROREPC | 64
MIPS | KVM_REG_MIPS_CP0_KSCRATCH1 | 64
MIPS | KVM_REG_MIPS_CP0_KSCRATCH2 | 64
MIPS | KVM_REG_MIPS_CP0_KSCRATCH3 | 64
MIPS | KVM_REG_MIPS_CP0_KSCRATCH4 | 64
MIPS | KVM_REG_MIPS_CP0_KSCRATCH5 | 64
MIPS | KVM_REG_MIPS_CP0_KSCRATCH6 | 64
MIPS | KVM_REG_MIPS_COUNT_CTL | 64
MIPS | KVM_REG_MIPS_COUNT_RESUME | 64
MIPS | KVM_REG_MIPS_COUNT_HZ | 64
......@@ -2156,7 +2182,7 @@ after pausing the vcpu, but before it is resumed.
4.71 KVM_SIGNAL_MSI
Capability: KVM_CAP_SIGNAL_MSI
Architectures: x86
Architectures: x86 arm64
Type: vm ioctl
Parameters: struct kvm_msi (in)
Returns: >0 on delivery, 0 if guest blocked the MSI, and -1 on error
......@@ -2169,10 +2195,22 @@ struct kvm_msi {
__u32 address_hi;
__u32 data;
__u32 flags;
__u8 pad[16];
__u32 devid;
__u8 pad[12];
};
No flags are defined so far. The corresponding field must be 0.
flags: KVM_MSI_VALID_DEVID: devid contains a valid value
devid: If KVM_MSI_VALID_DEVID is set, contains a unique device identifier
for the device that wrote the MSI message.
For PCI, this is usually a BFD identifier in the lower 16 bits.
The per-VM KVM_CAP_MSI_DEVID capability advertises the need to provide
the device ID. If this capability is not set, userland cannot rely on
the kernel to allow the KVM_MSI_VALID_DEVID flag being set.
On x86, address_hi is ignored unless the KVM_CAP_X2APIC_API capability is
enabled. If it is enabled, address_hi bits 31-8 provide bits 31-8 of the
destination id. Bits 7-0 of address_hi must be zero.
4.71 KVM_CREATE_PIT2
......@@ -2520,6 +2558,7 @@ Parameters: struct kvm_device_attr
Returns: 0 on success, -1 on error
Errors:
ENXIO: The group or attribute is unknown/unsupported for this device
or hardware support is missing.
EPERM: The attribute cannot (currently) be accessed this way
(e.g. read-only attribute, or attribute that only makes
sense when the device is in a different state)
......@@ -2547,6 +2586,7 @@ Parameters: struct kvm_device_attr
Returns: 0 on success, -1 on error
Errors:
ENXIO: The group or attribute is unknown/unsupported for this device
or hardware support is missing.
Tests whether a device supports a particular attribute. A successful
return indicates the attribute is implemented. It does not necessarily
......@@ -3803,6 +3843,42 @@ Allows use of runtime-instrumentation introduced with zEC12 processor.
Will return -EINVAL if the machine does not support runtime-instrumentation.
Will return -EBUSY if a VCPU has already been created.
7.7 KVM_CAP_X2APIC_API
Architectures: x86
Parameters: args[0] - features that should be enabled
Returns: 0 on success, -EINVAL when args[0] contains invalid features
Valid feature flags in args[0] are
#define KVM_X2APIC_API_USE_32BIT_IDS (1ULL << 0)
#define KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK (1ULL << 1)
Enabling KVM_X2APIC_API_USE_32BIT_IDS changes the behavior of
KVM_SET_GSI_ROUTING, KVM_SIGNAL_MSI, KVM_SET_LAPIC, and KVM_GET_LAPIC,
allowing the use of 32-bit APIC IDs. See KVM_CAP_X2APIC_API in their
respective sections.
KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK must be enabled for x2APIC to work
in logical mode or with more than 255 VCPUs. Otherwise, KVM treats 0xff
as a broadcast even in x2APIC mode in order to support physical x2APIC
without interrupt remapping. This is undesirable in logical mode,
where 0xff represents CPUs 0-7 in cluster 0.
7.8 KVM_CAP_S390_USER_INSTR0
Architectures: s390
Parameters: none
With this capability enabled, all illegal instructions 0x0000 (2 bytes) will
be intercepted and forwarded to user space. User space can use this
mechanism e.g. to realize 2-byte software breakpoints. The kernel will
not inject an operating exception for these instructions, user space has
to take care of that.
This capability can be enabled dynamically even if VCPUs were already
created and are running.
8. Other capabilities.
----------------------
......
......@@ -4,16 +4,22 @@ ARM Virtual Generic Interrupt Controller (VGIC)
Device types supported:
KVM_DEV_TYPE_ARM_VGIC_V2 ARM Generic Interrupt Controller v2.0
KVM_DEV_TYPE_ARM_VGIC_V3 ARM Generic Interrupt Controller v3.0
KVM_DEV_TYPE_ARM_VGIC_ITS ARM Interrupt Translation Service Controller
Only one VGIC instance may be instantiated through either this API or the
legacy KVM_CREATE_IRQCHIP api. The created VGIC will act as the VM interrupt
controller, requiring emulated user-space devices to inject interrupts to the
VGIC instead of directly to CPUs.
Only one VGIC instance of the V2/V3 types above may be instantiated through
either this API or the legacy KVM_CREATE_IRQCHIP api. The created VGIC will
act as the VM interrupt controller, requiring emulated user-space devices to
inject interrupts to the VGIC instead of directly to CPUs.
Creating a guest GICv3 device requires a host GICv3 as well.
GICv3 implementations with hardware compatibility support allow a guest GICv2
as well.
Creating a virtual ITS controller requires a host GICv3 (but does not depend
on having physical ITS controllers).
There can be multiple ITS controllers per guest, each of them has to have
a separate, non-overlapping MMIO region.
Groups:
KVM_DEV_ARM_VGIC_GRP_ADDR
Attributes:
......@@ -39,6 +45,13 @@ Groups:
Only valid for KVM_DEV_TYPE_ARM_VGIC_V3.
This address needs to be 64K aligned.
KVM_VGIC_V3_ADDR_TYPE_ITS (rw, 64-bit)
Base address in the guest physical address space of the GICv3 ITS
control register frame. The ITS allows MSI(-X) interrupts to be
injected into guests. This extension is optional. If the kernel
does not support the ITS, the call returns -ENODEV.
Only valid for KVM_DEV_TYPE_ARM_VGIC_ITS.
This address needs to be 64K aligned and the region covers 128K.
KVM_DEV_ARM_VGIC_GRP_DIST_REGS
Attributes:
......@@ -109,8 +122,8 @@ Groups:
KVM_DEV_ARM_VGIC_GRP_CTRL
Attributes:
KVM_DEV_ARM_VGIC_CTRL_INIT
request the initialization of the VGIC, no additional parameter in
kvm_device_attr.addr.
request the initialization of the VGIC or ITS, no additional parameter
in kvm_device_attr.addr.
Errors:
-ENXIO: VGIC not properly configured as required prior to calling
this attribute
......
......@@ -20,7 +20,8 @@ Enables Collaborative Memory Management Assist (CMMA) for the virtual machine.
1.2. ATTRIBUTE: KVM_S390_VM_MEM_CLR_CMMA
Parameters: none
Returns: 0
Returns: -EINVAL if CMMA was not enabled
0 otherwise
Clear the CMMA status for all guest pages, so any pages the guest marked
as unused are again used any may not be reclaimed by the host.
......@@ -85,6 +86,90 @@ Returns: -EBUSY in case 1 or more vcpus are already activated (only in write
-ENOMEM if not enough memory is available to process the ioctl
0 in case of success
2.3. ATTRIBUTE: KVM_S390_VM_CPU_MACHINE_FEAT (r/o)
Allows user space to retrieve available cpu features. A feature is available if
provided by the hardware and supported by kvm. In theory, cpu features could
even be completely emulated by kvm.
struct kvm_s390_vm_cpu_feat {
__u64 feat[16]; # Bitmap (1 = feature available), MSB 0 bit numbering
};
Parameters: address of a buffer to load the feature list from.
Returns: -EFAULT if the given address is not accessible from kernel space.
0 in case of success.
2.4. ATTRIBUTE: KVM_S390_VM_CPU_PROCESSOR_FEAT (r/w)
Allows user space to retrieve or change enabled cpu features for all VCPUs of a
VM. Features that are not available cannot be enabled.
See 2.3. for a description of the parameter struct.
Parameters: address of a buffer to store/load the feature list from.
Returns: -EFAULT if the given address is not accessible from kernel space.
-EINVAL if a cpu feature that is not available is to be enabled.
-EBUSY if at least one VCPU has already been defined.
0 in case of success.
2.5. ATTRIBUTE: KVM_S390_VM_CPU_MACHINE_SUBFUNC (r/o)
Allows user space to retrieve available cpu subfunctions without any filtering
done by a set IBC. These subfunctions are indicated to the guest VCPU via
query or "test bit" subfunctions and used e.g. by cpacf functions, plo and ptff.
A subfunction block is only valid if KVM_S390_VM_CPU_MACHINE contains the
STFL(E) bit introducing the affected instruction. If the affected instruction
indicates subfunctions via a "query subfunction", the response block is
contained in the returned struct. If the affected instruction
indicates subfunctions via a "test bit" mechanism, the subfunction codes are
contained in the returned struct in MSB 0 bit numbering.
struct kvm_s390_vm_cpu_subfunc {
u8 plo[32]; # always valid (ESA/390 feature)
u8 ptff[16]; # valid with TOD-clock steering
u8 kmac[16]; # valid with Message-Security-Assist
u8 kmc[16]; # valid with Message-Security-Assist
u8 km[16]; # valid with Message-Security-Assist
u8 kimd[16]; # valid with Message-Security-Assist
u8 klmd[16]; # valid with Message-Security-Assist
u8 pckmo[16]; # valid with Message-Security-Assist-Extension 3
u8 kmctr[16]; # valid with Message-Security-Assist-Extension 4
u8 kmf[16]; # valid with Message-Security-Assist-Extension 4
u8 kmo[16]; # valid with Message-Security-Assist-Extension 4
u8 pcc[16]; # valid with Message-Security-Assist-Extension 4
u8 ppno[16]; # valid with Message-Security-Assist-Extension 5
u8 reserved[1824]; # reserved for future instructions
};
Parameters: address of a buffer to load the subfunction blocks from.
Returns: -EFAULT if the given address is not accessible from kernel space.
0 in case of success.
2.6. ATTRIBUTE: KVM_S390_VM_CPU_PROCESSOR_SUBFUNC (r/w)
Allows user space to retrieve or change cpu subfunctions to be indicated for
all VCPUs of a VM. This attribute will only be available if kernel and
hardware support are in place.
The kernel uses the configured subfunction blocks for indication to
the guest. A subfunction block will only be used if the associated STFL(E) bit
has not been disabled by user space (so the instruction to be queried is
actually available for the guest).
As long as no data has been written, a read will fail. The IBC will be used
to determine available subfunctions in this case, this will guarantee backward
compatibility.
See 2.5. for a description of the parameter struct.
Parameters: address of a buffer to store/load the subfunction blocks from.
Returns: -EFAULT if the given address is not accessible from kernel space.
-EINVAL when reading, if there was no write yet.
-EBUSY if at least one VCPU has already been defined.
0 in case of success.
3. GROUP: KVM_S390_VM_TOD
Architectures: s390
......
......@@ -89,7 +89,7 @@ In mmu_spte_clear_track_bits():
old_spte = *spte;
/* 'if' condition is satisfied. */
if (old_spte.Accssed == 1 &&
if (old_spte.Accessed == 1 &&
old_spte.W == 0)
spte = 0ull;
on fast page fault path:
......@@ -102,7 +102,7 @@ In mmu_spte_clear_track_bits():
old_spte = xchg(spte, 0ull)
if (old_spte.Accssed == 1)
if (old_spte.Accessed == 1)
kvm_set_pfn_accessed(spte.pfn);
if (old_spte.Dirty == 1)
kvm_set_pfn_dirty(spte.pfn);
......
......@@ -66,6 +66,8 @@ extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
extern void __init_stage2_translation(void);
extern void __kvm_hyp_reset(unsigned long);
#endif
#endif /* __ARM_KVM_ASM_H__ */
......@@ -241,8 +241,7 @@ int kvm_arm_coproc_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *);
int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
int exception_index);
static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
phys_addr_t pgd_ptr,
static inline void __cpu_init_hyp_mode(phys_addr_t pgd_ptr,
unsigned long hyp_stack_ptr,
unsigned long vector_ptr)
{
......@@ -251,18 +250,13 @@ static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
* code. The init code doesn't need to preserve these
* registers as r0-r3 are already callee saved according to
* the AAPCS.
* Note that we slightly misuse the prototype by casing the
* Note that we slightly misuse the prototype by casting the
* stack pointer to a void *.
*
* We don't have enough registers to perform the full init in
* one go. Install the boot PGD first, and then install the
* runtime PGD, stack pointer and vectors. The PGDs are always
* passed as the third argument, in order to be passed into
* r2-r3 to the init code (yes, this is compliant with the
* PCS!).
*/
kvm_call_hyp(NULL, 0, boot_pgd_ptr);
* The PGDs are always passed as the third argument, in order
* to be passed into r2-r3 to the init code (yes, this is
* compliant with the PCS!).
*/
kvm_call_hyp((void*)hyp_stack_ptr, vector_ptr, pgd_ptr);
}
......@@ -272,16 +266,13 @@ static inline void __cpu_init_stage2(void)
kvm_call_hyp(__init_stage2_translation);
}
static inline void __cpu_reset_hyp_mode(phys_addr_t boot_pgd_ptr,
static inline void __cpu_reset_hyp_mode(unsigned long vector_ptr,
phys_addr_t phys_idmap_start)
{
/*
* TODO
* kvm_call_reset(boot_pgd_ptr, phys_idmap_start);
*/
kvm_call_hyp((void *)virt_to_idmap(__kvm_hyp_reset), vector_ptr);
}
static inline int kvm_arch_dev_ioctl_check_extension(long ext)
static inline int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext)
{
return 0;
}
......
......@@ -25,9 +25,6 @@
#define __hyp_text __section(.hyp.text) notrace
#define kern_hyp_va(v) (v)
#define hyp_kern_va(v) (v)
#define __ACCESS_CP15(CRn, Op1, CRm, Op2) \
"mrc", "mcr", __stringify(p15, Op1, %0, CRn, CRm, Op2), u32
#define __ACCESS_CP15_64(Op1, CRm) \
......
......@@ -26,16 +26,7 @@
* We directly use the kernel VA for the HYP, as we can directly share
* the mapping (HTTBR "covers" TTBR1).
*/
#define HYP_PAGE_OFFSET_MASK UL(~0)
#define HYP_PAGE_OFFSET PAGE_OFFSET
#define KERN_TO_HYP(kva) (kva)
/*
* Our virtual mapping for the boot-time MMU-enable code. Must be
* shared across all the page-tables. Conveniently, we use the vectors
* page, where no kernel data will ever be shared with HYP.
*/
#define TRAMPOLINE_VA UL(CONFIG_VECTORS_BASE)
#define kern_hyp_va(kva) (kva)
/*
* KVM_MMU_CACHE_MIN_PAGES is the number of stage2 page table translation levels.
......@@ -49,9 +40,8 @@
#include <asm/pgalloc.h>
#include <asm/stage2_pgtable.h>
int create_hyp_mappings(void *from, void *to);
int create_hyp_mappings(void *from, void *to, pgprot_t prot);
int create_hyp_io_mappings(void *from, void *to, phys_addr_t);
void free_boot_hyp_pgd(void);
void free_hyp_pgds(void);
void stage2_unmap_vm(struct kvm *kvm);
......@@ -65,7 +55,6 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu, struct kvm_run *run);
void kvm_mmu_free_memory_caches(struct kvm_vcpu *vcpu);
phys_addr_t kvm_mmu_get_httbr(void);
phys_addr_t kvm_mmu_get_boot_httbr(void);
phys_addr_t kvm_get_idmap_vector(void);
phys_addr_t kvm_get_idmap_start(void);
int kvm_mmu_init(void);
......
......@@ -97,7 +97,9 @@ extern pgprot_t pgprot_s2_device;
#define PAGE_READONLY_EXEC _MOD_PROT(pgprot_user, L_PTE_USER | L_PTE_RDONLY)
#define PAGE_KERNEL _MOD_PROT(pgprot_kernel, L_PTE_XN)
#define PAGE_KERNEL_EXEC pgprot_kernel
#define PAGE_HYP _MOD_PROT(pgprot_kernel, L_PTE_HYP)
#define PAGE_HYP _MOD_PROT(pgprot_kernel, L_PTE_HYP | L_PTE_XN)
#define PAGE_HYP_EXEC _MOD_PROT(pgprot_kernel, L_PTE_HYP | L_PTE_RDONLY)
#define PAGE_HYP_RO _MOD_PROT(pgprot_kernel, L_PTE_HYP | L_PTE_RDONLY | L_PTE_XN)
#define PAGE_HYP_DEVICE _MOD_PROT(pgprot_hyp_device, L_PTE_HYP)
#define PAGE_S2 _MOD_PROT(pgprot_s2, L_PTE_S2_RDONLY)
#define PAGE_S2_DEVICE _MOD_PROT(pgprot_s2_device, L_PTE_S2_RDONLY)
......
......@@ -80,6 +80,10 @@ static inline bool is_kernel_in_hyp_mode(void)
return false;
}
/* The section containing the hypervisor idmap text */
extern char __hyp_idmap_text_start[];
extern char __hyp_idmap_text_end[];
/* The section containing the hypervisor text */
extern char __hyp_text_start[];
extern char __hyp_text_end[];
......
......@@ -46,13 +46,6 @@ config KVM_ARM_HOST
---help---
Provides host support for ARM processors.
config KVM_NEW_VGIC
bool "New VGIC implementation"
depends on KVM
default y
---help---
uses the new VGIC implementation
source drivers/vhost/Kconfig
endif # VIRTUALIZATION
......@@ -22,7 +22,6 @@ obj-y += kvm-arm.o init.o interrupts.o
obj-y += arm.o handle_exit.o guest.o mmu.o emulate.o reset.o
obj-y += coproc.o coproc_a15.o coproc_a7.o mmio.o psci.o perf.o
ifeq ($(CONFIG_KVM_NEW_VGIC),y)
obj-y += $(KVM)/arm/vgic/vgic.o
obj-y += $(KVM)/arm/vgic/vgic-init.o
obj-y += $(KVM)/arm/vgic/vgic-irqfd.o
......@@ -30,9 +29,4 @@ obj-y += $(KVM)/arm/vgic/vgic-v2.o
obj-y += $(KVM)/arm/vgic/vgic-mmio.o
obj-y += $(KVM)/arm/vgic/vgic-mmio-v2.o
obj-y += $(KVM)/arm/vgic/vgic-kvm-device.o
else
obj-y += $(KVM)/arm/vgic.o
obj-y += $(KVM)/arm/vgic-v2.o
obj-y += $(KVM)/arm/vgic-v2-emul.o
endif
obj-y += $(KVM)/arm/arch_timer.o
......@@ -20,6 +20,7 @@
#include <linux/errno.h>
#include <linux/err.h>
#include <linux/kvm_host.h>
#include <linux/list.h>
#include <linux/module.h>
#include <linux/vmalloc.h>
#include <linux/fs.h>
......@@ -122,7 +123,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
if (ret)
goto out_fail_alloc;
ret = create_hyp_mappings(kvm, kvm + 1);
ret = create_hyp_mappings(kvm, kvm + 1, PAGE_HYP);
if (ret)
goto out_free_stage2_pgd;
......@@ -201,7 +202,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
r = KVM_MAX_VCPUS;
break;
default:
r = kvm_arch_dev_ioctl_check_extension(ext);
r = kvm_arch_dev_ioctl_check_extension(kvm, ext);
break;
}
return r;
......@@ -239,7 +240,7 @@ struct kvm_vcpu *kvm_arch_vcpu_create(struct kvm *kvm, unsigned int id)
if (err)
goto free_vcpu;
err = create_hyp_mappings(vcpu, vcpu + 1);
err = create_hyp_mappings(vcpu, vcpu + 1, PAGE_HYP);
if (err)
goto vcpu_uninit;
......@@ -377,7 +378,7 @@ void force_vm_exit(const cpumask_t *mask)
/**
* need_new_vmid_gen - check that the VMID is still valid
* @kvm: The VM's VMID to checkt
* @kvm: The VM's VMID to check
*
* return true if there is a new generation of VMIDs being used
*
......@@ -616,7 +617,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
* Enter the guest
*/
trace_kvm_entry(*vcpu_pc(vcpu));
__kvm_guest_enter();
guest_enter_irqoff();
vcpu->mode = IN_GUEST_MODE;
ret = kvm_call_hyp(__kvm_vcpu_run, vcpu);
......@@ -642,14 +643,14 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
local_irq_enable();
/*
* We do local_irq_enable() before calling kvm_guest_exit() so
* We do local_irq_enable() before calling guest_exit() so
* that if a timer interrupt hits while running the guest we
* account that tick as being spent in the guest. We enable
* preemption after calling kvm_guest_exit() so that if we get
* preemption after calling guest_exit() so that if we get
* preempted we make sure ticks after that is not counted as
* guest time.
*/
kvm_guest_exit();
guest_exit();
trace_kvm_exit(ret, kvm_vcpu_trap_get_class(vcpu), *vcpu_pc(vcpu));
/*
......@@ -1039,7 +1040,6 @@ long kvm_arch_vm_ioctl(struct file *filp,
static void cpu_init_hyp_mode(void *dummy)
{
phys_addr_t boot_pgd_ptr;
phys_addr_t pgd_ptr;
unsigned long hyp_stack_ptr;
unsigned long stack_page;
......@@ -1048,13 +1048,12 @@ static void cpu_init_hyp_mode(void *dummy)
/* Switch from the HYP stub to our own HYP init vector */
__hyp_set_vectors(kvm_get_idmap_vector());
boot_pgd_ptr = kvm_mmu_get_boot_httbr();
pgd_ptr = kvm_mmu_get_httbr();
stack_page = __this_cpu_read(kvm_arm_hyp_stack_page);
hyp_stack_ptr = stack_page + PAGE_SIZE;
vector_ptr = (unsigned long)kvm_ksym_ref(__kvm_hyp_vector);
__cpu_init_hyp_mode(boot_pgd_ptr, pgd_ptr, hyp_stack_ptr, vector_ptr);
__cpu_init_hyp_mode(pgd_ptr, hyp_stack_ptr, vector_ptr);
__cpu_init_stage2();
kvm_arm_init_debug();
......@@ -1076,15 +1075,9 @@ static void cpu_hyp_reinit(void)
static void cpu_hyp_reset(void)
{
phys_addr_t boot_pgd_ptr;
phys_addr_t phys_idmap_start;
if (!is_kernel_in_hyp_mode()) {
boot_pgd_ptr = kvm_mmu_get_boot_httbr();
phys_idmap_start = kvm_get_idmap_start();
__cpu_reset_hyp_mode(boot_pgd_ptr, phys_idmap_start);
}
if (!is_kernel_in_hyp_mode())
__cpu_reset_hyp_mode(hyp_default_vectors,
kvm_get_idmap_start());
}
static void _kvm_arch_hardware_enable(void *discard)
......@@ -1294,14 +1287,14 @@ static int init_hyp_mode(void)
* Map the Hyp-code called directly from the host
*/
err = create_hyp_mappings(kvm_ksym_ref(__hyp_text_start),
kvm_ksym_ref(__hyp_text_end));
kvm_ksym_ref(__hyp_text_end), PAGE_HYP_EXEC);
if (err) {
kvm_err("Cannot map world-switch code\n");
goto out_err;
}
err = create_hyp_mappings(kvm_ksym_ref(__start_rodata),
kvm_ksym_ref(__end_rodata));
kvm_ksym_ref(__end_rodata), PAGE_HYP_RO);
if (err) {
kvm_err("Cannot map rodata section\n");
goto out_err;
......@@ -1312,7 +1305,8 @@ static int init_hyp_mode(void)
*/
for_each_possible_cpu(cpu) {
char *stack_page = (char *)per_cpu(kvm_arm_hyp_stack_page, cpu);
err = create_hyp_mappings(stack_page, stack_page + PAGE_SIZE);
err = create_hyp_mappings(stack_page, stack_page + PAGE_SIZE,
PAGE_HYP);
if (err) {
kvm_err("Cannot map hyp stack\n");
......@@ -1324,7 +1318,7 @@ static int init_hyp_mode(void)
kvm_cpu_context_t *cpu_ctxt;
cpu_ctxt = per_cpu_ptr(kvm_host_cpu_state, cpu);
err = create_hyp_mappings(cpu_ctxt, cpu_ctxt + 1);
err = create_hyp_mappings(cpu_ctxt, cpu_ctxt + 1, PAGE_HYP);
if (err) {
kvm_err("Cannot map host CPU state: %d\n", err);
......@@ -1332,10 +1326,6 @@ static int init_hyp_mode(void)
}
}
#ifndef CONFIG_HOTPLUG_CPU
free_boot_hyp_pgd();
#endif
/* set size of VMID supported by CPU */
kvm_vmid_bits = kvm_get_vmid_bits();
kvm_info("%d-bit VMID\n", kvm_vmid_bits);
......
......@@ -210,7 +210,7 @@ bool kvm_condition_valid(struct kvm_vcpu *vcpu)
* @vcpu: The VCPU pointer
*
* When exceptions occur while instructions are executed in Thumb IF-THEN
* blocks, the ITSTATE field of the CPSR is not advanved (updated), so we have
* blocks, the ITSTATE field of the CPSR is not advanced (updated), so we have
* to do this little bit of work manually. The fields map like this:
*
* IT[7:0] -> CPSR[26:25],CPSR[15:10]
......
......@@ -182,7 +182,7 @@ unsigned long kvm_arm_num_regs(struct kvm_vcpu *vcpu)
/**
* kvm_arm_copy_reg_indices - get indices of all registers.
*
* We do core registers right here, then we apppend coproc regs.
* We do core registers right here, then we append coproc regs.
*/
int kvm_arm_copy_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)
{
......
......@@ -32,23 +32,13 @@
* r2,r3 = Hypervisor pgd pointer
*
* The init scenario is:
* - We jump in HYP with four parameters: boot HYP pgd, runtime HYP pgd,
* runtime stack, runtime vectors
* - Enable the MMU with the boot pgd
* - Jump to a target into the trampoline page (remember, this is the same
* physical page!)
* - Now switch to the runtime pgd (same VA, and still the same physical
* page!)
* - We jump in HYP with 3 parameters: runtime HYP pgd, runtime stack,
* runtime vectors
* - Invalidate TLBs
* - Set stack and vectors
* - Setup the page tables
* - Enable the MMU
* - Profit! (or eret, if you only care about the code).
*
* As we only have four registers available to pass parameters (and we
* need six), we split the init in two phases:
* - Phase 1: r0 = 0, r1 = 0, r2,r3 contain the boot PGD.
* Provides the basic HYP init, and enable the MMU.
* - Phase 2: r0 = ToS, r1 = vectors, r2,r3 contain the runtime PGD.
* Switches to the runtime PGD, set stack and vectors.
*/
.text
......@@ -68,8 +58,11 @@ __kvm_hyp_init:
W(b) .
__do_hyp_init:
cmp r0, #0 @ We have a SP?
bne phase2 @ Yes, second stage init
@ Set stack pointer
mov sp, r0
@ Set HVBAR to point to the HYP vectors
mcr p15, 4, r1, c12, c0, 0 @ HVBAR
@ Set the HTTBR to point to the hypervisor PGD pointer passed
mcrr p15, 4, rr_lo_hi(r2, r3), c2
......@@ -114,34 +107,25 @@ __do_hyp_init:
THUMB( ldr r2, =(HSCTLR_M | HSCTLR_A | HSCTLR_TE) )
orr r1, r1, r2
orr r0, r0, r1
isb
mcr p15, 4, r0, c1, c0, 0 @ HSCR
isb
@ End of init phase-1
eret
phase2:
@ Set stack pointer
mov sp, r0
@ Set HVBAR to point to the HYP vectors
mcr p15, 4, r1, c12, c0, 0 @ HVBAR
@ Jump to the trampoline page
ldr r0, =TRAMPOLINE_VA
adr r1, target
bfi r0, r1, #0, #PAGE_SHIFT
ret r0
@ r0 : stub vectors address
ENTRY(__kvm_hyp_reset)
/* We're now in idmap, disable MMU */
mrc p15, 4, r1, c1, c0, 0 @ HSCTLR
ldr r2, =(HSCTLR_M | HSCTLR_A | HSCTLR_C | HSCTLR_I)
bic r1, r1, r2
mcr p15, 4, r1, c1, c0, 0 @ HSCTLR
target: @ We're now in the trampoline code, switch page tables
mcrr p15, 4, rr_lo_hi(r2, r3), c2
/* Install stub vectors */
mcr p15, 4, r0, c12, c0, 0 @ HVBAR
isb
@ Invalidate the old TLBs
mcr p15, 4, r0, c8, c7, 0 @ TLBIALLH
dsb ish
eret
ENDPROC(__kvm_hyp_reset)
.ltorg
......
......@@ -32,8 +32,6 @@
#include "trace.h"
extern char __hyp_idmap_text_start[], __hyp_idmap_text_end[];
static pgd_t *boot_hyp_pgd;
static pgd_t *hyp_pgd;
static pgd_t *merged_hyp_pgd;
......@@ -483,28 +481,6 @@ static void unmap_hyp_range(pgd_t *pgdp, phys_addr_t start, u64 size)
} while (pgd++, addr = next, addr != end);
}
/**
* free_boot_hyp_pgd - free HYP boot page tables
*
* Free the HYP boot page tables. The bounce page is also freed.
*/
void free_boot_hyp_pgd(void)
{
mutex_lock(&kvm_hyp_pgd_mutex);
if (boot_hyp_pgd) {
unmap_hyp_range(boot_hyp_pgd, hyp_idmap_start, PAGE_SIZE);
unmap_hyp_range(boot_hyp_pgd, TRAMPOLINE_VA, PAGE_SIZE);
free_pages((unsigned long)boot_hyp_pgd, hyp_pgd_order);
boot_hyp_pgd = NULL;
}
if (hyp_pgd)
unmap_hyp_range(hyp_pgd, TRAMPOLINE_VA, PAGE_SIZE);
mutex_unlock(&kvm_hyp_pgd_mutex);
}
/**
* free_hyp_pgds - free Hyp-mode page tables
*
......@@ -519,15 +495,20 @@ void free_hyp_pgds(void)
{
unsigned long addr;
free_boot_hyp_pgd();
mutex_lock(&kvm_hyp_pgd_mutex);
if (boot_hyp_pgd) {
unmap_hyp_range(boot_hyp_pgd, hyp_idmap_start, PAGE_SIZE);
free_pages((unsigned long)boot_hyp_pgd, hyp_pgd_order);
boot_hyp_pgd = NULL;
}
if (hyp_pgd) {
unmap_hyp_range(hyp_pgd, hyp_idmap_start, PAGE_SIZE);
for (addr = PAGE_OFFSET; virt_addr_valid(addr); addr += PGDIR_SIZE)
unmap_hyp_range(hyp_pgd, KERN_TO_HYP(addr), PGDIR_SIZE);
unmap_hyp_range(hyp_pgd, kern_hyp_va(addr), PGDIR_SIZE);
for (addr = VMALLOC_START; is_vmalloc_addr((void*)addr); addr += PGDIR_SIZE)
unmap_hyp_range(hyp_pgd, KERN_TO_HYP(addr), PGDIR_SIZE);
unmap_hyp_range(hyp_pgd, kern_hyp_va(addr), PGDIR_SIZE);
free_pages((unsigned long)hyp_pgd, hyp_pgd_order);
hyp_pgd = NULL;
......@@ -679,17 +660,18 @@ static phys_addr_t kvm_kaddr_to_phys(void *kaddr)
* create_hyp_mappings - duplicate a kernel virtual address range in Hyp mode
* @from: The virtual kernel start address of the range
* @to: The virtual kernel end address of the range (exclusive)
* @prot: The protection to be applied to this range
*
* The same virtual address as the kernel virtual address is also used
* in Hyp-mode mapping (modulo HYP_PAGE_OFFSET) to the same underlying
* physical pages.
*/
int create_hyp_mappings(void *from, void *to)
int create_hyp_mappings(void *from, void *to, pgprot_t prot)
{
phys_addr_t phys_addr;
unsigned long virt_addr;
unsigned long start = KERN_TO_HYP((unsigned long)from);
unsigned long end = KERN_TO_HYP((unsigned long)to);
unsigned long start = kern_hyp_va((unsigned long)from);
unsigned long end = kern_hyp_va((unsigned long)to);
if (is_kernel_in_hyp_mode())
return 0;
......@@ -704,7 +686,7 @@ int create_hyp_mappings(void *from, void *to)
err = __create_hyp_mappings(hyp_pgd, virt_addr,
virt_addr + PAGE_SIZE,
__phys_to_pfn(phys_addr),
PAGE_HYP);
prot);
if (err)
return err;
}
......@@ -723,8 +705,8 @@ int create_hyp_mappings(void *from, void *to)
*/
int create_hyp_io_mappings(void *from, void *to, phys_addr_t phys_addr)
{
unsigned long start = KERN_TO_HYP((unsigned long)from);
unsigned long end = KERN_TO_HYP((unsigned long)to);
unsigned long start = kern_hyp_va((unsigned long)from);
unsigned long end = kern_hyp_va((unsigned long)to);
if (is_kernel_in_hyp_mode())
return 0;
......@@ -1687,14 +1669,6 @@ phys_addr_t kvm_mmu_get_httbr(void)
return virt_to_phys(hyp_pgd);
}
phys_addr_t kvm_mmu_get_boot_httbr(void)
{
if (__kvm_cpu_uses_extended_idmap())
return virt_to_phys(merged_hyp_pgd);
else
return virt_to_phys(boot_hyp_pgd);
}
phys_addr_t kvm_get_idmap_vector(void)
{
return hyp_idmap_vector;
......@@ -1705,6 +1679,22 @@ phys_addr_t kvm_get_idmap_start(void)
return hyp_idmap_start;
}
static int kvm_map_idmap_text(pgd_t *pgd)
{
int err;
/* Create the idmap in the boot page tables */
err = __create_hyp_mappings(pgd,
hyp_idmap_start, hyp_idmap_end,
__phys_to_pfn(hyp_idmap_start),
PAGE_HYP_EXEC);
if (err)
kvm_err("Failed to idmap %lx-%lx\n",
hyp_idmap_start, hyp_idmap_end);
return err;
}
int kvm_mmu_init(void)
{
int err;
......@@ -1719,28 +1709,41 @@ int kvm_mmu_init(void)
*/
BUG_ON((hyp_idmap_start ^ (hyp_idmap_end - 1)) & PAGE_MASK);
hyp_pgd = (pgd_t *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, hyp_pgd_order);
boot_hyp_pgd = (pgd_t *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, hyp_pgd_order);
kvm_info("IDMAP page: %lx\n", hyp_idmap_start);
kvm_info("HYP VA range: %lx:%lx\n",
kern_hyp_va(PAGE_OFFSET), kern_hyp_va(~0UL));
if (!hyp_pgd || !boot_hyp_pgd) {
kvm_err("Hyp mode PGD not allocated\n");
err = -ENOMEM;
if (hyp_idmap_start >= kern_hyp_va(PAGE_OFFSET) &&
hyp_idmap_start < kern_hyp_va(~0UL)) {
/*
* The idmap page is intersecting with the VA space,
* it is not safe to continue further.
*/
kvm_err("IDMAP intersecting with HYP VA, unable to continue\n");
err = -EINVAL;
goto out;
}
/* Create the idmap in the boot page tables */
err = __create_hyp_mappings(boot_hyp_pgd,
hyp_idmap_start, hyp_idmap_end,
__phys_to_pfn(hyp_idmap_start),
PAGE_HYP);
if (err) {
kvm_err("Failed to idmap %lx-%lx\n",
hyp_idmap_start, hyp_idmap_end);
hyp_pgd = (pgd_t *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, hyp_pgd_order);
if (!hyp_pgd) {
kvm_err("Hyp mode PGD not allocated\n");
err = -ENOMEM;
goto out;
}
if (__kvm_cpu_uses_extended_idmap()) {
boot_hyp_pgd = (pgd_t *)__get_free_pages(GFP_KERNEL | __GFP_ZERO,
hyp_pgd_order);
if (!boot_hyp_pgd) {
kvm_err("Hyp boot PGD not allocated\n");
err = -ENOMEM;
goto out;
}
err = kvm_map_idmap_text(boot_hyp_pgd);
if (err)
goto out;
merged_hyp_pgd = (pgd_t *)__get_free_page(GFP_KERNEL | __GFP_ZERO);
if (!merged_hyp_pgd) {
kvm_err("Failed to allocate extra HYP pgd\n");
......@@ -1748,29 +1751,10 @@ int kvm_mmu_init(void)
}
__kvm_extend_hypmap(boot_hyp_pgd, hyp_pgd, merged_hyp_pgd,
hyp_idmap_start);
return 0;
}
/* Map the very same page at the trampoline VA */
err = __create_hyp_mappings(boot_hyp_pgd,
TRAMPOLINE_VA, TRAMPOLINE_VA + PAGE_SIZE,
__phys_to_pfn(hyp_idmap_start),
PAGE_HYP);
if (err) {
kvm_err("Failed to map trampoline @%lx into boot HYP pgd\n",
TRAMPOLINE_VA);
goto out;
}
/* Map the same page again into the runtime page tables */
err = __create_hyp_mappings(hyp_pgd,
TRAMPOLINE_VA, TRAMPOLINE_VA + PAGE_SIZE,
__phys_to_pfn(hyp_idmap_start),
PAGE_HYP);
if (err) {
kvm_err("Failed to map trampoline @%lx into runtime HYP pgd\n",
TRAMPOLINE_VA);
goto out;
} else {
err = kvm_map_idmap_text(hyp_pgd);
if (err)
goto out;
}
return 0;
......
......@@ -52,7 +52,7 @@ static const struct kvm_irq_level cortexa_vtimer_irq = {
* @vcpu: The VCPU pointer
*
* This function finds the right table above and sets the registers on the
* virtual CPU struct to their architectually defined reset values.
* virtual CPU struct to their architecturally defined reset values.
*/
int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
{
......
......@@ -36,8 +36,9 @@
#define ARM64_HAS_VIRT_HOST_EXTN 11
#define ARM64_WORKAROUND_CAVIUM_27456 12
#define ARM64_HAS_32BIT_EL0 13
#define ARM64_HYP_OFFSET_LOW 14
#define ARM64_NCAPS 14
#define ARM64_NCAPS 15
#ifndef __ASSEMBLY__
......
......@@ -178,7 +178,7 @@
/* Hyp System Trap Register */
#define HSTR_EL2_T(x) (1 << x)
/* Hyp Coproccessor Trap Register Shifts */
/* Hyp Coprocessor Trap Register Shifts */
#define CPTR_EL2_TFP_SHIFT 10
/* Hyp Coprocessor Trap Register */
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment