- 08 May, 2022 1 commit
-
-
Timothy Pearson authored
Add arch-specific makefiles and configs needed to build for ppc64. Also add a minimal head.S that is a simple infinite loop. head.o can be built with $ make XEN_TARGET_ARCH=ppc64 KBUILD_DEFCONFIG=tiny64_defconfig arch/powerpc/ppc64/head.o Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com>
-
- 05 May, 2022 4 commits
-
-
Jan Beulich authored
Support for this construct was added in 2.22 only. Avoid the need to introduce logic to probe for linker script capabilities by (ab)using the probe for a command line option having appeared at about the same time. Note that this remains x86-specific because Arm is unaffected, by requiring GNU ld 2.24 or newer. Fixes: 4b7fd815 ("x86: fold sections in final binaries") Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
-
Juergen Gross authored
When firing special watches (e.g. "@releaseDomain"), they will be regarded to be valid children of the "/" node. So a domain having registered a watch for "/" and having the privilege to receive the special watches will receive those special watch events for the registered "/" watch. Fix that by calling the related fire_watches() with the "exact" parameter set to true, causing a mismatch for the "/" node. Reported-by: Raphael Ning <raphning@amazon.com> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Raphael Ning <raphning@amazon.com> Reviewed-by: Julien Grall <jgrall@amazon.com>
-
Bertrand Marquis authored
SMCC_WORKAROUND_3 is handling both Spectre v2 and spectre BHB. So when a guest is asking if we support workaround 1, tell yes if we apply workaround 3 on exception entry as it handles it. This will allow guests not supporting Spectre BHB but impacted by spectre v2 to still handle it correctly. The modified behaviour is coherent with what the Linux kernel does in KVM for guests. While there use ARM_SMCCC_SUCCESS instead of 0 for the return code value for workaround detection to be coherent with Workaround 2 handling. Signed-off-by: Bertrand Marquis <bertrand.marquis@arm.com> Acked-by: Julien Grall <jgrall@amazon.com>
-
Julien Grall authored
As part of XSA-385, SUPPORT.MD gained a statement regarding the amount of physical memory supported. However, booting Xen on a Arm platform with that amount of memory would result to a breakage because the frametable area is too small. The wiki [1] (as of April 2022) claims we were able to support up to 5 TiB on Arm64 and 16 GiB. However, this is not the case because the struct page_info has always been bigger than expected (56 bytes for 64-bit and 32-bytes for 32-bit). I don't have any HW with such amount of memory. So rather than modifying the code, take the opportunity to use the limit that should work on Arm (2 TiB for 64-bit and 12 GiB for 32-bit). Signed-off-by: Julien Grall <jgrall@amazon.com> Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com> #arm part
-
- 04 May, 2022 1 commit
-
-
Jens Wiklander authored
This commit fixes a case overlooked in [1]. There are two kinds of shared memory buffers used by OP-TEE: 1. Normal payload buffer 2. Internal command structure buffers The internal command structure buffers are represented with a shadow copy internally in Xen since this buffer can contain physical addresses that may need to be translated between real physical address and guest physical address without leaking information to the guest. [1] fixes the problem when releasing the normal payload buffers. The internal command structure buffers must be released in the same way. Failure to follow this order opens a window where the guest has freed the shared memory but Xen is still tracking the buffer. During this window the guest may happen to recycle this particular shared memory in some other thread and try to use it. Xen will block this which will lead to spurious failures to register a new shared memory block. Fix this by freeing the internal command structure buffers first before informing the guest that the buffer can be freed. [1] 5b13eb1d ("optee: immediately free buffers that are released by OP-TEE") Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org> Reviewed-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com> [stefano: minor code style fix] Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
-
- 02 May, 2022 5 commits
-
-
Roger Pau Monné authored
LLVM LD doesn't strip the quotes from the section names, and so the resulting binary ends up with section names like: [ 1] ".text" PROGBITS ffff82d040200000 00008000 000000000018cbc1 0000000000000000 AX 0 0 4096 This confuses some tools (like gdb) and prevents proper parsing of the binary. The issue has already been reported and is being fixed in LLD. In order to workaround this issue and keep the GNU ld support define different DECL_SECTION macros depending on the used ld implementation. Drop the quotes from the definitions of the debug sections in DECL_DEBUG{2}, as those quotes are not required for GNU ld either. Fixes: 62549205 ('x86: quote section names when defining them in linker script') Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
-
Roger Pau Monné authored
Detect GNU and LLVM ld implementations. This is required for further patches that will introduce diverging behaviour depending on the linker implementation in use. Note that LLVM ld returns "compatible with GNU linkers" as part of the version string, so be on the safe side and use '^' to only match at the start of the line in case LLVM ever decides to change the text to use "compatible with GNU ld" instead. Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Michal Orzel <michal.orzel@arm.com> Acked-by: Julien Grall <jgrall@amazon.com>
-
Elliott Mitchell authored
This matches the output directory option used by `git format-patch`. I suspect I'm not the only one who finds matching `git format-patch` more intuitive, than -d for directory. Signed-off-by: Elliott Mitchell <ehem+xen@m5p.com> Reviewed-by: Juergen Gross <jgross@suse.com>
-
Roger Pau Monné authored
Windows Server 2019 Essentials will unconditionally attempt to read P5_MC_ADDR MSR at boot and throw a BSOD if injected a #GP. Fix this by mapping MSR_P5_MC_{ADDR,TYPE} to MSR_IA32_MCi_{ADDR,STATUS}, as reported also done by hardware in Intel SDM "Mapping of the Pentium Processor Machine-Check Errors to the Machine-Check Architecture" section. Reported-by: Steffen Einsle <einsle@phptrix.de> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
-
Jan Beulich authored
While it is okay for IOMMU page tables to be set up for guests starting in PoD mode, actual device assignment may only occur once all PoD entries have been removed from the P2M. So far this was enforced only for boot-time assignment, and only in the tool stack. Also use the new function to replace p2m_pod_entry_count(): Its unlocked access to p2m->pod.entry_count wasn't really okay (irrespective of the result being stale by the time the caller gets to see it). Nor was the use of that function in line with the immediately preceding comment: A PoD guest isn't just one with a non-zero entry count, but also one with a non-empty cache (e.g. prior to actually launching the guest). To allow the tool stack to see a consistent snapshot of PoD state, move the tail of XENMEM_{get,set}_pod_target handling into a function, adding proper locking there. In libxl take the liberty to use the new local variable r also for a pre-existing call into libxc. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
-
- 29 Apr, 2022 1 commit
-
-
Julien Grall authored
This reverts commit fa6dc087 as there are more fallout on Arm.g
-
- 28 Apr, 2022 8 commits
-
-
Stefano Stabellini authored
Add Rahul as ARM SMMU maintainer. Create a new explicit entry for "ARM SMMU" also with Julien which is the original contributor of the code and continues to maintain it. Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com> Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com> Acked-by: Rahul Singh <rahul.singh@arm.com> Acked-by: Julien Grall <julien@xen.org>
-
Tamas K Lengyel authored
Alow specify distinct parts of the fork VM to be reset. This is useful when a fuzzing operation involves mapping in only a handful of pages that are known ahead of time. Throwing these pages away just to be re-copied immediately is expensive, thus allowing to specify partial resets can speed things up. Also allow resetting to be initiated from vm_event responses as an optiomization. Signed-off-by: Tamas K Lengyel <tamas.lengyel@intel.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
-
Jan Beulich authored
At their use sites the numeric suffixes are at least odd to read, first and foremost for PCI_DEVFN2() where the suffix doesn't even match the number of arguments. Make use of count_args() such that a single flavor each suffices (leaving aside helper macros, which aren't supposed to be used from the outside). In parse_ppr_log_entry() take the opportunity and drop two local variables and convert an assignment to an initializer. In VT-d code fold a number of bus+devfn comparison pairs into a single BDF comparison. No change to generated code for the vast majority of the adjustments. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Paul Durrant <paul@xen.org>
-
Jan Beulich authored
There's no good reason to use these when we already have a pci_sbdf_t type object available. This extends to the use of PCI_BUS() in pci_ecam_map_bus() as well. No change to generated code (with gcc11 at least, and I have to admit that I didn't expect compilers to necessarily be able to spot the optimization potential on the original code). Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Paul Durrant <paul@xen.org> Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
-
Jan Beulich authored
The reference "to shadow the resident processes" is applicable to domains (potentially) running in shadow mode only. Adjust the calculations accordingly. This, however, requires further parameters. Since the original function is deprecated anyway, and since it can't be changed (for being part of a stable ABI), introduce a new (internal only) function, with the deprecated one simply becoming a wrapper. In dom0_paging_pages() also take the opportunity and stop open-coding DIV_ROUND_UP(). Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
-
Artem Bityutskiy authored
Add Sapphire Rapids Xeon support. Up until very recently, the C1 and C1E C-states were independent, but this has changed in some new chips, including Sapphire Rapids Xeon (SPR). In these chips the C1 and C1E states cannot be enabled at the same time. The "C1E promotion" bit in 'MSR_IA32_POWER_CTL' also has its semantics changed a bit. Here are the C1, C1E, and "C1E promotion" bit rules on Xeons before SPR. 1. If C1E promotion bit is disabled. a. C1 requests end up with C1 C-state. b. C1E requests end up with C1E C-state. 2. If C1E promotion bit is enabled. a. C1 requests end up with C1E C-state. b. C1E requests end up with C1E C-state. Here are the C1, C1E, and "C1E promotion" bit rules on Sapphire Rapids Xeon. 1. If C1E promotion bit is disabled. a. C1 requests end up with C1 C-state. b. C1E requests end up with C1 C-state. 2. If C1E promotion bit is enabled. a. C1 requests end up with C1E C-state. b. C1E requests end up with C1E C-state. Before SPR Xeon, the 'intel_idle' driver was disabling C1E promotion and was exposing C1 and C1E as independent C-states. But on SPR, C1 and C1E cannot be enabled at the same time. This patch adds both C1 and C1E states. However, C1E is marked as with the "CPUIDLE_FLAG_UNUSABLE" flag, which means that in won't be registered by default. The C1E promotion bit will be cleared, which means that by default only C1 and C6 will be registered on SPR. The next patch will add an option for enabling C1E and disabling C1 on SPR. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Origin: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 9edf3c0ffef0 Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com>
-
Jan Beulich authored
This brings us (back) closer to the original Linux source. While touching mwait_idle_state_table_update() also drop a stray leading blank. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com>
-
Juergen Gross authored
For the initialization of a ring page by the frontend two macros are available in ring.h: SHARED_RING_INIT() and FRONT_RING_INIT(). All known users use always both of them in direct sequence. Add another macro XEN_FRONT_RING_INIT() combining the two macros. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
-
- 27 Apr, 2022 13 commits
-
-
Michal Orzel authored
Function exynos4210_uart_init_preirq defines and sets a variable divisor but does not make use of it. Remove the definition and comment out the assignment as this function already has some TODOs. Signed-off-by: Michal Orzel <michal.orzel@arm.com> Acked-by: Julien Grall <jgrall@amazon.com> Acked-by: Stefano Stabellini <sstabellini@kernel.org>
-
Michal Orzel authored
Function omap5_init_time defines and sets the variable den but does not make use of it. Remove this variable. Signed-off-by: Michal Orzel <michal.orzel@arm.com> Reviewed-by: Julien Grall <jgrall@amazon.com>
-
Michal Orzel authored
Currently function xgene_check_pirq_eoi assigns the return value of dt_device_get_address to a variable res but does not make use of it. Fix it by making use of res in the condition checking the result of a call to dt_device_get_address instead of checking the address stored in dbase. Signed-off-by: Michal Orzel <michal.orzel@arm.com> Reviewed-by: Julien Grall <jgrall@amazon.com>
-
Michal Orzel authored
Function schedule_cpu_add defines and sets a variable old_unit but does not make use of it. Remove this variable. Signed-off-by: Michal Orzel <michal.orzel@arm.com> Reviewed-by: Juergen Gross <jgross@suse.com> Acked-by: Dario Faggioli <dfaggioli@suse.com>
-
Michal Orzel authored
Function arm_smmu_init_context_bank defines and sets a variable gr0_base but does not make use of it. Remove this variable. Signed-off-by: Michal Orzel <michal.orzel@arm.com> Acked-by: Julien Grall <jgrall@amazon.com>
-
Michal Orzel authored
Function efi_start defines and sets a variable size but does not make use of it. Remove this variable. Signed-off-by: Michal Orzel <michal.orzel@arm.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
-
Michal Orzel authored
Function device_tree_node_compatible defines and sets a variable mlen but does not make use of it. Remove this variable. Signed-off-by: Michal Orzel <michal.orzel@arm.com> Reviewed-by: Julien Grall <jgrall@amazon.com>
-
Ayan Kumar Halder authored
When the data abort is caused due to cache maintenance for an address, there are three scenarios:- 1. Address belonging to a non emulated region - For this, Xen should set the corresponding bit in the translation table entry to valid and return to the guest to retry the instruction. This can happen sometimes as Xen need to set the translation table entry to invalid. (for eg 'Break-Before-Make' sequence). Xen returns to the guest to retry the instruction. 2. Address belongs to an emulated region - Xen should ignore the instruction (ie increment the PC) and return to the guest. 3. Address is invalid - Xen should forward the data abort to the guest. Signed-off-by: Ayan Kumar Halder <ayankuma@xilinx.com> [julien: Don't initialize p.size to 1 << info->dabt.size] Reviewed-by: Julien Grall <jgrall@amazon.com>
-
David Vrabel authored
Heap pages can only be safely allocated and freed with interrupts enabled as they may require a TLB flush which may send IPIs (on x86). Normally spinlock debugging would catch calls from the incorrect context, but not from stop_machine_run() action functions as these are called with spin lock debugging disabled. Enhance the assertions in alloc_xenheap_pages() and alloc_domheap_pages() to check interrupts are enabled. For consistency the same asserts are used when freeing heap pages. As an exception, when only 1 PCPU is online, allocations are permitted with interrupts disabled as any TLB flushes would be local only. This is necessary during early boot. Signed-off-by: David Vrabel <dvrabel@amazon.co.uk> Reviewed-by: Jan Beulich <jbeulich@suse.com>
-
Julien Grall authored
Commit 88a037e2 "page_alloc: assert IRQs are enabled in heap alloc/free" extended the checks in the buddy allocator to catch any use of the helpers from context with interrupts disabled. Unfortunately, the rule is not followed in the alternative code and this will result to crash at boot with debug enabled: (XEN) Xen call trace: (XEN) [<0022a510>] alloc_xenheap_pages+0x120/0x150 (PC) (XEN) [<00000000>] 00000000 (LR) (XEN) [<002736ac>] arch/arm/mm.c#xen_pt_update+0x144/0x6e4 (XEN) [<002740d4>] map_pages_to_xen+0x10/0x20 (XEN) [<00236864>] __vmap+0x400/0x4a4 (XEN) [<0026aee8>] arch/arm/alternative.c#__apply_alternatives_multi_stop+0x144/0x1ec (XEN) [<0022fe40>] stop_machine_run+0x23c/0x300 (XEN) [<002c40c4>] apply_alternatives_all+0x34/0x5c (XEN) [<002ce3e8>] start_xen+0xcb8/0x1024 (XEN) [<00200068>] arch/arm/arm32/head.o#primary_switched+0xc/0x1c The interrupts will be disabled by the state machine in stop_machine_run(), hence why the ASSERT is hit. For now the patch extending the checks has been reverted, but it would be good to re-introduce it (allocation with interrupts disabled is not desirable). So move the re-mapping of Xen to the caller of stop_machine_run(). Signed-off-by: Julien Grall <jgrall@amazon.com> Cc: David Vrabel <dvrabel@amazon.co.uk> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
-
Jan Beulich authored
Just like for "install", make dealing with xen.efi on the EFI partition dependent upon mount point and vendor directory being known. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
-
Jason Andryuk authored
PCI device assignment to an HVM with stubdom is potentially racy. First the PCI device is assigned to the stubdom via the PV PCI protocol. Then QEMU is sent a QMP command to attach the PCI device to QEMU running within the stubdom. However, the sysfs entries within the stubdom may not have appeared by the time QEMU receives the device_add command resulting in errors like: libxl_qmp.c:1838:qmp_ev_parse_error_messages:Domain 10:Could not open '/sys/bus/pci/devices/0000:00:1f.3/config': No such file or directory This patch retries the device assignment up to 10 times with a 1 second delay between. That roughly matches the overall hotplug timeout for pci_add_timeout. pci_add_timeout's initialization is moved to do_pci_add since retries call into pci_add_qmp_device_add again. The qmp_ev_parse_error_messages error is still printed since it happens at a lower level than the pci code controlling the retries. With that, the "Retrying PCI add %d" message is also printed at ERROR level to clarify what is happening. Signed-off-by: Jason Andryuk <jandryuk@gmail.com> Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
-
Tamas K Lengyel authored
During VM forking and resetting a failed vmentry has been observed due to the guest non-register state going out-of-sync with the guest register state. For example, a VM fork reset right after a STI instruction can trigger the failed entry. This is due to the guest non-register state not being saved from the parent VM, thus the reset operation only copies the register state. Fix this by adding a new pair of hvm functions to get/set the guest non-register state so that the overall vCPU state remains in sync. Signed-off-by: Tamas K Lengyel <tamas.lengyel@intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Acked-by: Jan Beulich <jbeulich@suse.com>
-
- 26 Apr, 2022 5 commits
-
-
Jan Beulich authored
This reverts commit 88a037e2, as it break booting on Arm.
-
David Vrabel authored
Heap pages can only be safely allocated and freed with interrupts enabled as they may require a TLB flush which may send IPIs (on x86). Normally spinlock debugging would catch calls from the incorrect context, but not from stop_machine_run() action functions as these are called with spin lock debugging disabled. Enhance the assertions in alloc_xenheap_pages() and alloc_domheap_pages() to check interrupts are enabled. For consistency the same asserts are used when freeing heap pages. As an exception, when only 1 PCPU is online, allocations are permitted with interrupts disabled as any TLB flushes would be local only. This is necessary during early boot. Signed-off-by: David Vrabel <dvrabel@amazon.co.uk> Reviewed-by: Jan Beulich <jbeulich@suse.com>
-
Daniel P. Smith authored
This is a quick code style cleanup patch for xsm/flask. The files flask_op.c and hooks.c are Xen specific, thus full code style rules were applied. The remaining files are from Linux and therefore only trailing whitespace was remove from those files. Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jason Andryuk <jandryuk@gmail.com>
-
Jan Beulich authored
Besides the reporter's issue of hitting a NULL deref when !CONFIG_GDBSX, XEN_DOMCTL_test_assign_device can legitimately end up having NULL passed here, when the domctl was passed DOMID_INVALID. Fixes: 71e617a6 ("use is_iommu_enabled() where appropriate...") Reported-by: Cheyenne Wills <cheyenne.wills@gmail.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Paul Durrant <paul@xen.org> Reviewed-by: Juergen Gross <jgross@suse.com>
-
Juergen Gross authored
Today iommu_do_domctl() is being called from arch_do_domctl() in the "default:" case of a switch statement. This has led already to crashes due to unvalidated parameters. Fix that by moving the call of iommu_do_domctl() to the main switch statement of do_domctl(). Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> # Arm
-
- 22 Apr, 2022 2 commits
-
-
Juergen Gross authored
Setting errno to a negative value makes no sense. Fixes: e78e8b9b ("libxl: Add interface for querying hypervisor about PCI topology") Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
-
Juergen Gross authored
Setting errno to a negative error value makes no sense. Fixes: cb99a640 ("libxc: arm: allow passing a device tree blob to the guest") Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
-