- 11 Apr, 2014 5 commits
-
-
Daniel De Graaf authored
Most of these functions actually act on the hardware domain, so change their names to reflect this. Command line parameters and variables based on those parameters are excluded since those changes would be user-visible, as are any public headers. Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Reviewed-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
-
Daniel De Graaf authored
This should not change any functionality other than renaming the global variable. In a few cases (primarily the domain building code), a local variable or argument named dom0 was created and used instead of the global hardware_domain to clarify that the domain being used in this case is actually domain 0. Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Reviewed-by: Jan Beulich <jbeulich@suse.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Christoph Egger <chegger@amazon.de> Acked-by: Keir Fraser <keir@xen.org>
-
Daniel De Graaf authored
When the hardware domain is made distinct from dom0, it becomes possible to shut down and destroy domain 0 while leaving the hypervisor running. If this happens, prevent this domain ID from being considered for allocation to a new guest. Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Acked-by: Keir Fraser <keir@xen.org>
-
Daniel De Graaf authored
When the hardware domain is split from domain 0, the initialization code for the hardware domain cannot be in the __init section, since the actual domain creation happens after these sections have been discarded. Create a __hwdom_init section designator to annotate these functions, and control it using the XSM configuration option for now (since XSM is required to take advantage of the security benefits of disaggregation). Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Reviewed-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
-
Daniel De Graaf authored
Instead of checking is_privileged to determine if a domain should control the hardware, check that the domain_id is equal to zero (which is currently the only domain for which is_privileged is true). This allows other places where domain_id is checked for zero to be replaced with is_hardware_domain. The distinction between is_hardware_domain, is_control_domain, and domain 0 is based on the following disaggregation model: Domain 0 bootstraps the system. It may remain to perform requested builds of domains that need a minimal trust chain (i.e. vTPM domains). Other than being built by the hypervisor, nothing is special about this domain - although it may be useful to have is_control_domain() return true depending on the toolstack it uses to build other domains. The hardware domain manages devices for PCI pass-through to driver domains or can act as a driver domain itself, depending on the desired degree of disaggregation. It is also the domain managing devices that do not support pass-through: PCI configuration space access, parsing the hardware ACPI tables and system power or machine check events. This is the only domain where is_hardware_domain() is true. The return of is_control_domain() may be false for this domain. The control domain manages other domains, controls guest launch and shutdown, and manages resource constraints; is_control_domain() returns true. The functionality guarded by is_control_domain may in the future be adapted to use explicit hypercalls, eliminating the special treatment of this domain. It may be reasonable to have multiple control domains on a multi-tenant system. Guest domains and other service or driver domains are all treated identically by the hypervisor; the security policy may further constrain administrative actions on or communication between these domains. Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Reviewed-by: Jan Beulich <jbeulich@suse.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
-
- 10 Apr, 2014 17 commits
-
-
Konrad Rzeszutek Wilk authored
Which of course has a different model number and sports two serial outputs. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Keir Fraser <keir@xen.org>
-
Jan Beulich authored
Make it so this is easier to extend, and move the parsing code/data into .init.* sections. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Aravind Gopalakrishnan<aravind.gopalakrishnan@amd.com>
-
Jan Beulich authored
Newer AMD CPUs also allow masking CPUID leaf 6 ECX and CPUID leaf 7 sub-leaf 0 EAX and EBX. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Aravind Gopalakrishnan<aravind.gopalakrishnan@amd.com>
-
Jan Beulich authored
- make sure UC- is only used for PAT purposes (MTRRs and hence EPT don't have this type) - add order input to "get", and properly handle conflict case (forcing an EPT page split) - properly detect (and refuse) overlaps during "set" - properly use RCU constructs - support deleting ranges through a special type input to "set" - set ignore-PAT flag in epte_get_entry_emt() when "get" succeeds - set "get" output to ~0 (invalid) rather than 0 (UC) on error (the caller shouldn't be looking at it anyway) - move struct hvm_mem_pinned_cacheattr_range from header to C file (used only there) Note that the code (before and after this change) implies the GFN ranges passed to the hypercall to be inclusive, which is in contrast to the sole current user in qemu (all variants). It is not clear to me at which layer (qemu, libxc, hypervisor) this would best be fixed. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Tim Deegan <tim@xen.org> Acked-by: Kevin Tian <kevin.tian@intel.com>
-
Jan Beulich authored
This capability solely makes a statement on cache coherency guarantees by the IOMMU. It does specifically not imply any further guarantees implied by certain memory types (cachability, ordering). Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Kevin Tian <kevin.tian@intel.com>
-
Jan Beulich authored
... between constituent pages. To indicate such, the page order is being passed down to the vMTRR routines, with a negative return value (possible only on order-non-zero pages) indicating such collisions. Some code redundancy reduction is being done to ept_set_entry() along the way, allowing the new handling to be centralized to a single place there. In order to keep ept_set_entry() fast and simple, the actual splitting is being deferred to the EPT_MISCONFIG VM exit handler. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Tim Deegan <tim@xen.org> Acked-by: Kevin Tian <kevin.tian@intel.com>
-
Jan Beulich authored
The main goal here is to drop the bogus dependency of epte_get_entry_emt() on d->arch.hvm_domain.params[HVM_PARAM_IDENT_PT]. Any change to state influencing epte_get_entry_emt()'s decision needs to result in re-calculation. Do this by using the EPT_MISCONFIG VM exit, storing an invalid memory type into EPT's emt field (leaving the IOMMU, which doesn't care about memory types, unaffected). This is being done in a hierarchical manner to keep execution time down: Initially only the top level directory gets invalidated this way. Upon access, the involved intermediate page table levels get cleared back to zero, and the leaf entry gets its field properly set. For 4k leaves all other entries in the same directory also get processed to amortize the cost of the extra VM exit (which halved the number of these VM exits in my testing). This restoring can result in spurious EPT_MISCONFIG VM exits (since two vCPU-s may access addresses involving identical page table structures). Rather than simply returning in such cases (and risking that such a VM exit results from a real mis-configuration, which would then result in an endless loop rather than killing the VM), a per-vCPU flag is being introduced indicating when such a spurious VM exit might validly happen - if another one occurs right after VM re- entry, the flag would generally end up being clear, causing the VM to be killed as before on such VM exits. Note that putting a reserved memory type value in the EPT structures isn't formally sanctioned by the specification. Intel isn't willing to adjust the specification to make this or a similar use of the EPT_MISCONFIG VM exit formally possible, but they have indicated that us using this is low risk wrt forward compatibility. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Tim Deegan <tim@xen.org> Acked-by: Kevin Tian <kevin.tian@intel.com>
-
Liu Jinsong authored
... since Jinsong switched to Alibaba Corp. Signed-off-by: Liu Jinsong <jinsong.liu@alibaba-inc.com>
-
Liu Jinsong authored
... since Jinsong switched to Alibaba Corp. Signed-off-by: Liu Jinsong <jinsong.liu@alibaba-inc.com>
-
Ian Campbell authored
This now uses the same decision tree as libxc (which is much easier to test). The main change is to explicitly handle the placement at 128MB or end of RAM as two cases, rather than combining with MIN. The effect is the same but the code is clearer. Secondly the attempt to place the modules right after the kernel is removed, since it is redundant with the case where placing them at the end of RAM ends up abutting the kernel. Also round the kernel size up to a 2MB boundary. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Julien Grall <julien.grall@linaro.org>
-
Ian Campbell authored
The placement algorithm should be effectively the same and using different variable names makes my head hurt when I try to compare. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Julien Grall <julien.grall@linaro.org>
-
Ian Campbell authored
314c9815 "tools: implement initial ramdisk support for ARM." broke starting guests with <= 128 MB ram by placing the boot modules (dtb and initrd) immediately after the kernel in this case, running the risk of them being overwritten. Instead place the modules at the end of RAM, as the hypervisor does for dom0. The hypervisor also falls back to placing things before the kernel as a last resort before failing, so add that here too. Tested with the Debian installer initrd and guests of 96MB, 128MB, 256MB and 1GB. All work, also tested with 64MB but the installer doesn't run with so little RAM (but our placement of the initrd is correct). Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Julien Grall <julien.grall@linaro.org> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
-
Ian Campbell authored
According to http://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html __builtin_expect has the prototype: long __builtin_expect (long exp, long c) If sizeof(exp) > sizeof(long) then this will effectively mask off the top bits of exp, meaning that the if in "if (unlikey(x))" will see the masked version, which might be false when true was expected, likely has the same issue. This is mostly likely to affect x86_32 and arm32 builds. x86_32 is not present on 4.3 onwards and a quick grep of current staging shows that all the existing arm32 uses of both likely and unlikely already pass a boolean. I noticed this with an as yet unposted patch which did not have this property. Also the defintion of likely might not have had the expected affect for cases where a true value > 1 might be passed. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Cc: Keir Fraser <keir@xen.org> Cc: Tim Deegan <tim@xen.org>
-
Ian Campbell authored
We haven't shipped a XenoLinux kernel for more releases than I can remember. We held onto these because osstest was using them but this is no longer the case. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
-
Ian Campbell authored
AFAICT this hasn't actually been built since 8311d176 "docs: Remove outdated LaTex documentation". Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
-
Ian Campbell authored
From 3f2142f0b7a0d600fa8d2d06b5eacf0d52aa5bca Mon Sep 17 00:00:00 2001 From: Ian Campbell <ian.campbell@citrix.com> Date: Fri, 4 Apr 2014 15:00:12 +0100 Subject: [PATCH v2] tools/hotplug: Remove network-* These are a xend-ism. Since Xen 4.1 the recommened way to configure networking has been to use the distro facilities (e.g. http://wiki.xen.org/wiki/HostConfiguration/Networking ) Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
-
Ian Campbell authored
These were added by 7dbfc2f8 "docs: Honour --{en, dis}able-xend when building docs" between v1 and the (eventually committed) v2 of 9e8672f1 "tools: remove xend and associated python modules" and were missed when rebasing for v2. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Tested-by: Olaf Hering <olaf@aepfle.de> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
-
- 09 Apr, 2014 18 commits
-
-
Wei Liu authored
... otherwise JSON array elements are not freed and memory is leaked. Signed-off-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
-
-
Jan Beulich authored
Reported-by: Aravind Gopalakrishnan<aravind.gopalakrishnan@amd.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
-
Bob Liu authored
During xen testing, below failure was triggered if dedup=0. (XEN) Assertion '!preempt_count()' failed at preempt.c:37 (XEN) ----[ Xen-4.5-unstable x86_64 debug=y Not tainted ]---- (XEN) CPU: 51 (XEN) RIP: e008:[<ffff82d08011bfef>] ASSERT_NOT_IN_ATOMIC+0x22/0x53 (XEN) RFLAGS: 0000000000010286 CONTEXT: hypervisor (XEN) rax: ffff82d080318d20 rbx: ffff8300681ea000 rcx: 0000000000000001 (XEN) rdx: 00000033bca03300 rsi: ffff8308110da000 rdi: ffff82d080286690 (XEN) rbp: ffff83043cd0ff08 rsp: ffff83043cd0ff08 r8: ffff8307d2beecb0 (XEN) r9: 000000000000000d r10: 00000000deadbeef r11: 0000000000000202 (XEN) r12: 0000000000000000 r13: 0000000000000000 r14: 0000000000000005 (XEN) r15: 0000000000000001 cr0: 0000000080050033 cr4: 00000000001526f0 (XEN) cr3: 000000005246d000 cr2: ffff880106123418 (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008 (XEN) Xen stack trace from rsp=ffff83043cd0ff08: (XEN) 00007cfbc32f00c7 ffff82d0802258f0 ffff880106123418 ffffea0006156e80 (XEN) ffff8800d0ab5368 00007faff4c83000 ffff8801bdea33e8 0000000000000002 (XEN) 0000000000000202 00000000deadbeef 0000000000000000 00000000000c3565 (XEN) fffffffffffffff4 ffffffff810014ca ffffffff81de1000 000000000000c356 (XEN) 00000000deadbeef 0001010000000000 ffffffff810014ca 000000000000e033 (XEN) 0000000000000202 ffff8801bdea3360 000000000000e02b 000000000000beef (XEN) 000000000000beef 000000000000beef 000000000000beef 0000000000000033 (XEN) ffff8300681ea000 00000033bca03300 0000000000000000 (XEN) Xen call trace: (XEN) [<ffff82d08011bfef>] ASSERT_NOT_IN_ATOMIC+0x22/0x53 (XEN) [<ffff82d0802258f0>] test_all_events+0x6/0x30 The root cause is there is an wronng 'write_unlock(&pcd_tree_rwlocks[firstbyte])' in function tmem_try_to_evict_pgp(). Nobody will lock &pcd_tree_rwlocks if dedup=0, but the write_unlock() will be executed anyway. This was introduced by a git commit 38c433d0 ("tmem: add page deduplication with optional compression or trailing-zero-elimination") Signed-off-by: Bob Liu <bob.liu@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
Bob Liu authored
Parameter "destroy" in function client_flush() and pool_flush() is unneeded because it was always set to 1. Signed-off-by: Bob Liu <bob.liu@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
Bob Liu authored
Reorg the code to make it more readable. Check the return value of shared_pool_join() and drop a unneeded call to it. Disable creating a shared & persistant pool in an advance place. Note that one might be tempted to delay the creation of the pool even further in the code. That however would break the behavior of the code - that is if we ended up creating a shared pool and the 'uuid_lo == -1L && uuid_hi == -1L' logic stands we still need to create a pool - just not shared type. Signed-off-by: Bob Liu <bob.liu@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
Bob Liu authored
Make function tmemc_shared_pool_auth() more readable. Note that the previous check for free being set the first time '(free == -1)' in the loop is now removed. That is OK because when we set free the first time ('free = i;') we follow it immediately with a break to get out of the loop. Signed-off-by: Bob Liu <bob.liu@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
Bob Liu authored
Parameters "selective" and "no_rebalance" are meaningless in obj destroy path, this patch remove them. No place uses no_rebalance=1. In the obj_destroy path we always call it with no_balance=0. Note that this will now free it only if: obj->last_client == cli_id Which is OK - even if we allocate a non-shared pool we set by default the obj->last_client to TMEM_CLI_ID_NULL so even if the pool is never used, the pool_flush will take care of removing those. Signed-off-by: Bob Liu <bob.liu@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
Bob Liu authored
tmemc_set_var() calls tmemc_set_var_one() but without taking its return value, this patch fix this issue. Also rename tmemc_set_var_one() to __tmemc_set_var(). Signed-off-by: Bob Liu <bob.liu@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
Bob Liu authored
No need to maintain a global pool list, nobody use it. Signed-off-by: Bob Liu <bob.liu@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
Bob Liu authored
Function client_freeze() only set client->frozen = freeze, the caller can do this work directly. Signed-off-by: Bob Liu <bob.liu@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
Bob Liu authored
There are several functions related with pgp free, but their relationships are not clear enough for understanding. This patch made some cleanup by remove pgp_delist() and pgp_free_from_inv_list(). The call trace is simple now: pgp_delist_free() > pgp_free() > __pgp_free() Signed-off-by: Bob Liu <bob.liu@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
Bob Liu authored
The only difference of the "from_delete" parameter in pgp_free() is one line ASSERT(), this patch moves it the caller to make code more clean. Signed-off-by: Bob Liu <bob.liu@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
Bob Liu authored
The parameter "eph_lock" is only needed for function tmem_evict(). Embeded the delist code into tmem_evict() directly so as to drop the eph_lock parameter. By this change, the eph list lock can also be released a bit earier. Signed-off-by: Bob Liu <bob.liu@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> [v2: A fix for an assertion of 'client->eph_count >= 0' was rolled in]
-
Bob Liu authored
There is a potential bug in the obj allocate path. When there are parallel callers allocate a obj and insert it to pool->obj_rb_root, an unexpected obj might be returned (both callers use the same oid). Caller A: Caller B: obj_find(oidp) == NULL obj_find(oidp) == NULL write_lock(&pool->pool_rwlock) obj_new(): objA = tmem_malloc() obj_rb_insert(objA) wirte_unlock() write_lock(&pool->pool_rwlock) obj_new(): objB = tmem_malloc() obj_rb_insert(objB) write_unlock() Continue write data to objA But in future obj_find(), objB will always be returned. The route cause is the allocate path didn't check the return value of obj_rb_insert(). This patch fix it and replace obj_new() with better name obj_alloc(). Signed-off-by: Bob Liu <bob.liu@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
Roger Pau Monné authored
This is done so PVH guests can use PHYSDEVOP_pirq_eoi_gmfn_v{1/2}. Update users of this fields, to reflect that this has been moved and it is now also available to other kind of guests. Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Move auto_unmask ahead of the other two fields, to reduce padding. Signed-off-by: Jan Beulich <jbeulich@suse.com>
-
Don Slutz authored
This add a set of trace events that track the setup of various emulated devices related to timers in domU. This set is hpet, pit (i8253, i8254), rtc (MC146818), apic (lapic), and pic (i8259). The pmtimer is not traced since it does not have a changeable rate. Signed-off-by: Don Slutz <dslutz@verizon.com> Reviewed-by: Tim Deegan <tim@xen.org> Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
-
Don Slutz authored
This is per CODING_STYLE. Signed-off-by: Don Slutz <dslutz@verizon.com>
-