Commit 8644d2a4 authored by Greg KH's avatar Greg KH Committed by Greg Kroah-Hartman
Browse files
parents 1cde8a16 99f95e52
......@@ -44,9 +44,9 @@ running, the suggested command should tell you.
Again, keep in mind that this list assumes you are already
functionally running a Linux 2.4 kernel. Also, not all tools are
necessary on all systems; obviously, if you don't have any PCMCIA (PC
Card) hardware, for example, you probably needn't concern yourself
with pcmcia-cs.
necessary on all systems; obviously, if you don't have any ISDN
hardware, for example, you probably needn't concern yourself with
isdn4k-utils.
o Gnu C 2.95.3 # gcc --version
o Gnu make 3.79.1 # make --version
......@@ -57,6 +57,7 @@ o e2fsprogs 1.29 # tune2fs
o jfsutils 1.1.3 # fsck.jfs -V
o reiserfsprogs 3.6.3 # reiserfsck -V 2>&1|grep reiserfsprogs
o xfsprogs 2.6.0 # xfs_db -V
o pcmciautils 001
o pcmcia-cs 3.1.21 # cardmgr -V
o quota-tools 3.09 # quota -V
o PPP 2.4.0 # pppd --version
......@@ -186,13 +187,20 @@ architecture independent and any version from 2.0.0 onward should
work correctly with this version of the XFS kernel code (2.6.0 or
later is recommended, due to some significant improvements).
PCMCIAutils
-----------
PCMCIAutils replaces pcmcia-cs (see below). It properly sets up
PCMCIA sockets at system startup and loads the appropriate modules
for 16-bit PCMCIA devices if the kernel is modularized and the hotplug
subsystem is used.
Pcmcia-cs
---------
PCMCIA (PC Card) support is now partially implemented in the main
kernel source. Pay attention when you recompile your kernel ;-).
Also, be sure to upgrade to the latest pcmcia-cs release.
kernel source. The "pcmciautils" package (see above) replaces pcmcia-cs
for newest kernels.
Quota-tools
-----------
......@@ -349,9 +357,13 @@ Xfsprogs
--------
o <ftp://oss.sgi.com/projects/xfs/download/>
Pcmciautils
-----------
o <ftp://ftp.kernel.org/pub/linux/utils/kernel/pcmcia/>
Pcmcia-cs
---------
o <ftp://pcmcia-cs.sourceforge.net/pub/pcmcia-cs/pcmcia-cs-3.1.21.tar.gz>
o <http://pcmcia-cs.sourceforge.net/>
Quota-tools
----------
......
Block io priorities
===================
Intro
-----
With the introduction of cfq v3 (aka cfq-ts or time sliced cfq), basic io
priorities is supported for reads on files. This enables users to io nice
processes or process groups, similar to what has been possible to cpu
scheduling for ages. This document mainly details the current possibilites
with cfq, other io schedulers do not support io priorities so far.
Scheduling classes
------------------
CFQ implements three generic scheduling classes that determine how io is
served for a process.
IOPRIO_CLASS_RT: This is the realtime io class. This scheduling class is given
higher priority than any other in the system, processes from this class are
given first access to the disk every time. Thus it needs to be used with some
care, one io RT process can starve the entire system. Within the RT class,
there are 8 levels of class data that determine exactly how much time this
process needs the disk for on each service. In the future this might change
to be more directly mappable to performance, by passing in a wanted data
rate instead.
IOPRIO_CLASS_BE: This is the best-effort scheduling class, which is the default
for any process that hasn't set a specific io priority. The class data
determines how much io bandwidth the process will get, it's directly mappable
to the cpu nice levels just more coarsely implemented. 0 is the highest
BE prio level, 7 is the lowest. The mapping between cpu nice level and io
nice level is determined as: io_nice = (cpu_nice + 20) / 5.
IOPRIO_CLASS_IDLE: This is the idle scheduling class, processes running at this
level only get io time when no one else needs the disk. The idle class has no
class data, since it doesn't really apply here.
Tools
-----
See below for a sample ionice tool. Usage:
# ionice -c<class> -n<level> -p<pid>
If pid isn't given, the current process is assumed. IO priority settings
are inherited on fork, so you can use ionice to start the process at a given
level:
# ionice -c2 -n0 /bin/ls
will run ls at the best-effort scheduling class at the highest priority.
For a running process, you can give the pid instead:
# ionice -c1 -n2 -p100
will change pid 100 to run at the realtime scheduling class, at priority 2.
---> snip ionice.c tool <---
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <getopt.h>
#include <unistd.h>
#include <sys/ptrace.h>
#include <asm/unistd.h>
extern int sys_ioprio_set(int, int, int);
extern int sys_ioprio_get(int, int);
#if defined(__i386__)
#define __NR_ioprio_set 289
#define __NR_ioprio_get 290
#elif defined(__ppc__)
#define __NR_ioprio_set 273
#define __NR_ioprio_get 274
#elif defined(__x86_64__)
#define __NR_ioprio_set 251
#define __NR_ioprio_get 252
#elif defined(__ia64__)
#define __NR_ioprio_set 1274
#define __NR_ioprio_get 1275
#else
#error "Unsupported arch"
#endif
_syscall3(int, ioprio_set, int, which, int, who, int, ioprio);
_syscall2(int, ioprio_get, int, which, int, who);
enum {
IOPRIO_CLASS_NONE,
IOPRIO_CLASS_RT,
IOPRIO_CLASS_BE,
IOPRIO_CLASS_IDLE,
};
enum {
IOPRIO_WHO_PROCESS = 1,
IOPRIO_WHO_PGRP,
IOPRIO_WHO_USER,
};
#define IOPRIO_CLASS_SHIFT 13
const char *to_prio[] = { "none", "realtime", "best-effort", "idle", };
int main(int argc, char *argv[])
{
int ioprio = 4, set = 0, ioprio_class = IOPRIO_CLASS_BE;
int c, pid = 0;
while ((c = getopt(argc, argv, "+n:c:p:")) != EOF) {
switch (c) {
case 'n':
ioprio = strtol(optarg, NULL, 10);
set = 1;
break;
case 'c':
ioprio_class = strtol(optarg, NULL, 10);
set = 1;
break;
case 'p':
pid = strtol(optarg, NULL, 10);
break;
}
}
switch (ioprio_class) {
case IOPRIO_CLASS_NONE:
ioprio_class = IOPRIO_CLASS_BE;
break;
case IOPRIO_CLASS_RT:
case IOPRIO_CLASS_BE:
break;
case IOPRIO_CLASS_IDLE:
ioprio = 7;
break;
default:
printf("bad prio class %d\n", ioprio_class);
return 1;
}
if (!set) {
if (!pid && argv[optind])
pid = strtol(argv[optind], NULL, 10);
ioprio = ioprio_get(IOPRIO_WHO_PROCESS, pid);
printf("pid=%d, %d\n", pid, ioprio);
if (ioprio == -1)
perror("ioprio_get");
else {
ioprio_class = ioprio >> IOPRIO_CLASS_SHIFT;
ioprio = ioprio & 0xff;
printf("%s: prio %d\n", to_prio[ioprio_class], ioprio);
}
} else {
if (ioprio_set(IOPRIO_WHO_PROCESS, pid, ioprio | ioprio_class << IOPRIO_CLASS_SHIFT) == -1) {
perror("ioprio_set");
return 1;
}
if (argv[optind])
execvp(argv[optind], &argv[optind]);
}
return 0;
}
---> snip ionice.c tool <---
March 11 2005, Jens Axboe <axboe@suse.de>
......@@ -17,6 +17,7 @@ This driver is known to work with the following cards:
* SA P600
* SA P800
* SA E400
* SA E300
If nodes are not already created in the /dev/cciss directory, run as root:
......
......@@ -1119,7 +1119,7 @@ running once the system is up.
See Documentation/ramdisk.txt.
psmouse.proto= [HW,MOUSE] Highest PS2 mouse protocol extension to
probe for (bare|imps|exps).
probe for (bare|imps|exps|lifebook|any).
psmouse.rate= [HW,MOUSE] Set desired mouse report rate, in reports
per second.
psmouse.resetafter=
......
Matching of PCMCIA devices to drivers is done using one or more of the
following criteria:
- manufactor ID
- card ID
- product ID strings _and_ hashes of these strings
- function ID
- device function (actual and pseudo)
You should use the helpers in include/pcmcia/device_id.h for generating the
struct pcmcia_device_id[] entries which match devices to drivers.
If you want to match product ID strings, you also need to pass the crc32
hashes of the string to the macro, e.g. if you want to match the product ID
string 1, you need to use
PCMCIA_DEVICE_PROD_ID1("some_string", 0x(hash_of_some_string)),
If the hash is incorrect, the kernel will inform you about this in "dmesg"
upon module initialization, and tell you of the correct hash.
You can determine the hash of the product ID strings by running
"pcmcia-modalias %n.%m" [%n being replaced with the socket number and %m being
replaced with the device function] from pcmciautils. It generates a string
in the following form:
pcmcia:m0149cC1ABf06pfn00fn00pa725B842DpbF1EFEE84pc0877B627pd00000000
The hex value after "pa" is the hash of product ID string 1, after "pb" for
string 2 and so on.
Alternatively, you can use this small tool to determine the crc32 hash.
simply pass the string you want to evaluate as argument to this program,
e.g.
$ ./crc32hash "Dual Speed"
-------------------------------------------------------------------------
/* crc32hash.c - derived from linux/lib/crc32.c, GNU GPL v2 */
#include <string.h>
#include <stdio.h>
#include <ctype.h>
#include <stdlib.h>
unsigned int crc32(unsigned char const *p, unsigned int len)
{
int i;
unsigned int crc = 0;
while (len--) {
crc ^= *p++;
for (i = 0; i < 8; i++)
crc = (crc >> 1) ^ ((crc & 1) ? 0xedb88320 : 0);
}
return crc;
}
int main(int argc, char **argv) {
unsigned int result;
if (argc != 2) {
printf("no string passed as argument\n");
return -1;
}
result = crc32(argv[1], strlen(argv[1]));
printf("0x%x\n", result);
return 0;
}
This file details changes in 2.6 which affect PCMCIA card driver authors:
* in-kernel device<->driver matching
PCMCIA devices and their correct drivers can now be matched in
kernelspace. See 'devicetable.txt' for details.
* Device model integration (as of 2.6.11)
A struct pcmcia_device is registered with the device model core,
and can be used (e.g. for SET_NETDEV_DEV) by using
handle_to_dev(client_handle_t * handle).
* Convert internal I/O port addresses to unsigned long (as of 2.6.11)
ioaddr_t should be replaced by kio_addr_t in PCMCIA card drivers.
* irq_mask and irq_list parameters (as of 2.6.11)
The irq_mask and irq_list parameters should no longer be used in
PCMCIA card drivers. Instead, it is the job of the PCMCIA core to
determine which IRQ should be used. Therefore, link->irq.IRQInfo2
is ignored.
* client->PendingEvents is gone (as of 2.6.11)
client->PendingEvents is no longer available.
* client->Attributes are gone (as of 2.6.11)
client->Attributes is unused, therefore it is removed from all
PCMCIA card drivers
* core functions no longer available (as of 2.6.11)
The following functions have been removed from the kernel source
because they are unused by all in-kernel drivers, and no external
driver was reported to rely on them:
pcmcia_get_first_region()
pcmcia_get_next_region()
pcmcia_modify_window()
pcmcia_set_event_mask()
pcmcia_get_first_window()
pcmcia_get_next_window()
* device list iteration upon module removal (as of 2.6.10)
It is no longer necessary to iterate on the driver's internal
client list and call the ->detach() function upon module removal.
* Resource management. (as of 2.6.8)
Although the PCMCIA subsystem will allocate resources for cards,
it no longer marks these resources busy. This means that driver
authors are now responsible for claiming your resources as per
other drivers in Linux. You should use request_region() to mark
your IO regions in-use, and request_mem_region() to mark your
memory regions in-use. The name argument should be a pointer to
your driver name. Eg, for pcnet_cs, name should point to the
string "pcnet_cs".
......@@ -1149,7 +1149,7 @@ S: Maintained
INFINIBAND SUBSYSTEM
P: Roland Dreier
M: roland@topspin.com
M: rolandd@cisco.com
P: Sean Hefty
M: mshefty@ichips.intel.com
P: Hal Rosenstock
......
......@@ -32,6 +32,7 @@
#include <asm/leds.h>
#include <asm/processor.h>
#include <asm/uaccess.h>
#include <asm/mach/time.h>
extern const char *processor_modes[];
extern void setup_mm_for_reboot(char mode);
......@@ -85,8 +86,10 @@ EXPORT_SYMBOL(pm_power_off);
void default_idle(void)
{
local_irq_disable();
if (!need_resched() && !hlt_counter)
if (!need_resched() && !hlt_counter) {
timer_dyn_reprogram();
arch_idle();
}
local_irq_enable();
}
......
......@@ -424,15 +424,19 @@ static int timer_dyn_tick_disable(void)
return ret;
}
/*
* Reprogram the system timer for at least the calculated time interval.
* This function should be called from the idle thread with IRQs disabled,
* immediately before sleeping.
*/
void timer_dyn_reprogram(void)
{
struct dyn_tick_timer *dyn_tick = system_timer->dyn_tick;
unsigned long flags;
write_seqlock_irqsave(&xtime_lock, flags);
write_seqlock(&xtime_lock);
if (dyn_tick->state & DYN_TICK_ENABLED)
dyn_tick->reprogram(next_timer_interrupt() - jiffies);
write_sequnlock_irqrestore(&xtime_lock, flags);
write_sequnlock(&xtime_lock);
}
static ssize_t timer_show_dyn_tick(struct sys_device *dev, char *buf)
......
zreladdr-y := 0xf0008000
......@@ -288,8 +288,8 @@ static void usb_release(struct device *dev)
static struct resource udc_resources[] = {
/* order is significant! */
{ /* registers */
.start = IO_ADDRESS(UDC_BASE),
.end = IO_ADDRESS(UDC_BASE + 0xff),
.start = UDC_BASE,
.end = UDC_BASE + 0xff,
.flags = IORESOURCE_MEM,
}, { /* general IRQ */
.start = IH2_BASE + 20,
......@@ -355,8 +355,8 @@ static struct platform_device ohci_device = {
static struct resource otg_resources[] = {
/* order is significant! */
{
.start = IO_ADDRESS(OTG_BASE),
.end = IO_ADDRESS(OTG_BASE + 0xff),
.start = OTG_BASE,
.end = OTG_BASE + 0xff,
.flags = IORESOURCE_MEM,
}, {
.start = IH2_BASE + 8,
......
......@@ -522,6 +522,69 @@ static inline void free_area(unsigned long addr, unsigned long end, char *s)
printk(KERN_INFO "Freeing %s memory: %dK\n", s, size);
}
static inline void
free_memmap(int node, unsigned long start_pfn, unsigned long end_pfn)
{
struct page *start_pg, *end_pg;
unsigned long pg, pgend;
/*
* Convert start_pfn/end_pfn to a struct page pointer.
*/
start_pg = pfn_to_page(start_pfn);
end_pg = pfn_to_page(end_pfn);
/*
* Convert to physical addresses, and
* round start upwards and end downwards.
*/
pg = PAGE_ALIGN(__pa(start_pg));
pgend = __pa(end_pg) & PAGE_MASK;
/*
* If there are free pages between these,
* free the section of the memmap array.
*/
if (pg < pgend)
free_bootmem_node(NODE_DATA(node), pg, pgend - pg);
}
/*
* The mem_map array can get very big. Free the unused area of the memory map.
*/
static void __init free_unused_memmap_node(int node, struct meminfo *mi)
{
unsigned long bank_start, prev_bank_end = 0;
unsigned int i;
/*
* [FIXME] This relies on each bank being in address order. This
* may not be the case, especially if the user has provided the
* information on the command line.
*/
for (i = 0; i < mi->nr_banks; i++) {
if (mi->bank[i].size == 0 || mi->bank[i].node != node)
continue;
bank_start = mi->bank[i].start >> PAGE_SHIFT;
if (bank_start < prev_bank_end) {
printk(KERN_ERR "MEM: unordered memory banks. "
"Not freeing memmap.\n");
break;
}
/*
* If we had a previous bank, and there is a space
* between the current bank and the previous, free it.
*/
if (prev_bank_end && prev_bank_end != bank_start)
free_memmap(node, prev_bank_end, bank_start);
prev_bank_end = (mi->bank[i].start +
mi->bank[i].size) >> PAGE_SHIFT;
}
}
/*
* mem_init() marks the free areas in the mem_map and tells us how much
* memory is free. This is done after various parts of the system have
......@@ -540,16 +603,12 @@ void __init mem_init(void)
max_mapnr = virt_to_page(high_memory) - mem_map;
#endif
/*
* We may have non-contiguous memory.
*/
if (meminfo.nr_banks != 1)
create_memmap_holes(&meminfo);
/* this will put all unused low memory onto the freelists */
for_each_online_node(node) {
pg_data_t *pgdat = NODE_DATA(node);
free_unused_memmap_node(node, &meminfo);
if (pgdat->node_spanned_pages != 0)
totalram_pages += free_all_bootmem_node(pgdat);
}
......
......@@ -169,7 +169,14 @@ pgd_t *get_pgd_slow(struct mm_struct *mm)
memzero(new_pgd, FIRST_KERNEL_PGD_NR * sizeof(pgd_t));
/*
* Copy over the kernel and IO PGD entries
*/
init_pgd = pgd_offset_k(0);
memcpy(new_pgd + FIRST_KERNEL_PGD_NR, init_pgd + FIRST_KERNEL_PGD_NR,
(PTRS_PER_PGD - FIRST_KERNEL_PGD_NR) * sizeof(pgd_t));
clean_dcache_area(new_pgd, PTRS_PER_PGD * sizeof(pgd_t));
if (!vectors_high()) {
/*
......@@ -198,14 +205,6 @@ pgd_t *get_pgd_slow(struct mm_struct *mm)
spin_unlock(&mm->page_table_lock);
}
/*
* Copy over the kernel and IO PGD entries
*/
memcpy(new_pgd + FIRST_KERNEL_PGD_NR, init_pgd + FIRST_KERNEL_PGD_NR,
(PTRS_PER_PGD - FIRST_KERNEL_PGD_NR) * sizeof(pgd_t));
clean_dcache_area(new_pgd, PTRS_PER_PGD * sizeof(pgd_t));
return new_pgd;
no_pte:
......@@ -698,75 +697,3 @@ void __init iotable_init(struct map_desc *io_desc, int nr)
for (i = 0; i < nr; i++)
create_mapping(io_desc + i);
}
static inline void
free_memmap(int node, unsigned long start_pfn, unsigned long end_pfn)
{
struct page *start_pg, *end_pg;
unsigned long pg, pgend;
/*
* Convert start_pfn/end_pfn to a struct page pointer.
*/
start_pg = pfn_to_page(start_pfn);
end_pg = pfn_to_page(end_pfn);
/*
* Convert to physical addresses, and
* round start upwards and end downwards.
*/
pg = PAGE_ALIGN(__pa(start_pg));
pgend = __pa(end_pg) & PAGE_MASK;
/*
* If there are free pages between these,
* free the section of the memmap array.
*/
if (pg < pgend)
free_bootmem_node(NODE_DATA(node), pg, pgend - pg);
}
static inline void free_unused_memmap_node(int node, struct meminfo *mi)
{
unsigned long bank_start, prev_bank_end = 0;
unsigned int i;
/*
* [FIXME] This relies on each bank being in address order. This
* may not be the case, especially if the user has provided the
* information on the command line.
*/
for (i = 0; i < mi->nr_banks; i++) {
if (mi->bank[i].size == 0 || mi->bank[i].node != node)
continue;
bank_start = mi->bank[i].start >> PAGE_SHIFT;
if (bank_start < prev_bank_end) {
printk(KERN_ERR "MEM: unordered memory banks. "
"Not freeing memmap.\n");
break;
}
/*
* If we had a previous bank, and there is a space
* between the current bank and the previous, free it.
*/
if (prev_bank_end && prev_bank_end != bank_start)
free_memmap(node, prev_bank_end, bank_start);
prev_bank_end = PAGE_ALIGN(mi->bank[i].start +
mi->bank[i].size) >> PAGE_SHIFT;
}
}
/*
* The mem_map array can get very big. Free
* the unused area of the memory map.
*/
void __init create_memmap_holes(struct meminfo *mi)
{
int node;
for_each_online_node(node)
free_unused_memmap_node(node, mi);
}
......@@ -6,7 +6,7 @@
# To add an entry into this database, please see Documentation/arm/README,
# or contact rmk@arm.linux.org.uk
#
# Last update: Thu Mar 24 14:34:50 2005
# Last update: Thu Jun 23 20:19:33 2005
#
# machine_is_xxx CONFIG_xxxx MACH_TYPE_xxx number
#
......@@ -243,7 +243,7 @@ yoho ARCH_YOHO YOHO 231
jasper ARCH_JASPER JASPER 232
dsc25 ARCH_DSC25 DSC25 233
omap_innovator MACH_OMAP_INNOVATOR OMAP_INNOVATOR 234
ramses ARCH_RAMSES RAMSES 235
mnci ARCH_RAMSES RAMSES 235
s28x ARCH_S28X S28X 236
mport3 ARCH_MPORT3 MPORT3 237
pxa_eagle250 ARCH_PXA_EAGLE250 PXA_EAGLE250 238
......@@ -323,7 +323,7 @@ nimbra29x ARCH_NIMBRA29X NIMBRA29X 311
nimbra210 ARCH_NIMBRA210 NIMBRA210 312
hhp_d95xx ARCH_HHP_D95XX HHP_D95XX 313
labarm ARCH_LABARM LABARM 314
m825xx ARCH_M825XX M825XX 315
comcerto ARCH_M825XX M825XX 315
m7100 SA1100_M7100 M7100 316
nipc2 ARCH_NIPC2 NIPC2 317
fu7202 ARCH_FU7202 FU7202 318
......@@ -724,3 +724,66 @@ lpc22xx MACH_LPC22XX LPC22XX 715
omap_comet3 MACH_COMET3 COMET3 716
omap_comet4 MACH_COMET4 COMET4 717
csb625 MACH_CSB625 CSB625 718
fortunet2 MACH_FORTUNET2 FORTUNET2 719
s5h2200 MACH_S5H2200 S5H2200 720
optorm920 MACH_OPTORM920 OPTORM920 721
adsbitsyxb MACH_ADSBITSYXB ADSBITSYXB 722
adssphere MACH_ADSSPHERE ADSSPHERE 723
adsportal MACH_ADSPORTAL ADSPORTAL 724
ln2410sbc MACH_LN2410SBC LN2410SBC 725
cb3rufc MACH_CB3RUFC CB3RUFC 726
mp2usb MACH_MP2USB MP2USB 727
ntnp425c MACH_NTNP425C NTNP425C 728
colibri MACH_COLIBRI COLIBRI 729
pcm7220 MACH_PCM7220 PCM7220 730
gateway7001 MACH_GATEWAY7001 GATEWAY7001 731
pcm027 MACH_PCM027 PCM027 732
cmpxa MACH_CMPXA CMPXA 733
anubis MACH_ANUBIS ANUBIS 734
ite8152 MACH_ITE8152 ITE8152 735
lpc3xxx MACH_LPC3XXX LPC3XXX 736
puppeteer MACH_PUPPETEER PUPPETEER 737
vt001 MACH_MACH_VADATECH MACH_VADATECH 738
e570 MACH_E570 E570 739
x50 MACH_X50 X50 740
recon MACH_RECON RECON 741
xboardgp8 MACH_XBOARDGP8 XBOARDGP8 742
fpic2 MACH_FPIC2 FPIC2 743
akita MACH_AKITA AKITA 744
a81 MACH_A81 A81 745
svm_sc25x MACH_SVM_SC25X SVM_SC25X 746
vt020 MACH_VADATECH020 VADATECH020 747
tli MACH_TLI TLI 748
edb9315lc MACH_EDB9315LC EDB9315LC 749
passec MACH_PASSEC PASSEC 750
ds_tiger MACH_DS_TIGER DS_TIGER 751
e310 MACH_E310 E310 752
e330 MACH_E330 E330 753
rt3000 MACH_RT3000 RT3000 754
nokia770 MACH_NOKIA770 NOKIA770 755
pnx0106 MACH_PNX0106 PNX0106 756
hx21xx MACH_HX21XX HX21XX 757
faraday MACH_FARADAY FARADAY 758
sbc9312 MACH_SBC9312 SBC9312 759
batman MACH_BATMAN BATMAN 760
jpd201 MACH_JPD201 JPD201 761
mipsa MACH_MIPSA MIPSA 762
kacom MACH_KACOM KACOM 763
swarcocpu MACH_SWARCOCPU SWARCOCPU 764
swarcodsl MACH_SWARCODSL SWARCODSL 765
blueangel MACH_BLUEANGEL BLUEANGEL 766
hairygrama MACH_HAIRYGRAMA HAIRYGRAMA 767
banff MACH_BANFF BANFF 768
carmeva MACH_CARMEVA CARMEVA 769
sam255 MACH_SAM255 SAM255 770
ppm10 MACH_PPM10 PPM10 771
edb9315a MACH_EDB9315A EDB9315A 772
sunset MACH_SUNSET SUNSET 773
stargate2 MACH_STARGATE2 STARGATE2 774
intelmote2 MACH_INTELMOTE2 INTELMOTE2 775
trizeps4 MACH_TRIZEPS4 TRIZEPS4 776
mainstone2 MACH_MAINSTONE2 MAINSTONE2 777
ez_ixp42x MACH_EZ_IXP42X EZ_IXP42X 778
tapwave_zodiac MACH_TAPWAVE_ZODIAC TAPWAVE_ZODIAC 779
universalmeter MACH_UNIVERSALMETER UNIVERSALMETER 780
hicoarm9 MACH_HICOARM9 HICOARM9 781
......@@ -127,48 +127,23 @@ static inline void prepare_singlestep(struct kprobe *p, struct pt_regs *regs)
regs->eip = (unsigned long)&p->ainsn.insn;
}
struct task_struct *arch_get_kprobe_task(void *ptr)
{
return ((struct thread_info *) (((unsigned long) ptr) &
(~(THREAD_SIZE -1))))->task;
}
void arch_prepare_kretprobe(struct kretprobe *rp, struct pt_regs *regs)
{
unsigned long *sara = (unsigned long *)&regs->esp;
struct kretprobe_instance *ri;
static void *orig_ret_addr;
struct kretprobe_instance *ri;
if ((ri = get_free_rp_inst(rp)) != NULL) {
ri->rp = rp;
ri->task = current;
ri->ret_addr = (kprobe_opcode_t *) *sara;
/*
* Save the return address when the return probe hits
* the first time, and use it to populate the (krprobe
* instance)->ret_addr for subsequent return probes at
* the same addrress since stack address would have
* the kretprobe_trampoline by then.
*/
if (((void*) *sara) != kretprobe_trampoline)
orig_ret_addr = (void*) *sara;
if ((ri = get_free_rp_inst(rp)) != NULL) {
ri->rp = rp;
ri->stack_addr = sara;
ri->ret_addr = orig_ret_addr;
add_rp_inst(ri);
/* Replace the return addr with trampoline addr */
*sara = (unsigned long) &kretprobe_trampoline;
} else {
rp->nmissed++;
}
}
void arch_kprobe_flush_task(struct task_struct *tk)
{
struct kretprobe_instance *ri;
while ((ri = get_rp_inst_tsk(tk)) != NULL) {
*((unsigned long *)(ri->stack_addr)) =
(unsigned long) ri->ret_addr;
recycle_rp_inst(ri);
}
add_rp_inst(ri);
} else {
rp->nmissed++;
}
}
/*
......@@ -286,36 +261,59 @@ static int kprobe_handler(struct pt_regs *regs)
*/
int trampoline_probe_handler(struct kprobe *p, struct pt_regs *regs)
{
struct task_struct *tsk;
struct kretprobe_instance *ri;
struct hlist_head *head;
struct hlist_node *node;
unsigned long *sara = ((unsigned long *) &regs->esp) - 1;
tsk = arch_get_kprobe_task(sara);
head = kretprobe_inst_table_head(tsk);
hlist_for_each_entry(ri, node, head, hlist) {
if (ri->stack_addr == sara && ri->rp) {
if (ri->rp->handler)
ri->rp->handler(ri, regs);
}
}
return 0;
}
struct kretprobe_instance *ri = NULL;
struct hlist_head *head;
struct hlist_node *node, *tmp;
unsigned long orig_ret_address = 0;
unsigned long trampoline_address =(unsigned long)&kretprobe_trampoline;
void trampoline_post_handler(struct kprobe *p, struct pt_regs *regs,
unsigned long flags)
{
struct kretprobe_instance *ri;
/* RA already popped */
unsigned long *sara = ((unsigned long *)&regs->esp) - 1;
head = kretprobe_inst_table_head(current);
while ((ri = get_rp_inst(sara))) {
regs->eip = (unsigned long)ri->ret_addr;
/*
* It is possible to have multiple instances associated with a given
* task either because an multiple functions in the call path
* have a return probe installed on them, and/or more then one return
* return probe was registered for a target function.
*
* We can handle this because:
* - instances are always inserted at the head of the list
* - when multiple return probes are registered for the same
* function, the first instance's ret_addr will point to the
* real return address, and all the rest will point to
* kretprobe_trampoline
*/
hlist_for_each_entry_safe(ri, node, tmp, head, hlist) {
if (ri->task != current)
/* another task is sharing our hash bucket */
continue;
if (ri->rp && ri->rp->handler)
ri->rp->handler(ri, regs);
orig_ret_address = (unsigned long)ri->ret_addr;
recycle_rp_inst(ri);
if (orig_ret_address != trampoline_address)
/*
* This is the real return address. Any other
* instances associated with this task are for
* other calls deeper on the call stack
*/
break;
}
regs->eflags &= ~TF_MASK;
BUG_ON(!orig_ret_address || (orig_ret_address == trampoline_address));
regs->eip = orig_ret_address;
unlock_kprobes();
preempt_enable_no_resched();
/*
* By returning a non-zero value, we are telling
* kprobe_handler() that we have handled unlocking
* and re-enabling preemption.
*/
return 1;
}
/*
......@@ -403,8 +401,7 @@ static inline int post_kprobe_handler(struct pt_regs *regs)
current_kprobe->post_handler(current_kprobe, regs, 0);
}
if (current_kprobe->post_handler != trampoline_post_handler)
resume_execution(current_kprobe, regs);
resume_execution(current_kprobe, regs);
regs->eflags |= kprobe_saved_eflags;
/*Restore back the original saved kprobes variables and continue. */
......@@ -534,3 +531,13 @@ int longjmp_break_handler(struct kprobe *p, struct pt_regs *regs)
}
return 0;
}
static struct kprobe trampoline_p = {
.addr = (kprobe_opcode_t *) &kretprobe_trampoline,
.pre_handler = trampoline_probe_handler
};
int __init arch_init(void)
{
return register_kprobe(&trampoline_p);
}
......@@ -616,6 +616,33 @@ handle_io_bitmap(struct thread_struct *next, struct tss_struct *tss)
tss->io_bitmap_base = INVALID_IO_BITMAP_OFFSET_LAZY;
}
/*
* This function selects if the context switch from prev to next
* has to tweak the TSC disable bit in the cr4.
*/
static inline void disable_tsc(struct task_struct *prev_p,
struct task_struct *next_p)
{
struct thread_info *prev, *next;
/*
* gcc should eliminate the ->thread_info dereference if
* has_secure_computing returns 0 at compile time (SECCOMP=n).
*/
prev = prev_p->thread_info;
next = next_p->thread_info;
if (has_secure_computing(prev) || has_secure_computing(next)) {
/* slow path here */
if (has_secure_computing(prev) &&
!has_secure_computing(next)) {
write_cr4(read_cr4() & ~X86_CR4_TSD);
} else if (!has_secure_computing(prev) &&
has_secure_computing(next))
write_cr4(read_cr4() | X86_CR4_TSD);
}
}
/*
* switch_to(x,yn) should switch tasks from x to y.
*
......@@ -695,6 +722,8 @@ struct task_struct fastcall * __switch_to(struct task_struct *prev_p, struct tas
if (unlikely(prev->io_bitmap_ptr || next->io_bitmap_ptr))
handle_io_bitmap(next, tss);
disable_tsc(prev_p, next_p);
return prev_p;
}
......
......@@ -289,3 +289,5 @@ ENTRY(sys_call_table)
.long sys_add_key
.long sys_request_key
.long sys_keyctl
.long sys_ioprio_set
.long sys_ioprio_get /* 290 */
......@@ -1577,8 +1577,8 @@ sys_call_table:
data8 sys_add_key
data8 sys_request_key
data8 sys_keyctl
data8 sys_ni_syscall
data8 sys_ni_syscall // 1275
data8 sys_ioprio_set
data8 sys_ioprio_get // 1275
data8 sys_set_zone_reclaim
data8 sys_ni_syscall
data8 sys_ni_syscall
......
......@@ -34,6 +34,7 @@
#include <asm/pgtable.h>
#include <asm/kdebug.h>
#include <asm/sections.h>
extern void jprobe_inst_return(void);
......@@ -263,13 +264,33 @@ static inline void get_kprobe_inst(bundle_t *bundle, uint slot,
}
}
/* Returns non-zero if the addr is in the Interrupt Vector Table */
static inline int in_ivt_functions(unsigned long addr)
{
return (addr >= (unsigned long)__start_ivt_text
&& addr < (unsigned long)__end_ivt_text);
}
static int valid_kprobe_addr(int template, int slot, unsigned long addr)
{
if ((slot > 2) || ((bundle_encoding[template][1] == L) && slot > 1)) {
printk(KERN_WARNING "Attempting to insert unaligned kprobe at 0x%lx\n",
addr);
printk(KERN_WARNING "Attempting to insert unaligned kprobe "
"at 0x%lx\n", addr);
return -EINVAL;
}
if (in_ivt_functions(addr)) {
printk(KERN_WARNING "Kprobes can't be inserted inside "
"IVT functions at 0x%lx\n", addr);
return -EINVAL;
}
if (slot == 1 && bundle_encoding[template][1] != L) {
printk(KERN_WARNING "Inserting kprobes on slot #1 "
"is not supported\n");
return -EINVAL;
}
return 0;
}
......@@ -290,6 +311,94 @@ static inline void set_current_kprobe(struct kprobe *p)
current_kprobe = p;
}
static void kretprobe_trampoline(void)
{
}
/*
* At this point the target function has been tricked into
* returning into our trampoline. Lookup the associated instance
* and then:
* - call the handler function
* - cleanup by marking the instance as unused
* - long jump back to the original return address
*/
int trampoline_probe_handler(struct kprobe *p, struct pt_regs *regs)
{
struct kretprobe_instance *ri = NULL;
struct hlist_head *head;
struct hlist_node *node, *tmp;
unsigned long orig_ret_address = 0;
unsigned long trampoline_address =
((struct fnptr *)kretprobe_trampoline)->ip;
head = kretprobe_inst_table_head(current);
/*
* It is possible to have multiple instances associated with a given
* task either because an multiple functions in the call path
* have a return probe installed on them, and/or more then one return
* return probe was registered for a target function.
*
* We can handle this because:
* - instances are always inserted at the head of the list
* - when multiple return probes are registered for the same
* function, the first instance's ret_addr will point to the
* real return address, and all the rest will point to
* kretprobe_trampoline
*/
hlist_for_each_entry_safe(ri, node, tmp, head, hlist) {
if (ri->task != current)
/* another task is sharing our hash bucket */
continue;
if (ri->rp && ri->rp->handler)
ri->rp->handler(ri, regs);
orig_ret_address = (unsigned long)ri->ret_addr;
recycle_rp_inst(ri);
if (orig_ret_address != trampoline_address)
/*
* This is the real return address. Any other
* instances associated with this task are for
* other calls deeper on the call stack
*/
break;
}
BUG_ON(!orig_ret_address || (orig_ret_address == trampoline_address));
regs->cr_iip = orig_ret_address;
unlock_kprobes();
preempt_enable_no_resched();
/*
* By returning a non-zero value, we are telling
* kprobe_handler() that we have handled unlocking
* and re-enabling preemption.
*/
return 1;
}
void arch_prepare_kretprobe(struct kretprobe *rp, struct pt_regs *regs)
{
struct kretprobe_instance *ri;
if ((ri = get_free_rp_inst(rp)) != NULL) {
ri->rp = rp;
ri->task = current;
ri->ret_addr = (kprobe_opcode_t *)regs->b0;
/* Replace the return addr with trampoline addr */
regs->b0 = ((struct fnptr *)kretprobe_trampoline)->ip;
add_rp_inst(ri);
} else {
rp->nmissed++;
}
}
int arch_prepare_kprobe(struct kprobe *p)
{
unsigned long addr = (unsigned long) p->addr;
......@@ -492,8 +601,8 @@ static int pre_kprobes_handler(struct die_args *args)
if (p->pre_handler && p->pre_handler(p, regs))
/*
* Our pre-handler is specifically requesting that we just
* do a return. This is handling the case where the
* pre-handler is really our special jprobe pre-handler.
* do a return. This is used for both the jprobe pre-handler
* and the kretprobe trampoline
*/
return 1;
......@@ -599,3 +708,14 @@ int longjmp_break_handler(struct kprobe *p, struct pt_regs *regs)
*regs = jprobe_saved_regs;
return 1;
}
static struct kprobe trampoline_p = {
.pre_handler = trampoline_probe_handler
};
int __init arch_init(void)
{
trampoline_p.addr =
(kprobe_opcode_t *)((struct fnptr *)kretprobe_trampoline)->ip;
return register_kprobe(&trampoline_p);
}
......@@ -27,6 +27,7 @@
#include <linux/efi.h>
#include <linux/interrupt.h>
#include <linux/delay.h>
#include <linux/kprobes.h>
#include <asm/cpu.h>
#include <asm/delay.h>
......@@ -707,6 +708,13 @@ kernel_thread_helper (int (*fn)(void *), void *arg)
void
flush_thread (void)
{
/*
* Remove function-return probe instances associated with this task
* and put them back on the free list. Do not insert an exit probe for
* this function, it will be disabled by kprobe_flush_task if you do.
*/
kprobe_flush_task(current);
/* drop floating-point and debug-register state if it exists: */
current->thread.flags &= ~(IA64_THREAD_FPH_VALID | IA64_THREAD_DBG_VALID);
ia64_drop_fpu(current);
......@@ -721,6 +729,14 @@ flush_thread (void)
void
exit_thread (void)
{
/*
* Remove function-return probe instances associated with this task
* and put them back on the free list. Do not insert an exit probe for
* this function, it will be disabled by kprobe_flush_task if you do.
*/
kprobe_flush_task(current);
ia64_drop_fpu(current);
#ifdef CONFIG_PERFMON
/* if needed, stop monitoring and flush state to perfmon context */
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment