310508 |
24-Dec-2016 |
avg |
define Maxmem for ia64, the only platform that didn't have it
This is a direct commit to stable/10 as the platform was removed in the newer branches.
Maxmem is required for compiling fwohci(4) on ia64 since commit r310081, MFC of r277511. It was easier to add Maxmem than to make a special case for ia64 in fwohci.
Reported by: jhb, gjb Discussed with: kib, jhb |
292348 |
16-Dec-2015 |
ken |
MFC r291716, r291724, r291741, r291742
In addition to those revisions, add this change to a file that is not in head:
sys/ia64/include/bus.h: Guard kernel-only parts of the ia64 machine/bus.h header with #ifdef _KERNEL.
This allows userland programs to include <machine/bus.h> to get the definition of bus_addr_t and bus_size_t.
------------------------------------------------------------------------ r291716 | ken | 2015-12-03 15:54:55 -0500 (Thu, 03 Dec 2015) | 257 lines
Add asynchronous command support to the pass(4) driver, and the new camdd(8) utility.
CCBs may be queued to the driver via the new CAMIOQUEUE ioctl, and completed CCBs may be retrieved via the CAMIOGET ioctl. User processes can use poll(2) or kevent(2) to get notification when I/O has completed.
While the existing CAMIOCOMMAND blocking ioctl interface only supports user virtual data pointers in a CCB (generally only one per CCB), the new CAMIOQUEUE ioctl supports user virtual and physical address pointers, as well as user virtual and physical scatter/gather lists. This allows user applications to have more flexibility in their data handling operations.
Kernel memory for data transferred via the queued interface is allocated from the zone allocator in MAXPHYS sized chunks, and user data is copied in and out. This is likely faster than the vmapbuf()/vunmapbuf() method used by the CAMIOCOMMAND ioctl in configurations with many processors (there are more TLB shootdowns caused by the mapping/unmapping operation) but may not be as fast as running with unmapped I/O.
The new memory handling model for user requests also allows applications to send CCBs with request sizes that are larger than MAXPHYS. The pass(4) driver now limits queued requests to the I/O size listed by the SIM driver in the maxio field in the Path Inquiry (XPT_PATH_INQ) CCB.
There are some things things would be good to add:
1. Come up with a way to do unmapped I/O on multiple buffers. Currently the unmapped I/O interface operates on a struct bio, which includes only one address and length. It would be nice to be able to send an unmapped scatter/gather list down to busdma. This would allow eliminating the copy we currently do for data.
2. Add an ioctl to list currently outstanding CCBs in the various queues.
3. Add an ioctl to cancel a request, or use the XPT_ABORT CCB to do that.
4. Test physical address support. Virtual pointers and scatter gather lists have been tested, but I have not yet tested physical addresses or scatter/gather lists.
5. Investigate multiple queue support. At the moment there is one queue of commands per pass(4) device. If multiple processes open the device, they will submit I/O into the same queue and get events for the same completions. This is probably the right model for most applications, but it is something that could be changed later on.
Also, add a new utility, camdd(8) that uses the asynchronous pass(4) driver interface.
This utility is intended to be a basic data transfer/copy utility, a simple benchmark utility, and an example of how to use the asynchronous pass(4) interface.
It can copy data to and from pass(4) devices using any target queue depth, starting offset and blocksize for the input and ouptut devices. It currently only supports SCSI devices, but could be easily extended to support ATA devices.
It can also copy data to and from regular files, block devices, tape devices, pipes, stdin, and stdout. It does not support queueing multiple commands to any of those targets, since it uses the standard read(2)/write(2)/writev(2)/readv(2) system calls.
The I/O is done by two threads, one for the reader and one for the writer. The reader thread sends completed read requests to the writer thread in strictly sequential order, even if they complete out of order. That could be modified later on for random I/O patterns or slightly out of order I/O.
camdd(8) uses kqueue(2)/kevent(2) to get I/O completion events from the pass(4) driver and also to send request notifications internally.
For pass(4) devcies, camdd(8) uses a single buffer (CAM_DATA_VADDR) per CAM CCB on the reading side, and a scatter/gather list (CAM_DATA_SG) on the writing side. In addition to testing both interfaces, this makes any potential reblocking of I/O easier. No data is copied between the reader and the writer, but rather the reader's buffers are split into multiple I/O requests or combined into a single I/O request depending on the input and output blocksize.
For the file I/O path, camdd(8) also uses a single buffer (read(2), write(2), pread(2) or pwrite(2)) on reads, and a scatter/gather list (readv(2), writev(2), preadv(2), pwritev(2)) on writes.
Things that would be nice to do for camdd(8) eventually:
1. Add support for I/O pattern generation. Patterns like all zeros, all ones, LBA-based patterns, random patterns, etc. Right Now you can always use /dev/zero, /dev/random, etc.
2. Add support for a "sink" mode, so we do only reads with no writes. Right now, you can use /dev/null.
3. Add support for automatic queue depth probing, so that we can figure out the right queue depth on the input and output side for maximum throughput. At the moment it defaults to 6.
4. Add support for SATA device passthrough I/O.
5. Add support for random LBAs and/or lengths on the input and output sides.
6. Track average per-I/O latency and busy time. The busy time and latency could also feed in to the automatic queue depth determination.
sys/cam/scsi/scsi_pass.h: Define two new ioctls, CAMIOQUEUE and CAMIOGET, that queue and fetch asynchronous CAM CCBs respectively.
Although these ioctls do not have a declared argument, they both take a union ccb pointer. If we declare a size here, the ioctl code in sys/kern/sys_generic.c will malloc and free a buffer for either the CCB or the CCB pointer (depending on how it is declared). Since we have to keep a copy of the CCB (which is fairly large) anyway, having the ioctl malloc and free a CCB for each call is wasteful.
sys/cam/scsi/scsi_pass.c: Add asynchronous CCB support.
Add two new ioctls, CAMIOQUEUE and CAMIOGET.
CAMIOQUEUE adds a CCB to the incoming queue. The CCB is executed immediately (and moved to the active queue) if it is an immediate CCB, but otherwise it will be executed in passstart() when a CCB is available from the transport layer.
When CCBs are completed (because they are immediate or passdone() if they are queued), they are put on the done queue.
If we get the final close on the device before all pending I/O is complete, all active I/O is moved to the abandoned queue and we increment the peripheral reference count so that the peripheral driver instance doesn't go away before all pending I/O is done.
The new passcreatezone() function is called on the first call to the CAMIOQUEUE ioctl on a given device to allocate the UMA zones for I/O requests and S/G list buffers. This may be good to move off to a taskqueue at some point. The new passmemsetup() function allocates memory and scatter/gather lists to hold the user's data, and copies in any data that needs to be written. For virtual pointers (CAM_DATA_VADDR), the kernel buffer is malloced from the new pass(4) driver malloc bucket. For virtual scatter/gather lists (CAM_DATA_SG), buffers are allocated from a new per-pass(9) UMA zone in MAXPHYS-sized chunks. Physical pointers are passed in unchanged. We have support for up to 16 scatter/gather segments (for the user and kernel S/G lists) in the default struct pass_io_req, so requests with longer S/G lists require an extra kernel malloc.
The new passcopysglist() function copies a user scatter/gather list to a kernel scatter/gather list. The number of elements in each list may be different, but (obviously) the amount of data stored has to be identical.
The new passmemdone() function copies data out for the CAM_DATA_VADDR and CAM_DATA_SG cases.
The new passiocleanup() function restores data pointers in user CCBs and frees memory.
Add new functions to support kqueue(2)/kevent(2):
passreadfilt() tells kevent whether or not the done queue is empty.
passkqfilter() adds a knote to our list.
passreadfiltdetach() removes a knote from our list.
Add a new function, passpoll(), for poll(2)/select(2) to use.
Add devstat(9) support for the queued CCB path.
sys/cam/ata/ata_da.c: Add support for the BIO_VLIST bio type.
sys/cam/cam_ccb.h: Add a new enumeration for the xflags field in the CCB header. (This doesn't change the CCB header, just adds an enumeration to use.)
sys/cam/cam_xpt.c: Add a new function, xpt_setup_ccb_flags(), that allows specifying CCB flags.
sys/cam/cam_xpt.h: Add a prototype for xpt_setup_ccb_flags().
sys/cam/scsi/scsi_da.c: Add support for BIO_VLIST.
sys/dev/md/md.c: Add BIO_VLIST support to md(4).
sys/geom/geom_disk.c: Add BIO_VLIST support to the GEOM disk class. Re-factor the I/O size limiting code in g_disk_start() a bit.
sys/kern/subr_bus_dma.c: Change _bus_dmamap_load_vlist() to take a starting offset and length.
Add a new function, _bus_dmamap_load_pages(), that will load a list of physical pages starting at an offset.
Update _bus_dmamap_load_bio() to allow loading BIO_VLIST bios. Allow unmapped I/O to start at an offset.
sys/kern/subr_uio.c: Add two new functions, physcopyin_vlist() and physcopyout_vlist().
sys/pc98/include/bus.h: Guard kernel-only parts of the pc98 machine/bus.h header with #ifdef _KERNEL.
This allows userland programs to include <machine/bus.h> to get the definition of bus_addr_t and bus_size_t.
sys/sys/bio.h: Add a new bio flag, BIO_VLIST.
sys/sys/uio.h: Add prototypes for physcopyin_vlist() and physcopyout_vlist().
share/man/man4/pass.4: Document the CAMIOQUEUE and CAMIOGET ioctls.
usr.sbin/Makefile: Add camdd.
usr.sbin/camdd/Makefile: Add a makefile for camdd(8).
usr.sbin/camdd/camdd.8: Man page for camdd(8).
usr.sbin/camdd/camdd.c: The new camdd(8) utility.
Sponsored by: Spectra Logic
------------------------------------------------------------------------ r291724 | ken | 2015-12-03 17:07:01 -0500 (Thu, 03 Dec 2015) | 6 lines
Fix typos in the camdd(8) usage() function output caused by an error in my diff filter script.
Sponsored by: Spectra Logic
------------------------------------------------------------------------ r291741 | ken | 2015-12-03 22:38:35 -0500 (Thu, 03 Dec 2015) | 10 lines
Fix g_disk_vlist_limit() to work properly with deletes.
Add a new bp argument to g_disk_maxsegs(), and add a new function, g_disk_maxsize() tha will properly determine the maximum I/O size for a delete or non-delete bio.
Submitted by: will Sponsored by: Spectra Logic
------------------------------------------------------------------------ ------------------------------------------------------------------------ r291742 | ken | 2015-12-03 22:44:12 -0500 (Thu, 03 Dec 2015) | 5 lines
Fix a style issue in g_disk_limit().
Noticed by: bdrewery
------------------------------------------------------------------------
Sponsored by: Spectra Logic |
288287 |
27-Sep-2015 |
kib |
MFC r288000: Add support for weak symbols to the kernel linkers. |
287945 |
17-Sep-2015 |
rstone |
MFC r280957
Fix integer truncation bug in malloc(9)
A couple of internal functions used by malloc(9) and uma truncated a size_t down to an int. This could cause any number of issues (e.g. indefinite sleeps, memory corruption) if any kernel subsystem tried to allocate 2GB or more through malloc. zfs would attempt such an allocation when run on a system with 2TB or more of RAM. |
283910 |
02-Jun-2015 |
jhb |
MFC 281266: Move the 32-bit compatible procfs types from freebsd32.h to <sys/procfs.h> and export them to userland. - Define __HAVE_REG32 on platforms that define a reg32 structure and check for this in <sys/procfs.h> to control when to export prstatus32, etc. - Add prstatus32_t and prpsinfo32_t typedefs for the 32-bit structures. libbfd looks for these types, and having them fixes 'gcore' in gdb of a 32-bit process on a 64-bit platform. - Use the structure definitions from <sys/procfs.h> in gcore's elf32 core dump code instead of duplicating the definitions. |
282506 |
05-May-2015 |
hselasky |
MFC r282120: The add_bounce_page() function can be called when loading physical pages which pass a NULL virtual address. If the BUS_DMA_KEEP_PG_OFFSET flag is set, use the physical address to compute the page offset instead. The physical address should always be valid when adding bounce pages and should contain the same page offset like the virtual address.
Submitted by: Svatopluk Kraus <onwahe@gmail.com> Reviewed by: jhb@ |
278412 |
08-Feb-2015 |
peter |
Repair ia64 build after r278347 - remove const from set_mcontext |
274648 |
18-Nov-2014 |
kib |
Merge the fueword(9) and casueword(9). In particular,
MFC r273783: Add fueword(9) and casueword(9) functions. MFC note: ia64 is handled like arm, with NO_FUEWORD define.
MFC r273784: Replace some calls to fuword() by fueword() with proper error checking.
MFC r273785: Convert kern_umtx.c to use fueword() and casueword(). MFC note: the sys__umtx_lock and sys__umtx_unlock syscalls are not converted, they are removed from HEAD, and not used. The do_sem2*() family is not yet merged to stable/10, corresponding chunk will be merged after do_sem2* are committed.
MFC r273788 (by jkim): Actually install casuword(9) to fix build.
MFC r273911: Add type qualifier volatile to the base (userspace) address argument of fuword(9) and suword(9). |
271998 |
22-Sep-2014 |
marcel |
Make sure all memory updates are made visible before we let go of the thread in cpu_switch(). It's otherwise possible that on another CPU the thread continues from stale context data.
Note that this is prominent on newer CPUs, like the Montecito, that really take advantage of the weak memory ordering. First generation Itanium 2 is not that aggressive and does not need this.
This is a direct commit to stable/10.
Approved by: re@ (gjb) |
271239 |
07-Sep-2014 |
marcel |
Fix previous commit: unbreak build of libkvm by including sys/systm.h only when _KERNEL is defined.
Approved by: re@ (implicit) |
271211 |
06-Sep-2014 |
marcel |
Fix the PCPU access macros. It was found that the PCPU pointer, when held in register r13, is used outside the bounds of critical_enter() and critical_exit() by virtue of optimizations performed by the compiler. The net effect being that address computations of fields in the PCPU structure could be relative to the PCPU structure of the CPU on which the address computation was performed and not related to the CPU that executes the actual load or store operation. The typical failure mode being that the per-CPU cache of UMA got corrupted due to accesses from other CPUs.
Adding more volatile decorating to the register expression does not help. The thinking being that volatile is assumed to work on memory references and not register references. Thus, the fix is to perform the address computation using a volatile inline assembly statement.
Additionally, since the reference is fundamentally non-atomic on ia64 by virtue of have a distinct address computation followed by the actual load or store operation, it is required to wrap the entire PCPU access in a critical section.
With PCPU_GET and friends requiring curthread now that they're in a critical section, low-level use of these macros in functions like cpu_switch() is not possible anymore. Consequently, a second order set of changes is needed to avoid using PCPU_GET and friends where curthread is either not set yet, or in the process of being changed. In those cases, explicit dereferencing of pcpup is needed. In those cases it is also possible to do that.
This is a direct commit to stable/10.
Approved by: re@ (marius) |
270920 |
01-Sep-2014 |
kib |
Fix a leak of the wired pages when unwiring of the PROT_NONE-mapped wired region. Rework the handling of unwire to do the it in batch, both at pmap and object level.
All commits below are by alc.
MFC r268327: Introduce pmap_unwire().
MFC r268591: Implement pmap_unwire() for powerpc.
MFC r268776: Implement pmap_unwire() for arm.
MFC r268806: pmap_unwire(9) man page.
MFC r269134: When unwiring a region of an address space, do not assume that the underlying physical pages are mapped by the pmap. This fixes a leak of the wired pages on the unwiring of the region mapped with no access allowed.
MFC r269339: In the implementation of the new function pmap_unwire(), the call to MOEA64_PVO_TO_PTE() must be performed before any changes are made to the PVO. Otherwise, MOEA64_PVO_TO_PTE() will panic.
MFC r269365: Correct a long-standing problem in moea{,64}_pvo_enter() that was revealed by the combination of r268591 and r269134: When we attempt to add the wired attribute to an existing mapping, moea{,64}_pvo_enter() do nothing. (They only set the wired attribute on newly created mappings.)
MFC r269433: Handle wiring failures in vm_map_wire() with the new functions pmap_unwire() and vm_object_unwire(). Retire vm_fault_{un,}wire(), since they are no longer used.
MFC r269438: Rewrite a loop in vm_map_wire() so that gcc doesn't think that the variable "rv" is uninitialized.
MFC r269485: Retire pmap_change_wiring().
Reviewed by: alc |
270835 |
30-Aug-2014 |
alc |
Update an assertion to reflect the changes made in r270439. This is a direct commit to stable/10 because ia64 is no longer supported by HEAD.
Reported by: marcel Sponsored by: EMC / Isilon Storage Division |
270573 |
25-Aug-2014 |
marcel |
Make sure the psr field in the trapframe (which holds the value of cr.ipsr) is properly synthesized for the EPC syscall. Properly synthesized in this case means that the bank number (BN bitfield) is set to 1. This is needed because the move-from-PSR instruction does copy all bits! In this case the BN bitfield was not copied.
While normally this is not a problem, because when we leave the kernel via the EPC syscall path again, we don't actually care about the BN bitfield. We restore PSR with a move-to-PSR instruction, which also doesn't cover the BN bitfield.
There is however a scenario where we enter the kernel via the EPC syscall path and leave the kernel via the exception/interrupt path. That path uses the RFI (Return-From-Interrupt) instruction and it restores all bits. What happens in that case is that we don't properly switch to register bank 1 and any exception/interrupt that happens while running in bank 0 clobbers the process' (or kernel's) banked registers. This is because the CPU switches to bank 0 on an exception/interrupt so that there are 16 general registers available for constructing a trapframe and saving the context. Consequently: normal code should always use register bank 1.
This bug has been present since 2003 (11 years) and has been the cause for many "unexplained" kernel panics. It says something about how often we hit this problem on the one hand and how tricky it was to find it.
Many thanks to: clusteradm@ for enabling me to track this down! |
270439 |
24-Aug-2014 |
kib |
Merge the changes to pmap_enter(9) for sleep-less operation (requested by flag). The ia64 pmap.c changes are direct commit, since ia64 is removed on head.
MFC r269368 (by alc): Retire PVO_EXECUTABLE.
MFC r269728: Change pmap_enter(9) interface to take flags parameter and superpage mapping size (currently unused).
MFC r269759 (by alc): Update the text of a KASSERT() to reflect the changes in r269728.
MFC r269822 (by alc): Change {_,}pmap_allocpte() so that they look for the flag PMAP_ENTER_NOSLEEP instead of M_NOWAIT/M_WAITOK when deciding whether to sleep on page table page allocation.
MFC r270151 (by alc): Replace KASSERT that no PV list locks are held with a conditional unlock.
Reviewed by: alc Approved by: re (gjb) Sponsored by: The FreeBSD Foundation |
270296 |
21-Aug-2014 |
emaste |
MFC r263815, r263872:
Move ia64 efi.h to sys in preparation for amd64 UEFI support
Prototypes specific to ia64 have been left in this file for now, under __ia64__, rather than moving them to a new header under sys/ia64. I anticipate that (some of) the corresponding functions will be shared by the amd64, arm64, i386, and ia64 architectures, and we can adjust this as EFI support on other than ia64 continues to develop.
Fix missed efi.h header change in r263815
Sponsored by: The FreeBSD Foundation |
268813 |
17-Jul-2014 |
imp |
MFC r263749,267146:
>r267146 | imp | 2014-06-05 22:08:55 -0600 (Thu, 05 Jun 2014) | 4 lines >Restore comments accidentally removed.
>r263749 | imp | 2014-03-25 16:08:31 -0600 (Tue, 25 Mar 2014) | 18 lines >Rather than require a makeoptions DEBUG to get debug correct, >add it in kern.mk, but only if we're using clang. While this >option is supported by both clang and gcc, in the future there >may be changes to clang which change the defaults that require >a tweak to build our kernel such that other tools in our tree >will work. Set a good example by forcing -gdwarf-2 only for >clang builds, and only if the user hasn't specified another >dwarf level already. Update UPDATING to reflect the changed >state of affairs. This also keeps us from having to update >all the ARM kernels to add this, and also keeps us from >in the future having to update all the MIPS kernels and is >one less place the user will have to know to do something >special for clang and one less thing developers will need >to do when moving an architecture to clang. |
268201 |
03-Jul-2014 |
marcel |
MFC r263380 & r268185: Add KTR events for the PMAP interface functions. |
268200 |
02-Jul-2014 |
marcel |
MFC r263323: Fix and improve exception tracing. |
268199 |
02-Jul-2014 |
marcel |
MFC r263254: Move the implementation of kdb_cpu_trap() from <machine/kdb.h> to machdep.c. |
268198 |
02-Jul-2014 |
marcel |
MFC r263253: Don't use the ITC as the faulting address for external interrupts. |
268195 |
02-Jul-2014 |
marcel |
MFC r263248 & r263257: In intr_event_handle() we already save and set td_intr_frame, so don't do it also in ia64_handle_intr(). |
268194 |
02-Jul-2014 |
marcel |
MFC r262726: When reading physical memory, make sure to access it using the right memory attributes. |
268192 |
02-Jul-2014 |
marcel |
MFC r259959 & r260009: Add prototypical support for minidumps. |
268189 |
02-Jul-2014 |
marcel |
MFC r257484: Change PAL_PTCE_INFO related variables. |
268184 |
02-Jul-2014 |
marcel |
MFC r257477: Purge the translation cache of APs before we unleash them. |
268181 |
02-Jul-2014 |
marcel |
MFC 257475: Respect the kern.smp.disabled tunable. |
266331 |
17-May-2014 |
ian |
MFC 263301
In kernel config files, it is supposed to be 'options<space><tab>' not 'options<tab><tab>', per long standing (but recently not so strictly enforced) convention. |
266312 |
17-May-2014 |
ian |
MFC 263036, 263059: delete advertising clause in licenses, renumber. |
266204 |
16-May-2014 |
ian |
MFC r257854 (discussed with alc@)
As of r257209, all architectures have defined VM_KMEM_SIZE_SCALE. In other words, every architecture is now auto-sizing the kmem arena. This revision changes kmeminit() so that the definition of VM_KMEM_SIZE_SCALE becomes mandatory and the definition of VM_KMEM_SIZE becomes optional.
Replace or eliminate all existing definitions of VM_KMEM_SIZE. With auto-sizing enabled, VM_KMEM_SIZE effectively became an alternate spelling for VM_KMEM_SIZE_MIN on most architectures. Use VM_KMEM_SIZE_MIN for clarity. |
265606 |
07-May-2014 |
scottl |
Merge r264984
Retire smp_active. It was racey and caused demonstrated problems with the cpufreq code. Replace its use with smp_started. There's at least one userland tool that still looks at the kern.smp.active sysctl, so preserve it but point it to smp_started as well.
Obtained from: Netflix, Inc. |
265388 |
05-May-2014 |
ken |
MFC the mpr(4) driver for LSI's 12Gb SAS cards.
This includes r265236, r265237, r265241 and r265261:
------------------------------------------------------------------------ r265236 | ken | 2014-05-02 14:25:09 -0600 (Fri, 02 May 2014) | 51 lines
Bring in the mpr(4) driver for LSI's MPT3 12Gb SAS controllers.
This is derived from the mps(4) driver, but it supports only the 12Gb IT and IR hardware including the SAS 3004, SAS 3008 and SAS 3108.
Some notes about this driver: o The 12Gb hardware can do "FastPath" I/O, and that capability is included in this driver.
o WarpDrive functionality has been removed, since it isn't supported in the 12Gb driver interface.
o The Scatter/Gather list handling code is significantly different between the 6Gb and 12Gb hardware. The 12Gb boards support IEEE Scatter/Gather lists.
Thanks to LSI for developing and testing this driver for FreeBSD.
share/man/man4/mpr.4: mpr(4) man page.
sys/dev/mpr/*: mpr(4) driver files.
sys/modules/Makefile, sys/modules/mpr/Makefile: Add a module Makefile for the mpr(4) driver.
sys/conf/files: Add the mpr(4) driver.
sys/amd64/conf/GENERIC, sys/i386/conf/GENERIC, sys/mips/conf/OCTEON1, sys/sparc64/conf/GENERIC: Add the mpr(4) driver to all config files that currently have the mps(4) driver.
sys/ia64/conf/GENERIC: Add the mps(4) and mpr(4) drivers to the ia64 GENERIC config file.
sys/i386/conf/XEN: Exclude the mpr module from building here.
Submitted by: Steve McConnell <Stephen.McConnell@lsi.com> Tested by: Chris Reeves <chrisr@spectralogic.com> Sponsored by: LSI, Spectra Logic Relnotes: LSI 12Gb SAS driver mpr(4) added
------------------------------------------------------------------------ ------------------------------------------------------------------------ r265237 | ken | 2014-05-02 14:36:20 -0600 (Fri, 02 May 2014) | 8 lines
Add the mpr(4) man page to the man4 Makefile.
This should have been included in r265236.
Submitted by: Steve McConnell <Stephen.McConnell@lsi.com> MFC after: 3 days Sponsored by: LSI, Spectra Logic
------------------------------------------------------------------------ ------------------------------------------------------------------------ r265241 | brueffer | 2014-05-02 15:14:28 -0600 (Fri, 02 May 2014) | 2 lines
Use our standard SYNOPSIS wording; perform some cleanup while here.
------------------------------------------------------------------------ ------------------------------------------------------------------------ r265261 | brueffer | 2014-05-03 05:15:28 -0600 (Sat, 03 May 2014) | 2 lines
Add a missing colon.
------------------------------------------------------------------------
Submitted by: Steve McConnell <Stephen.McConnell@lsi.com> Tested by: Chris Reeves <chrisr@spectralogic.com> Sponsored by: LSI, Spectra Logic Relnotes: LSI 12Gb SAS driver mpr(4) added |
264496 |
15-Apr-2014 |
tijl |
MFC r263998:
Rename __wchar_t so it no longer conflicts with __wchar_t from clang 3.4 -fms-extensions. |
262004 |
16-Feb-2014 |
marcel |
MFC r260175: Implement atomic_swap_<type>. |
262002 |
16-Feb-2014 |
marcel |
MFC r260914: In pmap_set_pte(), make sure to enforce ordering by inserting a memory fence. |
262001 |
16-Feb-2014 |
marcel |
MFC r260666: In the nested TLB fault handler, for a direct-mapped address, make sure to clear the lower 12 bits. |
261996 |
16-Feb-2014 |
marcel |
MFC r259244: Allow pmap_remove_pages() to be called for physical maps not associated with the current thread. |
261986 |
16-Feb-2014 |
marcel |
MFC r257910: Don't enable interrupts before we call sched_throw(). |
261985 |
16-Feb-2014 |
marcel |
MFC r257487: Use LOG2_ID_PAGE_SIZE again for the identity mapping in regions 6 & 7. |
259510 |
17-Dec-2013 |
kib |
MFC r257228: Add bus_dmamap_load_ma() function to load map with the array of vm_pages. |
256283 |
10-Oct-2013 |
gjb |
- Remove debugging from GENERIC* kernel configurations - Enable MALLOC_PRODUCTION - Default dumpdev=NO - Remove UPDATING entry regarding debugging features - Bump __FreeBSD_version to 1000500
Approved by: re (implicit) Sponsored by: The FreeBSD Foundation |
256281 |
10-Oct-2013 |
gjb |
Copy head (r256279) to stable/10 as part of the 10.0-RELEASE cycle.
Approved by: re (implicit) Sponsored by: The FreeBSD Foundation
|
255724 |
20-Sep-2013 |
alc |
The pmap function pmap_clear_reference() is no longer used. Remove it.
pmap_clear_reference() has had exactly one caller in the kernel for several years, more precisely, since FreeBSD 8. Now, that call no longer exists.
Approved by: re (kib) Sponsored by: EMC / Isilon Storage Division
|
255426 |
09-Sep-2013 |
jhb |
Add a mmap flag (MAP_32BIT) on 64-bit platforms to request that a mapping use an address in the first 2GB of the process's address space. This flag should have the same semantics as the same flag on Linux.
To facilitate this, add a new parameter to vm_map_find() that specifies an optional maximum virtual address. While here, fix several callers of vm_map_find() to use a VMFS_* constant for the findspace argument instead of TRUE and FALSE.
Reviewed by: alc Approved by: re (kib)
|
255289 |
06-Sep-2013 |
glebius |
On those machines, where sf_bufs do not represent any real object, make sf_buf_alloc()/sf_buf_free() inlines, to save two calls to an absolutely empty functions.
Reviewed by: alc, kib, scottl Sponsored by: Nginx, Inc. Sponsored by: Netflix
|
255028 |
29-Aug-2013 |
alc |
Significantly reduce the cost, i.e., run time, of calls to madvise(..., MADV_DONTNEED) and madvise(..., MADV_FREE). Specifically, introduce a new pmap function, pmap_advise(), that operates on a range of virtual addresses within the specified pmap, allowing for a more efficient implementation of MADV_DONTNEED and MADV_FREE. Previously, the implementation of MADV_DONTNEED and MADV_FREE relied on per-page pmap operations, such as pmap_clear_reference(). Intuitively, the problem with this implementation is that the pmap-level locks are acquired and released and the page table traversed repeatedly, once for each resident page in the range that was specified to madvise(2). A more subtle flaw with the previous implementation is that pmap_clear_reference() would clear the reference bit on all mappings to the specified page, not just the mapping in the range specified to madvise(2).
Since our malloc(3) makes heavy use of madvise(2), this change can have a measureable impact. For example, the system time for completing a parallel "buildworld" on a 6-core amd64 machine was reduced by about 1.5% to 2.0%.
Note: This change only contains pmap_advise() implementations for a subset of our supported architectures. I will commit implementations for the remaining architectures after further testing. For now, a stub function is sufficient because of the advisory nature of pmap_advise().
Discussed with: jeff, jhb, kib Tested by: pho (i386), marcel (ia64) Sponsored by: EMC / Isilon Storage Division
|
254667 |
22-Aug-2013 |
kib |
Revert r254501. Instead, reuse the type stability of the struct pmap which is the part of struct vmspace, allocated from UMA_ZONE_NOFREE zone. Initialize the pmap lock in the vmspace zone init function, and remove pmap lock initialization and destruction from pmap_pinit() and pmap_release().
Suggested and reviewed by: alc (previous version) Tested by: pho Sponsored by: The FreeBSD Foundation
|
254480 |
18-Aug-2013 |
pjd |
Add process descriptors support to the GENERIC kernel. It is already being used by the tools in base systems and with sandboxing more and more tools the usage should only increase.
Submitted by: Mariusz Zaborski <oshogbo@FreeBSD.org> Sponsored by: Google Summer of Code 2013 MFC after: 1 month
|
254300 |
13-Aug-2013 |
jkim |
Tidy up global locks for ACPICA. There is no functional change.
|
254138 |
09-Aug-2013 |
attilio |
The soft and hard busy mechanism rely on the vm object lock to work. Unify the 2 concept into a real, minimal, sxlock where the shared acquisition represent the soft busy and the exclusive acquisition represent the hard busy. The old VPO_WANTED mechanism becames the hard-path for this new lock and it becomes per-page rather than per-object. The vm_object lock becames an interlock for this functionality: it can be held in both read or write mode. However, if the vm_object lock is held in read mode while acquiring or releasing the busy state, the thread owner cannot make any assumption on the busy state unless it is also busying it.
Also: - Add a new flag to directly shared busy pages while vm_page_alloc and vm_page_grab are being executed. This will be very helpful once these functions happen under a read object lock. - Move the swapping sleep into its own per-object flag
The KPI is heavilly changed this is why the version is bumped. It is very likely that some VM ports users will need to change their own code.
Sponsored by: EMC / Isilon storage division Discussed with: alc Reviewed by: jeff, kib Tested by: gavin, bapt (older version) Tested by: pho, scottl
|
254133 |
09-Aug-2013 |
avg |
follow up to r254051
- update powerpc/GENERIC64 as well, suggested by mdf - update comments so that they make sense after the change, suggested by jhb
X-MFC after: never (change specific to head)
|
254051 |
07-Aug-2013 |
avg |
enable KDB_TRACE in GENERICs
KDB_TRACE is not an alternative to DDB/etc, they are complementary. So I do not see any reason to not enable KDB_TRACE by default.
X-MFC after: never (change specific to head)
|
254025 |
07-Aug-2013 |
jeff |
Replace kernel virtual address space allocation with vmem. This provides transparent layering and better fragmentation.
- Normalize functions that allocate memory to use kmem_* - Those that allocate address space are named kva_* - Those that operate on maps are named kmap_* - Implement recursive allocation handling for kmem_arena in vmem.
Reviewed by: alc Tested by: pho Sponsored by: EMC / Isilon Storage Division
|
253845 |
31-Jul-2013 |
obrien |
Back out r253779 & r253786.
|
253779 |
29-Jul-2013 |
obrien |
Decouple yarrow from random(4) device.
* Make Yarrow an optional kernel component -- enabled by "YARROW_RNG" option. The files sha2.c, hash.c, randomdev_soft.c and yarrow.c comprise yarrow.
* random(4) device doesn't really depend on rijndael-*. Yarrow, however, does.
* Add random_adaptors.[ch] which is basically a store of random_adaptor's. random_adaptor is basically an adapter that plugs in to random(4). random_adaptor can only be plugged in to random(4) very early in bootup. Unplugging random_adaptor from random(4) is not supported, and is probably a bad idea anyway, due to potential loss of entropy pools. We currently have 3 random_adaptors: + yarrow + rdrand (ivy.c) + nehemeiah
* Remove platform dependent logic from probe.c, and move it into corresponding registration routines of each random_adaptor provider. probe.c doesn't do anything other than picking a specific random_adaptor from a list of registered ones.
* If the kernel doesn't have any random_adaptor adapters present then the creation of /dev/random is postponed until next random_adaptor is kldload'ed.
* Fix randomdev_soft.c to refer to its own random_adaptor, instead of a system wide one.
Submitted by: arthurmesh@gmail.com, obrien Obtained from: Juniper Networks Reviewed by: obrien
|
253750 |
28-Jul-2013 |
avg |
Revert r253748,253749
This WIP should not have been committed yet.
Pointyhat to: avg
|
253748 |
28-Jul-2013 |
avg |
put contents of cpu.h under _KERNEL
no userland-serviceable parts inside
MFC after: 20 days
|
253560 |
23-Jul-2013 |
marcel |
In pci_cfgregread() and pci_cfgregwrite(), multiplex the domain and bus number into the bus argument. The bus number occupies the least significant 8 bits. The PCI domain occupies the most significant 24 bits.
On the Altix 350, the PCI domain is a required parameter, but changing the prototype of the pci_cfgreg*() functions to include a separate domain argument has wide-spread consequences across the supported architectures. We'd be changing a known interface.
Multiplexing is an acceptable kluge to give us what we need with manageable impact. Note that the PCI bus number fits in 8 bits, so the multiplexing of the domain is a backward compatible change.
|
253559 |
23-Jul-2013 |
marcel |
In ia64_mca_init(), don't limit the allocation of the info block to fall within the first 256MB of memory. The origin/reason for that limitation is not known, but it's not believed to be required for proper initialization. What is known is that the Altix 350 does not have physical memory at that address (by virtue of the address space bits).
Keep the boundary at 256MB so that the info block will be covered by a single direct-mapped translation.
While here, change the flags to M_NOWAIT to eliminate confusion. It does not change the behaviour of contigmalloc(). What is does is makes the flags argument explicitly say what the actual behaviour is.
|
253558 |
23-Jul-2013 |
marcel |
In pmap_mapdev(), if the physical memory range is not covered by an EFI memory descriptor, don't return NULL as the virtual address, return the direct-mapped uncacheable virtual address for it. At first, this was needed only for the Altix 350, but now even some high-end HP machines have devices mapped to physical addresses that aren't covered by the EFI memory map.
|
252434 |
01-Jul-2013 |
kib |
Fix issues with zeroing and fetching the counters, on x86 and ppc64. Issues were noted by Bruce Evans and are present on all architectures.
On i386, a counter fetch should use atomic read of 64bit value, otherwise carry from the increment on other CPU could be lost for the given fetch, making error of 2^32. If 64bit read (cmpxchg8b) is not available on the machine, it cannot be SMP and it is enough to disable preemption around read to avoid the split read.
On x86 the counter increment is not atomic on purpose, which makes it possible for the store of the incremented result to override just zeroed per-cpu slot. The effect would be a counter going off by arbitrary value after zeroing. Perform the counter zeroing on the same processor which does the increments, making the operations mutually exclusive. On i386, same as for the fetching, if the cmpxchg8b is not available, machine is not SMP and we disable preemption for zeroing.
PowerPC64 is treated the same as amd64.
For other architectures, the changes made to allow the compilation to succeed, without fixing the issues with zeroing or fetching. It should be possible to handle them by using the 64bit loads and stores atomic WRT preemption (assuming the architectures also converted from using critical sections to proper asm). If architecture does not provide the facility, using global (spin) mutex would be non-optimal but working solution.
Noted by: bde Sponsored by: The FreeBSD Foundation
|
252280 |
27-Jun-2013 |
jkim |
Move definitions required by userland applications out of acpica_machdep.h.
|
250963 |
24-May-2013 |
achim |
Driver 'aacraid' added. Supports Adaptec by PMC RAID controller families Series 6, 7, 8 and upcoming products. Older Adaptec RAID controller families are supported by the 'aac' driver.
Approved by: scottl (mentor)
|
250884 |
21-May-2013 |
attilio |
o Relax locking assertions for vm_page_find_least() o Relax locking assertions for pmap_enter_object() and add them also to architectures that currently don't have any o Introduce VM_OBJECT_LOCK_DOWNGRADE() which is basically a downgrade operation on the per-object rwlock o Use all the mechanisms above to make vm_map_pmap_enter() to work mostl of the times only with readlocks.
Sponsored by: EMC / Isilon storage division Reviewed by: alc
|
250544 |
12-May-2013 |
peter |
Tidy up some CVS workarounds.
|
250338 |
07-May-2013 |
attilio |
Rename VM_NDOMAIN into MAXMEMDOM and move it into machine/param.h in order to match the MAXCPU concept. The change should also be useful for consolidation and consistency.
Sponsored by: EMC / Isilon storage division Obtained from: jeff Reviewed by: alc
|
249410 |
12-Apr-2013 |
trasz |
Remove ctl(4) from GENERIC. Also remove 'options CTL_DISABLE' and kern.cam.ctl.disable tunable; those were introduced as a workaround to make it possible to boot GENERIC on low memory machines.
With ctl(4) being built as a module and automatically loaded by ctladm(8), this makes CTL work out of the box.
Reviewed by: ken Sponsored by: FreeBSD Foundation
|
249268 |
08-Apr-2013 |
glebius |
Merge from projects/counters: counter(9).
Introduce counter(9) API, that implements fast and raceless counters, provided (but not limited to) for gathering of statistical data.
See http://lists.freebsd.org/pipermail/freebsd-arch/2013-April/014204.html for more details.
In collaboration with: kib Reviewed by: luigi Tested by: ae, ray Sponsored by: Nginx, Inc.
|
249265 |
08-Apr-2013 |
glebius |
Merge from projects/counters:
Pad struct pcpu so that its size is denominator of PAGE_SIZE. This is done to reduce memory waste in UMA_PCPU_ZONE zones.
Sponsored by: Nginx, Inc.
|
249083 |
04-Apr-2013 |
mav |
Remove all legacy ATA code parts, not used since options ATA_CAM enabled in most kernels before FreeBSD 9.0. Remove such modules and respective kernel options: atadisk, ataraid, atapicd, atapifd, atapist, atapicam. Remove the atacontrol utility and some man pages. Remove useless now options ATA_CAM.
No objections: current@, stable@ MFC after: never
|
248508 |
19-Mar-2013 |
kib |
Implement the concept of the unmapped VMIO buffers, i.e. buffers which do not map the b_pages pages into buffer_map KVA. The use of the unmapped buffers eliminate the need to perform TLB shootdown for mapping on the buffer creation and reuse, greatly reducing the amount of IPIs for shootdown on big-SMP machines and eliminating up to 25-30% of the system time on i/o intensive workloads.
The unmapped buffer should be explicitely requested by the GB_UNMAPPED flag by the consumer. For unmapped buffer, no KVA reservation is performed at all. The consumer might request unmapped buffer which does have a KVA reserve, to manually map it without recursing into buffer cache and blocking, with the GB_KVAALLOC flag.
When the mapped buffer is requested and unmapped buffer already exists, the cache performs an upgrade, possibly reusing the KVA reservation.
Unmapped buffer is translated into unmapped bio in g_vfs_strategy(). Unmapped bio carry a pointer to the vm_page_t array, offset and length instead of the data pointer. The provider which processes the bio should explicitely specify a readiness to accept unmapped bio, otherwise g_down geom thread performs the transient upgrade of the bio request by mapping the pages into the new bio_transient_map KVA submap.
The bio_transient_map submap claims up to 10% of the buffer map, and the total buffer_map + bio_transient_map KVA usage stays the same. Still, it could be manually tuned by kern.bio_transient_maxcnt tunable, in the units of the transient mappings. Eventually, the bio_transient_map could be removed after all geom classes and drivers can accept unmapped i/o requests.
Unmapped support can be turned off by the vfs.unmapped_buf_allowed tunable, disabling which makes the buffer (or cluster) creation requests to ignore GB_UNMAPPED and GB_KVAALLOC flags. Unmapped buffers are only enabled by default on the architectures where pmap_copy_page() was implemented and tested.
In the rework, filesystem metadata is not the subject to maxbufspace limit anymore. Since the metadata buffers are always mapped, the buffers still have to fit into the buffer map, which provides a reasonable (but practically unreachable) upper bound on it. The non-metadata buffer allocations, both mapped and unmapped, is accounted against maxbufspace, as before. Effectively, this means that the maxbufspace is forced on mapped and unmapped buffers separately. The pre-patch bufspace limiting code did not worked, because buffer_map fragmentation does not allow the limit to be reached.
By Jeff Roberson request, the getnewbuf() function was split into smaller single-purpose functions.
Sponsored by: The FreeBSD Foundation Discussed with: jeff (previous version) Tested by: pho, scottl (previous version), jhb, bf MFC after: 2 weeks
|
248280 |
14-Mar-2013 |
kib |
Add pmap function pmap_copy_pages(), which copies the content of the pages around, taking array of vm_page_t both for source and destination. Starting offsets and total transfer size are specified.
The function implements optimal algorithm for copying using the platform-specific optimizations. For instance, on the architectures were the direct map is available, no transient mappings are created, for i386 the per-cpu ephemeral page frame is used. The code was typically borrowed from the pmap_copy_page() for the same architecture.
Only i386/amd64, powerpc aim and arm/arm-v6 implementations were tested at the time of commit. High-level code, not committed yet to the tree, ensures that the use of the function is only allowed after explicit enablement.
For sparc64, the existing code has known issues and a stab is added instead, to allow the kernel linking.
Sponsored by: The FreeBSD Foundation Tested by: pho (i386, amd64), scottl (amd64), ian (arm and arm-v6) MFC after: 2 weeks
|
248084 |
09-Mar-2013 |
attilio |
Switch the vm_object mutex to be a rwlock. This will enable in the future further optimizations where the vm_object lock will be held in read mode most of the time the page cache resident pool of pages are accessed for reading purposes.
The change is mostly mechanical but few notes are reported: * The KPI changes as follow: - VM_OBJECT_LOCK() -> VM_OBJECT_WLOCK() - VM_OBJECT_TRYLOCK() -> VM_OBJECT_TRYWLOCK() - VM_OBJECT_UNLOCK() -> VM_OBJECT_WUNLOCK() - VM_OBJECT_LOCK_ASSERT(MA_OWNED) -> VM_OBJECT_ASSERT_WLOCKED() (in order to avoid visibility of implementation details) - The read-mode operations are added: VM_OBJECT_RLOCK(), VM_OBJECT_TRYRLOCK(), VM_OBJECT_RUNLOCK(), VM_OBJECT_ASSERT_RLOCKED(), VM_OBJECT_ASSERT_LOCKED() * The vm/vm_pager.h namespace pollution avoidance (forcing requiring sys/mutex.h in consumers directly to cater its inlining functions using VM_OBJECT_LOCK()) imposes that all the vm/vm_pager.h consumers now must include also sys/rwlock.h. * zfs requires a quite convoluted fix to include FreeBSD rwlocks into the compat layer because the name clash between FreeBSD and solaris versions must be avoided. At this purpose zfs redefines the vm_object locking functions directly, isolating the FreeBSD components in specific compat stubs.
The KPI results heavilly broken by this commit. Thirdy part ports must be updated accordingly (I can think off-hand of VirtualBox, for example).
Sponsored by: EMC / Isilon storage division Reviewed by: jeff Reviewed by: pjd (ZFS specific review) Discussed with: alc Tested by: pho
|
247463 |
28-Feb-2013 |
mav |
MFcalloutng: Switch eventtimers(9) from using struct bintime to sbintime_t. Even before this not a single driver really supported full dynamic range of struct bintime even in theory, not speaking about practical inexpediency. This change legitimates the status quo and cleans up the code.
|
247454 |
28-Feb-2013 |
davide |
MFcalloutng: When CPU becomes idle, cpu_idleclock() calculates time to the next timer event in order to reprogram hw timer. Return that time in sbintime_t to the caller and pass it to acpi_cpu_idle(), where it can be used as one more factor (quite precise) to extimate furter sleep time and choose optimal sleep state. This is a preparatory change for further callout improvements will be committed in the next days.
The commmit is not targeted for MFC.
|
247251 |
25-Feb-2013 |
marcel |
kernacc() expects all KVAs to be covered in the kernel map. With the introduction of the PBVM, this stopped being the case. Redefine the VM parameters so that the PBVM is included in the kernel map. In particular this introduces VM_INIT_KERNEL_ADDRESS to point to the base of region 5 now that VM_MIN_KERNEL_ADDRESS points to the base of region 4 to include the PBVM. While here define KERNBASE to the actual link address of the kernel as is intended.
PR: 169926
|
247197 |
23-Feb-2013 |
marcel |
Enable PREEMPTION by default now that PR 147501 has been fixed.
|
246890 |
17-Feb-2013 |
marcel |
Close a race relating to setting the PCPU pointer (r13). Register r13 points to the TLS in user space and points to the PCPU structure in the kernel. The race is the result of having the exception handler on the one hand and the RPC system call entry on the other. The EPC syscall path is non-atomic in that interrupts are enabled while the two stacks are switched. The register stack is switched last as that is the stack used to determine whether we're going back to user space by the exception handler. If we go back to user space, we restore r13, otherwise we leave r13 alone. The EPC syscall path however set r13 to the PCPU structure *before* switching the register stack, which means that there was a window in which the exception handler would restore r13 when it was already pointing to the PCPU structure. This is fatal when the exception happened on CPU x, but left from the exception on anotehr CPU. In that case r13 would point to the PCPU of the CPU the thread was running on. This immediately results in getting the wrong value for curthread. The fix is to make sure we assign r13 *after* we set ar.bspstore to point to the kernel register stack for the thread.
|
246882 |
16-Feb-2013 |
marcel |
Return EFAULT when the address is not a kernel virtual address.
|
246715 |
12-Feb-2013 |
marcel |
Eliminate the PC_CURTHREAD symbol and load the current thread's thread structure pointer atomically from r13 (the pcpu pointer) for the current CPU/core. Add a CTASSERT in machdep.c to make sure that pc_curthread is in fact the first field in struct pcpu.
The only non-atomic operations left were those related to process- space operations, such as casuword, subyte, suword16, fubyte, fuword16, copyin, copyout and their variations.
The casuword function has been re-structured more complete than the others. This way we have an example of a better bundling without introducing a lot of risk when we get it wrong. The other functions can be rebundled in separate commits and with the appropriate testing.
|
246714 |
12-Feb-2013 |
marcel |
Eliminate padding by moving 'narg' next to 'code'. Both are 32-bit entities in the syscall_args structure that otherwise has 64-bit only fields.
|
246713 |
12-Feb-2013 |
kib |
Reform the busdma API so that new types may be added without modifying every architecture's busdma_machdep.c. It is done by unifying the bus_dmamap_load_buffer() routines so that they may be called from MI code. The MD busdma is then given a chance to do any final processing in the complete() callback.
The cam changes unify the bus_dmamap_load* handling in cam drivers.
The arm and mips implementations are updated to track virtual addresses for sync(). Previously this was done in a type specific way. Now it is done in a generic way by recording the list of virtuals in the map.
Submitted by: jeff (sponsored by EMC/Isilon) Reviewed by: kan (previous version), scottl, mjacob (isp(4), no objections for target mode changes) Discussed with: ian (arm changes) Tested by: marius (sparc64), mips (jmallet), isci(4) on x86 (jharris), amd64 (Fabian Keil <freebsd-listen@fabiankeil.de>)
|
246712 |
12-Feb-2013 |
marcel |
Now that we actually use more memory descriptors, make sure to dump them as well.
|
245044 |
04-Jan-2013 |
hrs |
Remove firewire devices missed in r244992.
|
245003 |
03-Jan-2013 |
kib |
Enable the UFS quotas for big-iron GENERIC kernels.
Discussed with: mckusick MFC after: 2 weeks
|
244992 |
03-Jan-2013 |
des |
As discussed on -current last October, remove the firewire drivers from GENERIC.
|
243040 |
14-Nov-2012 |
kib |
Flip the semantic of M_NOWAIT to only require the allocation to not sleep, and perform the page allocations with VM_ALLOC_SYSTEM class. Previously, the allocation was also allowed to completely drain the reserve of the free pages, being translated to VM_ALLOC_INTERRUPT request class for vm_page_alloc() and similar functions.
Allow the caller of malloc* to request the 'deep drain' semantic by providing M_USE_RESERVE flag, now translated to VM_ALLOC_INTERRUPT class. Previously, it resulted in less aggressive VM_ALLOC_SYSTEM allocation class.
Centralize the translation of the M_* malloc(9) flags in the single inline function malloc2vm_flags().
Discussion started by: "Sears, Steven" <Steven.Sears@netapp.com> Reviewed by: alc, mdf (previous version) Tested by: pho (previous version) MFC after: 2 weeks
|
242534 |
03-Nov-2012 |
attilio |
Rework the known rwlock to benefit about staying on their own cache line in order to avoid manual frobbing but using struct rwlock_padalign.
Reviewed by: alc, jimharris
|
242218 |
28-Oct-2012 |
kib |
Fix compilation on ia64 when page size is configured for 16KB.
Reviewed by: alc, marcel
|
242121 |
26-Oct-2012 |
alc |
Port the new PV entry allocator from amd64/i386. This allocator has two advantages. First, PV entries are roughly half the size. Second, this allocator doesn't access the paging queues, and thus it allows for the removal of the page queues lock from this pmap.
Replace all uses of the page queues lock by a R/W lock that is private to this pmap.
Tested by: marcel
|
241020 |
28-Sep-2012 |
alc |
Eliminate a stale comment. It describes another use case for the pmap in Mach that doesn't exist in FreeBSD.
|
240244 |
08-Sep-2012 |
attilio |
userret() already checks for td_locks when INVARIANTS is enabled, so there is no need to check if Giant is acquired after it.
Reviewed by: kib MFC after: 1 week
|
239379 |
18-Aug-2012 |
marcel |
Use pmap_kextract(x) rather than pmap_extract(kernel_pmap, x). The former knows about all the special mappings, like PBVM. The kernel text and data are in the PBVM.
|
239376 |
18-Aug-2012 |
marcel |
Remove support for SKI: HP's Itanium simulator. It's pretty much not used, serves very little value given that FreeBSD runs on real H/W for a long time. Note that SKI is open-source (see http://ski.sourceforge.net), so if there's interest and value again, then this code can be revived.
Discussed with: jhb
|
239331 |
16-Aug-2012 |
jhb |
Add locking for sscdisk(4) and mark it MPSAFE. Since this driver just makes calls out to the emulator, the locking is fairly simple. A global mutex protects the list of ssc disks, and each ssc disk has a mutex to protect it's bioq.
Approved by: marcel
|
239065 |
05-Aug-2012 |
kib |
After the PHYS_TO_VM_PAGE() function was de-inlined, the main reason to pull vm_param.h was removed. Other big dependency of vm_page.h on vm_param.h are PA_LOCK* definitions, which are only needed for in-kernel code, because modules use KBI-safe functions to lock the pages.
Stop including vm_param.h into vm_page.h. Include vm_param.h explicitely for the kernel code which needs it.
Suggested and reviewed by: alc MFC after: 2 weeks
|
238257 |
08-Jul-2012 |
marcel |
Move PCPU initialization to a new function called cpu_pcpu_setup(). This makes it easier to add additional CPU or platform information to the per-CPU structure without duplicated code.
|
238256 |
08-Jul-2012 |
marcel |
Unleash the APs at SI_SUB_KICK_SCHEDULER so that we have them all up and running to service interrupts. This is especially important when the firmware has bound interrupts to CPUs, like for the SGI Altix 350. We wake up APs at SI_SUB_CPU time and they sit and spin until we unleash them, so there's nothing fundamentally different from a MD perspective.
|
238190 |
07-Jul-2012 |
marcel |
Implement ia64_physmem_alloc() and use it consistently to get memory before VM has been initialized. This includes: 1. Replacing pmap_steal_memory(), 2. Replace the handcrafted logic to allocate a naturally aligned VHPT, 3. Properly allocate the DPCPU for the BSP.
Ad 3: Appending the DPCPU to kernend worked as long as we wouldn't cross into the next PBVM page. If we were to cross into the next page, then there wouldn't be a PTE entry on the page table for it and we would end up with a MCA following a page fault. As such, this commit fixes MCAs occasionally seen.
|
238184 |
07-Jul-2012 |
marcel |
Hide the creation of phys_avail behind an API to make it easier to do it correctly. We now iterate the EFI memory descriptors once and collect all the information in a single pass. This includes: 1. The I/O port base address, 2. The PAL memory region. Have the physmem API track this. 3. Memory descriptors of memory we can't use, like bad memory, runtime services code & data, etc. Have the physmem API track these. 4. memory descriptors of memory we can use or re-use, such as free memory, boot time services code & data, loader code & data, etc. These are added by the physmem API.
Since the PBVM page table and pages are in memory described as loader data, inform the physmem API of chunks that need to be delated from the available physical memory.
While here, remove Maxmem and replace it with the better named paddr_max. Maxmem was defined as physmem, which is generally wrong. Now, paddr_max is properly defined as the largesty physical address.
The upshot of all this is that: 1. We properly determine realmem. 2. We maximize physmem by re-using memory where possible. 3. We remove complexity from ia64_init() in machdep.c. 4. Remove confusion about realmem, physmem & Maxmem.
The new ia64_physmem_alloc() is to replace pmap_steal_memory() in pmap.c, as well as replace the handcrafted allocation of the VHPT for the BSP in pmap_bootstrap() in pmap.c. This is step 2 and addresses the manipulation of phys_avail after it is being created.
|
237517 |
24-Jun-2012 |
andrew |
Make the wchar_t type machine dependent.
This is required for ARM EABI. Section 7.1.1 of the Procedure Call for the ARM Architecture (AAPCS) defines wchar_t as either an unsigned int or an unsigned short with the former preferred.
Because of this requirement we need to move the definition of __wchar_t to a machine dependent header. It also cleans up the macros defining the limits of wchar_t by defining __WCHAR_MIN and __WCHAR_MAX in the same machine dependent header then using them to define WCHAR_MIN and WCHAR_MAX respectively.
Discussed with: bde
|
237433 |
22-Jun-2012 |
kib |
Implement mechanism to export some kernel timekeeping data to usermode, using shared page. The structures and functions have vdso prefix, to indicate the intended location of the code in some future.
The versioned per-algorithm data is exported in the format of struct vdso_timehands, which mostly repeats the content of in-kernel struct timehands. Usermode reading of the structure can be lockless. Compatibility export for 32bit processes on 64bit host is also provided. Kernel also provides usermode with indication about currently used timecounter, so that libc can fall back to syscall if configured timecounter is unknown to usermode code.
The shared data updates are initiated both from the tc_windup(), where a fast task is queued to do the update, and from sysctl handlers which change timecounter. A manual override switch kern.timecounter.fast_gettime allows to turn off the mechanism.
Only x86 architectures export the real algorithm data, and there, only for tsc timecounter. HPET counters page could be exported as well, but I prefer to not further glue the kernel and libc ABI there until proper vdso-based solution is developed.
Minimal stubs neccessary for non-x86 architectures to still compile are provided.
Discussed with: bde Reviewed by: jhb Tested by: flo MFC after: 1 month
|
237430 |
22-Jun-2012 |
kib |
Reserve AT_TIMEKEEP auxv entry for providing usermode the pointer to timekeeping information.
MFC after: 1 week
|
237168 |
16-Jun-2012 |
alc |
The page flag PGA_WRITEABLE is set and cleared exclusively by the pmap layer, but it is read directly by the MI VM layer. This change introduces pmap_page_is_write_mapped() in order to completely encapsulate all direct access to PGA_WRITEABLE in the pmap layer.
Aesthetics aside, I am making this change because amd64 will likely begin using an alternative method to track write mappings, and having pmap_page_is_write_mapped() in place allows me to make such a change without further modification to the MI VM layer.
As an added bonus, tidy up some nearby comments concerning page flags.
Reviewed by: kib MFC after: 6 weeks
|
236409 |
01-Jun-2012 |
jkim |
Improve style(9) in the previous commit.
|
236403 |
01-Jun-2012 |
iwasaki |
Call AcpiLeaveSleepStatePrep() in interrupt disabled context (described in ACPICA source code).
- Move intr_disable() and intr_restore() from acpi_wakeup.c to acpi.c and call AcpiLeaveSleepStatePrep() in interrupt disabled context. - Add acpi_wakeup_machdep() to execute wakeup MD procedures and call it twice in interrupt disabled/enabled context (ia64 version is just dummy). - Rename wakeup_cpus variable in acpi_sleep_machdep() to suspcpus in order to be shared by acpi_sleep_machdep() and acpi_wakeup_machdep(). - Move identity mapping related code to acpi_install_wakeup_handler() (i386 version) for preparation of x86/acpica/acpi_wakeup.c (MFC candidate).
Reviewed by: jkim@ MFC after: 2 days
|
236375 |
01-Jun-2012 |
alc |
pmap_alloc_vhpt() doesn't need the pages that it allocates to be mapped into the kernel map, so vm_page_alloc_contig() can be used in place of contigmalloc().
Reviewed by: marcel
|
235941 |
24-May-2012 |
bz |
MFp4 bz_ipv6_fast:
in_cksum.h required ip.h to be included for struct ip. To be able to use some general checksum functions like in_addword() in a non-IPv4 context, limit the (also exported to user space) IPv4 specific functions to the times, when the ip.h header is present and IPVERSION is defined (to 4).
We should consider more general checksum (updating) functions to also allow easier incremental checksum updates in the L3/4 stack and firewalls, as well as ponder further requirements by certain NIC drivers needing slightly different pseudo values in offloading cases. Thinking in terms of a better "library".
Sponsored by: The FreeBSD Foundation Sponsored by: iXsystems
Reviewed by: gnn (as part of the whole) MFC After: 3 days
|
235041 |
04-May-2012 |
marcel |
Don't assume we have legacy PICs (i.e. 8259A in cascade) at the legacy I/O port addresses. Even if we do, this is hardly the place to mask interrupts. It's not clear that this was at all needed. The code came with CVS revision 1.2 of nexus.c when interrupt support was first added. What is known is that ia64 has always been designed around the IOSAPIC, and that doing I/O like this prevents Altix from booting.
|
234785 |
29-Apr-2012 |
dim |
Add a convenience macro for the returns_twice attribute, and apply it to the prototypes of the appropriate functions (getcontext, savectx, setjmp, sigsetjmp and vfork).
MFC after: 2 weeks
|
233271 |
21-Mar-2012 |
ed |
Remove pty(4) from our kernel configurations.
As of FreeBSD 8, this driver should not be used. Applications that use posix_openpt(2) and openpty(3) use the pts(4) that is built into the kernel unconditionally. If it turns out high profile depend on the pty(4) module anyway, I'd rather get those fixed. So please report any issues to me.
The pty(4) module is still available as a kernel module of course, so a simple `kldload pty' can be used to run old-style pseudo-terminals.
|
233207 |
19-Mar-2012 |
tijl |
Copy i386 specialreg.h to x86 and merge with amd64 specialreg.h. Replace amd64/i386/pc98 specialreg.h with stubs.
|
233204 |
19-Mar-2012 |
tijl |
Copy i386 psl.h to x86 and replace amd64/i386/pc98 psl.h with stubs.
|
233203 |
19-Mar-2012 |
tijl |
Move userland bits (and some common kernel bits) from amd64 and i386 segments.h to a new x86 segments.h.
Add __packed attribute to some structs (just to be sure). Also make it clear that i386 GDT and LDT entries are used in ia64 code.
|
233125 |
18-Mar-2012 |
tijl |
Eliminate ia32_reg.h by moving its contents to x86 and ia64 reg.h.
Reviewed by: kib
|
232619 |
06-Mar-2012 |
attilio |
Disable the option VFS_ALLOW_NONMPSAFE by default on all the supported platforms. This will make every attempt to mount a non-mpsafe filesystem to the kernel forbidden, unless it is expressely compiled with VFS_ALLOW_NONMPSAFE option.
This patch is part of the effort of killing non-MPSAFE filesystems from the tree.
No MFC is expected for this patch.
|
232356 |
01-Mar-2012 |
jhb |
- Change contigmalloc() to use the vm_paddr_t type instead of an unsigned long for specifying a boundary constraint. - Change bus_dma tags to use bus_addr_t instead of bus_size_t for boundary constraints.
These allow boundary constraints to be fully expressed for cases where sizeof(bus_addr_t) != sizeof(bus_size_t). Specifically, it allows a driver to properly specify a 4GB boundary in a PAE kernel.
Note that this cannot be safely MFC'd without a lot of compat shims due to KBI changes, so I do not intend to merge it.
Reviewed by: scottl
|
232250 |
28-Feb-2012 |
gavin |
Correct capitalization of "Hz" in user-visible text (manpages, printf(), etc).
MFC after: 3 days
|
231177 |
08-Feb-2012 |
marcel |
Rev. 228360 moved the call to cpu_set_upcall() to happen before td_proc gets initialized in td (=newtd). Use td0 instead.
MFC after: 3 days
|
230475 |
23-Jan-2012 |
das |
Add C11 macros describing subnormal numbers to float.h.
Reviewed by: bde
|
230366 |
20-Jan-2012 |
das |
Add parentheses where required. Without them, `sizeof LDBL_MAX' is a syntax error and shouldn't be, while `1 FLT_ROUNDS' isn't a syntax error and should be. Thanks to bde for the examples.
|
229997 |
12-Jan-2012 |
ken |
Add the CAM Target Layer (CTL).
CTL is a disk and processor device emulation subsystem originally written for Copan Systems under Linux starting in 2003. It has been shipping in Copan (now SGI) products since 2005.
It was ported to FreeBSD in 2008, and thanks to an agreement between SGI (who acquired Copan's assets in 2010) and Spectra Logic in 2010, CTL is available under a BSD-style license. The intent behind the agreement was that Spectra would work to get CTL into the FreeBSD tree.
Some CTL features:
- Disk and processor device emulation. - Tagged queueing - SCSI task attribute support (ordered, head of queue, simple tags) - SCSI implicit command ordering support. (e.g. if a read follows a mode select, the read will be blocked until the mode select completes.) - Full task management support (abort, LUN reset, target reset, etc.) - Support for multiple ports - Support for multiple simultaneous initiators - Support for multiple simultaneous backing stores - Persistent reservation support - Mode sense/select support - Error injection support - High Availability support (1) - All I/O handled in-kernel, no userland context switch overhead.
(1) HA Support is just an API stub, and needs much more to be fully functional.
ctl.c: The core of CTL. Command handlers and processing, character driver, and HA support are here.
ctl.h: Basic function declarations and data structures.
ctl_backend.c, ctl_backend.h: The basic CTL backend API.
ctl_backend_block.c, ctl_backend_block.h: The block and file backend. This allows for using a disk or a file as the backing store for a LUN. Multiple threads are started to do I/O to the backing device, primarily because the VFS API requires that to get any concurrency.
ctl_backend_ramdisk.c: A "fake" ramdisk backend. It only allocates a small amount of memory to act as a source and sink for reads and writes from an initiator. Therefore it cannot be used for any real data, but it can be used to test for throughput. It can also be used to test initiators' support for extremely large LUNs.
ctl_cmd_table.c: This is a table with all 256 possible SCSI opcodes, and command handler functions defined for supported opcodes.
ctl_debug.h: Debugging support.
ctl_error.c, ctl_error.h: CTL-specific wrappers around the CAM sense building functions.
ctl_frontend.c, ctl_frontend.h: These files define the basic CTL frontend port API.
ctl_frontend_cam_sim.c: This is a CTL frontend port that is also a CAM SIM. This frontend allows for using CTL without any target-capable hardware. So any LUNs you create in CTL are visible in CAM via this port.
ctl_frontend_internal.c, ctl_frontend_internal.h: This is a frontend port written for Copan to do some system-specific tasks that required sending commands into CTL from inside the kernel. This isn't entirely relevant to FreeBSD in general, but can perhaps be repurposed.
ctl_ha.h: This is a stubbed-out High Availability API. Much more is needed for full HA support. See the comments in the header and the description of what is needed in the README.ctl.txt file for more details.
ctl_io.h: This defines most of the core CTL I/O structures. union ctl_io is conceptually very similar to CAM's union ccb.
ctl_ioctl.h: This defines all ioctls available through the CTL character device, and the data structures needed for those ioctls.
ctl_mem_pool.c, ctl_mem_pool.h: Generic memory pool implementation used by the internal frontend.
ctl_private.h: Private data structres (e.g. CTL softc) and function prototypes. This also includes the SCSI vendor and product names used by CTL.
ctl_scsi_all.c, ctl_scsi_all.h: CTL wrappers around CAM sense printing functions.
ctl_ser_table.c: Command serialization table. This defines what happens when one type of command is followed by another type of command.
ctl_util.c, ctl_util.h: CTL utility functions, primarily designed to be used from userland. See ctladm for the primary consumer of these functions. These include CDB building functions.
scsi_ctl.c: CAM target peripheral driver and CTL frontend port. This is the path into CTL for commands from target-capable hardware/SIMs.
README.ctl.txt: CTL code features, roadmap, to-do list.
usr.sbin/Makefile: Add ctladm.
ctladm/Makefile, ctladm/ctladm.8, ctladm/ctladm.c, ctladm/ctladm.h, ctladm/util.c: ctladm(8) is the CTL management utility. It fills a role similar to camcontrol(8). It allow configuring LUNs, issuing commands, injecting errors and various other control functions.
usr.bin/Makefile: Add ctlstat.
ctlstat/Makefile ctlstat/ctlstat.8, ctlstat/ctlstat.c: ctlstat(8) fills a role similar to iostat(8). It reports I/O statistics for CTL.
sys/conf/files: Add CTL files.
sys/conf/NOTES: Add device ctl.
sys/cam/scsi_all.h: To conform to more recent specs, the inquiry CDB length field is now 2 bytes long.
Add several mode page definitions for CTL.
sys/cam/scsi_all.c: Handle the new 2 byte inquiry length.
sys/dev/ciss/ciss.c, sys/dev/ata/atapi-cam.c, sys/cam/scsi/scsi_targ_bh.c, scsi_target/scsi_cmds.c, mlxcontrol/interface.c: Update for 2 byte inquiry length field.
scsi_da.h: Add versions of the format and rigid disk pages that are in a more reasonable format for CTL.
amd64/conf/GENERIC, i386/conf/GENERIC, ia64/conf/GENERIC, sparc64/conf/GENERIC: Add device ctl.
i386/conf/PAE: The CTL frontend SIM at least does not compile cleanly on PAE.
Sponsored by: Copan Systems, SGI and Spectra Logic MFC after: 1 month
|
229605 |
05-Jan-2012 |
adrian |
Flip on IEEE80211_SUPPORT_MESH and AH_SUPPORT_AR5416, the wlan and ath modules respectively assume this is set.
Pointy hat to: adrian
|
228973 |
29-Dec-2011 |
rwatson |
Add "options CAPABILITY_MODE" and "options CAPABILITIES" to GENERIC kernel configurations for various architectures in FreeBSD 10.x. This allows basic Capsicum functionality to be used in the default FreeBSD configuration on non-embedded architectures; process descriptors are not yet enabled by default.
MFC after: 3 months Sponsored by: Google, Inc
|
228631 |
17-Dec-2011 |
avg |
kern cons: introduce infrastructure for console grabbing by kernel
At the moment grab and ungrab methods of all console drivers are no-ops.
Current intended meaning of the calls is that the kernel takes control of console input. In the future the semantics may be extended to mean that the calling thread takes full ownership of the console (e.g. console output from other threads could be suspended).
Inspired by: bde MFC after: 2 months
|
228522 |
15-Dec-2011 |
alc |
Eliminate vestiges of page coloring.
|
228469 |
13-Dec-2011 |
ed |
Replace __signed by signed.
The signed keyword is an integral part of the C syntax. There's no need to use __signed.
|
227333 |
08-Nov-2011 |
attilio |
Introduce the option VFS_ALLOW_NONMPSAFE and turn it on by default on all the architectures. The option allows to mount non-MPSAFE filesystem. Without it, the kernel will refuse to mount a non-MPSAFE filesytem.
This patch is part of the effort of killing non-MPSAFE filesystems from the tree.
No MFC is expected for this patch.
Tested by: gianni Reviewed by: kib
|
227309 |
07-Nov-2011 |
ed |
Mark all SYSCTL_NODEs static that have no corresponding SYSCTL_DECLs.
The SYSCTL_NODE macro defines a list that stores all child-elements of that node. If there's no SYSCTL_DECL macro anywhere else, there's no reason why it shouldn't be static.
|
227293 |
07-Nov-2011 |
ed |
Mark MALLOC_DEFINEs static that have no corresponding MALLOC_DECLAREs.
This means that their use is restricted to a single C file.
|
226835 |
27-Oct-2011 |
kensmith |
Adjust the debugger options slightly. This should help me do the right thing when changing the debugging options as part of head becoming a new stable branch. It may also help people who for one reason or another want to run head but don't want it slowed down by the debugging support.
Reviewed by: kib
|
226818 |
26-Oct-2011 |
kensmith |
Move the debugging support to its own section. This matches what is in the other architectures' GENERIC and makes removing it at the point we're creating a new stable branch a bit easier.
Discussed with: marcel
|
226607 |
21-Oct-2011 |
das |
People porting FreeBSD to new architectures ought not have to implement a deprecated FPU control interface in addition to the standard one. To make this clearer, further deprecate ieeefp.h by not declaring the function prototypes except on architectures that implement them already.
Currently i386 and amd64 implement the ieeefp.h interface for compatibility, and for fp[gs]etprec(), which doesn't exist on most other hardware. Powerpc, sparc64, and ia64 partially implement it and probably shouldn't, and other architectures don't implement it at all.
|
226547 |
19-Oct-2011 |
kensmith |
Add a warning about why sbp(4) is commented out so that curious folks are forewarned they might wind up with a hole in their foot if they decide to give it a try.
Suggested by: dougb
|
226510 |
18-Oct-2011 |
kensmith |
Comment out the sbp(4) driver for architectures that support it.
As part of the 8.0-RELEASE cycle this was done in stable/8 (r199112) but was left alone in head so people could work on fixing an issue that caused boot failure on some motherboards. Apparently nobody has worked on it and we are getting reports of boot failure with the 9.0 test builds. So this time I'll comment out the driver in head (still hoping someone will work on it) and MFC to stable/9.
Submitted by: Alberto Villa <avilla at FreeBSD dot org>
|
226112 |
07-Oct-2011 |
kib |
Remove unused define.
MFC after: 1 month
|
225841 |
28-Sep-2011 |
kib |
Remove locking of the vm page queues from several pmaps, which only protected the dirty mask updates. The dirty mask updates are handled by atomics after the r225840.
Submitted by: alc Tested by: flo (sparc64) MFC after: 2 weeks
|
225617 |
16-Sep-2011 |
kmacy |
In order to maximize the re-usability of kernel code in user space this patch modifies makesyscalls.sh to prefix all of the non-compatibility calls (e.g. not linux_, freebsd32_) with sys_ and updates the kernel entry points and all places in the code that use them. It also fixes an additional name space collision between the kernel function psignal and the libc function of the same name by renaming the kernel psignal kern_psignal(). By introducing this change now we will ease future MFCs that change syscalls.
Reviewed by: rwatson Approved by: re (bz)
|
225474 |
11-Sep-2011 |
kib |
Inline the syscallenter() and syscallret(). This reduces the time measured by the syscall entry speed microbenchmarks by ~10% on amd64.
Submitted by: jhb Approved by: re (bz) MFC after: 2 weeks
|
225418 |
06-Sep-2011 |
kib |
Split the vm_page flags PG_WRITEABLE and PG_REFERENCED into atomic flags field. Updates to the atomic flags are performed using the atomic ops on the containing word, do not require any vm lock to be held, and are non-blocking. The vm_page_aflag_set(9) and vm_page_aflag_clear(9) functions are provided to modify afalgs.
Document the changes to flags field to only require the page lock.
Introduce vm_page_reference(9) function to provide a stable KPI and KBI for filesystems like tmpfs and zfs which need to mark a page as referenced.
Reviewed by: alc, attilio Tested by: marius, flo (sparc64); andreast (powerpc, powerpc64) Approved by: re (bz)
|
224746 |
09-Aug-2011 |
kib |
- Move the PG_UNMANAGED flag from m->flags to m->oflags, renaming the flag to VPO_UNMANAGED (and also making the flag protected by the vm object lock, instead of vm page queue lock). - Mark the fake pages with both PG_FICTITIOUS (as it is now) and VPO_UNMANAGED. As a consequence, pmap code now can use use just VPO_UNMANAGED to decide whether the page is unmanaged.
Reviewed by: alc Tested by: pho (x86, previous version), marius (sparc64), marcel (arm, ia64, powerpc), ray (mips) Sponsored by: The FreeBSD Foundation Approved by: re (bz)
|
224668 |
06-Aug-2011 |
marcel |
Fix kernel core dumps now that the kernel is using PBVM. The basic problem to solve is that we don't have a fixed mapping from kernel text to physical address so that libkvm can bootstrap itself. We solve this by passing the physical address of the bootinfo structure to the consumer as the entry point of the core file. This way, libkvm can extract the PBVM page table information and locate the kernel in the core file. We also need to dump memory chunks of type loader data, because those hold the kernel and the PBVM page table (among other things).
Approved by: re (blanket)
|
224663 |
05-Aug-2011 |
marcel |
Follow-up commit: refactor pmap_kextract() to make it easier to catch and debug issues like the one fixed in the previous commit: Replace all return statements with goto statements so that we end up at a single place with a value for the physical address. Print a message for all unknown KVA addresses.
Approved by: re (blanket)
|
224662 |
05-Aug-2011 |
marcel |
Remove stray semicolon in pmap_kextract() that turned the conditional "return (0)" into an unconditional one and as such broke PBVM address queries -- such as during kernel core dumps.
Approved by: re (blanket)
|
224217 |
19-Jul-2011 |
attilio |
Bump MAXCPU for amd64, ia64 and XLP mips appropriately. From now on, default values for FreeBSD will be 64 maxiumum supported CPUs on amd64 and ia64 and 128 for XLP. All the other architectures seem already capped appropriately (with the exception of sparc64 which needs further support on jalapeno flavour).
Bump __FreeBSD_version in order to reflect KBI/KPI brekage introduced during the infrastructure cleanup for supporting MAXCPU > 32. This covers cpumask_t retiral too.
The switch is considered completed at the present time, so for whatever bug you may experience that is reconducible to that area, please report immediately.
Requested by: marcel, jchandra Tested by: pluknet, sbruno Approved by: re (kib)
|
224216 |
19-Jul-2011 |
attilio |
On 64 bit architectures size_t is 8 bytes, thus it should use an 8 bytes storage. Fix the sintrcnt/sintrnames specification.
No MFC is previewed for this patch.
Reported, reviewed and tested by: marcel Approved by: re (kib)
|
224207 |
19-Jul-2011 |
attilio |
Add the possibility to specify from kernel configs MAXCPU value. This patch is going to help in cases like mips flavours where you want a more granular support on MAXCPU.
No MFC is previewed for this patch.
Tested by: pluknet Approved by: re (kib)
|
224187 |
18-Jul-2011 |
attilio |
- Remove the eintrcnt/eintrnames usage and introduce the concept of sintrcnt/sintrnames which are symbols containing the size of the 2 tables. - For amd64/i386 remove the storage of intr* stuff from assembly files. This area can be widely improved by applying the same to other architectures and likely finding an unified approach among them and move the whole code to be MI. More work in this area is expected to happen fairly soon.
No MFC is previewed for this patch.
Tested by: pluknet Reviewed by: jhb Approved by: re (kib)
|
224185 |
18-Jul-2011 |
jhb |
Enable NEW_PCIB by default on ia64.
Approved by: re (kib), marcel
|
224184 |
18-Jul-2011 |
jhb |
Implement bus_adjust_resource() for the ia64 nexus driver.
Reviewed by: marcel Approved by: re (kib)
|
224116 |
16-Jul-2011 |
marcel |
Don't assume pmap_mapdev() gets called only for memory mapped I/O addresses (i.e. uncacheable). ACPI in particular uses pmap_mapdev() rather excessively (number of calls) just to get a valid KVA. To that end, have pmap_mapdev(): 1. cache the last result so that we don't waste time for multiple consecutive invocations with the same PA/SZ. 2. find the memory descriptor that covers the PA and return NULL if none was found or when the PA is for a common DRAM address. 3. Use either a region 6 or region 7 KVA, in accordance with the memory attribute.
|
224114 |
16-Jul-2011 |
marcel |
Don't send EOI to the CPU before we handled the interrupt. This could potentially trigger multiple pending interrupts for level-sensitive interrupts. However, the event timer interrupt does need EOI before being handled to avoid missing clock events.
These conflicting requirements are handled by having the XIV handler inform the dispatch code whether or not it send EOI to the CPU. If not, the dispatch code will do it. This allows handlers to send EOI before doing potentially long-running activities, while still have a sensible default behaviour.
|
224112 |
16-Jul-2011 |
marcel |
Add a few more helper functions for working with memory descriptors: o efi_md_find() - returns the md that covers the given address o efi_md_last() - returns the last md in the list o efi_md_prev() - returns the md that preceeds the given md.
|
223873 |
08-Jul-2011 |
marcel |
Implement basic support for memory attributes. At this time we only distinguish between UC and WB memory so that we can map the page to either a region 6 address (for UC) or a region 7 address (for WB).
This change is only now possible, because previously we would map regions 6 and 7 with 256MB translations and on top of that had the kernel mapped in region 7 using a wired translation. The introduction of the PBVM moved the kernel into its own region and freed up region 7 and allowed us to revert to standard page-sized translations.
This commit inroduces pmap_page_to_va() that respects the attribute.
|
223763 |
04-Jul-2011 |
marcel |
Disable PREEMPTION for now. See also PR ia64/147501.
|
223758 |
04-Jul-2011 |
attilio |
With retirement of cpumask_t and usage of cpuset_t for representing a mask of CPUs, pc_other_cpus and pc_cpumask become highly inefficient.
Remove them and replace their usage with custom pc_cpuid magic (as, atm, pc_cpumask can be easilly represented by (1 << pc_cpuid) and pc_other_cpus by (all_cpus & ~(1 << pc_cpuid))).
This change is not targeted for MFC because of struct pcpu members removal and dependency by cpumask_t retirement.
MD review by: marcel, marius, alc Tested by: pluknet MD testing by: marcel, marius, gonzo, andreast
|
223732 |
02-Jul-2011 |
alc |
When iterating over a paging queue, explicitly check for PG_MARKER, instead of relying on zeroed memory being interpreted as an empty PV list.
Reviewed by: kib
|
223700 |
30-Jun-2011 |
marcel |
Change the management of nested faults by switching to physical addressing while reading or writing the trap frame. It's not possible to guarantee that the one translation cache entry that we depend on is not going to get purged by the CPU. We already know that global shootdowns (ptc.g and/or ptc.ga) can (and will) cause multiple TC entries to get purged and we initialize tried to handle that by serializing kernel entry with these operations. However, we need to serialize kernel exit as well.
But even if we can serialize, it appears that CPU threads within a core can affect each other's TC entries beyond the global shootdown. This would mean serializing any and all translatation cache updates with the threads in a core with the kernel entry and exit of any thread in that core. This is just too painful and complicated.
Since we already properly coded for the 2 nested faults that we can get, all we need to do is use those to obtain the physical address of the trap frame, switch to physical mode and in that way eliminate any further faults. The trap frame is already aligned to 1KB boundaries to make sure we don't cross the page boundary, this is safe to do.
We still need to serialize ptc.g or ptc.ga across CPUs because the platform can only have 1 such operation outstanding at the same time. We can now use a regular (spin) lock for this.
Also, it has been observed that we can get a nested TLB faults for region 7 virtual addresses. This was unexpected. For now, we enhance the nested TLB fault handler to deal with those as well, but it needs to be understood.
|
223677 |
29-Jun-2011 |
alc |
Add a new option, OBJPR_NOTMAPPED, to vm_object_page_remove(). Passing this option to vm_object_page_remove() asserts that the specified range of pages is not mapped, or more precisely that none of these pages have any managed mappings. Thus, vm_object_page_remove() need not call pmap_remove_all() on the pages.
This change not only saves time by eliminating pointless calls to pmap_remove_all(), but it also eliminates an inconsistency in the use of pmap_remove_all() versus related functions, like pmap_remove_write(). It eliminates harmless but pointless calls to pmap_remove_all() that were being performed on PG_UNMANAGED pages.
Update all of the existing assertions on pmap_remove_all() to reflect this change.
Reviewed by: kib
|
223544 |
25-Jun-2011 |
marcel |
Oops. The sec field of struct bintime is *not* a 32-bit type. It's time_t, which is 64 bits on ia64.
|
223542 |
25-Jun-2011 |
marcel |
Define the minimum fractional period in terms of hz. We know hz is a magnitude smaller than itc_freq. A minimum period of 10*hz is sufficient precision. As a side-effect, the number of clocks per second, when the machine is idle, dropped by more than 50%. Be anal and define the maximum period to be at least 4G seconds. With a 64-bit counter and an ITC frequency that's expected to be always less than 4Ghz, it takes longer than that to wrap around.
|
223529 |
25-Jun-2011 |
marcel |
Replace the original copyright notice with my own. Everything in this file is written by me and has no bearing on the initial or original version.
|
223528 |
25-Jun-2011 |
marcel |
Update copyright.
|
223526 |
25-Jun-2011 |
marcel |
Switch to the event timers infrastructure. This includes: o Setting td_intr_frame to the XIVs trap frame because it's referenced by the ET event handler. o Signal EOI to the CPU before calling the registered XIV handlers. This prevents lost ITC interrupts, which cause starvation in one-shot mode. o Adding support for IPI_HARDCLOCK with corresponding per-CPU counters. o Have the APs call cpu_initclocks() so as to limited the scattering of clock related initialization. cpu_initclocks() calls the <self>_bsp() or <self>_ap() version accordingly. o Uncomment the ET clock handling in cpu_idle(). o Update the DDB 'show pcpu' output for the new MD fields. o Entirely rewritten ia64_ih_clock(). Note that we don't create as many clock XIVs as we have CPUs, as is done on PowerPC. It doesn't scale. We can only have 240 XIVs and we can have more CPUs than that. There's a single intrcnt index for the cumulative clock ticks and we keep per CPU counts in the PCPU stats structure. o Register the ITC by hooking SI_SUB_CONFIGURE (2nd order).
Open issues: o Clock interrupts can still be lost. Some tweaking is still necessary.
Thanks to: mav@ for his support, feedback and explanations.
ET stats while committing: eris% sysctl machdep.cpu | grep nclks
machdep.cpu.0.nclks: 24007 machdep.cpu.1.nclks: 22895 machdep.cpu.2.nclks: 13523 machdep.cpu.3.nclks: 9342 machdep.cpu.4.nclks: 9103 machdep.cpu.5.nclks: 9298 machdep.cpu.6.nclks: 10039 machdep.cpu.7.nclks: 9479 eris% vmstat -i | grep clock clock 108599 50
|
223478 |
23-Jun-2011 |
marcel |
Unblock the outgoing thread after we performed pmap_switch() to switch the region registers. pmap_switch() returns the pmap for which the region register are currently programmed, which needs to be re-programmed on the CPU the ougoing thread gets switched in. This change does not noticibly change anything or fix known bugs, but does give me a warm fuzzy feeling by being more correct.
|
223365 |
21-Jun-2011 |
alc |
Use a non-standard page size that is supported.
|
223171 |
17-Jun-2011 |
marcel |
Improve on style(9)
|
223170 |
17-Jun-2011 |
marcel |
Properly serialize the global shootdown with the instruction stream of the local processor. Also explicitly invalidate the ALAT. This is done on the other CPUs in the coherence domain by virtue of the ptc.ga instruction, but does not apply to the local CPU.
|
222971 |
11-Jun-2011 |
marcel |
Add the model number for the Montvale processor (marketed as Itanium 2 9100). At this time we're missing just one: Tukwila (Itanium 2 9300).
|
222813 |
07-Jun-2011 |
attilio |
etire the cpumask_t type and replace it with cpuset_t usage.
This is intended to fix the bug where cpu mask objects are capped to 32. MAXCPU, then, can now arbitrarely bumped to whatever value. Anyway, as long as several structures in the kernel are statically allocated and sized as MAXCPU, it is suggested to keep it as low as possible for the time being.
Technical notes on this commit itself: - More functions to handle with cpuset_t objects are introduced. The most notable are cpusetobj_ffs() (which calculates a ffs(3) for a cpuset_t object), cpusetobj_strprint() (which prepares a string representing a cpuset_t object) and cpusetobj_strscan() (which creates a valid cpuset_t starting from a string representation). - pc_cpumask and pc_other_cpus are target to be removed soon. With the moving from cpumask_t to cpuset_t they are now inefficient and not really useful. Anyway, for the time being, please note that access to pcpu datas is protected by sched_pin() in order to avoid migrating the CPU while reading more than one (possible) word - Please note that size of cpuset_t objects may differ between kernel and userland. While this is not directly related to the patch itself, it is good to understand that concept and possibly use the patch as a reference on how to deal with cpuset_t objects in userland, when accessing kernland members. - KTR_CPUMASK is changed and now is represented through a string, to be set as the example reported in NOTES.
Please additively note that no MAXCPU is bumped in this patch, but private testing has been done until to MAXCPU=128 on a real 8x8x2(htt) machine (amd64).
Please note that the FreeBSD version is not yet bumped because of the upcoming pcpu changes. However, note that this patch is not targeted for MFC.
People to thank for the time spent on this patch: - sbruno, pluknet and Nicholas Esborn (nick AT desert DOT net) tested several revision of the patches and really helped in improving stability of this work. - marius fixed several bugs in the sparc64 implementation and reviewed patches related to ktr. - jeff and jhb discussed the basic approach followed. - kib and marcel made targeted review on some specific part of the patch. - marius, art, nwhitehorn and andreast reviewed MD specific part of the patch. - marius, andreast, gonzo, nwhitehorn and jceel tested MD specific implementations of the patch. - Other people have made contributions on other patches that have been already committed and have been listed separately.
Companies that should be mentioned for having participated at several degrees: - Yahoo! for having offered the machines used for testing on big count of CPUs. - The FreeBSD Foundation for having sponsored my devsummit attendance, which has been instrumental. - Sandvine for having offered offices and infrastructure during development.
(I really hope I didn't forget anyone, if it happened I apologize in advance).
|
222800 |
07-Jun-2011 |
marcel |
Call set_cputicker() to have the time counter use the ITC register. Note that the ITC frequency is fixed.
|
222769 |
06-Jun-2011 |
marcel |
Improve cpu_idle(): o cpu_idle_hook is expected to be called with interrupts disabled and re-enables interrupts on return. o sync with x86: don't idle when the CPU has runnable tasks o have callers of ia64_call_pal_static() disable interrupts and re-enable interrupts. o add, but compile-out, support for idle mode. This will be enabled at some later time, after proper testing.
|
222531 |
31-May-2011 |
nwhitehorn |
On multi-core, multi-threaded PPC systems, it is important that the threads be brought up in the order they are enumerated in the device tree (in particular, that thread 0 on each core be brought up first). The SLIST through which we loop to start the CPUs has all of its entries added with SLIST_INSERT_HEAD(), which means it is in reverse order of enumeration and so AP startup would always fail in such situations (causing a machine check or RTAS failure). Fix this by changing the SLIST into an STAILQ, and inserting new CPUs at the end.
Reviewed by: jhb
|
221894 |
14-May-2011 |
marcel |
Prefer switching the memory stack from user to kernel *before* switching the register stack. While the ordering doesn't matter, it creates an invariant not previously there: the memory stack pointer will always be larger than the register stack pointer. With this invariant in place, it's easier to add instrumentation code that detects a stack overflow because in such a scenario the memory stack pointer and register stack pointers have crossed each other.
Aside: basic kernel operation needs about half the stack size (~16K) at most. We have plenty of head room on the kernel stack...
|
221893 |
14-May-2011 |
marcel |
Sharpening the saw: o Clobber the register that holds the restart token immediately after crossing the restart point. This prevents false positives (i.e. a nested exception that we don't know can happen and that is being treated as one we know by virtue of a lingering restart token). o Now that the bootstrap kernel stack is free, switch onto it and call trap() for nested traps that we don't know about. In trap we panic() so that we can analyze the condition.
|
221890 |
14-May-2011 |
marcel |
Be pedantic: mark the pcpu pointer (= register r13) itself as volatile.
|
221889 |
14-May-2011 |
marcel |
Turn ia64_srlz() and ia64_srlz_i() into defines so that the code is still correct when inlining is disabled.
|
221855 |
13-May-2011 |
mdf |
Move the ZERO_REGION_SIZE to a machine-dependent file, as on many architectures (i386, for example) the virtual memory space may be constrained enough that 2MB is a large chunk. Use 64K for arches other than amd64 and ia64, with special handling for sparc64 due to differing hardware.
Also commit the comment changes to kmem_init_zero_region() that I missed due to not saving the file. (Darn the unfamiliar development environment).
Arch maintainers, please feel free to adjust ZERO_REGION_SIZE as you see fit.
Requested by: alc MFC after: 1 week MFC with: r221853
|
221606 |
07-May-2011 |
marcel |
In pmap_kextract(), return the physical address for PBVM virtual addresses as well (incl. the PBVM page table).
|
221526 |
06-May-2011 |
jhb |
Retire isa_setup_intr() and isa_teardown_intr() and use the generic bus versions instead. They were never needed as bus_generic_intr() and bus_teardown_intr() had been changed to pass the original child device up in 42734, but the ISA bus was not converted to new-bus until 45720.
|
221334 |
02-May-2011 |
marcel |
Don't use the whole region 5 for KVA, because the CPU may not implement all of the 61 bits available within the region for virtual addressing. Since there's no good way for us to map out the gap in the virtual address space, limit KVA to the architectural minimum implemented address bits. This still gives us 1 petabyte of KVA, so no worries.
|
221271 |
30-Apr-2011 |
marcel |
Stop linking against a direct-mapped virtual address and instead use the PBVM. This eliminates the implied hardcoding of the physical address at which the kernel needs to be loaded. Using the PBVM makes it possible to load the kernel irrespective of the physical memory organization and allows us to replicate kernel text on NUMA machines.
While here, reduce the direct-mapped page size to the kernel's page size so that we can support memory attributes better.
|
221218 |
29-Apr-2011 |
jhb |
Change rman_manage_region() to actually honor the rm_start and rm_end constraints on the rman and reject attempts to manage a region that is out of range. - Fix various places that set rm_end incorrectly (to ~0 or ~0u instead of ~0ul). - To preserve existing behavior, change rman_init() to set rm_start and rm_end to allow managing the full range (0 to ~0ul) if they are not set by the caller when rman_init() is called.
|
221173 |
28-Apr-2011 |
attilio |
Add the watchdogs patting during the (shutdown time) disk syncing and disk dumping. With the option SW_WATCHDOG on, these operations are doomed to let watchdog fire, fi they take too long.
I implemented the stubs this way because I really want wdog_kern_* KPI to not be dependant by SW_WATCHDOG being on (and really, the option only enables watchdog activation in hardclock) and also avoid to call them when not necessary (avoiding not-volountary watchdog activations).
Sponsored by: Sandvine Incorporated Discussed with: emaste, des MFC after: 2 weeks
|
221124 |
27-Apr-2011 |
rmacklem |
This patch changes head so that the default NFS client is now the new NFS client (which I guess is no longer experimental). The fstype "newnfs" is now "nfs" and the regular/old NFS client is now fstype "oldnfs". Although mounts via fstype "nfs" will usually work without userland changes, an updated mount_nfs(8) binary is needed for kernels built with "options NFSCL" but not "options NFSCLIENT". Updated mount_nfs(8) and mount(8) binaries are needed to do mounts for fstype "oldnfs". The GENERIC kernel configs have been changed to use options NFSCL and NFSD (the new client and server) instead of NFSCLIENT and NFSSERVER. For kernels being used on diskless NFS root systems, "options NFSCL" must be in the kernel config. Discussed on freebsd-fs@.
|
221035 |
25-Apr-2011 |
marcel |
Remove prototypes of non-existent functions.
|
220982 |
24-Apr-2011 |
mav |
Switch the GENERIC kernels for all architectures to the new CAM-based ATA stack. It means that all legacy ATA drivers are disabled and replaced by respective CAM drivers. If you are using ATA device names in /etc/fstab or other places, make sure to update them respectively (adX -> adaY, acdX -> cdY, afdX -> daY, astX -> saY, where 'Y's are the sequential numbers for each type in order of detection, unless configured otherwise with tunables, see cam(4)).
ataraid(4) functionality is now supported by the RAID GEOM class. To use it you can load geom_raid kernel module and use graid(8) tool for management. Instead of /dev/arX device names, use /dev/raid/rX.
|
220313 |
04-Apr-2011 |
marcel |
Use the new arch_loadaddr I/F to align ELF objects to PBVM page boundaries. For good measure, align all other objects to cache lines boundaries.
Use the new arch_loadseg I/F to keep track of kernel text and data so that we can wire as much of it as is possible. It is the responsibility of the kernel to link critical (read IVT related) code and data at the front of the respective segment so that it's covered by TRs before the kernel has a chance to add more translations.
Use a better way of determining whether we're loading a legacy kernel or not. We can't check for the presence of the PBVM page table, because we may have unloaded that kernel and loaded an older (legacy) kernel after that. Simply use the latest load address for it.
|
220238 |
01-Apr-2011 |
kib |
Add support for executing the FreeBSD 1/i386 a.out binaries on amd64.
In particular: - implement compat shims for old stat(2) variants and ogetdirentries(2); - implement delivery of signals with ancient stack frame layout and corresponding sigreturn(2); - implement old getpagesize(2); - provide a user-mode trampoline and LDT call gate for lcall $7,$0; - port a.out image activator and connect it to the build as a module on amd64.
The changes are hidden under COMPAT_43.
MFC after: 1 month
|
220042 |
26-Mar-2011 |
alc |
Eliminate an unused definition.
Reviewed by: marcel
|
219841 |
21-Mar-2011 |
marcel |
Fix switching to physical mode as part of calling into EFI runtime services or PAL procedures. The new implementation is based on specific functions that are known to be called in certain scenarios only. This in particular fixes the PAL call to obtain information about translation registers. In general, the new implementation does not bank on virtual addresses being direct-mapped and will work when the kernel uses PBVM.
When new scenarios need to be supported, new functions are added if the existing functions cannot be changed to handle the new scenario. If a single generic implementation is possible, it will become clear in due time.
While here, change bootinfo to a pointer type in anticipation of future development.
|
219808 |
21-Mar-2011 |
marcel |
Change region 4 to be part of the kernel. This serves 2 purposes: 1. The PBVM is in region 4, so if we want to make use of it, we need region 4 freed up. 2. Region 4 and above cannot be represented by an off_t by virtue of that type being signed. This is problematic for truss(1), ktrace(1) and other such programs.
|
219775 |
19-Mar-2011 |
bz |
For now remove options FLOWTABLE from the remaining GENERIC kernel configurations and make it opt-in for those who want it. LINT will still build it.
While it may be a perfect win in some scenarios, it still troubles users (see PRs) in general cases. In addition we are still allocating resources even if disabled by sysctl and still leak arp/nd6 entries in case of interface destruction.
Discussed with: qingli (2010-11-24, just never executed) Discussed with: juli (OCTEON1) PR: kern/148018, kern/155604, kern/144917, kern/146792 MFC after: 2 weeks
|
219758 |
18-Mar-2011 |
marcel |
o Move the IVT and supporting functions to the front of the text segment so that it's always mapped by the loader. o Change the alternate fault handlers to account for PBVM. Since currently the region is handled by the VHPT, no alternate faults will be generated for it.
|
219756 |
18-Mar-2011 |
marcel |
Remove inclusion of unneeded bootinfo.h header.
|
219741 |
18-Mar-2011 |
marcel |
Use VM_MAXUSER_ADDRESS rather than VM_MAX_ADDRESS when we talk about the bounds of user space. Redefine VM_MAX_ADDRESS as ~0UL, even though it's not used anywhere in the source tree.
|
219691 |
16-Mar-2011 |
marcel |
MFaltix: Add support for Pre-Boot Virtual Memory (PBVM) to the loader.
PBVM allows us to link the kernel at a fixed virtual address without having to make any assumptions about the physical memory layout. On the SGI Altix 350 for example, there's no usuable physical memory below 192GB. Also, the PBVM allows us to control better where we're going to physically load the kernel and its modules so that we can make sure we load the kernel in memory that's close to the BSP.
The PBVM is managed by a simple page table. The minimum size of the page table is 4KB (EFI page size) and the maximum is currently set to 1MB. A page in the PBVM is 64KB, as that's the maximum alignment one can specify in a linker script. The bottom line is that PBVM is between 64KB and 8GB in size.
The loader maps the PBVM page table at a fixed virtual address and using a single translations. The PBVM itself is also mapped using a single translation for a maximum of 32MB.
While here, increase the heap in the EFI loader from 512KB to 2MB and set the stage for supporting relocatable modules.
|
219523 |
11-Mar-2011 |
mdf |
Mostly revert r219468, as I had misremembered the C standard regarding the size of an extern array.
Keep one change from strncpy to strlcpy.
|
219468 |
10-Mar-2011 |
mdf |
Use MAXPATHLEN rather than the size of an extern array when copying the kernel name. Also consistenly use strlcpy().
Suggested by: Warner Losh
|
219405 |
08-Mar-2011 |
dchagin |
Extend struct sysvec with new method sv_schedtail, which is used for an explicit process at fork trampoline path instead of eventhadler(schedtail) invocation for each child process.
Remove eventhandler(schedtail) code and change linux ABI to use newly added sysvec method.
While here replace explicit comparing of module sysentvec structure with the newly created process sysentvec to detect the linux ABI.
Discussed with: kib
MFC after: 2 Week
|
218773 |
17-Feb-2011 |
alc |
Remove pmap fields that are either unused or not fully implemented.
Discussed with: kib
|
218382 |
06-Feb-2011 |
marcel |
Comment-out FLOWTABLE. It causes a kernel panic due to a misaligned memory access related to an IPv6 route update.
PR: kern/148018
|
218195 |
02-Feb-2011 |
mdf |
Put the general logic for being a CPU hog into a new function should_yield(). Use this in various places. Encapsulate the common case of check-and-yield into a new function maybe_yield().
Change several checks for a magic number of iterations to use should_yield() instead.
MFC after: 1 week
|
217688 |
21-Jan-2011 |
pluknet |
Make MSGBUF_SIZE kernel option a loader tunable kern.msgbufsize.
Submitted by: perryh pluto.rain.com (previous version) Reviewed by: jhb Approved by: kib (mentor) Tested by: universe
|
217519 |
17-Jan-2011 |
jkim |
Remove empty dev_mem_md_init() stubs.
|
217515 |
17-Jan-2011 |
jkim |
Add reader/writer lock around mem_range_attr_get() and mem_range_attr_set(). Compile sys/dev/mem/memutil.c for all supported platforms and remove now unnecessary dev_mem_md_init(). Consistently define mem_range_softc from mem.c for all platforms. Add missing #include guards for machine/memdev.h and sys/memrange.h. Clean up some nearby style(9) nits.
MFC after: 1 month
|
217265 |
11-Jan-2011 |
jhb |
Remove unneeded includes of <sys/linker_set.h>. Other headers that use it internally contain nested includes.
Reviewed by: bde
|
217192 |
09-Jan-2011 |
kib |
Move repeated MAXSLP definition from machine/vmparam.h to sys/vmmeter.h. Update the outdated comments describing MAXSLP and the process selection algorithm for swap out.
Comments wording and reviewed by: alc
|
217180 |
09-Jan-2011 |
das |
The highest-precision floating point type on ia64 has 64 bits of precision, so DECIMAL_DIG should be 21, as on i386/amd64.
|
217147 |
08-Jan-2011 |
tijl |
On mixed 32/64 bit architectures (mips, powerpc) use __LP64__ rather than architecture macros (__mips_n64, __powerpc64__) when 64 bit types (and corresponding macros) are different from 32 bit. [1]
Correct the type of INT64_MIN, INT64_MAX and UINT64_MAX.
Define (U)INTMAX_C as an alias for (U)INT64_C matching the type definition for (u)intmax_t. Do this on all architectures for consistency.
Suggested by: bde [1] Approved by: kib (mentor)
|
217145 |
08-Jan-2011 |
tijl |
Fix types of some values in machine/_limits.h.
On some architectures UCHAR_MAX and USHRT_MAX had type unsigned int. However, lacking integer suffixes for types smaller than int, their type should correspond to that of an object of type unsigned char (or short) when used in an expression with objects of type int. In that case unsigned char (short) are promoted to int (i.e. signed) so the type of UCHAR_MAX and USHRT_MAX should also be int.
Where MIN/MAX constants implicitly have the correct type the suffix has been removed.
While here, correct some comments.
Reviewed by: bde Approved by: kib (mentor)
|
217097 |
07-Jan-2011 |
kib |
Add AT_STACKPROT elf aux vector. Will be used to inform rtld about the initial stack protection set by the kernel image activator.
|
216143 |
03-Dec-2010 |
brucec |
Revert r216134. This checkin broke platforms where bus_space are macros: they need to be a single statement, and do { } while (0) doesn't work in this situation so revert until a solution can be devised.
|
216134 |
02-Dec-2010 |
brucec |
Disallow passing in a count of zero bytes to the bus_space(9) functions.
Passing a count of zero on i386 and amd64 for [I386|AMD64]_BUS_SPACE_MEM causes a crash/hang since the 'loop' instruction decrements the counter before checking if it's zero.
PR: kern/80980 Discussed with: jhb
|
216094 |
01-Dec-2010 |
alc |
phys_avail[] is correctly defined as an array of vm_paddr_t's in machdep.c. Use that same type, and not vm_offset_t, in this include file.
|
215054 |
09-Nov-2010 |
jhb |
- Remove <machine/mutex.h>. Most of the headers were empty, and the contents of the ones that were not empty were stale and unused. - Now that <machine/mutex.h> no longer exists, there is no need to allow it to override various helper macros in <sys/mutex.h>. - Rename various helper macros for low-level operations on mutexes to live in the _mtx_* or __mtx_* namespaces. While here, change the names to more closely match the real API functions they are backing. - Drop support for including <sys/mutex.h> in assembly source files.
Suggested by: bde (1, 2)
|
215052 |
09-Nov-2010 |
jhb |
Remove unused includes of <sys/mutex.h> and <machine/mutex.h>.
|
215023 |
09-Nov-2010 |
jkim |
Reduce diff between platforms and fix style(9) bugs.
|
214835 |
05-Nov-2010 |
jhb |
Adjust the order of operations in spinlock_enter() and spinlock_exit() to work properly with single-stepping in a kernel debugger. Specifically, these routines have always disabled interrupts before increasing the nesting count and restored the prior state of interrupts after decreasing the nesting count to avoid problems with a nested interrupt not disabling interrupts when acquiring a spin lock. However, trap interrupts for single-stepping can still occur even when interrupts are disabled. Now the saved state of interrupts is not saved in the thread until after interrupts have been disabled and the nesting count has been increased. Similarly, the saved state from the thread cannot be read once the nesting count has been decreased to zero. To fix this, use temporary variables to store interrupt state and shuffle it between the thread's MD area and the appropriate registers.
In cooperation with: bde MFC after: 1 month
|
213282 |
29-Sep-2010 |
neel |
Fix bogus error message from bus_dmamem_alloc() about incorrect alignment.
The check for alignment should be made against the physical address and not the virtual address that maps it.
Sponsored by: NetApp Submitted by: Will McGovern (will at netapp dot com) Reviewed by: mjacob, jhb
|
213157 |
25-Sep-2010 |
imp |
Remove clauses 3 and 4, per changes to NetBSD versions of these files.
|
212413 |
10-Sep-2010 |
avg |
bus_add_child: change type of order parameter to u_int
This reflects actual type used to store and compare child device orders. Change is mostly done via a Coccinelle (soon to be devel/coccinelle) semantic patch. Verified by LINT+modules kernel builds.
Followup to: r212213 MFC after: 10 days
|
211515 |
19-Aug-2010 |
jhb |
Remove unused KTRACE includes.
|
211412 |
17-Aug-2010 |
kib |
Supply some useful information to the started image using ELF aux vectors. In particular, provide pagesize and pagesizes array, the canary value for SSP use, number of host CPUs and osreldate.
Tested by: marius (sparc64) MFC after: 1 month
|
211006 |
07-Aug-2010 |
kib |
Prefer struct sysentvec sv_psstrings to hardcoding FREEBSD32_PS_STRINGS in the compat32 code. Use sv_usrstack instead of FREEBSD32_USRSTACK as well.
MFC after: 1 week
|
210939 |
06-Aug-2010 |
jhb |
Add a new ipi_cpu() function to the MI IPI API that can be used to send an IPI to a specific CPU by its cpuid. Replace calls to ipi_selected() that constructed a mask for a single CPU with calls to ipi_cpu() instead. This will matter more in the future when we transition from cpumask_t to cpuset_t for CPU masks in which case building a CPU mask is more expensive.
Submitted by: peter, sbruno Reviewed by: rookie Obtained from: Yahoo! (x86) MFC after: 1 month
|
210623 |
29-Jul-2010 |
jhb |
Mark the __curthread() functions as __pure2 and remove the volatile keyword from the inline assembly. This allows the compiler to cache invocations of curthread since it's value does not change within a thread context.
Submitted by: zec (i386) MFC after: 1 week
|
210564 |
28-Jul-2010 |
mdf |
Add MALLOC_DEBUG_MAXZONES debug malloc(9) option to use multiple uma zones for each malloc bucket size. The purpose is to isolate different malloc types into hash classes, so that any buffer overruns or use-after-free will usually only affect memory from malloc types in that hash class. This is purely a debugging tool; by varying the hash function and tracking which hash class was corrupted, the intersection of the hash classes from each instance will point to a single malloc type that is being misused. At this point inspection or memguard(9) can be used to catch the offending code.
Add MALLOC_DEBUG_MAXZONES=8 to -current GENERIC configuration files. The suggestion to have this on by default came from Kostik Belousov on -arch.
This code is based on work by Ron Steinke at Isilon Systems.
Reviewed by: -arch (mostly silence) Reviewed by: zml Approved by: zml (mentor)
|
210550 |
27-Jul-2010 |
jhb |
Very rough first cut at NUMA support for the physical page allocator. For now it uses a very dumb first-touch allocation policy. This will change in the future. - Each architecture indicates the maximum number of supported memory domains via a new VM_NDOMAIN parameter in <machine/vmparam.h>. - Each cpu now has a PCPU_GET(domain) member to indicate the memory domain a CPU belongs to. Domain values are dense and numbered from 0. - When a platform supports multiple domains, the default freelist (VM_FREELIST_DEFAULT) is split up into N freelists, one for each domain. The MD code is required to populate an array of mem_affinity structures. Each entry in the array defines a range of memory (start and end) and a domain for the range. Multiple entries may be present for a single domain. The list is terminated by an entry where all fields are zero. This array of structures is used to split up phys_avail[] regions that fall in VM_FREELIST_DEFAULT into per-domain freelists. - Each memory domain has a separate lookup-array of freelists that is used when fulfulling a physical memory allocation. Right now the per-domain freelists are listed in a round-robin order for each domain. In the future a table such as the ACPI SLIT table may be used to order the per-domain lookup lists based on the penalty for each memory domain relative to a specific domain. The lookup lists may be examined via a new vm.phys.lookup_lists sysctl. - The first-touch policy is implemented by using PCPU_GET(domain) to pick a lookup list when allocating memory.
Reviewed by: alc
|
210369 |
22-Jul-2010 |
kib |
When compat32 binary asks for the value of hw.machine_arch, report the name of 32bit sibling architecture instead of the host one. Do the same for hw.machine on amd64.
Add a safety belt debug.adaptive_machine_arch sysctl, to turn the substitution off.
Reviewed by: jhb, nwhitehorn MFC after: 2 weeks
|
209779 |
07-Jul-2010 |
marcel |
Add acpi_find_table() -- a convenience function for looking up an ACPI table given the signature.
|
209775 |
07-Jul-2010 |
marcel |
Remove pointless BOOTP conditional.
|
209749 |
06-Jul-2010 |
marcel |
Provide more examples for error injection.
|
209671 |
03-Jul-2010 |
marcel |
Allocate and setup an interrupt vector for corrected machine checks. For now, just print when we get the interrupt, but eventually we need to collect the details and provide a more useful report.
|
209618 |
01-Jul-2010 |
marcel |
When compiling with profiling, we define PROF for userspace and GPROF for the kernel.
|
209617 |
30-Jun-2010 |
marcel |
While functions are ideally aligned to a 32-byte boundary, don't assume this to be the case.
|
209613 |
30-Jun-2010 |
jhb |
Move prototypes for kern_sigtimedwait() and kern_sigprocmask() to <sys/syscallsubr.h> where all other kern_<syscall> prototypes live.
|
209085 |
12-Jun-2010 |
marcel |
The ptc.g operation for the Mckinley and Madison processors has the side-effect of purging more than the requested translation. While this is not a problem in general, it invalidates the assumption made during constructing the trapframe on entry into the kernel in SMP configurations. The assumption is that only the first store to the stack will possibly cause a TLB miss. Since the ptc.g purges the translation caches of all CPUs in the coherency domain, a ptc.g executed on one CPU can cause a purge on another CPU that is currently running the critical code that saves the state to the trapframe. This can cause an unexpected TLB miss and with interrupt collection disabled this means an unexpected data nested TLB fault.
A data nested TLB fault will not save any context, nor provide a way for software to determine what caused the TLB miss nor where it occured. Careful construction of the kernel entry and exit code allows us to handle a TLB miss in precisely orchastrated points and thereby avoiding the need to wire the kernel stack, but the unexpected TLB miss caused by the ptc.g instructution resulted in an unrecoverable condition and resulting in machine checks.
The solution to this problem is to synchronize the kernel entry on all CPUs with the use of the ptc.g instruction on a single CPU by implementing a bare-bones readers-writer lock that allows N readers (= N CPUs entering the kernel) and 1 writer (= execution of the ptc.g instruction on some CPU). This solution wins over a rendez-vous approach by not interrupting CPUs with an IPI.
This problem has not been observed on the Montecito.
PR: ia64/147772 MFC after: 6 days
|
209048 |
11-Jun-2010 |
alc |
Relax one of the new assertions in pmap_enter() a little. Specifically, allow pmap_enter() to be performed on an unmanaged page that doesn't have VPO_BUSY set. Having VPO_BUSY set really only matters for managed pages. (See, for example, pmap_remove_write().)
|
209026 |
11-Jun-2010 |
marcel |
Bump MAX_BPAGES from 256 to 1024. It seems that a few drivers, bge(4) in particular, do not handle deferred DMA map load operations at all. Any error, and especially EINPROGRESS, is treated as a hard error and typically abort the current operation. The fact that the busdma code queues the load operation for when resources (i.e. bounce buffers in this particular case) are available makes this especially problematic. Bounce buffering, unlike what the PR synopsis would suggest, works fine.
While on the subject, properly implement swi_vm().
PR: 147502 MFC after: 1 week
|
208990 |
10-Jun-2010 |
alc |
Reduce the scope of the page queues lock and the number of PG_REFERENCED changes in vm_pageout_object_deactivate_pages(). Simplify this function's inner loop using TAILQ_FOREACH(), and shorten some of its overly long lines. Update a stale comment.
Assert that PG_REFERENCED may be cleared only if the object containing the page is locked. Add a comment documenting this.
Assert that a caller to vm_page_requeue() holds the page queues lock, and assert that the page is on a page queue.
Push down the page queues lock into pmap_ts_referenced() and pmap_page_exists_quick(). (As of now, there are no longer any pmap functions that expect to be called with the page queues lock held.)
Neither pmap_ts_referenced() nor pmap_page_exists_quick() should ever be passed an unmanaged page. Assert this rather than returning "0" and "FALSE" respectively.
ARM:
Simplify pmap_page_exists_quick() by switching to TAILQ_FOREACH().
Push down the page queues lock inside of pmap_clearbit(), simplifying pmap_clear_modify(), pmap_clear_reference(), and pmap_remove_write(). Additionally, this allows for avoiding the acquisition of the page queues lock in some cases.
PowerPC/AIM:
moea*_page_exits_quick() and moea*_page_wired_mappings() will never be called before pmap initialization is complete. Therefore, the check for moea_initialized can be eliminated.
Push down the page queues lock inside of moea*_clear_bit(), simplifying moea*_clear_modify() and moea*_clear_reference().
The last parameter to moea*_clear_bit() is never used. Eliminate it.
PowerPC/BookE:
Simplify mmu_booke_page_exists_quick()'s control flow.
Reviewed by: kib@
|
208659 |
30-May-2010 |
alc |
Simplify the inner loop of get_pv_entry(): While iterating over the page's pv list, there is no point in checking whether or not the pv list is empty, wait instead until the loop completes.
|
208646 |
29-May-2010 |
alc |
Don't set PG_WRITEABLE in pmap_enter() unless the page is managed.
|
208574 |
26-May-2010 |
alc |
Push down page queues lock acquisition in pmap_enter_object() and pmap_is_referenced(). Eliminate the corresponding page queues lock acquisitions from vm_map_pmap_enter() and mincore(), respectively. In mincore(), this allows some additional cases to complete without ever acquiring the page queues lock.
Assert that the page is managed in pmap_is_referenced().
On powerpc/aim, push down the page queues lock acquisition from moea*_is_modified() and moea*_is_referenced() into moea*_query_bit(). Again, this will allow some additional cases to complete without ever acquiring the page queues lock.
Reorder a few statements in vm_page_dontneed() so that a race can't lead to an old reference persisting. This scenario is described in detail by a comment.
Correct a spelling error in vm_page_dontneed().
Assert that the object is locked in vm_page_clear_dirty(), and restrict the page queues lock assertion to just those cases in which the page is currently writeable.
Add object locking to vnode_pager_generic_putpages(). This was the one and only place where vm_page_clear_dirty() was being called without the object being locked.
Eliminate an unnecessary vm_page_lock() around vnode_pager_setsize()'s call to vm_page_clear_dirty().
Change vnode_pager_generic_putpages() to the modern-style of function definition. Also, change the name of one of the parameters to follow virtual memory system naming conventions.
Reviewed by: kib
|
208514 |
24-May-2010 |
kib |
Change ia64' struct syscall_args definition so that args is a pointer to the arguments array instead of array itself. ia64 syscall arguments are readily available in the frame, point args to it, do not do unnecessary bcopy. Still reserve the array in syscall_args for ia32 emulation.
Suggested and reviewed by: marcel MFC after: 1 month
|
208504 |
24-May-2010 |
alc |
Roughly half of a typical pmap_mincore() implementation is machine- independent code. Move this code into mincore(), and eliminate the page queues lock from pmap_mincore().
Push down the page queues lock into pmap_clear_modify(), pmap_clear_reference(), and pmap_is_modified(). Assert that these functions are never passed an unmanaged page.
Eliminate an inaccurate comment from powerpc/powerpc/mmu_if.m: Contrary to what the comment says, pmap_mincore() is not simply an optimization. Without a complete pmap_mincore() implementation, mincore() cannot return either MINCORE_MODIFIED or MINCORE_REFERENCED because only the pmap can provide this information.
Eliminate the page queues lock from vfs_setdirty_locked_object(), vm_pageout_clean(), vm_object_page_collect_flush(), and vm_object_page_clean(). Generally speaking, these are all accesses to the page's dirty field, which are synchronized by the containing vm object's lock.
Reduce the scope of the page queues lock in vm_object_madvise() and vm_page_dontneed().
Reviewed by: kib (an earlier version)
|
208453 |
23-May-2010 |
kib |
Reorganize syscall entry and leave handling.
Extend struct sysvec with three new elements: sv_fetch_syscall_args - the method to fetch syscall arguments from usermode into struct syscall_args. The structure is machine-depended (this might be reconsidered after all architectures are converted). sv_set_syscall_retval - the method to set a return value for usermode from the syscall. It is a generalization of cpu_set_syscall_retval(9) to allow ABIs to override the way to set a return value. sv_syscallnames - the table of syscall names.
Use sv_set_syscall_retval in kern_sigsuspend() instead of hardcoding the call to cpu_set_syscall_retval().
The new functions syscallenter(9) and syscallret(9) are provided that use sv_*syscall* pointers and contain the common repeated code from the syscall() implementations for the architecture-specific syscall trap handlers.
Syscallenter() fetches arguments, calls syscall implementation from ABI sysent table, and set up return frame. The end of syscall bookkeeping is done by syscallret().
Take advantage of single place for MI syscall handling code and implement ptrace_lwpinfo pl_flags PL_FLAG_SCE, PL_FLAG_SCX and PL_FLAG_EXEC. The SCE and SCX flags notify the debugger that the thread is stopped at syscall entry or return point respectively. The EXEC flag augments SCX and notifies debugger that the process address space was changed by one of exec(2)-family syscalls.
The i386, amd64, sparc64, sun4v, powerpc and ia64 syscall()s are changed to use syscallenter()/syscallret(). MIPS and arm are not converted and use the mostly unchanged syscall() implementation.
Reviewed by: jhb, marcel, marius, nwhitehorn, stas Tested by: marcel (ia64), marius (sparc64), nwhitehorn (powerpc), stas (mips) MFC after: 1 month
|
208392 |
21-May-2010 |
jhb |
- Adjust the whitespace for the lines that output fields in 'show pcpu' in DDB so that all the fields line up. - Print out the tid of the per-CPU idlethread instead of the pid since the idle process is now shared across all idle threads.
MFC after: 1 month
|
208283 |
19-May-2010 |
marcel |
Switch to C99 exact-width types.
|
208175 |
16-May-2010 |
alc |
On entry to pmap_enter(), assert that the page is busy. While I'm here, make the style of assertion used by pmap_enter() consistent across all architectures.
On entry to pmap_remove_write(), assert that the page is neither unmanaged nor fictitious, since we cannot remove write access to either kind of page.
With the push down of the page queues lock, pmap_remove_write() cannot condition its behavior on the state of the PG_WRITEABLE flag if the page is busy. Assert that the object containing the page is locked. This allows us to know that the page will neither become busy nor will PG_WRITEABLE be set on it while pmap_remove_write() is running.
Correct a long-standing bug in vm_page_cowsetup(). We cannot possibly do copy-on-write-based zero-copy transmit on unmanaged or fictitious pages, so don't even try. Previously, the call to pmap_remove_write() would have failed silently.
|
207796 |
08-May-2010 |
alc |
Push down the page queues into vm_page_cache(), vm_page_try_to_cache(), and vm_page_try_to_free(). Consequently, push down the page queues lock into pmap_enter_quick(), pmap_page_wired_mapped(), pmap_remove_all(), and pmap_remove_write().
Push down the page queues lock into Xen's pmap_page_is_mapped(). (I overlooked the Xen pmap in r207702.)
Switch to a per-processor counter for the total number of pages cached.
|
207410 |
30-Apr-2010 |
kmacy |
On Alan's advice, rather than do a wholesale conversion on a single architecture from page queue lock to a hashed array of page locks (based on a patch by Jeff Roberson), I've implemented page lock support in the MI code and have only moved vm_page's hold_count out from under page queue mutex to page lock. This changes pmap_extract_and_hold on all pmaps.
Supported by: Bitgravity Inc.
Discussed with: alc, jeffr, and kib
|
207373 |
29-Apr-2010 |
alc |
MFamd64/i386 r207205 Clearing a page table entry's accessed bit and setting the page's PG_REFERENCED flag in pmap_protect() can't really be justified, so don't do it. Moreover, on ia64, don't set the page's dirty field unless pmap_protect() is removing write access.
|
207329 |
28-Apr-2010 |
attilio |
- Extract the IODEV_PIO interface from ia64 and make it MI. In the end, it does help fixing /dev/io usage from multithreaded processes. - On i386 and amd64 the old behaviour is kept but multithreaded processes must use the new interface in order to work well. - Support for the other architectures is greatly improved, where necessary, by the necessity to define very small things now.
Manpage update will happen shortly.
Sponsored by: Sandvine Incorporated PR: threads/116181 Reviewed by: emaste, marcel MFC after: 3 weeks
|
207269 |
27-Apr-2010 |
kib |
Style: use #define<TAB> instead of #define<SPACE>.
Noted by: bde, pluknet gmail com MFC after: 11 days
|
207155 |
24-Apr-2010 |
alc |
Resurrect pmap_is_referenced() and use it in mincore(). Essentially, pmap_ts_referenced() is not always appropriate for checking whether or not pages have been referenced because it clears any reference bits that it encounters. For example, in mincore(), clearing the reference bits has two negative consequences. First, it throws off the activity count calculations performed by the page daemon. Specifically, a page on which mincore() has called pmap_ts_referenced() looks less active to the page daemon than it should. Consequently, the page could be deactivated prematurely by the page daemon. Arguably, this problem could be fixed by having mincore() duplicate the activity count calculation on the page. However, there is a second problem for which that is not a solution. In order to clear a reference on a 4KB page, it may be necessary to demote a 2/4MB page mapping. Thus, a mincore() by one process can have the side effect of demoting a superpage mapping within another process!
|
207152 |
24-Apr-2010 |
kib |
Move the constants specifying the size of struct kinfo_proc into machine-specific header files. Add KINFO_PROC32_SIZE for struct kinfo_proc32 for architectures providing COMPAT_FREEBSD32. Add CTASSERT for the size of struct kinfo_proc32.
Submitted by: pluknet Reviewed by: imp, jhb, nwhitehorn MFC after: 2 weeks
|
207077 |
22-Apr-2010 |
thompsa |
Change USB_DEBUG to #ifdef and allow it to be turned off. Previously this had the illusion of a tunable setting but was always turned on regardless.
MFC after: 1 week
|
206570 |
13-Apr-2010 |
marcel |
Populate the sysctl tree with any MCA records we collected. The sequence number is used as the name of a sysctl node, under which we add the MCA records using the CPU id as the leaf name.
Add the hw.mca.inject sysctl to provide a way to inject MC errors and trigger machine checks.
PR: ia64/113102
|
206558 |
13-Apr-2010 |
marcel |
Change the (generic) argument to ia64_store_mca_state() from the cpuid to the struct pcpu of the CPU. We casting between pointer types only then.
|
206556 |
13-Apr-2010 |
marcel |
o s/u_int64_t/uint64_t/g o style(9) fixes.
|
206541 |
13-Apr-2010 |
marcel |
Sync up to SDM 2.2.
|
205727 |
27-Mar-2010 |
marcel |
Bring up-to-date: o Switch to ITANIUM2 has the cpu. This has absolutely no effect on the code, but makes for a better example. o Drop COMPAT_FREEBSD6. We're tier 2, so you're supposed to run 8-stable or newer. o Add PREEMPTION. It works now. o Remove HWPMC_HOOKS. We don't have support for hwpmc yet.
o Add a bunch of new devices: atapist, hptiop, amr, ips, twa, igb, ixgbe, ae, age, alc, ale, bce, bfe, et, jme, msk, nge, sk, ste, stge, tx, vge, axe, rue, udav, fwip, and all USB serial. o Remove "legacy" devices: le, vx, dc, pcn, rl, sis.
Make sure to the module list is a superset of what goes into GENERIC.
|
205726 |
27-Mar-2010 |
marcel |
Implement interrupt to CPU binding. Assign interrupts to CPUs in a round-robin fashion, starting with the highest priority interrupt on the highest-numbered CPU and cycling downwards.
|
205723 |
27-Mar-2010 |
marcel |
Remove nx_pcibus from the nexus resource. Nexus is not involved with PCI busses. Remove nexus_read_ivar() and nexus_write_ivar() to give default behaviour. Remove <machine/nexusvar.h> as well, because there's nothing in it that's being used.
|
205713 |
26-Mar-2010 |
marcel |
Rename disable_intr() to ia64_disable_intr() and rename enable_intr() to ia64_enable_intr(). This reduces confusion with intr_disable() and intr_restore().
Have configure_final() call ia64_finalize_intr() instead of enable_intr() in preparation of adding support for binding interrupts to all CPUs.
|
205665 |
26-Mar-2010 |
marcel |
Only use the interval timer for clock interrupts on the BSP and have the BSP use IPIs to trigger clock interrupts on the APs. This allows us to run on hardware configurations for which the ITC has non-uniform frequencies across CPUs.
While here, change the clock XIV to type IPI so as to protect the interrupt delivery against CPU re-balancing once that's implemented.
|
205660 |
26-Mar-2010 |
nwhitehorn |
Fix the ia64 build.
Pointy hat to: me
|
205642 |
25-Mar-2010 |
nwhitehorn |
Change the arguments of exec_setregs() so that it receives a pointer to the image_params struct instead of several members of that struct individually. This makes it easier to expand its arguments in the future without touching all platforms.
Reviewed by: jhb
|
205454 |
22-Mar-2010 |
marcel |
o Remove the pmap argument to pmap_invalidate_all() as it's not used other than in a potentially dangerous KASSERT. o Hand-inline pmap_remove_page() as it's only called from 1 place and the abstraction that pmap_remove_page() provides is not enough to warrant the obfuscation. Eliminate the dangerous KASSERT in the process. o In pmap_remove_pte(), remove the KASSERT for pmap being the current one as it's not safe in the face of CPU migration.
|
205435 |
22-Mar-2010 |
marcel |
Drop the pmap argument to pmap_invalidate_page(). It's not used other than in a KASSERT. The KASSERT is broken in that it's done outside the critical section and as such isn't protected against CPU migration. Improve pmap_invalidate_page() as follows: o calculate vhpt_ofs inside the critical region for exactly the same reason. o calculate the tag outside the FOREACH loop, as it's loop-invariant. This is more efficient. o Replace the test and set with an atomic cmpset operation because we are changing other CPU's VHPT tables and this avoids invalidating after the entry got modified. Not necessarily a problem, but better safe than sorry.
|
205434 |
22-Mar-2010 |
marcel |
With preemption, the high FP registers may get enabled by cpu_switch() before we grab the mutex. Don't assert that they must be disabled at that point. We pretty much bypass all logic in that case anyway and leave immediately, so there's no harm.
|
205433 |
22-Mar-2010 |
marcel |
Fix interrupt handling by extending the critical region so that preemption doesn't happen until after all pending interrupt have been services. While here again, simplify the EOI handling by doing it after we call the XIV-specific handlers, rather than in each of them. The original thought was that we may want to do an EOI first and the actual IPI handling next, but that's mostly a micro-optimization.
|
205432 |
22-Mar-2010 |
marcel |
Disable interrupts when calling into SAL for PCI configuration cycles. This serves 2 purposes: 1. It prevents preemption and CPU migration while running SAL code. 2. It reduces the chance of stack overflows: we're supposed to enter SAL with at least 16KB of either memory- or register stack space, which we can't do without switching to a different stack.
|
205431 |
22-Mar-2010 |
marcel |
Define curthread as an inline function that loads the thread pointer directly from r13, the pcpu pointer. This guarantees correct behaviour when the thread migrates to a different CPU.
|
205429 |
21-Mar-2010 |
marcel |
Print MD fields in the pcpu to aid debugging.
|
205428 |
21-Mar-2010 |
marcel |
Don't include <machine/_regset.h> when _MACHINE_REGSET_H_ in defined. This is not for multiple inclusion purposes, because _regset.h already handles this, but to enable inclusion of the MD header by cross-tools on non-ia64 installations. The cross-tool can include _regset.h itself before including MD headers that depend on it.
|
205357 |
20-Mar-2010 |
marcel |
Don't check for boot_verbose in the environment. The loader does that already and sets RB_VERBOSE. The loader has always done it.
|
205234 |
17-Mar-2010 |
marcel |
Revamp the interrupt code based on the previous commit: o Introduce XIV, eXternal Interrupt Vector, to differentiate from the interrupts vectors that are offsets in the IVT (Interrupt Vector Table). There's a vector for external interrupts, which are based on the XIVs.
o Keep track of allocated and reserved XIVs so that we can assign XIVs without hardcoding anything. When XIVs are allocated, an interrupt handler and a class is specified for the XIV. Classes are: 1. architecture-defined: XIV 15 is returned when no external interrupt are pending, 2. platform-defined: SAL reports which XIV is used to wakeup an AP (typically 0xFF, but it's 0x12 for the Altix 350). 3. inter-processor interrupts: allocated for SMP support and non-redirectable. 4. device interrupts (i.e. IRQs): allocated when devices are discovered and are redirectable.
o Rewrite the central interrupt handler to call the per-XIV interrupt handler and rename it to ia64_handle_intr(). Move the per-XIV handler implementation to the file where we have the XIV allocation/reservation. Clock interrupt handling is moved to clock.c. IPI handling is moved to mp_machdep.c.
o Drop support for the Intel 8259A because it was broken. When XIV 0 is received, the CPU should initiate an INTA cycle to obtain the interrupt vector of the 8259-based interrupt. In these cases the interrupt controller we should be talking to WRT to masking on signalling EOI is the 8259 and not the I/O SAPIC. This requires adriver for the Intel 8259A which isn't available for ia64. Thus stop pretending to support ExtINTs and instead panic() so that if we come across hardware that has an Intel 8259A, so have something real to work with.
o With XIVs for IPIs dynamically allocatedi and also based on priority, define the IPI_* symbols as variables rather than constants. The variable holds the XIV allocated for the IPI.
o IPI_STOP_HARD delivers a NMI if possible. Otherwise the XIV assigned to IPI_STOP is delivered.
|
205172 |
15-Mar-2010 |
marcel |
Have cpu_throw() loop on blocked_lock as well. This bug has existed a long time and has gone unnoticed just as long, because I kept using sched_4bsd (due to sched_ule not working with preemption), but GENERIC had sched_ule by default -- including SMP.
While here, remove unused inclusion of <machine/clock.h>, remove totally bogus inclusion of <i386/include/specialreg.h>.
|
205116 |
13-Mar-2010 |
ed |
Remove COMPAT_43TTY from stock kernel configuration files.
COMPAT_43TTY enables the sgtty interface. Even though its exposure has only been removed in FreeBSD 8.0, it wasn't used by anything in the base system in FreeBSD 5.x (possibly even 4.x?). On those releases, if your ports/packages are less than two years old, they will prefer termios over sgtty.
|
205015 |
11-Mar-2010 |
nwhitehorn |
Accidentally committed test code. Remove it.
Big pointy hat: me
|
205014 |
11-Mar-2010 |
nwhitehorn |
Provide groundwork for 32-bit binary compatibility on non-x86 platforms, for upcoming 64-bit PowerPC and MIPS support. This renames the COMPAT_IA32 option to COMPAT_FREEBSD32, removes some IA32-specific code from MI parts of the kernel and enhances the freebsd32 compatibility code to support big-endian platforms.
Reviewed by: kib, jhb
|
204905 |
09-Mar-2010 |
marcel |
Remove inclusion of <i386/include/psl.h> While here move inclusion of <sys/lock.h> in a better place.
|
204904 |
09-Mar-2010 |
marcel |
Remove support for SYS_RES_DRQ.
|
204646 |
03-Mar-2010 |
joel |
The NetBSD Foundation has granted permission to remove clause 3 and 4 from the software.
Obtained from: NetBSD
|
204425 |
27-Feb-2010 |
marcel |
Interrupt related cleanups: o Assign vectors based on priority, because vectors have implied priority in hardware. o Use unordered memory accesses to the I/O sapic and use the acceptance form of the mf instruction. o Remove the sapicreg.h and sapicvar.h headers. All definitions in sapicreg.h are private to sapic.c and all definitions in sapicvar.h are either private or interface functions. Move the interface functions to intr.h. o Hide the definition of struct sapic.
|
204184 |
22-Feb-2010 |
marcel |
Prefer I-units and M-units for nop instructions. This works around McKinley flaws. It also avoids using the F-unit in the kernel for no reason.
|
204183 |
21-Feb-2010 |
marcel |
Normalize nop instructions: Only use 0 for the immediate operand.
|
204182 |
21-Feb-2010 |
marcel |
Remove pm_active from struct pmap as it serves no purpose.
MFC after: 1 week
|
203938 |
15-Feb-2010 |
attilio |
Adjust style (following the already existing rules) for the newly introduced option DEADLKRES.
Reported by: danfe, julian, avg
|
203884 |
14-Feb-2010 |
marcel |
Some code cleanups: o s/u_int32_t/uint32_t/g o Add multiple-inclusion protection. o Break long lines.
|
203883 |
14-Feb-2010 |
marcel |
Some code churn: o Eliminate IA64_PHYS_TO_RR6 and change all places where the macro is used by calling either bus_space_map() or pmap_mapdev(). o Implement bus_space_map() in terms of pmap_mapdev() and implement bus_space_unmap() in terms of pmap_unmapdev(). o Have ia64_pib hold the uncached virtual address of the processor interrupt block throughout the kernel's life and access the elements of the PIB through this structure pointer.
This is a non-functional change with the exception of using ia64_ld1() and ia64_st8() to write to the PIB. We were still using assignments, for which the compiler generates semaphore reads -- which cause undefined behaviour for uncacheable memory. Note also that the memory barriers in ipi_send() are critical for proper functioning.
With all the mapping of uncached memory done by pmap_mapdev(), we can keep track of the translations and wire them in the CPU. This then eliminates the need to reserve a whole region for uncached I/O and it eliminates translation traps for device I/O accesses.
|
203758 |
10-Feb-2010 |
attilio |
Add the options DEADLKRES (introducing the deadlock resolver thread) in the 'debugging' section of any HEAD kernel and enable for the mainstream ones, excluding the embedded architectures. It may, of course, enabled on a case-by-case basis.
Sponsored by: Sandvine Incorporated Requested by: emaste Discussed with: kib
|
203572 |
06-Feb-2010 |
marcel |
Fix single-stepping when the kernel was entered through the EPC syscall path. When the taken branch leaves the kernel and enters the process, we still need to execute the instruction at that address. Don't raise SIGTRAP when we branch into the process, but enable single-stepping instead.
|
203106 |
28-Jan-2010 |
marcel |
In pci_cfgregread() and pci_cfgregwrite(), validate the arguments and check that the alignment matches the width of the read or write.
|
203054 |
27-Jan-2010 |
marcel |
In cpu_switch(), use an atomic operation to set the td_lock of the old thread to the mutex that's passed.
Pointed out by: attilio, jhb
|
202904 |
23-Jan-2010 |
marcel |
Remove cpu_boot() and call efi_reset_system() directly from cpu_reset().
|
202273 |
14-Jan-2010 |
marcel |
Add ioctl requests to /dev/io on ia64 for reading and writing EFI variables. The primary reason for this is that it allows sysinstall(8) to add a boot menu item for the newly installed FreeBSD image.
|
202272 |
14-Jan-2010 |
marcel |
Fix previous commitr:. efi_var_set() was copied from efi_var_get(), but wasn't actually changed.
|
202271 |
14-Jan-2010 |
marcel |
Add wrappers for the RT Variable Services. While here, translate the EFI status into a standard errno value and change efi_set_time() to return a standard error.
MFC after: 1 week
|
202097 |
11-Jan-2010 |
marcel |
Use io(4) for I/O port access on ia64, rather than through sysarch(2). I/O port access is implemented on Itanium by reading and writing to a special region in memory. To hide details and avoid misaligned memory accesses, a process did I/O port reads and writes by making a MD system call. There's one fatal problem with this approach: unprivileged access was not being prevented. /dev/io serves that purpose on amd64/i386, so employ it on ia64 as well. Use an ioctl for doing the actual I/O and remove the sysarch(2) interface.
Backward compatibility is not being considered. The sysarch(2) approach was added to support X11, but support for FreeBSD/ia64 was never fully implemented in X11. Thus, nothing gets broken that didn't need more work to begin with.
MFC after: 1 week
|
202019 |
10-Jan-2010 |
imp |
Add INCLUDE_CONFIG_FILE in GENERIC on all non-embedded platforms.
# This is the resolution of removing it from DEFAULTS...
MFC after: 5 days
|
201813 |
08-Jan-2010 |
bz |
In sys/<arch>/conf/Makefile set TARGET to <arch>. That allows sys/conf/makeLINT.mk to only do certain things for certain architectures.
Note that neither arm nor mips have the Makefile there, thus essentially not (yet) supporting LINT. This would enable them do add special treatment to sys/conf/makeLINT.mk as well chosing one of the many configurations as LINT.
This is a hack of doing this and keeping it in a separate commit will allow us to more easily identify and back it out.
Discussed on/with: arch, jhb (as part of the LINT-VIMAGE thread) MFC after: 1 month
|
201534 |
04-Jan-2010 |
imp |
Revert 200594. This file isn't intended for these sorts of things.
|
201443 |
03-Jan-2010 |
brooks |
Add vlan(4) to all GENERIC kernels.
MFC after: 1 week
|
201373 |
02-Jan-2010 |
marcel |
Change BUS_SPACE_MAXADDR from 2^32-1 to 2^64-1. 2^32-1 is representative for its origin, more than for its accuracy.
MFC after: 1 week
|
201269 |
30-Dec-2009 |
marcel |
Revamp bus_space access functions: o Optimize for memory mapped I/O by making all I/O port acceses function calls and marking the test for the IA64_BUS_SPACE_IO tag with __predict_false(). Implement the I/O port access functions in a new file, called bus_machdep.c. o Change the bus_space_handle_t for memory mapped I/O to the virtual address rather than the physical address. This eliminates the PA->VA translation for every I/O access. The handle for I/O port access is still the port number. o Move inb(), outb(), inw(), outw(), inl(), outl(), and their string variants from cpufunc.h and define them in bus.h. On ia64 these are not CPU functions at all. In bus.h they are merely aliases for the new I/O port access functions defined in bus_machdep.h. o Handle the ACPI resource bug in nexus_set_resource(). There we can do it once so that we don't have to worry about it whenever we need to write to an I/O port that is really a memory mapped address.
The upshot of this change is that the KBI is better defined and that I/O port access always involves a function call, allowing us to change the actual implementation without breaking the KBI. For memory mapped I/O the virtual address is abstracted, so that we can change the VA->PA mapping in the kernel without causing an KBI breakage. The exception at this time is for bus_space_map() and bus_space_unmap().
MFC after: 1 week.
|
201223 |
29-Dec-2009 |
rnoland |
Update d_mmap() to accept vm_ooffset_t and vm_memattr_t.
This replaces d_mmap() with the d_mmap2() implementation and also changes the type of offset to vm_ooffset_t.
Purge d_mmap2().
All driver modules will need to be rebuilt since D_VERSION is also bumped.
Reviewed by: jhb@ MFC after: Not in this lifetime...
|
201145 |
28-Dec-2009 |
antoine |
(S)LIST_HEAD_INITIALIZER takes a (S)LIST_HEAD as an argument. Fix some wrong usages. Note: this does not affect generated binaries as this argument is not used.
PR: 137213 Submitted by: Eygene Ryabinkin (initial version) MFC after: 1 month
|
201032 |
26-Dec-2009 |
marcel |
Use unordered memory loads and stores for the in* and out* family of functions.
|
200889 |
23-Dec-2009 |
marcel |
Export the bus, cpu and itc frequencies under the hw.freq sysctl node. The frequencies are in MHz (i.e. a value of 1000 represents 1GHz). The frequencies are rounded to the nearest whole MHz.
While here, rename and re-type bus_frequency, processor_frequency and itc_frequency to bus_freq, cpu_freq and itc_freq and make them static. As unsigned integers, the hw.freq.cpu sysctl can more easily be made generic (across all architectures) making porting easier.
MFC after: 3 days
|
200888 |
23-Dec-2009 |
marcel |
Add a bit definition for invalid timestamp in the record header.
|
200594 |
16-Dec-2009 |
dougb |
Add INCLUDE_CONFIG_FILE, and a note in comments about how to also include the comments with CONFIGARGS
|
200240 |
08-Dec-2009 |
marcel |
In exception_save, write-back ar.rnat after switching the backing- store. Writing to ar.bspstore is defined to leave ar.rnat undefined.
PR: ia64/120315 MFC after: 3 days
|
200207 |
07-Dec-2009 |
marcel |
Define struct pcpu_md as the only MD field of struct pcpu (pc_acpi_id excluded, as it's used by MI code) and mode the sysctl variables from pcpu_stats to pcpu_md. Adjust all references accordingly.
While nearby, change the PCPU sysctl tree so that they match the CPU device sysctl tree -- they are now children of a static node called "machdep.cpu" and are named only with their cpu ID.
|
200200 |
07-Dec-2009 |
marcel |
Allocate the VHPT for each CPU in cpu_mp_start(), rather than allocating MAXCPU VHPTs up-front. This allows us to max-out MAXCPU without memory waste -- MAXCPU is now 32 for SMP kernels.
This change also eliminates the VHPT scaling based in the total memory in the system. It's the workload that determines the best size of the VHPT. The workload can be affected by the amount of memory, but not necessarily. For example, there's no performance difference between VHPT sizes of 256KB, 512KB and 1MB when building the LINT kernel. This was observed with a system that has 8GB of memory. By default the kernel will allocate a 1MB VHPT. The user can tune the system with the "machdep.vhpt.log2size" tunable.
|
200051 |
03-Dec-2009 |
marcel |
Make sure bus space accesses use unorder memory loads and stores. Memory accesses are posted in program order by virtue of the uncacheable memory attribute. Since GCC, by default, adds acquire and release semantics to volatile memory loads and stores, we need to use inline assembly to guarantee it. With inline assembly, we don't need volatile pointers anymore.
Itanium does not support semaphore instructions to uncacheable memory.
|
199941 |
29-Nov-2009 |
marcel |
Move the sysctl related fields to the end of the structure and make them conditional upon _KERNEL. libkvm includes <sys/pcpu.h> and <sys/sysctl.h> does not expose the structure definitions to userland.
|
199893 |
28-Nov-2009 |
marcel |
Eliminate teh use of MAXCPU in static arrays of interrupt counters by adding statistics counters to the PCPU structure. Export the counters through sysctl by giving each PCPU structure its own sysctl context.
While here, fix cnt.v_intr by not just having it count clock interrupts, but every interrupt and add more counters for each interrupt source.
|
199868 |
27-Nov-2009 |
alc |
Simplify the invocation of vm_fault(). Specifically, eliminate the flag VM_FAULT_DIRTY. The information provided by this flag can be trivially inferred by vm_fault().
Discussed with: kib
|
199727 |
24-Nov-2009 |
marcel |
Improve upon revision 196196 by removing the newly added comment in the wrong place and instead add a KASSERT in the right place.
|
199719 |
23-Nov-2009 |
marcel |
Revert previous commit. The problem was not related to overrunning the kernel stack at all. The new USB stack simply caused a change in timing that triggered a firmware bug more often. The addition of PRINTF_BUFR_SIZE apparently triggered the same firmware bug even more reliably.
But even with KSTACK_PAGES=5, one instance of the firmware bug remained: booting with a CD inserted. This problem was run into by accident after installing Debian and having to boot FreeBSD to fixup the GPT partitioning (Thanks... not). After bumping KSTACK_PAGES to 5, it was pretty unbelievable that the stack was still being too small.
After updating the firmware we could boot with a CD inserted and KSTACK_PAGES could be lowered back to 4 pages without problems.
Note: It is believed to be a timing related firmware bug, because the machine check information showed access to the serial console on one CPU and access to the EHCI HCD on the other CPU. Since both are devices on the management unit and thus virtualized in some way, any execution trace that does not include concurrent access to the BMC from both CPUs is fine.
Note also that it's not understood exactly how increasing the kernel stack avoided hitting the firmware bug. A change in page faults does change timing, but it's not known if that's what's happening here.
In any case: the problem is being monitored. Reverting back to 4 pages for the kernel stack is preferred, because it makes it easier to switch to 16K pages (double the page size) without wasting too much memory by not being able to half the number of pages...
|
199574 |
20-Nov-2009 |
marcel |
No need to include opt_kstack_pages.h, because KSTACK_PAGES is already defined through genassym.c
|
199566 |
20-Nov-2009 |
marcel |
Add a seatbelt to the Nested TLB Fault handler to give us a chance to panic when we have an unexpected TLB fault while interrupt collection is disabled. Use a token rather than the actual address of the restart point to avoid the need for the movl instruction. The token is arbitrary. For the drummers: it's based on a single paradiddle.
|
199502 |
19-Nov-2009 |
marcel |
opt_* headers are included using the quoted form.
|
199135 |
10-Nov-2009 |
kib |
Extract the code that records syscall results in the frame into MD function cpu_set_syscall_retval().
Suggested by: marcel Reviewed by: marcel, davidxu PowerPC, ARM, ia64 changes: marcel Sparc64 tested and reviewed by: marius, also sunv reviewed MIPS tested by: gonzo MFC after: 1 month
|
198733 |
31-Oct-2009 |
marcel |
Reimplement the lazy FP context switching: o Move all code into a single file for easier maintenance. o Use a single global lock to avoid having to handle either multiple locks or race conditions. o Make sure to disable the high FP registers after saving or dropping them. o use msleep() to wait for the other CPU to save the high FP registers.
This change fixes the high FP inconsistency panics.
A single global lock typically serializes too much, which may be noticable when a lot of threads use the high FP registers, but in that case it's probably better to switch the high FP context synchronuously. Put differently: cpu_switch() should switch the high FP registers if the incoming and outgoing threads both use the high FP registers.
|
198507 |
27-Oct-2009 |
kib |
In r197963, a race with thread being selected for signal delivery while in kernel mode, and later changing signal mask to block the signal, was fixed for sigprocmask(2) and ptread_exit(3). The same race exists for sigreturn(2), setcontext(2) and swapcontext(2) syscalls.
Use kern_sigprocmask() instead of direct manipulation of td_sigmask to reschedule newly blocked signals, closing the race.
Reviewed by: davidxu Tested by: pho MFC after: 1 month
|
198452 |
24-Oct-2009 |
marcel |
Add PRINTF_BUFR_SIZE=128, since we have SMP by default. While here, fix tabulation.
|
198451 |
24-Oct-2009 |
marcel |
A 32KB kernel stack is not quite enough. The new USB stack is a bit more stack hungry as compared to the old one that my RX2660 gets a machine check and spontaneously reboots at the time the USB DVD drive is found and attached to CAM as a mass storage device. This doesn't happen always, but definitely varies per kernel build. Likewise when using a 128-byte printf buffer. The additional 128 bytes that printf needs seems to be enough to have the memory stack and register stack collide and causing a machine check.
Thus: Bump KSTACK_PAGES from 4 to 5.
|
198341 |
21-Oct-2009 |
marcel |
o Introduce vm_sync_icache() for making the I-cache coherent with the memory or D-cache, depending on the semantics of the platform. vm_sync_icache() is basically a wrapper around pmap_sync_icache(), that translates the vm_map_t argumument to pmap_t. o Introduce pmap_sync_icache() to all PMAP implementation. For powerpc it replaces the pmap_page_executable() function, added to solve the I-cache problem in uiomove_fromphys(). o In proc_rwmem() call vm_sync_icache() when writing to a page that has execute permissions. This assures that when breakpoints are written, the I-cache will be coherent and the process will actually hit the breakpoint. o This also fixes the Book-E PMAP implementation that was missing necessary locking while trying to deal with the I-cache coherency in pmap_enter() (read: mmu_booke_enter_locked).
The key property of this change is that the I-cache is made coherent *after* writes have been done. Doing it in the PMAP layer when adding or changing a mapping means that the I-cache is made coherent *before* any writes happen. The difference is key when the I-cache prefetches.
|
198338 |
21-Oct-2009 |
marcel |
o Align function on a 32-byte boundary so that the core's front-end can deliver 2 bundles per cycle to the back-end. o Mark syscall stubs with a special unwind ABI tag so that unwind libraries know how to unwind.
|
197933 |
10-Oct-2009 |
kib |
Define architectural load bases for PIE binaries. Addresses were selected by looking at the bases used for non-relocatable executables by gnu ld(1), and adjusting it slightly.
Discussed with: bz Reviewed by: kan Tested by: bz (i386, amd64), bsam (linux) MFC after: some time
|
197729 |
03-Oct-2009 |
bz |
Make sure that the primary native brandinfo always gets added first and the native ia32 compat as middle (before other things). o(ld)brandinfo as well as third party like linux, kfreebsd, etc. stays on SI_ORDER_ANY coming last.
The reason for this is only to make sure that even in case we would overflow the MAX_BRANDS sized array, the native FreeBSD brandinfo would still be there and the system would be operational.
Reviewed by: kib MFC after: 1 month
|
197316 |
18-Sep-2009 |
alc |
Add a new sysctl for reporting all of the supported page sizes.
Reviewed by: jhb MFC after: 3 weeks
|
196994 |
08-Sep-2009 |
phk |
Get rid of the _NO_NAMESPACE_POLLUTION kludge by creating an architecture specific include file containing the _ALIGN* stuff which <sys/socket.h> needs.
|
196268 |
16-Aug-2009 |
marcel |
Decouple ACPI CPU Ids from FreeBSD's cpuid. The ACPI Ids can be sparse, which causes a kernel assert.
Approved by: re (kensmith)
|
196196 |
13-Aug-2009 |
attilio |
* Completely Remove the option STOP_NMI from the kernel. This option has proven to have a good effect when entering KDB by using a NMI, but it completely violates all the good rules about interrupts disabled while holding a spinlock in other occasions. This can be the cause of deadlocks on events where a normal IPI_STOP is expected. * Adds an new IPI called IPI_STOP_HARD on all the supported architectures. This IPI is responsible for sending a stop message among CPUs using a privileged channel when disponible. In other cases it just does match a normal IPI_STOP. Right now the IPI_STOP_HARD functionality uses a NMI on ia32 and amd64 architectures, while on the other has a normal IPI_STOP effect. It is responsibility of maintainers to eventually implement an hard stop when necessary and possible. * Use the new IPI facility in order to implement a new userend SMP kernel function called stop_cpus_hard(). That is specular to stop_cpu() but it does use the privileged channel for the stopping facility. * Let KDB use the newly introduced function stop_cpus_hard() and leave stop_cpus() for all the other cases * Disable interrupts on CPU0 when starting the process of APs suspension. * Style cleanup and comments adding
This patch should fix the reboot/shutdown deadlocks many users are constantly reporting on mailing lists.
Please don't forget to update your config file with the STOP_NMI option removal
Reviewed by: jhb Tested by: pho, bz, rink Approved by: re (kib)
|
195840 |
24-Jul-2009 |
jhb |
Add a new type of VM object: OBJT_SG. An OBJT_SG object is very similar to a device pager (OBJT_DEVICE) object in that it uses fictitious pages to provide aliases to other memory addresses. The primary difference is that it uses an sglist(9) to determine the physical addresses for a given offset into the object instead of invoking the d_mmap() method in a device driver.
Reviewed by: alc Approved by: re (kensmith) MFC after: 2 weeks
|
195649 |
12-Jul-2009 |
alc |
Add support to the virtual memory system for configuring machine- dependent memory attributes:
Rename vm_cache_mode_t to vm_memattr_t. The new name reflects the fact that there are machine-dependent memory attributes that have nothing to do with controlling the cache's behavior.
Introduce vm_object_set_memattr() for setting the default memory attributes that will be given to an object's pages.
Introduce and use pmap_page_{get,set}_memattr() for getting and setting a page's machine-dependent memory attributes. Add full support for these functions on amd64 and i386 and stubs for them on the other architectures. The function pmap_page_set_memattr() is also responsible for any other machine-dependent aspects of changing a page's memory attributes, such as flushing the cache or updating the direct map. The uses include kmem_alloc_contig(), vm_page_alloc(), and the device pager:
kmem_alloc_contig() can now be used to allocate kernel memory with non-default memory attributes on amd64 and i386.
vm_page_alloc() and the device pager will set the memory attributes for the real or fictitious page according to the object's default memory attributes.
Update the various pmap functions on amd64 and i386 that map pages to incorporate each page's memory attributes in the mapping.
Notes: (1) Inherent to this design are safety features that prevent the specification of inconsistent memory attributes by different mappings on amd64 and i386. In addition, the device pager provides a warning when a device driver creates a fictitious page with memory attributes that are inconsistent with the real page that the fictitious page is an alias for. (2) Storing the machine-dependent memory attributes for amd64 and i386 as a dedicated "int" in "struct md_page" represents a compromise between space efficiency and the ease of MFCing these changes to RELENG_7.
In collaboration with: jhb
Approved by: re (kib)
|
195625 |
11-Jul-2009 |
marcel |
On exec(2), when loading the ELF image, pmap_enter_object() is called to prefault pages. This is an obvious place for making sure the I-cache is coherent. It was missing though. As such, execution over NFS and ZFS file systems was failing. NFS was fixed the wrong way (by flushing the D-cache as part of the NFS code) in a previous commit. ZFS problems were encountered after that and indicated that something else was wrong...
Approved by: re (kib)
|
195376 |
05-Jul-2009 |
sam |
Cleanup ALIGNED_POINTER: o add to platforms where it was missing (arm, i386, powerpc, sparc64, sun4v) o define as "1" on amd64 and i386 where there is no restriction o make the type returned consistent with ALIGN o remove _ALIGNED_POINTER o make associated comments consistent
Reviewed by: bde, imp, marcel Approved by: re (kensmith)
|
195295 |
02-Jul-2009 |
ed |
Enable POSIX semaphores on all non-embedded architectures by default.
More applications (including Firefox) seem to depend on this nowadays, so not having this enabled by default is a bad idea.
Proposed by: miwi Patch by: Florian Smeets <flo kasimir com> Approved by: re (kib)
|
195060 |
26-Jun-2009 |
alc |
Correct the #endif comment.
Noticed by: jmallett Approved by: re (kib)
|
195033 |
26-Jun-2009 |
alc |
This change is the next step in implementing the cache control functionality required by video card drivers. Specifically, this change introduces vm_cache_mode_t with an appropriate VM_CACHE_DEFAULT definition on all architectures. In addition, this changes adds a vm_cache_mode_t parameter to kmem_alloc_contig() and vm_phys_alloc_contig(). These will be the interfaces for allocating mapped kernel memory and physical memory, respectively, with non-default cache modes.
In collaboration with: jhb
|
194784 |
23-Jun-2009 |
jeff |
Implement a facility for dynamic per-cpu variables. - Modules and kernel code alike may use DPCPU_DEFINE(), DPCPU_GET(), DPCPU_SET(), etc. akin to the statically defined PCPU_*. Requires only one extra instruction more than PCPU_* and is virtually the same as __thread for builtin and much faster for shared objects. DPCPU variables can be initialized when defined. - Modules are supported by relocating the module's per-cpu linker set over space reserved in the kernel. Modules may fail to load if there is insufficient space available. - Track space available for modules with a one-off extent allocator. Free may block for memory to allocate space for an extent.
Reviewed by: jhb, rwatson, kan, sam, grehan, marius, marcel, stas
|
194524 |
20-Jun-2009 |
marcel |
Drop the high FP state of an exiting thread in cpu_thread_exit() and not in cpu_exit(). The latter is called after td_md.md_highfp_mtx has been destroyed, which results in a race condition when another thread wants to use the high FP registers on the CPU that still has the high FP registers in question.
|
193530 |
05-Jun-2009 |
jkim |
Import ACPICA 20090521.
|
193334 |
02-Jun-2009 |
rwatson |
Remove MAC kernel config files and add "options MAC" to GENERIC, with the goal of shipping 8.0 with MAC support in the default kernel. No policies will be compiled in or enabled by default, but it will now be possible to load them at boot or runtime without a kernel recompile.
While the framework is not believed to impose measurable overhead when no policies are loaded (a result of optimization over the past few months in HEAD), we'll continue to benchmark and optimize as the release approaches. Please keep an eye out for performance or functionality regressions that could be a result of this change.
Approved by: re (kensmith) Obtained from: TrustedBSD Project
|
193066 |
29-May-2009 |
jamie |
Place hostnames and similar information fully under the prison system. The system hostname is now stored in prison0, and the global variable "hostname" has been removed, as has the hostname_mtx mutex. Jails may have their own host information, or they may inherit it from the parent/system. The proper way to read the hostname is via getcredhostname(), which will copy either the hostname associated with the passed cred, or the system hostname if you pass NULL. The system hostname can still be accessed directly (and without locking) at prison0.pr_host, but that should be avoided where possible.
The "similar information" referred to is domainname, hostid, and hostuuid, which have also become prison parameters and had their associated global variables removed.
Approved by: bz (mentor)
|
193018 |
29-May-2009 |
ed |
Last minute TTY API change: remove mutex argument from tty_alloc().
I don't want people to override the mutex when allocating a TTY. It has to be there, to keep drivers like syscons happy. So I'm creating a tty_alloc_mutex() which can be used in those cases. tty_alloc_mutex() should eventually be removed.
The advantage of this approach, is that we can just remove a function, without breaking the regular API in the future.
|
192918 |
27-May-2009 |
rink |
ia64: Move MCA information retrieval to a per-CPU kthread
Once AP's are launched, their MCA state information is stored and later obtainable using a sysctl. Since the size of the MCA state information is unknown, it will be malloc'ed as needed. However, when 'ia64_ap_startup' runs, it's not yet safe to call malloc and this may cause 'panic: blockable sleep lock (sleep mutex) 8192 @ /usr/src/sys/vm/uma_core.c'. This commit avoids this issue by scheduling a separate kthread to obtain this information, which immediately terminates afterwards.
|
192324 |
18-May-2009 |
marcel |
Rename ia64_invalidate_icache() to ia64_sync_icache(). We're not invalidating anything.
|
192323 |
18-May-2009 |
marcel |
Add cpu_flush_dcache() for use after non-DMA based I/O so that a possible future I-cache coherency operation can succeed. On ARM for example the L1 cache can be (is) virtually mapped, which means that any I/O that uses temporary mappings will not see the I-cache made coherent. On ia64 a similar behaviour has been observed. By flushing the D-cache, execution of binaries backed by md(4) and/or NFS work reliably. For Book-E (powerpc), execution over NFS exhibits SIGILL once in a while as well, though cpu_flush_dcache() hasn't been implemented yet.
Doing an explicit D-cache flush as part of the non-DMA based I/O read operation eliminates the need to do it as part of the I-cache coherency operation itself and as such avoids pessimizing the DMA-based I/O read operations for which D-cache are already flushed/invalidated. It also allows future optimizations whereby the bcopy() followed by the D-cache flush can be integrated in a single operation, which could be implemented using on-chips DMA engines, by-passing the D-cache altogether.
|
191954 |
10-May-2009 |
kuriyama |
- Use "device\t" and "options \t" for consistency.
|
191449 |
24-Apr-2009 |
marcel |
Remove isa_irq_pending(). It's not used.
|
191309 |
20-Apr-2009 |
rwatson |
Don't conditionally define CACHE_LINE_SHIFT, as we anticipate sizing a fair number of static data structures, making this an unlikely option to try to change without also changing source code. [1]
Change default cache line size on ia64, sparc64, and sun4v to 128 bytes, as this was what rtld-elf was already using on those platforms. [2]
Suggested by: bde [1], jhb [2] MFC after: 2 weeks
|
191278 |
19-Apr-2009 |
rwatson |
Add description and cautionary note regarding CACHE_LINE_SIZE.
MFC after: 2 weeks Suggested by: alc
|
191276 |
19-Apr-2009 |
rwatson |
For each architecture, define CACHE_LINE_SHIFT and a derived CACHE_LINE_SIZE constant. These constants are intended to over-estimate the cache line size, and be used at compile-time when a run-time tuning alternative isn't appropriate or available.
Defaults for all architectures are 64 bytes, except powerpc where it is 128 bytes (used on G5 systems).
MFC after: 2 weeks Discussed on: arch@
|
191201 |
17-Apr-2009 |
jhb |
Restore bus DMA bounce pages to an offset of 0 when they are released by a tag that has BUS_DMA_KEEP_PG_OFFSET set. Otherwise the page could be reused with a non-zero offset by a tag that doesn't have BUS_DMA_KEEP_PG_OFFSET leading to data corruption.
Sleuthing by: avg Reviewed by: scottl
|
191011 |
13-Apr-2009 |
kib |
The bus_dmamap_load_uio(9) shall use pmap of the thread recorded in the uio_td to extract pages from, instead of unconditionally use kernel pmap.
Submitted by: Jason Harmening <jason.harmening gmail com> (amd64 version) PR: amd64/133592 Reviewed by: scottl (original patch), jhb MFC after: 2 weeks
|
190708 |
05-Apr-2009 |
dchagin |
Fix KBI breakage by r190520 which affects older linux.ko binaries:
1) Move the new field (brand_note) to the end of the Brandinfo structure. 2) Add a new flag BI_BRAND_NOTE that indicates that the brand_note pointer is valid. 3) Use the brand_note field if the flag BI_BRAND_NOTE is set and as old modules won't have the flag set, so the new field brand_note would be ignored.
Suggested by: jhb Reviewed by: jhb Approved by: kib (mentor) MFC after: 6 days
|
190631 |
01-Apr-2009 |
kib |
Add trivial implementation for the freebsd32_sysarch on ia64. Fix comapt32 and LINT build on ia64.
Discussed with: jhb
|
189926 |
17-Mar-2009 |
kib |
Add AT_EXECPATH ELF auxinfo entry type. The value's a_ptr is a pointer to the full path of the image that is being executed. Increase AT_COUNT.
Remove no longer true comment about types used in Linux ELF binaries, listed types contain FreeBSD-specific entries.
Reviewed by: kan
|
189771 |
13-Mar-2009 |
dchagin |
Implement new way of branding ELF binaries by looking to a ".note.ABI-tag" section.
The search order of a brand is changed, now first of all the ".note.ABI-tag" is looked through.
Move code which fetch osreldate for ELF binary to check_note() handler.
PR: 118473 Approved by: kib (mentor)
|
188944 |
23-Feb-2009 |
thompsa |
Change over the usb kernel options to the new stack (retaining existing naming). The old usb stack can be compiled in my prefixing the name with 'o'.
|
188665 |
15-Feb-2009 |
thompsa |
Add uslcom to the build too.
Reminded by: Michael Butler
|
188660 |
15-Feb-2009 |
thompsa |
Switch over GENERIC kernels to USB2 by default.
Tested by: make universe
|
188453 |
10-Feb-2009 |
marcel |
Mark the BSP as being awake. This supresses the message that not all usable CPUs could be woken up...
|
188350 |
08-Feb-2009 |
imp |
When bouncing pages, allow a new option to preserve the intra-page offset. This is needed for the ehci hardware buffer rings that assume this behavior.
This is an interim solution, and a more general one is being worked on. This solution doesn't break anything that doesn't ask for it directly. The mbuf and uio variants with this flag likely don't work and haven't been tested.
Universe builds with these changes. I don't have a huge-memory machine to test these changes with, but will be happy to work with folks that do and hps if this changes turns out not to be sufficient.
Submitted by: alfred@ from Hans Peter Selasky's original
|
188274 |
07-Feb-2009 |
wkoszek |
Don't forget to create opt_agp.h on ia64, which also uses agp(4).
|
188119 |
04-Feb-2009 |
jhb |
Tweak the ia64 machine check handling code to not register new sysctl nodes while holding a spin mutex. Instead, it now shoves the machine check records onto a queue that is later drained to add sysctl nodes for each record. While a routine to drain the queue is present, it is not currently called.
Reviewed by: marcel
|
187381 |
18-Jan-2009 |
alc |
Correct an error in revision 1.170 of this file. When get_pv_entry() is forced to reclaim pv entries, the one pv entry that it returns should not be freed.
|
186212 |
17-Dec-2008 |
imp |
AT_DEBUG and AT_BRK were OBE like 10 years ago, so retire them.
Reviewed by: peter
|
185567 |
02-Dec-2008 |
ed |
Remove "[KEEP THIS!]" from COMPAT_43TTY. It's not really that important.
Sgtty is a programming interface that has been replaced by termios over the years. In June we already removed <sgtty.h>, which exposes the ioctl()'s that are implemented by this interface. The importance of this flag is overrated right now.
|
185169 |
22-Nov-2008 |
kib |
Add sv_flags field to struct sysentvec with intention to provide description of the ABI of the currently executing image. Change some places to test the flags instead of explicit comparing with address of known sysentvec structures to determine ABI features.
Discussed with: dchagin, imp, jhb, peter
|
185163 |
22-Nov-2008 |
marcel |
Define mb(), rmb() and wmb() for real.
|
185162 |
22-Nov-2008 |
kmacy |
- bump __FreeBSD version to reflect added buf_ring, memory barriers, and ifnet functions
- add memory barriers to <machine/atomic.h> - update drivers to only conditionally define their own
- add lockless producer / consumer ring buffer - remove ring buffer implementation from cxgb and update its callers
- add if_transmit(struct ifnet *ifp, struct mbuf *m) to ifnet to allow drivers to efficiently manage multiple hardware queues (i.e. not serialize all packets through one ifq) - expose if_qflush to allow drivers to flush any driver managed queues
This work was supported by Bitgravity Inc. and Chelsio Inc.
|
184205 |
23-Oct-2008 |
des |
Retire the MALLOC and FREE macros. They are an abomination unto style(9).
MFC after: 3 months
|
184062 |
19-Oct-2008 |
marcel |
Atomically increment the number of awoken APs as all APs will be unleashed here.
Pointed out by: christian.kandeler@hob.de
|
183527 |
01-Oct-2008 |
peter |
Collect N identical (or near identical) mkdumpheader() implementations into one, as threatened in the comment. Textdump magic can be passed in.
|
183439 |
28-Sep-2008 |
marius |
Remove ipi_all() and ipi_self() as the former hasn't been used at all to date and the latter also is only used in ia64 and powerpc code which no longer serves a real purpose after bring-up and just can be removed as well. Note that architectures like sun4u also provide no means of implementing IPI'ing a CPU itself natively in the first place.
Suggested by: jhb Reviewed by: arch, grehan, jhb
|
183397 |
27-Sep-2008 |
ed |
Replace all calls to minor() with dev2unit().
After I removed all the unit2minor()/minor2unit() calls from the kernel yesterday, I realised calling minor() everywhere is quite confusing. Character devices now only have the ability to store a unit number, not a minor number. Remove the confusion by using dev2unit() everywhere.
This commit could also be considered as a bug fix. A lot of drivers call minor(), while they should actually be calling dev2unit(). In -CURRENT this isn't a problem, but it turns out we never had any problem reports related to that issue in the past. I suspect not many people connect more than 256 pieces of the same hardware.
Reviewed by: kib
|
183322 |
24-Sep-2008 |
kib |
Change the static struct sysentvec and struct Elf_Brandinfo initializers to the C99 style. At least, it is easier to read sysent definitions that way, and search for the actual instances of sigcode etc.
Explicitely initialize sysentvec.sv_maxssiz that was missed in most sysvecs.
No objection from: jhb MFC after: 1 month
|
183299 |
23-Sep-2008 |
obrien |
The kernel implemented 'memcmp' is an alias for 'bcmp'. However, memcmp and bcmp are not the same thing. 'man bcmp' states that the return is "non-zero" if the two byte strings are not identical. Where as, 'man memcmp' states that the return is the "difference between the first two differing bytes (treated as unsigned char values" if the two byte strings are not identical.
So provide a proper memcmp(9), but it is a C implementation not a tuned assembly implementation. Therefore bcmp(9) should be preferred over memcmp(9).
|
181905 |
20-Aug-2008 |
ed |
Integrate the new MPSAFE TTY layer to the FreeBSD operating system.
The last half year I've been working on a replacement TTY layer for the FreeBSD kernel. The new TTY layer was designed to improve the following:
- Improved driver model:
The old TTY layer has a driver model that is not abstract enough to make it friendly to use. A good example is the output path, where the device drivers directly access the output buffers. This means that an in-kernel PPP implementation must always convert network buffers into TTY buffers.
If a PPP implementation would be built on top of the new TTY layer (still needs a hooks layer, though), it would allow the PPP implementation to directly hand the data to the TTY driver.
- Improved hotplugging:
With the old TTY layer, it isn't entirely safe to destroy TTY's from the system. This implementation has a two-step destructing design, where the driver first abandons the TTY. After all threads have left the TTY, the TTY layer calls a routine in the driver, which can be used to free resources (unit numbers, etc).
The pts(4) driver also implements this feature, which means posix_openpt() will now return PTY's that are created on the fly.
- Improved performance:
One of the major improvements is the per-TTY mutex, which is expected to improve scalability when compared to the old Giant locking. Another change is the unbuffered copying to userspace, which is both used on TTY device nodes and PTY masters.
Upgrading should be quite straightforward. Unlike previous versions, existing kernel configuration files do not need to be changed, except when they reference device drivers that are listed in UPDATING.
Obtained from: //depot/projects/mpsafetty/... Approved by: philip (ex-mentor) Discussed: on the lists, at BSDCan, at the DevSummit Sponsored by: Snow B.V., the Netherlands dcons(4) fixed by: kan
|
181875 |
19-Aug-2008 |
jhb |
Export 'struct pcpu' to userland w/o requiring _KERNEL. A few ports already define _KERNEL to get to this and I'm about to add hooks to libkvm to access per-CPU data.
MFC after: 1 week
|
181803 |
17-Aug-2008 |
bz |
Commit step 1 of the vimage project, (network stack) virtualization work done by Marko Zec (zec@).
This is the first in a series of commits over the course of the next few weeks.
Mark all uses of global variables to be virtualized with a V_ prefix. Use macros to map them back to their global names for now, so this is a NOP change only.
We hope to have caught at least 85-90% of what is needed so we do not invalidate a lot of outstanding patches again.
Obtained from: //depot/projects/vimage-commit2/... Reviewed by: brooks, des, ed, mav, julian, jamie, kris, rwatson, zec, ... (various people I forgot, different versions) md5 (with a bit of help) Sponsored by: NLnet Foundation, The FreeBSD Foundation X-MFC after: never V_Commit_Message_Reviewed_By: more people than the patch
|
180533 |
15-Jul-2008 |
alc |
Update bus_dmamem_alloc()'s first call to malloc() such that M_WAITOK is specified when appropriate.
Reviewed by: scottl
|
180359 |
07-Jul-2008 |
delphij |
Add HWPMC_HOOKS to GENERIC kernels, this makes hwpmc.ko work out of the box.
|
180354 |
07-Jul-2008 |
marcel |
Add inline function ia64_fc_i() to abstract inline assembly. Use the new inline function in ia64_invalidate_icache(). While there, add proper synchronization so that we know the fc.i instructions have taken effect when we return.
|
180209 |
03-Jul-2008 |
peter |
Exclude .cvsignore files from $FreeBSD$ checking
|
179990 |
25-Jun-2008 |
ed |
Remove the unused major/minor numbers from iodev and memdev.
Now that st_rdev is being automatically generated by the kernel, there is no need to define static major/minor numbers for the iodev and memdev. We still need the minor numbers for the memdev, however, to distinguish between /dev/mem and /dev/kmem.
Approved by: philip (mentor)
|
179382 |
28-May-2008 |
marcel |
Work-around a compiler optimization bug, that broke libthr. Massive inlining resulted in constant propagation to the extend that cmpval was known to the compiler to be URWLOCK_WRITE_OWNER (= 0x80000000U). Unfortunately, instead of zero-extending the unsigned constant, it was sign-extended. As such, the cmpxchg instruction was comparing 0x0000000080000000LU to 0xffffffff80000000LU and obviously didn't perform the exchange. But, since the value returned by cmpxhg equalled cmpval (when zero- extended), the _thr_rtld_lock_release() function thought the exchange did happen and as such returned as if having released the lock. This was not the case. Subsequent locking requests found rw_state non-zero and the thread in question entered the kernel and block indefinitely.
The work-around is to zero-extend by casting to uint64_t.
|
179256 |
23-May-2008 |
marcel |
Account for IPI_PREEMPT. We don't want to call sched_preempt() with interrupts disabled or with td_intr_nesting_level non-zero.
|
179229 |
23-May-2008 |
alc |
The VM system no longer uses setPQL2(). Remove it and its helpers.
|
179190 |
22-May-2008 |
marcel |
Create the bucket mutexes with MTX_NOWITNESS. There's now a hard limit of 512 pending mutexes in the witness code and we can easily have 1 million bucket mutexes initialized before witness is up and running. Bumping the limit from 512 to 1M is not really an option here...
|
179173 |
21-May-2008 |
marcel |
We can call ia64_flush_dirty() when the corresponding process is locked or not. As such, use PROC_LOCKED() to determine which case it is and lock the process when not.
|
179081 |
18-May-2008 |
alc |
Retire pmap_addr_hint(). It is no longer used.
|
178893 |
09-May-2008 |
alc |
Add a stub for pmap_align_superpage() on machines that don't (yet) implement pmap-level support for superpages.
|
178494 |
25-Apr-2008 |
marcel |
Unbreak previous commit. While here, refactor the code a bit.
|
178471 |
25-Apr-2008 |
jeff |
- Add an integer argument to idle to indicate how likely we are to wake from idle over the next tick. - Add a new MD routine, cpu_wake_idle() to wakeup idle threads who are suspended in cpu specific states. This function can fail and cause the scheduler to fall back to another mechanism (ipi). - Implement support for mwait in cpu_idle() on i386/amd64 machines that support it. mwait is a higher performance way to synchronize cpus as compared to hlt & ipis. - Allow selecting the idle routine by name via sysctl machdep.idle. This replaces machdep.cpu_idle_hlt. Only idle routines supported by the current machine are permitted.
Sponsored by: Nokia
|
178429 |
22-Apr-2008 |
phk |
Now that all platforms use genclock, shuffle things around slightly for better structure.
Much of this is related to <sys/clock.h>, which should really have been called <sys/calendar.h>, but unless and until we need the name, the repocopy can wait.
In general the kernel does not know about minutes, hours, days, timezones, daylight savings time, leap-years and such. All that is theoretically a matter for userland only.
Parts of kernel code does however care: badly designed filesystems store timestamps in local time and RTC chips almost universally track time in a YY-MM-DD HH:MM:SS format, and sometimes in local timezone instead of UTC. For this we have <sys/clock.h>
<sys/time.h> on the other hand, deals with time_t, timeval, timespec and so on. These know only seconds and fractions thereof.
Move inittodr() and resettodr() prototypes to <sys/time.h>. Retain the names as it is one of the few surviving PDP/VAX references.
Move startrtclock() to <machine/clock.h> on relevant platforms, it is a MD call between machdep.c/clock.c. Remove references to it elsewhere.
Remove a lot of unnecessary <sys/clock.h> includes.
Move the machdep.disable_rtc_set sysctl to subr_rtc.c where it belongs. XXX: should be kern.disable_rtc_set really, it's not MD.
|
178372 |
21-Apr-2008 |
phk |
Make genclock standard on all platforms.
Thanks to: grehan & marcel for platform support on ia64 and ppc.
|
178309 |
19-Apr-2008 |
marcel |
Sanitize the malloc types: M_PMAP is not used in pmap.c, so don't define it there. Don't use M_PMAP in mp_machdep.c; define M_SMP instead.
|
178296 |
18-Apr-2008 |
marcel |
Remove cruft we got from Alpha, which was probably inherited from NetBSD. I.e. make it more like a FreeBSD header.
|
178222 |
15-Apr-2008 |
marcel |
Use genclock for RTC handling. This eliminates the MD versions for inittodr() and resettodr(). Have nexus double as the clock device, because it's the firmware that provides RTC services. We could create a special (pseudo-) device for it, but that wasn't superior enough to actually do it. Maybe later...
Requested by: phk
|
178215 |
15-Apr-2008 |
marcel |
Support and switch to the ULE scheduler: o Implement IPI_PREEMPT, o Set td_lock for the thread being switched out, o For ULE & SMP, loop while td_lock points to blocked_lock for the thread being switched in, o Enable ULE by default in GENERIC and SKI,
|
178206 |
14-Apr-2008 |
marcel |
Revision 1.9 changes the delivery mode from the magic constant 0 (i.e. fixed delivery) to SAPIC_DELMODE_LOWPRI. While the commit log doesn't mention the change in behaviour, it is believed to be deliberate. In the last 5.5 years this hasn't been a problem. Nor do I think did it make any difference, but who knows. However, I do know that it break SMP support for Montecito-based machines. Switch back to fixed-CPU delivery so that SMP works again. This gives me some time to look more closely at the problem, as well as make sure the I-cache validation as it's implemented currently is sufficient in SMP configurations...
|
178131 |
11-Apr-2008 |
jeff |
- Pass the irq and not the vector to intr_event_create().
Reviewed by: marcel
|
178092 |
11-Apr-2008 |
jeff |
- Add the interrupt vector number to intr_event_create so MI code can lookup hard interrupt events by number. Ignore the irq# for soft intrs. - Add support to cpuset for binding hardware interrupts. This has the side effect of binding any ithread associated with the hard interrupt. As per restrictions imposed by MD code we can only bind interrupts to a single cpu presently. Interrupts can be 'unbound' by binding them to all cpus.
Reviewed by: jhb Sponsored by: Nokia
|
178028 |
09-Apr-2008 |
marcel |
Unbreak after removal of SI_SUB_MOUNT_ROOT.
|
177940 |
05-Apr-2008 |
jhb |
Add a MI intr_event_handle() routine for the non-INTR_FILTER case. This allows all the INTR_FILTER #ifdef's to be removed from the MD interrupt code. - Rename the intr_event 'eoi', 'disable', and 'enable' hooks to 'post_filter', 'pre_ithread', and 'post_ithread' to be less x86-centric. Also, add a comment describe what the MI code expects them to do. - On amd64, i386, and powerpc this is effectively a NOP. - On arm, don't bother masking the interrupt unless the ithread is scheduled in the non-INTR_FILTER case to match what INTR_FILTER did. Also, don't bother unmasking the interrupt in the post_filter case if we never masked it. The INTR_FILTER case had been doing this by having arm_unmask_irq for the post_filter (formerly 'eoi') hook. - On ia64, stray interrupts are now masked for the non-INTR_FILTER case. They were already masked in the INTR_FILTER case. - On sparc64, use the a NULL pre_ithread hook and use intr_enable_eoi() for both the 'post_filter' and 'post_ithread' hooks to match what the non-INTR_FILTER code did. - On sun4v, retire the ithread wrapper hack by using an appropriate 'post_ithread' hook instead (it's what 'post_ithread'/'enable' was designed to do even in 5.x).
Glanced at by: piso Reviewed by: marius Requested by: marius [1], [5] Tested on: amd64, i386, arm, sparc64
|
177769 |
30-Mar-2008 |
marcel |
Better implement I-cache invalidation. The previous implementation was a kluge. This implementation matches the behaviour on powerpc and sparc64. While on the subject, make sure to invalidate the I-cache after loading a kernel module.
MFC after: 2 weeks
|
177662 |
27-Mar-2008 |
dfr |
Add kernel module support for nfslockd and krpc. Use the module system to detect (or load) kernel NLM support in rpc.lockd. Remove the '-k' option to rpc.lockd and make kernel NLM the default. A user can still force the use of the old user NLM by building a kernel without NFSLOCKD and/or removing the nfslockd.ko module.
|
177661 |
27-Mar-2008 |
jb |
When building a kernel module, define MAXCPU the same as SMP so that modules work with and without SMP.
|
177642 |
26-Mar-2008 |
phk |
The "free-lance" timer in the i8254 is only used for the speaker these days, so de-generalize the acquire_timer/release_timer api to just deal with speakers.
The new (optional) MD functions are: timer_spkr_acquire() timer_spkr_release() and timer_spkr_setfreq()
the last of which configures the timer to generate a tone of a given frequency, in Hz instead of 1/1193182th of seconds.
Drop entirely timer2 on pc98, it is not used anywhere at all.
Move sysbeep() to kern/tty_cons.c and use the timer_spkr*() if they exist, and do nothing otherwise.
Remove prototypes and empty acquire-/release-timer() and sysbeep() functions from the non-beeping archs.
This eliminate the need for the speaker driver to know about i8254frequency at all. In theory this makes the speaker driver MI, contingent on the timer_spkr_*() functions existing but the driver does not know this yet and still attaches to the ISA bus.
Syscons is more tricky, in one function, sc_tone(), it knows the hz and things are just fine.
In the other function, sc_bell() it seems to get the period from the KDMKTONE ioctl in terms if 1/1193182th second, so we hardcode the 1193182 and leave it at that. It's probably not important.
Change a few other sysbeep() uses which obviously knew that the argument was in terms of i8254 frequency, and leave alone those that look like people thought sysbeep() took frequency in hertz.
This eliminates the knowledge of i8254_freq from all but the actual clock.c code and the prof_machdep.c on amd64 and i386, where I think it would be smart to ask for help from the timecounters anyway [TBD].
|
177325 |
17-Mar-2008 |
jhb |
Simplify the interrupt code a bit: - Always include the ie_disable and ie_eoi methods in 'struct intr_event' and collapse down to one intr_event_create() routine. The disable and eoi hooks simply aren't used currently in the !INTR_FILTER case. - Expand 'disab' to 'disable' in a few places. - Use function casts for arm and i386:intr_eoi_src() instead of wrapper routines since to trim one extra indirection.
Compiled on: {arm,amd64,i386,ia64,ppc,sparc64} x {FILTER, !FILTER} Tested on: {amd64,i386} x {FILTER, !FILTER}
|
177276 |
16-Mar-2008 |
pjd |
Implement atomic_fetchadd_long() for all architectures and document it.
Reviewed by: attilio, jhb, jeff, kris (as a part of the uidinfo_waitfree.patch)
|
177253 |
16-Mar-2008 |
rwatson |
In keeping with style(9)'s recommendations on macros, use a ';' after each SYSINIT() macro invocation. This makes a number of lightweight C parsers much happier with the FreeBSD kernel source, including cflow's prcc and lxr.
MFC after: 1 month Discussed with: imp, rink
|
177215 |
15-Mar-2008 |
imp |
BUS_DMA_ISA is left over from Alpha, and is not used in the tree at all. The reference in ia64 code is due to cutNpaste in its history and can safely be removed.
Revired by: cognet, raj, marcel, jhb and maybe one other whom I'm forgetting
|
177181 |
14-Mar-2008 |
jhb |
Add preliminary support for binding interrupts to CPUs: - Add a new intr_event method ie_assign_cpu() that is invoked when the MI code wishes to bind an interrupt source to an individual CPU. The MD code may reject the binding with an error. If an assign_cpu function is not provided, then the kernel assumes the platform does not support binding interrupts to CPUs and fails all requests to do so. - Bind ithreads to CPUs on their next execution loop once an interrupt event is bound to a CPU. Only shared ithreads are bound. We currently leave private ithreads for drivers using filters + ithreads in the INTR_FILTER case unbound. - A new intr_event_bind() routine is used to bind an interrupt event to a CPU. - Implement binding on amd64 and i386 by way of the existing pic_assign_cpu PIC method. - For x86, provide a 'intr_bind(IRQ, cpu)' wrapper routine that looks up an interrupt source and binds its interrupt event to the specified CPU. MI code can currently (ab)use this by doing:
intr_bind(rman_get_start(irq_res), cpu);
however, I plan to add a truly MI interface (probably a bus_bind_intr(9)) where the implementation in the x86 nexus(4) driver would end up calling intr_bind() internally.
Requested by: kmacy, gallatin, jeff Tested on: {amd64, i386} x {regular, INTR_FILTER}
|
177157 |
13-Mar-2008 |
jhb |
Rework how the nexus(4) device works on x86 to better handle the idea of different "platforms" on x86 machines. The existing code already handles having two platforms: ACPI and legacy. However, the existing approach was rather hardcoded and difficult to extend. These changes take the approach that each x86 hardware platform should provide its own nexus(4) driver (it can inherit most of its behavior from the default legacy nexus(4) driver) which is responsible for probing for the platform and performing appropriate platform-specific setup during attach (such as adding a platform-specific bus device). This does mean changing the x86 platform busses to no longer use an identify routine for probing, but to move that logic into their matching nexus(4) driver instead. - Make the default nexus(4) driver in nexus.c on i386 and amd64 handle the legacy platform. It's probe routine now returns BUS_PROBE_GENERIC so it can be overriden. - Expose a nexus_init_resources() routine which initializes the various resource managers so that subclassed nexus(4) drivers can invoke it from their attach routine. - The legacy nexus(4) driver explicitly adds a legacy0 device in its attach routine. - The ACPI driver no longer contains an new-bus identify method. Instead it exposes a public function (acpi_identify()) which is a probe routine that the MD nexus(4) drivers can use to probe for ACPI. All of the probe logic in acpi_probe() is now moved into acpi_identify() and acpi_probe() is just a stub. - On i386 and amd64, an ACPI-specific nexus(4) driver checks for ACPI via acpi_identify() and claims the nexus0 device if the probe succeeds. It then explicitly adds an acpi0 device in its attach routine. - The legacy(4) driver no longer knows anything about the acpi0 device. - On ia64 if acpi_identify() fails you basically end up with no devices. This matches the previous behavior where the old acpi_identify() would fail to add an acpi0 device again leaving you with no devices.
Discussed with: imp Silence on: arch@
|
177126 |
12-Mar-2008 |
jeff |
- Fix build breakage; there was a reference to a removed syscall in a KASSERT(). Attempt to cleanup the comment to reflect reality.
|
177091 |
12-Mar-2008 |
jeff |
Remove kernel support for M:N threading.
While the KSE project was quite successful in bringing threading to FreeBSD, the M:N approach taken by the kse library was never developed to its full potential. Backwards compatibility will be provided via libmap.conf for dynamically linked binaries and static binaries will be broken.
|
176734 |
02-Mar-2008 |
jeff |
- Remove the old smp cpu topology specification with a new, more flexible tree structure that encodes the level of cache sharing and other properties. - Provide several convenience functions for creating one and two level cpu trees as well as a default flat topology. The system now always has some topology. - On i386 and amd64 create a seperate level in the hierarchy for HTT and multi-core cpus. This will allow the scheduler to intelligently load balance non-uniform cores. Presently we don't detect what level of the cache hierarchy is shared at each level in the topology. - Add a mechanism for testing common topologies that have more information than the MD code is able to provide via the kern.smp.topology tunable. This should be considered a debugging tool only and not a stable api.
Sponsored by: Nokia
|
176346 |
16-Feb-2008 |
marcel |
Re-sort options. While here: o remove COMPAT_FREEBSD5 o add INVARIANTS o add WITNESS
|
176286 |
14-Feb-2008 |
marcel |
On Montecito processors, the instruction cache is in fact not coherent with the data caches. Implement a quick fix to allow us to boot on Montecito, while I'm working on a better fix in the mean time.
Commit made on Montecito-based Itanium...
|
175959 |
04-Feb-2008 |
marcel |
Allocate a stack for thread0 and switch to it before calling mi_startup(). This frees up kstack for static PAL/SAL calls and double-fault handling.
|
175768 |
28-Jan-2008 |
ru |
Add a wrapper function that bound checks writes to the dump device.
|
175147 |
07-Jan-2008 |
jhb |
Add COMPAT_FREEBSD7 and enable it in configs that have COMPAT_FREEBSD6.
|
175067 |
03-Jan-2008 |
alc |
Add an access type parameter to pmap_enter(). It will be used to implement superpage promotion.
Correct a style error in kmem_malloc(): pmap_enter()'s last parameter is a Boolean.
|
175066 |
03-Jan-2008 |
imp |
Use correct function name in panic message
|
175065 |
03-Jan-2008 |
imp |
Fix obsolete comment. pmap_remove_all is the function we're in.
|
174938 |
27-Dec-2007 |
alc |
Add configuration knobs for the superpage reservation system. Initially, the reservation will only be enabled on amd64.
|
174898 |
25-Dec-2007 |
rwatson |
Add a new 'why' argument to kdb_enter(), and a set of constants to use for that argument. This will allow DDB to detect the broad category of reason why the debugger has been entered, which it can use for the purposes of deciding which DDB script to run.
Assign approximate why values to all current consumers of the kdb_enter() interface.
|
174405 |
07-Dec-2007 |
jkoshy |
Add stubs to unbreak LINT.
|
174326 |
06-Dec-2007 |
marcel |
Add a BSD disklabel backend to g_part: o Disklabels can have between 8 and 20 partitions (inclusive). o No device special file is created for the raw partition. o Switch ia64 to use this backend. o No support for boot code yet.
|
174195 |
02-Dec-2007 |
rwatson |
Break out stack(9) from ddb(4):
- Introduce per-architecture stack_machdep.c to hold stack_save(9). - Introduce per-architecture machine/stack.h to capture any common definitions required between db_trace.c and stack_machdep.c. - Add new kernel option "options STACK"; we will build in stack(9) if it is defined, or also if "options DDB" is defined to provide compatibility with existing users of stack(9).
Add new stack_save_td(9) function, which allows the capture of a stacktrace of another thread rather than the current thread, which the existing stack_save(9) was limited to. It requires that the thread be neither swapped out nor running, which is the responsibility of the consumer to enforce.
Update stack(9) man page.
Build tested: amd64, arm, i386, ia64, powerpc, sparc64, sun4v Runtime tested: amd64 (rwatson), arm (cognet), i386 (rwatson)
|
173988 |
27-Nov-2007 |
jhb |
Remove the 'needbounce' variable from the _bus_dmamap_load_buffer() routine. It is not needed as the existing tests for segment coalescing already handle bounced addresses and it prevents legal segment coalescing in certain edge cases.
MFC after: 1 week Reviewed by: scottl
|
173970 |
27-Nov-2007 |
jasone |
Define atomic_readandclear_ptr.
|
173799 |
21-Nov-2007 |
scottl |
Extend critical section coverage in the low-level interrupt handlers to include the ithread scheduling step. Without this, a preemption might occur in between the interrupt getting masked and the ithread getting scheduled. Since the interrupt handler runs in the context of curthread, the scheudler might see it as having a such a low priority on a busy system that it doesn't get to run for a _long_ time, leaving the interrupt stranded in a disabled state. The only way that the preemption can happen is by a fast/filter handler triggering a schduling event earlier in the handler, so this problem can only happen for cases where an interrupt is being shared by both a fast/filter handler and an ithread handler. Unfortunately, it seems to be common for this sharing to happen with network and USB devices, for example. This fixes many of the mysterious TCP session timeouts and NIC watchdogs that were being reported. Many thanks to Sam Lefler for getting to the bottom of this problem.
Reviewed by: jhb, jeff, silby
|
173708 |
17-Nov-2007 |
alc |
Prevent the leakage of wired pages in the following circumstances: First, a file is mmap(2)ed and then mlock(2)ed. Later, it is truncated. Under "normal" circumstances, i.e., when the file is not mlock(2)ed, the pages beyond the EOF are unmapped and freed. However, when the file is mlock(2)ed, the pages beyond the EOF are unmapped but not freed because they have a non-zero wire count. This can be a mistake. Specifically, it is a mistake if the sole reason why the pages are wired is because of wired, managed mappings. Previously, unmapping the pages destroys these wired, managed mappings, but does not reduce the pages' wire count. Consequently, when the file is unmapped, the pages are not unwired because the wired mapping has been destroyed. Moreover, when the vm object is finally destroyed, the pages are leaked because they are still wired. The fix is to reduce the pages' wired count by the number of wired, managed mappings destroyed. To do this, I introduce a new pmap function pmap_page_wired_mappings() that returns the number of managed mappings to the given physical page that are wired, and I use this function in vm_object_page_remove().
Reviewed by: tegge MFC after: 6 weeks
|
173615 |
14-Nov-2007 |
marcel |
o Rename cpu_thread_setup() to cpu_thread_alloc() to better communicate that it relates to (is called by) thread_alloc() o Add cpu_thread_free() which is called from thread_free() to counter-act cpu_thread_alloc().
i386: Have cpu_thread_free() call cpu_thread_clean() to preserve behaviour. ia64: Have cpu_thread_free() call mtx_destroy() for the mutex initialized in cpu_thread_alloc().
PR: ia64/118024
|
173600 |
14-Nov-2007 |
julian |
generally we are interested in what thread did something as opposed to what process. Since threads by default have teh name of the process unless over-written with more useful information, just print the thread name instead.
|
173361 |
05-Nov-2007 |
kib |
Fix for the panic("vm_thread_new: kstack allocation failed") and silent NULL pointer dereference in the i386 and sparc64 pmap_pinit() when the kmem_alloc_nofault() failed to allocate address space. Both functions now return error instead of panicing or dereferencing NULL.
As consequence, vmspace_exec() and vmspace_unshare() returns the errno int. struct vmspace arg was added to vm_forkproc() to avoid dealing with failed allocation when most of the fork1() job is already done.
The kernel stack for the thread is now set up in the thread_alloc(), that itself may return NULL. Also, allocation of the first process thread is performed in the fork1() to properly deal with stack allocation failure. proc_linkup() is separated into proc_linkup() called from fork1(), and proc_linkup0(), that is used to set up the kernel process (was known as swapper).
In collaboration with: Peter Holm Reviewed by: jhb
|
172692 |
16-Oct-2007 |
marcel |
Set PTE_ACCESSED in the PTE and before inserting it in the VHPT. This avoids back-to-back faults for all TLB misses. This can be improved further in the future by also setting PTE_DIRTY for TLB misses for write accesses.
MFC after: 1 week
|
172691 |
16-Oct-2007 |
marcel |
The flushrs instruction must be the first in an instruction group. GNU as(1) already made sure of that, but it's better to actually have the code right.
MFC after: 1 week
|
172690 |
16-Oct-2007 |
marcel |
Print instruction stops to improve analysis of dependency violations.
MFC after: 1 week
|
172689 |
16-Oct-2007 |
marcel |
Fix disassembly of the invala, itc, itr and hint instructions by fixing the opcode ordering.
MFC after: 1 week
|
172332 |
26-Sep-2007 |
brueffer |
Use the correct expanded name for SCTP.
PR: 116496 Submitted by: koitsu Reviewed by: rrs Approved by: re (kensmith)
|
172317 |
25-Sep-2007 |
alc |
Change the management of cached pages (PQ_CACHE) in two fundamental ways:
(1) Cached pages are no longer kept in the object's resident page splay tree and memq. Instead, they are kept in a separate per-object splay tree of cached pages. However, access to this new per-object splay tree is synchronized by the _free_ page queues lock, not to be confused with the heavily contended page queues lock. Consequently, a cached page can be reclaimed by vm_page_alloc(9) without acquiring the object's lock or the page queues lock.
This solves a problem independently reported by tegge@ and Isilon. Specifically, they observed the page daemon consuming a great deal of CPU time because of pages bouncing back and forth between the cache queue (PQ_CACHE) and the inactive queue (PQ_INACTIVE). The source of this problem turned out to be a deadlock avoidance strategy employed when selecting a cached page to reclaim in vm_page_select_cache(). However, the root cause was really that reclaiming a cached page required the acquisition of an object lock while the page queues lock was already held. Thus, this change addresses the problem at its root, by eliminating the need to acquire the object's lock.
Moreover, keeping cached pages in the object's primary splay tree and memq was, in effect, optimizing for the uncommon case. Cached pages are reclaimed far, far more often than they are reactivated. Instead, this change makes reclamation cheaper, especially in terms of synchronization overhead, and reactivation more expensive, because reactivated pages will have to be reentered into the object's primary splay tree and memq.
(2) Cached pages are now stored alongside free pages in the physical memory allocator's buddy queues, increasing the likelihood that large allocations of contiguous physical memory (i.e., superpages) will succeed.
Finally, as a result of this change long-standing restrictions on when and where a cached page can be reclaimed and returned by vm_page_alloc(9) are eliminated. Specifically, calls to vm_page_alloc(9) specifying VM_ALLOC_INTERRUPT can now reclaim and return a formerly cached page. Consequently, a call to malloc(9) specifying M_NOWAIT is less likely to fail.
Discussed with: many over the course of the summer, including jeff@, Justin Husted @ Isilon, peter@, tegge@ Tested by: an earlier version by kris@ Approved by: re (kensmith)
|
172189 |
15-Sep-2007 |
alc |
It has been observed on the mailing lists that the different categories of pages don't sum to anywhere near the total number of pages on amd64. This is for the most part because uma_small_alloc() pages have never been counted as wired pages, like their kmem_malloc() brethren. They should be. This changes fixes that.
It is no longer necessary for the page queues lock to be held to free pages allocated by uma_small_alloc(). I removed the acquisition and release of the page queues lock from uma_small_free() on amd64 and ia64 weeks ago. This patch updates the other architectures that have uma_small_alloc() and uma_small_free().
Approved by: re (kensmith)
|
171740 |
06-Aug-2007 |
marcel |
Clear pending interrupts before we enable external interrupts. Recently the AP in my Merced box seems to have grown a habit of getting unexpected interrupts, such as redundant wake-ups and legacy interrupts that require an INTA cycle.
While here, replace DELAY(0) with cpu_spinwait() so that it's clear what we're doing as well as enable the code to take advantage of cpu_spinwait() when it gets implemented.
Approved by: re (blanket)
|
171739 |
06-Aug-2007 |
marcel |
Keep interrupts disabled while handling external interrupts. There's no advantage in allowing nested external interrupts. In fact, it leads to a potential stack overrun.
While here, put the interrupt vector in the trapframe, so as to compensate for the 36 cycle latency of reading cr.ivr.
Further simplify assembly code by dealing with ASTs from C.
Approved by: re (blanket)
|
171735 |
05-Aug-2007 |
marcel |
In ia64_set_rr(), don't perform data serialization. This allows us to do the data serializations once after writing multiple region registers, as is done in pmap_switch(). All existing calls to ia64_set_rr() are followed with calls to ia64_srlz_d().
Approved by: re (blanket)
|
171722 |
04-Aug-2007 |
marcel |
Replace "__asm __volatile()" by equivalent support functions from ia64_cpu.h. This improves readability and consistency and aids in auditing the code. Add instruction-serialization after writing to cr.pta.
Delay enabling interrupts until after we setup the clocks and after we program the task priority register.
Approved by: re (blanket)
|
171721 |
04-Aug-2007 |
marcel |
Replace "__asm __volatile()" by equivalent support functions from ia64_cpu.h. This improves readability and consistency and aids in auditing the code. Add data-serialization after writing to the region registers and add instruction-serialization after writing to cr.pta.
Approved by: re (blanket)
|
171720 |
04-Aug-2007 |
marcel |
Replace "__asm __volatile()" by equivalent support functions from ia64_cpu.h. This improves readability and consistency and aids in auditing the code. Add data-serialization after writing to cr.tpr.
Approved by: re (blanket)
|
171719 |
04-Aug-2007 |
marcel |
Add required data-serialization after writing to cr.itm and cr.itv.
Approved by: re (blanket)
|
171718 |
04-Aug-2007 |
marcel |
Add ia64_srlz_d() and ia64_srlz_i() functions to aid in serialization.
Approved by: re (blanket)
|
171666 |
30-Jul-2007 |
marcel |
o Switch to physical addressing before dereferencing the VHPT bucket pointer. The virtual mapping may not be present in the translation cache. This will result in a nested TLB fault at a place we don't handle (and don't want to handle). o Make sure there's a stop after the rfi instruction, otherwise its behaviour is undefined. o Make sure we switch back to virtual addressing before doing a rfi. Behaviour is undefined otherwise.
Approved by: re (blanket)
|
171665 |
30-Jul-2007 |
marcel |
Add option EXCEPTION_TRACING, which enables KTR-like functionality for processor interruptions. This is especially useful to track unexpected nested TLB faults.
Approved by: re (blanket)
|
171664 |
30-Jul-2007 |
marcel |
Rework the interrupt code and add support for interrupt filtering (INTR_FILTER). This includes: o Save a pointer to the sapic structure and IRQ for every vector, so that we can quickly EOI, mask and unmask the interrupt. o Add locking to the sapic code now that we can reprogram a sapic on multiple CPUs at the same time. o Use u_int for the vector and IRQ. We only have 256 vectors, so using a 64-bit type for it is rather excessive. o Properly handle concurrent registration of a handler for the same vector.
Since vectors have a corresponding priority, we should not map IRQs to vectors in a linear fashion, but rather pick a vector that has a priority in line with the interrupt type. This is left for later. The vector/IRQ interchange has been untangled as much as possible to make this easier.
Approved by: re (blacket)
|
171663 |
30-Jul-2007 |
marcel |
Explicitly map the VHPT on all processors. Previously we were merely lucky that the VHPT was mapped as a side-effect of mapping the kernel, but when there's enough physical memory, this may not at all be the case.
Approved by: re (blanket)
|
171662 |
30-Jul-2007 |
marcel |
Add casts to some of the more commonly used pointer-type atomic operations. We really should be able to make those inline functions, but this would break its use for sx_locks.
Approved by: re (blanket)
|
171553 |
23-Jul-2007 |
dwmalone |
If clock_ct_to_ts fails to convert time time from the real time clock, print a one line error message. Add some comments on not being able to trust the day of week field (I'll act on these comments in a follow up commit).
Approved by: re MFC after: 3 weeks
|
171463 |
16-Jul-2007 |
marcel |
Restore the value of ar.rnat after the assignment to ar.bspstore. The SDM states that writing to ar.bspstore invalidates the ar.rnat register as a side-effect. This was interpreted as "bits in the ar.rnat register that correspond to registers whose value is on the stack are undefined'. Since we keep the kernel stack NaT- aligned with the user stack (i.e. the lower 9 bits of the backing store pointer remain unchanged when we switch to the kernel stack) bits that need preserving would be preserved.
That interpretation is questionable. So, now, the interpretation is more absolute: ar.rnat is undefined after writing to ar.bspstore. As such, we write the saved value of ar.rnat back to ar.rnat after writing to ar.bspstore.
Discussed with: christian.kandeler@hob.de Approved by: re (kensmith)
|
171312 |
09-Jul-2007 |
marcel |
dma_tag is a static structure. Testing for it being a NULL pointer doesn't make sense. Rewrite to what was intended.
Correctly warned about by: GCC Approved by: re (bmah)
|
170731 |
14-Jun-2007 |
delphij |
Enable SCTP by default for GENERIC kernels in order to give it more exposure. The current state of SCTP implementation is considered to be ready for 32-bit platforms, but still need some work/testing on 64-bit platforms.
Approved by: re (kensmith) Discussed with: rrs
|
170652 |
13-Jun-2007 |
marcel |
Enable GEOM_PART_MBR by default. On ia64 this replaces GEOM_MBR.
|
170519 |
10-Jun-2007 |
alc |
Add the machine-specific definitions for configuring the new physical memory allocator.
Set the size of phys_avail[] using one of these definitions.
Approved by: re
|
170507 |
10-Jun-2007 |
marcel |
Work around a firmware bug in the HP rx2660, where in ACPI an I/O port is really a memory mapped I/O address. The bug is in the GAS that describes the address and in particular the SpaceId field. The field should not say the address is an I/O port when it clearly is not.
With an additional check for the IA64_BUS_SPACE_IO case in the bus access functions, and the fact that I/O ports pretty much not used in general on ia64, make the calculation of the I/O port address a function. This avoids inlining the work-around into every driver, and also helps reduce overall code bloat.
|
170474 |
09-Jun-2007 |
marcel |
Synchronize the instruction cache after writing to memory. This is needed for breakpoints to work.
|
170473 |
09-Jun-2007 |
marcel |
Add kdb_cpu_sync_icache(), intended to synchronize instruction caches with data caches after writing to memory. This typically is required to make breakpoints work on ia64 and powerpc. For those architectures the function is implemented.
|
170444 |
09-Jun-2007 |
marcel |
Physical memory regions can be larger than INT_MAX. Change size1 from an int to a long to avoid printing negative byte and page counts.
|
170440 |
08-Jun-2007 |
rwatson |
Enable AUDIT by default in the GENERIC kernel, allowing security event auditing to be turned on without a kernel recompile, just an rc.conf option.
Approved by: re (kensmith) Obtained from: TrustedBSD Project
|
170403 |
07-Jun-2007 |
marcel |
Remove remaining references to pc_curtid missed in previous commit.
|
170402 |
07-Jun-2007 |
marcel |
Eliminate pmap_install(), which was used to wrap pmap_switch() and grab sched_lock. This would serialize calls to pmap_switch from cpu_switch(). With the introduction of thread_lock, this is not possible anymore, because thread_lock is not a single lock. It varies. Secondly and most importantly, it's not needed at all. The only requirement for pmap_switch() is that it's not preempted while in the middle of updating the CPU and PCPU. In other words, it's a critical region. No locking required.
|
170390 |
07-Jun-2007 |
davidxu |
Fix compiling error.
|
170359 |
06-Jun-2007 |
marcel |
Include <sys/sched.h> for sched_throw().
|
170307 |
05-Jun-2007 |
jeff |
Commit 14/14 of sched_lock decomposition. - Use thread_lock() rather than sched_lock for per-thread scheduling sychronization. - Use the per-process spinlock rather than the sched_lock for per-process scheduling synchronization.
Tested by: kris, current@ Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc. Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
|
170306 |
04-Jun-2007 |
jeff |
Commit 13/14 of sched_lock decomposition. - Add a new parameter to cpu_switch() that is used to release the lock on the outgoing thread and properly acquire the lock on the incoming thread. This parameter is not required for schedulers that don't do per-cpu locking and architectures which do not support it may continue to use the 4BSD scheduler. This feature is presently not supported on ia64
Tested by: kris, current@ Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc. Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
|
170305 |
04-Jun-2007 |
jeff |
- Change comments and asserts to reflect the removal of the global scheduler lock.
Tested by: kris, current@ Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc. Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
|
170303 |
04-Jun-2007 |
jeff |
Commit 10/14 of sched_lock decomposition. - Use sched_throw() rather than replicating the same cpu_throw() code for each architecture. This also allows the scheduler to use any locking it may want to. - Use the thread_lock() rather than sched_lock when preempting. - The scheduler lock is not required to synchronize release_aps.
Tested by: kris, current@ Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc. Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
|
170291 |
04-Jun-2007 |
attilio |
Rework the PCPU_* (MD) interface: - Rename PCPU_LAZY_INC into PCPU_INC - Add the PCPU_ADD interface which just does an add on the pcpu member given a specific value.
Note that for most architectures PCPU_INC and PCPU_ADD are not safe. This is a point that needs some discussions/work in the next days.
Reviewed by: alc, bde Approved by: jeff (mentor)
|
170170 |
31-May-2007 |
attilio |
Revert VMCNT_* operations introduction. Probabilly, a general approach is not the better solution here, so we should solve the sched_lock protection problems separately.
Requested by: alc Approved by: jeff (mentor)
|
170162 |
31-May-2007 |
piso |
In some particular cases (like in pccard and pccbb), the real device handler is wrapped in a couple of functions - a filter wrapper and an ithread wrapper. In this case (and just in this case), the filter wrapper could ask the system to schedule the ithread and mask the interrupt source if the wrapped handler is composed of just an ithread handler: modify the "old" interrupt code to make it support this situation, while the "new" interrupt code is already ok.
Discussed with: jhb
|
170086 |
29-May-2007 |
yongari |
Honor maxsegsz of less than a page size in a DMA tag. Previously it used to return PAGE_SIZE without respect to restrictions of a DMA tag. This affected all of the busdma load functions that use _bus_dmamap_loader_buffer() as their back-end.
Reviewed by: scottl
|
170033 |
27-May-2007 |
alc |
Eliminate an unused definition.
|
170026 |
27-May-2007 |
marcel |
Have the processor defer all faults and exceptions for control speculative loads. This at least makes control speculative loads work. In the future we should analyze which faults/exceptions we want to handle rather than defer to avoid having to call the recovery code when it's not strictly necessary.
|
169846 |
22-May-2007 |
kan |
Allow FreeBSD's native ELF image activators to execute shared libraries the same way it was enabled for Linux binares in linuxulator.
This allows binaries built with -pie. Many ports auto-detect -fPIE support in GCC 4.2 and build binaries FreeBSD was unable to run.
|
169814 |
21-May-2007 |
marcel |
When speculation fails (as determined by the chk instruction) the processor is to jump to recovery code. This branching behaviour may not be implemented by the processor and a Speculative Operation fault is raised. The OS is responsible to emulate the branch. Implement this, because GCC 4.2 uses advanced loads regularly.
|
169773 |
19-May-2007 |
marcel |
Fix GCC warning: va = va += PAGE_SIZE contains pointless operation va = va. Fix white space in nearby lines.
|
169760 |
19-May-2007 |
marcel |
Add a level of indirection to the kernel PTE table. The old scheme allowed for 1024 PTE pages, each containing 256 PTEs. This yielded 2GB of KVA. This is not enough to boot a kernel on a 16GB box and in general too low for a 64-bit machine. By adding a level of indirection we now have 1024 2nd-level directory pages, each capable of supporting 2GB of KVA. This brings the grand total to 2TB of KVA.
|
169757 |
19-May-2007 |
marcel |
Account for the fact that contigmalloc(9) can return a NULL pointer. Fix the flags argument: M_WAITOK is not a valid flag. Its presence leaves the indication that contigmalloc(9) will not return a NULL pointer.
The use of contigmalloc(9) in this place is probably not a good idea given the constraints. It's probably better to lift the constraints and instead add a permanent mapping to the ITR. It's possible that the first 256MB of memory is exhausted when we get here.
This fixes a kernel panic on a 16GB rx3600.
|
169667 |
18-May-2007 |
jeff |
- define and use VMCNT_{GET,SET,ADD,SUB,PTR} macros for manipulating vmcnts. This can be used to abstract away pcpu details but also changes to use atomics for all counters now. This means sched lock is no longer responsible for protecting counts in the switch routines.
Contributed by: Attilio Rao <attilio@FreeBSD.org>
|
169291 |
05-May-2007 |
alc |
Define every architecture as either VM_PHYSSEG_DENSE or VM_PHYSSEG_SPARSE depending on whether the physical address space is densely or sparsely populated with memory. The effect of this definition is to determine which of two implementations of vm_page_array and PHYS_TO_VM_PAGE() is used. The legacy implementation is obtained by defining VM_PHYSSEG_DENSE, and a new implementation that trades off time for space is obtained by defining VM_PHYSSEG_SPARSE. For now, all architectures except for ia64 and sparc64 define VM_PHYSSEG_DENSE. Defining VM_PHYSSEG_SPARSE on ia64 allows the entirety of my Itanium 2's memory to be used. Previously, only the first 1 GB could be used. Defining VM_PHYSSEG_SPARSE on sparc64 allows USIIIi-based systems to boot without crashing.
This change is a combination of Nathan Whitehorn's patch and my own work in perforce.
Discussed with: kmacy, marius, Nathan Whitehorn PR: 112194
|
168920 |
21-Apr-2007 |
sepotvin |
Add support for specifying a minimal size for vm.kmem_size in the loader via vm.kmem_size_min. Useful when using ZFS to make sure that vm.kmem size will be at least 256mb (for example) without forcing a particular value via vm.kmem_size.
Approved by: njl (mentor) Reviewed by: alc
|
168603 |
10-Apr-2007 |
pjd |
Remove trailing '.' for consistency!
|
168594 |
10-Apr-2007 |
pjd |
Add UFS_GJOURNAL options to the GENERIC kernel.
Approved by: re (kensmith)
|
167814 |
22-Mar-2007 |
jkim |
Catch up with ACPI-CA 20070320 import.
|
167767 |
21-Mar-2007 |
jhb |
Change the amd64, i386, and ia64 nexus drivers to setup bus space tags and handles when activating a resource via bus_activate_resource() rather than doing some of the work in bus_alloc_resource() and some of it in bus_activate_resource().
One note is that when using isa_alloc_resourcev() on PC-98, drivers now need to just use bus_release_resource() without explicitly calling bus_deactivate_resource() first. nyan@ has already fixed all of the PC-98 drivers.
|
167429 |
11-Mar-2007 |
alc |
Push down the implementation of PCPU_LAZY_INC() into the machine-dependent header file. Reimplement PCPU_LAZY_INC() on amd64 and i386 making it atomic with respect to interrupts.
Reviewed by: bde, jhb
|
167352 |
09-Mar-2007 |
mohans |
Over NFS, an open() call could result in multiple over-the-wire GETATTRs being generated - one from lookup()/namei() and the other from nfs_open() (for cto consistency). This change eliminates the GETATTR in nfs_open() if an otw GETATTR was done from the namei() path. Instead of extending the vop interface, we timestamp each attr load, and use this to detect whether a GETATTR was done from namei() for this syscall. Introduces a thread-local variable that counts the syscalls made by the thread and uses <pid, tid, thread syscalls> as the attrload timestamp. Thanks to jhb@ and peter@ for a discussion on thread state that could be used as the timestamp with minimal overhead.
|
167277 |
06-Mar-2007 |
scottl |
Don't increment total_bounced when doing no-op dmamap_sync ops.
|
166945 |
24-Feb-2007 |
piso |
Updated ia64 isa support with the new bus_setup_intr() syntax.
Approved by: re (implicit?)
|
166901 |
23-Feb-2007 |
piso |
o break newbus api: add a new argument of type driver_filter_t to bus_setup_intr()
o add an int return code to all fast handlers
o retire INTR_FAST/IH_FAST
For more info: http://docs.freebsd.org/cgi/getmsg.cgi?fetch=465712+0+current/freebsd-current
Reviewed by: many Approved by: re@
|
166860 |
21-Feb-2007 |
alc |
Change pmap_protect() so that execute access can be removed without simultaneously removing write access.
|
166810 |
18-Feb-2007 |
alc |
Eliminate some acquisitions and releases of the page queues lock that are no longer necessary.
|
166631 |
11-Feb-2007 |
marcel |
Now that the free page queue mutex is a sleep mutex, we cannot call vm_page_alloc() from within a critical section in pmap_growkernel(). Since the need for a critical section may never have existed in the first place, simply get rid of it.
Discussed with: alc@
|
166604 |
09-Feb-2007 |
brooks |
Include GEOM_LABEL in GENERIC. It's very useful and not well publicized enough.
Approved by: pjd
|
166551 |
07-Feb-2007 |
marcel |
Evolve the ctlreq interface added to geom_gpt into a generic partitioning class that supports multiple schemes. Current schemes supported are APM (Apple Partition Map) and GPT. Change all GEOM_APPLE anf GEOM_GPT options into GEOM_PART_APM and GEOM_PART_GPT (resp).
The ctlreq interface supports verbs to create and destroy partitioning schemes on a disk; to add, delete and modify partitions; and to commit or undo changes made.
|
165967 |
12-Jan-2007 |
imp |
Remove 3rd clause, renumber, ok per email
|
165369 |
20-Dec-2006 |
davidxu |
Add a lwpid field into per-cpu structure, the lwpid represents current running thread's id on each cpu. This allow us to add in-kernel adaptive spin for user level mutex. While spinning in user space is possible, without correct thread running state exported from kernel, it hardly can be implemented efficiently without wasting cpu cycles, however exporting thread running state unlikely will be implemented soon as it has to design and stablize interfaces. This implementation is transparent to user space, it can be disabled dynamically. With this change, mutex ping-pong program's performance is improved massively on SMP machine. performance of mysql super-smack select benchmark is increased about 7% on Intel dual dual-core2 Xeon machine, it indicates on systems which have bunch of cpus and system-call overhead is low (athlon64, opteron, and core-2 are known to be fast), the adaptive spin does help performance.
Added sysctls: kern.threads.umtx_dflt_spins if the sysctl value is non-zero, a zero umutex.m_spincount will cause the sysctl value to be used a spin cycle count. kern.threads.umtx_max_spins the sysctl sets upper limit of spin cycle count.
Tested on: Athlon64 X2 3800+, Dual Xeon 5130
|
164936 |
06-Dec-2006 |
julian |
Threading cleanup.. part 2 of several.
Make part of John Birrell's KSE patch permanent.. Specifically, remove: Any reference of the ksegrp structure. This feature was never fully utilised and made things overly complicated. All code in the scheduler that tried to make threaded programs fair to unthreaded programs. Libpthread processes will already do this to some extent and libthr processes already disable it.
Also: Since this makes such a big change to the scheduler(s), take the opportunity to rename some structures and elements that had to be moved anyhow. This makes the code a lot more readable.
The ULE scheduler compiles again but I have no idea if it works.
The 4bsd scheduler still reqires a little cleaning and some functions that now do ALMOST nothing will go away, but I thought I'd do that as a separate commit.
Tested by David Xu, and Dan Eischen using libthr and libpthread.
|
164395 |
18-Nov-2006 |
marcel |
Since printf also has at least one critical section, we need to initialize pc_curthread. While here, rename early_pcpu to pcpu0 to be conistent (compare thread0 and proc0).
|
164392 |
18-Nov-2006 |
marcel |
Now that printf() needs the PCPU, set it up before we call printf(). Change the pc_pcb field from a pointer to struct pcb to struct pcb so that sizeof(struct pcb) includes the PCB we use for IPI_STOP. Statically declare early_pcb so that we don't have to allocate the PCB for thread0. This way we can setup the PCPU before cninit() and thus before we use printf().
|
164391 |
18-Nov-2006 |
marcel |
Revert previous commit. PC_CONS_BUFR is not used nor needed by assembly.
|
164250 |
13-Nov-2006 |
ru |
Fix a comment.
|
164229 |
12-Nov-2006 |
alc |
Make pmap_enter() responsible for setting PG_WRITEABLE instead of its caller. (As a beneficial side-effect, a high-contention acquisition of the page queues lock in vm_fault() is eliminated.)
|
164049 |
06-Nov-2006 |
rwatson |
Add missing includes of priv.h.
|
164033 |
06-Nov-2006 |
rwatson |
Sweep kernel replacing suser(9) calls with priv(9) calls, assigning specific privilege names to a broad range of privileges. These may require some future tweaking.
Sponsored by: nCircle Network Security, Inc. Obtained from: TrustedBSD Project Discussed on: arch@ Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri, Alex Lyashkov <umka at sevcity dot net>, Skip Ford <skip dot ford at verizon dot net>, Antoine Brodin <antoine dot brodin at laposte dot net>
|
163987 |
04-Nov-2006 |
jb |
Remove the KDTRACE option again because of the complaints about having it as a default.
For the record, the KDTRACE option caused _no_ additional source files to be compiled in; certainly no CDDL source files. All it did was to allow existing BSD licensed kernel files to include one or more CDDL header files.
By removing this from DEFAULTS, the onus is on a kernel builder to add the option to the kernel config, possibly by including GENERIC and customising from there. It means that DTrace won't be a feature available in FreeBSD by default, which is the way I intended it to be.
Without this option, you can't load the dtrace module (which contains the dtrace device and the DTrace framework). This is equivalent to requiring an option in a kernel config before you can load the linux emulation module, for example.
I think it is a mistake to have DTrace ported to FreeBSD, but not to have it available to everyone, all the time. The only exception to this is the companies which distribute systems with FreeBSD embedded. Those companies will customise their systems anyway. The KDTRACE option was intended for them, and only them.
|
163972 |
04-Nov-2006 |
jb |
Build in kernel support for loading DTrace modules by default. This adds the hooks that DTrace modules register with, and adds a few functions which have the dtrace_ prefix to allow the DTrace FBT (function boundary trace) provider to avoid tracing because they are called from the DTtrace probe context.
Unlike other forms of tracing and debug, DTrace support in the kernel incurs negligible run-time cost.
I think the only reason why anyone wouldn't want to have kernel support enabled for DTrace would be due to the license (CDDL) under which DTrace is released.
|
163928 |
03-Nov-2006 |
marcel |
Make sure kern_envp is never NULL. If we don't get a pointer to the environment from the loader, use the static environment.
|
163858 |
01-Nov-2006 |
jb |
Add a cnputs() function to write a string to the console with a lock to prevent interspersed strings written from different CPUs at the same time.
To avoid putting a buffer on the stack or having to malloc one, space is incorporated in the per-cpu structure. The buffer size if 128 bytes; chosen because it's the next power of 2 size up from 80 characters.
String writes to the console are buffered up the end of the line or until the buffer fills. Then the buffer is flushed to all console devices.
Existing low level console output via cnputc() is unaffected by this change. ithread calls to log() are also unaffected to avoid blocking those threads.
A minor change to the behaviour in a panic situation is that console output will still be buffered, but won't be written to a tty as before. This should prevent interspersed panic output as a number of CPUs panic before we end up single threaded running ddb.
Reviewed by: scottl, jhb MFC after: 2 weeks
|
163711 |
26-Oct-2006 |
jb |
Remove the KSE option now that it's in DEFAULTS on these arches/machines.
The 'nooption' kernel config entry has to be used to turn KSE off now. This isn't my preferred way of dealing with this, but I'll defer to scottl's experience with the io/mem kernel option change and the grief experienced over that.
Submitted by: scottl@
|
163710 |
26-Oct-2006 |
jb |
Add 'options KSE' to the kernel config DEFAULTS on all arches/machines except sun4v.
This change makes the transition from a default to an option more transparent and is an attempt to head off all the compliants that are likely from people who don't read UPDATING, based on experience with the io/mem change.
Submitted by: scottl@
|
163709 |
26-Oct-2006 |
jb |
Make KSE a kernel option, turned on by default in all GENERIC kernel configs except sun4v (which doesn't process signals properly with KSE).
Reviewed by: davidxu@
|
163630 |
23-Oct-2006 |
ru |
Move "device splash" back to MI NOTES and "files", it's MI.
|
163619 |
23-Oct-2006 |
marcel |
o Eliminate nexus_print_resources(). Use resource_list_print_type() instead. o Eliminate nexus_print_all_resources(). Inline the function body in nexus_print_child().
|
163603 |
22-Oct-2006 |
alc |
Eliminate unnecessary PG_BUSY tests.
|
163535 |
20-Oct-2006 |
des |
Move more MD devices and options out of MI NOTES.
|
163531 |
20-Oct-2006 |
des |
The VGA_DEBUG option only exists on {amd64,i386,ia64}. Also remove 'device io' from amd64 NOTES; DEFAULTS takes care of it.
|
163492 |
19-Oct-2006 |
marcel |
Fix previous revision: o day and mday are the same. No need to subtract 1 from mday. o Set dow to -1 as clock_ct_to_ts() checks this field and returns EINVAL on any day of the week but Sunday.
|
163449 |
17-Oct-2006 |
davidxu |
o Add keyword volatile for user mutex owner field. o Fix type consistent problem by using type long for old umtx and wait channel. o Rename casuptr to casuword.
|
163386 |
15-Oct-2006 |
hrs |
Add a newline to the printf().
Spotted by: Peter Carah <pete@altadena.net> MFC after: 3 days
|
163058 |
06-Oct-2006 |
marcel |
Include freebsd32_signal.h now that signal-related definitions are moved there.
Found by: ia64 tinderbox.
|
163041 |
05-Oct-2006 |
simon |
- Remove SCHED_ULE from GENERIC to better avoid foot-shooting by unsuspecting users. - Add a comment in NOTES about experimental status of SCHED_ULE. - Make warning about experimental status in sched_ule(4) a bit stronger.
Suggested and reviewed by: dougb Discussed on: developers MFC after: 3 days
|
163016 |
04-Oct-2006 |
jb |
PR: Submitted by: Reviewed by: Approved by: Obtained from: MFC after: Security: Move the relocation definitions to the common elf header so that DTrace can use them on one architecture targeted to a different one.
Add the additional ELF types defines in Sun's "Linker and Libraries" manual.
|
162966 |
02-Oct-2006 |
phk |
Use calendrical calculations from subr_clock.c instead of home-rolled.
|
162958 |
02-Oct-2006 |
phk |
Second part of a little cleanup in the calendar/timezone/RTC handling.
Split subr_clock.c in two parts (by repo-copy): subr_clock.c contains generic RTC and calendaric stuff. etc. subr_rtc.c contains the newbus'ified RTC interface.
Centralize the machdep.{adjkerntz,disable_rtc_set,wall_cmos_clock} sysctls and associated variables into subr_clock.c. They are not machine dependent and we have generic code that relies on being present so they are not even optional.
|
162954 |
02-Oct-2006 |
phk |
First part of a little cleanup in the calendar/timezone/RTC handling.
Move relevant variables to <sys/clock.h> and fix #includes as necessary.
Use libkern's much more time- & spamce-efficient BCD routines.
|
162658 |
26-Sep-2006 |
ru |
Added COMPAT_FREEBSD6 option.
|
162487 |
21-Sep-2006 |
kan |
Use __builtin_va_start instead of __builtin_stdarg_start. GCC4 obsoletes the former and __builtin_va_start was present in all GCC version 3.1 and later.
|
162361 |
16-Sep-2006 |
rwatson |
Add audit hooks for ppc, ia64 system call paths.
Reviewed by: marcel (ia64) Obtained from: TrustedBSD Project MFC after: 3 days
|
161675 |
28-Aug-2006 |
davidxu |
Implement casuword32, compare and set user integer, thank Marcel Moolenarr who wrote the IA64 version of casuword32.
|
161628 |
25-Aug-2006 |
alc |
Eliminate unused definitions. (They came from NetBSD.)
Discussed with: cognet, grehan, marcel
|
161223 |
11-Aug-2006 |
jhb |
First pass at allowing memory to be mapped using cache modes other than WB (write-back) on x86 via control bits in PTEs and PDEs (including making use of the PAT MSR). Changes include: - A new pmap_mapdev_attr() function for amd64 and i386 which takes an additional parameter (relative to pmap_mapdev()) specifying the cache mode for this mapping. Note that on amd64 only WB mappings are done with the direct map, all other modes result in a private mapping. - pmap_mapdev() on i386 and amd64 now defaults to using UC (uncached) mappings rather than WB. Previously we relied on the BIOS setting up MTRR's to enforce memio regions being treated as UC. This might make hw.cbb_start_memory unnecessary in some cases now for example. - A new pmap_mapbios()/pmap_unmapbios() API has been added to allow places that used pmap_mapdev() to map non-device memory (such as ACPI tables) to do so using WB as before. - A new pmap_change_attr() function for amd64 and i386 that changes the caching mode for a range of KVA.
Reviewed by: alc
|
160889 |
01-Aug-2006 |
alc |
Complete the transition from pmap_page_protect() to pmap_remove_write(). Originally, I had adopted sparc64's name, pmap_clear_write(), for the function that is now pmap_remove_write(). However, this function is more like pmap_remove_all() than like pmap_clear_modify() or pmap_clear_reference(), hence, the name change.
The higher-level rationale behind this change is described in src/sys/amd64/amd64/pmap.c revision 1.567. The short version is that I'm trying to clean up and fix our support for execute access.
Reviewed by: marcel@ (ia64)
|
160813 |
29-Jul-2006 |
marcel |
Remove sio(4) and related options from MI files to amd64, i386 and pc98 MD files. Remove nodevice and nooption lines specific to sio(4) from ia64, powerpc and sparc64 NOTES. There were no such lines for arm yet. sio(4) is usable on less than half the platforms, not counting a future mips platform. Its presence in MI files is therefore increasingly becoming a burden.
|
160801 |
28-Jul-2006 |
jhb |
Retire SYF_ARGMASK and remove both SYF_MPSAFE and SYF_ARGMASK. sy_narg is now back to just being an argument count.
|
160798 |
28-Jul-2006 |
jhb |
Now that all system calls are MPSAFE, retire the SYF_MPSAFE flag used to mark system calls as being MPSAFE: - Stop conditionally acquiring Giant around system call invocations. - Remove all of the 'M' prefixes from the master system call files. - Remove support for the 'M' prefix from the script that generates the syscall-related files from the master system call files. - Don't explicitly set SYF_MPSAFE when registering nfssvc.
|
160773 |
27-Jul-2006 |
jhb |
Unify the checking for lock misbehavior in the various syscall() implementations and adjust some of the checks while I'm here: - Add a new check to make sure we don't return from a syscall in a critical section. - Add a new explicit check before userret() to make sure we don't return with any locks held. The advantage here is that we can include the syscall number and name in syscall() whereas that info is not available in userret(). - Drop the mtx_assert()'s of sched_lock and Giant. They are replaced by the more general checks just added.
MFC after: 2 weeks
|
160770 |
27-Jul-2006 |
jhb |
Add KTR_SYSC tracing to the syscall() implementations that didn't have it yet.
MFC after: 1 week
|
160764 |
27-Jul-2006 |
jhb |
Add missing ptrace(2) system-call stops to various syscall() implementations.
MFC after: 1 week
|
160444 |
17-Jul-2006 |
marcel |
Move default GEOM classes from files.ia64, where they were marked standard, to the DEFAULTS file.
|
160312 |
12-Jul-2006 |
jhb |
Simplify the pager support in DDB. Allowing different db commands to install custom pager functions didn't actually happen in practice (they all just used the simple pager and passed in a local quit pointer). So, just hardcode the simple pager as the only pager and make it set a global db_pager_quit flag that db commands can check when the user hits 'q' (or a suitable variant) at the pager prompt. Also, now that it's easy to do so, enable paging by default for all ddb commands. Any command that wishes to honor the quit flag can do so by checking db_pager_quit. Note that the pager can also be effectively disabled by setting $lines to 0.
Other fixes: - 'show idt' on i386 and pc98 now actually checks the quit flag and terminates early. - 'show intr' now actually checks the quit flag and terminates early.
|
160210 |
09-Jul-2006 |
mjacob |
Make the firmware assist driver resident in preparation for isp using it.
|
160105 |
05-Jul-2006 |
bde |
Fixed FP_R*. fp{get_set}round() apparently never worked on ia64, since the alpha values were used and are quite different.
Fixed some style bugs by copying from the i386 version where it is better.
|
160040 |
29-Jun-2006 |
marcel |
Partial support for branch long emulation. This only emulates the branch long jump and not the branch long call. Support for that is forthcoming.
|
159971 |
27-Jun-2006 |
alc |
Make several changes to pmap_enter_quick_locked():
1. Make the caller responsible for performing pmap_install(). This reduces the number of times that pmap_install() is performed by pmap_enter_object() from twice per page to twice overall.
2. Don't block if pmap_find_pte() is unable to allocate a PTE. If it did block, then it might wind up mapping a cache page. Specifically, if pmap_enter_quick_locked() slept when called from pmap_enter_object(), the page daemon could change an active or inactive page into a cache page just before it was to be mapped.
3. Bail out of pmap_enter_quick_locked() if pv entries aren't plentiful. In other words, don't force the allocation of a pv entry if they aren't readily available.
Reviewed by: marcel@
|
159964 |
26-Jun-2006 |
babkin |
Backed out the change by request from rwatson.
PR: kern/14584
|
159927 |
25-Jun-2006 |
babkin |
The common UID/GID space implementation. It has been discussed on -arch in 1999, and there are changes to the sysctl names compared to PR, according to that discussion. The description is in sys/conf/NOTES. Lines in the GENERIC files are added in commented-out form. I'll attach the test script I've used to PR.
PR: kern/14584 Submitted by: babkin
|
159916 |
24-Jun-2006 |
marcel |
Update to SDM 2.2: o Add tf (test feature) instruction, o Add vmsw (VM switch) instruction.
While here, update copyright.
MFC after: 1 week
|
159909 |
24-Jun-2006 |
marcel |
Sync up with SDM 2.1: o Add nop/hint formats F16, I18, M48 and X5, o Add format M47 for ptc.e, o Add hint instruction, o Fix decoding of cmp8xchg16.
|
159850 |
22-Jun-2006 |
marcel |
Identify the cual-core Montecito.
MFC after: 3 days
|
159651 |
15-Jun-2006 |
netchild |
Remove COMPAT_43 from GENERIC (and other kernel configs). For amd64 there's an explicit comment that it's needed for the linuxolator. This is not the case anymore. For all other architectures there was only a "KEEP THIS". I'm (and other people too) running a COMPAT_43-less kernel since it's not necessary anymore for the linuxolator. Roman is running such a kernel for a for longer time. No problems so far. And I doubt other (newer than ia32 or alpha) architectures really depend on it.
This may result in a small performance increase for some workloads.
If the removal of COMPAT_43 results in a not working program, please recompile it and all dependencies and try again before reporting a problem.
The only place where COMPAT_43 is needed (as in: does not compile without it) is in the (outdated/not usable since too old) svr4 code.
Note: this does not remove the COMPAT_43TTY option.
Nagging by: rdivacky
|
159627 |
15-Jun-2006 |
ups |
Remove mpte optimization from pmap_enter_quick(). There is a race with the current locking scheme and removing it should have no measurable performance impact. This fixes page faults leading to panics in pmap_enter_quick_locked() on amd64/i386.
Reviewed by: alc,jhb,peter,ps
|
159537 |
12-Jun-2006 |
imp |
Add the ability to subset the devices that UART pulls in. This allows the arm to compile without all the extras that don't appear, at least not in the flavors of ARM I deal with. This helps us save about 100k.
If I've botched the available devices on a platform, please let me know and I'll correct ASAP.
|
159303 |
05-Jun-2006 |
alc |
Introduce the function pmap_enter_object(). It maps a sequence of resident pages from the same object. Use it in vm_map_pmap_enter() to reduce the locking overhead of premapping objects.
Reviewed by: tegge@
|
159159 |
02-Jun-2006 |
imp |
EISA bus ia64 systems don't exist in reality. I'm told they may exist in theory, but that it was OK to remove from NOTES.
OK'd by: marcel
|
159148 |
01-Jun-2006 |
alc |
Correct a syntax error in the previous revision.
|
159130 |
01-Jun-2006 |
silby |
After much discussion with mjacob and scottl, change bus_dmamem_alloc so that it just warns the user with a printf when it misaligns a piece of memory that was requested through a busdma tag.
Some drivers (such as mpt, and probably others) were asking for alignments that could not be satisfied, but as far as driver operation was concerned, that did not matter. In the theory that other drivers will fall into this same category, we agreed that panicing or making the allocation fail will cause more hardship than is necessary. The printf should be sufficient motivation to get the driver glitch fixed.
|
159093 |
31-May-2006 |
mjacob |
Since it's to all intents and purposes identical code to amd64 && i386, match the recent changes to bus_dmamem_alloc here.
|
158984 |
27-May-2006 |
marcel |
Unbreak after previous commit. While here, improve function naming consistency by s/ssc/ssc_/g.
|
158964 |
26-May-2006 |
phk |
Update to new console api.
|
158711 |
17-May-2006 |
marius |
Add le(4). I could actually only test it on alpha, i386 and sparc64 but given that this includes the more problematic platforms I see no reason why it shouldn't also work on amd64 and ia64.
|
158651 |
16-May-2006 |
phk |
Since DELAY() was moved, most <machine/clock.h> #includes have been unnecessary.
|
158459 |
11-May-2006 |
marcel |
Fix braino in previous commit: Don't redefine OID_AUTO to something not equal to -1, or at all for that matter.
|
158450 |
11-May-2006 |
phk |
Remove more straggling CPU_ macro references
|
158445 |
11-May-2006 |
phk |
Clean out sysctl machdep.* related defines.
The cmos clock related stuff should really be in MI code.
|
158124 |
28-Apr-2006 |
marcel |
Rewrite of puc(4). Significant changes are: o Properly use rman(9) to manage resources. This eliminates the need to puc-specific hacks to rman. It also allows devinfo(8) to be used to find out the specific assignment of resources to serial/parallel ports. o Compress the PCI device "database" by optimizing for the common case and to use a procedural interface to handle the exceptions. The procedural interface also generalizes the need to setup the hardware (program chipsets, program clock frequencies). o Eliminate the need for PUC_FASTINTR. Serdev devices are fast by default and non-serdev devices are handled by the bus. o Use the serdev I/F to collect interrupt status and to handle interrupts across ports in priority order. o Sync the PCI device configuration to include devices found in NetBSD and not yet merged to FreeBSD. o Add support for Quatech 2, 4 and 8 port UARTs. o Add support for a couple dozen Timedia serial cards as found in Linux.
|
157941 |
21-Apr-2006 |
marcel |
In nexus_teardown_intr(), actually remove the handler.
MFC after: 1 day
|
157894 |
20-Apr-2006 |
imp |
Set the rid of the resource obtained from rman_reserve_resource.
|
157680 |
12-Apr-2006 |
alc |
Retire pmap_track_modified(). We no longer need it because we do not create managed mappings within the clean submap. To prevent regressions, add assertions blocking the creation of managed mappings within the clean submap.
Reviewed by: tegge
|
157449 |
03-Apr-2006 |
marcel |
Improve handling of IPI_STOP: o use atomic operations to fiddle with stopped_cpus and started_cpus. o disable interrupts while we're waiting to be started. o remove logic relating to cpustop_restartfunc as it's not used.
|
157448 |
03-Apr-2006 |
marcel |
Eliminate HAVE_STOPPEDPCBS. On ia64 the PCPU holds a pointer to the PCB in which the context of stopped CPUs is stored. To access this PCB from KDB, we introduce a new define, called KDB_STOPPEDPCB. The definition, when present, lives in <machine/kdb.h> and abstracts where MD code saves the context. Define KDB_STOPPEDPCB on i386, amd64, alpha and sparc64 in accordance to previous code.
|
157443 |
03-Apr-2006 |
peter |
Remove the unused sva and eva arguments from pmap_remove_pages().
|
155922 |
22-Feb-2006 |
jhb |
Close some races between procfs/ptrace and exit(2): - Reorder the events in exit(2) slightly so that we trigger the S_EXIT stop event earlier. After we have signalled that, we set P_WEXIT and then wait for any processes with a hold on the vmspace via PHOLD to release it. PHOLD now KASSERT()'s that P_WEXIT is clear when it is invoked, and PRELE now does a wakeup if P_WEXIT is set and p_lock drops to zero. - Change proc_rwmem() to require that the processing read from has its vmspace held via PHOLD by the caller and get rid of all the junk to screw around with the vmspace reference count as we no longer need it. - In ptrace() and pseudofs(), treat a process with P_WEXIT set as if it doesn't exist. - Only do one PHOLD in kern_ptrace() now, and do it earlier so it covers FIX_SSTEP() (since on alpha at least this can end up calling proc_rwmem() to clear an earlier single-step simualted via a breakpoint). We only do one to avoid races. Also, by making the EINVAL error for unknown requests be part of the default: case in the switch, the various switch cases can now just break out to return which removes a _lot_ of duplicated PRELE and proc unlocks, etc. Also, it fixes at least one bug where a LWP ptrace command could return EINVAL with the proc lock still held. - Changed the locking for ptrace_single_step(), ptrace_set_pc(), and ptrace_clear_single_step() to always be called with the proc lock held (it was a mixed bag previously). Alpha and arm have to drop the lock while the mess around with breakpoints, but other archs avoid extra lock release/acquires in ptrace(). I did have to fix a couple of other consumers in kern_kse and a few other places to hold the proc lock and PHOLD.
Tested by: ps (1 mostly, but some bits of 2-4 as well) MFC after: 1 week
|
155680 |
14-Feb-2006 |
jhb |
Fix the hw.realmem sysctl. The global realmem variable is a count of pages, not a count of bytes. The sysctl handler for hw.realmem already uses ctob() to convert realmem from pages to bytes. Thus, on archs that were storing a byte count in the realmem variable, hw.realmem was inflated.
Reported by: Valerio daelli valerio dot daelli at gmail dot com (alpha) MFC after: 3 days
|
155553 |
11-Feb-2006 |
marcel |
Correct the spinlock nesting of the idle thread of the APs before we save the MCA state of the AP. Saving the MCA state of the AP requires us to allocate memory, which uses sleep locks. Now that we correct the spinlock nesting of the AP without having schedlock, avoid calling spinlock_exit(). Instead call critical_exit() and manually clear the MD spinlock count.
MFC after: 3 days
|
155455 |
08-Feb-2006 |
phk |
Simplify system time accounting for profiling.
Rename struct thread's td_sticks to td_pticks, we will need the other name for more appropriately named use shortly. Reduce it from uint64_t to u_int.
Clear td_pticks whenever we enter the kernel instead of recording its value as reference for userret(). Use the absolute value of td->pticks in userret() and eliminate third argument.
|
155444 |
07-Feb-2006 |
phk |
Modify the way we account for CPU time spent (step 1)
Keep track of time spent by the cpu in various contexts in units of "cputicks" and scale to real-world microsec^H^H^H^H^H^H^H^Hclock_t only when somebody wants to inspect the numbers.
For now "cputicks" are still derived from the current timecounter and therefore things should by definition remain sensible also on SMP machines. (The main reason for this first milestone commit is to verify that hypothesis.)
On slower machines, the avoided multiplications to normalize timestams at every context switch, comes out as a 5-7% better score on the unixbench/context1 microbenchmark. On more modern hardware no change in performance is seen.
|
155410 |
07-Feb-2006 |
marcel |
Allocate memory for the MCA state information with M_NOWAIT. We can get a MCA event at any moment and it may not be safe to sleep.
MFC after: 3 days
|
155232 |
02-Feb-2006 |
marcel |
Remove devices acpi & mem, as they are in defaults already.
|
154958 |
28-Jan-2006 |
marcel |
s/DT_IA64_PLT_RESERVE/DT_IA_64_PLT_RESERVE/
|
154496 |
18-Jan-2006 |
marcel |
o Add missing relocations. o Minor white-space fixups.
|
154491 |
17-Jan-2006 |
marcel |
s/R_IA64_/R_IA_64_/g as per the ia64 psABI.
|
154170 |
10-Jan-2006 |
phk |
Move the old BSD4.3 tty compatibility from (!BURN_BRIDGES && COMPAT_43) to COMPAT_43TTY.
Add COMPAT_43TTY to NOTES and */conf/GENERIC
Compile tty_compat.c only under the new option.
Spit out #warning "Old BSD tty API used, please upgrade." if ioctl_compat.h gets #included from userland.
|
154128 |
09-Jan-2006 |
imp |
By popular demand, move __HAVE_ACPI and __PCI_REROUTE_INTERRUPT into param.h. Per request, I've placed these just after the _NO_NAMESPACE_POLLUTION ifndef. I've not renamed anything yet, but may since we don't need the __.
Submitted by: bde, jhb, scottl, many others.
|
154017 |
04-Jan-2006 |
phk |
Use ttyalloc() instead of ttymalloc()
|
153955 |
01-Jan-2006 |
imp |
Define __HAVE_ACPI and/or __PCI_REROUTE_INTERRUPT, as appropriate for each platform. These will be used in the pci code in preference to the complicated #ifdefs we have there now.
|
153940 |
31-Dec-2005 |
netchild |
MI changes: - provide an interface (macros) to the page coloring part of the VM system, this allows to try different coloring algorithms without the need to touch every file [1] - make the page queue tuning values readable: sysctl vm.stats.pagequeue - autotuning of the page coloring values based upon the cache size instead of options in the kernel config (disabling of the page coloring as a kernel option is still possible)
MD changes: - detection of the cache size: only IA32 and AMD64 (untested) contains cache size detection code, every other arch just comes with a dummy function (this results in the use of default values like it was the case without the autotuning of the page coloring) - print some more info on Intel CPU's (like we do on AMD and Transmeta CPU's)
Note to AMD owners (IA32 and AMD64): please run "sysctl vm.stats.pagequeue" and report if the cache* values are zero (= bug in the cache detection code) or not.
Based upon work by: Chad David <davidc@acns.ab.ca> [1] Reviewed by: alc, arch (in 2004) Discussed with: alc, Chad David, arch (in 2004)
|
153741 |
26-Dec-2005 |
sobomax |
Remove kern.elf32.can_exec_dyn sysctl. Instead extend Brandinfo structure with flags bitfield and set BI_CAN_EXEC_DYN flag for all brands that usually allow executing elf dynamic binaries (aka shared libraries). When it is requested to execute ET_DYN elf image check if this flag is on after we know the elf brand allowing execution if so.
PR: kern/87615 Submitted by: Marcin Koziej <creep@desk.pl>
|
153666 |
22-Dec-2005 |
jhb |
Tweak how the MD code calls the fooclock() methods some. Instead of passing a pointer to an opaque clockframe structure and requiring the MD code to supply CLKF_FOO() macros to extract needed values out of the opaque structure, just pass the needed values directly. In practice this means passing the pair (usermode, pc) to hardclock() and profclock() and passing the boolean (usermode) to hardclock_cpu() and hardclock_process(). Other details: - Axe clockframe and CLKF_FOO() macros on all architectures. Basically, all the archs were taking a trapframe and converting it into a clockframe one way or another. Now they can just extract the PC and usermode values directly out of the trapframe and pass it to fooclock(). - Renamed hardclock_process() to hardclock_cpu() as the latter is more accurate. - On Alpha, we now run profclock() at hz (profhz == hz) rather than at the slower stathz. - On Alpha, for the TurboLaser machines that don't have an 8254 timecounter, call hardclock() directly. This removes an extra conditional check from every clock interrupt on Alpha on the BSP. There is probably room for even further pruning here by changing Alpha to use the simplified timecounter we use on x86 with the lapic timer since we don't get interrupts from the 8254 on Alpha anyway. - On x86, clkintr() shouldn't ever be called now unless using_lapic_timer is false, so add a KASSERT() to that affect and remove a condition to slightly optimize the non-lapic case. - Change prototypeof arm_handler_execute() so that it's first arg is a trapframe pointer rather than a void pointer for clarity. - Use KCOUNT macro in profclock() to lookup the kernel profiling bucket.
Tested on: alpha, amd64, arm, i386, ia64, sparc64 Reviewed by: bde (mostly)
|
153504 |
18-Dec-2005 |
marcel |
Make our ELF64 type definitions match standards. In particular this means: o Remove Elf64_Quarter, o Redefine Elf64_Half to be 16-bit, o Redefine Elf64_Word to be 32-bit, o Add Elf64_Xword and Elf64_Sxword for 64-bit entities, o Use Elf_Size in MI code to abstract the difference between Elf32_Word and Elf64_Word. o Add Elf_Ssize as the signed counterpart of Elf_Size.
MFC after: 2 weeks
|
153179 |
06-Dec-2005 |
jhb |
- Cleanup whitespace and extra ()s in vtophys() macros. - Move vtophys() macros next to vtopte() where vtopte() exists to match comments above vtopte(). - Remove references to the alternate address space in the comment above vtopte(). amd64 never had the alternate address space, and i386 lost it prior to PAE support being added. - s/entires/entries/ in comments.
Reviewed by: alc
|
153168 |
06-Dec-2005 |
ru |
Drop _MACHINE_ARCH and _MACHINE defines (not to be confused with MACHINE_ARCH and MACHINE). Their purpose was to be able to test in cpp(1), but cpp(1) only understands integer type expressions. Using such unsupported expressions introduced a number of subtle bugs, which were discovered by compiling with -Wundef.
|
153165 |
06-Dec-2005 |
ru |
Fix -Wundef warnings from compiling GENERIC and LINT kernels of all architectures.
|
152865 |
27-Nov-2005 |
ru |
- Allow duplicate "machine" directives with the same arguments. - Move existing "machine" directives to DEFAULTS.
|
152662 |
21-Nov-2005 |
jhb |
Don't enable PUC_FASTINTR by default in the source. Instead, enable it via the DEFAULTS kernel configs. This allows folks to turn it that option off in the kernel configs if desired without having to hack the source. This is especially useful since PUC_FASTINTR hangs the kernel boot on my ultra60 which has two uart(4) devices hung off of a puc(4) device.
I did not enable PUC_FASTINTR by default on powerpc since powerpc does not currently allow sharing of INTR_FAST with non-INTR_FAST like the other archs.
|
152661 |
21-Nov-2005 |
jhb |
Create DEFAULTS files for alpha, ia64, powerpc, and sparc64 and move 'device mem' over from GENERIC to DEFAULTS to be consistent with i386 and amd64. Additionally, on ia64 enable ACPI by default since ia64 requires acpi.
|
152630 |
20-Nov-2005 |
alc |
Eliminate pmap_init2(). It's no longer used.
|
152359 |
13-Nov-2005 |
alc |
In get_pv_entry() use PMAP_LOCK() instead of PMAP_TRYLOCK() when deadlock cannot possibly occur.
|
152224 |
09-Nov-2005 |
alc |
Reimplement the reclamation of PV entries. Specifically, perform reclamation synchronously from get_pv_entry() instead of asynchronously as part of the page daemon. Additionally, limit the reclamation to inactive pages unless allocation from the PV entry zone or reclamation from the inactive queue fails. Previously, reclamation destroyed mappings to both inactive and active pages. get_pv_entry() still, however, wakes up the page daemon when reclamation occurs. The reason being that the page daemon may move some pages from the active queue to the inactive queue, making some new pages available to future reclamations.
Print the "reclaiming PV entries" message at most once per minute, but don't stop printing it after the fifth time. This way, we do not give the impression that the problem has gone away.
Reviewed by: tegge
|
152042 |
04-Nov-2005 |
alc |
Begin and end the initialization of pvzone in pmap_init(). Previously, pvzone's initialization was split between pmap_init() and pmap_init2(). This split initialization was the underlying cause of some UMA panics during initialization. Specifically, if the UMA boot pages was exhausted before the pvzone was fully initialized, then UMA, through no fault of its own, would use an inappropriate back-end allocator leading to a panic. (Previously, as a workaround, we have increased the UMA boot pages.) Fortunately, there is no longer any reason that pvzone's initialization cannot be completed in pmap_init().
Eliminate a check for whether pv_entry_high_water has been initialized or not from get_pv_entry(). Since pvzone's initialization is completed in pmap_init(), this check is no longer needed.
Use cnt.v_page_count, the actual count of available physical pages, instead of vm_page_array_size to compute the maximum number of pv entries.
Introduce the vm.pmap.pv_entries tunable on alpha and ia64.
Eliminate some unnecessary white space.
Discussed with: tegge (item #1) Tested by: marcel (ia64)
|
152002 |
03-Nov-2005 |
alc |
Remove the remaining spl*() calls. Add some assertions. Eliminate some excessive white space.
|
151897 |
31-Oct-2005 |
rwatson |
Normalize a significant number of kernel malloc type names:
- Prefer '_' to ' ', as it results in more easily parsed results in memory monitoring tools such as vmstat.
- Remove punctuation that is incompatible with using memory type names as file names, such as '/' characters.
- Disambiguate some collisions by adding subsystem prefixes to some memory types.
- Generally prefer lower case to upper case.
- If the same type is defined in multiple architecture directories, attempt to use the same name in additional cases.
Not all instances were caught in this change, so more work is required to finish this conversion. Similar changes are required for UMA zone names.
|
151885 |
30-Oct-2005 |
marcel |
Remove a stray return statement in the interrupt dispatch function that caused a premature exit after calling a fast interrupt handler and bypassing a much needed critical_exit() and the scheduling of the interrupt thread for non-fast handlers. In short: unbreak :-)
|
151658 |
25-Oct-2005 |
jhb |
Reorganize the interrupt handling code a bit to make a few things cleaner and increase flexibility to allow various different approaches to be tried in the future. - Split struct ithd up into two pieces. struct intr_event holds the list of interrupt handlers associated with interrupt sources. struct intr_thread contains the data relative to an interrupt thread. Currently we still provide a 1:1 relationship of events to threads with the exception that events only have an associated thread if there is at least one threaded interrupt handler attached to the event. This means that on x86 we no longer have 4 bazillion interrupt threads with no handlers. It also means that interrupt events with only INTR_FAST handlers no longer have an associated thread either. - Renamed struct intrhand to struct intr_handler to follow the struct intr_foo naming convention. This did require renaming the powerpc MD struct intr_handler to struct ppc_intr_handler. - INTR_FAST no longer implies INTR_EXCL on all architectures except for powerpc. This means that multiple INTR_FAST handlers can attach to the same interrupt and that INTR_FAST and non-INTR_FAST handlers can attach to the same interrupt. Sharing INTR_FAST handlers may not always be desirable, but having sio(4) and uhci(4) fight over an IRQ isn't fun either. Drivers can always still use INTR_EXCL to ask for an interrupt exclusively. The way this sharing works is that when an interrupt comes in, all the INTR_FAST handlers are executed first, and if any threaded handlers exist, the interrupt thread is scheduled afterwards. This type of layout also makes it possible to investigate using interrupt filters ala OS X where the filter determines whether or not its companion threaded handler should run. - Aside from the INTR_FAST changes above, the impact on MD interrupt code is mostly just 's/ithread/intr_event/'. - A new MI ddb command 'show intrs' walks the list of interrupt events dumping their state. It also has a '/v' verbose switch which dumps info about all of the handlers attached to each event. - We currently don't destroy an interrupt thread when the last threaded handler is removed because it would suck for things like ppbus(8)'s braindead behavior. The code is present, though, it is just under #if 0 for now. - Move the code to actually execute the threaded handlers for an interrrupt event into a separate function so that ithread_loop() becomes more readable. Previously this code was all in the middle of ithread_loop() and indented halfway across the screen. - Made struct intr_thread private to kern_intr.c and replaced td_ithd with a thread private flag TDP_ITHREAD. - In statclock, check curthread against idlethread directly rather than curthread's proc against idlethread's proc. (Not really related to intr changes)
Tested on: alpha, amd64, i386, sparc64 Tested on: arm, ia64 (older version of patch by cognet and marcel)
|
151543 |
21-Oct-2005 |
ade |
Specifically panic() in the case where pmap_insert_entry() fails to get a new pv under high system load where the available pv entries have been exhausted before the pagedaemon has a chance to wake up to reclaim some.
Prior to this, the NULL pointer dereference ended up causing secondary panics with rather less than useful resulting tracebacks.
Reviewed by: alc, jhb MFC after: 1 week
|
151388 |
16-Oct-2005 |
phk |
Make ttyconsolemode() call ttsetwater() so that drivers don't have to.
|
151316 |
14-Oct-2005 |
davidxu |
1. Change prototype of trapsignal and sendsig to use ksiginfo_t *, most changes in MD code are trivial, before this change, trapsignal and sendsig use discrete parameters, now they uses member fields of ksiginfo_t structure. For sendsig, this change allows us to pass POSIX realtime signal value to user code.
2. Remove cpu_thread_siginfo, it is no longer needed because we now always generate ksiginfo_t data and feed it to libpthread.
3. Add p_sigqueue to proc structure to hold shared signals which were blocked by all threads in the proc.
4. Add td_sigqueue to thread structure to hold all signals delivered to thread.
5. i386 and amd64 now return POSIX standard si_code, other arches will be fixed.
6. In this sigqueue implementation, pending signal set is kept as before, an extra siginfo list holds additional siginfo_t data for signals. kernel code uses psignal() still behavior as before, it won't be failed even under memory pressure, only exception is when deleting a signal, we should call sigqueue_delete to remove signal from sigqueue but not SIGDELSET. Current there is no kernel code will deliver a signal with additional data, so kernel should be as stable as before, a ksiginfo can carry more information, for example, allow signal to be delivered but throw away siginfo data if memory is not enough. SIGKILL and SIGSTOP have fast path in sigqueue_add, because they can not be caught or masked. The sigqueue() syscall allows user code to queue a signal to target process, if resource is unavailable, EAGAIN will be returned as specification said. Just before thread exits, signal queue memory will be freed by sigqueue_flush. Current, all signals are allowed to be queued, not only realtime signals.
Earlier patch reviewed by: jhb, deischen Tested on: i386, amd64
|
151006 |
06-Oct-2005 |
phk |
Eliminate need for __RMAN_RESOURCE_VISIBLE
Reviewed by: marcel@
|
150663 |
28-Sep-2005 |
rwatson |
Back out alpha/alpha/trap.c:1.124, osf1_ioctl.c:1.14, osf1_misc.c:1.57, osf1_signal.c:1.41, amd64/amd64/trap.c:1.291, linux_socket.c:1.60, svr4_fcntl.c:1.36, svr4_ioctl.c:1.23, svr4_ipc.c:1.18, svr4_misc.c:1.81, svr4_signal.c:1.34, svr4_stat.c:1.21, svr4_stream.c:1.55, svr4_termios.c:1.13, svr4_ttold.c:1.15, svr4_util.h:1.10, ext2_alloc.c:1.43, i386/i386/trap.c:1.279, vm86.c:1.58, unaligned.c:1.12, imgact_elf.c:1.164, ffs_alloc.c:1.133:
Now that Giant is acquired in uprintf() and tprintf(), the caller no longer leads to acquire Giant unless it also holds another mutex that would generate a lock order reversal when calling into these functions. Specifically not backed out is the acquisition of Giant in nfs_socket.c and rpcclnt.c, where local mutexes are held and would otherwise violate the lock order with Giant.
This aligns this code more with the eventual locking of ttys.
Suggested by: bde
|
150631 |
27-Sep-2005 |
peter |
Implement 32 bit getcontext/setcontext/swapcontext on amd64. I've added stubs for ia64 to keep it compiling. These are used by 32 bit apps such as gdb.
|
150627 |
27-Sep-2005 |
jhb |
Add a new atomic_fetchadd() primitive that atomically adds a value to a variable and returns the previous value of the variable.
Tested on: i386, alpha, sparc64, arm (cognet) Reviewed by: arch@ Submitted by: cognet (arm) MFC after: 1 week
|
150335 |
19-Sep-2005 |
rwatson |
Add GIANT_REQUIRED and WITNESS sleep warnings to uprintf() and tprintf(), as they both interact with the tty code (!MPSAFE) and may sleep if the tty buffer is full (per comment).
Modify all consumers of uprintf() and tprintf() to hold Giant around calls into these functions. In most cases, this means adding an acquisition of Giant immediately around the function. In some cases (nfs_timer()), it means acquiring Giant higher up in the callout.
With these changes, UFS no longer panics on SMP when either blocks are exhausted or inodes are exhausted under load due to races in the tty code when running without Giant.
NB: Some reduction in calls to uprintf() in the svr4 code is probably desirable.
NB: In the case of nfs_timer(), calling uprintf() while holding a mutex, or even in a callout at all, is a bad idea, and will generate warnings and potential upset. This needs to be fixed, but was a problem before this change.
NB: uprintf()/tprintf() sleeping is generally a bad ideas, as is having non-MPSAFE tty code.
MFC after: 1 week
|
150270 |
18-Sep-2005 |
csjp |
Introduce a kernel config for the Mandatory Access Control framework. This kernel config briefly describes some of the major MAC policies available on FreeBSD. The hope is that this will raise the awareness about MAC and get more people interested.
Discussed with: scottl
|
150008 |
11-Sep-2005 |
alc |
Eliminate unused definitions.
|
150003 |
11-Sep-2005 |
obrien |
Canonize the include of acpi.h.
|
149926 |
10-Sep-2005 |
marcel |
Merge db_interface.c and db_trace.c into db_machdep.c.
|
149925 |
10-Sep-2005 |
marcel |
Move the prototypes of db_md_set_watchpoint(), db_md_clr_watchpoint() and db_md_list_watchpoints() to ddb/ddb.h.
|
149924 |
10-Sep-2005 |
marcel |
Move the ia32_sigcode structure from ia32_sigtramp.c to ia32_signal.c. It's a bit excessive to have it in a file of its own.
|
149923 |
10-Sep-2005 |
marcel |
Remove redundant $FreeBSD$
|
149915 |
09-Sep-2005 |
marcel |
Change the High FP lock from a sleep lock to a spin lock. We can take the lock from interrupt context, which causes an implicit lock order reversal. We've been using the lock carefully enough that making it a spin lock should not be harmful.
|
149807 |
05-Sep-2005 |
marcel |
Milestone: enable SMP by default.
|
149806 |
05-Sep-2005 |
marcel |
o In pmap_remove_pte: always invalidate the page. Previously the page was not invalidated if the PTE was not actually being removed. In an UP kernel this didn't cause problems, because the new mapping would preempt the old one. In an SMP kernel this could lead to the use of stale translations when processes move between CPUs at the "right" moment. This fixes the last of the obvious SMP problems and it should be safe to enable SMP by default now. o In pmap_remove_pte: minor code refactoring to avoid duplication. o Test all PTE pointers against NULL. Don't use implicit boolean tests.
|
149777 |
03-Sep-2005 |
marcel |
o s/vhpt_size/pmap_vhpt_log2size/g o s/vhpt_base/pmap_vhpt_base/g o s/vhpt_bucket/pmap_vhpt_bucket/g o Declare the above in <machine/pmap.h> o Move the vm.stats.vhpt.* sysctls to machdep.vhpt.* o Create a tunable machdep.vhpt.log2size, with corresponding sysctl. The tunable allows the user to specify the VHPT size from the loader. o Don't keep track of the number of PTEs in the VHPT. Calculate the population when necessary by iterating the buckets and summing up the length of the buckets. o Don't perform the tpa instruction with a bucket lock held. The instruction can (theoretically) fault and locking is not needed.
|
149770 |
03-Sep-2005 |
marcel |
Fix collision chain termination checks. The result of IA64_PHYS_TO_RR7 is never 0, so one cannot test for a NULL pointer after a physical address is translated into a virtual pointer with said macro. Instead, keep the physical address around and test it against 0. Note that this obviously implies that a PTE can never be allocated at physical address 0. This isn't exactly guaranteed, but hasn't been a problem so far. We test the physical address against 0 for as long as the ia64 port exists...
|
149768 |
03-Sep-2005 |
alc |
Pass a value of type vm_prot_t to pmap_enter_quick() so that it determine whether the mapping should permit execute access.
|
149337 |
20-Aug-2005 |
stefanf |
Move MINSIGSTKSZ from <machine/signal.h> to <machine/_limits.h> and rename it to __MINSIGSTKSZ. Define MINSIGSTKSZ in <sys/signal.h>.
This is done in order to use MINSIGSTKSZ for the macro PTHREAD_STACK_MIN in <pthread.h> (soon <limits.h>) without having to include the whole <sys/signal.h> header.
Discussed with: bde
|
149062 |
14-Aug-2005 |
marcel |
Remove the execute permission for stacks.
|
149037 |
13-Aug-2005 |
marcel |
o s/pmap_lpte_/pmap_/g o Remove pmap_is_referenced(). It was already compiled-out.
|
149036 |
13-Aug-2005 |
marcel |
Fix the problem with the IPI for the lazy context switching of the high FP registers. It was not that the IPI got lost due to the perceived unreliability of the IPI delivery, but rather that the IPI was not assigned a vector (ugh). Sending a 0 vector to a CPU results in a stray external interrupt. Add a KASSERT to ipi_send() to catch this. The initialization of the IPIs could be better, but it's not at all sure what the future of the code is. Avoid wasting a lot of time on something that is going to be rewritten anyway.
|
148807 |
06-Aug-2005 |
marcel |
Improve SMP support: o Allocate a VHPT per CPU. The VHPT is a hash table that the CPU uses to look up translations it can't find in the TLB. As such, the VHPT serves as a level 1 cache (the TLB being a level 0 cache) and best results are obtained when it's not shared between CPUs. The collision chain (i.e. the hash bucket) is shared between CPUs, as all buckets together constitute our collection of PTEs. To achieve this, the collision chain does not point to the first PTE in the list anymore, but to a hash bucket head structure. The head structure contains the pointer to the first PTE in the list, as well as a mutex to lock the bucket. Thus, each bucket is locked independently of each other. With at least 1024 buckets in the VHPT, this provides for sufficiently finei-grained locking to make the ssolution scalable to large SMP machines. o Add synchronisation to the lazy FP context switching. We do this with a seperate per-thread lock. On SMP machines the lazy high FP context switching without synchronisation caused inconsistent state, which resulted in a panic. Since the use of the high FP registers is not common, it's possible that races exist. The ia64 package build has proven to be a good stress test, so this will get plenty of exercise in the near future. o Don't use the local ID of the processor we want to send the IPI to as the argument to ipi_send(). use the struct pcpu pointer instead. The reason for this is that IPI delivery is unreliable. It has been observed that sending an IPI to a CPU causes it to receive a stray external interrupt. As such, we need a way to make the delivery reliable. The intended solution is to queue requests in the target CPU's per-CPU structure and use a single IPI to inform the CPU that there's a new entry in the queue. If that IPI gets lost, the CPU can check it's queue at any convenient time (such as for each clock interrupt). This also allows us to send requests to a CPU without interrupting it, if such would be beneficial.
With these changes SMP is almost working. There are still some random process crashes and the machine can hang due to having the IPI lost that deals with the high FP context switch.
The overhead of introducing the hash bucket head structure results in a performance degradation of about 1% for UP (extra pointer indirection). This is surprisingly small and is offset by gaining reasonably/good scalable SMP support.
|
148805 |
06-Aug-2005 |
marcel |
Reduce the default MAXCPU from 16 to 4. This is in preparation of allocating a VHPT per CPU. Since we don't yet know how many CPUs are actually in the system at the time we need to allocate the VHPTs, we allocate for MAXCPU processors. This can result in a lot of wasted space for 2-way machines. So, for now, limit MAXCPU to something smaller until we have something more dynamic.
|
148804 |
06-Aug-2005 |
marcel |
For ia64_ptc_{e,g,ga,l}(), use instruction serialization. We typically don't know what the TLB described and need to assume that it affects the fetching of instructions.
|
148666 |
03-Aug-2005 |
jeff |
- Add support for saving stack traces and displaying them via printf(9) and KTR.
Contributed by: Antoine Brodin <antoine.brodin@laposte.net> Concept code from: Neal Fachan <neal@isilon.com>
|
148067 |
15-Jul-2005 |
jhb |
Convert the atomic_ptr() operations over to operating on uintptr_t variables rather than void * variables. This makes it easier and simpler to get asm constraints and volatile keywords correct.
MFC after: 3 days Tested on: i386, alpha, sparc64 Compiled on: ia64, powerpc, amd64 Kernel toolchain busted on: arm
|
147991 |
14-Jul-2005 |
kensmith |
Add recently invented COMPAT_FREEBSD5 option.
MFC after: 3 days
|
147889 |
10-Jul-2005 |
davidxu |
Validate if the value written into {FS,GS}.base is a canonical address, writting non-canonical address can cause kernel a panic, by restricting base values to 0..VM_MAXUSER_ADDRESS, ensuring only canonical values get written to the registers.
Reviewed by: peter, Josepha Koshy < joseph.koshy at gmail dot com > Approved by: re (scottl)
|
147773 |
05-Jul-2005 |
marcel |
Enhance ia64_flush_dirty() to handle the case in which td != curthread. This case is triggered with ptrace(2) and the PT_SETREGS function. Change the return type of the function to int so that errors can be passed on to the caller.
Approved by: re (scottl)
|
147745 |
02-Jul-2005 |
marcel |
Implement functions calls from within DDB on ia64. On ia64 a function pointer doesn't point to the first instruction of that function, but rather to a descriptor. The descriptor has the address of the first instruction, as well as the value of the global pointer. The symbol table doesn't know anything about descriptors, so if you lookup the name of a function you get the address of the first instruction. The cast from the address, which is the result of the symbol lookup, to a function pointer as is done in db_fncall is therefore invalid. Abstract this detail behind the DB_CALL macro. By default DB_CALL is defined as db_fncall_generic, which yields the old behaviour. On ia64 the macro is defined as db_fncall_ia64, in which a descriptor is constructed to yield a valid function pointer.
While here, introduce DB_MAXARGS. DB_MAXARGS replaces the existing (local) MAXARGS. The DB_MAXARGS macro can be defined by platforms to create a convenient maximum. By default this will be the legacy 10. On ia64 we define this macro to be 8, for 8 is the maximum number of arguments that can be passed in registers. This avoids having to implement spilling of arguments on the memory stack.
Approved by: re (dwhite)
|
147740 |
02-Jul-2005 |
marcel |
Fix a buglet that was present in the ia64 code and that got inherited by amd64 and i386: For buffered writes we collect data and write it out a ${DEV_BSIZE}-sized block at a time. The fragsz variable is used to keep track of how much data we have collected in the buffer so far and it's reset to zero immediately after writing a block to the dump device. When the last, possibly partially filled buffer is flushed, we didn't reset fragsz to 0 and as such would stop reflecting reality. Since we currently only need to do buffered writes once, this isn't a problem. However, when kernel dumps are made by hand (say by callling doadump from within DDB), the improperly cleared state from the first call to dumpsys causes the next call to dumpsys to create an invalid code file. This change resets fragsz after flushing the partially filled buffer so that it fixes the two problems at once.
Approved by: re (scottl)
|
147692 |
30-Jun-2005 |
peter |
Jumbo-commit to enhance 32 bit application support on 64 bit kernels. This is good enough to be able to run a RELENG_4 gdb binary against a RELENG_4 application, along with various other tools (eg: 4.x gcore). We use this at work.
ia32_reg.[ch]: handle the 32 bit register file format, used by ptrace, procfs and core dumps. procfs_*regs.c: vary the format of proc/XXX/*regs depending on the client and target application. procfs_map.c: Don't print a 64 bit value to 32 bit consumers, or their sscanf fails. They expect an unsigned long. imgact_elf.c: produce a valid 32 bit coredump for 32 bit apps. sys_process.c: handle 32 bit consumers debugging 32 bit targets. Note that 64 bit consumers can still debug 32 bit targets.
IA64 has got stubs for ia32_reg.c.
Known limitations: a 5.x/6.x gdb uses get/setcontext(), which isn't implemented in the 32/64 wrapper yet. We also make a tiny patch to gdb pacify it over conflicting formats of ld-elf.so.1.
Approved by: re
|
147640 |
27-Jun-2005 |
marcel |
Handle B-unit break instructions. The break.b is unique in that the immediate is not saved by the architecture. Any of the break.{mifx} instructions have their immediate saved in cr.iim on interruption. Consequently, when we handle the break interrupt, we end up with a break value of 0 when it was a break.b. The immediate is important because it distinguishes between different uses of the break and which are defined by the runtime specification. The bottomline is that when the GNU debugger replaces a B-unit instruction with a break instruction in the inferior, we would not send the process a SIGTRAP when we encounter it, because the value is not one we recognize as a debugger breakpoint.
This change adds logic to decode the bundle in which the break instruction lives whenever the break value is 0. The assumption being that it's a break.b and we fetch the immediate directly out of the instruction. If the break instruction was not a break.b, but any of break.{mifx} with an immediate of 0, we would be doing unnecessary work. But since a break 0 is invalid, this is not a problem and it will still result in a SIGILL being sent to the process.
Approved by: re (scottl)
|
147639 |
27-Jun-2005 |
marcel |
Replace the existing copyright notice with my own. Over the years I've changed this file so much that it's equivalent to a rewrite, and I'm not talking about any of the cosmetic changes of course.
Approved by: re (scottl)
|
147638 |
27-Jun-2005 |
marcel |
Cosmetic: s/u_int64_t/uint64_t/g
Approved by: re (scottl)
|
147504 |
20-Jun-2005 |
obrien |
Add .cvsignore files just like in sys/<arch>/compiled, this keeps CVS from questing kernel config files not in CVS.
Approved by: re(kensmith)
|
147322 |
12-Jun-2005 |
marcel |
Define IPI_PREEMPT. Update a nearby comment while I'm here.
|
147217 |
10-Jun-2005 |
alc |
Introduce a procedure, pmap_page_init(), that initializes the vm_page's machine-dependent fields. Use this function in vm_pageq_add_new_page() so that the vm_page's machine-dependent and machine-independent fields are initialized at the same time.
Remove code from pmap_init() for initializing the vm_page's machine-dependent fields.
Remove stale comments from pmap_init().
Eliminate the Boolean variable pmap_initialized from the alpha, amd64, i386, and ia64 pmap implementations. Its use is no longer required because of the above changes and earlier changes that result in physical memory that is being mapped at initialization time being mapped without pv entries.
Tested by: cognet, kensmith, marcel
|
147191 |
09-Jun-2005 |
jkoshy |
MFP4:
- Implement sampling modes and logging support in hwpmc(4).
- Separate MI and MD parts of hwpmc(4) and allow sharing of PMC implementations across different architectures. Add support for P4 (EMT64) style PMCs to the amd64 code.
- New pmcstat(8) options: -E (exit time counts) -W (counts every context switch), -R (print log file).
- pmc(3) API changes, improve our ability to keep ABI compatibility in the future. Add more 'alias' names for commonly used events.
- bug fixes & documentation.
|
146794 |
29-May-2005 |
marcel |
Create nexus in configure_first() instead of in configure(). This makes sure that sysinit tasks that run after configure_first(), but before configure() have a nexus to hang devices off.
|
146791 |
29-May-2005 |
marcel |
Call cninit_finish() in configure_final().
|
146734 |
29-May-2005 |
nyan |
Remove bus_{mem,p}io.h and related code for a micro-optimization on i386 and amd64. The optimization is a trivial on recent machines.
Reviewed by: -arch (imp, marcel, dfr)
|
146214 |
14-May-2005 |
nyan |
- Move bus dependent defines to {isa,cbus}_dmareg.h. - Use isa/isareg.h rather than <arch>/isa/isa.h.
Tested on: i386, pc98
|
146037 |
10-May-2005 |
marcel |
Don't define _MACHINE_BUS_MEMIO_H_ nor _MACHINE_BUS_PIO_H_.
|
145433 |
23-Apr-2005 |
davidxu |
Change cpu_set_kse_upcall to more generic style, so we can reuse it in other codes. Add cpu_set_user_tls, use it to tweak user register and setup user TLS. I ever wanted to merge it into cpu_set_kse_upcall, but since cpu_set_kse_upcall is also used by M:N threads which may not need this feature, so I wrote a separated cpu_set_user_tls.
|
145389 |
22-Apr-2005 |
marcel |
Sanity the RTC code: o Remove the clock interface. Not only does it conflict with the MI version when device genclock is added to the kernel, it was also not possible to have more than 1 clock device. This of course would have been a problem if we actually had more than 1 clock device. In short: we don't need a clock interface and if we do eventually, we should be using the MI one. o Rewrite inittodr() and resettodr() to take into account that: 1) We use the EFI interface directly. 2) time_t is 64-bit and we do need to make sure we can determine leap years from year 2100 and on. Add a nice explanation of where leap years come from and why. 3) This rewrite happened in 2005 so any date prior to 1/1/2005 (either M/D/Y or D/M/Y) is bogus. Reprogram the EFI clock with 1/1/2005 in that case. 4) The EFI clock has a high probability of being correct, so only (further) correct the EFI clock when the file system time is larger. That should never happen in a time-synchronised world. Complain when EFI lost 2 days or more.
Replace the copyright notice now that I (pretty much) rewrote all of this file.
|
145332 |
20-Apr-2005 |
marcel |
Add empty header (except of the multiple-inclusion protection) to get hwpmc(4) to compile on this platform.
|
145253 |
18-Apr-2005 |
imp |
Break out the definition of bus_space_{tag,handle}_t and a few other types into _bus.h to help with name space polution from including all of bus.h. In a few days, I'll commit changes to the MI code to take advantage of thse sepration (after I've made sure that these changes don't break anything in the main tree, I've tested in my trees, but you never know...).
Suggested by: bde (in 2002 or 2003 I think) Reviewed in principle by: jhb
|
145173 |
16-Apr-2005 |
marcel |
Add a kpte command to DDB. It dumps the PTE of a KVA. This helps to analyze faults and TLB/VHPT inconsistencies.
|
145137 |
16-Apr-2005 |
marcel |
Return better "error" values for UWX_BOTTOM and UWX_ABI_FRAME in unw_step(). Both errors denote the end of a stack trace (i.e. no prior frame), but are otherwise not error conditions. Have db_trace() return 0 when the trace ends due to one of these return codes as they are really normal termination conditions.
This change especially improves the output of the "show thread" command in DDB when there are threads in fork_trampoline() and previously db_trace() would return an error, causing the show command to emit '***'.
|
145092 |
15-Apr-2005 |
marcel |
Initialize curthread before we save the APs MCA state. Saving the MCA state requires a spin lock, which requires a valid curthread. This change allows SMP kernels to boot into multi-user again.
While here, update the copyright notice and use __FBSDID for the revision string.
|
144971 |
12-Apr-2005 |
jhb |
Use PCPU_LAZY_INC() for cnt.v_{intr,trap,syscalls} rather than atomic operations in some places and simple non-per CPU math in others.
|
144962 |
12-Apr-2005 |
marcel |
Dot the i's: 1 Move the debug.clock_adjust_* sysctls to debug.clock.adjust_* to make it easier to get only the clock statistics. 2 Make the sysctls read-only [suggested by Marius]. 3 When determining the new clock adjustment, we checked for an error either larger than 12.5% or smaller than 12.5%. We left out an error of exactly 12.5%. For errors larger than 12.5% we adjust the clock reload value in such a way that the next clock interrupt would be early (as in premature). For errors less than 12.5% we stopped the adjustment. The current algorithm doesn't benefit from excluding an error of exactly 12.5%. Change the code to stop adjusting the clock if the error is *not* larger than 12.5% [suggested by Marius].
Discussed with: marius@
|
144637 |
04-Apr-2005 |
jhb |
Divorce critical sections from spinlocks. Critical sections as denoted by critical_enter() and critical_exit() are now solely a mechanism for deferring kernel preemptions. They no longer have any affect on interrupts. This means that standalone critical sections are now very cheap as they are simply unlocked integer increments and decrements for the common case.
Spin mutexes now use a separate KPI implemented in MD code: spinlock_enter() and spinlock_exit(). This KPI is responsible for providing whatever MD guarantees are needed to ensure that a thread holding a spin lock won't be preempted by any other code that will try to lock the same lock. For now all archs continue to block interrupts in a "spinlock section" as they did formerly in all critical sections. Note that I've also taken this opportunity to push a few things into MD code rather than MI. For example, critical_fork_exit() no longer exists. Instead, MD code ensures that new threads have the correct state when they are created. Also, we no longer try to fixup the idlethreads for APs in MI code. Instead, each arch sets the initial curthread and adjusts the state of the idle thread it borrows in order to perform the initial context switch.
This change is largely a big NOP, but the cleaner separation it provides will allow for more efficient alternative locking schemes in other parts of the kernel (bare critical sections rather than per-CPU spin mutexes for per-CPU data for example).
Reviewed by: grehan, cognet, arch@, others Tested on: i386, alpha, sparc64, powerpc, arm, possibly more
|
143985 |
22-Mar-2005 |
sobomax |
Add USB Communication Device Class Ethernet driver. Originally written for FreeBSD based on aue(4) it was picked by OpenBSD, then from OpenBSD ported to NetBSD and finally NetBSD version merged with original one goes into FreeBSD.
Obtained from: http://www.gank.org/freebsd/cdce/ NetBSD OpenBSD
|
143867 |
20-Mar-2005 |
njl |
s/SLIST/STAILQ to catch up with changes to resource lists.
Missed by: imp
|
143809 |
18-Mar-2005 |
murray |
Add a comment to note that pseudo-device bpf is required for DHCP. This is mentioned in the Handbook but it is not as obvious to new users why bpf is needed compared to the other largely self-explanatory items in GENERIC.
PR: conf/40855 MFC after: 1 week
|
143796 |
18-Mar-2005 |
iedowse |
Split configure() into 3 separate steps like we do on other architectures. This makes it possible to insert hooks before and after the device attachment step.
Tested thanks to: marcel
|
143598 |
14-Mar-2005 |
scottl |
Refactor the bus_dma header files so that the interface is described in sys/bus_dma.h instead of being copied in every single arch. This slightly reorders a flag that was specific to AXP and thus changes the ABI there. The interface still relies on bus_space definitions found in <machine/bus.h> so it cannot be included on its own yet, but that will be fixed at a later date. Add an MD <machine/bus_dma.h> for ever arch for consistency and to allow for future MD augmentation of the API. sparc64 makes heavy use of this right now due to its different bus_dma implemenation.
|
143202 |
07-Mar-2005 |
scottl |
Remove dead code.
|
143063 |
02-Mar-2005 |
joerg |
netchild's mega-patch to isolate compiler dependencies into a central place.
This moves the dependency on GCC's and other compiler's features into the central sys/cdefs.h file, while the individual source files can then refer to #ifdef __COMPILER_FEATURE_FOO where they by now used to refer to #if __GNUC__ > 3.1415 && __BARC__ <= 42.
By now, GCC and ICC (the Intel compiler) have been actively tested on IA32 platforms by netchild. Extension to other compilers is supposed to be possible, of course.
Submitted by: netchild Reviewed by: various developers on arch@, some time ago
|
143057 |
02-Mar-2005 |
marcel |
Make sure fpswa_iface equals NULL when bootinfo.bi_fpswa equals 0. We need to be able to test for the (possible) non-existence of the FPSWA code.
PR: ia64/77591 Submitted by: Christian Kandeler (christian dot kandeler at hob dot de) MFC after: 1 day
|
142956 |
01-Mar-2005 |
wes |
Attempt to doff the pointy hat: implement 'hw.realmem' on remaining architectures. Pointed out by O'Brien, ScottL via email.
Reviewed by: obrien (various)
|
142436 |
25-Feb-2005 |
delphij |
Remove acpi_perf from {ARCH}/conf/NOTES, to make tinderbox happy.
Reported by: tinderbox Inspired by: acpi_perf build structure removal commit
|
142107 |
19-Feb-2005 |
ru |
Use a common multi-inclusion protection, and add such a protection to alpha/include/exec.h.
|
141556 |
09-Feb-2005 |
marcel |
s/descr/oid_descr/
|
141391 |
06-Feb-2005 |
phk |
Since we are quite unlikely to ever face another platform which uses the i8237 without trying to emulate the PC architecture move the register definitions for the i8237 chip into the central include file for the chip, except for the PC98 case which is magic.
Add new isa_dmatc() function which tells us as cheaply as possible if the terminal count has been reached for a given channel.
|
141378 |
06-Feb-2005 |
njl |
Finish the job of sorting all includes and fix the build by including malloc.h before proc.h on sparc64. Noticed by das@
Compiled on: alpha, amd64, i386, pc98, sparc64
|
141369 |
05-Feb-2005 |
njl |
Build cpufreq and acpi_perf on platforms that are likely to be able to use them.
|
141248 |
04-Feb-2005 |
marcel |
Include sys/bus.h before sys/cpu.h. The latter needs device_t.
|
141237 |
04-Feb-2005 |
njl |
Add an implementation of cpu_est_clockrate(9). This function estimates the current clock frequency for the given CPU id in units of Hz.
|
141085 |
31-Jan-2005 |
imp |
nit in /*-
|
140891 |
27-Jan-2005 |
marcel |
Fix handling of post increment: Either the first or second operand is the register with the memory address, and it's that register's value we need to increment or decrement.
MFC after: 3 days
|
140418 |
18-Jan-2005 |
scottl |
Fix compile errors. Bah.
|
140315 |
15-Jan-2005 |
scottl |
Fix an assignment that I missed in the last commit.
|
140311 |
15-Jan-2005 |
scottl |
Add bus_dmamap_load_mbuf_sg() to ia64
|
140256 |
14-Jan-2005 |
jhb |
- Remove some OBE comments regarding cpu_exit(). cpu_exit() is no longer the last action of kern_exit(). Instead, it is a MD callout to cleanup per-process state during exit. - Add notes of concern to Alpha and ia64 about the possible need to drop fp state in cpu_thread_exit() rather than in cpu_exit() since it is per-thread state rather than per-process.
|
139790 |
06-Jan-2005 |
imp |
/* -> /*- for copyright notices, minor format tweaks as necessary
|
139554 |
02-Jan-2005 |
marcel |
Further enhance the handling of misaligned loads and stores: o implement double-extended and single precision loads and stores, o implement double precision stores, o replace the machdep.unaligned_print sysctl with debug.unaligned_print and change the default value to 0, o replace the machdep.unaligned_sigbus sysctl with debug.unaligned_test, o Remmove the fillfd() function. The function is trvial enough for inline assembly.
The debug.unaligned_test sysctl is used to test the emulation of misaligned loads and stores. When PSR.ac is 0, the CPU will handle misaligned memory accesses itselfi and we don't get an exception for it. When PSR.ac is 1, the process needs to be signalled and we should not emulate. The sysctl takes effect when PSR.ac is 1 and tells us that we should emulate and not send a signal.
PR: 72268 MFC after: 1 week
|
139241 |
23-Dec-2004 |
alc |
Modify pmap_enter_quick() so that it expects the page queues to be locked on entry and it assumes the responsibility for releasing the page queues lock if it must sleep.
Remove a bogus comment from pmap_enter_quick().
Using the first change, modify vm_map_pmap_enter() so that the page queues lock is acquired and released once, rather than each time that a page is mapped.
|
138897 |
15-Dec-2004 |
alc |
In the common case, pmap_enter_quick() completes without sleeping. In such cases, the busying of the page and the unlocking of the containing object by vm_map_pmap_enter() and vm_fault_prefault() is unnecessary overhead. To eliminate this overhead, this change modifies pmap_enter_quick() so that it expects the object to be locked on entry and it assumes the responsibility for busying the page and unlocking the object if it must sleep. Note: alpha, amd64, i386 and ia64 are the only implementations optimized by this change; arm, powerpc, and sparc64 still conservatively busy the page and unlock the object within every pmap_enter_quick() call.
Additionally, this change is the first case where we synchronize access to the page's PG_BUSY flag and busy field using the containing object's lock rather than the global page queues lock. (Modifications to the page's PG_BUSY flag and busy field have asserted both locks for several weeks, enabling an incremental transition.)
|
138752 |
12-Dec-2004 |
marcel |
Fix the last of the instability and the cause of the annoying "vm_fault: fault on nofault entry, addr: %lx" panic. The problem was a stale PTE in the TLB that marked the page as not present, even though we had a good PTE in the VHPT. We typically don't yet insert PTEs in the TLB. We do that lazily. The CPU will look for the PTE in the VHPT when there's no PTE in the TLB. Unfortunately this doesn't handle the case of the stale PTE in the TLB. The quick fix is to invalidate the TLB (sloppily) when the VHPT doesn't contain a valid PTE. This is also the only case that may cause a PTE in the TLB that marks a page as non-present.
|
138674 |
11-Dec-2004 |
marcel |
Use primitive types to avoid creating an artificial header dependency: o s/u_long/unsigned long/ o s/uint32_t/unsigned int/g o s/uint64_t/unsigned long/g
Trigger case: multimedia/mpeg2codec
|
138543 |
08-Dec-2004 |
marcel |
Don't obtain the HCDP address directly from the bootinfo structure. Use a function to keep the details at arms length from uart(4).
|
138253 |
01-Dec-2004 |
marcel |
Change gdb_cpu_setreg() to not take the value to which to set the specified register, but a pointer to the in-memory representation of that value. The reason for this is twofold: 1. Not all registers can be represented by a register_t. In particular FP registers fall in that category. Passing the new register value by reference instead of by value makes this point moot. 2. When we receive a G or P packet, both are for writing a register, the packet will have the register value in target-byte order and in the memory representation (modulo the fact that bytes are sent as 2 printable hexadecimal numbers of course). We only need to decode the packet to have a pointer to the register value.
This change fixes the bug of extracting the register value of the P packet as a hexadecimal number instead of as a bit array. The quick (and dirty) fix to bswap the register value in gdb_cpu_setreg() as it has been added on i386 and amd64 can therefore be removed and has in fact been that.
Tested on: alpha, amd64, i386, ia64, sparc64
|
138145 |
28-Nov-2004 |
marcel |
Whitespace fixes: o Remove a bogus comment that relates to alpha. o s/u_int64_t/uint64_t/g o Add bi_spare2 to make the internal padding explicit. o Move BOOTINFO_MAGIC after the field it applies to.
|
138129 |
27-Nov-2004 |
das |
Don't include sys/user.h merely for its side-effect of recursively including other headers.
|
137978 |
21-Nov-2004 |
marcel |
Remove struct ia64_itir and use a plain old uint64_t instead.
|
137914 |
20-Nov-2004 |
das |
Remove UAREA_PAGES.
Reviewed by: arch@
|
137912 |
20-Nov-2004 |
das |
U areas are going away, so don't allocate one for process 0.
Reviewed by: arch@
|
137906 |
20-Nov-2004 |
das |
user.h is included only to get pcb.h, so use the latter directly instead.
|
137708 |
14-Nov-2004 |
marcel |
Remove the BR tag. When the machine doesn't have the DIG64 HCDP table with console settings, we now only need to know at which address the UART lives. Leaving the baudrate unspecified results in us using the baudrate at which the UART operates. This removes one parameter that can interfere with a successful installation out of the box.
|
137117 |
01-Nov-2004 |
jhb |
- Change the ddb paging "support" to use a variable (db_lines_per_page) to control the number of lines per page rather than a constant. The variable can be examined and changed in ddb as '$lines'. Setting the variable to 0 will effectively turn off paging. - Change db_putchar() to force out pending whitespace before outputting newlines and carriage returns so that one can rub out content on the current line via '\r \r' type strings. - Change the simple pager to rub out the --More-- prompt explicitly when the routine exits. - Add some aliases to the simple pager to make it more compatible with more(1): 'e' and 'j' do a single line. 'd' does half a page, and 'f' does a full page.
MFC after: 1 month Inspired by: kris
|
136809 |
23-Oct-2004 |
phk |
Use bioq_takefirst()
|
136680 |
18-Oct-2004 |
phk |
Add new function ttyinitmode() which sets our systemwide default modes on a tty structure.
Both the ".init" and the current settings are initialized allowing the function to be used both at attach and open time.
The function takes an argument to decide if echoing should be enabled. Echoing should not be enabled for regular physical serial ports unless they are consoles, in which case they should be configured by ttyconsolemode() instead.
Use the new function throughout.
|
136521 |
14-Oct-2004 |
njl |
Print flags in the nexus for child devices.
|
136366 |
11-Oct-2004 |
njl |
Move the code for halting the CPU (acpi_cpu_c1) into machdep files. This removes the last MD portion of acpi_cpu.c.
MFC after: 2 weeks
|
136183 |
06-Oct-2004 |
marcel |
Add the Madison II, which is the second generation Madison. The Madison II is model 2 in the Itanium 2 family and has up to 9MB of L3 cache and clocks higher than 1.5Ghz. There's no LV variant AFAICT.
|
136070 |
03-Oct-2004 |
alc |
The physical address stored in the vm_page is page aligned. There is no need to mask off the page offset bits. (This operation made some sense prior to i386/i386/pmap.c revision 1.254 when we passed a physical address rather than a vm_page pointer to pmap_enter().)
|
136050 |
02-Oct-2004 |
alc |
Eliminate unnecessary uses of PHYS_TO_VM_PAGE() from pmap_enter(). These uses predate the change in the pmap_enter() interface that replaced the page's physical address by the address of its vm_page structure. The PHYS_TO_VM_PAGE() was being used to compute the address of the same vm_page structure that was being passed in.
|
135832 |
26-Sep-2004 |
marcel |
...And fix WITNESS builds: declare syscallnames.
|
135798 |
26-Sep-2004 |
marcel |
Fix INVARIANTS build: Include <machine/cpu.h>.
|
135783 |
25-Sep-2004 |
marcel |
Move the IA-32 trap handling from trap() to ia32_trap(). Move the ia32_syscall() function along with it to ia32_trap.c. When COMPAT_IA32 is not defined, we'll raise SIGEMT instead.
|
135590 |
23-Sep-2004 |
marcel |
Redefine a PTE as a 64-bit integral type instead of a struct of bit-fields. Unify the PTE defines accordingly and update all uses.
|
135589 |
22-Sep-2004 |
marcel |
s/u_int#_t/uint#_t/g
|
135583 |
22-Sep-2004 |
marcel |
For the atomic_{add|clear|set|subtract} family of inlines, return the old or previous value instead of void. This is not as is documented in atomic(9), but is API (and ABI) compatible and simply makes sense. This feature will primarily be used for atomic PTE updates in PMAP/ng.
|
135581 |
22-Sep-2004 |
marcel |
MFp4: various style fixes, including o s/u_int/uint/g o s/#define<sp>/#define<tab>/g o indent macro definitions o Improve vertical spacing o Globally align line continuation character
|
135529 |
20-Sep-2004 |
jhb |
- Add support for "paging" in stack trace output. That is, when you do a stack trace from ddb, the output will pause with a '--More--' prompt every 18 lines. If you hit Enter, it will print another line and prompt again. If you hit space it will output another page and then prompt. If you hit 'q' or 'x' it will abort the rest of the stack trace. - Fix the sparc64 userland stack trace to honor the total count of lines to print. This is useful if your trace happens to walk back onto 0xdeadc0de and gets stuck in an endless loop.
MFC after: 1 month Tested on: i386, alpha, sparc64
|
135453 |
19-Sep-2004 |
marcel |
MFp4: Completely remove the remaining EFI includes and add our own (type) definitions instead. While here, abstract more of the internals by providing interface functions.
|
135443 |
18-Sep-2004 |
alc |
Release the page queues lock earlier in pmap_protect() and pmap_remove() in order to reduce contention.
|
135405 |
17-Sep-2004 |
marcel |
Provide our own FPSWA definitions, instead of depending on the Intel EFI headers and put them all in <machine/fpu.h>. The Intel EFI headers conflict with the Intel ACPI headers (duplicate type definitions), so are being phased out in the kernel.
|
135403 |
17-Sep-2004 |
marcel |
Remove useless inclusion of <machine/fpu.h>
|
135262 |
15-Sep-2004 |
phk |
Add new a function isa_dma_init() which returns an errno when it fails and which takes a M_WAITOK/M_NOWAIT flag argument.
Add compatibility isa_dmainit() macro which whines loudly if isa_dma_init() fails.
Problem uncovered by: tegge
|
135096 |
12-Sep-2004 |
marcel |
Catch up with other platforms: switch the default scheduler to 4BSD.
|
134934 |
08-Sep-2004 |
scottl |
Fix a problem with tag->boundary inheritence that has existed since day one and was propagated to nearly every platform. The boundary of the child needs to consider the boundary of the parent and pick the minimum of the two, not the maximum. However, if either is 0 then pick the appropriate one. This bug was exposed by a recent change to ATA, which should now be fixed by this change. The alignment and maxsegsz tag attributes likely also need a similar review in the near future.
This is a MT5 candidate.
Reviewed by: marcel Submitted by: sos (in part)
|
134928 |
08-Sep-2004 |
marcel |
Sync the busdma code with i386. The most tangible upshot is that the alignment and boundary constraints are being respected, which fixes the reported ATA problems with SiI chips. I consider the busdma implementation worrisome nonetheless. Not only is there too much MI code duplicated in MD files, there's a lot of questionable code. I smell a wholesale, cross-platform overhaul coming...
MT5 candidate.
|
134791 |
05-Sep-2004 |
julian |
Refactor a bunch of scheduler code to give basically the same behaviour but with slightly cleaned up interfaces.
The KSE structure has become the same as the "per thread scheduler private data" structure. In order to not make the diffs too great one is #defined as the other at this time.
The KSE (or td_sched) structure is now allocated per thread and has no allocation code of its own.
Concurrency for a KSEGRP is now kept track of via a simple pair of counters rather than using KSE structures as tokens.
Since the KSE structure is different in each scheduler, kern_switch.c is now included at the end of each scheduler. Nothing outside the scheduler knows the contents of the KSE (aka td_sched) structure.
The fields in the ksegrp structure that are to do with the scheduler's queueing mechanisms are now moved to the kg_sched structure. (per ksegrp scheduler private data structure). In other words how the scheduler queues and keeps track of threads is no-one's business except the scheduler's. This should allow people to write experimental schedulers with completely different internal structuring.
A scheduler call sched_set_concurrency(kg, N) has been added that notifies teh scheduler that no more than N threads from that ksegrp should be allowed to be on concurrently scheduled. This is also used to enforce 'fainess' at this time so that a ksegrp with 10000 threads can not swamp a the run queue and force out a process with 1 thread, since the current code will not set the concurrency above NCPU, and both schedulers will not allow more than that many onto the system run queue at a time. Each scheduler should eventualy develop their own methods to do this now that they are effectively separated.
Rejig libthr's kernel interface to follow the same code paths as linkse for scope system threads. This has slightly hurt libthr's performance but I will work to recover as much of it as I can.
Thread exit code has been cleaned up greatly. exit and exec code now transitions a process back to 'standard non-threaded mode' before taking the next step. Reviewed by: scottl, peter MFC after: 1 week
|
134648 |
02-Sep-2004 |
marcel |
Add aac(4) and aacp(4). The driver is 64-bit clean for roughly a year now and has been mentioned on the freebsd-ia64 list.
|
134571 |
31-Aug-2004 |
julian |
Remove an unneeded argument.. The removed argument could trivially be derived from the remaining one. That in turn should be the same as curthread, but it is possible that curthread could be expensive to derive on some syste,s so leave it as an argument. Having both proc and thread as an argumen tjust gives an opportunity for them to get out sync.
MFC after: 3 days
|
134568 |
31-Aug-2004 |
julian |
Remove sched_free_thread() which was only used in diagnostics. It has outlived its usefulness and has started causing panics for people who turn on DIAGNOSTIC, in what is otherwise good code.
MFC after: 2 days
|
134509 |
30-Aug-2004 |
alc |
Remove unnecessary check for curthread == NULL.
|
134502 |
30-Aug-2004 |
marcel |
s/ENTRY/ENTRY_NOPROFILE/g for particular functions that do not follow the C calling convention or are otherwise not regular functions. This allows us to boot a profiling kernel.
|
134410 |
27-Aug-2004 |
marcel |
Catch up with the drive-by renaming of IA32 to COMPAT_IA32. Missed 11 days ago when all the other places were fixed and finally caught by the tinderbox run...
|
134398 |
27-Aug-2004 |
marcel |
Move the kernel-specific logic to adjust frompc from MI to MD. For these two reasons: 1. On ia64 a function pointer does not hold the address of the first instruction of a functions implementation. It holds the address of a function descriptor. Hence the user(), btrap(), eintr() and bintr() prototypes are wrong for getting the actual code address. 2. The logic forces interrupt, trap and exception entry points to be layed-out contiguously. This can not be achieved on ia64 and is generally just bad programming.
The MCOUNT_FROMPC_USER macro is used to set the frompc argument to some kernel address which represents any frompc that falls outside the kernel text range. The macro can expand to ~0U to bail out in that case. The MCOUNT_FROMPC_INTR macro is used to set the frompc argument to some kernel address to represent a call to a trap or interrupt handler. This to avoid that the trap or interrupt handler appear to be called from everywhere in the call graph. The macro can expand to ~0U to prevent adjusting frompc. Note that the argument is selfpc, not frompc.
This commit defines the macros on all architectures equivalently to the original code in sys/libkern/mcount.c. People can take it from here...
Compile-tested on: alpha, amd64, i386, ia64 and sparc64 Boot-tested on: i386
|
134393 |
27-Aug-2004 |
alc |
The machine-independent parts of the virtual memory system always pass a valid pmap to the pmap functions that require one. Remove the checks for NULL. (These checks have their origins in the Mach pmap.c that was integrated into BSD. None of the new code written specifically for FreeBSD included them.)
|
134383 |
27-Aug-2004 |
andre |
Always compile PFIL_HOOKS into the kernel and remove the associated kernel compile option. All FreeBSD packet filters now use the PFIL_HOOKS API and thus it becomes a standard part of the network stack.
If no hooks are connected the entire packet filter hooks section and related activities are jumped over. This removes any performance impact if no hooks are active.
Both OpenBSD and DragonFlyBSD have integrated PFIL_HOOKS permanently as well.
|
134289 |
25-Aug-2004 |
marcel |
Get a step closer to profiling the kernel by fixing the definitions of the MCOUNT_ENTER, MCOUNT_EXIT and MCOUNT_DECL defines. Also make sure there's a prototype of _MCOUNT_DECL(). This allows us to build a kernel. There are still unresolved symbols, so linking fails.
|
134287 |
25-Aug-2004 |
marcel |
Make profiling actually work. The gcc compiler emits a call to the _mcount() stub when profiling is enabled. Emit this code sequence for assembly routines as welli (MCOUNT definition in <machine/asm.h>. We do not pass the GOT entry however as the 4th argument, because it's not used. The _mcount() stub calls __mcount(), which does the actual work. Define _MCOUNT_DECL to define __mcount. We do not have an implementation of mcount(), so we define MCOUNT as empty, but have a weak alias to _mcount() in _mcount.S. Note that the _mcount() stub in the kernel is slightly different from the stub in userland. This is because we do not have to worry about nested routines in the kernel.
|
134263 |
24-Aug-2004 |
njl |
Catch up with i386 nexus.c rev 1.59: add bus_get_resource_list().
|
134261 |
24-Aug-2004 |
obrien |
sr(4) definately won't work on IA64.
|
133888 |
16-Aug-2004 |
arun |
The existing code fails some corner cases. Replace it with ia64_bsp_adjust() which has been tested to work in all cases for arbitrary (bsp, nslots) combinations.
reviewed by: marcel@
|
133879 |
16-Aug-2004 |
marcel |
As I said: the previous commit was untested... Remove an #endif which should have ceased to exist when its corresponding #if was removed.
|
133878 |
16-Aug-2004 |
marcel |
Catch up with the drive-by renaming of IA32 to COMPAT_IA32. It must have been rush hour...
While here, move COMPAT_IA32 from opt_global.h to opt_compat.h like on amd64. Consequently, it's unsafe to use the option in pcb.h. We now unconditionally have the ia32 specific registers in the PCB.
This commit is untested.
|
133875 |
16-Aug-2004 |
arun |
ITC.{i,d} instructions use format M41 not M42.
reviewed by: marcel@
|
133711 |
14-Aug-2004 |
marcel |
Allocate memory in the unwinder with M_NOWAIT. We may need to provide backtraces with locks held.
|
133472 |
11-Aug-2004 |
marcel |
In set_regs(), flush the dirty registers onto the backingstore before we update the registers. That way we don't have any dirty registers to worry about and also know that bsp=bspstore, which makes updating the RSE related registers predictable. This is not the end of it. We need more validity checks, but for now this allows us to complete the gdb testsuite without crashing the kernel.
|
133464 |
11-Aug-2004 |
marcel |
Add __elfN(dump_thread). This function is called from __elfN(coredump) to allow dumping per-thread machine specific notes. On ia64 we use this function to flush the dirty registers onto the backingstore before we write out the PRSTATUS notes.
Tested on: alpha, amd64, i386, ia64 & sparc64 Not tested on: arm, powerpc
|
133405 |
09-Aug-2004 |
marcel |
Better preserve the original protection for the mappings we maintain. The hardware always gives read access for privilege level 0, which means that we cannot use the hardware access rights and privilege level in the PTE to test whether there's a change in protection. So, we save the original vm_prot_t in the PTE as well. Add pmap_pte_prot() to set the proper access rights and privilege level on the PTE given a pmap and the requested protection.
The above allows us to compare the protection in pmap_extract_and_hold() which was missing. While in pmap_extract_and_hold(), add pmap locking.
While here, clean up most (i.e. all but one) PTE macros we inherited from alpha. They were either unused, used inconsistently, badly named or simply weren't beneficial. We save the wired and managed state of the PTE in distinct (bit) fields.
While in pte.h, s/u_int64_t/uint64_t/g
pmap locking obtained from: alc@ feedback & review by: alc@
|
133291 |
08-Aug-2004 |
marcel |
Implement single stepping when we leave the kernel through the EPC syscall path. The basic problem is that we cannot set the single stepping flag directly, because we don't leave the kernel via an interrupt return. So, we need another way to set the single stepping flag. The way we do this is by enabling the lower-privilege transfer trap, which gets raised when we drop the privilege level. However, since we're still running in kernel space (sec), we're not yet done. We clear the lower- privilege transfer trap, enable the taken-branch trap and continue exiting the kernel until we branch into user space. Given the current code, there's a total of two traps this way before we can raise SIGTRAP.
|
133286 |
07-Aug-2004 |
marcel |
Slightly move labels around to make sure we call ast() on our way out after a fork(2) in fork_trampoline(). By moving the epc_syscall_return label immediately before the call to do_ast() in epc_syscall(), we not only achieve that but also handle the detour through exception_return when the frame corresponds to an asynchronous kernel entry. Hence, we simplified fork_trampoline() as a side-effect.
|
133285 |
07-Aug-2004 |
marcel |
De-inline gdb_cpu_signal() because we need to convert the trap vectors related to breakpoints and single stepping into SIGTRAP so gdb(1) knows why the remote target has stopped. In particular, gdb(1) needs to know if the reason is something of its own doing.
|
133135 |
04-Aug-2004 |
arun |
Use a 256MB TR instead of a 64MB TR to make sure that the kernel text/data are covered on APs. This enables the kernel to boot on a 4 way Intel Itanium-2 platform. This has a secondary effect of keeping the TRs identical on BP and the APs.
reviewed by: marcel@
|
133087 |
03-Aug-2004 |
markm |
Making a loadable null.ko for /dev/(null|zero) proved rather unpopular, so remove this (mis)feature.
Encouragement provided by: jhb (and others)
|
133084 |
03-Aug-2004 |
mux |
Instead of calling ia32_pause() conditionally on __i386__ or __amd64__ being defined, define and use a new MD macro, cpu_spinwait(). It only expands to something on i386 and amd64, so the compiled code should be identical.
Name of the macro found by: jhb Reviewed by: jhb
|
133023 |
02-Aug-2004 |
marcel |
Fix 2 typos in previous commit: both s/strct/struct/
|
133020 |
02-Aug-2004 |
marcel |
Add the mem and null devices now that they are optional.
|
133019 |
02-Aug-2004 |
marcel |
Sort the miscellaneous devices to restore ordering after the insertion of the mem and null devices.
|
132965 |
01-Aug-2004 |
markm |
Remove extraneous ';'.
|
132956 |
01-Aug-2004 |
markm |
Break out the MI part of the /dev/[k]mem and /dev/io drivers into their own directory and module, leaving the MD parts in the MD area (the MD parts _are_ part of the modules). /dev/mem and /dev/io are now loadable modules, thus taking us one step further towards a kernel created entirely out of modules. Of course, there is nothing preventing the kernel from having these statically compiled.
|
132897 |
30-Jul-2004 |
alc |
- Add pmap locking to ia64's pmap_enter() and pmap_enter_quick(). (This brings ia64 to parity with alpha, amd64, and i386 in this area.) - Prevent a race in pmap_find_pte(): If pmap_find_pte() sleeps in uma_zalloc(), another thread could allocate a pte at the same address. Instead, sleep at a higher level and retry the lookup before retrying the allocation.
Reviewed and tested by: marcel@
|
132871 |
30-Jul-2004 |
marcel |
Fix -O builds with gcc 3.4 by defining ffs as __builtin_ffs instead of creating an inline function that just calls __builtin_ffs.
|
132808 |
28-Jul-2004 |
phk |
Move a relic to its correct location(s): Put nfs diskless initialization calls with the code they call. (Yet another example of mindless copy&paste).
|
132700 |
27-Jul-2004 |
rwatson |
Pass a thread argument into cpu_critical_{enter,exit}() rather than dereference curthread. It is called only from critical_{enter,exit}(), which already dereferences curthread. This doesn't seem to affect SMP performance in my benchmarks, but improves MySQL transaction throughput by about 1% on UP on my Xeon.
Head nodding: jhb, bmilekic
|
132626 |
25-Jul-2004 |
marcel |
Work-around a gcc code generation bug for function descriptors references (target/16559). This fixes SMP configurations.
Obtained from: arun@
|
132522 |
22-Jul-2004 |
alc |
In pmap_mincore() create a private copy of the pte for use after the pmap lock is released.
|
132487 |
21-Jul-2004 |
alc |
Additional pmap locking
Tested by: marcel@
|
132482 |
21-Jul-2004 |
marcel |
Unify db_stack_trace_cmd(). All it did was look up the thread given the thread ID and call db_trace_thread(). Since arm has all the logic in db_stack_trace_cmd(), rename the new DB_COMMAND function to db_stack_trace to avoid conflicts on arm. While here, have db_stack_trace parse its own arguments so that we can use a more natural radix for IDs. If the ID is not a thread ID, or more precisely when no thread exists with the ID, try if there's a process with that ID and return the first thread in it. This makes it easier to print stack traces from the ps output.
requested by: rwatson@ tested on: amd64, i386, ia64
|
132383 |
19-Jul-2004 |
das |
Make FLT_ROUNDS correctly reflect the dynamic rounding mode.
|
132378 |
19-Jul-2004 |
alc |
Add partial pmap locking.
Tested by: marcel@
|
132235 |
16-Jul-2004 |
alc |
Remove unused fields from the pmap.
|
132226 |
15-Jul-2004 |
phk |
Preparation commit for the tty cleanups that will follow in the near future:
rename ttyopen() -> tty_open() and ttyclose() -> tty_close().
We need the ttyopen() and ttyclose() for the new generic cdevsw functions for tty devices in order to have consistent naming.
|
132220 |
15-Jul-2004 |
alc |
Push down the acquisition and release of the page queues lock into pmap_protect() and pmap_remove(). In general, they require the lock in order to modify a page's pv list or flags. In some cases, however, pmap_protect() can avoid acquiring the lock.
|
132170 |
15-Jul-2004 |
alc |
A loop in pmap_remove() should use TAILQ_FOREACH_SAFE(), not TAILQ_FOREACH(), because the loop deletes elements from the list.
Reviewed by: marcel@
|
132088 |
13-Jul-2004 |
davidxu |
Add ptrace_clear_single_step(), alpha already has it for years, the function will be used by ptrace to clear a thread's single step state.
|
132085 |
13-Jul-2004 |
alc |
Simplify pmap_protect().
|
132082 |
13-Jul-2004 |
alc |
Push down the acquisition and release of the page queues lock into pmap_remove_pages(). (The implementation of pmap_remove_pages() is optional. If pmap_remove_pages() is unimplemented, the acquisition and release of the page queues lock is unnecessary.)
Remove spl calls from the alpha, arm, and ia64 pmap_remove_pages().
|
131969 |
11-Jul-2004 |
marcel |
Add options KDB and GDB. KDB takes on the function of what DDB used to be. Both DDB and GDB specify which KDB backends to include.
|
131960 |
11-Jul-2004 |
marcel |
Remove the now unused GDB stubs. See src/sys/gdb/* for the new KDB backend.
|
131952 |
10-Jul-2004 |
marcel |
Mega update for the KDB framework: turn DDB into a KDB backend. Most of the changes are a direct result of adding thread awareness. Typically, DDB_REGS is gone. All registers are taken from the trapframe and backtraces use the PCB based contexts. DDB_REGS was defined to be a trapframe on all platforms anyway. Thread awareness introduces the following new commands: thread X switch to thread X (where X is the TID), show threads list all threads.
The backtrace code has been made more flexible so that one can create backtraces for any thread by giving the thread ID as an argument to trace.
With this change, ia64 has support for breakpoints.
|
131945 |
10-Jul-2004 |
marcel |
Update for the KDB framework: o ksym_start and ksym_end changed type to vm_offset_t. o Make debugging support conditional upon KDB instead of DDB. o Call kdb_enter() instead of breakpoint(). o Remove implementation of Debugger(). o Call kdb_trap() according to the new world order.
unwinder: o s/db_active/kdb_active/g o Various s/ddb/kdb/g o Add support for unwinding from the PCB as well as the trapframe. Abuse a spare field in the special register set to flag whether the PCB was actually constructed from a trapframe so that we can make the necessary adjustments.
md_var.h: o Add RSE convenience macros. o Add ia64_bsp_adjust() to add or subtract from BSP while taking NaT collections into account.
|
131905 |
10-Jul-2004 |
marcel |
Implement makectx(). The makectx() function is used by KDB to create a PCB from a trapframe for purposes of unwinding the stack. The PCB is used as the thread context and all but the thread that entered the debugger has a valid PCB. This function can also be used to create a context for the threads running on the CPUs that have been stopped when the debugger got entered. This however is not done at the time of this commit.
|
131903 |
10-Jul-2004 |
marcel |
Introduce the KDB debugger frontend. The frontend provides a framework in which multiple (presumably different) debugger backends can be configured and which provides basic services to those backends. Besides providing services to backends, it also serves as the single point of contact for any and all code that wants to make use of the debugger functions, such as entering the debugger or handling of the alternate break sequence. For this purpose, the frontend has been made non-optional. All debugger requests are forwarded or handed over to the current backend, if applicable. Selection of the current backend is done by the debug.kdb.current sysctl. A list of configured backends can be obtained with the debug.kdb.available sysctl. One can enter the debugger by writing to the debug.kdb.enter sysctl.
|
131899 |
10-Jul-2004 |
marcel |
Introduce the GDB debugger backend for the new KDB framework. The backend improves over the old GDB support in the following ways: o Unified implementation with minimal MD code. o A simple interface for devices to register themselves as debug ports, ala consoles. o Compression by using run-length encoding. o Implements GDB threading support.
|
131840 |
08-Jul-2004 |
brian |
Change the following environment variables to kernel options:
bootp -> BOOTP bootp.nfsroot -> BOOTP_NFSROOT bootp.nfsv3 -> BOOTP_NFSV3 bootp.compat -> BOOTP_COMPAT bootp.wired_to -> BOOTP_WIRED_TO
- i.e. back out the previous commit. It's already possible to pxeboot(8) with a GENERIC kernel.
Pointed out by: dwmalone
|
131838 |
08-Jul-2004 |
marcel |
MFamd64 (1.275): Reduce the scope of the Giant lock being held for non-mpsafe syscalls. There was way too much code being covered.
|
131821 |
08-Jul-2004 |
marcel |
Better handle the break instruction trap. The runtime specification has outlined which break numbers are software interrupts, debugger breakpoints and ABI specific breaks. We mostly treated all break numbers we didn't care about as debugger breakpoints.
|
131814 |
08-Jul-2004 |
brian |
Change the following kernel options to environment variables:
BOOTP -> bootp BOOTP_NFSROOT -> bootp.nfsroot BOOTP_NFSV3 -> bootp.nfsv3 BOOTP_COMPAT -> bootp.compat BOOTP_WIRED_TO -> bootp.wired_to
This lets you PXE boot with a GENERIC kernel by putting this sort of thing in loader.conf:
bootp="YES" bootp.nfsroot="YES" bootp.nfsv3="YES" bootp.wired_to="bge1"
or even setting the variables manually from the OK prompt.
|
131662 |
05-Jul-2004 |
alc |
- Correct pmap_extract()'s return type. It should be vm_paddr_t, not vm_offset_t. - Convert pmap_extract() to the ANSI style of declaration.
|
131481 |
02-Jul-2004 |
jhb |
Implement preemption of kernel threads natively in the scheduler rather than as one-off hacks in various other parts of the kernel: - Add a function maybe_preempt() that is called from sched_add() to determine if a thread about to be added to a run queue should be preempted to directly. If it is not safe to preempt or if the new thread does not have a high enough priority, then the function returns false and sched_add() adds the thread to the run queue. If the thread should be preempted to but the current thread is in a nested critical section, then the flag TDF_OWEPREEMPT is set and the thread is added to the run queue. Otherwise, mi_switch() is called immediately and the thread is never added to the run queue since it is switch to directly. When exiting an outermost critical section, if TDF_OWEPREEMPT is set, then clear it and call mi_switch() to perform the deferred preemption. - Remove explicit preemption from ithread_schedule() as calling setrunqueue() now does all the correct work. This also removes the do_switch argument from ithread_schedule(). - Do not use the manual preemption code in mtx_unlock if the architecture supports native preemption. - Don't call mi_switch() in a loop during shutdown to give ithreads a chance to run if the architecture supports native preemption since the ithreads will just preempt DELAY(). - Don't call mi_switch() from the page zeroing idle thread for architectures that support native preemption as it is unnecessary. - Native preemption is enabled on the same archs that supported ithread preemption, namely alpha, i386, and amd64.
This change should largely be a NOP for the default case as committed except that we will do fewer context switches in a few cases and will avoid the run queues completely when preempting.
Approved by: scottl (with his re@ hat)
|
131381 |
30-Jun-2004 |
marcel |
Unbreak build: define __RMAN_RESOURCE_VISIBLE See also src/sys/sys/rman.h rev. 1.21.
|
131312 |
30-Jun-2004 |
njl |
Add machdep quirks functions. On i386, this disables acpi on systems with BIOS dates earlier than Jan 1, 1999. Add prototypes and quirks flags.
|
130965 |
23-Jun-2004 |
alc |
- Remove unused definitions. - Move a definition inside the scope of a #ifdef _KERNEL.
|
130764 |
20-Jun-2004 |
bde |
Backed out previous commit. Blind substitution of dev_t by `struct cdev *' was just wrong here because the dev_t's are user dev_t's.
|
130742 |
19-Jun-2004 |
alc |
Remove dead code related to pv entry allocation.
Reviewed by: marcel@
|
130585 |
16-Jun-2004 |
phk |
Do the dreaded s/dev_t/struct cdev */ Bump __FreeBSD_version accordingly.
|
130362 |
11-Jun-2004 |
alc |
Neither pmap_enter() nor pmap_enter_quick() should create pv entries for unmanaged pages.
Tested by: marcel@
|
130344 |
11-Jun-2004 |
phk |
Deorbit COMPAT_SUNOS.
We inherited this from the sparc32 port of BSD4.4-Lite1. We have neither a sparc32 port nor a SunOS4.x compatibility desire these days.
|
130338 |
11-Jun-2004 |
alc |
Reduce the number of preallocated pv entries and lpte entries in pmap_init().
Tested by: marcel@
|
130077 |
04-Jun-2004 |
phk |
Machine generated patch which changes linedisc calls from accessing linesw[] directly to using the ttyld...() functions
The ttyld...() functions ar inline so there is no performance hit.
|
130028 |
03-Jun-2004 |
tjr |
Remove checks for curthread == NULL - it can't happen.
|
130025 |
03-Jun-2004 |
phk |
Add missing <sys/module.h> instances which were shadowed by the nested include in <sys/kernel.h>
|
130023 |
03-Jun-2004 |
tjr |
Move TDF_DEADLKTREAT into td_pflags (and rename it accordingly) to avoid having to acquire sched_lock when manipulating it in lockmgr(), uiomove(), and uiomove_fromphys().
Reviewed by: jhb
|
129944 |
01-Jun-2004 |
phk |
Gainfully employ the new ttyioctl in the trivial cases.
|
129750 |
26-May-2004 |
tmm |
Retire cpu_sched_exit(); it is not used any more.
|
129444 |
19-May-2004 |
bde |
Moved most of the "MI" definitions and declarations from <machine/profile.h> to <sys/gmon.h>. Cleaned them up a little by not attempting to ifdef for incomplete and out of date support for GUPROF in userland, as in the sparc64 version.
|
129393 |
18-May-2004 |
stefanf |
<stdint.h> should define WINT_M{AX,IN} independent from whether WCHAR_MIN is defined. Otherwise first including <wchar.h> and then <stdint.h> leads to no WINT_M{AX,IN} at all.
PR: 64956 Approved by: das (mentor)
|
129351 |
17-May-2004 |
marcel |
Fix typo in comment. While here, end the sentence with a period and remove the empty line between the fdc and sio devices. The empty line suggests that the comment applies to fdc only while it applies to all following devices and options.
Typo spotted by: ru@
|
129323 |
17-May-2004 |
marcel |
Unbreak build due to previous commit: now that elf_reloc_internal() gets the relocation base passed in relocbase, we cannot declare a local variable with the same name. Assume the argument holds the same value as the local variable did...
|
129321 |
17-May-2004 |
marcel |
filter out the fdc(4) and sio(4) devices and corresponding options. Note that cy(4) uses COM_MULTIPORT, so we need to keep that option.
|
129282 |
16-May-2004 |
peter |
Make a small revision to the api between the elf linker core and the elf_reloc() backends for two reasons. First, to support the possibility of there being two elf linkers in the kernel (eg: amd64), and second, to pass the relocbase explicitly (for relocating .o format kld files).
|
129023 |
07-May-2004 |
marcel |
Revert previous commit. We should not get any FP traps from within the kernel. We can guarantee this by resetting the FP status register. This masks all FP traps. The reason we did get FP traps was that we didn't reset the FP status register in all cases.
Make sure to reset the FP status register in syscall(). This is one of the places where it was forgotten.
While on the subject, reset the FP status register only when we trapped from user space.
|
129022 |
07-May-2004 |
marcel |
Make sure to sanitize the FP status register. Specifically this masks all FP traps, which should not happen in the kernel.
|
128990 |
06-May-2004 |
njl |
Make unnecessary globals static and remove unused includes.
Pointed out by: cscout
|
128979 |
05-May-2004 |
njl |
Add an MI implementation of the ACPI global lock routines and retire the individual asm versions. The global lock is shared between the BIOS and OS and thus cannot use our mutexes. It is defined in section 5.2.9.1 of the ACPI specification.
Reviewed by: marcel, bde, jhb
|
128857 |
03-May-2004 |
marcel |
Floating-point faults and exceptions can happen in the kernel too. Do not panic when it happens; handle them.
Run into by: das
|
128850 |
03-May-2004 |
marcel |
Catch- and cleanup: o Fix and improve comments and references, o Add PFIL_HOOKS, UFS_ACL and UFS_DIRHASH, o Switch from SCHED_4BSD to SCHED_ULE, o Remove SCSI_DELAY (there's no SCSI support),
|
128838 |
02-May-2004 |
obrien |
Spell Ethernet correctly.
|
128790 |
01-May-2004 |
marcel |
Verify the MADT checksum before using the table.
Submitted by: njl
|
128629 |
25-Apr-2004 |
das |
Hide FLT_EVAL_METHOD and DECIMAL_DIG in pre-C99 compilation environments.
PR: 63935 Submitted by: Stefan Farfeleder <stefan@fafoe.narf.at>
|
128508 |
21-Apr-2004 |
njl |
Don't check for NULL, device_get_softc() always succeeds.
|
128393 |
18-Apr-2004 |
alc |
MFamd64 Simplify the sf_buf implementation. In short, make it a veneer over the direct virtual-to-physical mapping.
|
128105 |
11-Apr-2004 |
alc |
Remove a comment that refers to avail_start and avail_end as these variables no longer exist.
|
128097 |
10-Apr-2004 |
alc |
- pmap_kenter_temporary() is unused by machine-independent code. Therefore, move its declaration to the machine-dependent header file on those machines that use it. In principle, only i386 should have it. Alpha and AMD64 should use their direct virtual-to-physical mapping. - Remove pmap_kenter_temporary() from ia64. It is unused. Approved by: marcel@
|
128019 |
07-Apr-2004 |
imp |
Remove advertising clause from University of California Regent's license, per letter dated July 22, 1999 and email from Peter Wemm, Alan Cox and Robert Watson.
Approved by: core, peter, alc, rwatson
|
127922 |
06-Apr-2004 |
alc |
Remove avail_end. As of yesterday, it is unused.
|
127875 |
05-Apr-2004 |
alc |
Remove avail_start on those platforms that no longer use it. (Only amd64 does anything with it beyond simple initialization.)
|
127869 |
05-Apr-2004 |
alc |
Remove unused arguments from pmap_init().
|
127788 |
03-Apr-2004 |
alc |
In some cases, sf_buf_alloc() should sleep with pri PCATCH; in others, it should not. Add a new parameter so that the caller can specify which is the case.
Reported by: dillon
|
127494 |
27-Mar-2004 |
marcel |
MFi386: correctly calculate the top-of-stack when a kthread is created with a larger kernel stack. Remove inclusion of opt_kstack_pages.h now that it's unused.
|
127253 |
21-Mar-2004 |
marcel |
In breakpoint(), use a different immediate to make sure we can distinguish between debugger inserted breakpoints and fixed breakpoints. While here, make sure the break instruction never ends up in the last slot of a bundle by forcing it to be an M-unit instruction. This makes it easier for use to skip over it.
|
127241 |
20-Mar-2004 |
alc |
- Add uiomove_fromphys() implementations to alpha and ia64. These only differ trivially from amd64. - Correct a spelling error in a comment.
|
127239 |
20-Mar-2004 |
marcel |
Introduce the cpumask_t type. The purpose of the type is to create a level of abstraction for any and all CPU mask and CPU bitmap variables so that platforms have the ability to break free from the hard limit of 32 CPUs, simply because we don't have more bits in an u_int. Note that the type is not supposed to solve massive parallelism, where the number of CPUs can be larger than the width of the widest integral type. As such, cpumask_t is not supposed to be a compound type. If such would be necessary in the future, we can deal with the issues then and there. For now, it can be assumed that the type is integral and unsigned.
With this commit, all MD definitions start off as u_int. This allows us to phase-in cpumask_t at our leasure without breaking anything. Once cpumask_t is used consistently, platforms can switch to wider (or smaller) types if such would be beneficial (or not; whatever :-)
Compile-tested on: i386
|
127219 |
20-Mar-2004 |
marcel |
Replace uint64_t with unsigned long in struct dbreg.
|
127217 |
20-Mar-2004 |
marcel |
Remove the last traditional hints. These hints only served the purpose for uart(4) to figure out which device to use as console. Use this file to define hw.uart.console instead so that we don't have to put it in the default loader.conf, which makes it hard to override.
|
127146 |
17-Mar-2004 |
jmg |
sync comment with i386's isa.c.. This removes a comment that is YEARS old...
|
127086 |
16-Mar-2004 |
alc |
Refactor the existing machine-dependent sf_buf_free() into a machine- dependent function by the same name and a machine-independent function, sf_buf_mext(). Aside from the virtue of making more of the code machine- independent, this change also makes the interface more logical. Before, sf_buf_free() did more than simply undo an sf_buf_alloc(); it also unwired and if necessary freed the page. That is now the purpose of sf_buf_mext(). Thus, sf_buf_alloc() and sf_buf_free() can now be used as a general-purpose emphemeral map cache.
|
126919 |
13-Mar-2004 |
scottl |
Now that contigfree() does not require Giant, don't grab it in busdma.
|
126825 |
10-Mar-2004 |
marcel |
Identify the Deerfield processor. Deerfield is a low-voltage variant based on the Madison core and targeting the low end of the spectrum. Its clock frequency is 1Ghz, whereas Madison starts at 1.3Ghz. Since the CPUID information is the same for Madison and Deerfield, we use the clock frequency to identify the processor. Supposedly the Deerfield only uses 62W, which seems to be less than modern Xeon processors (about 70W) and about half what a Madison would need.
|
126728 |
07-Mar-2004 |
alc |
Retire pmap_pinit2(). Alpha was the last platform that used it. However, ever since alpha/alpha/pmap.c revision 1.81 introduced the list allpmaps, there has been no reason for having this function on Alpha. Briefly, when pmap_growkernel() relied upon the list of all processes to find and update the various pmaps to reflect a growth in the kernel's valid address space, pmap_init2() served to avoid a race between pmap initialization and pmap_growkernel(). Specifically, pmap_pinit2() was responsible for initializing the kernel portions of the pmap and pmap_pinit2() was called after the process structure contained a pointer to the new pmap for use by pmap_growkernel(). Thus, an update to the kernel's address space might be applied to the new pmap unnecessarily, but an update would never be lost.
|
126716 |
07-Mar-2004 |
alc |
Integrate the code from pmap_pinit2() into pmap_pinit(), leaving pmap_pinit2() empty.
Approved by: marcel
|
126715 |
07-Mar-2004 |
alc |
Remove unused declarations. (Some time ago, these variables became fields of vm/vm.h's struct kva_md_info.)
|
126649 |
05-Mar-2004 |
le |
Fix syntax errors and wrong function prototypes in several MD header files when using non-GNUC compilers.
PR: kern/58515 Submitted by: Stefan Farfeleder <stefan@fafoe.narf.at> Approved by: grog (mentor), obrien
|
126106 |
22-Feb-2004 |
marcel |
Do not pre-map the I/O port space. On the Intel Tiger 4 this conflicts with a memory mapped I/O range that's immediately before it and is not 256MB aligned. As a result, when an address is accessed in the memory mapped range and a direct mapping is added for it, it overlaps with the pre-mapped I/O port space and causes a machine check.
Based on a patch from: arun@
|
126080 |
21-Feb-2004 |
phk |
Device megapatch 4/6:
Introduce d_version field in struct cdevsw, this must always be initialized to D_VERSION.
Flip sense of D_NOGIANT flag to D_NEEDGIANT, this involves removing four D_NOGIANT flags and adding 145 D_NEEDGIANT flags.
|
126078 |
21-Feb-2004 |
phk |
Device megapatch 3/6:
Add missing D_TTY flags to various drivers.
Complete asserts that dev_t's passed to ttyread(), ttywrite(), ttypoll() and ttykqwrite() have (d_flags & D_TTY) and a struct tty pointer.
Make ttyread(), ttywrite(), ttypoll() and ttykqwrite() the default cdevsw methods for D_TTY drivers and remove the explicit initializations in various drivers cdevsw structures.
|
126076 |
21-Feb-2004 |
phk |
Device megapatch 1/6:
Free approx 86 major numbers with a mostly automatically generated patch.
A number of strategic drivers have been left behind by caution, and a few because they still (ab)use their major number.
|
125975 |
18-Feb-2004 |
phk |
Change the disk(9) API in order to make device removal more robust.
Previously the "struct disk" were owned by the device driver and this gave us problems when the device disappared and the users of that device were not immediately disappearing.
Now the struct disk is allocate with a new call, disk_alloc() and owned by geom_disk and just abandonned by the device driver when disk_create() is called.
Unfortunately, this results in a ton of "s/\./->/" changes to device drivers.
Since I'm doing the sweep anyway, a couple of other API improvements have been carried out at the same time:
The Giant awareness flag has been flipped from DISKFLAG_NOGIANT to DISKFLAG_NEEDSGIANT
A version number have been added to disk_create() so that we can detect, report and ignore binary drivers with old ABI in the future.
Manual page update to follow shortly.
|
125112 |
27-Jan-2004 |
marcel |
Sort PFIL_HOOKS.
|
124935 |
24-Jan-2004 |
jeff |
- Recruit some new ULE users by making it the default scheduler in GENERIC. ULE will be in a probationary period to determine whether it will be left as the default in 5.3 which would likely mean the rest of the 5.x series.
|
124919 |
24-Jan-2004 |
nectar |
Add PFIL_HOOKS to the GENERIC kernel configuration, primarily so that one can load the IPFilter module (which requires PFIL_HOOKS).
Requested by: Many, for over a year
|
124739 |
20-Jan-2004 |
marcel |
Fix handling of FP traps: o For traps, the cr.iip register points to the next instruction to execute on interrupt return (modulo slot). Since we need to get the bundle of the instruction that caused the FP fault/trap, make sure we fetch the previous bundle if the next instruction is in fact the first in a bundle. o When we call the FPSWA handler, we need to tell it whether it's a trap or a fault (first argument). This was hardcoded to mean a fault.
Also, for FP faults, when a fault is converted to a trap, adjust the cr.iip and cr.ipsr registers to point to the next instruction. This makes sure that the SIGFPE handler gets a consistent state.
|
124737 |
20-Jan-2004 |
marcel |
s/framep/tf/g -- this normalizes on the use of tf to point to the trapframe and improves grep-ability.
|
124478 |
13-Jan-2004 |
des |
Whitespace nit.
|
124296 |
09-Jan-2004 |
nectar |
Provide sysarch(2) prototypes in the MD sysarch.h headers. While I'm at it, use the ANSI C generic pointer type for the second argument, thus matching the documentation.
Remove the now extraneous (and now conflicting) function declarations in various libc sources. Remove now unnecessary casts.
Reviewed by: bde
|
124092 |
03-Jan-2004 |
davidxu |
Make sigaltstack as per-threaded, because per-process sigaltstack state is useless for threaded programs, multiple threads can not share same stack. The alternative signal stack is private for thread, no lock is needed, the orignal P_ALTSTACK is now moved into td_pflags and renamed to TDP_ALTSTACK. For single thread or Linux clone() based threaded program, there is no semantic changed, because those programs only have one kernel thread in every process.
Reviewed by: deischen, dfr
|
123929 |
28-Dec-2003 |
silby |
Track three new sendfile-related statistics: - The number of times sendfile had to do disk I/O - The number of times sfbuf allocation failed - The number of times sfbuf allocation had to wait
|
123920 |
28-Dec-2003 |
silby |
Move the declaration of sfbufspeak and sfbufsused to mbuf.h, and use imax instead of max, as sfbufspeak and sfbufsused are signed.
Submitted by: bde
|
123884 |
27-Dec-2003 |
silby |
Track current and peak sfbuf usage, export the values via sysctl.
|
123819 |
24-Dec-2003 |
marcel |
Don't use NULL with integral types.
|
123799 |
24-Dec-2003 |
peter |
Return AE_OK for stub functions returning ACPI_STATUS, not NULL
|
123791 |
24-Dec-2003 |
peter |
GC the unused <machine/kse.h> file.
|
123742 |
23-Dec-2003 |
peter |
Add an additional field to the elf brandinfo structure to support quicker exec-time replacement of the elf interpreter on an emulation environment where an entire /compat/* tree isn't really warranted.
|
123629 |
18-Dec-2003 |
peter |
Add missing #include "opt_compat.h" so that the compatability function freebsd4_freebsd32_sigreturn() is defined when expected. This should unbreak the tinderbox. Sorry.
|
123528 |
14-Dec-2003 |
marcel |
In set_mcontext(), take into account that kse_switchin(2) will eventually be passed an async. context as well as a syscall context. While here, fix a serious bug in that if the trapframe is a syscall frame, but we're restoring an async context, we need to clear the FRAME_SYSCALL flag so that we leave the kernel via exception_restore.
|
123423 |
11-Dec-2003 |
peter |
Assimilate ia64 back into the fold with the common freebsd32/ia32 code. The split-up code is derived from the ia64 code originally.
Note that I have only compile-tested this, not actually run-tested it. The ia64 side of the force is missing some significant chunks of signal delivery code.
|
123420 |
10-Dec-2003 |
peter |
Fix last second typo.
|
123419 |
10-Dec-2003 |
peter |
Use gcc's superior ffs() builtin.
|
123418 |
10-Dec-2003 |
peter |
Use ffs(x) == popcnt(x ^ (x - 1)) to implement 64 bit ffsl(). gcc's ffs() builtin uses this already but truncates the upper 32 bits.
|
123346 |
09-Dec-2003 |
marcel |
Don't panic for misalignment traps when the onfault handler is set. Not all transfers between kernel and user space are byte oriented and thus alignment safe. Especially fuword*() and suword*() are sensitive to alignment but in general more optimal than block copies. By catching the misalignment trap we avoid pessimizing the common case of properly aligned memory accesses which we would do if we were to use byte copies or adding tests for proper alignment.
Note that the expectation that the kernel produces aligned pointers is unchanged. This change therefore relates to possible unaligned pointers generated in userland.
|
123326 |
09-Dec-2003 |
njl |
Use the ACPI-CA definitions for the various APIC tables instead of our own.
|
123288 |
08-Dec-2003 |
obrien |
Move the bktr(4) <arch>/include/ioctl_{bt848,meteor}.h files to dev/bktr as these ioctl's aren't MD. This also means they are installed in /usr/include/dev/bktr now. Also provide compatability wrappers for where these headers lived in 4.x.
|
123255 |
07-Dec-2003 |
marcel |
Simplify the contexts created by the kernel and remove the related flags. We now create asynchronous contexts or syscall contexts only. Syscall contexts differ from the minimal ABI dictated contexts by having the scratch registers saved and restored because that's where we keep the syscall arguments and syscall return values. Since this change affects KSE, have it use kse_switchin(2) for the "new" syscall context.
|
123223 |
07-Dec-2003 |
imp |
Ooops. These are still used by the bktr driver. David O'Brien has plans for dealing, but I'll let him deal.
Pointy hat to: imp@
|
123213 |
07-Dec-2003 |
imp |
Remote meteor driver. It hasn't compiled in over 3 years. If someone makes it compile again, and can test it, we can restore the driver to the tree.
|
122947 |
21-Nov-2003 |
jhb |
- Split cpu_mp_probe() into two parts. cpu_mp_setmaxid() is still called very early (SI_SUB_TUNABLES - 1) and is responsible for setting mp_maxid. cpu_mp_probe() is now called at SI_SUB_CPU and determines if SMP is actually present and sets mp_ncpus and all_cpus. Splitting these up allows an architecture to probe CPUs later than SI_SUB_TUNABLES by just setting mp_maxid to MAXCPU in cpu_mp_setmaxid(). This could allow the CPU probing code to live in a module, for example, since modules sysinit's in modules cannot be invoked prior to SI_SUB_KLD. This is needed to re-enable the ACPI module on i386. - For the alpha SMP probing code, use LOCATE_PCS() instead of duplicating its contents in a few places. Also, add a smp_cpu_enabled() function to avoid duplicating some code. There is room for further code reduction later since much of this code is also present in cpu_mp_start(). - All archs besides i386 still set mp_maxid to the same values they set it to before this change. i386 now sets mp_maxid to MAXCPU.
Tested on: alpha, amd64, i386, ia64, sparc64 Approved by: re (scottl)
|
122918 |
20-Nov-2003 |
marcel |
Set the ACPI processor Id in the PCPU structure so that CPU idling on SMP systems has a chance of working. This was a loose end of the implementation of the ACPI Cx idle states. Since our logical CPU Id is the ACPI processor Id, we do not need to jump through hoops to obtain it.
Approved: re@ (jhb)
|
122841 |
17-Nov-2003 |
peter |
Widen the enable/disable helper function's argument in line with the ithread_create() changes etc. This should be mostly a NOP.
|
122829 |
17-Nov-2003 |
bde |
Fixed a pedantic syntax error (a stray semicolon at the end of PCPU_MD_FIELDS).
|
122821 |
16-Nov-2003 |
alc |
- Remove unnecessary synchronization from sf_buf_init(). (There is only one active CPU when sf_buf_init() is performed.)
|
122780 |
16-Nov-2003 |
alc |
- Modify alpha's sf_buf implementation to use the direct virtual-to- physical mapping. - Move the sf_buf API to its own header file; make struct sf_buf's definition machine dependent. In this commit, we remove an unnecessary field from struct sf_buf on the alpha, amd64, and ia64. Ultimately, we may eliminate struct sf_buf on those architecures except as an opaque pointer that references a vm page.
|
122763 |
15-Nov-2003 |
njl |
Add the pc_acpi_id PCPU member. The new acpi_cpu driver uses this to dereference the softc.
|
122525 |
12-Nov-2003 |
marcel |
Remove ia64_highfp_load() now that it's unused.
|
122518 |
12-Nov-2003 |
marcel |
Further work-out the handling of the high FP registers. The most important change is in cpu_switch() where we disable the high FP registers for the thread that we switch-out if the CPU currently has its high FP registers. This avoids that the high FP registers remain enabled for the thread even when the CPU has unloaded them or the thread migrated to another processor. Likewise, when we switch-in a thread of that has its high FP registers on the CPU, we enable them. This avoids an otherwise harmless, but unnecessary trap to have them enabled.
The code that handles the disabled high FP trap (in trap()) has been turned into a critical section for the most part to avoid being preempted. If there's a race, we bail out and have the processor trap again if necessary.
Avoid using the generic ia64_highfp_save() function when the context is predictable. The function adds unnecessary overhead. Don't use ia64_highfp_load() for the same reason. The function is now unused and can be removed.
These changes make the lazy context switching of the high FP registers in an UP kernel functional.
|
122480 |
11-Nov-2003 |
marcel |
Save and restore the high FP registers in {g|s}_mcontext(). Note that we currently do not keep track of whether the thread has actually used the high FP registers before. If not, we should not save them in the context which automaticly means that we also would not restore them from the context. For now, do it unconditionally so that we can reach functional completeness.
|
122479 |
11-Nov-2003 |
marcel |
Fix a nasty bug that got exposed when the sendsig() and sigreturn() functions switched to using {g|s}et_mcontext(). The problem is that sigreturn(), being a syscall, can be given an async. context (i.e. one corresponding to an interrupt or trap). When this happens, we try to return to user mode via epc_syscall_return with a trapframe that can only be used to return to user mode via exception_restore.
To fix this, we check the frame's flags immediately prior to epc_syscall_return and branch to exception_restore for non-syscall frames. Modify the assertion in set_mcontext() to check that if there's a mismatch, it's because of sigreturn().
|
122389 |
10-Nov-2003 |
marcel |
In get_mcontext(), do not update bspstore and ndirty in the trapframe. Only update them in the newly created context to reflect the state after copying the dirty registers onto the user stack. If we were to update the trapframe, we lose the state at entry into the kernel. We may need that after we create the context, such as for KSE upcalls.
We have to update the trapframe after writing the dirty registers to the user stack for signal delivery to work. But this is best done in sendsig() itself where it applies, not in get_mcontext() where it's done unconditionally.
|
122373 |
09-Nov-2003 |
marcel |
When a thread is being swapped-out, save the high FP registers. We have a pointer in the PCPU to the PCB of the thread that currently has its high FP registers loaded.
|
122368 |
09-Nov-2003 |
marcel |
Use get_mcontext() to construct the signal context in sendsig() and use set_mcontext() to restore the context in sigreturn(). Since we put the syscall number and the syscall arguments in the trapframe (we don't save the scratch registers for syscalls, which allows us to reuse the space to our advantage), create a MD specific flag so that we save the scratch registers even for syscalls. We would not be able to restart a syscall otherwise.
The signal trampoline does not need to flush the regiters anymore, because get_mcontext() already handles that. In fact, if we set up the context correctly, we do not need to have a trampoline at all. This change however only minimally changes the trampoline code. In follow-up commits this can be further optimized.
Note that normally we preserve cfm and iip in the trapframe created by the EPC syscall path when we restore a context in set_mcontext() because those fields are not normally set for a synchronuous context. The kernel puts the return address and frame info of the syscall stub in there. By preserving these fields we hide this detail from userland which allows us to use setcontext(2) for user created contexts. However, sigreturn() is commonly called from the trampoline, which means that if we preserve cfm and iip in all cases, we would return to the trampoline after the sigreturn(), which means we hit the safety net: we call exit(2). So, we do not preserve cfm and iip when we have a synchronous context that also has scratch registers (the uncommon context created by sendsig() only), under the assumption that if such a context is created in userland, something special is going on and the use of cfm and iip is then just another quirk. All this is invisible in the common case.
|
122364 |
09-Nov-2003 |
marcel |
Change the clear_ret argument of get_mcontext() to be a flags argument. Since all callers either passed 0 or 1 for clear_ret, define bit 0 in the flags for use as clear_ret. Reserve bits 1, 2 and 3 for use by MI code for possible (but unlikely) future use. The remaining bits are for use by MD code.
This change is triggered by a need on ia64 to have another knob for get_mcontext().
|
122333 |
08-Nov-2003 |
marcel |
Remove the atkbd, psm, sc and vga devices. Most ia64 boxes out there are zx1 based machines and they don't particularly like it when we poke at them with PC legacy code. The atkbd and psm devices were disabled in the hints file so that one could enable them on machines that support legacy devices, but that's not really something you can expect from a first-time installer. This still leaves syscons (sc) and the vga device, which were enabled by default and wrecking havoc anyway. We could disable them by default like the atkbd and psm devices, but there's really no point in pretending we're in a better shape that way.
|
122266 |
07-Nov-2003 |
scottl |
Document the lockfunc and lockfuncarg arguments to bus_dma_tag_create() in the busdma headers.
|
122245 |
07-Nov-2003 |
jhb |
Regen.
|
122243 |
07-Nov-2003 |
jhb |
Sync with global syscalls.master. ptrace(), dup(), pipe(), ktrace(), ia32_sigaltstack(), sysarch(), issetugid(), utrace(), and ia32_sigaction() are MP safe.
|
122162 |
06-Nov-2003 |
marcel |
Add support for unaligned ld2, st2, st4 and st8. While here, make sure we handle stacked registers properly by taking into account that: 1. bspstore points after the frame (due to cover), 2. we need to adjust for intermediate NaT collections.
|
121933 |
03-Nov-2003 |
marcel |
Handle unaligned 4-byte loads. While in the neighborhood, remove the cr.isr sanity check. We actually encounter insanities, which very likely means that the insanity check itself is insane. Remove an empty comment while I'm at it.
|
121926 |
03-Nov-2003 |
marcel |
Add a bogus definition of __va_list for use by lint. Make it visible only when lint is defined to protect builds with non-GNU compilers.
|
121892 |
02-Nov-2003 |
marcel |
Remove headers copied from i386 and either useless or wrong on ia64. An example of useless is bios.h. An example of wrong is msdos.h (due to the use of long for 32-bit fields).
display.h cannot be removed because it's used by syscons. That header however has no platform dependency and shouldn't really be here.
Removal if these headers may cause build failures in the ports tree. It's the ports that need fixing in that case.
Tested with: buildworld, LINT
|
121635 |
28-Oct-2003 |
marcel |
When switching the RSE to use the kernel stack as backing store, keep the RNAT bit index constant. The net effect of this is that there's no discontinuity WRT NaT collections which greatly simplifies certain operations. The cost of this is that there can be up to 504 bytes of unused stack between the true base of the kernel stack and the start of the RSE backing store. The cost of adjusting the backing store pointer to keep the RNAT bit index constant, for each kernel entry, is negligible.
The primary reasons for this change are: 1. Asynchronuous contexts in KSE processes have the disadvantage of having to copy the dirty registers from the kernel stack onto the user stack. The implementation we had so far copied the registers one at a time without calculating NaT collection values. A process that used speculation would not work. Now that the RNAT bit index is constant, we can block-copy the registers from the kernel stack to the user stack without having to worry about NaT collections. They will be in the right place on the user stack. 2. The ndirty field in the trapframe is now also usable in userland. This was previously not the case because ndirty also includes the space occupied by NaT collections. The value could be off by 8, depending on the discontinuity. Now that the RNAT bit index is contants, we have exactly the same number of NaT collection points on the kernel stack as we would have had on the user stack if we didn't switch backing stores. 3. Debuggers and other applications that use ptrace(2) can now copy the dirty registers from the kernel stack (using ptrace(2)) and copy them whereever they want them (onto the user stack of the inferior as might be the case for gdb) without having to worry about NaT collections in the same way the kernel doesn't have to worry about them.
There's a second order effect caused by the randomization of the base of the backing store, for it depends on the number of dirty registers the processor happened to have at the time of entry into the kernel. The second order effect is that the RSE will have a better cache utilization as compared to having the backing store always aligned at page boundaries. This has not been measured and may be in practice only minimally beneficial, if at all measurable.
|
121622 |
27-Oct-2003 |
marcel |
The previous commit removed both clause 3 and clause 4 from the UCB license. Only clause 3 has been revoked. Restore the fourth clause as clause 3.
Pointed out by: das@
Remove my name as a copyright holder since I don't use a BSD license compatible or comparable to the UCB license. I choose not to add a complete second license for my work for aesthetic reasons, nor to replace the UCB license on grounds of rewriting more than 90% of the source files. The rewrite can also be seen as an enhancement and since the files were practically empty, it's rather trivial to have changed 90% of the files.
|
121600 |
27-Oct-2003 |
marcel |
Add support for userland to access I/O port space. This is primarily added for XFree86. There are 2 reasons for doing this with sysarch(): 1. The memory mapped I/O space is not at a fixed physical address. An application has to use some interface to get the base address. It gets worse if the machine has multiple memory mapped I/O spaces. 2. Access to the memory mapped I/O space needs to happen through a translation that is flagged as uncachable. There's no interface that allows a process to do uncached memory I/O, other than though /dev/mem (possibly).
So, until we either disallow direct access to I/O or bus space from userland or have a better way of doing this, sysarch() has the least negative impact on existing interfaces.
|
121459 |
24-Oct-2003 |
marcel |
Remove unused header. See also ia64/disasm/disasm.h.
|
121457 |
24-Oct-2003 |
marcel |
Remove ia64_pack_bundle() and ia64_unpack_bundle(). They are not used anymore.
|
121456 |
24-Oct-2003 |
marcel |
Remove unused file. db_disasm() has been implemented in db_interface.c now.
|
121454 |
24-Oct-2003 |
marcel |
Implement db_disasm() by using the new disassembler. Temporarily unimplement db_write_breakpoint() and db_clear_breakpoint().
|
121452 |
24-Oct-2003 |
arun |
Use a TR of size 1 << IA64_ID_PAGE_SHIFT instead of 16M to avoid overlapping TR/TC entries (which results in a machine check). Note that we don't look at the size of the memory descriptor, because it doesn't guarantee non-overlap.
With this change, a UP kernel could boot on a Intel Tiger4 machine with the following options:
options LOG2_ID_PAGE_SIZE=26 # 64M options LOG2_PAGE_SIZE=14 # 16K
Approved by: marcel
|
121449 |
24-Oct-2003 |
marcel |
Don't use fuword() or suword() unconditionally. They explicitly disallow reading or writing.
|
121448 |
24-Oct-2003 |
marcel |
Remove two unused fields in the operand structure (o_read & o_write).
|
121416 |
23-Oct-2003 |
marcel |
Cleanup. Remove the md_flags for threads. It's not used. The flags we had were bogus. While here, reassign the copyright to the Project. There's nothing in this files that originates from NetBSD, especially now that the FreeBSD/alpha bits have been removed, but even then the amount of inherited code that we actually used was nil.
|
121415 |
23-Oct-2003 |
marcel |
Reimplement unaligned_fixup() using the new disassembler and a mcontext_t for the register values. Currently only ld8 and ldfd instructions are handled as those are the ones we need now (a misaligned ld8 occurs 4 times in ntpd(8) and a misaligned ldfd occurs once in mozilla 1.4 and 1.5). Other instructions are added when needed.
|
121413 |
23-Oct-2003 |
marcel |
Remove unused include of <machine/inst.h>
|
121412 |
23-Oct-2003 |
marcel |
Remove prototype of unaligned_fixup() and fix a nearby style(9) bug.
|
121411 |
23-Oct-2003 |
marcel |
Add prototypes for spillfd() and unaligned_fixup().
|
121410 |
23-Oct-2003 |
marcel |
Add spillfd(). This function loads a double-precision FP register at the first address and spills it to the second address. This allows unaligned_fixup() to update the context of the process in a way that assures proper rounding. Similar functions for single-and extended-precision are added when needed.
|
121404 |
23-Oct-2003 |
marcel |
Add a new disassembler that improves over the previous disassembler in that it provides an abstract (intermediate) representation for instructions. This significantly improves working with instructions such as emulation of instructions that are not implemented by the hardware (e.g. long branch) or enhancing implemented instructions (e.g. handling of misaligned memory accesses). Not to mention that it's much easier to print instructions.
Functions are included that provide a textual representation for opcodes, completers and operands.
The disassembler supports all ia64 instructions defined by revision 2.1 of the SDM (Oct 2002).
|
121294 |
21-Oct-2003 |
marcel |
Remove md_bspstore from the MD fields of struct thread. Now that the backing store is at a fixed address, there's no need for a per-thread variable.
|
121268 |
20-Oct-2003 |
marcel |
Put the RSE backing store at a fixed address. This change is triggered by libguile that needs to know the base of the RSE backing store. We currently do not export the fixed address to userland by means of a sysctl so user code needs to hardcode it for now. This will be revisited later.
The RSE backing store is now at the bottom of region 4. The memory stack is at the top of region 4. This means that the whole region is usable for the stacks, giving a 61-bit stack space.
Port: lang/guile (depended of x11/gnome2)
|
121228 |
18-Oct-2003 |
njl |
Add the cpu_idle_hook() function pointer so that other idlers can be hooked at runtime. Make C1 sleep (e.g., HLT) be the default. This prepares the way for further ACPI sleep states.
|
121148 |
17-Oct-2003 |
marcel |
Implement cpu_idle() on ia64. We put the processor in a lightweight halt state that minimizes power consumption while still preserving cache and TLB coherency. Halting the processor is not conditional at this time. Tested with UP and SMP kernels.
|
120937 |
09-Oct-2003 |
robert |
Implement preliminary support for the PT_SYSCALL command to ptrace(2).
|
120928 |
09-Oct-2003 |
marcel |
With BETA 5 of libuwx some of the application registers are renamed from UWX_REG_MUMBLE to UWX_REG_AR_MUMBLE. Compatibility defines are present in libuwx. Change the names here so that we don't depend on compatibility defines.
Note that there's now an UWX_REG_PFS and an UWX_REG_AR_PFS and the former is not a compatibility define for the latter AFAICT. Change to UWX_REG_AR_PFS as that seems to be the one we need to handle.
|
120914 |
08-Oct-2003 |
marcel |
Include <sys/smp.h> for the prototype of smp_rendezvous().
|
120831 |
06-Oct-2003 |
bms |
Move pmap_resident_count() from the MD pmap.h to the MI pmap.h. Add a definition of pmap_wired_count(). Add a definition of vmspace_wired_count().
Reviewed by: truckman Discussed with: peter
|
120722 |
03-Oct-2003 |
alc |
Migrate pmap_prefault() into the machine-independent virtual memory layer.
A small helper function pmap_is_prefaultable() is added. This function encapsulate the few lines of pmap_prefault() that actually vary from machine to machine. Note: pmap_is_prefaultable() and pmap_mincore() have much in common. Going forward, it's worth considering their merger.
|
120683 |
03-Oct-2003 |
marcel |
Swap the syscall caller frame info (i.e. the return pointer and frame marker) and the syscall stub frame info in the trap frame. Previously we stored the stub frame info in (rp,pfs) and the caller frame info in (iip,cfm). This ends up being suboptimal for the following reasons: 1. When we create a new context, such as for an execve(2), we had to set the (rp,pfs) pair for the entry point when using the syscall path out of the kernel but we need to set the (iip,cfm) pair when we take the interrupt way out. This is mostly just an inconsistency from the kernel's point of view, but an ugly irregularity from gdb(1)'s point of view. 2. The getcontext(2) and setcontext(2) syscalls had to swap the (rp,pfs) and (iip,cfm) pairs to make the context compatible with one created purely in userland.
Swapping the (rp,pfs) and (iip,cfm) pairs is visible to signal handlers that actually peek at the mcontext_t and to gdb(1). Since this change is made for gdb(1) and we don't care about signal handlers that peek at the mcontext_t because we're still a tier 2 platform, this ABI breakage is academic at this moment in time.
Note that there was no real reason to save the caller frame info in (iip,cfm) and the stub frame info in (rp,pfs).
|
120540 |
28-Sep-2003 |
marcel |
Drop any and all support for varargs. There's no history to worry about because we're still tier 2 and our current compiler, as well as future compilers will not support varargs. This is mostly a no-op in practice, because <sys/varargs.h> should already cause compile failures.
|
120464 |
26-Sep-2003 |
phk |
Set cn_name, not cn_dev
|
120422 |
25-Sep-2003 |
peter |
Add sysentvec->sv_fixlimits() hook so that we can catch cases on 64 bit systems where the data/stack/etc limits are too big for a 32 bit process.
Move the 5 or so identical instances of ELF_RTLD_ADDR() into imgact_elf.c.
Supply an ia32_fixlimits function. Export the clip/default values to sysctl under the compat.ia32 heirarchy.
Have mmap(0, ...) respect the current p->p_limits[RLIMIT_DATA].rlim_max value rather than the sysctl tweakable variable. This allows mmap to place mappings at sensible locations when limits have been reduced.
Have the imgact_elf.c ld-elf.so.1 placement algorithm use the same method as mmap(0, ...) now does.
Note that we cannot remove all references to the sysctl tweakable maxdsiz etc variables because /etc/login.conf specifies a datasize of 'unlimited'. And that causes exec etc to fail since it can no longer find space to mmap things.
|
120375 |
23-Sep-2003 |
nyan |
Implement the bus_space_map() function to allocate resources and initialize a bus_handle, but currently it does only initializing a bus_handle.
|
120296 |
20-Sep-2003 |
marcel |
Fix the last remaining problem encountered by KSE: apparently it is not guaranteed that the RSE writes the NaT collection immediately, sort of atomically, to the backing store when it writes the register immediately prior to the NaT collection point. This means that we cannot assume that the low 9 bits of the backingstore pointer do not point to the NaT collection. This is rather a surprise and I don't know at this time if it's a bug in the Merced or that it's actually a valid condition of the architecture. A quick scan over the sources does not indicate that we depend on the false assumption elsewhere, but it's something to keep in mind.
The fix is to write the saved contents of the ar.rnat register to the backingstore prior to entering the loop that copies the dirty registers from the kernel stack to the user stack.
|
120294 |
20-Sep-2003 |
marcel |
Move uma_small_alloc() and uma_small_free() to uma_machdep.c. These functions reference UMA internals from <vm/uma_int.h>, which makes them highly unwanted in non-UMA specific files.
While here, prune the includes in pmap.c and use __FBSDID(). Move the includes above the descriptive comment.
The copyright of uma_machdep.c is assigned to the project and can be reassigned to the foundation if and when when such is preferrable.
|
120252 |
19-Sep-2003 |
marcel |
Fix the most significant KSE breakage caused by not restoring the restart instruction bits in the PSR. As such, we were returning from interrupt to the instruction in the bundle that caused us to enter the kernel, only now we're returning to a completely different bundle.
While close here: add two KASSERTs to make sure that we restore sync contexts only when entered the kernel through a syscall and restore an async context only when entered the kernel through an interrupt, trap or fault.
While not exactly here, but close enough: use suword64() when we copy the dirty registers from the kernel stack to the user stack. The code was intended to be be replaced shortly after being added, but that was a couple of weeks ago. I might as well avoid that it is a source for panics until it's replaced.
|
120250 |
19-Sep-2003 |
marcel |
Revamp trap(): make it more explicit which kinds of traps/faults we can get (or not) and what we do with them. This fixes the behaviour for NaT consumption and speculation faults in that we now don't panic for user faults.
Remove the dopanic label and move the code to a function. This makes it easier in the simulator to set a breakpoint.
While here, remove the special handling of the old break-based syscall path and move it to where we handle the break vector. While here, reserve a new break immediate for KSE. We currently use the old break- based syscall to deal with restoring async contexts. However, it has the side-effect of also setting the signal mask and callong ast() on the way out. The new break immediate simply restores the context and returns without calling ast().
|
120222 |
19-Sep-2003 |
marcel |
Change TRAPF_USERMODE and CLOCKF_USERMODE to not test for CPL == 3, but for CPL != 0. For some reason yet unknown it is possible for the CPL to be 2. This would previously be counted as kernel mode, which resulted in nasty panics. By changing the test it is now treated as user mode, which is more correct. We still need to figure out how it is possible that the privilege level can be 2 (or 1 for that matter), because it's not used by us. We only use 3 (user mode) and 0 (kernel mode).
|
120212 |
19-Sep-2003 |
marcel |
Include "opt_kstack_pages.h". We export KSTACK_PAGES to assembly and better have the right value.
|
119999 |
12-Sep-2003 |
alc |
Add a new parameter to pmap_extract_and_hold() that is needed to eliminate Giant from vmapbuf().
Idea from: tegge
|
119970 |
10-Sep-2003 |
marcel |
Rewrite the SAPIC initialization to always program the RTEs with what we think is the correct trigger mode and polarity. This allows us to implement BUS_CONFIG_INTR() as an update of the RTE in question. Consequently, we can trust the RTE when we enable an interrupt and avoids that we need to know about the trigger mode and polarity at that time.
|
119946 |
10-Sep-2003 |
jhb |
Move the definitions for ACPI MADT table entries not present in the ACPICA distribution to a MI header so it can be shared with other architectures.
|
119906 |
09-Sep-2003 |
marcel |
Introduce IA64_ID_PAGE_{MASK|SHIFT|SIZE} and LOG2_ID_PAGE_SIZE. The latter is a kernel option for IA64_ID_PAGE_SHIFT, which in turn determines IA64_ID_PAGE_MASK and IA64_ID_PAGE_SIZE.
The constants are used instead of the literal hardcoding (in its various forms) of the size of the direct mappings created in region 6 and 7. The default and probably only workable size is still 256M, but for kicks we use 128M for LINT.
|
119869 |
08-Sep-2003 |
alc |
Introduce a new pmap function, pmap_extract_and_hold(). This function atomically extracts and holds the physical page that is associated with the given pmap and virtual address. Such a function is needed to make the memory mapping optimizations used by, for example, pipes and raw disk I/O MP-safe.
Reviewed by: tegge
|
119868 |
08-Sep-2003 |
wpaul |
Take the support for the 8139C+/8169/8169S/8110S chips out of the rl(4) driver and put it in a new re(4) driver. The re(4) driver shares the if_rlreg.h file with rl(4) but is a separate module. (Ultimately I may change this. For now, it's convenient.)
rl(4) has been modified so that it will never attach to an 8139C+ chip, leaving it to re(4) instead. Only re(4) has the PCI IDs to match the 8169/8169S/8110S gigE chips. if_re.c contains the same basic code that was originally bolted onto if_rl.c, with the following updates:
- Added support for jumbo frames. Currently, there seems to be a limit of approximately 6200 bytes for jumbo frames on transmit. (This was determined via experimentation.) The 8169S/8110S chips apparently are limited to 7.5K frames on transmit. This may require some more work, though the framework to handle jumbo frames on RX is in place: the re_rxeof() routine will gather up frames than span multiple 2K clusters into a single mbuf list.
- Fixed bug in re_txeof(): if we reap some of the TX buffers, but there are still some pending, re-arm the timer before exiting re_txeof() so that another timeout interrupt will be generated, just in case re_start() doesn't do it for us.
- Handle the 'link state changed' interrupt
- Fix a detach bug. If re(4) is loaded as a module, and you do tcpdump -i re0, then you do 'kldunload if_re,' the system will panic after a few seconds. This happens because ether_ifdetach() ends up calling the BPF detach code, which notices the interface is in promiscuous mode and tries to switch promisc mode off while detaching the BPF listner. This ultimately results in a call to re_ioctl() (due to SIOCSIFFLAGS), which in turn calls re_init() to handle the IFF_PROMISC flag change. Unfortunately, calling re_init() here turns the chip back on and restarts the 1-second timeout loop that drives re_tick(). By the time the timeout fires, if_re.ko has been unloaded, which results in a call to invalid code and blows up the system.
To fix this, I cleared the IFF_UP flag before calling ether_ifdetach(), which stops the ioctl routine from trying to reset the chip.
- Modified comments in re_rxeof() relating to the difference in RX descriptor status bit layout between the 8139C+ and the gigE chips. The layout is different because the frame length field was expanded from 12 bits to 13, and they got rid of one of the status bits to make room.
- Add diagnostic code (re_diag()) to test for the case where a user has installed a broken 32-bit 8169 PCI NIC in a 64-bit slot. Some NICs have the REQ64# and ACK64# lines connected even though the board is 32-bit only (in this case, they should be pulled high). This fools the chip into doing 64-bit DMA transfers even though there is no 64-bit data path. To detect this, re_diag() puts the chip into digital loopback mode and sets the receiver to promiscuous mode, then initiates a single 64-byte packet transmission. The frame is echoed back to the host, and if the frame contents are intact, we know DMA is working correctly, otherwise we complain loudly on the console and abort the device attach. (At the moment, I don't know of any way to work around the problem other than physically modifying the board, so until/unless I can think of a software workaround, this will have do to.)
- Created re(4) man page
- Modified rlphy.c to allow re(4) to attach as well as rl(4).
Note that this code works for the sample 8169/Marvell 88E1000 NIC that I have, but probably won't work for the 8169S/8110S chips. RealTek has sent me some sample NICs, but they haven't arrived yet. I will probably need to add an rlgphy driver to handle the on-board PHY in the 8169S/8110S (it needs special DSP initialization).
|
119867 |
07-Sep-2003 |
marcel |
Untangle the code in this file to improve understandability. Both ia64_count_cpus() and ia64_probe_sapics() called a single function to do the the actual work. The difference in behaviour was handled in that function and was further complicated by adding bootverbose related code. As such, even the simplest of changes was hard to comprehend.
Untangling has been done by increasing code duplication and using a more naive style of coding. FWIW, the object file is slightly smaller than before, so things aren't as bad as it may seem.
Triggered by: a simple fix on the P4 branch that never got merged.
|
119861 |
07-Sep-2003 |
alc |
MFamd64/i386 Add necessary page locking to pmap_mincore().
|
119830 |
07-Sep-2003 |
marcel |
MFp4: Revamped GENERIC (and hints). This is some much more pleasant to look at...
|
119828 |
07-Sep-2003 |
marcel |
Replace sio(4) with uart(4). Remove the sio(4) hints and only add those hints used by uart(4) for the determination of the serial console in the absence of the HCDP table.
|
119787 |
05-Sep-2003 |
marcel |
Fix a place where I forgot to change the code that checks whether we return to kernel or userland. This triggered a panic in a KSE application when TDF_USTATCLOCK was set in the case userland was interrupted, but we never called ast() on our way out. As such, we called ast() at some other time. Unfortunately, TDF_USTATCLOCK handling assumes running in the interrupt thread. This was not the case anymore.
To avoid making the same mistake later, interrupt() now returns to its caller whether we interrupted userland or not. This avoids that we have to duplicate the check in assembly, where it's bound to fall off the scope. Now we simply check the return value and call ast() if appropriate.
Run into this: davidxu
|
119649 |
01-Sep-2003 |
marcel |
Use pmap_steal_memory() for the msgbuf instead of trying to squeeze it in the last chunk (phys_avail block). The last chunk very often is not larger than one or two pages, resulting in a msgbuf that's too small to hold a complete verbose boot. Note that pmap_steal_memory() will bzero the memory it "allocates". Consequently, ia64 will never preserve previous msgbufs. This is not a noticable difference in practice. If the msgbuf could be reused, it was invariably too small to have anything preserved anyway.
|
119624 |
01-Sep-2003 |
marcel |
Use direct mapped KVA for the sf_buf allocator, as made possible by the previous commit. While here, fix a typo, reformat comments and fix a long line.
Tested with: ftpd
|
119563 |
29-Aug-2003 |
alc |
Migrate the sf_buf allocator that is used by sendfile(2) and zero-copy sockets into machine-dependent files. The rationale for this migration is illustrated by the modified amd64 allocator. It uses the amd64's direct map to avoid emphemeral mappings in the kernel's address space. On an SMP, the emphemeral mappings result in an IPI for TLB shootdown for each transmitted page. Yuck.
Maintainers of other 64-bit platforms with direct maps should be able to use the amd64 allocator as a reference implementation.
|
119531 |
28-Aug-2003 |
njl |
Minor style cleanups.
|
119469 |
25-Aug-2003 |
marcel |
Change LOG2_PAGE_SIZE from 14 to 15 bits. This will cause the CTASSERT in vm_page.h to be reached and thus slightly increases the overall coverage of LINT on ia64.
|
119377 |
23-Aug-2003 |
marcel |
Add the bits for a LINT kernel. It has been verified to compile. We may need to polish this.
|
119347 |
23-Aug-2003 |
marcel |
Remove PAGE_SIZE_4K, PAGE_SIZE_8K and PAGE_SIZE_16K and replace them with LOG2_PAGE_SIZE. A single option is better to LINT than multiple mutual exclusive ones.
|
119337 |
23-Aug-2003 |
marcel |
Remove unused inclusion of opt_acpi.h
|
119200 |
21-Aug-2003 |
jhb |
Regen.
|
119199 |
21-Aug-2003 |
jhb |
Swap sigaction/sigreturn since they are in the wrong order.
Noticed indirectly by: peter
|
119159 |
20-Aug-2003 |
marcel |
Undo the mistake made in revision 1.77 of trap.c and which was the ultimate trigger for the follow-up fixes in revisions 1.78, 1.80, 1.81 and 1.82 of trap.c. I was simply too pre-occupied with the gateway page and how it blurs kernel space with user space and vice versa that I couldn't see that it was all a load of bollocks.
It's not the IP address that matters, it's the privilege level that counts. We never run in user space with lifted permissions and we sure can not run in kernel space without it. Sure, the gateway page is the exception, but not if you look at the privilege level. It's user space if you run with user permissions and kernel space otherwise.
So, we're back to looking at the privilege level like it should be. There's no other way.
Pointy hat: marcel
|
119015 |
17-Aug-2003 |
gordon |
Fixup the ELF branding information to point to the new home of rtld.
|
119004 |
16-Aug-2003 |
marcel |
In vm_thread_swap{in|out}(), remove the alpha specific conditional compilation and replace it with a call to cpu_thread_swap{in|out}(). This allows us to add similar code on ia64 without cluttering the code even more.
|
118990 |
16-Aug-2003 |
marcel |
Further cleanup <machine/cpu.h> and <machine/md_var.h>: move the MI prototypes of cpu_halt(), cpu_reset() and swi_vm() from md_var.h to cpu.h. This affects db_command.c and kern_shutdown.c.
ia64: move all MD prototypes from cpu.h to md_var.h. This affects madt.c, interrupt.c and mp_machdep.c. Remove is_physical_memory(). It's not used (vm_machdep.c).
alpha: the MD prototypes have been left in cpu.h with a comment that they should be there. Moving them is left for later. It was expected that the impact would be significant enough to be done in a seperate commit.
powerpc: MD prototypes left in cpu.h. Comment added.
Suggested by: bde Tested with: make universe (pc98 incomplete)
|
118981 |
16-Aug-2003 |
marcel |
Fix a range check bug. Don't left-shift the integer argument 'data'. Sign extension happens after the shift, not before so that boundary cases like 0x40000000 will not be caught properly. Instead, right shift ndirty. It is guaranteed to be a multiple of 8. While here, do some manual code motion and code commoning.
Range check bug pointed out by: iedowse
|
118935 |
15-Aug-2003 |
marcel |
Fix the generation of coredumps. We did not take the dirty registers that were on the kernel stack into account. For now we write them out to the register stack of the process before creating the dump. This however is not the final solution. The problem is that we may invalidate the coredump by overwriting vital information due to an invalid backing store pointer. Instead we need to write the dirty registers to an unused region of VM which will result in a seperate segment in the coredump. For now we can at least get to all the registers from a coredump.
|
118934 |
15-Aug-2003 |
marcel |
Add an instruction group break after the move to application register and the move to control register to avoid dependency violations when these functions are used. Note that explicit data and instruction serialization also need to be in a subsequent instruction group. This too requires that we have an igrp break here.
|
118933 |
15-Aug-2003 |
marcel |
Introduce two machine specific ptrace(2) requests: PT_GETKSTACK and PT_SETKSTACK. These requests allow the tracing process to access the dirty registers of the traced process that are on the kernel stack.
Note that there's currently no way to access the rnat register for those dirty registers that are not (yet) covered by a nat collection point. The interface for this is still being slept on.
Also note that implied by these requests is the division of work: The tracing process has to keep track of where registers are spilled and is responsible to figure out where the NaT bit of the stacked registers are at any time during the execution of the traced process. The kernel provides the interfaces but will not abstract the fact that the register stack can be split. This model does not follow the approach taken in Linux where PT_PEEK and PT_POKE deals with this automagically.
|
118853 |
13-Aug-2003 |
marcel |
Don't use VM_MIN_KERNEL_ADDRESS to check if the faulting address is in user space or kernel space. VM_MIN_KERNEL_ADDRESS starts after the gateway page, which means that improper memory accesses to the gateway page while in user mode would panic the kernel. Use VM_MAX_ADDRESS instead. It ends before the gateway page. The difference between VM_MIN_KERNEL_ADDRESS and VM_MAX_ADDRESS is exactly the gateway page.
|
118851 |
13-Aug-2003 |
marcel |
Put an instruction group break between the move to ar.rnat and the move to ar.rsc. The RSE must be in enforced lazy mode when writing to RSE modifyable registers. In this case we restore the RSE NaT collection register ar.rnat. I have seen 2 general exception faults on pluto1 now that indicate that the move to ar.rsc has already happened prior to the move to ar.rnat, meaning that the RSE is not in enforced lazy mode anymore. The ia64 dependency and instruction ordering rules seem to allow having both registers written to in the same instruction group, provided ar.rsc is written to later than ar.rnat (based on the ordering semantics). It appears that we may be pushing our luck. For now, put them in seperate cycles (by means of the instruction group break). If we ever get a general exception fault on the move to ar.rnat again, we have definite proof that something else is fishy.
|
118848 |
12-Aug-2003 |
imp |
Expand inline the relevant parts of src/COPYRIGHT for Matt Dillon's copyrighted files.
Approved by: Matt Dillon
|
118818 |
12-Aug-2003 |
marcel |
Extend identifycpu(): o Differentiate between CPU family and CPU model. There are multiple Itanium 2 models and it's nice to differentiate between them. o Seperately export the CPU family and CPU model with sysctl. o Merced is the only model in the Itanium family. o Add Madison to the Itanium 2 family. We already knew about McKinley. o Print the CPU family between parenthesis, like we do with the i386 CPU class.
My prototype now identifies itself as: CPU: Merced (800.03-Mhz Itanium)
pluto1 and pluto2 will eventually identify themselves as: CPU: McKinley (900.00-Mhz Itanium 2)
|
118811 |
12-Aug-2003 |
marcel |
Cleanup prototypes in cpu.h, including fswintrberr and any references to it. Sort the remaining prototypes in cpu.h.
No functional change.
|
118798 |
11-Aug-2003 |
marcel |
Cleanup and style(9) fixes. No functional change.
|
118739 |
10-Aug-2003 |
marcel |
o move cpu_reset() from vm_machdep.c to machdep.c. o reorder cpu_boot(), cpu_halt() and identifycpu().
No functional change.
|
118717 |
10-Aug-2003 |
marcel |
Now that we can ignore up to 8KB of dirty registers, remove the RSE magic from exec_setregs(). In set_mcontext() we now also don't have to worry that we entered the kernel with more that 512 bytes of dirty registers on the kernel stack. Note that we cannot make any assumptions anymore WRT to NaT collection points in exec_setregs(), so we have to deal with them now.
|
118640 |
08-Aug-2003 |
marcel |
MFi386 1.422 & 1.423: lock page queues in pmap_insert_entry().
|
118607 |
07-Aug-2003 |
jhb |
Consistently use the BSD u_int and u_short instead of the SYSV uint and ushort. In most of these files, there was a mixture of both styles and this change just makes them self-consistent.
Requested by: bde (kern_ktrace.c)
|
118590 |
07-Aug-2003 |
marcel |
Better define the flags in the mcontext_t and properly set the flags when we create contexts. The meaning of the flags are documented in <machine/ucontext.h>. I only list them here to help browsing the commit logs: _MC_FLAGS_ASYNC_CONTEXT _MC_FLAGS_HIGHFP_VALID _MC_FLAGS_KSE_SET_MBOX _MC_FLAGS_RETURN_VALID _MC_FLAGS_SCRATCH_VALID
Yes, _MC_FLAGS_KSE_SET_MBOX is a hack and I'm proud of it :-)
|
118588 |
07-Aug-2003 |
marcel |
o Fix cut-n-paste whitespace corruption in previous commit o For trap-based upcalls the argument (the kse_mailbox) to the UTS must be written onto the kernel stack, not the user stack. While here, deal with the fact that we may be at a NaT collection point.
|
118566 |
06-Aug-2003 |
marcel |
In cpu_set_upcall_kse(), create the upcall according to the entry path into the kernel. Normally it's due to a syscall, but one can also be created as the result of a clock interrupt (for example). This now even more looks like exec_setregs().
While here, add an assert that we don't expect more than 8KB of dirty registers on the kernel stack.
|
118563 |
06-Aug-2003 |
marcel |
o In revision 1.45 of exception.S we changed exception_restore to unconditionally restore ar.k7 (kernel memory stack) and ar.k6 (kernel register stack). I don't know what I was smoking then, but if you unconditionally restore ar.k6, you also want to compute its value unconditionally. By having the computation predicated and dependent on whether we return to user mode, we would end up writing junk (= invalid value for ar.bspstore) if we would return to kernel mode. But the whole point of the unconditional restoration was that there is a grey area where we still need to have ar.k6 restored. If we restore with a junk value, we would end up wedging the machine on the next interrupt. So, unconditionally calculate the value we unconditionally write to ar.k6.
o The previous braino was found while making the following change: We used to clear the lower 9 bits of the value we write to ar.k6. The meaning being that we know that the kernel register stack is at least 512 byte aligned and simply clearing the lower 9 bits allows us to return to a context of which we don't have dirty registers on the kernel stack, even though the context that entered the kernel does have dirty registers on the kernel stack. By masking-off the lower bits, we correctly obtain the base of the register stack without having to worry that we didn't actually reached the base while unwinding it. The change is to mask off the lower 13 bits, knowing that the kernel register stack is always 8KB aligned. The advantage is that we don't have to worry anymore if there's more than 512 bytes of dirty registers on the kernel stack. A situation that frequently occurs. In exec_setregs() in machdep.c:1.147 or older, we had to deal with that situation by copying the active portion of the register stack down in multiples of 512 bytes. Now that we mask off the lower 13 bits we don't have to do that at all. Contemporary IPF processors have a register file that can hold up to 96 stacked registers (=784 bytes [incl. 2 NaT collections]). With no indication that register files grow beyond a couple of hundred registers, we should not have to worry about it anymore... and yes, 640KB is enough for everybody :-) This change helps setcontext(2) and cpu_set_upcall_kse() in that they can return to completely different contexts without having to mess with the kernel stack. Of course exec_setregs() doesn't need to do that anymore as well.
|
118503 |
05-Aug-2003 |
marcel |
o Put the syscall return registers in the context. Not only do we need this for swapcontext(), KSE upcalls initiated from ast() also need to save them so that we properly return the syscall results after having had a context switch. Note that we don't use r11 in the kernel. However, the runtime specification has defined r8-r11 as return registers, so we put r11 in the context as well. I think deischen@ was trying to tell me that we should save the return registers before. I just wasn't ready for it :-)
o The EPC syscall code has 2 return registers and 2 frame markers to save. The first (rp/pfs) belongs to the syscall stub itself. The second (iip/cfm) belongs to the caller of the syscall stub. We want to put the second in the context (note that iip and cfm relate to interrupts. They are only being misused by the syscall code, but are not part of a regular context). This way, when the context is switched to again, we return to the caller of setcontext(2) as one would expect.
o Deal with dirty registers on the kernel stack. The getcontext() syscall will flush the RSE, so we don't expect any dirty registers in that case. However, in thread_userret() we also need to save the context in certain cases. When that happens, we are sure that there are dirty registers on the kernel stack. This implementation simply copies the registers, one at a time, from the kernel stack to the user stack. NAT collections are not dealt with. Hence we don't preserve NaT bits. A better solution needs to be found at some later time. We also don't deal with this in all cases in set_mcontext. No temporay solution is implemented because it's not a showstopper. The problem is that we need to ignore the dirty registers and we automaticly do that for at most 62 registers. When there are more than 62 dirty registers we have a memory "leak".
This commit is fundamental for KSE support.
|
118450 |
04-Aug-2003 |
marcel |
Fix logic bug in the previous commit. Any region less than 5 is a user space region. Hence, we need to test if 5 is greater than the region; not greater equal. This bug caused us to call ast() while interrupting kernel mode.
|
118443 |
04-Aug-2003 |
jhb |
- Since td_critnest is now initialized in MI code, it doesn't have to be set in cpu_critical_fork_exit() anymore. - As far as I can tell, cpu_thread_link() has never been used, not even when it was originally added, so remove it.
|
118414 |
04-Aug-2003 |
marcel |
Cleanup the clock code. This includes: o Remove alpha specific timer code (mc146818A) and compiled-out calibration of said timer. o Remove i386 inherited timer code (i8253) and related acquire and release functions. o Move sysbeep() from clock.c to machdep.c and have it return ENODEV. Console beeps should be implemented using ACPI or if no such device is described, using the sound driver. o Move the sysctls related to adjkerntz, disable_rtc_set and wall_cmos_clock from machdep.c to clock.c, where the variables are. o Don't hardcode a hz value of 1024 in cpu_initclocks() and don't bother faking a stathz that's 1/8 of that. Keep it simple: hz defaults to HZ and stathz equals hz. This is also how it's done for sparc64. o Keep a per-CPU ITC counter (pc_clock) and adjustment (pc_clockadj) to calculate ITC skew and corrections. On average, we adjust the ITC match register once every ~1500 interrupts for a duration of 2 consequtive interruprs. This is to correct the non-deterministic behaviour of the ITC interrupt (there's a delay between the match and the raising of the interrupt). o Add 4 debugging sysctls to monitor clock behaviour. Those are debug.clock_adjust_edges, debug.clock_adjust_excess, debug.clock_adjust_lost and debug.clock_adjust_ticks. The first counts the individual adjustment cycles (when the skew first crosses the threshold), the second counts the number of times the adjustment was excessive (any non-zero value is to be considered a bug), the third counts lost clock interrupts and the last counts the number of interrupts for which we applied an adjustment (debug.clock_adjust_ticks / debug.clock_adjust_edges gives the avarage duration of an individual adjustment -- should be ~2).
While here, remove some nearby (trivial) left-overs from alpha and other cleanups.
|
118402 |
04-Aug-2003 |
marcel |
Fix handling of external interrupts: we weren't calling ast() when interrupting user mode. The net effect of this bug is that a clock interrupt does not cause rescheduling and processes are not preempted. It only takes a "while (1);" to render the machine useless.
This bug was introduced by the context changes and EPC syscall code. Handling of ASTs was moved to C for clarity and ease of maintenance, but was not added for the external interrupt case.
This needs to be revisited. We now have calls to do_ast() in trap(), break_syscall() and ivt_External_Interrupt(). A single call in exception_restore covers these 3 places without duplication. This is where we handled ASTs prior to the overhaul, except that the meat has been moved to do_ast(), a C function. This was the goal to begin with.
Pointy hat: marcel
|
118382 |
03-Aug-2003 |
obrien |
Style sync.
|
118333 |
02-Aug-2003 |
marcel |
Don't use uint64_t. Use unsigned long instead. One is supposed to use ucontext_t without having to include headers other than <ucontext.h>.
|
118296 |
01-Aug-2003 |
marcel |
Write the preserved registers to (and read them from) struct reg and struct fpreg.
|
118244 |
31-Jul-2003 |
bmilekic |
Make sure that when the PV ENTRY zone is created in pmap, that it's created not only with UMA_ZONE_VM but also with UMA_ZONE_NOFREE. In the i386 case in particular, the pmap code would hook a special page allocation routine that allocated from kernel_map and not kmem_map, and so when/if the pageout daemon drained the zones, it could actually push out slabs from the PV ENTRY zone but call UMA's default page_free, which resulted in pages allocated from kernel_map being freed to kmem_map; bad. kmem_free() ignores the return value of the vm_map_delete and just returns. I'm not sure what the exact repercussions could be, but it doesn't look good.
In the PAE case on i386, we also set-up a zone in pmap, so be conservative for now and make that zone also ZONE_NOFREE and ZONE_VM. Do this for the pmap zones for the other archs too, although in some cases it may not be entirely necessarily. We'd rather be safe than sorry at this point.
Perhaps all UMA_ZONE_VM zones should by default be also UMA_ZONE_NOFREE?
May fix some of silby's crashes on the PV ENTRY zone.
|
118239 |
31-Jul-2003 |
peter |
Deal with 'options KSTACK_PAGES' being a global option.
|
118238 |
31-Jul-2003 |
peter |
Cosmetic: fix some disorder of #include "opt_...." files
|
118237 |
31-Jul-2003 |
peter |
Remove leftover relic of pmap_new_thread() etc.
|
118081 |
27-Jul-2003 |
mux |
- Introduce a new busdma flag BUS_DMA_ZERO to request for zero'ed memory in bus_dmamem_alloc(). This is possible now that contigmalloc() supports the M_ZERO flag. - Remove the locking of Giant around calls to contigmalloc() since contigmalloc() now grabs Giant itself.
|
118050 |
26-Jul-2003 |
marcel |
Remove prototype of ia64_pa_access(). The function has been moved to mem.c where it's been made static.
|
118048 |
26-Jul-2003 |
marcel |
Avoid using __aligned(16). Instead define the jmp_buf in terms of long doubles. This gives us 16-byte alignment. Add a CTASSERT for the size of the jmp_buf to detect ABI breakages.
|
118046 |
26-Jul-2003 |
marcel |
Unbreak ia64 builds now -Werror is enabled again. Avoid obsolete memory operand construct.
|
118034 |
25-Jul-2003 |
marcel |
Revert previous commit. We don't use setjmp()/longjmp() for context switching anymore, so there's no need to save and restore GP. This change breaks threaded applications linked against libc_r. Pull the tier 2 card again: relink. This will link against libthr instead.
|
118024 |
25-Jul-2003 |
alc |
MFi386 revision 1.416 Add vm object locking to pmap_prefault().
Note: powerpc and sparc64 do not implement this function.
|
117999 |
25-Jul-2003 |
marcel |
Remove __aligned(16) from the definition of struct _ia64_fpreg. It's a non-standard construct. Instead, redefine struct _ia64_fpreg as a union and put a long double in it. On ia64 and for LP64, this is defined by the ABI to have 16-byte alignment. For ILP32 a long double has 4-byte alignment, but we don't support ILP32.
Note that the in-memory image of a long double does not match the in- memory image of spilled FP registers. This means that one cannot use the fpr_flt field to interpet the bits. For this reason we continue to use an aggregate type.
|
117998 |
25-Jul-2003 |
marcel |
Remove INVARIANT* and WITNESS. This makes the simulator much more pleasant to use.
|
117993 |
25-Jul-2003 |
marcel |
Move ia64_pa_access() from machdep.c to mem.c and declare it static. It's only used in mem.c and cannot accidentally be used elsewhere this way.
|
117984 |
25-Jul-2003 |
marcel |
Disable the single-step trap on a debug related trap, including of course the single-step trap itself.
|
117908 |
23-Jul-2003 |
marcel |
We sloppily created an array for the high FP registers (f32-f127), but this just created a weird inconsistency when porting gdb(1). Instead, we name each high FP register seperately, like we do for all the other registers.
|
117608 |
15-Jul-2003 |
marcel |
Rename thread_siginfo to cpu_thread_siginfo.
|
117499 |
13-Jul-2003 |
marcel |
Enable the high FP registers when we call the FPSWA handler and disable them again afterwards. This fixes a disabled FP fault while in the FPSWA handler. While here, merge the FP fault and FP trap handling code to reduce code duplication. Where code was different, it was not sure it should be.
Trigger case: ports/math/atlas
|
117467 |
12-Jul-2003 |
marcel |
Add logic to trace across/over a trapframe. We have ABI markers in our unwind information for functions that are entry points into the kernel. When stepping to the next frame, the unwinder will let us know when sych a marker was encountered. We use this to stop the current unwind session, query the trapframe and restart a new unwind session based on the new trapframe.
The implementation is a bit sloppy, but at this time there are bigger fish to fry.
|
117437 |
11-Jul-2003 |
marcel |
Add a body directive before the first instruction in epc_syscall(). This results in a zero length prologue and a body that covers the whole function. This is more correct.
|
117436 |
11-Jul-2003 |
marcel |
Remove a gratuitous align directive after the endp directive for IVT entries.
|
117267 |
05-Jul-2003 |
marcel |
Don't call malloc() and free() while in the debugger and unwinding to get a stacktrace. This does not work even with M_NOWAIT when we have WITNESS and is generally a bad idea (pointed out by bde@). We allocate an 8K heap for use by the unwinder when ddb is active. A stack trace roughly takes up half of that in any case, so we have some room for complex unwind situations. We don't want to waste too much space though. Due to the nature of unwinding, we don't worry too much about fragmentation or performance of unwinding while in the debugger. For now we have our own heap management, but we may be able to leverage from existing code at some later time.
While here: o Make sure we actually free the unwind environment after unwinding. This fixes a memory leak. o Replace Doug's license with mine in unwind.c and unwind.h. Both files don't have much, if any, of Doug's code left since the EPC syscall overhaul and the import of the unwinder. o Remove dead code. o Replace M_NOWAIT with M_WAITOK for all remaining malloc() calls.
|
117206 |
03-Jul-2003 |
alc |
Background: pmap_object_init_pt() premaps the pages of a object in order to avoid the overhead of later page faults. In general, it implements two cases: one for vnode-backed objects and one for device-backed objects. Only the device-backed case is really machine-dependent, belonging in the pmap.
This commit moves the vnode-backed case into the (relatively) new function vm_map_pmap_enter(). On amd64 and i386, this commit only amounts to code rearrangement. On alpha and ia64, the new machine independent (MI) implementation of the vnode case is smaller and more efficient than their pmap-based implementations. (The MI implementation takes advantage of the fact that objects in -CURRENT are ordered collections of pages.) On sparc64, pmap_object_init_pt() hadn't (yet) been implemented.
|
117161 |
02-Jul-2003 |
ru |
The .s files were repo-copied to .S files.
Approved by: marcel Repocopied by: joe
|
117142 |
02-Jul-2003 |
marcel |
The use of SYSINIT requires the inclusion of <sys/kernel.h>
|
117139 |
01-Jul-2003 |
mux |
Make this even closer to other busdma backends.
|
117133 |
01-Jul-2003 |
mux |
Sync bounce pages support with the alpha backend. More precisely: o use a mutex to protect the bounce pages structure. o use a SYSINIT function to initialize the bounce pages structures and thus avoid a race condition in alloc_bounce_pages(). o add support for the BUS_DMA_NOWAIT flag in bus_dmamap_load(). o remove obsolete splhigh()/splx() calls. o remove printf() about incorrect locking in busdma_swi() and sync busdma_swi() with the one of the alpha backend. o use __FBSDID.
|
117129 |
01-Jul-2003 |
mux |
Honor the boundary of the busdma tag when allocating bounce pages. This was fixed in revision 1.5 of alpha/alpha/busdma_machdep.c and was never fixed in other busdma backends using bounce pages.
|
117126 |
01-Jul-2003 |
scottl |
Mega busdma API commit.
Add two new arguments to bus_dma_tag_create(): lockfunc and lockfuncarg. Lockfunc allows a driver to provide a function for managing its locking semantics while using busdma. At the moment, this is used for the asynchronous busdma_swi and callback mechanism. Two lockfunc implementations are provided: busdma_lock_mutex() performs standard mutex operations on the mutex that is specified from lockfuncarg. dftl_lock() is a panic implementation and is defaulted to when NULL, NULL are passed to bus_dma_tag_create(). The only time that NULL, NULL should ever be used is when the driver ensures that bus_dmamap_load() will not be deferred. Drivers that do not provide their own locking can pass busdma_lock_mutex,&Giant args in order to preserve the former behaviour.
sparc64 and powerpc do not provide real busdma_swi functions, so this is largely a noop on those platforms. The busdma_swi on is64 is not properly locked yet, so warnings will be emitted on this platform when busdma callback deferrals happen.
If anyone gets panics or warnings from dflt_lock() being called, please let me know right away.
Reviewed by: tmm, gibbs
|
117045 |
29-Jun-2003 |
alc |
- Export pmap_enter_quick() to the MI VM. This will permit the implementation of a largely MI pmap_object_init_pt() for vnode-backed objects. pmap_enter_quick() is implemented via pmap_enter() on sparc64 and powerpc. - Correct a mismatch between pmap_object_init_pt()'s prototype and its various implementations. (I plan to keep pmap_object_init_pt() as the MD hook for device-backed objects on i386 and amd64.) - Correct an error in ia64's pmap_enter_quick() and adjust its interface to match the other versions. Discussed with: marcel
|
117022 |
29-Jun-2003 |
alc |
- Remove the calls to pmap_install() from pmap_object_init_pt(); they are redundant. Discussed with: marcel - MFi386: Add vm object locking to pmap_object_init_pt().
|
116971 |
28-Jun-2003 |
marcel |
Implement cpu_set_upcall_kse(). Elementary testing shows that this function behaves correctly in principle, but is not expected to be 100% complete. In any case, with this commit we have KSE ported enough to start runtime testing with threaded applications and fix whatever bugs or omissions we encounter. Yay!
|
116958 |
28-Jun-2003 |
davidxu |
Add a machine depended function thread_siginfo, SA signal code will use the function to construct a siginfo structure and use the result to export to userland.
Reviewed by: julian
|
116907 |
27-Jun-2003 |
scottl |
Do the first and mostly mechanical step of adding mutex support to the bus_dma async callback scheme. Note that sparc64 does not seem to do async callbacks. Note that ia64 callbacks might not be MPSAFE at the moment. Note that powerpc doesn't seem to do async callbacks due to the implementation being incomplete.
Reviewed by: mostly silence on arch@
|
116570 |
19-Jun-2003 |
marcel |
Add TLS related relocation.
|
116510 |
18-Jun-2003 |
alc |
Fix a performance bug in all of the various implementations of uma_small_alloc(): They always zeroed the page regardless of what the caller requested.
|
116361 |
15-Jun-2003 |
davidxu |
Rename P_THREADED to P_SA. P_SA means a process is using scheduler activations.
|
116355 |
14-Jun-2003 |
alc |
Migrate the thread stack management functions from the machine-dependent to the machine-independent parts of the VM. At the same time, this introduces vm object locking for the non-i386 platforms.
Two details:
1. KSTACK_GUARD has been removed in favor of KSTACK_GUARD_PAGES. The different machine-dependent implementations used various combinations of KSTACK_GUARD and KSTACK_GUARD_PAGES. To disable guard page, set KSTACK_GUARD_PAGES to 0.
2. Remove the (unnecessary) clearing of PG_ZERO in vm_thread_new. In 5.x, (but not 4.x,) PG_ZERO can only be set if VM_ALLOC_ZERO is passed to vm_page_alloc() or vm_page_grab().
|
116328 |
14-Jun-2003 |
alc |
Move the *_new_altkstack() and *_dispose_altkstack() functions out of the various pmap implementations into the machine-independent vm. They were all identical.
|
116324 |
14-Jun-2003 |
marcel |
Remove kernel event tracing. The overhead is significant when running under ski.
|
116227 |
12-Jun-2003 |
marcel |
Make sure pcpu->pc_pcb is pointing to a 16-byte aligned address. The PCB contains FP registers, whose alignment must be 16 bytes at least. Since the PCB pointed to by pc_pcb is immediately after the PCPU itself, round-up the size of thge PCPU to a multiple of 16 bytes. The PCPU is page aligned.
This fixes a misalignment trap caused by stopping a CPU in a SMP kernel, such as been done when entering the debugger.
Reported by: Alan Robinson <alan.robinson@fujitsu-siemens.com>
|
116188 |
11-Jun-2003 |
peter |
GC unused cpu_wait() function
|
115999 |
08-Jun-2003 |
jmallett |
Note that scbus is required for SCSI, not just "required" in general.
Submitted by: Edward Kaplan (tmbg37 on IRC) Reviewed by: rwatson (in principle)
|
115937 |
07-Jun-2003 |
marcel |
pmap_find_vhpt() has been observed to return a NULL pointer when the caller assumes this to not happen by means of performing an indirection without checking the return value. Add KASSERTs to force a kernel with INVARIANTS to panic. This is a short-term measure. The pmap code is scheduled to be overhauled.
|
115936 |
07-Jun-2003 |
marcel |
If we get a fault in the gateway page, which would happen if we try to deliver a signal and the RSE backing store has been exhausted or the backing store pointer has been clobbered, we need to make sure we call userret() and do_ast() when we exit from trap(). Not adjusting the local variable 'user' in this case will prevent the faulty process from being terminated and we end up in an infinite fault repetition.
Faulty process provided by: bento
|
115916 |
06-Jun-2003 |
marcel |
Use TRAPF_USERMODE() to replace an equivalent check in trap(). While here, amend the related comment.
|
115913 |
06-Jun-2003 |
marcel |
Have TRAPF_USERMODE() take into account that the gateway page is not always kernel space. It should be treated as user space when run with user privileges (which is the case for the signal trampolines). This fixes its only use in a KASSERT in subr_trap.c.
|
115859 |
04-Jun-2003 |
marcel |
Fix the dreaded double counting that was present on alpha as well and got fixed two weeks after the ia64 version was copied from the alpha version (see rev 1.32 of sys/alpha/alpha/mem.c). As such, we were missing the same continue as on alpha.
While here, add a default case for the device minor switch and do some general style(9) cleanups.
WARNING: this file still has bugs. When reading from region 6 or region 7, we don't validate the physical address. One can trivially cause a machine check by trying to read from address 0xFFFFFFFFFFFFFFF0 or something that uses the unimplemented physical address bits.
Reported by: Alan Robinson <alan.robinson@fujitsu-siemens.com>
|
115858 |
04-Jun-2003 |
marcel |
Change the second (and last) argument of cpu_set_upcall(). Previously we were passing in a void* representing the PCB of the parent thread. Now we pass a pointer to the parent thread itself. The prime reason for this change is to allow cpu_set_upcall() to copy (parts of) the trapframe instead of having it done in MI code in each caller of cpu_set_upcall(). Copying the trapframe cannot always be done with a simply bcopy() or may not always be optimal that way. On ia64 specifically the trapframe contains information that is specific to an entry into the kernel and can only be used by the corresponding exit from the kernel. A trapframe copied verbatim from another frame is in most cases useless without some additional normalization.
Note that this change removes the assignment to td->td_frame in some implementations of cpu_set_upcall(). The assignment is redundant. A previous call to cpu_thread_setup() already did the exact same assignment. An added benefit of removing the redundant assignment is that we can now change td_pcb without nasty side-effects.
This change officially marks the ability on ia64 for 1:1 threading.
Not tested on: amd64, powerpc Compile & boot tested on: alpha, sparc64 Functionally tested on: i386, ia64
|
115652 |
01-Jun-2003 |
marcel |
Improve set_mcontext: o Don't copy psr verbatim from the user supplied context. Only allow userland to change the processor settings that are part of the user mask.
|
115651 |
01-Jun-2003 |
marcel |
Improve on cpu_set_upcall: o Use pcb and tf for the new pcb and the new trapframe and use pcb0 for the old (current) pcb. The mix of pcb, pcb2 and tf was slightly confusing. o Don't define td->td_frame here. It has already been set previously by cpu_thread_setup. Add a KASSERT to make sure pcb and tf are both non-NULL. o Make sure the number of dirty registers is 0 for the new thread. There are no user registers on the backing store because we heven't enter userland yet.
|
115605 |
01-Jun-2003 |
marcel |
Implement cpu_thread_setup(). This is mostly the same as on i386, except for the fact that trapframes have a size recorded in it that we set here too. We need this for proper thread setup.
Pointed out by: mtm
|
115573 |
31-May-2003 |
marcel |
Now that we have the signal trampolines in the gateway page and the gateway page is considered kernel space, we can panic when we should only SIGSEGV. Hence, add the additional constraint that for page faults we also require running with kernel privileges. The gateway page is the only kernel code running with user privileges, iso this is a correct way to exclude the gateway page from kernel land.
We do not currently exclude the gateway page for other faults as it is not always the right way to do it. Further tuning will happen on a case by case bases.
|
115570 |
31-May-2003 |
marcel |
Implement cpu_set_upcall(). Required by libthr and used by thr_create(2). This implementation is so far only compile tested. But since this is also the last of the functions required to support libthr, we're now functionally complete (for some weird definition of functionally; and complete). Runtime testing can commence.
|
115566 |
31-May-2003 |
marcel |
Implement set_mcontext() and get_mcontext(). Just as for sendsig() and sigreturn(), we cheat and assume the preserved registers are still on-chip and unmodified. This is actually the case, but more by accident than by design. We need to use unwinding eventually or explicitly compile the kernel in a way that the compiler steers clear from using the preserved registers completely.
|
115564 |
31-May-2003 |
marcel |
Make the regset pointers const pointers for the context restore functions. This works better with set_mcontext() and is more precise in general.
|
115563 |
31-May-2003 |
marcel |
Some ia32 related finetuning for the EPC syscall path: o The SDM states that flushing the RSE in the cycle prior to the call to ia32 code yields the best performance. We don't really care to much about performance here, but we do the same anyway. I'm being paranoia and conservative here. o Only initialize the ia32 state registers, not the registers used as scratch by the ia32 engine. This saves a couple of loads from the trapframe, but also helps debugging: we don't clobber useful debugging data (engineering hints :-) o Make sure all general registers constituting ia32 state have been initialized. If there's no useful to be loaded from the trapframe, clear the register. This avoids accidentally leaking NaT bits. o Make sure we set ar.k6 prior to clobbering ar.bspstore and also set ar.k7 prior to setting sp. This fixes a race seen for ia64 native code as well (and previously fixed too).
|
115558 |
31-May-2003 |
marcel |
Make sure we have all the dirty registers in user frames on the backing store before we discard them. It is possible that we enter the kernel (due to an execve in this case) with a lot of dirty user registers and that the RSE has only partially spilled them (to make room for new frames). We cannot move the backing store pointer down (to discard user registers) when not all of the user registers are on the backing store. So, we flush the register stack IFF this happens. Unconditionally doing the flush is too costly, because the condition in which we need to flush is very rare.
This change appears to fix the SIGSEGV that sometimes happen for newly executed processes and so far also appears to fix the last of the corruption. It is possible, although not likely, that this change prevents some other bug from happening, even though it is itself not a fix. Hence the uncertainty. We'll know in a couple of months I guess :-)
|
115416 |
30-May-2003 |
hmp |
Rename BUS_DMAMEM_NOSYNC to BUS_DMA_COHERENT.
The current name is confusing, because it indicates to the client that a bus_dmamap_sync() operation is not necessary when the flag is specified, which is wrong.
The main purpose of this flag is to hint the underlying architecture that DMA memory should be mapped in a coherent way, but the architecture can ignore it. But if the architecture does supports coherent mapping of memory, then it makes bus_dmamap_sync() calls cheap.
This flag is the same as the one in NetBSD's Bus DMA.
Reviewed by: gibbs, scottl, des (implicitly) Approved by: re@ (jhb)
|
115378 |
29-May-2003 |
marcel |
Move the sysctls of the misalignment handler to where they belong and use OID_AUTO instead of fixed IDs.
Approved by: re@ (blanket)
|
115376 |
29-May-2003 |
marcel |
Fix what I think is a cut-n-paste bug: use OID_AUTO for the print_usertrap sysctl instead of CPU_UNALIGNED_PRINT. The latter is used already.
Approved by: re@ (blanket)
|
115344 |
27-May-2003 |
marcel |
A flushrs must be the first in an instruction group.
Approved by: re@ (blanket)
|
115343 |
27-May-2003 |
scottl |
Bring back bus_dmasync_op_t. It is now a typedef to an int, though the BUS_DMASYNC_ definitions remain as before. The does not change the ABI, and reverts the API to be a bit more compatible and flexible. This has survived a full 'make universe'.
Approved by: re (bmah)
|
115342 |
27-May-2003 |
marcel |
Have the unwinder allocate memory with M_NOWAIT. The unwinder is used by DDB and we cannot know in advance whether it's save to sleep. It often enough isn't. We may want to pre-allocate space to cover the most common cases without having to use malloc at all, but that requires some analysis. We leave that for later.
Approved by: re@ (blanket)
|
115341 |
27-May-2003 |
marcel |
Fix fu{byte|word*} and su{byte|word*}: o If the address was not within user space we jumped to fusufault where we would clear pcb_onfault and return 0. There are two bugs here: 1. We never got to the point where we assigned the address of pcb_onfault to r15, which means that we would clobber some random memory location, including I/O space or ROM. 2. We're supposed to return -1 on error. o Make sure we have proper memory ordering for setting pcb_onfault, doing the memory access to user space and clearing pcb_onfault. For the fu* family of functions this means that we need a mf instruction, because we don't have acquire semantics on stores and release semantics on loads (hence st;ld cannot be ordered without intermediate mf).
While here, implement casuptr() so that we are a (small) step closer to supporting libthr and deobfuscate the non-implementation of {f|s}uswintr.
Approved by: re@ (blanket)
|
115339 |
26-May-2003 |
marcel |
Revision 1.99 of this file changed the allocation request from VM_ALLOC_INTERRUPT to VM_ALLOC_SYSTEM. There was no mention of this in commit log as it was considered harmless. Guess what: it does harm. WITNESS showed that we can not safely grab the page queue lock in vm_page_alloc() in all cases as we may have to sleep on it. Revert the request to VM_ALLOC_INTERRUPT to circumvent this. We panic if vm_page_alloc returns 0. I'm not entirely happy about this, but we have bigger fish to fry.
Approved by: re@ (blanket)
|
115298 |
25-May-2003 |
marcel |
Now that we define user mode as any IP address that isn't in the kernel's VA regions, we cannot limit the use of break-based syscalls to user mode only. The signal trampolines are in the gateway page, which is mapped into the process address space in region 5 and thus is kernel space.
We don't special case the gateway page here. Allow break-based syscalls from anywhere in the kernel VA space.
Approved by: re@ (blanket)
|
115296 |
24-May-2003 |
marcel |
Fix a source of instability specific to an EPC userland. We return to userland with interrupts disabled until we restore PSR. However, it has been observed that interrupts do actually happen before they are enabled again. This is a bit surprising and I don't know yet what's going on exactly. Nevertheless, the code was not crafted carefully enough to allow interrupts to happen and we could clobber the kernel stack of another thread when interrupts did happen.
This is what happens: we restore the (memory) stack pointer (sp) and the register stack base prior to restoring ar.k6 and ar.k7. This is not a problem if interrupts don't happen between setting sp/ar.bspstore and ar.k6/ar.k7. Alas, interrupts can happen. Since sp/ar.bspstore already point to the userland stacks, we need to switch to the kernel stack in interrupt. However, ar.k6 and ar.k7 have not been set, which means that we were switching to some unrelated kstack and happily clobbered the trapframe present there if the thread to which the kstack belonged was in kernel mode or otherwise we could have our trapframe clobbered if that other thread enters the kernel. Nasty either way.
We now carefully restore ar.k6 prior to restoring ar.bspstore and likewise for ar.k7 and sp. All we need is the guarantee that an interrupt does not clobber ar.k6 or ar.k7 before we're back in userland. That has been achieved by restoring ar.k6/ar.k7 unconditionally (see exception.s)
While here, remove the disabling of interrupts on EPC entry. It was added as a way to "resolve" the crashes until it was understood what was going on. I think I achieved the latter, so we can remove the patch. Note that setting up a trapframe with interrupts enabled has it's own share of corner cases, but it's better to properly fixed those than to keep a mostly wrong patch around because we're afraid to remove it...
Approved by: re@ (blanket)
|
115295 |
24-May-2003 |
marcel |
Be more careful how we restore interrupts. Don't rewrite most of the PSR only to achieve setting PSR.i back to it's previous value. It makes it impossible to change any of the 30+ other unrelated bits when done between intr_disable() and intr_restore(). That's bad.
Instead have intr_disable() return 1 when interrupts were previously enabled and 0 otherwise and only enable interrupts in intr_restore() when given a non-0 value.
This change specifically disallows using intr_restore() to disable interrupts. The reason is simple: interrupts only need to be restored after they are being disabled, which means that intr_restore() is called with interrupts disabled and we only need to enable them if they were previously enabled.
This change does not fix any bugs, other than that it bugged me...
Approved by: re@ (blanket)
|
115294 |
24-May-2003 |
marcel |
Consistently us the same metric to differentiate between kernel mode and user mode. We need to take into account that the EPC syscall path introduces a grey area in which one can argue either way, including a third: neither.
We now use the region in which the IP address lies. Regions 5, 6 and 7 are kernel VA regions and if the IP lies any any of those regions we assume we're in kernel mode. Hence, we can be in kernel mode even if we're not on the kernel stack and/or have user privileges. There're gremlins living in the twilight zone :-)
For the EPC syscall path this particularly means that the process leaves user mode the moment it calls into the gateway page. This makes the most sense because from a process' point of view the call represents a request to the kernel for some service and that service has been performed if the call returns. With the metric we picked, this also means that we're back in user mode IFF the call returns.
Approved by: re@ (blanket)
|
115291 |
24-May-2003 |
marcel |
Unconditionally restore ar.k7 (memory stack) and ar.k6 (register stack) when returning from an interrupt. Both registers are used on interrupt to switch to the right kernel stack, but other than that they are not used. This means we only have to make sure they contain proper values while in user mode. As such, we conditionally restored these registers based on whether we returned to userland or not. A nice property of conditionally restoring ar.k6 and ar.k7 is that it introduces two invariants: ar.k6 always points to the bottom of the kernel stack and ar.k7 always points to the top of the kernel stack (immediately below the PCB we have there).
However, the EPC syscall path introduces an irregularity: there's no "thin red line" between user and kernel. There's a grey area that's a couple of instructions wide. Any interruption in that grey area is bound to see an inconsistent state. One such state is that we're in kernel space for all practical purposes, but we still need to have ar.k6 and ar.k7 restored as if we're in userland.
Thus: restore ar.k6 and ar.k7 unconditionally at the cost of losing a valuable invariant. Both registers now hold the extend of the usable portion of the kernel stack at any interrupt nesting, which when in userland mean the bottom and the top of the kstack.
|
115276 |
24-May-2003 |
marcel |
Fix an alpha inheritance bug:
On alpha, PAL is involved in context management and after wiring the CPU (in alpha_init()) a context switch was performed to tell PAL about the context. This was bogusly brought over to ia64 where it introduced bugs, because we restored the context from a mostly uninitialized PCB.
The cleanup constitutes: o Remove the unused arguments from ia64_init(). o Don't return from ia64_init(), but instead call mi_startup() directly. This reduces the amount of muckery in assembly and also allows for the next bullet: o Save our currect context prior to calling mi_startup(). The reason for this is that many threads are created from thread0 by cloning the PCB. By saving our context in the PCB, we have something sane to clone. It also ensures that a cloned thread that does not alter the context in any way will return to the saved context, where we're ready for the eventuality with a nice, user unfriendly panic().
The cleanup fixes at least the following bugs: o Entering mi_startup() with the RSE in enforced lazy mode. o Re-execution of ia64_init() in certain "lab" conditions.
While here, add proper unwind directives to __start() so that the unwind knows it has reached the bottom of the (call) stack.
Approved by: re@ (blanket)
|
115274 |
23-May-2003 |
marcel |
Fix a (new) source of instability:
When interrupting a kernel context, we don't need to switch stacks (memory nor register). As such, we were also not restoring the register stack pointer (ar.bspstore). This, however, fails to be valid in 1 situation: when we interrupt a register stack switch as is being done in restorectx(). The problem is that restorectx() needs to have ar.bsp == ar.bspstore before it can assign the new value to ar.bspstore. This is achieved by doing a loadrs prior to assigning to ar.bspstore. If we take an interrupt in between the loadrs and the assignment and we don't make sure we restore the ar.bspstore prior to returning from the interrupt, we switch stacks with possibly non-zero dirty registers, which means that the new frame pointer (ar.bsp) will be invalid.
So, instead of jumping over the restoration of the register frame pointer and related registers, we conditionalize it based on whether we return to kernel context or user context. A future performance tweak is possible by only restoring ar.bspstore when returning to kernel mode *and* when the RSE is in enforced lazy mode. One cannot assume ar.bsp == ar.bspstore if the RSE is not in enforced lazy mode anyway.
While here (well, not quite) don't unconditionally assign to ar.bspstore in exception_save. Only do that when we actually switch stacks. It can only harm us to do it unconditionally.
Approved by: re@ (blanket)
|
115270 |
23-May-2003 |
marcel |
In swapctx(), put the RSE in enforced lazy mode before we flush the register stack. There's nothing really wrong with flushing before putting the RSE in enforced lazy mode, provided you don't depend on ar.bspstore being equal to ar.bsp when the RSE has been put in enforced lazy more. The small window between the flush and setting the RSE may be sufficient to have the RSE eagerly increase the dirty region (and hence cause ar.bspstore != ar.bsp) or have an interrupt that may even get the laziest RSE to do something.
Anyway: we don't depend on ar.bspstore being equal to ar.bsp, so nothing was and is broken. But the code was non-intuitive and easily confuses. This is a source of future bugs.
Note: the advantage of not depending on ar.bspstore is that there's some recilience against an interrupted flushrs. Clobbering is limited to stacked register contents only, not to RSE address clobbering.
Approved: re@ (blanket)
|
115179 |
20-May-2003 |
marcel |
o Fix a definite bogon: the dirty bity fault, instruction access failt and data access fault install the PTE in question into the VHPT table. However, a post-increment was missing and we wrote the raw PTE data into the pagesize/access key field. This leaves a corrupt VHPT entry. o While here, remove the explicit cache purge. Insertion into the translation implicitly purges any overlapping entries. o Make sure there's a cycle break between the itc and the rfi. o Whitespace fixes.
|
115178 |
20-May-2003 |
marcel |
Rename the "IA64 ITC" counter to "ITC" counter. We don't call the "TSC" counter on i386 "I386 TSC".
Approved by: re@ (blanket)
|
115176 |
20-May-2003 |
marcel |
Prevent corruption of the VHPT collision chain by protecting it with a mutex. The only volatile chain operations are insertion and deletion but since updating an existing PTE also updates the VHPT entry itself, and we have the VHPT mutex in both other cases, we also lock when we update an existing PTE even though no chain operation is involved. Note that we perform the insertion and deletion careful enough that we don't need to lock traversals. If we need to lock traversals, we also need to lock from the exception handler, which we can't without creating a trapframe.
We're now able to withstand a -j8 buildworld. More work is needed to withstand Murphy fields. In other words: we still have a bogon...
Approved by: re@ (blanket)
|
115164 |
19-May-2003 |
kan |
sys/sys/limits.h:
- Fix visibilty test for LONG_BIT and WORD_BIT. `#if defined(__FOO_VISIBLE)' is alays wrong because __FOO_VISIBLE is always defined (to 0 for invisibility).
sys/<arch>/include/limits.h sys/<arch>/include/_limits.h:
- Style fixes.
Submitted by: bde Reviewed by: bsdmike Approved by: re (scottl)
|
115152 |
19-May-2003 |
marcel |
Turn pmap_install_pte() into a critical section. We better not get interrupted while writing into the VHPT table. While here, make sure memory accesses a properly ordered. Tag invalidation must happen first so that the hardware VHPT walker will not be able to match this entry while we're updating it and we have to make sure the new new tag gets written only after the PTE is completely updated.
Approved by: re (blanket)
|
115149 |
19-May-2003 |
marcel |
Unconditionally set pcb_current_pmap. WIP versions of the code previously committed cleared pcb_current_pmap prior to changing the region registers, but that was removed before committing. Since we don't normally (at all?) pass a NULL pointer, the bug was mostly harmless. Fix it while I'm here...
I'm here because we need to have data serialization after writing to the region registers. Not doing so was likely the cause of the hangs we were experiencing. General exceptions in cpu_switch may also be caused by the lack of serialization.
Approved by: re (blanket)
|
115148 |
19-May-2003 |
marcel |
pmap_install() needs to be atomic WRT to context switching. Protect switching user regions (region 0-4) with schedlock. Avoid unnecessary recursion on schedlock by moving the core functionality to another function (pmap_switch()) where we assert schedlock is held. Turn pmap_install() into a wrapper that grabs schedlock. This minimizes the number of callsites that need to be changed. Since we already have schedlock in cpu_switch() and cpu_throw(), have them call pmap_switch() directly. These were also the only two calls to pmap_install() outside pmap.c, so make pmap_install() static and remove its prototype from pmap.h
Approved by: re (blanket)
|
115094 |
17-May-2003 |
marcel |
Remove unused files. cpu_switch() and cpu_throw(), normally in swtch.s, can be found in machdep.c.
Approved: re@
|
115084 |
16-May-2003 |
marcel |
Revamp of the syscall path, exception and context handling. The prime objectives are: o Implement a syscall path based on the epc inststruction (see sys/ia64/ia64/syscall.s). o Revisit the places were we need to save and restore registers and define those contexts in terms of the register sets (see sys/ia64/include/_regset.h).
Secundairy objectives: o Remove the requirement to use contigmalloc for kernel stacks. o Better handling of the high FP registers for SMP systems. o Switch to the new cpu_switch() and cpu_throw() semantics. o Add a good unwinder to reconstruct contexts for the rare cases we need to (see sys/contrib/ia64/libuwx)
Many files are affected by this change. Functionally it boils down to: o The EPC syscall doesn't preserve registers it does not need to preserve and places the arguments differently on the stack. This affects libc and truss. o The address of the kernel page directory (kptdir) had to be unstaticized for use by the nested TLB fault handler. The name has been changed to ia64_kptdir to avoid conflicts. The renaming affects libkvm. o The trapframe only contains the special registers and the scratch registers. For syscalls using the EPC syscall path no scratch registers are saved. This affects all places where the trapframe is accessed. Most notably the unaligned access handler, the signal delivery code and the debugger. o Context switching only partly saves the special registers and the preserved registers. This affects cpu_switch() and triggered the move to the new semantics, which additionally affects cpu_throw(). o The high FP registers are either in the PCB or on some CPU. context switching for them is done lazily. This affects trap(). o The mcontext has room for all registers, but not all of them have to be defined in all cases. This mostly affects signal delivery code now. The *context syscalls are as of yet still unimplemented.
Many details went into the removal of the requirement to use contigmalloc for kernel stacks. The details are mostly CPU specific and limited to exception_save() and exception_restore(). The few places where we create, destroy or switch stacks were mostly simplified by not having to construct physical addresses and additionally saving the virtual addresses for later use.
Besides more efficient context saving and restoring, which of course yields a noticable speedup, this also fixes the dreaded SMP bootup problem as a side-effect. The details of which are still not fully understood.
This change includes all the necessary backward compatibility code to have it handle older userland binaries that use the break instruction for syscalls. Support for break-based syscalls has been pessimized in favor of a clean implementation. Due to the overall better performance of the kernel, this will still be notived as an improvement if it's noticed at all.
Approved by: re@ (jhb)
|
115063 |
16-May-2003 |
marcel |
o In pmap_install, don't prevent switching the pmap if we're switching to kernel_pmap. The pmap is not special enough. o Clear the active bit on the pmap we're switching out. o Fix some nearby style(9) bugs.
Approved by: re@
|
115059 |
16-May-2003 |
marcel |
Indent a comment. This makes 1.100.
Still approved by: re@ (blanket)
|
115058 |
16-May-2003 |
marcel |
Turn pmap_growkernel() into a critical section. While here, initialize kernel_vm_end in pmap_bootstrap. Don't delay the initialization until we need to grow the kernel VM space. This BTW happens twice before we enter either single- or multi-user mode. Don't adjust kernel_vm_end while growing based on whether the KPT contains a non-NULL entry. We trust kernel_vm_end to be correct and we make sure it's still correct after growing. Define virtual_avail and virtual_end in terms of VM_MIN_KERNEL_ADDRESS and VM_MAX_KERNEL_ADDRESS (resp). Don't hardcode region knowledge.
|
115057 |
16-May-2003 |
marcel |
Revamp the RID allocation code: o Limit the size of the region ID map to 64KB. This gives a bitmap that is large enough to keep track of 2^19 numbers. The minimal map size is 32KB. The reason we limit the map size is that processor models may have implemented a 24-bit region ID, which would give a 2MB bitmap while the maximum number of allocations is always less than PID_MAX*5, which is less than 2^19. o Allocate all region IDs up-front. The slight downside of reserving more RIDs then a process needs (3 for ia64 native and 1 for ia32) is preferable over the call to pmap_ensure_rid() where RIDs are allocated on demand. On SMP systems this may lead to a race condition. o When allocating a region ID, don't use arc4random(). We're not interested in randomness or uniform distribution across the spectrum. We only need uniqueness. Random numbers may easily collide when the number of allocated RIDs is high, creating a possibly unbounded retry rate.
|
115056 |
16-May-2003 |
marcel |
Move the conditional definition of KSTACK_MAX_PAGES up ahead where it's more visible.
Approved by: re@ (blanket)
|
115019 |
15-May-2003 |
marcel |
This file creates register sets based on the runtime specification. The advantage of using register sets is that you don't focus on each register seperately, but instead instroduce a level of abstraction. This reduces the chance of errors, and also simplifies the code. The register sers form the basis of everything register. The sets in this file are:
struct _special contains all of the control related registers, such as instruction pointer and stack pointer. It also contains interrupt specific registers like the faulting address. The set is roughly split in 3 groups. The first contains the registers that define a context or thread. This is the only group that the kernel needs to switch threads. The second group contains registers needed in addition to the first group needed to switch userland threads. This group contains the thread pointer and the FP control register. The third group contains those registers we need for execption handling and are used on top of the first two groups.
struct _callee_saved, struct _callee_saved_fp These sets contain the preserved registers, including the NaT after spilling. The general registers (including branch registers) are seperated from the FP registers for ptrace(2).
struct _caller_saved, struct _caller_saved_fp These sets contain the scratch registers based on SDM 2.1, This means that both ar.csd and ar.ccd are included here, even though they contain ia32 segment register descriptions. We keep seperate NaT bits for scratch and preserved registers, because they are never saved/restored at the same time.
struct _high_fp The upper 96 FP registers that can be enabled/disabled seperately on the CPU from the lower 32 FP registers. Due to the size of this set, we treat them specially, even though they are defined as scratch registers.
CVS ----------------------------------------------------------------------
|
115018 |
15-May-2003 |
marcel |
This file contains elementary context related functions used to save and restore "sets" of registers in various places. The restorectx and swapctx functions are used by cpu_switch() and deal with the special registers, as well as the preserved registers. The *callee_saved* functions are used to save and restore the preserved registers (integer and floating-point). They are useful for signal delivery and ptrace support. The save_high_fp and restore_high_fp functions are used to "load" and "unload" to and from the CPU as part of lazy context switching. The ia32 specific context functions have been kept with the ia32 code.
Approved by: re@ (blanket)
|
115017 |
15-May-2003 |
marcel |
This file contains the code that implements the syscall path based on the epc instruction. The epc instruction, given the permissions of the page in which the epc is located, allows the privilege level to be increased with little or no overhead. The previous privilege level is recorded in the current frame marker and is restored by a regular (function) return. Since the epc instruction has to live in a page with non-standard properties, we hardwire a "gateway" page in the address space. The address of the gateway page is exported to userland in ar.k7. This allows us to rewire the page without breaking the ABI. The syscall stubs in libc are regular function calls that slightly differ from the normal runtime. The difference is mostly to simplify the stubs themselves by by moving some of the logic to the kernel. The libc stubs call into the gateway page (offset 0), from where the kernel trampolines to the code that sets up a minimal trapframe and arranges to execute from the kernel stack. The way back is basicly the same. The kernel returns to the gateway page, whereby privilege is dropped, and jumps back to the syscall stub. Only the special registers are saved in the trapframe. None of the scratch registers are preserved and since the kernel follows the same runtime model, none of the preserved registers are saved. Future enhancements can include the implementation of lightweight syscalls, where kernel functions are performed without setting up a trapframe. Good candidates are the *context syscalls for example.
Now that there's a gateway page from which code can be executed in a non-privileged context, we also have the ideal place to put the signal trampolines. By moving the signal trampolines from the user stack to the gateway page, we open up the doors to unexecutable stacks. The gateway page contains signal trampolines for both the "legacy" break-based syscall code and the new and improved epc- based syscall code.
Approved: re@ (blanket)
|
114983 |
13-May-2003 |
jhb |
- Merge struct procsig with struct sigacts. - Move struct sigacts out of the u-area and malloc() it using the M_SUBPROC malloc bucket. - Add a small sigacts_*() API for managing sigacts structures: sigacts_alloc(), sigacts_free(), sigacts_copy(), sigacts_share(), and sigacts_shared(). - Remove the p_sigignore, p_sigacts, and p_sigcatch macros. - Add a mutex to struct sigacts that protects all the members of the struct. - Add sigacts locking. - Remove Giant from nosys(), kill(), killpg(), and kern_sigaction() now that sigacts is locked. - Several in-kernel functions such as psignal(), tdsignal(), trapsignal(), and thread_stopped() are now MP safe.
Reviewed by: arch@ Approved by: re (rwatson)
|
114678 |
04-May-2003 |
kan |
Style fixes. Remove DBL_DIG, DBL_MIN, DBL_MAX and their FLT_ counterparts, they were marked for deprecation ever since SUSv1 at least. Only define ULLONG_MIN/MAX and LLONG_MAX if long long type is supported. Restore a lost comment in MI _limits.h file and remove it from sys/limits.h where it does not belong.
|
114616 |
03-May-2003 |
marcel |
Fix c99 victim: the accepted character '0 most now be types as '0'.
|
114553 |
02-May-2003 |
marcel |
Option KADB does not exist. It came from alpha, where it still exists.
|
114347 |
30-Apr-2003 |
marcel |
Kill MID_MACHINE, its a.out specific, the only platform that supports it is i386. All of the other platforms should remove it too. -- peter@
|
114305 |
30-Apr-2003 |
jhb |
Range check the syscall number before looking it up in the syscallnames[] array.
Submitted by: pho
|
114216 |
29-Apr-2003 |
kan |
Deprecate machine/limits.h in favor of new sys/limits.h. Change all in-tree consumers to include <sys/limits.h>
Discussed on: standards@ Partially submitted by: Craig Rodrigues <rodrigc@attbi.com>
|
114208 |
29-Apr-2003 |
marcel |
Revamp the newbus functions: o do not use the in* and out* functions. These functions are used by legacy drivers and thus must have ia32 compatible behaviour. Hence, they need to have fences. Using these functions for newbus would then pessimize performance. o remove the conditional compilation of PIO and/or MEMIO support. It's a PITA without having any significant benefit. We always support them both. Since there are no I/O ports on ia64 (they are simulated by the chipset by translating memory mapped I/O to predefined uncacheable memory regions) the only difference between PIO and MEMIO is in the address calculation. There should be enough ILP that can be exploited here that making these computations compile-time conditional is not worth it. We now also don't use the read* and write* functions. o Add the missing *_8 variants. They were missing, although not missed. It's for completeness. o Do not add the fences that were present in the low-level support functions here. We're using uncacheable memory, which means that accesses are in program order. Change the barrier implementation to not only do a memory fence, but also an acceptance fence. This should more reliably synchronize drivers with the hardware. The memory fence enforces ordering, but does not imply visibility (ie the access does not necessarily have happened). This is what the acceptance deals with.
cpufunc.h cleanup: o Remove the low-level memory mapped I/O support functions. They are not used. Keep the low-level I/O port access functions for legacy drivers and add fences to ensure ia32 compatibility. o Remove the syscons specific functions now that we have moved the proper definitions where they belong. o Replace the ia64_port_address() and ia64_memory_address() functions with macros. There's a bigger change inline functions get inlined when there aren't function callsi and the calculations are simply enough to do it with macros.
Replace the one reference to ia64_memory address in mp_machdep.c to use the macro.
|
114029 |
25-Apr-2003 |
jhb |
- Push down Giant into the sysarch() calls that still need Giant. - Standardize on EINVAL rather than EOPNOTSUPP if the sysarch op value is invalid.
|
114017 |
25-Apr-2003 |
jhb |
Regen.
|
114016 |
25-Apr-2003 |
jhb |
Oops, the thr_* and jail_attach() syscall entries should be NOPROTO rather than STD.
|
113998 |
25-Apr-2003 |
deischen |
Add an argument to get_mcontext() which specified whether the syscall return values should be cleared. The system calls getcontext() and swapcontext() want to return 0 on success but these contexts can be switched to at a later time so the return values need to be cleared in the saved register sets. Other callers of get_mcontext() would normally want the context without clearing the return values.
Remove the i386-specific context saving from the KSE code. get_mcontext() is not i386-specific any more.
Fix a bad pointer in the alpha get_mcontext() code. The context was being bcopy()'d from &td->tf_frame, but tf_frame is itself a pointer, so the thread was being copied instead. Spotted by jake.
Glanced at by: jake Reviewed by: bde (months ago)
|
113989 |
24-Apr-2003 |
jhb |
Regen.
|
113987 |
24-Apr-2003 |
jhb |
Fix the thr_create() entry by adding a trailing \. Also, sync up the MP safe flag for thr_* with the main table.
|
113941 |
23-Apr-2003 |
kan |
Add a new sys/limits.h file which in turn depends on machine/_limits.h to get actual constant values. This is in preparation for machine/limits.h retirement.
Discussed on: standards@ Submitted by: Craig Rodrigues <rodrigc@attbi.com> (*) Modified by: kan
|
113859 |
22-Apr-2003 |
jhb |
- Replace inline implementations of sigprocmask() with calls to kern_sigprocmask() in the various binary compatibility emulators. - Replace calls to sigsuspend(), sigaltstack(), sigaction(), and sigprocmask() that used the stackgap with calls to the corresponding kern_sig*() functions instead without using the stackgap.
|
113833 |
22-Apr-2003 |
davidxu |
Remove single threading detecting code, these code really should be replaced by thread_user_enter(), but current we don't want to enable this in trap.
|
113831 |
22-Apr-2003 |
marcel |
Don't use the tpa instruction to implement pmap_kextract. The tpa instruction requires that a translation is present in the TC. This may trigger a TLB miss and a subsequent call to vm_fault(). This implementation is deliberately non-inline for debugging and profiling purposes. Partial or full inlining should eventually be done.
Valuable insights by: jake
|
113803 |
21-Apr-2003 |
simokawa |
Add FireWire drivers to GENERIC.
|
113686 |
18-Apr-2003 |
jhb |
Use the proc lock to protect p_singlethread and a P_WEXIT test. This fixes a couple of potential KSE panics on non-i386 arch's that weren't holding the proc lock when calling thread_exit().
|
113540 |
16-Apr-2003 |
marcel |
Add the EHCI host controller.
|
113350 |
10-Apr-2003 |
mux |
I deserve a big pointy hat for having missed all those references to bus_dmasync_op_t in my last commit.
|
113347 |
10-Apr-2003 |
mux |
Change the operation parameter of bus_dmamap_sync() from an enum to an int and redefine the BUS_DMASYNC_* constants as flags. This allows us to specify several operations in one call to bus_dmamap_sync() as in NetBSD.
|
113275 |
09-Apr-2003 |
mike |
o In struct prison, add an allprison linked list of prisons (protected by allprison_mtx), a unique prison/jail identifier field, two path fields (pr_path for reporting and pr_root vnode instance) to store the chroot() point of each jail. o Add jail_attach(2) to allow a process to bind to an existing jail. o Add change_root() to perform the chroot operation on a specified vnode. o Generalize change_dir() to accept a vnode, and move namei() calls to callers of change_dir(). o Add a new sysctl (security.jail.list) which is a group of struct xprison instances that represent a snapshot of active jails.
Reviewed by: rwatson, tjr
|
113255 |
08-Apr-2003 |
des |
Introduce an M_ASSERTPKTHDR() macro which performs the very common task of asserting that an mbuf has a packet header. Use it instead of hand- rolled versions wherever applicable.
Submitted by: Hiten Pandya <hiten@unixdaemons.com>
|
113247 |
08-Apr-2003 |
marcel |
Remove COMPAT_FREEBSD4. It's impossible because FreeBSD 4 does not run on ia64 at all.
|
113181 |
06-Apr-2003 |
marcel |
Remove the 32KB VHPT section from the kernel image. We don't really use it because we allocate a VHPT based on the size of the physical memory and even if the allocated VHPT is 32KB, we don't use the in- image section for it. Since the VHPT must be naturally aligned, we save 48K on average (due to alignment). Consequently, we start off with the VHPT disabled (it is assumed the VHPT is disabled because the EFI loader runs without memory address translation and thus has no need to setup the VHPT). It's probably a good idea to explicitly disable the VHPT if we make the use of the VHPT optional.
|
113160 |
06-Apr-2003 |
marcel |
Also set the access bit in the PTE when we get a data dirty bit fault. This avoids an immediate access bit fault when we serviced the dirty bit fault in case the access bit is unset. This typically happens for newly allocated memory that's being zeroed and thus very common.
|
113140 |
05-Apr-2003 |
marcel |
Include <geom/geom_disk.h> and stop including <sys/disk.h>. The former gives us 'struct disk'.
|
113090 |
04-Apr-2003 |
des |
Define ovbcopy() as a macro which expands to the equivalent bcopy() call, to take care of the KAME IPv6 code which needs ovbcopy() because NetBSD's bcopy() doesn't handle overlap like ours.
Remove all implementations of ovbcopy().
Previously, bzero was a function pointer on i386, to save a jmp to bzero_vector. Get rid of this microoptimization as it only confuses things, adds machine-dependent code to an MD header, and doesn't really save all that much.
This commit does not add my pagezero() / pagecopy() code.
|
112946 |
01-Apr-2003 |
phk |
Use bioq_flush() to drain a bio queue with a specific error code. Retain the mistake of not updating the devstat API for now.
Spell bioq_disksort() consistently with the remaining bioq_*().
#include <geom/geom_disk.h> where this is more appropriate.
|
112908 |
01-Apr-2003 |
jeff |
- Add thr and umtx system calls.
|
112898 |
01-Apr-2003 |
jeff |
- Define a new md function 'casuptr'. This atomically compares and sets a pointer that is in user space. It will be used as the basic primitive for a kernel supported user space lock implementation. - Implement this function in x86's support.s - Provide stubs that return -1 in all other architectures. Implementations will follow along shortly.
Reviewed by: jake
|
112896 |
31-Mar-2003 |
jeff |
- Add a placeholder for sigwait
|
112888 |
31-Mar-2003 |
jeff |
- Move p->p_sigmask to td->td_sigmask. Signal masks will be per thread with a follow on commit to kern_sig.c - signotify() now operates on a thread since unmasked pending signals are stored in the thread. - PS_NEEDSIGCHK moves to TDF_NEEDSIGCHK.
|
112883 |
31-Mar-2003 |
jeff |
- Change trapsignal() to accept a thread and not a proc. - Change all consumers to pass in a thread.
Right now this does not cause any functional changes but it will be important later when signals can be delivered to specific threads.
|
112882 |
31-Mar-2003 |
jeff |
- Use sigexit() instead of twiddling the signal mask, catch, ignore, and action bits to allow SIGILL to work as expected. This brings this file in line with other architectures.
|
112721 |
27-Mar-2003 |
das |
Correct LDBL_* constants based on values from i386.
|
112569 |
25-Mar-2003 |
jake |
- Add vm_paddr_t, a physical address type. This is required for systems where physical addresses larger than virtual addresses, such as i386s with PAE. - Use this to represent physical addresses in the MI vm system and in the i386 pmap code. This also changes the paddr parameter to d_mmap_t. - Fix printf formats to handle physical addresses >4G in the i386 memory detection code, and due to kvtop returning vm_paddr_t instead of u_long.
Note that this is a name change only; vm_paddr_t is still the same as vm_offset_t on all currently supported platforms.
Sponsored by: DARPA, Network Associates Laboratories Discussed with: re, phk (cdevsw change)
|
112498 |
22-Mar-2003 |
ru |
Remove bitrot associated with `maxusers'.
Submitted by: bde
|
112436 |
20-Mar-2003 |
mux |
Use atomic operations to increment and decrement the refcount in busdma tags. There are currently no tags shared accross different drivers so this isn't needed at the moment, but it will be required when we'll have a proper newbus method to get the parent busdma tag.
|
112312 |
16-Mar-2003 |
jake |
Made the prototypes for pmap_kenter and pmap_kremove MD. These functions are machine dependent because they are not required to update the tlb when mappings are added or removed, and doing so is machine dependent. In addition, an implementation may require that pages mapped with pmap_kenter have a backing vm_page_t, which is not necessarily true of all physical pages, and so may choose to pass the vm_page_t to pmap_kenter instead of the physical address in order to make this requirement clear.
|
112235 |
14-Mar-2003 |
mux |
Bah, get it right this time and add sys/lock.h before sys/mutex.h.
|
112215 |
14-Mar-2003 |
mux |
Oops, add missing includes. Pass me the pointy hat.
Reported by: jake
|
112196 |
13-Mar-2003 |
mux |
Grab Giant around calls to contigmalloc() and contigfree() so that drivers converted to be MP safe don't have to deal with it.
|
112195 |
13-Mar-2003 |
mux |
Memory allocated with contigmalloc() should be freed with contigfree(), not with free().
|
112051 |
10-Mar-2003 |
marcel |
Fix two rounds of breakages and cleanup. Remove the sccdebug sysctl while I'm here and garbage collect dead code (ssc_clone). Define d_maxsize as DFLTPHYS for now because that's what it will be if we don't define it.
|
111979 |
08-Mar-2003 |
phk |
Centralize the devstat handling for all GEOM disk device drivers in geom_disk.c.
As a side effect this makes a lot of #include <sys/devicestat.h> lines not needed and some biofinish() calls can be reduced to biodone() again.
|
111897 |
05-Mar-2003 |
marcel |
Fix threaded applications on ia64 that are linked dynamicly. We did not save (restore) the global pointer (GP) in the jmpbuf in setjmp (longjmp) because it's not needed in general. GP is considered a scratch register at callsites and hence is always restored after a call (when it's possible that the call resolves to a symbol in a different loadmodule; otherwise GP does not have to be saved and restored at all), including calls to setjmp/longjmp. There's just one problem with this now that we use setjmp/longjmp for context switching: A new context must have GP defined properly for the thread's entry point. This means that we need to put GP in the jmpbuf and consequently that we have to restore is in longjmp. This automaticly requires us to save it as well.
When setjmp/longjmp isn't used for context switching, this can be reverted again.
|
111894 |
05-Mar-2003 |
marcel |
ABI breaker: Move the J_SIGMASK field in the jmpbuf before the J_SIG0 field. While here, rename J_SIG0 to J_SIGSET and remove J_SIG1. The main reason for this change is that the 128-bit sigset_t is now aligned on a 16-byte boundary, which allows us to use 16-byte atomic loads and stores on CPUs that support it. The removal of J_SIG1 is done to avoid confusion: it is never accessed and should not be. Renaming J_SIG0 to J_SIGSET is the icing on the cake that's better done now than later.
|
111883 |
04-Mar-2003 |
jhb |
Replace calls to WITNESS_SLEEP() and witness_list() with equivalent calls to WITNESS_WARN().
|
111815 |
03-Mar-2003 |
phk |
Gigacommit to improve device-driver source compatibility between branches:
Initialize struct cdevsw using C99 sparse initializtion and remove all initializations to default values.
This patch is automatically generated and has been tested by compiling LINT with all the fields in struct cdevsw in reverse order on alpha, sparc64 and i386.
Approved by: re(scottl)
|
111697 |
01-Mar-2003 |
alc |
MFi386 revision 1.88 Remove some long unused declarations.
|
111587 |
27-Feb-2003 |
davidxu |
Needn't kse.h
|
111585 |
27-Feb-2003 |
julian |
Change the process flags P_KSES to be P_THREADED. This is just a cosmetic change but I've been meaning to do it for about a year.
|
111524 |
26-Feb-2003 |
mux |
Correctly set BUS_SPACE_MAXSIZE in all the busdma backends. It was bogusly set to 64 * 1024 or 128 * 1024 because it was bogusly reused in the BUS_DMAMAP_NSEGS definition.
|
111462 |
25-Feb-2003 |
mux |
Cleanup of the d_mmap_t interface.
- Get rid of the useless atop() / pmap_phys_address() detour. The device mmap handlers must now give back the physical address without atop()'ing it. - Don't borrow the physical address of the mapping in the returned int. Now we properly pass a vm_offset_t * and expect it to be filled by the mmap handler when the mapping was successful. The mmap handler must now return 0 when successful, any other value is considered as an error. Previously, returning -1 was the only way to fail. This change thus accidentally fixes some devices which were bogusly returning errno constants which would have been considered as addresses by the device pager. - Garbage collect the poorly named pmap_phys_address() now that it's no longer used. - Convert all the d_mmap_t consumers to the new API.
I'm still not sure wheter we need a __FreeBSD_version bump for this, since and we didn't guarantee API/ABI stability until 5.1-RELEASE.
Discussed with: alc, phk, jake Reviewed by: peter Compile-tested on: LINT (i386), GENERIC (alpha and sparc64) Runtime-tested on: i386
|
111194 |
20-Feb-2003 |
phk |
Change the console interface to pass a "struct consdev *" instead of a dev_t to the method functions.
The dev_t can still be found at struct consdev *->cn_dev.
Add a void *cn_arg element to struct consdev which the drivers can use for retrieving their softc.
|
111119 |
19-Feb-2003 |
imp |
Back out M_* changes, per decision of the TRB.
Approved by: trb
|
111036 |
17-Feb-2003 |
julian |
Fix missed patch in last commit
|
111032 |
17-Feb-2003 |
julian |
Move a bunch of flags from the KSE to the thread. I was in two minds as to where to put them in the first case.. I should have listenned to the other mind.
Submitted by: parts by davidxu@ Reviewed by: jeff@ mini@
|
111031 |
17-Feb-2003 |
marcel |
Define _ALIGNBYTES to be 15. This should have been done right away.
|
111030 |
17-Feb-2003 |
marcel |
Print two new processor features: o Spontaneous deferral (A feature required by dutch railways :-) o 16-byte atomic operations (ld, st, cmpxchg)
|
111028 |
17-Feb-2003 |
jeff |
- Split the struct kse into struct upcall and struct kse. struct kse will soon be visible only to schedulers. This greatly simplifies much the KSE code.
Submitted by: davidxu
|
111024 |
17-Feb-2003 |
jeff |
- Move ke_sticks, ke_iticks, ke_uticks, ke_uu, ke_su, and ke_iu back into the proc. These counters are only examined through calcru.
Submitted by: davidxu Tested on: x86, alpha, UP/SMP
|
111002 |
16-Feb-2003 |
phk |
Remove #include <sys/dkstat.h>
|
110959 |
15-Feb-2003 |
marcel |
Fix misuse of Maxmem in the calculation of the VHPT size. Maxmem is already in pages, so we should not convert from bytes to pages. The result of this bug was bad scaling of the VHPT relative to the available memory.
Submitted by: Arun Sharma <arun@sharma-home.net>
|
110831 |
13-Feb-2003 |
obrien |
Fix the style of the SCHED_4BSD commit.
|
110784 |
13-Feb-2003 |
alc |
MFi386 Remove kptobj. Instead, use VM_ALLOC_NOOBJ.
|
110566 |
08-Feb-2003 |
mike |
Implement fpclassify(): o Add a MD header private to libc called _fpmath.h; this header contains bitfield layouts of MD floating-point types. o Add a MI header private to libc called fpmath.h; this header contains bitfield layouts of MI floating-point types. o Add private libc variables to lib/libc/$arch/gen/infinity.c for storing NaN values. o Add __double_t and __float_t to <machine/_types.h>, and provide double_t and float_t typedefs in <math.h>. o Add some C99 manifest constants (FP_ILOGB0, FP_ILOGBNAN, HUGE_VALF, HUGE_VALL, INFINITY, NAN, and return values for fpclassify()) to <math.h> and others (FLT_EVAL_METHOD, DECIMAL_DIG) to <float.h> via <machine/float.h>. o Add C99 macro fpclassify() which calls __fpclassify{d,f,l}() based on the size of its argument. __fpclassifyl() is never called on alpha because (sizeof(long double) == sizeof(double)), which is good since __fpclassifyl() can't deal with such a small `long double'.
This was developed by David Schultz and myself with input from bde and fenner.
PR: 23103 Submitted by: David Schultz <dschultz@uclink.Berkeley.EDU> (significant portions) Reviewed by: bde, fenner (earlier versions)
|
110335 |
04-Feb-2003 |
harti |
Fix a problem in bus_dmamap_load_{mbuf,uio} when the first mbuf or the first uio segment is empty. In this case no dma segment is create by bus_dmamap_load_buffer, but the calling routine clears the first flag. Under certain combinations of addresses of the first and second mbuf/uio buffer this leads to corrupted DMA segment descriptors. This was already fixed by tmm in sparc64/sparc64/iommu.c.
PR: kern/47733 Reviewed by: sam Approved by: jake (mentor)
|
110296 |
03-Feb-2003 |
jake |
Split statclock into statclock and profclock, and made the method for driving statclock based on profhz when profiling is enabled MD, since most platforms don't use this anyway. This removes the need for statclock_process, whose only purpose was to subdivide profhz, and gets the profiling clock running outside of sched_lock on platforms that implement suswintr. Also changed the interface for starting and stopping the profiling clock to do just that, instead of changing the rate of statclock, since they can now be separate.
Reviewed by: jhb, tmm Tested on: i386, sparc64
|
110255 |
03-Feb-2003 |
marcel |
Don't use the 'c' partition for mounting root. A disklabel is very likely not present under the simulator. If multiple partitions are present on the virtual disk, then the 'a' partition would be the most logical choice. Nowadays partitions are GPT based, which would make the assumption of a disklabel even more questionable. Given all the possible scenarios, assuming a raw "device" seems best.
|
110232 |
02-Feb-2003 |
alfred |
Consolidate MIN/MAX macros into one place (param.h).
Submitted by: Hiten Pandya <hiten@unixdaemons.com>
|
110229 |
02-Feb-2003 |
phk |
We don't need sscopen() and sscclose().
Register sscstrategy directly, instead of using a cdevsw{} for the purpose.
Tested by: marcel
|
110227 |
02-Feb-2003 |
marcel |
Export IA32 from opt_ia32.h to assembly so that we can eliminate saving and restoring ia32 specific registers when switching context and ia32 support has not been compiled-in. The primary reason for this change is that one of the ia32 registers (ar.fcr) is wrongly marked as invalid by the simulator. Now that we avoid using the register when possible, usability is improved. The secundary reason is that it saves us 7 loads and stores.
Note that the PCB will continue to have room for these registers, irrespective of the IA32 option. There are no benefits that make it worthwhile.
|
110211 |
01-Feb-2003 |
marcel |
Remove special casing for running in the simulator from the kernel and instead add platform, firmware and EFI stubs to the loader. The net effect of this change is that besides a special console and disk driver, the kernel has no knowledge of the simulator. This has the following advantages: o Simulator support is much harder to break, o It's easier to make use of more feature complete simulators. This would only need a change in the simulator specific loader, o Running SMP kernels within the simulator. Note that ski at this time does not simulate IPIs, so there's no way to start APs.
The platform, firmware and EFI stubs describe the following hardware: o 4 CPU Itanium, o 128 MB RAM within the 4GB address space, o 64 MB RAM above the 4GB address space.
NOTE: The stubs in the skiloader describe a machine that should in parts be defined by the simulator. Things like processor interrupt block and AP wakeup vector cannot be choosen at random because they require interpretation by the simulator. Currently the simulator is ignorant of this.
This change introduces an unofficial SSC call SSC_SAL_SET_VECTORS which is ignored by the simulator.
Tested with: ski (version 0.943 for linux)
|
110202 |
01-Feb-2003 |
joe |
Put replace spaces with tabs in keeping with the rest of the file.
|
110190 |
01-Feb-2003 |
julian |
Reversion of commit by Davidxu plus fixes since applied.
I'm not convinced there is anything major wrong with the patch but them's the rules..
I am using my "David's mentor" hat to revert this as he's offline for a while.
|
110084 |
30-Jan-2003 |
phk |
Remove D_CANFREE from sscdisk.c.
I belive it got here by copy&paste and I see no signs in the source code that BIO_DELETE was dealt with correctly and can only wonder what kind of trouble this may have caused.
|
109911 |
27-Jan-2003 |
julian |
Unbreak SMP cases for these architectures. statclock_process() changed arguments. note: it may be worth checking if curkse is needed on these architectures.. (and if so, why?)
|
109877 |
26-Jan-2003 |
davidxu |
Move UPCALL related data structure out of kse, introduce a new data structure called kse_upcall to manage UPCALL. All KSE binding and loaning code are gone.
A thread owns an upcall can collect all completed syscall contexts in its ksegrp, turn itself into UPCALL mode, and takes those contexts back to userland. Any thread without upcall structure has to export their contexts and exit at user boundary.
Any thread running in user mode owns an upcall structure, when it enters kernel, if the kse mailbox's current thread pointer is not NULL, then when the thread is blocked in kernel, a new UPCALL thread is created and the upcall structure is transfered to the new UPCALL thread. if the kse mailbox's current thread pointer is NULL, then when a thread is blocked in kernel, no UPCALL thread will be created.
Each upcall always has an owner thread. Userland can remove an upcall by calling kse_exit, when all upcalls in ksegrp are removed, the group is atomatically shutdown. An upcall owner thread also exits when process is in exiting state. when an owner thread exits, the upcall it owns is also removed.
KSE is a pure scheduler entity. it represents a virtual cpu. when a thread is running, it always has a KSE associated with it. scheduler is free to assign a KSE to thread according thread priority, if thread priority is changed, KSE can be moved from one thread to another.
When a ksegrp is created, there is always N KSEs created in the group. the N is the number of physical cpu in the current system. This makes it is possible that even an userland UTS is single CPU safe, threads in kernel still can execute on different cpu in parallel. Userland calls kse_create to add more upcall structures into ksegrp to increase concurrent in userland itself, kernel is not restricted by number of upcalls userland provides.
The code hasn't been tested under SMP by author due to lack of hardware.
Reviewed by: julian
|
109865 |
26-Jan-2003 |
jeff |
- Introduce the SCHED_ULE and SCHED_4BSD options for compile time selection of the scheduler. - Add SCHED_4BSD as the scheduler for all kernel config files in cvs.
|
109799 |
24-Jan-2003 |
dfr |
Fix pmap_extract so that it doesn't panic if the user types 'cat /proc/pid/map'
Submitted by: Arun Sharma <arun.sharma@intel.com>
|
109623 |
21-Jan-2003 |
alfred |
Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0. Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.
|
109615 |
21-Jan-2003 |
jeff |
- Add a VM_WAIT in the appropriate cases where vm_page_alloc() fails and flags indicate that uma_small_alloc should not. This code should be refactored so that there is not so much cross arch duplication.
Reviewed by: jake Spotted by: tmm Tested on: alpha, sparc64 Pointy hat to: jeff and everyone who cut and pasted the bad code. :-)
|
109605 |
21-Jan-2003 |
jake |
Resolve relative relocations in klds before trying to parse the module's metadata. This fixes module dependency resolution by the kernel linker on sparc64, where the relocations for the metadata are different than on other architectures; the relative offset is in the addend of an Elf_Rela record instead of the original value of the location being patched. Also fix printf formats in debug code.
Submitted by: Hartmut Brandt <brandt@fokus.gmd.de> PR: 46732 Tested on: alpha (obrien), i386, sparc64
|
109558 |
20-Jan-2003 |
phk |
We need neither <sys/diskslice.h> nor <sys/disklabel.h> here.
|
109490 |
18-Jan-2003 |
mux |
Don't try to free() map in bus_dmamap_destroy() when it's set to &nobounce_dmamap. A similar bug was fixed by wpaul in revision 1.19 of sys/alpha/alpha/busdma_machdep.c.
|
109342 |
16-Jan-2003 |
dillon |
Merge all the various copies of vm_fault_quick() into a single portable copy.
|
109340 |
15-Jan-2003 |
dillon |
Merge all the various copies of vmapbuf() and vunmapbuf() into a single portable copy. Note that pmap_extract() must be used instead of pmap_kextract().
This is precursor work to a reorganization of vmapbuf() to close remaining user/kernel races (which can lead to a panic).
|
108759 |
06-Jan-2003 |
marcel |
Move ia64_sapics and ia64_sapic_count from interrupt.c to sapic.c and declare them extern in interrupt.c. This eliminates the need for ia64_add_sapic(), which is called from sapic.c. While here, reformat ia64_enable() in interrupt.c to improve indentation and add a sysctl (machdep.apic) to dump the I/O APIC entries currently programmed into all I/O APICs. The latter can help analyze interrupt problems. Note that the sysctl is not intended as a userland (software) interface. It may be changed in the future to include counters so that vmstat -i can make use of it. It may also be removed...
|
108757 |
06-Jan-2003 |
peter |
Move the itm reload to a single place rather than having two identical copies of the reload. Note that we use the precomputed itm_reload value so that we can avoid a division in the kernel. The ia64 cpu does not have integer divide, so this would have been done by a floating point operation.
|
108756 |
06-Jan-2003 |
marcel |
Replace the hardcoding of 255 as the clock interrupt vector with CLOCK_VECTOR and define it as 254, not 255. Vector 255 is already in use as the AP wakeup vector on the HP rx2600.
This needs to be made more dynamic. The likelyhood of vector 254 being in use is pretty small, but we already have code to assign vectors to IPIs (see sal.c) and it's preobably better to have a centralized "vector manager" that hands out vectors based on some imput (like priority).
|
108751 |
06-Jan-2003 |
marcel |
Manually inline handleclock(). There's only a single caller and handleclock itself is trivial.
While here, replace (itc_frequency+hz/2)/hz with itm_reload for consistency. There's now a single place where we determine the ITM reload value.
|
108749 |
06-Jan-2003 |
marcel |
Count interrupts as soon as possible. This makes sure interrupts are counted even when there are no handlers.
|
108737 |
05-Jan-2003 |
marcel |
Don't hardcode the address of the local (S)APIC (aka processor interrupt block). We use the previously hardcoded address as a default only, but will otherwise use whatever ACPI tells us. The address can be found in the MADT table header or in the LAPIC override table entry.
|
108734 |
05-Jan-2003 |
marcel |
Bump the number of interrupts from 65 to 257. This is a waste of space most of the time, but handles machines with lots of I/O (S)APICs. We cannot make this more dynamic without breaking the interface with vmstat. Hence, we need to fix the interface first.
|
108733 |
05-Jan-2003 |
marcel |
Handle 3-digit interrupt numbers (vectors). While here, change the name of unused entries from "intr XXX" to "#XXX". This makes it easier to debug interrupt problems, because vmstat can be hacked more easily to dump all interrupt entries that are in use and not those that have had interrupts.
|
108730 |
05-Jan-2003 |
marcel |
Make all memory I/O addresses (explicitly) 64-bit. Memory mapped devices aren't necessarily mapped within 4GB. I/O port addresses are offsets into the memory mapped I/O port space, which is not larger than 16MB. No need to convert those to 64 bit types.
|
108728 |
05-Jan-2003 |
marcel |
Provide a null-implementation for bus_space_unmap, like i386. bus_space_unmap is required for puc(4).
|
108690 |
05-Jan-2003 |
marcel |
Adopt, adapt and improve: o Make the URL of the handbook match reality o Improve some comments (either wording or formatting) o Sync with i386: comment-out DDB, INVARIANTS, INVARIANT_SUPPORT o Add some more SCSI/RAID controllers: ahd, mpt, asr, ciss, dpt, iir, mly, ida o Remove support for the parallel port o Add NICs: em, bge o Remove NICs: ste, tl, tx, vr, wb o Enable USB support again, except of the UHCI host controller. UHCI still hangs the BigSur (=HP i2000) machines, and makes them useless. The OHCI controller works fine. Note that newer ia64 boxes based on the Intel host controllers (UHCI or EHCI) still won't have USB support. We really need to import the EHCI host controller from NetBSD...
|
108643 |
04-Jan-2003 |
alc |
Hold the page queues lock around pmap_remove_pte() in pmap_enter().
Submitted by: Arun Sharma <adsharma@unix-os.sc.intel.com>
|
108619 |
03-Jan-2003 |
marcel |
Make this build and sync-up: o Add COMPAT_FREEBSD4 o Remove NO_GEOM o Remove commented out options.
|
108533 |
01-Jan-2003 |
schweikh |
Correct typos, mostly s/ a / an / where appropriate. Some whitespace cleanup, especially in troff files.
|
108409 |
29-Dec-2002 |
rwatson |
Synchronize to kern/syscalls.master:1.139.
Obtained from: TrustedBSD Project
|
108175 |
22-Dec-2002 |
tjr |
MB_LEN_MAX is not MD, move it to the MI limits.h.
|
108049 |
18-Dec-2002 |
marcel |
More MFp4: DIG64 structures.
|
108026 |
18-Dec-2002 |
marcel |
Export the physical address of the RSDP to userland by means of the `machdep.acpi_root' sysctl. This is required on ia64 because the root pointer hardly ever, if at all, lives in the first MB of memory and also because scanning the first MB of memory can cause machine checks. This provides a save and reliable way for ACPI tools to work with the tables if ACPI support is present in the kernel. On ia64 ACPI is non-optional.
|
107963 |
17-Dec-2002 |
marcel |
Check that the dump device is large enough. Otherwise we could end up with a dump offset that's smaller than the start of the dump device and either clobber data in preceding partitions or try to write beyond the end of the medium (unsigned wrap).
Implement legacy behaviour to never write to the first 64KB as that is where metadata (ie disklabels) may reside.
|
107923 |
16-Dec-2002 |
marcel |
Regen: swapoff
|
107922 |
16-Dec-2002 |
marcel |
Change swapoff from MNOPROTO to UNIMPL. The former doesn't work.
|
107913 |
15-Dec-2002 |
dillon |
This is David Schultz's swapoff code which I am finally able to commit. This should be considered highly experimental for the moment.
Submitted by: David Schultz <dschultz@uclink.Berkeley.EDU> MFC after: 3 weeks
|
107849 |
14-Dec-2002 |
alfred |
SCARGS removal take II.
|
107839 |
13-Dec-2002 |
alfred |
Backout removal SCARGS, the code freeze is only "selectively" over.
|
107838 |
13-Dec-2002 |
alfred |
Remove SCARGS.
Reviewed by: md5
|
107719 |
10-Dec-2002 |
julian |
Unbreak the KSE code. Keep track of zobie threads using the Per-CPU storage during the context switch. Rearrange thread cleanups to avoid problems with Giant. Clean threads when freed or when recycled.
Approved by: re (jhb)
|
107685 |
08-Dec-2002 |
marcel |
Use one of the bi_spare entries for the DIG64 HCDP table address. The HCDP table is one (non-proprietary) way for the platform to inform the OS about headless operation. This field would normally hold the address as can be found by scanning the EFI system table, which we also pass to the kernel. The apparent duplication allows us to synthesize a HCDP table in the loader by whatever means we can think of, including relocating the platform table into pre- mapped address space. In short: it gives us more freedom.
Approved by: re (blanket)
|
107684 |
08-Dec-2002 |
marcel |
Disable SMP. It reduces the chance that the kernel boots. On top of that, there's some nasty process corruption when running with SMP.
Note that this was already in effect for the 5.0-RC1 kernels in the form of a local patch.
Approved by: re (blanket)
|
107481 |
02-Dec-2002 |
alc |
MFi386 Hold the page queues lock around vm_page_unhold() in vunmapbuf().
Approved by: re (blanket)
|
107395 |
29-Nov-2002 |
marcel |
Implement bus_space_subregion(). Identical to i386.
Approved by: re (carte blanc)
|
107394 |
29-Nov-2002 |
marcel |
Better handle sparse physical memory: Don't use the address range as a measure for available memory to scale the VHPT. Instead, use the previously determined Maxmem.
Approved by: re (carte blanc)
|
107206 |
24-Nov-2002 |
marcel |
MFp4: Add function map_port_space() to map the memory mapped I/O port range as uncacheable virtual memory and call it prior to probing for a console. This removes the dependency on the loader to have done this for us. Note that this change does not include doing the same for APs.
Approved by: re (blanket)
|
107205 |
24-Nov-2002 |
marcel |
Fix comparison that caused a 1-off bug. This appeared harmless for the kernel itself, but SAL on Itanium2 machines spontaneously rebooted the machine.
Approved by: re (blanket) Submitted by: Arun Sharma <adsharma@unix-os.sc.intel.com>
|
107180 |
22-Nov-2002 |
mux |
Under certain circumstances, we were calling kmem_free() from i386 cpu_thread_exit(). This resulted in a panic with WITNESS since we need to hold Giant to call kmem_free(), and we weren't helding it anymore in cpu_thread_exit(). We now do this from a new MD function, cpu_thread_dtor(), called by thread_dtor().
Approved by: re@ Suggested by: jhb
|
107028 |
17-Nov-2002 |
alc |
MFi386 r1.369 - Clear the PG_WRITEABLE flag in pmap_page_protect() if write access is being removed. Return immediately if write access is being removed and PG_WRITEABLE is already clear.
|
106993 |
16-Nov-2002 |
deischen |
Regenerate after adding syscalls.
|
106989 |
16-Nov-2002 |
deischen |
Add *context() syscalls to ia64 32-bit compatability table as requested in kern/syscalls.master.
|
106977 |
16-Nov-2002 |
deischen |
Add getcontext, setcontext, and swapcontext as system calls. Previously these were libc functions but were requested to be made into system calls for atomicity and to coalesce what might be two entrances into the kernel (signal mask setting and floating point trap) into one.
A few style nits and comments from bde are also included.
Tested on alpha by: gallatin
|
106965 |
15-Nov-2002 |
peter |
Do not assume that time_t is an int.
Approved by: re (jhb)
|
106964 |
15-Nov-2002 |
peter |
Test the water. Make time_t long (64 bit) on ia64 since we do not have to worry about ABI vs released systems yet. This is mostly transparent since there is no significant exposure in the syscall interface. The things that go wrong are mostly userland stuff - time(&intvariable).
Reviewed by: dfr, marcel Approved by: re (jhb)
|
106838 |
13-Nov-2002 |
alc |
Move pmap_collect() out of the machine-dependent code, rename it to reflect its new location, and add page queue and flag locking.
Notes: (1) alpha, i386, and ia64 had identical implementations of pmap_collect() in terms of machine-independent interfaces; (2) sparc64 doesn't require it; (3) powerpc had it as a TODO.
|
106755 |
11-Nov-2002 |
marcel |
ia64 ABI breaker: Don't force 16-byte alignment at run-time. Do it at compile-time. This saves us the pointer fiddling by the setjmp functions and reduces complexity. While here, increase the jmp_buf by 16 bytes to an even 512 bytes. Coincidentally, due to the way alignment was handled prior to this change, the jmp_buf has not changed in size, but only in how the space is used. Prior to this change the 16 bytes were reserved for enforcing alignment; now they are reserved by us for future extensions. Therefore, this ABI breaker is relatively save: the failure is always an alignment trap.
|
106753 |
11-Nov-2002 |
alc |
- Clear the page's PG_WRITEABLE flag in the i386's pmap_changebit() if we're removing write access from the page's PTEs. - Export pmap_remove_all() on alpha, i386, and ia64. (It's already exported on sparc64.)
|
106751 |
11-Nov-2002 |
marcel |
Comment-out USB support. A kernel doesn't boot with it. Deal with it later.
|
106697 |
09-Nov-2002 |
des |
Print real / avail memory in megabytes rather than kilobytes.
|
106605 |
07-Nov-2002 |
tmm |
Move the definitions of the hw.physmem, hw.usermem and hw.availpages sysctls to MI code; this reduces code duplication and makes all of them available on sparc64, and the latter two on powerpc. The semantics by the i386 and pc98 hw.availpages is slightly changed: previously, holes between ranges of available pages would be included, while they are excluded now. The new behaviour should be more correct and brings i386 in line with the other architectures.
Move physmem to vm/vm_init.c, where this variable is used in MI code.
|
106503 |
06-Nov-2002 |
jmallett |
Remove what was a temporary bogus assignment of bits of siginfo_t, as it does not look like the prerequisites to fill it in properly will be in the tree for the upcoming release, but it's mostly done, so there is no need for these to stay around to remind us.
|
106486 |
06-Nov-2002 |
marcel |
Define UMA_MD_SMALL_ALLOC so that we can allocate memory with region 7 addresses for use by page tables and kernel stacks.
Obtained from: peter
|
106447 |
05-Nov-2002 |
marcel |
o Remove devices that are commented out. o Enable sc o Remove NO_GEOM. We need GEOM for GPT. o Remove NO_CPU_COPTFLAGS.
|
106446 |
05-Nov-2002 |
marcel |
Remove mcclock. It's an Alpha left-over.
|
106364 |
02-Nov-2002 |
rwatson |
Sync to src/sys/kern/syscalls.master
|
106197 |
30-Oct-2002 |
marcel |
Don't pass the return address to exception_save in register b0. Use a true scratch register. This change and future re-allocations will eventually result in code that we can unwind to to get the preserved registers of the process. This of course means that we cannot trash them while saving the process context.
While re-allocating, remove the register aliases. Abstraction is in this case disadvanteous.
|
106189 |
30-Oct-2002 |
marcel |
Rewrite cpu_switch(). The most notable change is the fact that we now have f16-f31 as part of the context. The PCB has been reorganized to better match how we save and restore the (preserved) registers. This commit also moves the context restoriation to its own function (named pcb_restore), as we did with pcb_save.
Only minimal effort has been put in writing optimal assembly. The expectation is that there will be more rounds of changes.
|
106069 |
28-Oct-2002 |
marcel |
Remove mf.a from sapic_read() and sapic_write(). We only care about ordering and not acceptance. The removal of mf.a leaves behind the mf that accompanied it.
|
106067 |
28-Oct-2002 |
marcel |
Remove mf.a (the acceptance form of the memory fence instruction) from all low-level bus space support functions. There's no need to actually force the read/write to be accepted by the platform before we can do anything else. We still have the mf instruction there, which forces ordering. This too is not required given the semantices of the bus space I/O functions, but it's not at all clear to me if there are any poorly written device drivers that depend on the strict ordering by the processor. The motto here is to take small steps...
|
106066 |
28-Oct-2002 |
marcel |
Make vmstat -i work: o Properly set the pointer to the counter for each interrupt and update the intrnames table. o Remove Alpha cruft from intrcnt.h. o Create INTRNAME_LEN as the single entity that defines the width of the names in the intrnames table (incl. terminatinf '\0').
|
106063 |
27-Oct-2002 |
marcel |
In ipi_send(), perform a mf instruction prior to initiating the IPI. This guarantees that loads and stores emitted before the fence are made visible before the IPI becomes pended. Remove the mf.a instruction after initiating the IPI. There's no guarantee that the IPI becomes pended prior to subsequent reads or writes. Even if there was a guarantee, it would mostly be without any benefit.
|
105977 |
26-Oct-2002 |
peter |
Add COMPAT_FREEBSD4 here too. It has COMPAT_43 as well.
|
105950 |
25-Oct-2002 |
peter |
Split 4.x and 5.x signal handling so that we can keep 4.x signal handling clean and functional as 5.x evolves. This allows some of the nasty bandaids in the 5.x codepaths to be unwound.
Encapsulate 4.x signal handling under COMPAT_FREEBSD4 (there is an anti-foot-shooting measure in place, 5.x folks need this for a while) and finish encapsulating the older stuff under COMPAT_43. Since the ancient stuff is required on alpha (longjmp(3) passes a 'struct osigcontext *' to the current sigreturn(2), instead of the 'ucontext_t *' that sigreturn is supposed to take), add a compile time check to prevent foot shooting there too. Add uniform COMPAT_43 stubs for ia64/sparc64/powerpc.
Tested on: i386, alpha, ia64. Compiled on sparc64 (a few days ago). Approved by: re
|
105900 |
24-Oct-2002 |
julian |
Extract out KSE specific code from machine specific code so that there is ony one copy of it. Fix that one copy so that KSEs with no mailbox in a KSE program are not a cause of page faults (this can legitmatly happen).
Submitted by: (parts) davidxu
|
105891 |
24-Oct-2002 |
jhb |
Oops, I missed a few changes in 'device acpica' -> 'device acpi' change.
Submitted by: Hiten Pandya <hiten@angelica.unixdaemons.com>
|
105890 |
24-Oct-2002 |
jhb |
Rename 'device acpica' to 'device acpi'.
Approved by: msmith, iwasaki
|
105591 |
20-Oct-2002 |
marcel |
In cb_dumphdr() we were calling buf_write() with di->priv as the pointer to a dumperinfo instead of di. A brainfart, surely. This bug went unnoticed for all this time because the pointer is only used by buf_write() when it can write a completely filled buffer to the dump device. This depends on the number of memory chunks that needs to be dumped. This has apparently been low enough that it has never happened up until this point.
|
105500 |
20-Oct-2002 |
marcel |
Remove the special casing for IP addresses that are within the IVT or the do_syscall() function. We have unwind directives to stop the unwinder.
|
105499 |
20-Oct-2002 |
marcel |
Define IVT_ENTRY and IVT_END as special versions of ENTRY and END for defining vectors. As a result, each vector will be a global function with unwind directives to notify the unwinder that we're in an interrupt handler. In the debugger this will show up something like:
Debugger(0xe000000000a211d8, 0xe000000000748960) at Debugger+0x31 panic(0xe000000000a36858, 0xe0000000021d32d0, 0xe000000000ae42e8, ... trap(0x14, 0x100000, 0xe0000000021d32d0, 0x0, 0xa0000000002095f0, ... ivt_Data_TLB(0x14, 0x100000, 0xe0000000021d32d0) at ivt_Data_TLB+0x1f0
|
105490 |
19-Oct-2002 |
peter |
Stake a claim on 418 (__xstat), 419 (__xfstat), 420 (__xlstat)
|
105486 |
19-Oct-2002 |
peter |
Grab 416/417 real estate before I get burned while testing again. This is for the not-quite-ready signal/fpu abi stuff. It may not see the light of day, but I'm certainly not going to be able to validate it when getting shot in the foot due to syscall number conflicts.
|
105476 |
19-Oct-2002 |
rwatson |
Add a placeholder for the execve_mac() system call, similar to SELinux's execve_secure() system call, which permits a process to pass in a label for a label change during exec. This permits SELinux to change the label for the resulting exec without a race following a manual label change on the process. Because this interface uses our general purpose MAC label abstraction, we call it execve_mac(), and wrap our port of SELinux's execve_secure() around it with appropriate sid mappings.
Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories
|
105470 |
19-Oct-2002 |
marcel |
Update the unwind information when modules are loaded and unloaded by using the linker hooks. Since these hooks are called for the kernel as well, we don't need to deal with that with a special SYSINIT. The initialization implicitly performed on the first update of the unwind information is made explicit with a SYSINIT. We now don't need the _ia64_unwind_{start|end} symbols.
|
105469 |
19-Oct-2002 |
marcel |
Add two hooks to signal module load and module unload to MD code. The primary reason for this is to allow MD code to process machine specific attributes, segments or sections in the ELF file and update machine specific state accordingly. An immediate use of this is in the ia64 port where unwind information is updated to allow debugging and tracing in/across modules. Note that this commit does not add the functionality to the ia64 port. See revision 1.9 of ia64/ia64/elf_machdep.c.
Validated on: alpha, i386, ia64
|
105463 |
19-Oct-2002 |
rwatson |
Permits UFS ACLs to be used with the GENERIC kernel. Due to recent ACL configuration changes, this shouldn't result in different code paths for file systems not explicitly configured for ACLs by the system administrator. For UFS1, administrators must still recompile their kernel to add support for extended attributes; for UFS2, it's sufficient to enable ACLs using tunefs or at mount-time (tunefs preferred for reliability reasons). UFS2, for a variety of reasons, including performance and reliability, is the preferred file system for use with ACLs.
Approved by: re
|
105432 |
19-Oct-2002 |
marcel |
Make this compile when DDB is not defined by conditionally compiling all references to ksym_start and ksym_end.
|
105147 |
15-Oct-2002 |
marcel |
Fix kernel module loading on ia64. Cross-module function calls were improperly relocated due to faulty logic in lookup_fdesc() in elf_machdep.c. The symbol index (symidx) was bogusly used for load modules other than the one the relocation applied to. This resulted in bogus bindings and consequently runtime failures.
The fix is to use the symbol index only for the module being relocated and to use the symbol name for look-ups in the modules in the dependent list. As such, we need a function to return the symbol name given the linker file and symbol index.
|
105138 |
15-Oct-2002 |
peter |
The a.out md_coredump stuff isn't referenced anywhere anymore, and hasn't been filled in for ages.. Nuked.
|
105079 |
14-Oct-2002 |
marcel |
Allow kernel dumps to be aborted with ctrl-C.
|
105054 |
13-Oct-2002 |
mike |
Remove the P1003_1B kernel option; it is no longer used.
|
105049 |
13-Oct-2002 |
mike |
struct ia64_fpreg needs to be available outside of the kernel for struct sigcontext.
Pointy hat to: mike
|
105014 |
13-Oct-2002 |
mike |
Add standards visibility conditionals. Change any uses of sigset_t to struct __sigset to avoid depending on objects from <sys/signal.h>.
|
105011 |
12-Oct-2002 |
marcel |
o Fix typo in previous commit: s/sc-nsect/sc->nsect/ o Fix printf format error for %d format with long argument.
|
105010 |
12-Oct-2002 |
marcel |
Plug two holes where we returned to userland without restoring the predicate registers. Even though the ITLB and DTLB interrupts happen often enough, this bug didn't do much harm. The reason is that the interrupt handlers only modify p1 and since this is a preserved (callee-saved) register it is hardly used in code generated by the compiler. Compilers use scratch registers by default. Changing the interrupt handlers to use p6 (ie a scratch register) proved that the bug was in fact fatal.
|
105001 |
12-Oct-2002 |
marcel |
Polish previous commit: o Replace KSTACK_PAGES with pages on panic() in pmap_new_thread(), o Fix style bugs in adjacent code, o Use NULL instead of 0 for pointers, o Save the virtual kstack address if we create an alternate kstack because 1) we can derive the physical (RR7) address from it and 2) we need the virtual address for contigfree() in pmap_dispose_thread(). Thus td_altkstack saves td_md.md_kstackvirt.
|
105000 |
12-Oct-2002 |
marcel |
MFp4: Include machine/vmparam.h to pull in definition of IA64_RR_BASE.
Obtained from: peter
|
104998 |
12-Oct-2002 |
marcel |
Remove the dependency on ia64_cpu.h by not defining pmap_kextract() as a trivial function that only calls ia64_tpa() and hence requires the prototype of ia64_tpa(), but by defining pmap_kextract as ia64_tpa. This solves the inclusion ordering issue in ddb/db_watch.c
|
104939 |
11-Oct-2002 |
peter |
cut/paste the pmap_new_altkstack stuff from the other platforms. It's no different here. Update the rest of the kstack API's for scottl's changes.
|
104938 |
11-Oct-2002 |
peter |
Call uma_zalloc on pvzone with M_NOWAIT, just like i386 and alpha. Otherwise we get hundreds of 'could sleep' during boot.
|
104908 |
11-Oct-2002 |
mike |
Change iov_base's type from `char *' to the standard `void *'. All uses of iov_base which assume its type is `char *' (in order to do pointer arithmetic) have been updated to cast iov_base to `char *'.
|
104741 |
09-Oct-2002 |
peter |
re-regen. Sigh.
|
104740 |
09-Oct-2002 |
peter |
Sigh. Fix fat-fingering of diff. I knew this was going to happen.
|
104739 |
09-Oct-2002 |
peter |
regenerate. sendfile stuff and other recently picked up stubs.
|
104738 |
09-Oct-2002 |
peter |
Try and deal with the #ifdef COMPAT_FREEBSD4 sendfile stuff. This would have been a lot easier if do_sendfile() was usable externally.
|
104736 |
09-Oct-2002 |
peter |
Try and patch up some tab-to-space spammage.
|
104735 |
09-Oct-2002 |
peter |
Add placeholder stubs for nsendfile, mac_syscall, ksem_close, ksem_post, ksem_wait, ksem_trywait, ksem_init, ksem_open, ksem_unlink, ksem_getvalue, ksem_destroy, __mac_get_pid, __mac_get_link, __mac_set_link, extattr_set_link, extattr_get_link, extattr_delete_link.
|
104584 |
06-Oct-2002 |
mike |
Add conditionals to allow va_list to be defined in other headers.
|
104583 |
06-Oct-2002 |
mike |
o Add conditionals to allow va_list to be defined in other headers. o Standardize on _MACHINE_STDARG_H_ to allow multiple header includes. o Restrict the definition of va_copy() to C99 environments.
|
104551 |
06-Oct-2002 |
obrien |
It appears CPU_MAXID should be 1 more than the number of CPU_* defines.
|
104519 |
05-Oct-2002 |
phk |
NB: This commit does *NOT* make GEOM the default in FreeBSD NB: But it will enable it in all kernels not having options "NO_GEOM"
Put the GEOM related options into the intended order.
Add "options NO_GEOM" to all kernel configs apart from NOTES.
In some order of controlled fashion, the NO_GEOM options will be removed, architecture by architecture in the coming days.
There are currently three known issues which may force people to need the NO_GEOM option:
boot0cfg/fdisk: Tries to update the MBR while it is being used to control slices. GEOM does not allow this as a direct operation.
SCSI floppy drives: Appearantly the scsi-da driver return "EBUSY" if no media is inserted. This is wrong, it should return ENXIO.
PC98: It is unclear if GEOM correctly recognizes all variants of PC98 disklabels. (Help Wanted! I have neither docs nor HW)
These issues are all being worked.
Sponsored by: DARPA & NAI Labs.
|
104505 |
05-Oct-2002 |
mike |
Fix namespace issues by using visibility conditionals from <sys/cdefs.h>.
|
104493 |
04-Oct-2002 |
mike |
style(9) <machine/setjmp.h> headers so they look mostly the same.
|
104486 |
04-Oct-2002 |
sam |
New bus_dma interfaces for use by crypto device drivers:
o bus_dmamap_load_mbuf o bus_dmamap_load_uio
Test on i386. Known to compile on alpha and sparc64, but not tested. Otherwise untried.
|
104439 |
04-Oct-2002 |
peter |
Gah, spell extern correctly. Do not trust cut/paste via old mozilla builds.
|
104438 |
04-Oct-2002 |
peter |
List the IO SAPIC delivery mode definitions.
|
104436 |
04-Oct-2002 |
peter |
Declare itc_frequency and itm_reload.
|
104433 |
04-Oct-2002 |
peter |
Do a bit of rude hackery to get clock interrupts on all CPUs. This is partly based on the Alpha system which duplicates the clock to each cpu, instead of doing a clock roundrobin like on i386. This means we get hz * ncpu clocks per second and so we have to seperate clock sampling from actual 'do the work' clock processing. The BSP runs the complete processing, the rest just sample state etc.
Using the on-cpu interval timer is not ideal as it will drift. There is more to be done here, we should use an external clock source.
|
104426 |
04-Oct-2002 |
peter |
Update stubs for post-kseIII.
|
104425 |
04-Oct-2002 |
peter |
Update for post-kseIII
|
104379 |
02-Oct-2002 |
archie |
Let kse_wakeup() take a KSE mailbox pointer argument.
Reviewed by: julian
|
104320 |
01-Oct-2002 |
phk |
Fix the same misinitialization of pmap_prefault_pageorder as on i386.
Suggeste by: jake
|
103972 |
25-Sep-2002 |
archie |
Make the following name changes to KSE related functions, etc., to better represent their purpose and minimize namespace conflicts:
kse_fn_t -> kse_func_t struct thread_mailbox -> struct kse_thr_mailbox thread_interrupt() -> kse_thr_interrupt() kse_yield() -> kse_release() kse_new() -> kse_create()
Add missing declaration of kse_thr_interrupt() to <sys/kse.h>. Regenerate the various generated syscall files. Minor style fixes.
Reviewed by: julian
|
103870 |
23-Sep-2002 |
alfred |
use __packed.
|
103834 |
23-Sep-2002 |
peter |
At great personal risk, add a __packed and __aligned(x) define that expand to __attribute__((packed)) and __attribute__((aligned(x))) respectively. Replace the handful of gcc-ism's that use __attribute__((aligned(16))) etc around the kernel with __aligned(16).
There are over 400 __attribute__((packed)) to deal with, that can come later. I just want to use __packed in new code rather than add more gcc-ism's.
|
103814 |
23-Sep-2002 |
mike |
Be careful not to define GCC-specific optimizations in the non-GCC case.
|
103714 |
20-Sep-2002 |
phk |
(This commit touches about 15 disk device drivers in a very consistent and predictable way, and I apologize if I have gotten it wrong anywhere, getting prior review on a patch like this is not feasible, considering the number of people involved and hardware availability etc.)
If struct disklabel is the messenger: kill the messenger.
Inside struct disk we had a struct disklabel which disk drivers used to communicate certain metrics to the disklayer above (GEOM or the disk mini-layer). This commit changes this communication to use four explicit fields instead.
Amongst the benefits is that the fields do not get overwritten by wrong or bogus on-disk disklabels.
Once that is clear, <sys/disk.h> which is included in the drivers no longer need to pull <sys/disklabel.h> and <sys/diskslice.h> in, the few places that needs them, have gotten explicit #includes for them.
The disklabel inside struct disk is now only for internal use in the disk mini-layer, so instead of embedding it, we malloc it as we need it.
This concludes (modulus any mistakes) the series of disklabel related commits.
I belive it all amounts to a NOP for all the rest of you :-)
Sponsored by: DARPA & NAI Labs.
|
103703 |
20-Sep-2002 |
phk |
For reasons now lost in historical fog, the bounds_check_with_label() function were put in i386/i386/machdep.c from where it has been cut and pasted to other architectures with only minor corruption.
Disklabel is really a MI format in many ways, at least it certainly is when you operate on struct disklabel.
Put bounds_check_with_label() back in subr_disklabel.c where it belongs.
Sponsored by: DARPA & NAI Labs.
|
103646 |
19-Sep-2002 |
jhb |
Implement db_print_backtrace() if DDB is compiled into the kernel. This MD function is just a wrapper around db_stack_trace_cmd() that prints out a backtrace of curthread. Currently, this function is only implemented on i386 and alpha (and the alpha version isn't quite tested yet, will do that in a bit). Other changes:
- For i386, fix a bug in the raw frame address case. The eip we extract from the passed in frame address does not match the frame we received. Thus, instead of printing a bogus frame with the wrong eip, go ahead and advance frame down to the same frame as the eip we are using. - For alpha, attempt to add a way of doing a raw trace for alpha. Instead of passing a frame address in 'addr', pass in a pointer to a structure containing PC and KSP and use those to start the backtrace. The alpha db_print_backtrace() uses asm to read in the current PC and KSP values into such a request.
Tested on: i386 Requested by: many
|
103526 |
18-Sep-2002 |
mike |
Implement C99's va_copy() macro.
|
103436 |
17-Sep-2002 |
peter |
Initiate deorbit burn for the i386-only a.out related support. Moves are under way to move the remnants of the a.out toolchain to ports. As the comment in src/Makefile said, this stuff is deprecated and one should not expect this to remain beyond 4.0-REL. It has already lasted WAY beyond that.
Notable exceptions: gcc - I have not touched the a.out generation stuff there. ldd/ldconfig - still have some code to interface with a.out rtld. old as/ld/etc - I have not removed these yet, pending their move to ports. some includes - necessary for ldd/ldconfig for now.
Tested on: i386 (extensively), alpha
|
103367 |
15-Sep-2002 |
julian |
Allocate KSEs and KSEGRPs separatly and remove them from the proc structure. next step is to allow > 1 to be allocated per process. This would give multi-processor threads. (when the rest of the infrastructure is in place)
While doing this I noticed libkvm and sys/kern/kern_proc.c:fill_kinfo_proc are diverging more than they should.. corrective action needed soon.
|
103109 |
09-Sep-2002 |
kuriyama |
Use "options " rather than "options<tab>".
|
103081 |
07-Sep-2002 |
jmallett |
Fill out two fields (si_pid, si_uid) in the siginfo structure handed back to userland in the signal handler that were not being iflled out before, but should and can be.
This part of sendsig could be slightly refactored to use an MI interface, or ideally, *sendsig*() would have an API change to accept a siginfo_t, which would be filled out by an MI function in the level above sendsig, and said MI function would make a small call into MD code to fill out the MD parts (some of which may be bogus, such as the si_addr stuff in some places). This would eventually make it possible for parts of the kernel sending signals to set up a siginfo with meaningful information.
Reviewed by: mux MFC after: 2 weeks
|
103049 |
07-Sep-2002 |
peter |
Zap the implementations of the i386-aout specific cpu_coredump function. Most of the non-i386 platforms had rather broken implementations anyway.
|
102883 |
03-Sep-2002 |
peter |
Make this compile
|
102874 |
03-Sep-2002 |
mike |
Now that _BSD_CLK_TCK_ and _BSD_CLOCKS_PER_SEC_ are the same on all architectures, move the definition directly into <time.h> and finish the removal of <machine/ansi.h>.
|
102869 |
02-Sep-2002 |
mike |
Align _BSD_CLK_TCK_ and _BSD_CLOCKS_PER_SEC_ with most other platforms. This introduces some binary incompatibilities for dynamically linked programs which make use of clock(3) and times(3).
|
102837 |
02-Sep-2002 |
alc |
o Remove an initialized but unused variable from pmap_remove_all().
|
102815 |
01-Sep-2002 |
marcel |
Sync up: remove device counts.
|
102808 |
01-Sep-2002 |
jake |
Added fields for VM_MIN_ADDRESS, PS_STRINGS and stack protections to sysentvec. Initialized all fields of all sysentvecs, which will allow them to be used instead of constants in more places. Provided stack fixup routines for emulations that previously used the default.
|
102666 |
31-Aug-2002 |
peter |
Take a shot at fixing up a whole stack of style and other embarresing unforced errors that Bruce identified. I have not yet addressed all of his concerns.
|
102663 |
31-Aug-2002 |
peter |
Do not use an object for the pte and pv zones on ia64 because it overrides the pmap_allocf() function that we provide above. We still use the limits via other means.
Submitted by: jeff
|
102600 |
30-Aug-2002 |
peter |
Change hw.physmem and hw.usermem to unsigned long like they used to be in the original hardwired sysctl implementation.
The buf size calculator still overflows an integer on machines with large KVA (eg: ia64) where the number of pages does not fit into an int. Use 'long' there.
Change Maxmem and physmem and related variables to 'long', mostly for completeness. Machines are not likely to overflow 'int' pages in the near term, but then again, 640K ought to be enough for anybody. This comes for free on 32 bit machines, so why not?
|
102561 |
29-Aug-2002 |
jake |
Renamed poorly named setregs to exec_setregs. Moved its prototype to imgact.h with the other exec support functions.
|
102560 |
29-Aug-2002 |
jake |
Fixed printf format errors.
|
102399 |
25-Aug-2002 |
alc |
o Retire pmap_pageable(). It's an advisory routine that none of our platforms implements.
|
102330 |
23-Aug-2002 |
marcel |
s/_BSD_VA_LIST_/__va_list/. The former type doesn't exist anymore.
|
102315 |
23-Aug-2002 |
mike |
Move several MI types from <machine/_types.h> to <sys/_types.h>. These types are unlikely to ever become very MD. They include: clockid_t, ct_rune_t, fflags_t, intrmask_t, mbstate_t, off_t, pid_t, rune_t, socklen_t, timer_t, wchar_t, and wint_t.
While moving them, make a few adjustments (submitted by bde): o __ct_rune_t needs to be precisely `int', not necessarily __int32_t, since the arg type of the ctype functions is int. o __rune_t, __wchar_t and __wint_t inherit this via a typedef of __ct_rune_t. o Some minor wording changes in the comment blocks for ct_rune_t and mbstate_t.
Submitted by: bde (partially)
|
102278 |
22-Aug-2002 |
mux |
Convert NEXUS_ACCESSOR to use the __BUS_ACCESSOR macro instead of reimplementing it.
Approved by: peter
|
102227 |
21-Aug-2002 |
mike |
o Merge <machine/ansi.h> and <machine/types.h> into a new header called <machine/_types.h>. o <machine/ansi.h> will continue to live so it can define MD clock macros, which are only MD because of gratuitous differences between architectures. o Change all headers to make use of this. This mainly involves changing: #ifdef _BSD_FOO_T_ typedef _BSD_FOO_T_ foo_t; #undef _BSD_FOO_T_ #endif to: #ifndef _FOO_T_DECLARED typedef __foo_t foo_t; #define _FOO_T_DECLARED #endif
Concept by: bde Reviewed by: jake, obrien
|
102161 |
20-Aug-2002 |
rwatson |
Correct one more errant whitespace nit that crept in during changes in the arguments to vn_rdwr(). Hopefully the last.
|
102153 |
20-Aug-2002 |
peter |
remove unit counts from atkbdc, pckbd, sc
|
101945 |
15-Aug-2002 |
rwatson |
Correct a minor whitespace nit that sneaked in with my previous commit.
|
101941 |
15-Aug-2002 |
rwatson |
In order to better support flexible and extensible access control, make a series of modifications to the credential arguments relating to file read and write operations to cliarfy which credential is used for what:
- Change fo_read() and fo_write() to accept "active_cred" instead of "cred", and change the semantics of consumers of fo_read() and fo_write() to pass the active credential of the thread requesting an operation rather than the cached file cred. The cached file cred is still available in fo_read() and fo_write() consumers via fp->f_cred. These changes largely in sys_generic.c.
For each implementation of fo_read() and fo_write(), update cred usage to reflect this change and maintain current semantics:
- badfo_readwrite() unchanged - kqueue_read/write() unchanged pipe_read/write() now authorize MAC using active_cred rather than td->td_ucred - soo_read/write() unchanged - vn_read/write() now authorize MAC using active_cred but VOP_READ/WRITE() with fp->f_cred
Modify vn_rdwr() to accept two credential arguments instead of a single credential: active_cred and file_cred. Use active_cred for MAC authorization, and select a credential for use in VOP_READ/WRITE() based on whether file_cred is NULL or not. If file_cred is provided, authorize the VOP using that cred, otherwise the active credential, matching current semantics.
Modify current vn_rdwr() consumers to pass a file_cred if used in the context of a struct file, and to always pass active_cred. When vn_rdwr() is used without a file_cred, pass NOCRED.
These changes should maintain current semantics for read/write, but avoid a redundant passing of fp->f_cred, as well as making it more clear what the origin of each credential is in file descriptor read/write operations.
Follow-up commits will make similar changes to other file descriptor operations, and modify the MAC framework to pass both credentials to MAC policy modules so they can implement either semantic for revocation.
Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs
|
101645 |
10-Aug-2002 |
alc |
o Remove the setting and clearing of the PG_MAPPED flag from the alpha and ia64 pmap. o Remove the PG_MAPPED flag's declaration.
|
101625 |
10-Aug-2002 |
peter |
My quad cpu itanium2 box has its cpu's numbered with a lid starting at 192. Masking off bottom 4 bits is not very good here.
|
101588 |
09-Aug-2002 |
brooks |
Make ppp(4) devices clonable and unloadable.
|
101479 |
07-Aug-2002 |
alc |
o Introduce pmap_page_is_mapped(). Its purpose is to obsolete the PG_MAPPED flag.
|
101251 |
03-Aug-2002 |
peter |
Ignore memory above 4GB for now due to unpleasant pci issues.
|
101199 |
02-Aug-2002 |
alc |
o Lock page queue accesses by vm_page_deactivate().
|
100969 |
30-Jul-2002 |
iwasaki |
Resolve conflicts arising from the ACPI CA 20020725 import.
|
100882 |
29-Jul-2002 |
mike |
Create a new header <machine/_stdint.h> for storing MD parts of <stdint.h>. Previously, parts were defined in <machine/ansi.h> and <machine/limits.h>. This resulted in two problems: (1) Defining macros in <machine/ansi.h> gets in the way of that header only defining types. (2) Defining C99 limits in <machine/limits.h> adds pollution to <limits.h>.
|
100551 |
23-Jul-2002 |
peter |
de-count pci
|
100543 |
23-Jul-2002 |
arr |
- Pass the VM_ALLOC_WIRED flag to vm_page_alloc() in pmap_growkernel() so that we can avoid a call to vm_page_lock_queues().
Approved by: peter
|
100464 |
21-Jul-2002 |
peter |
Add explicit unit count on 'device pci' for ahc/ahd
|
100398 |
20-Jul-2002 |
peter |
Change the max IRQ from 63 to 255. I realize we have to block some out still for the IPI vectors, but 63 isn't enough. There is an fxp at IRQ 86 on the Itanium2 box I have.
|
100385 |
20-Jul-2002 |
peter |
Regenerate
|
100384 |
20-Jul-2002 |
peter |
Infrastructure tweaks to allow having both an Elf32 and an Elf64 executable handler in the kernel at the same time. Also, allow for the exec_new_vmspace() code to build a different sized vmspace depending on the executable environment. This is a big help for execing i386 binaries on ia64. The ELF exec code grows the ability to map partial pages when there is a page size difference, eg: emulating 4K pages on 8K or 16K hardware pages.
Flesh out the i386 emulation support for ia64. At this point, the only binary that I know of that fails is cvsup, because the cvsup runtime tries to execute code in pages not marked executable.
Obtained from: dfr (mostly, many tweaks from me).
|
100274 |
17-Jul-2002 |
peter |
Fix a transcription typo. s/ACPI_PTR/ACPI_POINTER/
|
100269 |
17-Jul-2002 |
peter |
Fix some typos in 1.68 from over a week ago.
|
100268 |
17-Jul-2002 |
peter |
Cap the initial PV and PTE table preallocations. Otherwise we explode on the Itanium2 system I have when we use up *all* of the initial 256MB direct mapped region before we are ready to dynamically expand it.
The machine that I have has 4 cpus and a very big hole in the middle. This makes the bogus '(last_address - first_address) / PAGE_SIZE' calculations especially dangerous and caused many millions of initial PV/PTE's to be preallocated.
|
100267 |
17-Jul-2002 |
peter |
Be sure to use a logical address for the SAL table. For some reason the phsysical address is still mapped at this stage of boot on the Itanium1 SDV boxes we have. But Itanium2 does *not* let us get away with this.
|
100266 |
17-Jul-2002 |
peter |
Update for new ACPICA import. Gah.
|
100189 |
16-Jul-2002 |
jhb |
Various comment and minor style fixes. No actual content changes.
Inspired by: bde
|
100000 |
14-Jul-2002 |
alc |
o Lock page queue accesses by vm_page_wire().
|
99900 |
13-Jul-2002 |
mini |
Add additional cred_free_thread() calls that I had missed the first time.
Pointed out by: jhb
|
99887 |
12-Jul-2002 |
jhb |
Set the thread state of the newly chosen to run thread to TDS_RUNNING in choosethread() in MI C code instead of doing it in in assembly in all the various cpu_switch() functions. This fixes problems on ia64 and sparc64.
Reviewed by: julian, peter, benno Tested on: i386, alpha, sparc64
|
99733 |
10-Jul-2002 |
mike |
Remove label_t and physadr, which seem to have never been used in FreeBSD.
Submitted by: bde
|
99630 |
09-Jul-2002 |
mike |
Remove an unused type.
|
99594 |
08-Jul-2002 |
mike |
Move __offsetof() macro from <machine/ansi.h> to <sys/cdefs.h>. It's hardly MD, since all our platforms share the same macro. It's not really compiler dependent either, but this helps in reducing <machine/ansi.h> to only type definitions.
|
99571 |
08-Jul-2002 |
peter |
Add a special page zero entry point intended to be called via the single threaded VM pagezero kthread outside of Giant. For some platforms, this is really easy since it can just use the direct mapped region. For others, IPI sending is involved or there are other issues, so grab Giant when needed.
We still have preemption issues to deal with, but Alan Cox has an interesting suggestion on how to minimize the problem on x86.
Use Luigi's hack for preserving the (lack of) priority.
Turn the idle zeroing back on since it can now actually do something useful outside of Giant in many cases.
|
99559 |
07-Jul-2002 |
peter |
Collect all the (now equivalent) pmap_new_proc/pmap_dispose_proc/ pmap_swapin_proc/pmap_swapout_proc functions from the MD pmap code and use a single equivalent MI version. There are other cleanups needed still.
While here, use the UMA zone hooks to keep a cache of preinitialized proc structures handy, just like the thread system does. This eliminates one dependency on 'struct proc' being persistent even after being freed. There are some comments about things that can be factored out into ctor/dtor functions if it is worth it. For now they are mostly just doing statistics to get a feel of how it is working.
|
99422 |
05-Jul-2002 |
peter |
Back out proc part of last commit. UMA manages the thread cache only, and we just have to deal with the kstack when told to. We do not have a UMA-managed cache for the proc struct and its associated upage yet. So, go back to the old lazy mechanism. Note that if UMA destroys pages that used to contain proc structures, we'll lose the corresponding upage forever. (zones never did this - once a page was allocated, it stayed attached to the proc zone forever)
|
99421 |
05-Jul-2002 |
peter |
Copy from sparc64/pmap.c rev 1.64 (Retrofit changes from i386/pmap.c rev 1.328-1.331.) but for uarea only. We still have our own broken kstack code here.
|
99149 |
30-Jun-2002 |
iwasaki |
Resolve conflicts arising from the ACPI CA 20020404 import.
|
99117 |
30-Jun-2002 |
mike |
Since printf(3) now supports the `j' conversion specifier, use that when printing intmax_t and uintmax_t.
Forgotten by: mike Noticed by: bde
|
99095 |
29-Jun-2002 |
julian |
Fix reverse ordering of locks. add a comment about locks on some platforms.
Submitted by: jhb@freebsd.org
|
99081 |
29-Jun-2002 |
julian |
Add KSE stubs to MD parts of ia64 code. Dfr will fill these out when we decide to enable KSEs on ia64 (probably not immediatly)
|
99077 |
29-Jun-2002 |
julian |
Add a copy of the sparc64 machine/kse.h to satisfy depencies.. dfr will fill in the correct contents at a later time.
|
99072 |
29-Jun-2002 |
julian |
Part 1 of KSE-III
The ability to schedule multiple threads per process (one one cpu) by making ALL system calls optionally asynchronous. to come: ia64 and power-pc patches, patches for gdb, test program (in tools)
Reviewed by: Almost everyone who counts (at various times, peter, jhb, matt, alfred, mini, bernd, and a cast of thousands)
NOTE: this is still Beta code, and contains lots of debugging stuff. expect slight instability in signals..
|
98773 |
24-Jun-2002 |
dfr |
Add UMA_ZONE_VM flag to the zones which are used for pmap_enter().
|
98765 |
24-Jun-2002 |
jake |
Add an MD callout like cpu_exit, but which is called after sched_lock is obtained, when all other scheduling activity is suspended. This is needed on sparc64 to deactivate the vmspace of the exiting process on all cpus. Otherwise if another unrelated process gets the exact same vmspace structure allocated to it (same address), its address space will not be activated properly. This seems to fix some spontaneous signal 11 problems with smp on sparc64.
|
98727 |
24-Jun-2002 |
mini |
Remove unused diagnostic function cread_free_thread().
Approved by: alfred
|
98484 |
20-Jun-2002 |
peter |
Update an 'XXX what is this?' type comment about suswintr and fuswintr. These are 16 bit short values used only by the profiling code.
|
98480 |
20-Jun-2002 |
peter |
Deorbit suibyte(). It was only used for split address space systems for supporting UIO_USERISPACE (ie: it wasn't used).
|
98474 |
20-Jun-2002 |
peter |
ia32 %edx return comes from td_retval[1], not td_retval[0]
Obtained from: dfr
|
98473 |
20-Jun-2002 |
peter |
Use suword32/64 and fuword32/64 like elsewhere instead of inventing suhword/fuhword.
|
98471 |
20-Jun-2002 |
peter |
panic rather than fault and explode if we fail to contigmalloc a kernel stack. This is still bad(TM), but at least we have a clue when we get hit when contigmalloc fails.
|
98470 |
20-Jun-2002 |
peter |
Use the canonical pmap_{new,dispose,swapin,swapout}_proc() functions, in this case cut/pasted from sparc64 instead of messing with contigmalloc where it is not needed.
|
98469 |
20-Jun-2002 |
peter |
Move the "- 1" into the RQB_FFS(mask) macro itself so that implementations can provide a base zero ffs function if they wish. This changes #define RQB_FFS(mask) (ffs64(mask)) foo = RQB_FFS(mask) - 1; to #define RQB_FFS(mask) (ffs64(mask) - 1) foo = RQB_FFS(mask); On some platforms we can get the "- 1" for free, eg: those that use the C code for ffs64().
Reviewed by: jake (in principle)
|
98001 |
07-Jun-2002 |
jhb |
- Fixup / remove obsolete comments. - ktrace no longer requires Giant so do ktrace syscall events before and after acquiring and releasing Giant, respectively. - For i386, ia32 syscalls on ia64, powerpc, and sparc64, get rid of the goto bad hack and instead use the model on ia64 and alpha were we skip the actual syscall invocation if error != 0. This fixes a bug where if we the copyin() of the arguments failed for a syscall that was not marked MP safe, we would try to release Giant when we had not acquired it.
|
97969 |
06-Jun-2002 |
marcel |
Work around a bug in the Linux version of ski, that's specific to SSC_GET_RTC. This fixes the panic seen shortly after mounting the root file system.
Thanks to: "K.Sumitani" <ksumitani@mui.biglobe.ne.jp>
|
97748 |
02-Jun-2002 |
schweikh |
Fix typo in the BSD copyright: s/withough/without/
Spotted and suggested by: des MFC after: 3 weeks
|
97564 |
30-May-2002 |
dfr |
Move the definition of ElfN_Hashelt to common headers. The only platform which has a different definition for this is alpha.
|
97443 |
29-May-2002 |
marcel |
Remove the definition of struct mca_guid and use the generic struct uuid defined in <sys/uuid.h>.
Use uuid/UUID instead of guid/GUID to emphasize that the identifiers are DCE version 1 identifiers and also to avoid inconsistencies as much a possible.
|
97261 |
25-May-2002 |
jake |
Make the run queue parameters machine dependent. Optimize 64 bit architectures by using a 64 bit word for the bit array which keeps track of non-empty queues.
Reviewed by: peter
|
97090 |
22-May-2002 |
marcel |
o Add records for PCI bus and PCI device errors. o Rename mem_platform_id to mem_oem_id. o Minor style fixes.
|
96973 |
20-May-2002 |
marcel |
Flesh-out ptrace support. This obviously needs more work.
|
96961 |
19-May-2002 |
marcel |
Fix a kernel page fault when accessing user memory. We were combining too much conditions and as such ended up with the kernel map instead of the corresponding process map. While here, remove code to allow access to the stackgap and restyle slightly to improve readability.
This fix specifically fixes the procfs failure we're having when reading the process map (cat /proc/curproc/map)
|
96957 |
19-May-2002 |
marcel |
It's time to build modules by default.
|
96956 |
19-May-2002 |
marcel |
Simplify IA64_CMPXCHG to avoid having braced-groups in expressions. As a minor positive side-effect, code at -O0 is more optimal. As a minor negative side-effect, certain boundary cases yield no better code than non-boundary cases. For example, atomic_set_acq_32(p, 0) does a useless logical OR with value 0. This was previously elimina- ted as part of if/while optimizations. Non-boundary cases yield identical code at -O1 and -O2.
|
96921 |
19-May-2002 |
marcel |
Add record definition for memory checks.
|
96916 |
19-May-2002 |
peter |
Catch another C++ comment
|
96912 |
19-May-2002 |
marcel |
o Remove namespace pollution from param.h: - Don't include ia64_cpu.h and cpu.h - Guard definitions by _NO_NAMESPACE_POLLUTION - Move definition of KERNBASE to vmparam.h
o Move definitions of IA64_RR_{BASE|MASK} to vmparam.h o Move definitions of IA64_PHYS_TO_RR{6|7} to vmparam.h
o While here, remove some left-over Alpha references.
|
96907 |
19-May-2002 |
marcel |
o Move prototypes for restorectx and savectx from cpu.h to pcb.h, o Remove Alpha specific contents of struct md_coredump.
|
96900 |
19-May-2002 |
marcel |
Remove option ACPI_DEBUG. It causes compile failures in the function tracing bits due to __func__ being declared as const.
|
96899 |
19-May-2002 |
marcel |
Cast dumpsize to long long to match printf format.
|
96755 |
16-May-2002 |
trhodes |
More s/file system/filesystem/g
|
96606 |
14-May-2002 |
phk |
Move MI stuff out of MD param.h files.
It can all still be overridden in the MD files should need suddenly arise.
|
96604 |
14-May-2002 |
phk |
Remove the unused definitions of ctod() and dotc().
|
96495 |
13-May-2002 |
marcel |
s/_ALPHA_/_MACHINE_/
|
96494 |
13-May-2002 |
marcel |
Remove reference to the "Alpha Calling Standard".
|
96487 |
13-May-2002 |
jake |
These were repo-copied to dump_machdep.c.
|
96442 |
12-May-2002 |
marcel |
o Rename ia64_count_aps to ia64_count_cpus and reimplement the function to return the total number of CPUs and not the highest CPU id. o Define mp_maxid based on the minimum of the actual number of CPUs in the system and MAXCPU. o In cpu_mp_add, when the CPU id of the CPU we're trying to add is larger than mp_maxid, don't add the CPU. Formerly this was based on MAXCPU. Don't count CPUs when we add them. We already know how many CPUs exist. o Replace MAXCPU with mp_maxid when used in loops that iterate over the id space. This avoids a couple of useless iterations. o In cpu_mp_unleash, use the number of CPUs to determine if we need to launch the CPUs. o Remove mp_hardware as it's not used anymore. o Move the IPI vector array from mp_machdep.c to sal.c. We use the array as a centralized place to collect vector assignments. Note that we still assign vectors to SMP specific IPIs in non-SMP configurations. Rename the array from mp_ipi_vector to ipi_vector. o Add IPI_MCA_RENDEZ and IPI_MCA_CMCV. These are used by MCA. Note that IPI_MCA_CMCV is not SMP specific. o Initialize the ipi_vector array so that we place the IPIs in sensible priority classes. The classes are relative to where the AP wake-up vector is located to guarantee that it's the highest priority (external) interrupt. Class assignment is as follows: class IPI notes x AP wake-up (normally x=15) x-1 MCA rendezvous x-2 AST, Rendezvous, stop x-3 CMCV, test
|
96336 |
10-May-2002 |
marcel |
Add missing #endif
|
96318 |
10-May-2002 |
obrien |
Gcc 3.1 varargs support.
|
96146 |
07-May-2002 |
marcel |
o Add ar.lc to the pcb. o Create pcb_save as the backend for savectx and cpu_switch. o While here, use explicit bundling for pcb_save and optimize for compactness (~87% density).
o Not part of the commit is a backend pcb_restore. restorectx() still jumps halfway into cpu_switch().
|
96062 |
05-May-2002 |
marcel |
o Add struct mca_guid o Add currently known GUIDs o Slight restyling
|
96061 |
05-May-2002 |
marcel |
o Include md_var.h o Remove definition of struct ia64_fdesc o Remove prototype of os_boot_rendez o Use the FDESC_FUNC and FDESC_GP abstractions
|
96059 |
05-May-2002 |
marcel |
Remove definition of struct ia64_fdesc. It's been moved to md_var.h
|
96058 |
05-May-2002 |
marcel |
o Move definition of struct ia64_fdesc here to remove duplication. o Add prototype of os_boot_rendez.
|
96029 |
04-May-2002 |
dfr |
Use region 7 addresses for the slabs in the PV and PT zones so that we don't confuse the zone allocater by translating region 5 addresses to region 7 addresses (which is unavoidable for PTEs).
|
96019 |
04-May-2002 |
marcel |
Make sure we don't index the pm_rid array out of bounds in pmap_ensure_rid(). This can happen because the function is called for both user and kernel addresses, while the rid array only has room for user addresses. This bug got exposed by rev 1.58 of ia64/ia64/pmap.c and rev 1.8 of ia64/include/pmap.h.
|
95929 |
02-May-2002 |
dfr |
The width of segsz_t should be 64, not 32 on ia64.
|
95920 |
02-May-2002 |
marcel |
In pmap_pinit0, remove duplicate initialization.
|
95919 |
02-May-2002 |
marcel |
PCPU(current_pmap) is initialized in pmap_bootstrap. No need to do it again.
|
95893 |
01-May-2002 |
marcel |
Save the MCA info specific to the AP as part of the AP launch.
|
95892 |
01-May-2002 |
marcel |
Make ia64_mca_save_state MP safe. Protect access to the info block, updating the sysctl tree and clearing the SAL state by a spin lock.
|
95863 |
01-May-2002 |
peter |
Connect up kern_envp *before* we use it for getenv() and console probing. It is a bit late after that when we have no consoles. :-]
Also, fix a comment nit and print a warning about missing metadata.
|
95814 |
30-Apr-2002 |
phk |
Don't export timecounter structures under debug. with sysctl, they contain no truly interesting data anymore.
|
95768 |
30-Apr-2002 |
marcel |
Add ar.lc and ar.ec to the trapframe. These are not saved for syscalls, only for exceptions.
While adding this to exception_save and exception_restore, it was hard to find a good place to put the instructions. The code sequence was sufficiently arbitrarily ordered that the density was low (roughly 67%). No explicit bundling was used. Thus, I rewrote the functions to optimize for density (close to 80% now), and added explicit bundles and nop instructions. The immediate operand on the nop instruction has been incremented with each instance, to make debugging a bit easier when looking at recurring patterns. Redundant stops have been removed as much as possible. Future optimizations can focus more on performance. A well-placed lfetch can make all the difference here!
Also, the FRAME_Fxx defines in frame.h were mostly bogus. FRAME_F10 to FRAME_F15 were copied from FRAME_F9 and still had the same index. We don't use them yet, so nothing was broken.
|
95762 |
30-Apr-2002 |
marcel |
Make this work for ski again. Don't call ia64_mca_init() when we're in the simulator.
|
95761 |
30-Apr-2002 |
marcel |
Include md_var.h. It has the prototype of ia64_running_in_simulator().
|
95760 |
30-Apr-2002 |
marcel |
Remove KTR_EXTEND.
|
95710 |
29-Apr-2002 |
peter |
Tidy up some loose ends. i386/ia64/alpha - catch up to sparc64/ppc: - replace pmap_kernel() with refs to kernel_pmap - change kernel_pmap pointer to (&kernel_pmap_store) (this is a speedup since ld can set these at compile/link time) all platforms (as suggested by jake): - gc unused pmap_reference - gc unused pmap_destroy - gc unused struct pmap.pm_count (we never used pm_count - we track address space sharing at the vmspace)
|
95519 |
26-Apr-2002 |
marcel |
Initialize MCA in cpu_startup() so that it's ready before we wake-up the application processors. This allows us to collect unconsumed AP specific error records as part of the wake-up.
|
95518 |
26-Apr-2002 |
marcel |
MCA specific code has been moved to a seperate file. It is expected to grow enough to be in the way here.
|
95517 |
26-Apr-2002 |
marcel |
Machine Check Architecture (MCA) support code. Error records are collected at boot and made available through sysctl(8). At the moment, the following MIB names are created:
hw.mca.count - The number of error records collected. hw.mca.first - The lowest sequence number present. hw.mca.last - The highest sequence number present. hw.mca.<X> - The error record with sequence number <X>.
Using sysctl(8) allows us to easily detect and analyze the records, which is very helpful during development of MCA but can also be used in production as a way to collect machine health statistics.
|
95515 |
26-Apr-2002 |
marcel |
Machine Check Architecture (MCA) structures and constants.
|
95458 |
25-Apr-2002 |
marcel |
The official name for McKinley is: Itanium 2
|
95410 |
25-Apr-2002 |
marcel |
Don't use the symbol name to lookup the symbol value when we can use the symbol index defined by the relocation. The elf_lookup() support function is to be used by elf_reloc() when symbol lookups need to be done. The elf_lookup() function operates on the symbol index and will do a symbol name based lookup when such is required, otherwise it uses the symbol index directly. This solves the problem seen on ia64 where the symbol hash table does not contain local symbols and a symbol name based lookup would fail for those symbols.
Don't pass the symbol name to elf_reloc(), as it isn't used any more.
|
95245 |
22-Apr-2002 |
marcel |
Add ia64_sal_init_state(). This function will initialize the machine check handling. In its current form, it only determines the largest amount of state information it can get from SAL and allocates a region 7 memory block for it.
The next steps involve: o get and log any unconsumed (NVM stored) error records across reboots, o register an OS_MCA handler and enable machine checks.
|
95244 |
22-Apr-2002 |
marcel |
Add state information types.
|
95231 |
21-Apr-2002 |
marcel |
Fix WAW dependency violation on r17 (line 198) that only exists for the SMP case. While on the subject, remove unnecessary stops. I don't know if this resolves the memory corruption I'm seeing, but it does have the potential. We'll see...
|
95229 |
21-Apr-2002 |
marcel |
Implement elf_reloc(). The RT specification says that we can expect both Elf_Rel and Elf_Rela types of relocation, so handle them both even though we only have Rel_Rela ATM. We don't handle 32-bit and big-endian variants yet. Support for that is not trivial enough to implement it without any evidence that we ever need it in the near future.
For the FPTR relocations, we currently use the fptr_storage used by _reloc() is locore.s. This is in no way a real solution, but for now provides the service we need to get the basics going.
A static recursive function lookup_fdesc() is used to find the address of a function in a way that keeps track of the load module so that we can get the correct GP value if we need to construct an OPD (ie there's no OPD yet for the function.
For simplicity, we create an OPD for the IPLT relocations as well and simply fill the user provided function descriptor from the OPD. Since the the official descriptors are unique, this has no bad side effects. Note that we ignore the addend for FPTR relocations, but use the addend for IPLT relocations as an offset to the function address.
This commit allows us to load and relocate modules and modules appear to work correctly, although we probably need to make sure that we set GP correctly in all cases when we have inter-module calls. This especially applies to assembly coded functions that have cross module calls.
|
95202 |
21-Apr-2002 |
dfr |
Setup the child's return values correctly when forking an IA-32 process.
|
95191 |
21-Apr-2002 |
marcel |
Improve self-relocation and fix ABI misinterpretation. The changes here mostly mirror the changes made in boot/efi/libefi/arch/ia64/start.S rev 1.5
Significant difference: We don't handle the IPLT relocation here. For barebones KLD support, we make the fptr_storage global.
|
95025 |
19-Apr-2002 |
marcel |
Remove the bootinfo kludge. We get the address of the bootinfo block from the loader.
|
95019 |
19-Apr-2002 |
alc |
o Remove vm_map_growstack() from ia64's trap_pfault(). o Remove the acquisition and release of Giant from ia64's trap_pfault(). (vm_fault() still acquires it.)
|
94980 |
18-Apr-2002 |
rwatson |
Since WITNESS doesn't just do mutexes, remove "mutex" from the WITNESS comment in GENERIC config files of appropriate platforms. For whatever reason, powerpc didn't use WITNESS in GENERIC.
|
94936 |
17-Apr-2002 |
mux |
Rework the kernel environment subsystem. We now convert the static environment needed at boot time to a dynamic subsystem when VM is up. The dynamic kernel environment is protected by an sx lock.
This adds some new functions to manipulate the kernel environment : freeenv(), setenv(), unsetenv() and testenv(). freeenv() has to be called after every getenv() when you have finished using the string. testenv() only tests if an environment variable is present, and doesn't require a freeenv() call. setenv() and unsetenv() are self explanatory.
The kenv(2) syscall exports these new functionalities to userland, mainly for kenv(1).
Reviewed by: peter
|
94819 |
16-Apr-2002 |
alc |
Remove code that updates vm->vm_ssize. This duplicates work already performed by vm_map_growstack().
|
94779 |
15-Apr-2002 |
peter |
Fix an "oops!" that turned out to be mostly harmless (but gave a warning). I did this right on the sparc64. Store the direct mapped addresses in the correct variables.
Submitted by: jake
|
94777 |
15-Apr-2002 |
peter |
Pass vm_page_t instead of physical addresses to pmap_zero_page[_area]() and pmap_copy_page(). This gets rid of a couple more physical addresses in upper layers, with the eventual aim of supporting PAE and dealing with the physical addressing mostly within pmap. (We will need either 64 bit physical addresses or page indexes, possibly both depending on the circumstances. Leaving this to pmap itself gives more flexibilitly.)
Reviewed by: jake Tested on: i386, ia64 and (I believe) sparc64. (my alpha was hosed)
|
94642 |
14-Apr-2002 |
marcel |
Dotting the i-s: o Use chunk instead of region when we talk about a memory range. Region can be confused with region register and we already call it chunk in machdep.c o Update the twiddle every 16MB
|
94639 |
14-Apr-2002 |
peter |
Allow a kernel to be compiled with both SKI and acpica and still work on real hardware. (SKI used to break the sapic probes)
|
94628 |
13-Apr-2002 |
alc |
Add comment that sigreturn() is MPSAFE.
|
94496 |
12-Apr-2002 |
dfr |
Initialise ar.cflg, which contains the IA-32 registers cr0 and cr4. Since all IA-32 processes use the same values for cr0 and cr4, we initialise them at system startup.
|
94495 |
12-Apr-2002 |
dfr |
Print extra information in printtrap() if the interrupted state was for an IA-32 process. Don't sign extend arguments in ia32_syscall - its not normally going to be useful (e.g. pointers need to be zero extended).
|
94494 |
12-Apr-2002 |
marcel |
Fix definition of va_start: We don't need to take the address of va_list. It's a builtin type. gcc 3.1 doesn't care either way, but gcc 3.2 is more picky and doesn't like the former.
|
94481 |
12-Apr-2002 |
peter |
Really fix uniprocessor on IA64. Note to self: do not use variables before they are initialized. I had correctly figured out that the UP problem was the pcpu current_pmap thing, but didn't fix it right last time.
|
94380 |
10-Apr-2002 |
dfr |
Initial support for executing IA-32 binaries. This will not compile without a few patches for the rest of the kernel to allow the image activator to override exec_copyout_strings and setregs.
None of the syscall argument translation has been done. Possibly, this translation layer can be shared with any platform that wants to support running ILP32 binaries on an LP64 host (e.g. sparc32 binaries?)
|
94378 |
10-Apr-2002 |
dfr |
Save and restore the IA-32 state in cpu_switch(). Probably should only do this if the thread has been executing IA-32 code.
|
94377 |
10-Apr-2002 |
dfr |
Add suhword() and fuhword() for accessing 32-bit values ("half words") in userland. All these functions should be renamed to be explicit about the size of value being read or written.
|
94376 |
10-Apr-2002 |
dfr |
Add exception and syscall support for executing IA-32 binaries.
|
94375 |
10-Apr-2002 |
dfr |
Add ucode values for SIGFPE etc. Copied from i386/include/signal.h.
|
94374 |
10-Apr-2002 |
dfr |
Add fields for saving/restoring the IA-32 state.
|
94373 |
10-Apr-2002 |
dfr |
Add definitions for IA-32 exceptions, interrupts and intercepts.
|
94365 |
10-Apr-2002 |
dfr |
Call ast() from the syscall exit path as well as for full exception restores.
|
94364 |
10-Apr-2002 |
dfr |
Initialise PCPU_GET(current_pmap) in pmap_bootstrap - cpu_switch needs to be sure that it is always correct and this was not true for the first call to cpu_switch. When thread0 resumed later, it ended up calling pmap_install with a null pmap, which is bad.
|
94363 |
10-Apr-2002 |
mike |
Remove the hack for segsz_t from <sys/types.h>; use the normal _BSD_FOO_T_ method for defining segsz_t.
|
94362 |
10-Apr-2002 |
mike |
Add manifest constants: _LITTLE_ENDIAN, _BIG_ENDIAN, _PDP_ENDIAN, and _BYTE_ORDER. These are far more useful than their non-underscored equivalents as these can be used in restricted namespace environments. Mark the non-underscored variants as deprecated.
|
94275 |
09-Apr-2002 |
phk |
GC various bits and pieces of USERCONFIG from all over the place.
|
94271 |
09-Apr-2002 |
dfr |
Define a complete set of accessors for application and control registers.
|
94270 |
09-Apr-2002 |
dfr |
Don't call make_dev from ssccnattach - its far too early to work properly.
|
94026 |
07-Apr-2002 |
peter |
ia64 depends on ACPICA on actual hardware. It might be worth having a seperate SKI config (like we had SIMOS for alpha).
|
93997 |
06-Apr-2002 |
marcel |
Add prototype for bootpc_init when BOOTP is defined.
|
93961 |
06-Apr-2002 |
dfr |
Merge fixes for dbtob() and btodb() from alpha/include/param.h. This stops ffs_snapshot() from using negative numbers for byte offsets in large file systems.
|
93933 |
06-Apr-2002 |
marcel |
Fix a braino in the alignment of the segment contents after dumping the program headers. As a result of this, dumplo was advanced too much causing the end of the dump and most notably the trailing dump header to be written beyond the end of the the dump medium.
|
93818 |
04-Apr-2002 |
jhb |
Change callers of mtx_init() to pass in an appropriate lock type name. In most cases NULL is passed, but in some cases such as network driver locks (which use the MTX_NETWORK_LOCK macro) and UMA zone locks, a name is used.
Tested on: i386, alpha, sparc64
|
93794 |
04-Apr-2002 |
brian |
Back out the previous commit.
In the i386 case, options BOOTP requires options NFS_ROOT as well as options NFSCLIENT. With *both* the NFS options, a bootpc_init() prototype is brought in by nfsclient/nfsdiskless.h.
In the ia64 case, it just doesn't work and my change just pushes it further away from working.
Suggested to be wrong by: bde
|
93793 |
04-Apr-2002 |
bde |
Moved signal handling and rescheduling from userret() to ast() so that they aren't in the usual path of execution for syscalls and traps. The main complication for this is that we have to set flags to control ast() everywhere that changes the signal mask.
Avoid locking in userret() in most of the remaining cases.
Submitted by: luoqi (first part only, long ago, reorganized by me) Reminded by: dillon
|
93785 |
04-Apr-2002 |
brian |
Pre-declare bootpc_init() so that options BOOTP doesn't break the build in ia64 and i386 due to -Werror.
|
93761 |
04-Apr-2002 |
alc |
o Kill the MD grow_stack(). Call the MI vm_map_growstack() in its place. o Eliminate the use of useracc() and grow_stack() from sendsig().
Reviewed by: peter
|
93757 |
04-Apr-2002 |
marcel |
o Add architecture specific segment types. o Add architecture specific segment attributes.
|
93719 |
03-Apr-2002 |
ru |
Dike out a highly insecure UCONSOLE option. TIOCCONS must be able to VOP_ACCESS() /dev/console to succeed.
Obtained from: OpenBSD
|
93717 |
03-Apr-2002 |
marcel |
Make the kernel dump header endianness invariant by always dumping in dump byte order (=network byte order). Swap blocksize and dumptime to avoid extraneous padding on 64-bit architectures. Use CTASSERT instead of runtime checks to make sure the header is 512 bytes large. Various style(9) fixes.
Reviewed by: phk, bde, mike
|
93713 |
03-Apr-2002 |
marcel |
o GC dumplo o Replace the string lit. "ia64" with MACHINE
|
93712 |
03-Apr-2002 |
marcel |
Use a twiddle to show that we're busy dumping. The initial code emitted the total number of pages it still had to dump prior to dumping a block of up to 16 pages. For a 128MB region this would result in 8M number of printf()s. Barf!
The problem in general is that memory typically has one really big region and a number of "scattered" smaller regions. Some may even be just a few pages. The twiddle works best for now, but it doesn't really give a good progress indication for the large regions. Those are the cases where you definitely want good PI to avoid having the user turn into a twiddle :-)
|
93702 |
02-Apr-2002 |
jhb |
- Move the MI mutexes sched_lock and Giant from being declared in the various machdep.c's to being declared in kern_mutex.c. - Add a new function mutex_init() used to perform early initialization needed for mutexes such as setting up thread0's contested lock list and initializing MI mutexes. Change the various MD startup routines to call this function instead of duplicating all the code themselves.
Tested on: alpha, i386
|
93647 |
02-Apr-2002 |
marcel |
Initial implementation of the ia64 kernel dumper. The dumper constructs an ELF image, consisting of the ELF header, for each memory region a program header, followed by the memory contents for each region. It does blocked I/O for the headers as they are typically smaller than DEV_BSIZE.
|
93627 |
02-Apr-2002 |
marcel |
o GC totalphysmem and resvmem. o Rephrase comment describing that the memory region can contain the kernel.
|
93607 |
01-Apr-2002 |
dillon |
Stage-2 commit of the critical*() code. This re-inlines cpu_critical_enter() and cpu_critical_exit() and moves associated critical prototypes into their own header file, <arch>/<arch>/critical.h, which is only included by the three MI source files that need it.
Backout and re-apply improperly comitted syntactical cleanups made to files that were still under active development. Backout improperly comitted program structure changes that moved localized declarations to the top of two procedures. Partially re-apply one of the program structure changes to move 'mask' into an intermediate block rather then in three separate sub-blocks to make the code more readable. Re-integrate bug fixes that Jake made to the sparc64 code.
Note: In general, developers should not gratuitously move declarations out of sub-blocks. They are where they are for reasons of structure, grouping, readability, compiler-localizability, and to avoid developer-introduced bugs similar to several found in recent years in the VFS and VM code.
Reviewed by: jake
|
93593 |
01-Apr-2002 |
jhb |
Change the suser() API to take advantage of td_ucred as well as do a general cleanup of the API. The entire API now consists of two functions similar to the pre-KSE API. The suser() function takes a thread pointer as its only argument. The td_ucred member of this thread must be valid so the only valid thread pointers are curthread and a few kernel threads such as thread0. The suser_cred() function takes a pointer to a struct ucred as its first argument and an integer flag as its second argument. The flag is currently only used for the PRISON_ROOT flag.
Discussed on: smp@
|
93467 |
31-Mar-2002 |
phk |
Centralize the "bootdev" and "dumpdev" variables. They are still pretty bogus all things considered, but at least now they don't camouflage as being MD variables.
|
93458 |
30-Mar-2002 |
marcel |
Transition to a model where the loader passes the address of the bootinfo block in register r8. In locore.s we save the address in the global variable 'pa_bootinfo'. In machdep.c we compare this value against the hardwired address, but don't depend on its validity yet (ie: we still expect the bootinfo block to be at the hardwired address). After a small amount of time, we'll flip the switch and depend on the loader to pass us the address. From that moment on the loader is free to put it anywhere it likes, provided the machine itself likes it as well.
Add some verbosity to aid in the transition. We emit a message if the loader didn't pass the address and we also emit a message if there's no bootinfo block at the hardwired address.
While in locore.s, reduce the number of redundant serialization instructions. A srlz.i is a proper superset of a srlz.d and thus is a valid replacement. Also slightly reorder the movl instructions to improve bundle density.
|
93389 |
29-Mar-2002 |
jake |
Remove abuse of intr_disable/restore in MI code by moving the loop in ast() back into the calling MD code. The MD code must ensure no races between checking the astpening flag and returning to usermode.
Submitted by: peter (ia64 bits) Tested on: alpha (peter, jeff), i386, ia64 (peter), sparc64
|
93312 |
28-Mar-2002 |
obrien |
style(9)
Approved by: jake
|
93273 |
27-Mar-2002 |
jeff |
Add a new mtx_init option "MTX_DUPOK" which allows duplicate acquires of locks with this flag. Remove the dup_list and dup_ok code from subr_witness. Now we just check for the flag instead of doing string compares.
Also, switch the process lock, process group lock, and uma per cpu locks over to this interface. The original mechanism did not work well for uma because per cpu lock names are unique to each zone.
Approved by: jhb
|
93264 |
27-Mar-2002 |
dillon |
Compromise for critical*()/cpu_critical*() recommit. Cleanup the interrupt disablement assumptions in kern_fork.c by adding another API call, cpu_critical_fork_exit(). Cleanup the td_savecrit field by moving it from MI to MD. Temporarily move cpu_critical*() from <arch>/include/cpufunc.h to <arch>/<arch>/critical.c (stage-2 will clean this up).
Implement interrupt deferral for i386 that allows interrupts to remain enabled inside critical sections. This also fixes an IPI interlock bug, and requires uses of icu_lock to be enclosed in a true interrupt disablement.
This is the stage-1 commit. Stage-2 will occur after stage-1 has stabilized, and will move cpu_critical*() into its own header file(s) + other things. This commit may break non-i386 architectures in trivial ways. This should be temporary.
Reviewed by: core Approved by: core
|
93256 |
27-Mar-2002 |
marcel |
o Revert previous commit in asm.h. There's no need to undefine __FBSDID first, because it should not be defined at all, o Remove inclusion of cdefs.h in locore.s.
Pointed out by: peter
|
93192 |
26-Mar-2002 |
obrien |
Get the guarding right. The IA-64 has a different organization for this than our other platforms.
|
93092 |
24-Mar-2002 |
obrien |
Guard against redefining __gnuc_va_list.
|
93088 |
24-Mar-2002 |
marcel |
Undefine __FBSDID before defining it as it's already defined at that point.
|
92998 |
23-Mar-2002 |
obrien |
ASM versions of __FBSDID.
|
92870 |
21-Mar-2002 |
dfr |
Change critical_t to register_t for intr_disable/restore.
|
92869 |
21-Mar-2002 |
dfr |
Change cpu_critical_enter/exit to intr_disable/restore.
|
92865 |
21-Mar-2002 |
peter |
In UP mode, the primary cpu's per-cpu current_pmap was not initialized - this was only done as a side effect of calling cpu_mp_start(). I haven't actually tested that this fixes UP kernels, but it feels about right.
|
92851 |
21-Mar-2002 |
jeff |
Remove references to vm_zone.h and switch over to the new uma API.
Approved by: peter
|
92843 |
20-Mar-2002 |
alfred |
Remove __P.
Reviewd by: peter
|
92824 |
20-Mar-2002 |
jhb |
Change the way we ensure td_ucred is NULL if DIAGNOSTIC is defined. Instead of caching the ucred reference, just go ahead and eat the decerement and increment of the refcount. Now that Giant is pushed down into crfree(), we no longer have to get Giant in the common case. In the case when we are actually free'ing the ucred, we would normally free it on the next kernel entry, so the cost there is not new, just in a different place. This also removse td_cache_ucred from struct thread. This is still only done #ifdef DIAGNOSTIC.
Tested on: i386, alpha
|
92805 |
20-Mar-2002 |
dfr |
Change intr_enable to intr_restore for consistency with sparc64.
|
92782 |
20-Mar-2002 |
dfr |
Replace calls to cpu_critical_enter/exit with appropriate calls to either explicitly disable interrupts or use a real critical section, as appropriate.
|
92781 |
20-Mar-2002 |
dfr |
Recreate intr_disable/intr_enable and implement cpu_critical_enter/exit in terms of that (for now).
|
92696 |
19-Mar-2002 |
peter |
#if 0 some unused variables (only in #if 0 code)
|
92679 |
19-Mar-2002 |
peter |
Enabling the SKI option is a guaranteed breakage for me. Interrupts no longer work. I can only get a box to boot with 'options SMP'.
|
92677 |
19-Mar-2002 |
peter |
My ia64 box for some reason likes to fragment the beginning/end of memory a bit before handing it over to the OS. I occasionally have 11 segments with several 8K or so fragments depending on nvram settings and what I have done under loader(8) before booting. This needs to be revisited.
|
92676 |
19-Mar-2002 |
peter |
Fix some unused variables.
|
92675 |
19-Mar-2002 |
peter |
Move a couple of prototypes together instead of being incompletely scattered around.
|
92674 |
19-Mar-2002 |
peter |
__func__ is a const char *, not a "string" that can be concatenated.
|
92673 |
19-Mar-2002 |
peter |
Fix a pointer/int warning
|
92672 |
19-Mar-2002 |
peter |
#ifdef SMP some variables that are only used elsewhere under #ifdef SMP also.
|
92671 |
19-Mar-2002 |
peter |
Work around an apparent compiler bug with gcc-3.1, although it might be a language feature that I do not know about. gcc is complaining about a left shift >= sizeof type, even when shifting a (cast) 64 bit type left by 43 bits.
|
92670 |
19-Mar-2002 |
peter |
Believe it or not, I ran into the 32MB stack size limit using a natively hosted gcc.
|
92669 |
19-Mar-2002 |
peter |
#if 0 out some unused code.
|
92668 |
19-Mar-2002 |
peter |
Add some #includes after things got broken with the last round of MI include file (<sys/smp.h> I think) tweaks.
|
92667 |
19-Mar-2002 |
peter |
Turn off the ia64 ITC timecounter when SMP is present since it has the same problem as the TSC on the x86 - ie: it is not synchronized. #if 0 out some unused functions, ia64 doesn't calibrate clocks yet.
|
92654 |
19-Mar-2002 |
jeff |
This is the first part of the new kernel memory allocator. This replaces malloc(9) and vm_zone with a slab like allocator.
Reviewed by: arch@
|
92552 |
18-Mar-2002 |
dfr |
Fix spelling.
|
92383 |
16-Mar-2002 |
des |
Move the definition of PT_[GS]ET{,DB,FP}REGS from the MD ptrace.h to the MI ptrace.h, since all platforms define them. Keep the MD ptrace.h around for FIX_SSTEP (which is currently only needed on Alpha).
|
92321 |
15-Mar-2002 |
dfr |
* Stop other cpus when one cpu enters DDB and restart them after it leaves. * Add a sync.i instruction to the code which writes out breakpoints to ensure that the breakpoint is seem by all cpus in the coherence domain.
|
92318 |
15-Mar-2002 |
dfr |
* Remove a breakpoint() I accidentally left in for debugging :-(. * Make cpu_mp_probe() work before the VM system is available and initialise mp_maxid accordingly.
|
92287 |
14-Mar-2002 |
dfr |
Tweak the AP startup code somewhat. With all the other recent changes, this now works pretty well for two processors at least.
Submitted by: marcel, mostly.
|
92286 |
14-Mar-2002 |
dfr |
* Initialise pcb_pmap for new threads. * Add support for forking new threads from &thread0 as well as curthread.
|
92285 |
14-Mar-2002 |
dfr |
* Save and restore PCPU_GET(current_pmap) in pcb_pmap so that we don't lose if a process is preempted while pmap is temporarily switched to another pmap. * For SMP, drop the high-fp state when a thread is switched away from so that if another cpu resumes that thread, it doesn't have to play games with IPI to get ahold of the correct register values.
|
92281 |
14-Mar-2002 |
dfr |
Add pcpu.pc_current_pmap and pcb.pcb_pmap.
|
92280 |
14-Mar-2002 |
dfr |
Add a field to hold the current pmap of a thread.
|
92271 |
14-Mar-2002 |
dfr |
Add ia64_sync_i(), ia64_get_tpr() and ia64_set_tpr().
|
92268 |
14-Mar-2002 |
dfr |
* Add some KTR messages for IPIs. * Don't call ast() from interrupt() - if we switch, then we will miss writing cr.eoi which will prevent the current cpu from receiving interrupts until the current thread is resumed. The call to ast() happens magically in exception_restore where it is safe. * Add DDB 'show irq' command to examine interrupt hardware state.
|
92267 |
14-Mar-2002 |
dfr |
Add debug code to print SAPIC registers.
|
92263 |
14-Mar-2002 |
dfr |
* Use a mutex to protect the RID allocator. * Use ptc.g instead of ptc.l so that TLB shootdowns are broadcast to the coherence domain. * Use smp_rendezvous for pmap_invalidate_all to ensure it happens on all cpus. * Dike out a DIAGNOSTIC printf which didn't compile. * Protect the internals of pmap_install with cpu_critical_enter/exit.
|
92262 |
14-Mar-2002 |
dfr |
Move the call to pmap_bootstrap to after the initialisation of thread0. This allows us to use mutexes in pmap safely. Also initialise fpcurthread for cpu0 so that ia64_fpstate_check doesn't barf during boot.
|
92249 |
14-Mar-2002 |
dfr |
Don't restore r13 when returning to kernel mode. We may have migrated to a different cpu since the exception_save and r13 needs to point at the current cpu's pcpu structure.
|
92124 |
12-Mar-2002 |
peter |
Fix some -Wunused warnings by "using" a macro argument
|
92123 |
12-Mar-2002 |
peter |
Fix a warning (make ucontext_t *ucp a const)
|
92122 |
12-Mar-2002 |
peter |
Stop concatenating __func__ with strings
|
92121 |
12-Mar-2002 |
peter |
Deal with a structure member rename in a recent acpica import
|
92105 |
11-Mar-2002 |
jhb |
Fix a misspelling of mine: s/optomization/optimization/.
Noticed by: bmilekic
|
92020 |
10-Mar-2002 |
dfr |
Add an implementation of cpu_throw() and make restorectx() simply branch to the tail of cpu_switch.
|
92019 |
10-Mar-2002 |
dfr |
Don't try to print the arguments if the value of bsp is outside the kernel - its asking for trouble.
|
92010 |
10-Mar-2002 |
dfr |
Use the right value for the region length in parse_spill_mask.
|
91959 |
09-Mar-2002 |
mike |
o Don't require long long support in bswap64() functions. o In i386's <machine/endian.h>, macros have some advantages over inlines, so change some inlines to macros. o In i386's <machine/endian.h>, ungarbage collect word_swap_int() (previously __uint16_swap_uint32), it has some uses on i386's with PDP endianness.
Submitted by: bde
o Move a comment up in <machine/endian.h> that was accidentially moved down a few revisions ago. o Reenable userland's use of optimized inline-asm versions of byteorder(3) functions. o Fix ordering of prototypes vs. redefinition of byteorder(3) functions, so that the non-GCC (libc asm) case has proper prototypes. o Add proper prototypes for byteorder(3) functions in <sys/param.h>. o Prevent redundant duplicate prototypes by making use of the _BYTEORDER_PROTOTYPED define. o Move the bswap16(), bswap32(), bswap64() C functions into MD space for platforms in which asm versions don't exist. This significantly reduces the complexity of some things at the cost of duplicate code.
Reviewed by: bde
|
91779 |
07-Mar-2002 |
jake |
Include machine/smp.h.
|
91669 |
05-Mar-2002 |
marcel |
Call ast() only when we're handling a user trap.
|
91639 |
04-Mar-2002 |
dfr |
Add PSEUDOFS.
|
91635 |
04-Mar-2002 |
dfr |
Add emulation support for PAL_VM_SUMMARY.
|
91598 |
03-Mar-2002 |
dfr |
* Include <sys/ucontext.h> so that this compiles again. * Move the section which manipulates ia64_pal_base to after cninit() so that we don't risk printing anything before we have a console. * Don't call ia64_probe_sapics() for a SKI build. This should really be dependant on ACPICA being present or something.
|
91504 |
28-Feb-2002 |
arr |
- Move a comment from being on the same line as a #ifdef to the line following it. This should have gone in the previous commit, but misviewed Bruce's patch.
Requested by: bde
|
91475 |
28-Feb-2002 |
arr |
- Fix panic() message and a couple style nits that snuck in from the recent diagnostics commit (rev. 1.84).
|
91406 |
27-Feb-2002 |
jhb |
Simple p_ucred -> td_ucred changes to start using the per-thread ucred reference.
|
91403 |
27-Feb-2002 |
silby |
Fix a horribly suboptimal algorithm in the vm_daemon.
In order to determine what to page out, the vm_daemon checks reference bits on all pages belonging to all processes. Unfortunately, the algorithm used reacted badly with shared pages; each shared page would be checked once per process sharing it; this caused an O(N^2) growth of tlb invalidations. The algorithm has been changed so that each page will be checked only 16 times.
Prior to this change, a fork/sleepbomb of 1300 processes could cause the vm_daemon to take over 60 seconds to complete, effectively freezing the system for that time period. With this change in place, the vm_daemon completes in less than a second. Any system with hundreds of processes sharing pages should benefit from this change.
Note that the vm_daemon is only run when the system is under extreme memory pressure. It is likely that many people with loaded systems saw no symptoms of this problem until they reached the point where swapping began.
Special thanks go to dillon, peter, and Chuck Cranor, who helped me get up to speed with vm internals.
PR: 33542, 20393 Reviewed by: dillon MFC after: 1 week
|
91394 |
27-Feb-2002 |
tmm |
Add the following functions/macros to support byte order conversions and device drivers for bus system with other endinesses than the CPU (using interfaces compatible to NetBSD):
- bwap16() and bswap32(). These have optimized implementations on some architectures; for those that don't, there exist generic implementations. - macros to convert from a certain byte order to host byte order and vice versa, using a naming scheme like le16toh(), htole16(). These are implemented using the bswap functions. - stream bus space access functions, which do not perform a byte order conversion (while the normal access functions would if the bus endianess differs from the CPU endianess).
htons(), htonl(), ntohs() and ntohl() are implemented using the new functions above for kernel usage. None of the above interfaces is currently exported to user land.
Make use of the new functions in a few places where local implementations of the same functionality existed.
Reviewed by: mike, bde Tested on alpha by: mike
|
91090 |
22-Feb-2002 |
julian |
Add some DIAGNOSTIC code. While in userland, keep the thread's ucred reference in a shadow field so that the usual place to store it is NULL. If DIAGNOSTIC is not set, the thread ucred is kept valid until the next kernel entry, at which time it is checked against the process cred and possibly corrected. Produces a BIG speedup in kernels with INVARIANTS set. (A previous commit corrected it for the non INVARIANTS case already)
Reviewed by: dillon@freebsd.org
|
91066 |
22-Feb-2002 |
phk |
Convert p->p_runtime and PCPU(switchtime) to bintime format.
|
90892 |
19-Feb-2002 |
julian |
Duplicate the changes to i386 to keep creds over the user boundary.
|
90885 |
19-Feb-2002 |
mike |
Add C++ support.
|
90868 |
18-Feb-2002 |
mike |
o Move NTOHL() and associated macros into <sys/param.h>. These are deprecated in favor of the POSIX-defined lowercase variants. o Change all occurrences of NTOHL() and associated marcros in the source tree to use the lowercase function variants. o Add missing license bits to sparc64's <machine/endian.h>. Approved by: jake o Clean up <machine/endian.h> files. o Remove unused __uint16_swap_uint32() from i386's <machine/endian.h>. o Remove prototypes for non-existent bswapXX() functions. o Include <machine/endian.h> in <arpa/inet.h> to define the POSIX-required ntohl() family of functions. o Do similar things to expose the ntohl() family in libstand, <netinet/in.h>, and <sys/param.h>. o Prepend underscores to the ntohl() family to help deal with complexities associated with having MD (asm and inline) versions, and having to prevent exposure of these functions in other headers that happen to make use of endian-specific defines. o Create weak aliases to the canonical function name to help deal with third-party software forgetting to include an appropriate header. o Remove some now unneeded pollution from <sys/types.h>. o Add missing <arpa/inet.h> includes in userland.
Tested on: alpha, i386 Reviewed by: bde, jake, tmm
|
90711 |
15-Feb-2002 |
wollman |
Resurrect one of the easiest changes from my big include files roll-up patch from a year ago: give file flags their own type. This does not (yet) change the type used by system calls or library functions. The underlying type was chosen to match what is returned by stat().
|
90598 |
13-Feb-2002 |
rwatson |
Remove WITNESS from GENERIC by default: as we grow more locks, this gets slower, and may be impeding adoption of -CURRENT by developers. We recommend turning on WITNESS by default on crash boxes, and when doing locking development. It will probably get turned on by default for a week or two following any major locking commits, also.
Approved by: all and sundry (jhb, phk, ...)
|
90361 |
07-Feb-2002 |
julian |
Pre-KSE/M3 commit. this is a low-functionality change that changes the kernel to access the main thread of a process via the linked list of threads rather than assuming that it is embedded in the process. It IS still embeded there but remove all teh code that assumes that in preparation for the next commit which will actually move it out.
Reviewed by: peter@freebsd.org, gallatin@cs.duke.edu, benno rice,
|
90344 |
07-Feb-2002 |
phk |
GC the PC_SWITCH* symbols which are not used in assembly anymore.
|
90065 |
01-Feb-2002 |
bde |
Compile osigreturn() unconditionally since it will always be needed on some arches and the syscall table is machine-independent. It was (bogusly) conditional on COMPAT_43, so this usually makes no difference.
ia64: in addition: - replace the bogus cloned comment before osigreturn() by a correct one. osigreturn() is just a stub fo ia64's. - fix the formatting of cloned comment before sigreturn(). - fix the return code. use nosys() instead of returning ENOSYS to get the same semantics as if the syscall is not in the syscall table. Generating SIGSYS is actually correct here. - fix style bugs.
powerpc: copy the cleaned up ia64 stub. This mainly fixes a bogus comment.
sparc64: copy the cleaned up the ia64 stub, since there was no stub before.
|
89493 |
18-Jan-2002 |
marcel |
Add a definition of ddb_regs. ddb_regs is declared as extern in db_machdep.h to fix the link failure (multiple definitions) caused by disabling the emission of common symbols. As a result, there were no definitions at all. While here, remove useless declarations.
|
89492 |
18-Jan-2002 |
marcel |
Remove the definition of bootverbose. This fixes the link failure caused by disabling the emission of common symbols.
|
89491 |
18-Jan-2002 |
marcel |
Declare ddb_regs as extern to avoid creating a tentative definition. This fixes the link failure caused by disabling the emission of common symbols.
|
88903 |
05-Jan-2002 |
peter |
Convert a bunch of 1 << PCPU_GET(cpuid) to PCPU_GET(cpumask).
|
88900 |
05-Jan-2002 |
jhb |
Change the preemption code for software interrupt thread schedules and mutex releases to not require flags for the cases when preemption is not allowed:
The purpose of the MTX_NOSWITCH and SWI_NOSWITCH flags is to prevent switching to a higher priority thread on mutex releease and swi schedule, respectively when that switch is not safe. Now that the critical section API maintains a per-thread nesting count, the kernel can easily check whether or not it should switch without relying on flags from the programmer. This fixes a few bugs in that all current callers of swi_sched() used SWI_NOSWITCH, when in fact, only the ones called from fast interrupt handlers and the swi_sched of softclock needed this flag. Note that to ensure that swi_sched()'s in clock and fast interrupt handlers do not switch, these handlers have to be explicitly wrapped in critical_enter/exit pairs. Presently, just wrapping the handlers is sufficient, but in the future with the fully preemptive kernel, the interrupt must be EOI'd before critical_exit() is called. (critical_exit() can switch due to a deferred preemption in a fully preemptive kernel.)
I've tested the changes to the interrupt code on i386 and alpha. I have not tested ia64, but the interrupt code is almost identical to the alpha code, so I expect it will work fine. PowerPC and ARM do not yet have interrupt code in the tree so they shouldn't be broken. Sparc64 is broken, but that's been ok'd by jake and tmm who will be fixing the interrupt code for sparc64 shortly.
Reviewed by: peter Tested on: i386, alpha
|
88727 |
30-Dec-2001 |
marcel |
Revert previous definition of cpu_throw(). Non-MP configurations were broken as well.
|
88695 |
30-Dec-2001 |
marcel |
Better implement SMP support: o Do not use a special struct to keep track of CPUs we found; instead, use struct pcpu. This handles all the magic WRT thread creation (yay!). o Respect MAXCPU. o Use the vhpt_base and vhpt_size values to initialize the AP. o Style fixes.
Note that this commit temporarily breaks SMP configurations. Previously APs didn't do anything, but they now enter the scheduler. They hold sched_lock for more than 5 secs though and cause a panic. That's what I call progress :-)
|
88693 |
30-Dec-2001 |
marcel |
o Reimplement map_pal_code to work with a global variable ia64_pal_base instead of scanning the EFI tables. This way AP startup code can more easily use the function. o Initialize ia64_pal_base in ia64_init(). When the PAL code doesn't need explicit mapping or no PAL code has been found, ia64_pal_base will be 0. o Remove some unused global variables. o Also in ia64_init(), allocate only 1 page for struct pcpu and remove some Alpha leftovers. o Initialize pc_pcb in cpu_pcpu_init().
|
88692 |
30-Dec-2001 |
marcel |
Make vhpt_base and vhpt_size globals so that they can be used by the AP startup code.
|
88691 |
30-Dec-2001 |
marcel |
Cleanup the IPIs.
|
88690 |
30-Dec-2001 |
marcel |
Remove unused MD fields (pc_pending_ipis, pc_next_asn and pc_current_asngen) and add SMP specific fields (pc_pcb, pc_lid and pc_awake).
|
88689 |
30-Dec-2001 |
marcel |
o Remove temporary implementation of cpu_throw in vm_machdep.c and instead make it an alternate entry-point of cpu_switch() in swtch.s o Add SMP support to cpu_switch().
|
88687 |
30-Dec-2001 |
marcel |
Draft implementation of IPI handling.
|
88686 |
30-Dec-2001 |
marcel |
Add PC_IDLETHREAD. We need it in cpu_switch.
|
88685 |
30-Dec-2001 |
marcel |
Add missing predicate in interruption_Data_TLB. Without this predicate we never used the VHPT entry we found.
While here, normalize the compares.
|
88441 |
23-Dec-2001 |
dfr |
Fix CRITICAL_FORK so that it compiles.
|
88376 |
21-Dec-2001 |
tmm |
Use the new resource_list_print_type() function. Pass the bus device to isa_init() (this is needed for the sparc64 version).
|
88245 |
20-Dec-2001 |
peter |
Replace a bunch of: for (pv = TAILQ_FIRST(&m->md.pv_list); pv; pv = TAILQ_NEXT(pv, pv_list)) { with: TAILQ_FOREACH(pv, &m->md.pv_list, pv_list) {
|
88088 |
18-Dec-2001 |
jhb |
Modify the critical section API as follows: - The MD functions critical_enter/exit are renamed to start with a cpu_ prefix. - MI wrapper functions critical_enter/exit maintain a per-thread nesting count and a per-thread critical section saved state set when entering a critical section while at nesting level 0 and restored when exiting to nesting level 0. This moves the saved state out of spin mutexes so that interlocking spin mutexes works properly. - Most low-level MD code that used critical_enter/exit now use cpu_critical_enter/exit. MI code such as device drivers and spin mutexes use the MI wrappers. Note that since the MI wrappers store the state in the current thread, they do not have any return values or arguments. - mtx_intr_enable() is replaced with a constant CRITICAL_FORK which is assigned to curthread->td_savecrit during fork_exit().
Tested on: i386, alpha
|
87894 |
14-Dec-2001 |
iedowse |
Enable UFS_DIRHASH in the GENERIC kernel.
Suggested by: silby Reviewed by: dillon MFC after: 5 days
|
87702 |
11-Dec-2001 |
jhb |
Overhaul the per-CPU support a bit:
- The MI portions of struct globaldata have been consolidated into a MI struct pcpu. The MD per-CPU data are specified via a macro defined in machine/pcpu.h. A macro was chosen over a struct mdpcpu so that the interface would be cleaner (PCPU_GET(my_md_field) vs. PCPU_GET(md.md_my_md_field)). - All references to globaldata are changed to pcpu instead. In a UP kernel, this data was stored as global variables which is where the original name came from. In an SMP world this data is per-CPU and ideally private to each CPU outside of the context of debuggers. This also included combining machine/globaldata.h and machine/globals.h into machine/pcpu.h. - The pointer to the thread using the FPU on i386 was renamed from npxthread to fpcurthread to be identical with other architectures. - Make the show pcpu ddb command MI with a MD callout to display MD fields. - The globaldata_register() function was renamed to pcpu_init() and now init's MI fields of a struct pcpu in addition to registering it with the internal array and list. - A pcpu_destroy() function was added to remove a struct pcpu from the internal array and list.
Tested on: alpha, i386 Reviewed by: peter, jake
|
87599 |
10-Dec-2001 |
obrien |
Update to C99, s/__FUNCTION__/__func__/, also don't use ANSI string concatenation.
|
87572 |
09-Dec-2001 |
obrien |
style(9)
|
87546 |
09-Dec-2001 |
dillon |
Allow maxusers to be specified as 0 in the kernel config, which will cause the system to auto-size to between 32 and 512 depending on the amount of memory.
MFC after: 1 week
|
87454 |
06-Dec-2001 |
jhb |
Add multiple inclusion protection.
|
87341 |
04-Dec-2001 |
des |
PROCFS requires PSEUDOFS.
|
87158 |
01-Dec-2001 |
mike |
o Stop abusing MD headers with non-MD types. o Hide nonstandard functions and types in <netinet/in.h> when _POSIX_SOURCE is defined. o Add some missing types (required by POSIX.1-200x) to <netinet/in.h>. o Restore vendor ID from Rev 1.1 in <netinet/in.h> and make use of new __FBSDID() macro. o Fix some miscellaneous issues in <arpa/inet.h>. o Correct final argument for the inet_ntop() function (POSIX.1-200x). o Get rid of the namespace pollution from <sys/types.h> in <arpa/inet.h>.
Reviewed by: fenner Partially submitted by: bde
|
87119 |
30-Nov-2001 |
dfr |
* Don't use critical_enter/critical_exit when accessing the VHPT - its pointless and would be inadequate for SMP systems. We will rely on the VM system's locks to serialise this for now. * Change pmap_remove() so that if the range being removed is larger than the number of pages mapped by the pmap, we iterate over the currently mapped pages instead of over the virtual address range. This should make a difference when removing large virtual address ranges from an address space.
|
86951 |
27-Nov-2001 |
dfr |
Minor tweaks to the TLB handling code - avoid movl instructions and add itc.x instructions to attempt to avoid the little flurry of TLB exceptions for handling access, dirty etc.
|
86593 |
19-Nov-2001 |
peter |
s/code/ucode/ (last minute typo)
|
86592 |
19-Nov-2001 |
peter |
Initial cut at calling the EFI-provided FPSWA (Floating Point Software Assist) driver to handle the "messy" floating point cases which cause traps to the kernel for handling.
|
86587 |
19-Nov-2001 |
peter |
Use some (now) spare space for passing through a pointer to the FPSWA Interface provided by EFI (Floating Point SoftWare Assist).
|
86586 |
19-Nov-2001 |
peter |
Remove bootinfo.bi_kernel. It isn't used by the kernel. struct bootinfo should go away on ia64, we should be loader metadata based since that is the only way we can boot (loader, skiload).
|
86443 |
16-Nov-2001 |
peter |
Oops, I accidently merged a whitespace error from the original commit. (whitespace at end of line in rev 1.264 pmap.c). Fix them all.
|
86442 |
16-Nov-2001 |
peter |
Merge rev 1.264 from i386/pmap.c (tegge via alfred): Protect against an infinite loop when prefaulting pages. This can happen when the vm system maps past the end of an object or tries to map a zero length object, the pmap layer misses the fact that offsets wrap into negative numbers and we get stuck.
|
86441 |
16-Nov-2001 |
peter |
Merge rev 1.202 from i386/pmap.c (back in 1998 by John Dyson): Make flushing dirty pages work correctly on filesystems that unexpectedly do not complete writes even with sync I/O requests. This should help the behavior of mmaped files when using softupdates (and perhaps in other circumstances also.)
|
86440 |
16-Nov-2001 |
peter |
Merge rev 1.293 of i386/pmap.c - skip PG_UNMANAGED in pmap_collect()
|
86438 |
16-Nov-2001 |
peter |
Converge with i386/pmap.c - dont refer to curproc, use curthread.
|
86435 |
16-Nov-2001 |
peter |
As part of a general cleanup and reconvergence of related pmap code, start tidying up some loose ends. The DEBUG_VA stuff has long since passed its use-by date. It wasn't used on ia64 but got cut/pasted there.
|
86294 |
12-Nov-2001 |
peter |
Implement eficlock_set() to set hardware clock.
|
86291 |
12-Nov-2001 |
marcel |
o os_boot_rendez is responsible for clearing the IRR bit by reading cr.ivr, as well as writing to cr.eoi. o use global variables to pass information to os_boot_rendez so that it doesn't have to jump through hoops to find it out. This avoids traps on the AP without it even being initialized. This fixes SMP configurations. o Move the probing of the MADT to the end of cpu_startup, instead of at the start of cpu_mp_probe. We need to probe the MADT for non-SMP configurations as well. This fixes uniprocessor configurations. o Serialize AP wake-up by waiting for the AP. We need to do this since we use global variables to for the AP to use. As a side-effect, we can use printf() more easily to see what's going on.
|
86290 |
12-Nov-2001 |
marcel |
Invoke trap() for the alt. ITLB and alt. DTLB interrrupts when the region is not 6 or 7. This changes the behaviour from inserting a bogus region 6 mapping to a kernel panic.
|
86286 |
12-Nov-2001 |
peter |
Remove #if 0'ed code that was replaced by vm_ksubmap_init() and GC'ed on other platforms.
|
86239 |
10-Nov-2001 |
marcel |
Avoid using the .align directive to skip to the next vector offset. It doesn't help us catch overflowing vector entries at compile time. Instead use the .org directive. The last entry in the IVT doesn't strictly need to be limited to 256 bytes, but doing so allows the the VHPT to be placed immediately following the IVT without wasting any space due to alignment.
|
86213 |
09-Nov-2001 |
dfr |
* Make sure we increment pm_stats.resident_count in pmap_enter_quick * Re-organise RID allocation so that we don't accidentally give a RID to two different processes. Also randomise the order to try to reduce collisions in VHPT and TLB. Don't allocate RIDs for regions which are unused. * Allocate space for VHPT based on the size of physical memory. More tuning is needed here. * Add sysctl instrumentation for VHPT - see sysctl vm.stats.vhpt * Fix a bug in pmap_prefault() which prevented it from actually adding pages to the pmap. * Remove ancient dead debugging code. * Add DDB commands for examining translation registers and region registers.
The first change fixes the 'free/cache page %p was dirty' panic which I have been seeing when the system is put under moderate load. It also fixes the negative RSS values in ps which have been confusing me for a while.
With this set of changes the ia64 port is reliable enough to build its own kernels, even with a 20-way parallel build. Next stop buildworld.
|
86212 |
09-Nov-2001 |
dfr |
Raise SIGILL for General Exceptions - its closer to the correct meaning.
|
86211 |
09-Nov-2001 |
dfr |
Reserve more space for phys_avail. Really need to be more careful about overflowing phys_avail.
|
86210 |
09-Nov-2001 |
dfr |
Teach DDB about branch registers.
|
86209 |
09-Nov-2001 |
dfr |
Define PS and VE fields of region register correctly.
|
86204 |
09-Nov-2001 |
marcel |
Implement os_boot_rendez. Application processors are initialized and brought to a point where kernel specific initializations can be done. That will be the next step...
|
86069 |
05-Nov-2001 |
marcel |
Don't pass os_boot_rendez directly to SAL_SET_VECTORS, because it's actually the address of the function descriptor. The fdesc has both the address of the function and it's corresponding gp value. Now that we have a gp value, use it instead of passing 0.
|
85973 |
03-Nov-2001 |
dfr |
Implement <machine/ieeefp.h>
|
85930 |
03-Nov-2001 |
dillon |
Implement i386/i386/pmap.c 1.292 for alpha, ia64 (avoid free page exhaustion / kernel panic for certain madvise() scenarios)
|
85892 |
02-Nov-2001 |
mike |
o Add new header <sys/stdint.h>. o Make <stdint.h> a symbolic link to <sys/stdint.h>. o Move most of <sys/inttypes.h> into <sys/stdint.h>, as per C99. o Remove <sys/inttypes.h>. o Adjust includes in sys/types.h and boot/efi/include/ia64/efibind.h to reflect new location of integer types in <sys/stdint.h>. o Remove previously symbolicly linked <inttypes.h>, instead create a new file. o Add MD headers <machine/_inttypes.h> from NetBSD. o Include <sys/stdint.h> in <inttypes.h>, as required by C99; and include <machine/_inttypes.h> in <inttypes.h>, to fill in the remaining requirements for <inttypes.h>. o Add additional integer types in <machine/ansi.h> and <machine/limits.h> which are included via <sys/stdint.h>.
Partially obtain from: NetBSD Tested on: alpha, i386 Discussed on: freebsd-standards@bostonradio.org Reviewed by: bde, fenner, obrien, wollman
|
85866 |
02-Nov-2001 |
dfr |
Call ast() from exception_restore when we are restoring to user mode.
|
85865 |
02-Nov-2001 |
dfr |
Use static storage for the unwind state so that we can still get backtraces when the VM system is hosed.
|
85856 |
02-Nov-2001 |
dfr |
Remember to actually free the pv_entry in pmap_remove_entry().
|
85852 |
02-Nov-2001 |
peter |
argh! cut/paste typo. :-( (committed on a different machine to what I was testing it on)
|
85850 |
02-Nov-2001 |
peter |
"Fix" a problem that got copied from alpha to ia64 and broke there. When we truncate the msgbuf size because the last chunk is too small, correctly terminate the phys_avail[] array - the VM system tests the *end* for zero, not the start. This leads the VM startup to attempt to recreate a duplicate set of pages for all physical memory.
XXX the msgbuf handling is suspiciously different on i386 vs alpha/ia64...
|
85785 |
31-Oct-2001 |
dfr |
Experiment with rewriting the syscall() wrapper using explicit bundling and trying to reduce stalls from reading certain high latency registers. This should be faster than the old syscall code. Its certainly a lot smaller.
|
85777 |
31-Oct-2001 |
dfr |
Add TF_AR_FPSR, the offset of ar.fpsr in a trapframe.
|
85766 |
31-Oct-2001 |
dfr |
Print the bundle template name on the first slot of the bundle.
|
85685 |
29-Oct-2001 |
dfr |
* Factor out common code for manipulating the RSE backing store. * Implement a fairly simplistic parser for unwinding stack frames. * Use unwind records for DDB's 'trace' command. Also add support for tracing past exceptions to the context which generated the exception.
The stack unwind code requires a toolchain based on binutils-2.11.2 or later and gcc-3.0.1 or later.
|
85684 |
29-Oct-2001 |
dfr |
Make the various bits of SMP code conditional on SMP so that I can still build non-SMP kernels.
|
85682 |
29-Oct-2001 |
dfr |
Various fixes to make stack traces using the unwind tables work properly.
|
85681 |
29-Oct-2001 |
dfr |
Fix disassembly of 'add a=b,c,1' and make the disassembly of the various break and nops consistent.
|
85680 |
29-Oct-2001 |
peter |
The size of the ELF hash table changed from 64 bits in the prototype toolchains to 32 bits in 2.11.2.
Obtained from: dfr
|
85674 |
29-Oct-2001 |
marcel |
o Send a test IPI from the BSP to itself at the same time APs are woken up. o Make IPIs synchronuous by default. If we want asynchronuous IPIs, we may want to make the memory fence controllable.
|
85673 |
29-Oct-2001 |
marcel |
Add an IPI used for testing proper operation of delivering IPIs.
|
85670 |
29-Oct-2001 |
marcel |
Make the clock vector 255 instead of 240. On Lion boxes, 240 is the AP wake-up vector. We probably want a more dynamic approach to assigning vectors in the future...
|
85667 |
29-Oct-2001 |
marcel |
Small correction in the LOCAL_SAPIC structure. The Flags field starts at offset 8; not 6. Hence the structure is 12 bytes and not 10 bytes. Adjust the definition so that the ProcessorEnabled flag is moved from bit 15 to bit 31 in the Flags field.
The definition now matches ACPI 2.0 Errata 1.5.
|
85656 |
29-Oct-2001 |
marcel |
o Do not parse the MADT as a side-effect in AcpiOsGetRootPointer, do it as a side-effect of probing for MP hardware. This allows us to scan for local SAPICs early (especially before MBUF initialization). o Fix the Local SAPIC structure so that matches the Local SAPIC table entry. Now that the Local SAPIC info is the same as the Local APIC info, stop dumping the Local APIC entries. o For every Local SAPIC entry in the MADT that's not disabled, let the SMP code know about it. They represent actual CPUs. o Register the OS_BOOT_RENDEZ entry point and provide a (bogus) implementation for the entry point. o Provide a mapping for internal IPI numbers to ExtINT vectors. o In a MP system, announce the CPUs and start them by sending IPI_AP_WAKEUP to each of them. Not that it makes a difference at this time :-) o Miscellaneous style fixes and other adjustments.
|
85556 |
26-Oct-2001 |
iwasaki |
Add APM compatibility feature to ACPI. This emulates APM device node interface APIs (mainly ioctl) and provides APM services for the applications. The goal is to support most of APM applications without any changes. Implemented ioctls in this commit are: - APMIO_SUSPEND (mapped ACPI S3 as default but changable by sysctl) - APMIO_STANDBY (mapped ACPI S1 as default but changable by sysctl) - APMIO_GETINFO and APMIO_GETINFO_OLD - APMIO_GETPWSTATUS
With above, many APM applications which get batteries, ac-line info. and transition the system into suspend/standby mode (such as wmapm, xbatt) should work with ACPI enabled kernel (if ACPI works well :-)
Reviewed by: arch@, audit@ and some guys
|
85525 |
26-Oct-2001 |
jhb |
Add a per-thread ucred reference for syscalls and synchronous traps from userland. The per thread ucred reference is immutable and thus needs no locks to be read. However, until all the proc locking associated with writes to p_ucred are completed, it is still not safe to use the per-thread reference.
Tested on: x86 (SMP), alpha, sparc64
|
85439 |
24-Oct-2001 |
dfr |
* Clear the TLB on boot. * If a pte for a location given to pmap_enter_quick is valid, just give up - don't panic, even if the mapping is different.
|
85438 |
24-Oct-2001 |
dfr |
If we get an unhandled page fault in kernel mode, either panic (if pcb_onfault is not set) or arrange to restart at the location in pcb_onfault.
This ought to help the stability of a system under moderate load. It certainly stops DDB from hanging the kernel when it tries to access a non-present page.
|
85401 |
24-Oct-2001 |
marcel |
Remove call to cninit_finish. This is part of the multiple low-level console support.
|
85399 |
24-Oct-2001 |
marcel |
Add parse functions for local APIC and I/O APIC entries. Also, show when a local APIC or SAPIC is disabled.
|
85383 |
23-Oct-2001 |
peter |
Fix RAW dependency violation when compiled with gcc-3 Warning: Use of 'br.ret.sptk.many' violates RAW dependency 'PSR.tb' (data)
|
85360 |
23-Oct-2001 |
peter |
Turn off the single-user override. We've been running multi-user for some time. Having a machine boot unattended is useful. :-)
|
85356 |
23-Oct-2001 |
dfr |
Add data serialisations after ptc and mov to rr[] instructions.
|
85335 |
23-Oct-2001 |
mike |
Remove funky right justification.
Pointed out by: bde
|
85330 |
22-Oct-2001 |
dfr |
In the signal trampoline, flush the register stack before calling sigreturn. This appears to fix the last set of problems with csh.
|
85304 |
22-Oct-2001 |
obrien |
Setup for a 200MB FS -- 209715200/512= 409600 sectors.
(DFR's latest ia64-root-*.tar.gz leaves only 7.7M avail when created by dd if=/dev/zero of=ia64-root.fs bs=1024k count=200)
|
85297 |
21-Oct-2001 |
des |
Move procfs_* from procfs_machdep.c into sys_process.c, and rename them to proc_* in the process; procfs_machdep.c is no longer needed.
Run-tested on i386, build-tested on Alpha, untested on other platforms.
|
85294 |
21-Oct-2001 |
des |
[partially forced commit due to pilot error in earlier commit attempt]
{set,fill}_{,fp,db}regs() fixup:
- Add dummy {set,fill}_dbregs() on architectures that don't have them.
- KSEfy the powerpc versions (struct proc -> struct thread).
- Some architectures had the prototypes in md_var.h, some in reg.h, and some in both; for consistency, move them to reg.h on all platforms.
These functions aren't really MD (the implementation is MD, but the interface is MI), so they should move to an MI header, but I haven't figured out which one yet.
Run-tested on i386, build-tested on Alpha, untested on other platforms.
|
85293 |
21-Oct-2001 |
des |
{set,fill}_{,fp,db}regs() fixup:
- Add dummy {set,fill}_dbregs() on architectures that don't have them.
- KSEfy the powerpc versions (struct proc -> struct thread).
- Some architectures had the prototypes in md_var.h, some in reg.h, and some in both; for consistency, move them to reg.h on all platforms.
These functions aren't really MD (the implementation is MD, but the interface is MI), so they should move to an MI header, but I haven't figured out which {set,fill}_{,fp,db}regs() fixup:
- Add dummy {set,fill}_dbregs() on architectures that don't have them.
- KSEfy the powerpc versions (struct proc -> struct thread).
- Some architectures had the prototypes in md_var.h, some in reg.h, and some in both; for consistency, move them to reg.h on all platforms.
These functions aren't really MD (the implementation is MD, but the interface is MI), so they should move to an MI header, but I haven't figured out which one yet.
Run-tested on i386, build-tested on Alpha, untested on other platforms.
|
85286 |
21-Oct-2001 |
dfr |
Add some more names for bits of trapframe.
|
85285 |
21-Oct-2001 |
dfr |
We need to save a bit more information in the partial syscall trapframe in case we need to take a signal.
|
85284 |
21-Oct-2001 |
dfr |
Set ar.fpsr to something sane before trying to handle a trap - the user might have trashed it.
|
85283 |
21-Oct-2001 |
dfr |
Use ia64_set_fpsr() instead of __asm to set ar.fpsr.
|
85282 |
21-Oct-2001 |
dfr |
Add ia64_set_fpsr().
|
85276 |
21-Oct-2001 |
marcel |
Implement the IPI send functions. No mapping between IPI message Id and interrupt vector has been made yet.
|
85269 |
21-Oct-2001 |
marcel |
Add define for the PIB default address and include a reference to the SDM.
|
85255 |
20-Oct-2001 |
mjacob |
Remove wx.
|
85230 |
20-Oct-2001 |
dfr |
Reserve space for signal state.
|
85211 |
20-Oct-2001 |
marcel |
Save the AP wake-up vector from the SAL descriptor under SMP. Note that the descriptor is optional. Add a comment to indicate that we want to register the OS_BOOT_RENDEZ here as well.
|
85210 |
20-Oct-2001 |
marcel |
Make this compile under option SMP.
|
85199 |
19-Oct-2001 |
dfr |
Make a start at an unaligned trap handler. Only integer loads and stores are handled so far.
|
85195 |
19-Oct-2001 |
dfr |
Translate various userland traps into SIGBUS (instead of just panicing).
|
85189 |
19-Oct-2001 |
ru |
Try two on the preprocessing logic.
Reviewed by: obrien
|
85185 |
19-Oct-2001 |
jhb |
Remove unneeded sys/mutex.h includes.
|
85183 |
19-Oct-2001 |
obrien |
Blah, fix braino where ru had to remind me of proper preprocessor syntax. Bad fingers, no cookie.
|
85142 |
19-Oct-2001 |
dfr |
Rework pmap so that it separates the PTE structure from the pv_entry structure. This makes it possible to pre-allocate PTEs for the kernel, which is necessary for a reliable implementation of pmap_kenter(). This also avoids wasting space (about 48 bytes per page) for kernel mappings and user mappings of memory-mapped devices.
This also fixes a bug with the previous version where the implementation required the pv_entry structure to be physically contiguous but did not enforce this (the structure size was not a power of two). This meant that the pv_entry free list was quickly corrupted as soon as the system was even mildly loaded.
|
85109 |
18-Oct-2001 |
dfr |
Shift the code which packs and unpacks instruction bundles out of DDB since it is useful for various emulations duties (e.g. unaligned trap handling).
|
85088 |
18-Oct-2001 |
marcel |
Fix typos in previous commit:
o s/sys_narg/sy_narg/ o s/SYS_MPSAFE/SYF_MPSAFE/
|
85085 |
18-Oct-2001 |
obrien |
Add support for "__gnuc_va_list". Some overly "smart" libraries assume the existence of the __gnuc_va_list type[*] because our compiler is GCC.
[*] __gnuc_va_list is defined in the GCC ginclude/stdarg.h replacement headerwhich we don't use.
|
85082 |
17-Oct-2001 |
jhb |
- Small cleanups to the Giant handling in trap(). - Only release Giant in trap() if we locked it, otherwise we could release Giant in a kernel trap if we didn't get it for a page fault and the previous frame had grabbed the lock. - Only get Giant for !MP safe syscalls.
|
85035 |
16-Oct-2001 |
mjacob |
Make SCSI changer and SES devices standard in generic kernels.
Reviewed by: ken@kdm.org
|
85024 |
16-Oct-2001 |
dfr |
Size the number of pv_entries we use to bootstrap the pv_entry allocator based on the size of physical memory. This should eliminate the tweaking needed for larger memory configurations.
|
84966 |
15-Oct-2001 |
marcel |
When compiling with SKI support, create the fake memory regions when either the memory descriptor in the bootinfo is NULL or the descriptor count is 0.
|
84876 |
13-Oct-2001 |
dfr |
Only the first eight arguments can possibly be in stacked registers.
|
84840 |
12-Oct-2001 |
dfr |
Pass the correct trapframe pointer to fork_exit - sp is trapframe-16.
|
84839 |
12-Oct-2001 |
dfr |
If the faulting instruction is a cmpxchg, then isr.w and isr.r will both be set. We need to check isr.w before isr.r so that we can correctly handle a cmpxchg to a copy-on-write page.
This fixes the hang-after-fork problem for dynamically linked programs.
|
84801 |
11-Oct-2001 |
dfr |
Implement MCOUNT hook for assembler. Probably doesn't work right.
|
84800 |
11-Oct-2001 |
dfr |
Implement mcount trampoline (untested).
|
84798 |
11-Oct-2001 |
dfr |
* Change the calling convention for execve so that it conforms to normal C calling conventions. This allows crt1.c to be written nearly without any inline assembler. * Initialise cpu_model[] so that the hw.model sysctl works properly.
|
84783 |
10-Oct-2001 |
ps |
Make MAXTSIZ, DFLDSIZ, MAXDSIZ, DFLSSIZ, MAXSSIZ, SGROWSIZ loader tunable.
Reviewed by: peter MFC after: 2 weeks
|
84751 |
10-Oct-2001 |
dfr |
Add a definition for the ia64's special PLT_RESERVE entry in the _DYNAMIC section.
|
84732 |
09-Oct-2001 |
dfr |
Clarify a comment.
Requested by: jhb
|
84714 |
09-Oct-2001 |
dfr |
Don't include isavar.h - we don't need it.
|
84713 |
09-Oct-2001 |
dfr |
Add a minimalist kernel config which can run inside SKI.
|
84677 |
08-Oct-2001 |
dfr |
Make printtrap() more informative.
|
84638 |
07-Oct-2001 |
dfr |
Implement inline versions of ntohl etc.
|
84637 |
07-Oct-2001 |
des |
Dissociate ptrace from procfs.
Until now, the ptrace syscall was implemented as a wrapper that called various functions in procfs depending on which ptrace operation was requested. Most of these functions were themselves wrappers around procfs_{read,write}_{,db,fp}regs(), with only some extra error checks, which weren't necessary in the ptrace case anyway.
This commit moves procfs_rwmem() from procfs_mem.c into sys_process.c (renaming it to proc_rwmem() in the process), and implements ptrace() directly in terms of procfs_{read,write}_{,db,fp}regs() instead of having it fake up a struct uio and then call procfs_do{,db,fp}regs().
It also moves the prototypes for procfs_{read,write}_{,db,fp}regs() and proc_rwmem() from proc.h to ptrace.h, and marks all procfs files except procfs_machdep.c as "optional procfs" instead of "standard".
|
84632 |
07-Oct-2001 |
dfr |
* Use srlz.i to serialise changes to psr.ic * Don't enable psr.i at the same time as psr.dt and psr.ic
These changes improve stability considerably.
|
84621 |
07-Oct-2001 |
dfr |
Remove bogus include.
|
84592 |
06-Oct-2001 |
dfr |
Move console probes until after we set boothowto so that 'boot -h' works.
|
84590 |
06-Oct-2001 |
dfr |
Assume round-to-nearest mode for floating point.
|
84583 |
06-Oct-2001 |
dfr |
Delete legacy pcib code - we can't possibly work without acpi on ia64.
|
84581 |
06-Oct-2001 |
marcel |
o Change ia64_memory_address to explicitly take a u_int64_t o Add memcpy_fromio, memcpy_io, memcpy_toio, memset_io, memsetw and memsetw_io. I'm not sure this is the right place for it, though.
|
84557 |
05-Oct-2001 |
dfr |
Add BOOTP support.
|
84556 |
05-Oct-2001 |
dfr |
Fix some dependency violations (don't know why gas didn't catch this).
|
84555 |
05-Oct-2001 |
dfr |
Use physical addresses, not virtual addresses when calling PHYS_TO_VM_PAGE.
|
84554 |
05-Oct-2001 |
dfr |
Eliminate some alpha craziness.
|
84553 |
05-Oct-2001 |
dfr |
In in_cksumdata, len must be a signed type.
|
84543 |
05-Oct-2001 |
dfr |
Low-level code for programming the I/O SAPIC.
|
84541 |
05-Oct-2001 |
dfr |
Wire up most of the interrupt handling infrastructure. Not sure it works right yet but its enough for the ATA probe to work. The SCSI probes which follow are broken though.
|
84540 |
05-Oct-2001 |
dfr |
Fix typo which meant that we never actually found the ACPI 2.0 table.
|
84535 |
05-Oct-2001 |
dfr |
Disable interrupts when we are in DDB.
|
84534 |
05-Oct-2001 |
dfr |
Add ia64_get_lid().
|
84477 |
04-Oct-2001 |
dfr |
Don't pretend the argument to clockattach is a device - it isn't.
|
84476 |
04-Oct-2001 |
dfr |
* Don't pretend the object passed to clockattach is a device - it isn't. * Declare itc_frequency properly.
|
84475 |
04-Oct-2001 |
dfr |
Use EFI (or some reasonable simulation) to read the RTC.
|
84474 |
04-Oct-2001 |
dfr |
Fake the EFI runtime call GetTime.
|
84447 |
04-Oct-2001 |
dfr |
Add low-level ACPI support code and make a start on parsing the ACPI interrupt information.
|
84412 |
03-Oct-2001 |
dfr |
The encoding for the bus being passed to SAL was completely wrong.
|
84381 |
02-Oct-2001 |
mjacob |
Fix problem where a user buffer outside of the area being tested will be corrupted.
PR: 29194 Obtained from: Tor.Egge@fast.no MFC after: 2 weeks
|
84349 |
02-Oct-2001 |
marcel |
Remove redundant and misplaced "options DDB" line.
|
84131 |
29-Sep-2001 |
dfr |
Support for SKI is now an option.
|
84130 |
29-Sep-2001 |
dfr |
Make sio0 a console device.
|
84129 |
29-Sep-2001 |
dfr |
Add a couple of arguments to ia64_init. I'll use them later to improve the method of passing bootinfo from the loader.
|
84128 |
29-Sep-2001 |
dfr |
Various changes to use the firmware on a real machine.
|
84127 |
29-Sep-2001 |
dfr |
* Read parameters for ptc.e instruction from PAL Code. * Add pmap_unmapdev().
|
84126 |
29-Sep-2001 |
dfr |
Fake PAL Code for SKI.
|
84124 |
29-Sep-2001 |
dfr |
Start hooking up devices.
|
84123 |
29-Sep-2001 |
dfr |
Add pmap_unmapdev().
|
84122 |
29-Sep-2001 |
dfr |
Fill out the firmware interfaces somewhat.
|
84121 |
29-Sep-2001 |
dfr |
Add code to initialise firmware resources (and to fake them if we are running in simulation).
|
84120 |
29-Sep-2001 |
dfr |
Add shims for calling PAL Code in physical mode.
|
84118 |
29-Sep-2001 |
dfr |
Add some move definitions.
|
84117 |
29-Sep-2001 |
dfr |
Call cpu_boot from cpu_reset.
|
84116 |
29-Sep-2001 |
dfr |
Give up on the backtrace if the calculated pc isn't in region 7.
|
84115 |
29-Sep-2001 |
dfr |
Use PAGE_SHIFT instead of a hardcoded constant for log2(PAGE_SIZE).
|
84114 |
29-Sep-2001 |
dfr |
* Preserve ar.rsc in ia64_change_mode. * Convert sp to/from physical in ia64_change_mode. * Add a shim for calling EFI procedures in virtual mode.
|
84113 |
29-Sep-2001 |
dfr |
Change END(locorestart) to END(__start).
|
83983 |
26-Sep-2001 |
rwatson |
o Modify the access control checks for the ia64 /dev/mem (and friends) to use securelevel_gt() instead of direct variable checks.
Obtained from: TrustedBSD Project
|
83964 |
26-Sep-2001 |
dfr |
Tidy up and fix a runtime warning.
|
83936 |
25-Sep-2001 |
brooks |
The faith(4) device is no longer a count device so don't specify a count.
|
83913 |
24-Sep-2001 |
dfr |
Use b6 instead of b1 - b1 is supposed to be preserved and b6 is scratch.
|
83912 |
24-Sep-2001 |
dfr |
Make the Alternate {I,D} TLB vector code actually work for virtual addresses greater than 256M (the page size for region 6 and 7).
|
83908 |
24-Sep-2001 |
dfr |
Don't try to access external files from SKI unless we are actually running in SKI.
|
83907 |
24-Sep-2001 |
dfr |
Increase the number of bootstrap PVs.
|
83906 |
24-Sep-2001 |
dfr |
Include <machine/pte.h> instead of <machine/pmap.h>
|
83905 |
24-Sep-2001 |
dfr |
We need different call stubs for static and stacked calling conventions.
|
83900 |
24-Sep-2001 |
dfr |
Factor out PTE and related definitions from pmap.h - they are useful in the loader.
|
83895 |
24-Sep-2001 |
dfr |
Fix a few comment typos from the last commit.
|
83893 |
24-Sep-2001 |
dfr |
Add some code which can be used to change to/from physical mode when calling various firmware functions.
|
83872 |
24-Sep-2001 |
obrien |
+ Fix misplacement of `txp' + Document our -CURRENT debugging bits
|
83856 |
23-Sep-2001 |
dfr |
Add definitions of SAL System Table.
|
83837 |
22-Sep-2001 |
dfr |
Don't activate the ssc console unless we are running in SKI.
|
83836 |
22-Sep-2001 |
dfr |
Add implementations of readx() and writex().
|
83835 |
22-Sep-2001 |
dfr |
Add declaration of ia64_running_in_simulator().
|
83834 |
22-Sep-2001 |
dfr |
* Turn off memory descriptor debugging - its served its purpose. * Don't get confused when memory regions don't lie on page boundaries - remember our page size is typically larger than the firmware's page size. * Add a function ia64_running_in_simulator() which is intended to detect whether the kernel is running in SKI or on real hardware.
|
83833 |
22-Sep-2001 |
dfr |
Remove a redundant stop.
|
83767 |
21-Sep-2001 |
dfr |
Fix a warning and make sure we flush the cache after writing an instruction bundle otherwise the CPU won't see the changed bundle.
|
83766 |
21-Sep-2001 |
dfr |
Add ia64_fc().
|
83734 |
20-Sep-2001 |
dfr |
If two @fptr relocations refer to the same symbol, use the same fptr structure to resolve them. This is necessary to allow code to compare function pointers.
|
83733 |
20-Sep-2001 |
dfr |
Don't clear the single-step bit after a trap - leave it up to the debugger. The code was broken anyway - it clear every bit *except* the single-step bit (oops).
|
83732 |
20-Sep-2001 |
dfr |
The second instruction in an MLX bundle is slot one, not slot two, even though the actual opcode is stored in the value in slot two.
|
83727 |
20-Sep-2001 |
dfr |
Tidy.
|
83718 |
20-Sep-2001 |
dfr |
Don't include NFS headers. I have no idea why they were here in the first place - NFS has no assembler in it.
|
83691 |
20-Sep-2001 |
peter |
Replicate a change from alpha/genassym.c to other arches. This should fix nfs-related build breakage.
|
83651 |
18-Sep-2001 |
peter |
Cleanup and split of nfs client and server code. This builds on the top of several repo-copies.
|
83644 |
18-Sep-2001 |
jhb |
Whitespace fixes.
|
83643 |
18-Sep-2001 |
jhb |
- If we ever do the per-cpu KTR stuff, the index won't be volatile as it will be private to each CPU. - Re-style(9) the globaldata structures. There really needs to be a MI struct pcpu that has a MD struct mdpcpu member at some point.
|
83622 |
18-Sep-2001 |
dfr |
Add ia64_get_cpuid().
|
83611 |
18-Sep-2001 |
dfr |
Flesh out identifycpu().
|
83522 |
15-Sep-2001 |
dfr |
Rearrange so we search for I/O port space as early as possible (i.e. before console probing). Also fix a confusion between EFI's page size which is fixed at 4096 and our own page size which is variable at compile time.
|
83520 |
15-Sep-2001 |
dfr |
Avoid the region used for thread0's trapframe when setting up the stack for ia64_init. If we use this area for ia64_init's stack, it ends up containing garbage which causes cpu_fork to die horribly later.
|
83512 |
15-Sep-2001 |
dfr |
Use the MI console code to initialise the console.
|
83511 |
15-Sep-2001 |
dfr |
Implement inx() and outx() functions for accessing I/O ports.
|
83510 |
15-Sep-2001 |
dfr |
Add ia64_mf_a() which executes an mf.a instruction.
|
83509 |
15-Sep-2001 |
dfr |
* Use Intel's EFI headers instead of home-grown ones. * Use the bootinfo's memory map if present instead of hard-coding SKI's memory map. * Record the location of the I/O Port Space if present in the memory map.
|
83506 |
15-Sep-2001 |
dfr |
Fill out some gaps in ia64 DDB support. This involves generalising DDB's breakpoint handling slightly to cope with the fact that ia64 instructions are not located on byte boundaries.
|
83499 |
15-Sep-2001 |
dfr |
Sync the PCI NIC sections with i386.
|
83407 |
13-Sep-2001 |
dfr |
* Enable dynamically linked kernel. This involves adding a self-relocator to locore to process the @fptr relocations in the dynamic executable. * Don't initialise the timer until *after* we install the timecounter to avoid a race between timecounter initialisation and hardclock. * Tidy up bootinfo somewhat including adding sanity checks for when the kernel is loaded without a recognisable bootinfo.
|
83366 |
12-Sep-2001 |
julian |
KSE Milestone 2 Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process.
Sorry john! (your next MFC will be a doosie!)
Reviewed by: peter@freebsd.org, dillon@freebsd.org
X-MFC after: ha ha ha ha
|
83359 |
12-Sep-2001 |
marcel |
o Fix struct ssc_time and enable the SSC call to get the RTC. o Print a message that the TODR is not set in sscclock_set.
|
83301 |
10-Sep-2001 |
dfr |
* Make a start on a realistic definition for bootinfo. * Switch to proc0's stack and backing store before calling ia64_init so that we don't rely on the loader's stack at all. * Change kernel entry point name from locorestart to __start.
|
83276 |
10-Sep-2001 |
peter |
Rip some well duplicated code out of cpu_wait() and cpu_exit() and move it to the MI area. KSE touched cpu_wait() which had the same change replicated five ways for each platform. Now it can just do it once. The only MD parts seemed to be dealing with fpu state cleanup and things like vm86 cleanup on x86. The rest was identical.
XXX: ia64 and powerpc did not have cpu_throw(), so I've put a functional stub in place.
Reviewed by: jake, tmm, dillon
|
83223 |
08-Sep-2001 |
peter |
Missing part of dillon's coredump commit. cpu_coredump() was still passing IO_NODELOCKED to vn_rdwr(), this would cause operations on the unlocked core vnode and softupdates nastiness if an a.out binary cored.
|
83198 |
07-Sep-2001 |
dfr |
Add options to select between 4k, 8k and 16k page sizes on ia64. The default is now 8k.
|
83197 |
07-Sep-2001 |
dfr |
Typo in comment.
|
83196 |
07-Sep-2001 |
dfr |
* Track ref/mod information properly when a mapping changes. * Fix a panic in pmap_remove() for a non-current pmap.
|
83195 |
07-Sep-2001 |
dfr |
Remove old setjmp/longjmp stubs.
|
83163 |
06-Sep-2001 |
jhb |
Call sendsig() with the proc lock held and return with it held.
|
83156 |
06-Sep-2001 |
dfr |
Add struct tags to avoid warnings in kernel code.
|
83047 |
05-Sep-2001 |
obrien |
style(9) the structure definitions.
|
82938 |
04-Sep-2001 |
peter |
Nuke #if 0'ed "setredzone()" stub. We never used it, and probably never will. I've implemented an optional redzone as part of the KSE upage breakup.
|
82867 |
03-Sep-2001 |
dfr |
Add a working version of setjmp/longjmp.
Obtained from: Intel's EFI toolkit.
|
82857 |
03-Sep-2001 |
peter |
Since we're cross compiling from x86, ignore the x86 CPUTYPE by default.
|
82849 |
03-Sep-2001 |
peter |
Dont conflict with sysctl debug.mddebug
|
82788 |
02-Sep-2001 |
peter |
Sync with i386 / alpha. Whitespace unindent / style prep for kse.
|
82785 |
02-Sep-2001 |
peter |
Merge from i386: various cleanups including moving the map calculations to MI code. This gets ia64 to compile again.
|
82632 |
31-Aug-2001 |
peter |
Same as i386/i386/pmap.c: clean up some style. This is irrelevant since it is inside #if 0'ed code, but it would be a shame if this stuff got cut/pasted elsewhere.
|
82530 |
30-Aug-2001 |
mike |
o Remove some GCCisms in src/powerpc/include/endian.h. o Unify <machine/endian.h>'s across all architectures. o Make bswapXX() functions use a different spelling of u_int16_t and friends to reduce namespace pollution. The bswapXX() functions don't actually exist, but we'll probably import these at some point. Atleast one driver (if_de) depends on bswapXX() for big endian cases. o Deprecate byteorder(3) prototypes from <sys/types.h>, these are now prototyped indirectly in <arpa/inet.h>. o Deprecate in_addr_t and in_port_t typedefs in <sys/types.h>, these are now typedef'd in <arpa/inet.h>. o Change byteorder(3) prototypes to use standards compliant uint32_t (spelled __uint32_t to reduce namespace pollution). o Document new preferred headers and standards compliance.
Discussed with: bde PR: 29946 Reviewed by: bmilekic
|
82393 |
27-Aug-2001 |
peter |
Enable hardwiring of things like tunables from embedded enironments that do not start from loader(8).
|
82313 |
25-Aug-2001 |
peter |
vm_page_zero_idle() is no longer MD.
|
82107 |
21-Aug-2001 |
peter |
Strip out some #if's for old implementations of global data pointers.
|
82025 |
21-Aug-2001 |
peter |
Make COMPAT_43 optional again. XXX we need COMPAT_FBSD3 etc for this stuff.
|
81763 |
16-Aug-2001 |
obrien |
style(9) and make consistent across platforms
|
81727 |
15-Aug-2001 |
ache |
OFF_T -> OFF (more standard style)
|
81720 |
15-Aug-2001 |
ache |
Add OFF_T_MAX/OFF_T_MIN
|
81675 |
15-Aug-2001 |
obrien |
Style changes to commonize the various platforms.
|
81493 |
10-Aug-2001 |
jhb |
- Close races with signals and other AST's being triggered while we are in the process of exiting the kernel. The ast() function now loops as long as the PS_ASTPENDING or PS_NEEDRESCHED flags are set. It returns with preemption disabled so that any further AST's that arrive via an interrupt will be delayed until the low-level MD code returns to user mode. - Use u_int's to store the tick counts for profiling purposes so that we do not need sched_lock just to read p_sticks. This also closes a problem where the call to addupc_task() could screw up the arithmetic due to non-atomic reads of p_sticks. - Axe need_proftick(), aston(), astoff(), astpending(), need_resched(), clear_resched(), and resched_wanted() in favor of direct bit operations on p_sflag. - Fix up locking with sched_lock some. In addupc_intr(), use sched_lock to ensure pr_addr and pr_ticks are updated atomically with setting PS_OWEUPC. In ast() we clear pr_ticks atomically with clearing PS_OWEUPC. We also do not grab the lock just to test a flag. - Simplify the handling of Giant in ast() slightly.
Reviewed by: bde (mostly)
|
81265 |
08-Aug-2001 |
peter |
Zap 'ptrace(PT_READ_U, ...)' and 'ptrace(PT_WRITE_U, ...)' since they are a really nasty interface that should have been killed long ago when 'ptrace(PT_[SG]ETREGS' etc came along. The entity that they operate on (struct user) will not be around much longer since it is part-per-process and part-per-thread in a post-KSE world.
gdb does not actually use this except for the obscure 'info udot' command which does a hexdump of as much of the child's 'struct user' as it can get. It carries its own #defines so it doesn't break compiles.
|
81255 |
07-Aug-2001 |
jhb |
Grab Giant arond page faults. ia64 boots again in the simulator now.
|
81198 |
06-Aug-2001 |
dfr |
Make this compile again.
|
81197 |
06-Aug-2001 |
dfr |
Remove usage of nonexistent vm_mtx.
|
80729 |
31-Jul-2001 |
jhb |
GC some obsolete alpha code.
|
80700 |
31-Jul-2001 |
jake |
Use a machine dependent type, Elf_Hashelt, for the elements of the elf dynamic symbol table buckets and chains. The sparc64 toolchain uses 32 bit .hash entries, unlike other 64 bits architectures (alpha), which use 64 bit entries.
Discussed with: dfr, jdp
|
80431 |
27-Jul-2001 |
peter |
Make PMAP_SHPGPERPROC tunable. One shouldn't need to recompile a kernel for this, since it is easy to run into with large systems with lots of shared mmap space.
Obtained from: yahoo
|
80421 |
26-Jul-2001 |
peter |
Call the early tunable setup functions as soon as kern_envp is available. Some things depend on hz being set not long after this.
|
80399 |
26-Jul-2001 |
bmilekic |
- Do not handle the per-CPU containers in mbuf code as though the cpuids were indices in a dense array. The cpuids are a sparse set and treat them as such, setting up containers only for CPUs activated during mb_init().
- Fix netstat(1) and systat(1) to treat the per-CPU stats area as a sparse map, in accordance with the above.
This allows us to properly boot with certain CPUs disactivated. However, if we later decide to re-activate said CPUs, we will barf until we decide to implement CPU spinon/spinoff callback hooks to allow for said CPUs' per-CPU containers to get configured on their activation.
Reported by: mjacob Partially (sys/ diffs) Submitted by: mjacob
|
79573 |
11-Jul-2001 |
bsd |
Add 'hwatch' and 'dhwatch' ddb commands analogous to 'watch' and 'dwatch'. The new commands install hardware watchpoints if supported by the architecture and if there are enough registers to cover the desired memory area.
No objection by: audit@, hackers@
MFC after: 2 weeks
|
79418 |
08-Jul-2001 |
julian |
A set of changes to reduce the number of include files the kernel takes from /usr/include. I cannot check them on alpha.. (will try beast)
Briefly looked at by: Warner Losh <imp@harmony.village.org>
|
79265 |
05-Jul-2001 |
dillon |
Move vm_page_zero_idle() from machine-dependant sections to a machine-independant source file, vm/vm_zeroidle.c. It was exactly the same for all platforms and updating them all was getting annoying.
|
79263 |
04-Jul-2001 |
dillon |
Reorg vm_page.c into vm_page.c, vm_pageq.c, and vm_contig.c (for contigmalloc). Also removed some spl's and added some VM mutexes, but they are not actually used yet, so this commit does not really make any operational changes to the system.
vm_page.c relates to vm_page_t manipulation, including high level deactivation, activation, etc... vm_pageq.c relates to finding free pages and aquiring exclusive access to a page queue (exclusivity part not yet implemented). And the world still builds... :-)
|
79224 |
04-Jul-2001 |
dillon |
With Alfred's permission, remove vm_mtx in favor of a fine-grained approach (this commit is just the first stage). Also add various GIANT_ macros to formalize the removal of Giant, making it easy to test in a more piecemeal fashion. These macros will allow us to test fine-grained locks to a degree before removing Giant, and also after, and to remove Giant in a piecemeal fashion via sysctl's on those subsystems which the authors believe can operate without Giant.
|
79123 |
03-Jul-2001 |
jhb |
Allow Giant to be recursed when a process terminates.
|
79106 |
02-Jul-2001 |
brooks |
gif(4) and stf(4) modernization:
- Remove gif dependencies from stf. - Make gif and stf into modules - Make gif cloneable.
PR: kern/27983 Reviewed by: ru, ume Obtained from: NetBSD MFC after: 1 week
|
79053 |
01-Jul-2001 |
imp |
Don't need the .keep_me files. Obrien and I committed past each other.
Add 0-9 to the list of possible kernel names at matsushita-san's suggestion.
Submitted by: Makoto MATSUSHITA-san <matusita@jp.FreeBSD.org>
|
79030 |
30-Jun-2001 |
obrien |
Ensure sys/${MACHINE}/compile/FOO exists
Reviewed by: arch, imp, peter, and the USENIX terminal room secret kernel cabal
|
79026 |
30-Jun-2001 |
imp |
Really do proper keepme files in the compile directories. Use .cvsignore file for [A-Za-z]* to keep these directories around rather than waste a file on .keepme. This should also make people's built trees place nice with CVS.
Idea for .cvsignore: peter (although I suggested the regexp) Pointed out by: Makoto MATSUSHITA-san <matusita@jp.FreeBSD.org> Llama's costuming by: Fernamdo Llamas
|
79019 |
30-Jun-2001 |
obrien |
Ensure sys/${MACHINE}/compile/FOO exists
Reviewed by: arch, imp, peter and the USENIX terminal room secret kernel cabal
|
79008 |
30-Jun-2001 |
imp |
Repo copy i8237.h to dev/ic so we can get rid of some of the final vestiges of includes of i386 files from non-i386 ports.
|
78983 |
29-Jun-2001 |
jhb |
Move ast() and userret() to sys/kern/subr_trap.c now that they are MI.
|
78962 |
29-Jun-2001 |
jhb |
Add a new MI pointer to the process' trapframe p_frame instead of using various differently named pointers buried under p_md.
Reviewed by: jake (in principle)
|
78888 |
27-Jun-2001 |
jhb |
Catch up to mbuf allocator changes from last September so this compiles again.
|
78887 |
27-Jun-2001 |
jhb |
Make this compile again. Broken since June 1.
|
78762 |
25-Jun-2001 |
jhb |
Fix cut-n-paste brain-o.
Pointy-hat to: me
|
78636 |
22-Jun-2001 |
jhb |
- Grab the proc lock around CURSIG and postsig(). Don't release the proc lock until after grabbing the sched_lock to avoid CURSIG racing with psignal. - Don't grab Giant for addupc_task() as it isn't needed.
Reported by: tegge (signal race), bde (addupc_task a while back)
|
78269 |
15-Jun-2001 |
peter |
oops. prepare_usermode() died in August 2000 in the MI and x86 code.
Issue raised by: scottl
|
77931 |
09-Jun-2001 |
obrien |
Fix style of defines.
|
77800 |
06-Jun-2001 |
joerg |
Nuke the various poorly maintained copies of ioctl_fd.h. The file is not machine-dependant, thus it has been moved out (repo-copied) into <sys/fdcio.h>.
|
77796 |
06-Jun-2001 |
jhb |
Don't hold sched_lock across addupc_task().
Reported by: David Taylor <davidt@yadt.co.uk> Submitted by: bde
|
77626 |
02-Jun-2001 |
phk |
Properly wrap mtx_intr_enable() macro in "do $bla while (0)"
|
77582 |
01-Jun-2001 |
tmm |
Clean up the code exporting interrupt statistics via sysctl a bit: - move the sysctl code to kern_intr.c - do not use INTRCNT_COUNT, but rather eintrcnt - intrcnt to determine the length of the intrcnt array - move the declarations of intrnames, eintrnames, intrcnt and eintrcnt from machine-dependent include files to sys/interrupt.h - remove the hw.nintr sysctl, it is not needed. - fix various style bugs
Requested by: bde Reviewed by: bde (some time ago)
|
77507 |
30-May-2001 |
jhb |
Catch up to the axeing of MFS and fix the ia64 build.
Forgotten by: a Danish axe-wielder
|
77448 |
30-May-2001 |
jhb |
- Catch up to the VM mutex changes. - Sort includes in a few places.
|
77414 |
29-May-2001 |
phk |
Remove MFS options from all example kernel configs.
|
77031 |
23-May-2001 |
ru |
- FDESC, FIFO, NULL, PORTAL, PROC, UMAP and UNION file systems were repo-copied from sys/miscfs to sys/fs.
- Renamed the following file systems and their modules: fdesc -> fdescfs, portal -> portalfs, union -> unionfs.
- Renamed corresponding kernel options: FDESC -> FDESCFS, PORTAL -> PORTALFS, UNION -> UNIONFS.
- Install header files for the above file systems.
- Removed bogus -I${.CURDIR}/../../sys CFLAGS from userland Makefiles.
|
76784 |
18-May-2001 |
obrien |
Style changes -- revert ordering to mostly two revs ago. Embellish some comments, fix tab'ing.
Requested by: bde
|
76770 |
17-May-2001 |
jhb |
- Move the setting of bootverbose to a MI SI_SUB_TUNABLES SYSINIT. - Attach a writable sysctl to bootverbose (debug.bootverbose) so it can be toggled after boot. - Move the printf of the version string to a SI_SUB_COPYRIGHT SYSINIT just afer the display of the copyright message instead of doing it by hand in three MD places.
|
76701 |
16-May-2001 |
obrien |
Consistently define the rune types. Follow NetBSD's lead and add a _BSD_MBSTATE_T_ type.
|
76700 |
16-May-2001 |
obrien |
Move the int typedefs to the top so they can be used in defining other types. Ensure every platform has __offsetof. Make multiple inclusion detection consistent with other <platform>/include/*.h files.
|
76659 |
16-May-2001 |
jhb |
Lock the procfs functions for doing a single step and reading/writing registers better. Hold sched_lock not only for checking the flag but also while performing the actual operation to ensure the process doesn't get swapped out by another CPU while we the operation is being performed.
|
76651 |
15-May-2001 |
jhb |
"Sir, the deorbit burn completed succesfully."
RIP {sys/machine}/ipl.h.
|
76650 |
15-May-2001 |
jhb |
Remove unneeded includes of sys/ipl.h and machine/ipl.h.
|
76554 |
13-May-2001 |
phk |
Convert DEVFS from an "opt-in" to an "opt-out" option.
If for some reason DEVFS is undesired, the "NODEVFS" option is needed now.
Pending any significant issues, DEVFS will be made mandatory in -current on july 1st so that we can start reaping the full benefits of having it.
|
76494 |
11-May-2001 |
jhb |
Simplify the vm fault trap handling code a bit by using if-else instead of duplicating code in the then case and then using a goto to jump around the else case.
|
76440 |
10-May-2001 |
jhb |
- Split out the support for per-CPU data from the SMP code. UP kernels have per-CPU data and gdb on the i386 at least needs access to it. - Clean up includes in kern_idle.c and subr_smp.c.
Reviewed by: jake
|
76411 |
09-May-2001 |
jhb |
Add include of sys/mutex.h and resort include of sys/lock.h.
|
76410 |
09-May-2001 |
jhb |
Add needed sys/lock.h include.
|
76322 |
06-May-2001 |
phk |
Actually biofinish(struct bio *, struct devstat *, int error) is more general than the bioerror().
Most of this patch is generated by scripts.
|
76078 |
27-Apr-2001 |
jhb |
Overhaul of the SMP code. Several portions of the SMP kernel support have been made machine independent and various other adjustments have been made to support Alpha SMP.
- It splits the per-process portions of hardclock() and statclock() off into hardclock_process() and statclock_process() respectively. hardclock() and statclock() call the *_process() functions for the current process so that UP systems will run as before. For SMP systems, it is simply necessary to ensure that all other processors execute the *_process() functions when the main clock functions are triggered on one CPU by an interrupt. For the alpha 4100, clock interrupts are delievered in a staggered broadcast fashion, so we simply call hardclock/statclock on the boot CPU and call the *_process() functions on the secondaries. For x86, we call statclock and hardclock as usual and then call forward_hardclock/statclock in the MD code to send an IPI to cause the AP's to execute forwared_hardclock/statclock which then call the *_process() functions. - forward_signal() and forward_roundrobin() have been reworked to be MI and to involve less hackery. Now the cpu doing the forward sets any flags, etc. and sends a very simple IPI_AST to the other cpu(s). AST IPIs now just basically return so that they can execute ast() and don't bother with setting the astpending or needresched flags themselves. This also removes the loop in forward_signal() as sched_lock closes the race condition that the loop worked around. - need_resched(), resched_wanted() and clear_resched() have been changed to take a process to act on rather than assuming curproc so that they can be used to implement forward_roundrobin() as described above. - Various other SMP variables have been moved to a MI subr_smp.c and a new header sys/smp.h declares MI SMP variables and API's. The IPI API's from machine/ipl.h have moved to machine/smp.h which is included by sys/smp.h. - The globaldata_register() and globaldata_find() functions as well as the SLIST of globaldata structures has become MI and moved into subr_smp.c. Also, the globaldata list is only available if SMP support is compiled in.
Reviewed by: jake, peter Looked over by: eivind
|
75914 |
24-Apr-2001 |
dfr |
When switching backing store during signal delivery, do the switch before creating the register frame for calling the handler. Also discard that frame before switching back to the old backing store after the handler returns.
|
75913 |
24-Apr-2001 |
dfr |
Align stack pointer and backing store pointer to 16 byte boundary when delivering signals.
|
75912 |
24-Apr-2001 |
dfr |
Don't trash the user's pr on syscalls.
|
75701 |
19-Apr-2001 |
dfr |
Don't unwrap the function descriptor used as the callout argument to fork_exit(). The MI version of fork_exit() needs a real function descriptor, not a simple function pointer.
|
75700 |
19-Apr-2001 |
dfr |
Don't take the Giant mutex for clock interrupts.
|
75668 |
18-Apr-2001 |
dfr |
Don't panic when we try to modify the kernel pmap.
|
75667 |
18-Apr-2001 |
dfr |
Print an approximation of the function arguments in the stack trace.
|
75666 |
18-Apr-2001 |
dfr |
Implement a simple stack trace for DDB. This will have to be redone if/when we change to a more modern toolchain.
|
75665 |
18-Apr-2001 |
dfr |
Record the right value for tf_ndirty for kernel interruptions so that we can examine the interrupted register stack frame in DDB.
|
75528 |
15-Apr-2001 |
obrien |
Turn on kernel debugging support (DDB, INVARIANTS, INVARIANT_SUPPORT, WITNESS) by default while SMPng is still being developed.
Submitted by: jhb
|
75421 |
11-Apr-2001 |
jhb |
Rename the IPI API from smp_ipi_* to ipi_* since the smp_ prefix is just "redundant noise" and to match the IPI constant namespace (IPI_*).
Requested by: bde
|
75002 |
29-Mar-2001 |
obrien |
Reduce the emasculation of bounds_check_with_label() by one line, so we propagate a bio error condition to the caller and above.
|
74927 |
28-Mar-2001 |
jhb |
Convert the allproc and proctree locks from lockmgr locks to sx locks.
|
74912 |
28-Mar-2001 |
jhb |
Rework the witness code to work with sx locks as well as mutexes. - Introduce lock classes and lock objects. Each lock class specifies a name and set of flags (or properties) shared by all locks of a given type. Currently there are three lock classes: spin mutexes, sleep mutexes, and sx locks. A lock object specifies properties of an additional lock along with a lock name and all of the extra stuff needed to make witness work with a given lock. This abstract lock stuff is defined in sys/lock.h. The lockmgr constants, types, and prototypes have been moved to sys/lockmgr.h. For temporary backwards compatability, sys/lock.h includes sys/lockmgr.h. - Replace proc->p_spinlocks with a per-CPU list, PCPU(spinlocks), of spin locks held. By making this per-cpu, we do not have to jump through magic hoops to deal with sched_lock changing ownership during context switches. - Replace proc->p_heldmtx, formerly a list of held sleep mutexes, with proc->p_sleeplocks, which is a list of held sleep locks including sleep mutexes and sx locks. - Add helper macros for logging lock events via the KTR_LOCK KTR logging level so that the log messages are consistent. - Add some new flags that can be passed to mtx_init(): - MTX_NOWITNESS - specifies that this lock should be ignored by witness. This is used for the mutex that blocks a sx lock for example. - MTX_QUIET - this is not new, but you can pass this to mtx_init() now and no events will be logged for this lock, so that one doesn't have to change all the individual mtx_lock/unlock() operations. - All lock objects maintain an initialized flag. Use this flag to export a mtx_initialized() macro that can be safely called from drivers. Also, we on longer walk the all_mtx list if MUTEX_DEBUG is defined as witness performs the corresponding checks using the initialized flag. - The lock order reversal messages have been improved to output slightly more accurate file and line numbers.
|
74903 |
28-Mar-2001 |
jhb |
Switch from save/disable/restore_intr() to critical_enter/exit().
|
74902 |
28-Mar-2001 |
jhb |
Catch up to the mtx_saveintr -> mtx_savecrit change.
|
74900 |
28-Mar-2001 |
jhb |
- Switch from using save/disable/restore_intr to using critical_enter/exit and change the u_int mtx_saveintr member of struct mtx to a critical_t mtx_savecrit. - On the alpha we no longer need a custom _get_spin_lock() macro to avoid an extra PAL call, so remove it. - Partially fix using mutexes with WITNESS in modules. Change all the _mtx_{un,}lock_{spin,}_flags() macros to accept explicit file and line parameters and rename them to use a prefix of two underscores. Inside of kern_mutex.c, generate wrapper functions for _mtx_{un,}lock_{spin,}_flags() (only using a prefix of one underscore) that are called from modules. The macros mtx_{un,}lock_{spin,}_flags() are mapped to the __mtx_* macros inside of the kernel to inline the usual case of mutex operations and map to the internal _mtx_* functions in the module case so that modules will use WITNESS and KTR logging if the kernel is compiled with support for it.
|
74897 |
28-Mar-2001 |
jhb |
- Add the new critical_t type used to save state inside of critical sections. - Add implementations of the critical_enter() and critical_exit() functions and remove restore_intr() and save_intr(). - Remove the somewhat bogus disable_intr() and enable_intr() functions on the alpha as the alpha actually uses a priority level and not simple bit flag on the CPU.
|
74810 |
26-Mar-2001 |
phk |
Send the remains (such as I have located) of "block major numbers" to the bit-bucket.
|
74746 |
24-Mar-2001 |
ume |
Unbreak build on alpha. - Move in_port_t to sys/types.h. - Nuke in_addr_t from each endian.h.
Reported by: jhb
|
74733 |
24-Mar-2001 |
jhb |
- Define and use MAXCPU like the alpha and i386 instead of NCPUS. - Sort the sys/mutex.h include in mp_machdep.c into a closer to correct location.
|
74732 |
24-Mar-2001 |
jhb |
Stick a prototype for handleclock() in machine/clock.h and include it interrupt.c to quiet a warning.
|
74670 |
23-Mar-2001 |
tmm |
Export intrnames and intrcnt as sysctls (hw.nintr, hw.intrnames and hw.intrcnt).
Approved by: rwatson
|
74384 |
17-Mar-2001 |
peter |
Use a generic implementation of the Fowler/Noll/Vo hash (FNV hash). Make the name cache hash as well as the nfsnode hash use it.
As a special tweak, create an unsigned version of register_t. This allows us to use a special tweak for the 64 bit versions that significantly speeds up the i386 version (ie: int64 XOR int64 is slower than int64 XOR int32).
The code layout is a little strange for the string function, but I was able to get between 5 to 10% improvement over the original version I started with. The layout affects gcc code generation choices and this way was fastest on x86 and alpha.
Note that 'CPUTYPE=p3' etc makes a fair difference to this. It is around 45% faster with -march=pentiumpro on a p6 cpu.
|
74031 |
09-Mar-2001 |
dfr |
Allow the config file to specify a root filesystem filename.
|
74030 |
09-Mar-2001 |
dfr |
Adjust a comment slightly.
|
74016 |
09-Mar-2001 |
jhb |
Fix mtx_legal2block. The only time that it is bad to block on a mutex is if we hold a spin mutex, since we can trivially get into deadlocks if we start switching out of processes that hold spinlocks. Checking to see if interrupts were disabled was a sort of cheap way of doing this since most of the time interrupts were only disabled when holding a spin lock. At least on the i386. To fix this properly, use a per-process counter p_spinlocks that counts the number of spin locks currently held, and instead of checking to see if interrupts are disabled in the witness code, check to see if we hold any spin locks. Since child processes always start up with the sched lock magically held in fork_exit(), we initialize p_spinlocks to 1 for child processes. Note that proc0 doesn't go through fork_exit(), so it starts with no spin locks held.
Consulting from: cp
|
73936 |
07-Mar-2001 |
jhb |
Unrevert the pmap_map() changes. They weren't broken on x86.
Sense beaten into me by: peter
|
73931 |
07-Mar-2001 |
jhb |
- Release Giant a bit earlier on syscall exit. - Don't try to grab Giant before postsig() in userret() as it is no longer needed. - Don't grab Giant before psignal() in ast() but get the proc lock instead.
|
73929 |
07-Mar-2001 |
jhb |
Grab the process lock while calling psignal and before calling psignal.
|
73922 |
07-Mar-2001 |
jhb |
Use the proc lock to protect p_pptr when waking up our parent in cpu_exit() and remove the mpfixme() message that is now fixed.
|
73903 |
07-Mar-2001 |
jhb |
Back out the pmap_map() change for now, it isn't completely stable on the i386.
|
73867 |
06-Mar-2001 |
jhb |
Don't psignal() a process from forward_hardclock() but set the appropriate pending flag in p_sflag instead.
|
73862 |
06-Mar-2001 |
jhb |
- Rework pmap_map() to take advantage of direct-mapped segments on supported architectures such as the alpha. This allows us to save on kernel virtual address space, TLB entries, and (on the ia64) VHPT entries. pmap_map() now modifies the passed in virtual address on architectures that do not support direct-mapped segments to point to the next available virtual address. It also returns the actual address that the request was mapped to. - On the IA64 don't use a special zone of PV entries needed for early calls to pmap_kenter() during pmap_init(). This gets us in trouble because we end up trying to use the zone allocator before it is initialized. Instead, with the pmap_map() change, the number of needed PV entries is small enough that we can get by with a static pool that is used until pmap_init() is complete.
Submitted by: dfr Debugging help: peter Tested by: me
|
73555 |
04-Mar-2001 |
dfr |
Fix a couple of typos which became obvious when I started to actually use this on real hardware.
|
72991 |
24-Feb-2001 |
jhb |
sched_swi -> swi_sched
|
72990 |
24-Feb-2001 |
jhb |
Don't include machine/mutex.h and relocate sys/mutex.h's include to be closer to alphabetical order and identical to that of the alpha.
|
72988 |
24-Feb-2001 |
jhb |
Clockframes have a trapframe stored in a cf_tf member, not ct_tf.
|
72985 |
24-Feb-2001 |
jhb |
Whitespace nits.
|
72984 |
24-Feb-2001 |
jhb |
Pass in process to mark ast on to aston().
|
72907 |
22-Feb-2001 |
jhb |
Axe pcb_schednest as it is no longer used.
|
72906 |
22-Feb-2001 |
jhb |
Rename switch_trampoline() to fork_trampoline() on the alpha and ia64.
Suggested by: dfr
|
72905 |
22-Feb-2001 |
jhb |
Don't set the sched_lock lesting level for new processes as it is no longer used.
|
72904 |
22-Feb-2001 |
jhb |
Catch comments up to child_return() -> fork_return() as well.
|
72903 |
22-Feb-2001 |
jhb |
Synch up with the other architectures: - Remove unneeded spl()'s around mi_switch() in userret(). - Don't hold sched_lock across addupc_task(). - Remove the MD function child_return() now that the MI function fork_return() is used instead. - Use TRAPF_USERMODE() instead of dinking with the trapframe directly to check for ast's in kernel mode. - Check astpending(curproc) and resched_wanted() in ast() and return if neither is true. - Use astoff() rather than setting the non-existent per-cpu variable astpending to 0 to clear an ast.
|
72899 |
22-Feb-2001 |
jhb |
Use the MI fork_return() fork trampoline callout function for child processes instead of the MD child_return().
|
72898 |
22-Feb-2001 |
jhb |
- Don't dink with sched_lock in cpu_switch() since mi_switch() does this for us. - Change the switch_trampoline() to call fork_exit() passing in the required arguments instead of calling the fork trampoline callout function directly. Warning: this hasn't been tested.
Looked over by: dfr
|
72896 |
22-Feb-2001 |
jhb |
- Axe the now unused ASS_* assertions for interrupt status. - Use ia64_get_psr() instead of save_intr() in mtx_legal2block().
|
72895 |
22-Feb-2001 |
jhb |
Add a inline function to read the psr.
|
72894 |
22-Feb-2001 |
jhb |
Add a mtx_intr_enable() macro.
|
72893 |
22-Feb-2001 |
jhb |
Axe the astpending per-cpu variable.
|
72892 |
22-Feb-2001 |
jhb |
Add TRAPF_PC() and TRAPF_USERMODE() macros and redefine CLKF_PC() and CLKF_USERMODE() in terms of them.
|
72884 |
22-Feb-2001 |
jhb |
Catch up to new MI astpending and need_resched handling.
|
72569 |
17-Feb-2001 |
ume |
Correct disordering which is corresponding to bde's fix to i386/include/ansi.h.
|
72510 |
15-Feb-2001 |
ume |
Correct 2nd argument of getnameinfo(3) to socklen_t.
Reviewed by: itojun
|
72376 |
12-Feb-2001 |
jake |
Implement a unified run queue and adjust priority levels accordingly.
- All processes go into the same array of queues, with different scheduling classes using different portions of the array. This allows user processes to have their priorities propogated up into interrupt thread range if need be. - I chose 64 run queues as an arbitrary number that is greater than 32. We used to have 4 separate arrays of 32 queues each, so this may not be optimal. The new run queue code was written with this in mind; changing the number of run queues only requires changing constants in runq.h and adjusting the priority levels. - The new run queue code takes the run queue as a parameter. This is intended to be used to create per-cpu run queues. Implement wrappers for compatibility with the old interface which pass in the global run queue structure. - Group the priority level, user priority, native priority (before propogation) and the scheduling class into a struct priority. - Change any hard coded priority levels that I found to use symbolic constants (TTIPRI and TTOPRI). - Remove the curpriority global variable and use that of curproc. This was used to detect when a process' priority had lowered and it should yield. We now effectively yield on every interrupt. - Activate propogate_priority(). It should now have the desired effect without needing to also propogate the scheduling class. - Temporarily comment out the call to vm_page_zero_idle() in the idle loop. It interfered with propogate_priority() because the idle process needed to do a non-blocking acquire of Giant and then other processes would try to propogate their priority onto it. The idle process should not do anything except idle. vm_page_zero_idle() will return in the form of an idle priority kernel thread which is woken up at apprioriate times by the vm system. - Update struct kinfo_proc to the new priority interface. Deliberately change its size by adjusting the spare fields. It remained the same size, but the layout has changed, so userland processes that use it would parse the data incorrectly. The size constraint should really be changed to an arbitrary version number. Also add a debug.sizeof sysctl node for struct kinfo_proc.
|
72358 |
11-Feb-2001 |
markm |
RIP <machine/lock.h>.
Some things needed bits of <i386/include/lock.h> - cy.c now has its own (only) copy of the COM_(UN)LOCK() macros, and IMASK_(UN)LOCK() has been moved to <i386/include/apic.h> (AKA <machine/apic.h>). Reviewed by: jhb
|
72226 |
09-Feb-2001 |
jhb |
Move the initailization of the proc lock for proc0 very early into the MD startup code.
|
72221 |
09-Feb-2001 |
jhb |
Remove bogus #if 0'd code that dinked with the saved interrupt state in sched_lock.
|
72200 |
09-Feb-2001 |
bmilekic |
Change and clean the mutex lock interface.
mtx_enter(lock, type) becomes:
mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks) mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized)
similarily, for releasing a lock, we now have:
mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN. We change the caller interface for the two different types of locks because the semantics are entirely different for each case, and this makes it explicitly clear and, at the same time, it rids us of the extra `type' argument.
The enter->lock and exit->unlock change has been made with the idea that we're "locking data" and not "entering locked code" in mind.
Further, remove all additional "flags" previously passed to the lock acquire/release routines with the exception of two:
MTX_QUIET and MTX_NOSWITCH
The functionality of these flags is preserved and they can be passed to the lock/unlock routines by calling the corresponding wrappers:
mtx_{lock, unlock}_flags(lock, flag(s)) and mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN locks, respectively.
Re-inline some lock acq/rel code; in the sleep lock case, we only inline the _obtain_lock()s in order to ensure that the inlined code fits into a cache line. In the spin lock case, we inline recursion and actually only perform a function call if we need to spin. This change has been made with the idea that we generally tend to avoid spin locks and that also the spin locks that we do have and are heavily used (i.e. sched_lock) do recurse, and therefore in an effort to reduce function call overhead for some architectures (such as alpha), we inline recursion for this case.
Create a new malloc type for the witness code and retire from using the M_DEV type. The new type is called M_WITNESS and is only declared if WITNESS is enabled.
Begin cleaning up some machdep/mutex.h code - specifically updated the "optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently need those.
Finally, caught up to the interface changes in all sys code.
Contributors: jake, jhb, jasone (in no particular order)
|
72011 |
04-Feb-2001 |
peter |
Clean up some leftovers from the root mount cleanup that was done some time ago. FFS_ROOT and CD9660_ROOT are obsolete.
|
71984 |
04-Feb-2001 |
peter |
All the world is not an i386. Merge rev 1.438 of i386/i386/machdep.c. Make buffer_map a system map.
|
71880 |
31-Jan-2001 |
peter |
Remove count for NSIO. The only places it was used it were incorrect. (alpha-gdbstub.c got sync'ed up a bit with the i386 version)
|
71803 |
29-Jan-2001 |
dfr |
Flesh out EFI support somewhat.
|
71785 |
29-Jan-2001 |
peter |
Send "#if NISA > 0" to the bit-bucket and replace it with an option. These were compile-time "is the isa code present?" tests and not 'how many isa busses' tests.
|
71731 |
28-Jan-2001 |
marcel |
Add gd_witness_spin_check.
|
71730 |
28-Jan-2001 |
marcel |
Fix typo.
|
71729 |
28-Jan-2001 |
marcel |
Improve kernel bootstrapping: o Use objdump instead of gensetdefs(1) to build the linker sets. o Allow overriding of nm and objdump in resp. genassym.sh and gensetdefs.pl for non-native toolchains.
Reviewed by: arch Perl improvements: Jos Backus <josb@cncdsl.com>, benno
|
71684 |
26-Jan-2001 |
dfr |
Initialise proc0.p_heldmtx and proc0.p_contested and call mtx_enter(&Giant, MTX_DEF) after Giant is initialised.
Reviewed by: jhb
|
71596 |
24-Jan-2001 |
dfr |
Change cpuno to cpuid.
|
71595 |
24-Jan-2001 |
dfr |
Fix typo.
|
71576 |
24-Jan-2001 |
jasone |
Convert all simplelocks to mutexes and remove the simplelock implementations.
|
71554 |
24-Jan-2001 |
jhb |
- Proc locking. - P_OWEUPC -> PS_OWEUPC.
|
71553 |
24-Jan-2001 |
jhb |
- Proc locking. - Update userret() to take a struct trapframe * as a second argument. - Axe have_giant and use mtx_owned(&Giant) where appropriate.
|
71552 |
24-Jan-2001 |
jhb |
- Proc locking. - P_FOO -> PS_FOO.
|
71551 |
24-Jan-2001 |
jhb |
- Proc locking. - Bring across forwarded_statclock() fixes from i386 and alpha.
|
71350 |
21-Jan-2001 |
des |
First step towards an MP-safe zone allocator: - have zalloc() and zfree() always lock the vm_zone. - remove zalloci() and zfreei(), which are now redundant.
Reviewed by: bmilekic, jasone
|
71337 |
21-Jan-2001 |
jake |
Make intr_nesting_level per-process, rather than per-cpu. Setup interrupt threads to run with it always >= 1, so that malloc can detect M_WAITOK from "interrupt" context. This is also necessary in order to context switch from sched_ithd() directly.
Reviewed By: peter
|
71320 |
21-Jan-2001 |
jasone |
Remove MUTEX_DECLARE() and MTX_COLD. Instead, postpone full mutex initialization until after malloc() is safe to call, then iterate through all mutexes and complete their initialization.
This change is necessary in order to avoid some circular bootstrapping dependencies.
|
71241 |
19-Jan-2001 |
peter |
Remove the now-empty ipl_funcs.c file on all platforms.
|
71240 |
19-Jan-2001 |
peter |
Remove the static splXXX functions and replace them by static __inline stubs. Remove the xxx_imask variables which have been all but gone for a while.
|
71228 |
19-Jan-2001 |
bmilekic |
Implement MTX_RECURSE flag for mtx_init(). All calls to mtx_init() for mutexes that recurse must now include the MTX_RECURSE bit in the flag argument variable. This change is in preparation for an upcoming (further) mutex API cleanup. The witness code will call panic() if a lock is found to recurse but the MTX_RECURSE bit was not set during the lock's initialization.
The old MTX_RECURSE "state" bit (in mtx_lock) has been renamed to MTX_RECURSED, which is more appropriate given its meaning.
The following locks have been made "recursive," thus far: eventhandler, Giant, callout, sched_lock, possibly some others declared in the architecture-specific code, all of the network card driver locks in pci/, as well as some other locks in dev/ stuff that I've found to be recursive.
Reviewed by: jhb
|
71103 |
16-Jan-2001 |
phk |
These files have been on deathrow for a couple of months, no appeal.
|
71037 |
14-Jan-2001 |
markm |
Remove NOBLOCKRANDOM as a compile-time option. Instead, provide exactly the same functionality via a sysctl, making this feature a run-time option.
The default is 1(ON), which means that /dev/random device will NOT block at startup.
setting kern.random.sys.seeded to 0(OFF) will cause /dev/random to block until the next reseed, at which stage the sysctl will be changed back to 1(ON).
While I'm here, clean up the sysctls, and make them dynamic. Reviewed by: des Tested on Alpha by: obrien
|
70952 |
12-Jan-2001 |
jake |
Remove unused per-cpu variables inside_intr and ss_eflags.
|
70928 |
11-Jan-2001 |
jake |
- Remove compatibility macros for accessing per-cpu variables. __FreeBSD_version 500015 can be used to detect their disappearance. - Move the symbols for SMP_prvspace and lapic from globals.s to locore.s. - Remove globals.s with extreme prejudice.
|
70861 |
10-Jan-2001 |
jake |
Use PCPU_GET, PCPU_PTR and PCPU_SET to access all per-cpu variables other then curproc.
|
70789 |
08-Jan-2001 |
obrien |
Put VCS ids in a consistent place and form.
|
70786 |
08-Jan-2001 |
obrien |
Remove seconds types we don't use that came in thru the NetBSD heiratage.
|
70723 |
06-Jan-2001 |
jake |
Implement accessors for per-cpu variables which don't depend on the symbols in globals.s.
PCPU_GET(name) returns the value of the per-cpu variable PCPU_PTR(name) returns a pointer to the per-cpu variable PCPU_SET(name, val) sets the value of the per-cpu variable
In general these are not yet used, compatibility macros remain.
Unifdef SMP struct globaldata, this makes variables such as cpuid available for UP as well.
Rebuilding modules is probably a good idea, but I believe old modules will still work, as most of the old infrastructure remains.
|
70509 |
30-Dec-2000 |
dfr |
Don't include <stddef.h> for offsetof() - its also defined in <sys/types.h>
|
70508 |
30-Dec-2000 |
dfr |
Merge ALIGN changes from alpha code.
|
70507 |
30-Dec-2000 |
dfr |
Fix typo.
|
70317 |
23-Dec-2000 |
jake |
Protect proc.p_pptr and proc.p_children/p_sibling with the proctree_lock.
linprocfs not locked pending response from informal maintainer.
Reviewed by: jhb, -smp@
|
70210 |
20-Dec-2000 |
marcel |
Resolve RAW dependency violation between tbit and adds.
|
70034 |
14-Dec-2000 |
jhb |
Remove the "machine dependent" KTR trace buffer ddb commands. The code was exactly the same on all platforms.
|
69966 |
13-Dec-2000 |
obrien |
Sync with i386/GENERIC rev 1.294 removing "COMPAT_OLDPCI".
This fixed the broken kernel build on the Alpha.
|
69947 |
13-Dec-2000 |
jake |
- Change the allproc_lock to use a macro, ALLPROC_LOCK(how), instead of explicit calls to lockmgr. Also provides macros for the flags pased to specify shared, exclusive or release which map to the lockmgr flags. This is so that the use of lockmgr can be easily replaced with optimized reader-writer locks. - Add some locking that I missed the first time.
|
69881 |
12-Dec-2000 |
jake |
- Add code to detect if a system call returns with locks other than Giant held and panic if so (conditional on witness). - Change witness_list to return the number of locks held so this is easier. - Add kern/syscalls.c to the kernel build if witness is defined so that the panic message can contain the name of the offending system call. - Add assertions that Giant and sched_lock are not held when returning from a system call, which were missing for alpha and ia64.
|
69812 |
10-Dec-2000 |
marcel |
o Remove mcclock o s/alpha/ia64/g
|
69781 |
08-Dec-2000 |
dwmalone |
Convert more malloc+bzero to malloc+M_ZERO.
Submitted by: josh@zipperup.org Submitted by: Robert Drehmel <robd@gmx.net>
|
69586 |
05-Dec-2000 |
jake |
Remove the last of the MD netisr code. It is now all MI. Remove spending, which was unused now that all software interrupts have their own thread. Make the legacy schednetisr use an atomic op for setting bits in the netisr mask.
Reviewed by: jhb
|
69399 |
30-Nov-2000 |
alfred |
remove unneded sys/ucred.h includes
|
69379 |
30-Nov-2000 |
marcel |
Don't use p->p_sigstk.ss_flags to keep state of whether the process is on the alternate stack or not. For compatibility with sigstack(2) state is being updated if such is needed.
We now determine whether the process is on the alternate stack by looking at its stack pointer. This allows a process to siglongjmp from a signal handler on the alternate stack to the place of the sigsetjmp on the normal stack. When maintaining state, this would have invalidated the state information and causing a subsequent signal to be delivered on the normal stack instead of the alternate stack.
PR: 22286
|
69207 |
26-Nov-2000 |
jlemon |
Add 'mpsafe' parameter to callout_init() in MD bits.
Reminded by: jake
|
69022 |
22-Nov-2000 |
jake |
Protect the following with a lockmgr lock:
allproc zombproc pidhashtbl proc.p_list proc.p_hash nextpid
Reviewed by: jhb Obtained from: BSD/OS and netbsd
|
69003 |
21-Nov-2000 |
markm |
Add a consistent API to a feature that most modern CPUs have; a fast counter register in-CPU.
This is to be used as a fast "timer", where linearity is more important than time, and multiple lines in the linearity caused by multiple CPUs in an SMP machine is not a problem.
This adds no code whatsoever to the FreeBSD kernel until it is actually used, and then as a single-instruction inline routine (except for the 80386 and 80486 where it is some more inline code around nanotime(9).
Reviewed by: bde, kris, jhb
|
68974 |
20-Nov-2000 |
phk |
Make programs which still #include <machine/{mouse,console}.h> fail at compiletime, with an explanatory error message. Previously they would only get a warning.
These files will be finally removed 2001-01-15
|
68889 |
19-Nov-2000 |
jake |
- Protect the callout wheel with a separate spin mutex, callout_lock. - Use the mutex in hardclock to ensure no races between it and softclock. - Make softclock be INTR_MPSAFE and provide a flag, CALLOUT_MPSAFE, which specifies that a callout handler does not need giant. There is still no way to set this flag when regstering a callout.
Reviewed by: -smp@, jlemon
|
68862 |
17-Nov-2000 |
jake |
- Split the run queue and sleep queue linkage, so that a process may block on a mutex while on the sleep queue without corrupting it. - Move dropping of Giant to after the acquire of sched_lock.
Tested by: John Hay <jhay@icomtek.csir.co.za> jhb
|
68808 |
16-Nov-2000 |
jhb |
Don't release and acquire Giant in mi_switch(). Instead, release and acquire Giant as needed in functions that call mi_switch(). The releases need to be done outside of the sched_lock to avoid potential deadlocks from trying to acquire Giant while interrupts are disabled.
Submitted by: witness
|
68762 |
15-Nov-2000 |
jhb |
Don't perform an mi_switch() when we release Giant during cpu_exit(). We are about to call cpu_switch() anyways.
Found by: witness
|
68520 |
09-Nov-2000 |
marcel |
Make MINSIGSTKSZ machine dependent, and have the sigaltstack syscall compare against a variable sv_minsigstksz in struct sysentvec as to properly take the size of the machine- and ABI dependent struct sigframe into account.
The SVR4 and iBCS2 modules continue to have a minsigstksz of 8192 to preserve behavior. The real values (if different) are not known at this time. Other ABI modules use the real values.
The native MINSIGSTKSZ is now defined as follows:
Arch MINSIGSTKSZ ---- ----------- alpha 4096 i386 2048 ia64 12288
Reviewed by: mjacob Suggested by: bde
|
67708 |
27-Oct-2000 |
phk |
Convert all users of fldoff() to offsetof(). fldoff() is bad because it only takes a struct tag which makes it impossible to use unions, typedefs etc.
Define __offsetof() in <machine/ansi.h>
Define offsetof() in terms of __offsetof() in <stddef.h> and <sys/types.h>
Remove myriad of local offsetof() definitions.
Remove includes of <stddef.h> in kernel code.
NB: Kernelcode should *never* include from /usr/include !
Make <sys/queue.h> include <machine/ansi.h> to avoid polluting the API.
Deprecate <struct.h> with a warning. The warning turns into an error on 01-12-2000 and the file gets removed entirely on 01-01-2001.
Paritials reviews by: various. Significant brucifications by: bde
|
67689 |
27-Oct-2000 |
markm |
As the blocking model has seems to be troublesome for many, disable it for now with an option.
This option is already deprecated, and will be removed when the entropy-harvesting code is fast enough to warrant it.
|
67636 |
26-Oct-2000 |
dfr |
Minor build fixes.
|
67551 |
25-Oct-2000 |
jhb |
- Overhaul the software interrupt code to use interrupt threads for each type of software interrupt. Roughly, what used to be a bit in spending now maps to a swi thread. Each thread can have multiple handlers, just like a hardware interrupt thread. - Instead of using a bitmask of pending interrupts, we schedule the specific software interrupt thread to run, so spending, NSWI, and the shandlers array are no longer needed. We can now have an arbitrary number of software interrupt threads. When you register a software interrupt thread via sinthand_add(), you get back a struct intrhand that you pass to sched_swi() when you wish to schedule your swi thread to run. - Convert the name of 'struct intrec' to 'struct intrhand' as it is a bit more intuitive. Also, prefix all the members of struct intrhand with 'ih_'. - Make swi_net() a MI function since there is now no point in it being MD.
Submitted by: cp
|
67540 |
25-Oct-2000 |
jhb |
Implement atomic_{set,clear,add,subtract}_{acq_,rel_,}_ptr()
|
67522 |
24-Oct-2000 |
dfr |
* Various fixes to breakage introduced by the atomic and mutex reorgs. * Fixes to the signal delivery code. Not quite right yet.
I would have preferred to wait until I have signal delivery actually working but the current kernel in CVS doesn't build.
|
67488 |
24-Oct-2000 |
obrien |
Adjust comments Submitted by: bde
Add ISO C99's long long type limits. Reviewed by: bde
|
67474 |
23-Oct-2000 |
mjacob |
CURTHD now defines in globals.h
|
67403 |
20-Oct-2000 |
jhb |
Define the mtx_legal2block() macro used in the witness code that managed to get lost during the MI mutex conversion.
Reported by: Steve Kargl <sgk@troutmask.apl.washington.edu>
|
67358 |
20-Oct-2000 |
jhb |
- machine/mutex.h -> sys/mutex.h - Catch up to the MI mutex structure due to saveflags,saveipl,savepsr becoming saveintr.
|
67357 |
20-Oct-2000 |
jhb |
- machine/mutex.h -> sys/mutex.h - Use MUTEX_DECLARE() and MTX_COLD for Giant and sched_lock.
|
67352 |
20-Oct-2000 |
jhb |
- Make the mutex code almost completely machine independent. This greatly reducues the maintenance load for the mutex code. The only MD portions of the mutex code are in machine/mutex.h now, which include the assembly macros for handling mutexes as well as optionally overriding the mutex micro-operations. For example, we use optimized micro-ops on the x86 platform #ifndef I386_CPU. - Change the behavior of the SMP_DEBUG kernel option. In the new code, mtx_assert() only depends on INVARIANTS, allowing other kernel developers to have working mutex assertiions without having to include all of the mutex debugging code. The SMP_DEBUG kernel option has been renamed to MUTEX_DEBUG and now just controls extra mutex debugging code. - Abolish the ugly mtx_f hack. Instead, we dynamically allocate seperate mtx_debug structures on the fly in mtx_init, except for mutexes that are initiated very early in the boot process. These mutexes are declared using a special MUTEX_DECLARE() macro, and use a new flag MTX_COLD when calling mtx_init. This is still somewhat hackish, but it is less evil than the mtx_f filler struct, and the mtx struct is now the same size with and without mutex debugging code. - Add some micro-micro-operation macros for doing the actual atomic operations on the mutex mtx_lock field to make it easier for other archs to override/optimize mutex ops if needed. These new tiny ops also clean up the code in some places by replacing long atomic operation function calls that spanned 2-3 lines with a short 1-line macro call. - Don't call mi_switch() from mtx_enter_hard() when we block while trying to obtain a sleep mutex. Calling mi_switch() would bogusly release Giant before switching to the next process. Instead, inline most of the code from mi_switch() in the mtx_enter_hard() function. Note that when we finally kill Giant we can back this out and go back to calling mi_switch().
|
67351 |
20-Oct-2000 |
jhb |
- Expand the set of atomic operations to optionally include memory barriers in most of the atomic operations. Now for these operations, you can use the normal atomic operation, you can use the operation with a read barrier, or you can use the operation with a write barrier. The function names follow the same semantics used in the ia64 instruction set. An atomic operation with a read barrier has the extra suffix 'acq', due to it having "acquire" semantics. An atomic operation with a write barrier has the extra suffix 'rel'. These suffixes are inserted between the name of the operation to perform and the typename. For example, the atomic_add_int() function now has 3 variants: - atomic_add_int() - this is the same as the previous function - atomic_add_acq_int() - this function combines the add operation with a read memory barrier - atomic_add_rel_int() - this function combines the add operation with a write memory barrier - Add 'ptr' to the list of types that we can perform atomic operations on. This allows one to do atomic operations on uintptr_t's. This is useful in the mutex code, for example, because the actual mutex lock is a pointer. - Add two new operations for doing loads and stores with memory barriers. The new load operations use a read barrier before the load, and the new store operations use a write barrier after the load. For example, atomic_load_acq_int() will atomically load an integer as well as enforcing a read barrier.
|
67350 |
20-Oct-2000 |
jhb |
Axe the barrier_{read,write,rw}() helper functions as this method of doing memory barriers doesn't really scale well for the ia64. Also, memory barriers are more a property of the CPU than bus space.
Requested by: dfr
|
67325 |
19-Oct-2000 |
dfr |
Don't force bootverbose anymore.
|
67324 |
19-Oct-2000 |
dfr |
Decrease the number of ticks between clock interrupts by a factor of ten to place more pressure on the exception handling code.
|
67323 |
19-Oct-2000 |
dfr |
* Disable interrupts when restoring a trapframe. * Make sure we reset ar.k6 (used to hold the kernel stack pointer when we are returning to user mode after a syscall.
|
67285 |
18-Oct-2000 |
jhb |
Add in a simple API for memory barriers to machine/bus.h: - barrier_read() enforces a memory read barrier - barrier_write() enforces a memory write barrier - barrier_rw() enforces a memory read/write barrier
|
67247 |
17-Oct-2000 |
ps |
Implement write combining for crashdumps. This is useful when write caching is disabled on both SCSI and IDE disks where large memory dumps could take up to an hour to complete.
Taking an i386 scsi based system with 512MB of ram and timing (in seconds) how long it took to complete a dump, the following results were obtained:
Before: After: WCE TIME WCE TIME ------------------ ------------------ 1 141.820972 1 15.600111 0 797.265072 0 65.480465
Obtained from: Yahoo! Reviewed by: peter
|
67213 |
16-Oct-2000 |
dfr |
In pmap_remove_pv(), only manipulate the page's list if the pv is managed.
|
67212 |
16-Oct-2000 |
dfr |
Do a full exception_restore after an execve syscall to ensure that the new program gets the right values for its arguments etc.
|
67211 |
16-Oct-2000 |
dfr |
Clear the register stack frame before using loadrs to invalidate the stacked registers.
|
67210 |
16-Oct-2000 |
dfr |
Clear ar.pfs for the child process in cpu_fork - switch_trampoline doesn't want a stack frame.
|
67201 |
16-Oct-2000 |
dfr |
Track changes to trapframe.
|
67199 |
16-Oct-2000 |
dfr |
* Correct some of my misunderstandings about how best to switch to the kernel backing store. * Implement syscalls via break instructions. * Fix backing store copying in cpu_fork() so that the child gets the right register values.
This thing is actually starting to work now. This set of changes takes me up to the second execve (the one which runs the first shell). Next stop single-user mode :-).
|
67196 |
16-Oct-2000 |
dfr |
Use the right mask for extracting sof from cr.ifs.
|
67195 |
16-Oct-2000 |
dfr |
Remember to re-initialise cr.itm on clock interrupts so that we get more than just one tick.
|
67194 |
16-Oct-2000 |
dfr |
Merge a fix from the alpha port - put softintr in the right place in the table.
|
67193 |
16-Oct-2000 |
dfr |
Give names to app registers and control registers. Fix a typo handling mov from branch register instructions.
|
67158 |
15-Oct-2000 |
phk |
Move DELAY() from <machine/clock.h> to <sys/systm.h>
|
67032 |
12-Oct-2000 |
dfr |
Implement a rudimentary interrupt handling system which should be good enough for clock interrupts in SKI.
|
67031 |
12-Oct-2000 |
dfr |
Turn off a debugging printf.
|
67020 |
12-Oct-2000 |
dfr |
* Fix exception handling so that it actually works. We can now handle exceptions from both kernel and user mode. * Fix context switching so that we can switch back to a proc which we switched away from (we were saving the state in the wrong place). * Implement lazy switching of the high-fp state. This needs to be looked at again for SMP to cope with the case of a process migrating from one processor to another while it has the high-fp state. * Make setregs() work properly. I still think this should be called cpu_exec() or something. * Various other minor fixes.
With this lot, we can execve() /sbin/init and we get all the way up to its first syscall. At that point, we stop because syscall handling is not done yet.
|
67018 |
12-Oct-2000 |
dfr |
Fix this so that it can cope with transfers to/from regions which are not physically contiguous.
|
67017 |
12-Oct-2000 |
dfr |
* Allocate kernel stacks with contigmalloc() to make exception handling safe - we can't afford to take a TLB trap when we are writing a trapframe. Possibly revisit this later. * Various fixes to pmap_enter() so that it actually works properly.
|
67016 |
12-Oct-2000 |
dfr |
Some minor fixes and simplifications.
|
66937 |
10-Oct-2000 |
dfr |
* Add rudimentary DDB support (no kgdb, no backtrace, no single step). * Track recent changes to SWI code. * Allocate RIDs for pmaps (untested). * Implement assembler version of cpu_switch - its cleaner that way.
|
66860 |
09-Oct-2000 |
phk |
Initiate deorbit burn sequence for <machine/mouse.h>.
Replace all in-tree uses with <sys/mouse.h> which repo-copied a few moments ago from src/sys/i386/include/mouse.h by peter. This is also the appropriate fix for exo-tree sources.
Put warnings in <machine/mouse.h> to discourage use. November 15th 2000 the warnings will be converted to errors. January 15th 2001 the <machine/mouse.h> files will be removed.
|
66834 |
08-Oct-2000 |
phk |
Initiate deorbit burn sequence for <machine/console.h>.
Replace all in-tree uses with necessary subset of <sys/{fb,kb,cons}io.h>. This is also the appropriate fix for exo-tree sources.
Put warnings in <machine/console.h> to discourage use. November 15th 2000 the warnings will be converted to errors. January 15th 2001 the <machine/console.h> files will be removed.
|
66802 |
08-Oct-2000 |
bmilekic |
Cleanup comment in machine/param.h regarding mbuf-related sizes, and get rid of MCLOFSET, which does not appear to be used anywhere anymore, and if it is, it probably shouldn't be.
|
66739 |
06-Oct-2000 |
bde |
Work around a bug by adding struct tags. gcc-2.95 apparently gets the check in the [basic.link] section of the C++ standard wrong. gcc-2.7.2.3 apparently doesn't do the check, so the bug doesn't affect RELENG_3.
PR: 16170, 21427 Submitted by: Max Khon <fjoe@lark.websci.ru> (i386 version) Discussed with: jdp
|
66721 |
06-Oct-2000 |
jasone |
Reduce userland namespace polution.
#include <proc.h>, since curproc is needed.
|
66633 |
04-Oct-2000 |
dfr |
Next round of fixes to the ia64 code. This includes simulated clock and disk drivers along with a load of fixes to context switching, fork handling and a load of other stuff I can't remember now. This takes us as far as start_init() before it dies. I guess now I will have to finish off the VM system and syscall handling :-).
|
66486 |
30-Sep-2000 |
dfr |
Next round of ia64 work, including fixes to context switching, implementing cpu_fork(), copy*str(), bcopy(), copy{in,out}(). With these changes, my test kernel reaches the mountroot prompt.
|
66464 |
29-Sep-2000 |
dfr |
Ansify and fix warnings.
|
66463 |
29-Sep-2000 |
dfr |
Implement dirty and access bit exceptions.
|
66462 |
29-Sep-2000 |
dfr |
Bodge the simplelocks in a way which works UP.
|
66460 |
29-Sep-2000 |
dfr |
Use write-back instead of write-combining for region 7.
|
66459 |
29-Sep-2000 |
dfr |
Add a few more files for the ia64 port.
|
66458 |
29-Sep-2000 |
dfr |
This is the first snapshot of the FreeBSD/ia64 kernel. This kernel will not work on any real hardware (or fully work on any simulator). Much more needs to happen before this is actually functional but its nice to see the FreeBSD copyright message appear in the ia64 simulator.
|