Cross Reference: /freebsd-10.0-release/sys/i386/i386/swtch.s

History log of /freebsd-10.0-release/sys/i386/i386/swtch.s
Revision	Date	Author	Comments (<<< Hide modified files) (Show modified files >>>)
# 259065	07-Dec-2013	gjb	- Copy stable/10 (r259064) to releng/10.0 as part of the 10.0-RELEASE cycle. - Update __FreeBSD_version [1] - Set branch name to -RC1 [1] 10.0-CURRENT __FreeBSD_version value ended at '55', so start releng/10.0 at '100' so the branch is started with a value ending in zero. Approved by: re (implicit) Sponsored by: The FreeBSD Foundation /freebsd-10.0-release /freebsd-10.0-release/sys/conf/newvers.sh /freebsd-10.0-release/sys/sys/param.h
# 256281	10-Oct-2013	gjb	Copy head (r256279) to stable/10 as part of the 10.0-RELEASE cycle. Approved by: re (implicit) Sponsored by: The FreeBSD Foundation
# 237027	13-Jun-2012	jkim	- Fix resumectx() prototypes to reflect reality. - For i386, simply jump to resumectx() with PCB in %ecx. - Fix a style(9) nit while I am here.
# 236772	08-Jun-2012	iwasaki	Add x86/acpica/acpi_wakeup.c for amd64 and i386. Difference of suspend/resume procedures are minimized among them. common: - Add global cpuset suspended_cpus to indicate APs are suspended/resumed. - Remove acpi_waketag and acpi_wakemap from acpivar.h (no longer used). - Add some variables in acpi_wakecode.S in order to minimize the difference among amd64 and i386. - Disable load_cr3() because now CR3 is restored in resumectx(). amd64: - Add suspend/resume related members (such as MSR) in PCB. - Modify savectx() for above new PCB members. - Merge acpi_switch.S into cpu_switch.S as resumectx(). i386: - Merge(and remove) suspendctx() into savectx() in order to match with amd64 code. Reviewed by: attilio@, acpi@
# 235622	18-May-2012	iwasaki	Add SMP/i386 suspend/resume support. Most part is merged from amd64. - i386/acpica/acpi_wakecode.S Replaced with amd64 code (from realmode to paging enabling code). - i386/acpica/acpi_wakeup.c Replaced with amd64 code (except for wakeup_pagetables stuff). - i386/include/pcb.h - i386/i386/genassym.c Added PCB new members (CR0, CR2, CR4, DS, ED, FS, SS, GDT, IDT, LDT and TR) needed for suspend/resume, not for context switch. - i386/i386/swtch.s Added suspendctx() and resumectx(). Note that savectx() was not changed and used for suspending (while amd64 code uses it). BSP and AP execute the same sequence, suspendctx(), acpi_wakecode() and resumectx() for suspend/resume (in case of UP system also). - i386/i386/apic_vector.s Added cpususpend(). - i386/i386/mp_machdep.c - i386/include/smp.h Added cpususpend_handler(). - i386/include/apicvar.h - kern/subr_smp.c - sys/smp.h Added IPI_SUSPEND and suspend_cpus(). - i386/i386/initcpu.c - i386/i386/machdep.c - i386/include/md_var.h - pc98/pc98/machdep.c Moved initializecpu() declarations to md_var.h. MFC after: 3 days
# 210617	29-Jul-2010	jkim	MFamd64: r210615 Fix another fallout from r208833. savectx() is used to save CPU context for crash dump (dumppcb) and kdb (stoppcbs). For both cases, we cannot have a valid pointer in pcb_save. This should restore the previous behaviour.
# 208833	05-Jun-2010	kib	Introduce the x86 kernel interfaces to allow kernel code to use FPU/SSE hardware. Caller should provide a save area that is chained into the stack of the areas; pcb save_area for usermode FPU state is on top. The pcb now contains a pointer to the current FPU saved area, used during FPUDNA handling and context switches. There is also a facility to allow the kernel thread to use pcb save_area. Change the dreaded warnings "npxdna in kernel mode!" into the panics when FPU usage is not registered. KPI discussed with: fabient Tested by: pho, fabient Hardware provided by: Sentex Communications MFC after: 1 month
# 187948	31-Jan-2009	obrien	Change some movl's to mov's. Newer GAS no longer accept 'movl' instructions for moving between a segment register and a 32-bit memory location. Looked at by: jhb
# 181792	16-Aug-2008	kmacy	Call in to xen for privileged aspects of context switching MFC after: 1 month
# 171916	22-Aug-2007	jkoshy	Assign sizes to assembly language support functions. Approved by: re (kensmith)
# 171480	17-Jul-2007	jeff	- Add support for blocking and releasing threads to i386 cpu_switch(). This is required for per-cpu scheduler lock support. Obtained from: attilio Tested by: current@ many users Approved by: re
# 170368	06-Jun-2007	davidxu	Backout experimental adaptive-spin umtx code.
# 165369	20-Dec-2006	davidxu	Add a lwpid field into per-cpu structure, the lwpid represents current running thread's id on each cpu. This allow us to add in-kernel adaptive spin for user level mutex. While spinning in user space is possible, without correct thread running state exported from kernel, it hardly can be implemented efficiently without wasting cpu cycles, however exporting thread running state unlikely will be implemented soon as it has to design and stablize interfaces. This implementation is transparent to user space, it can be disabled dynamically. With this change, mutex ping-pong program's performance is improved massively on SMP machine. performance of mysql super-smack select benchmark is increased about 7% on Intel dual dual-core2 Xeon machine, it indicates on systems which have bunch of cpus and system-call overhead is low (athlon64, opteron, and core-2 are known to be fast), the adaptive spin does help performance. Added sysctls: kern.threads.umtx_dflt_spins if the sysctl value is non-zero, a zero umutex.m_spincount will cause the sysctl value to be used a spin cycle count. kern.threads.umtx_max_spins the sysctl sets upper limit of spin cycle count. Tested on: Athlon64 X2 3800+, Dual Xeon 5130
# 154502	18-Jan-2006	davidxu	Eliminate a stale instruction introduced in revision 1.136.
# 153836	29-Dec-2005	davidxu	Remove pcb_switchout, it has not been used for a long time.
# 153726	25-Dec-2005	davidxu	Move global variable private_tss into per-cpu area. Reviewed by: jhb
# 149142	16-Aug-2005	jhb	Clarify a comment.
# 145034	13-Apr-2005	peter	Change the segment limits to 4GB, we set the user accessible bit on all of the kernel address space already. Intel recommend this anyway, because using a non-4GB limit adds an additional clock cycle to address generation. We were able to install 4GB segments into the LDT, so any limits we imposed on %cs and %ds were academic anyway. More importantly, this allows us to make a page in the kernel readable to user applications, for holding things like the signal trampoline and other fun things. Move the user %cs/%ds segments from the LDT to the GDT. There was no good reason for them to be there anyway. The old LDT entries are still there but we can now relax the restriction that prevented users from emptying the default LDT entries. Putting user and kernel %cs and %ds together allows us to access the fast sysenter/sysexit/syscall/sysret instructions. syscall/sysret in particular require that the user/kernel segments be laid out this way. Reserve a slot specifically for NDIS while here. Create two user controllable slots in the GDT that are context switched with the (kernel) thread. This allows user applications to set two user privilige selectors to arbitary values. Create i386_set_fsbase(void *base) and friends. (get/set, fs/gs). For i386, %gs is used by tls and the thread libraries and this means that user processes no longer have to have the cost of having a custom LDT, and we will no longer to do a ldt switch when activating a kthread/ithread in the usual case any more. In other words, we can now set the base address for %fs and %gs to arbitary addresses without the pain of messing with ldt segments.
# 132213	15-Jul-2004	jhb	Fix a typo in a comment.
# 130164	06-Jun-2004	phk	Remove filename+line number from panic messages.
# 128019	07-Apr-2004	imp	Remove advertising clause from University of California Regent's license, per letter dated July 22, 1999 and email from Peter Wemm, Alan Cox and Robert Watson. Approved by: core, peter, alc, rwatson
# 124732	19-Jan-2004	phk	Add linenumber and source filename to panic(9) output. Ideally a traceback should be printed too, any takers ?
# 120602	30-Sep-2003	jeff	- On my Pentium4-M laptop, invalpg takes ~1100 cycles if the page is found in the TLB and ~1600 if it is not. Therefore, it is more effecient to invalidate the TLB after operations that use CMAP rather than before. - So that the tlb is invalidated prior to switching off of a processor, we must change the switchin functions to switchout functions. - Remove td_switchout from the thread and move it to the x86 pcb. - Move the code that calls switchout into swtch.s. These changes make this optimization truely x86 specific.
# 117372	09-Jul-2003	peter	unifdef -DLAZY_SWITCH and start to tidy up the associated glue.
# 116930	27-Jun-2003	peter	Tidy up leftover lazy_switch instrumentation that is no longer needed. This cleans up some #ifdef hell.
# 116929	27-Jun-2003	peter	groan. I can't win today. Fix manual transcription error so that the PAE ifdef is correct. Pointy hat assigned by: kan
# 116928	27-Jun-2003	peter	Make LAZY_SWITCH work with PAE
# 116926	27-Jun-2003	peter	Fix the false IPIs on smp when using LAZY_SWITCH caused by pmap_activate() not releasing the pm_active bit in the old pmap.
# 113148	05-Apr-2003	peter	Unbreak the !LAZY_SWITCH case. I #ifdef'ed too much when I added the ifdefs prior to commit and killed the same-address-space test. Submitted by: bde
# 112993	02-Apr-2003	peter	Commit a partial lazy thread switch mechanism for i386. it isn't as lazy as it could be and can do with some more cleanup. Currently its under options LAZY_SWITCH. What this does is avoid %cr3 reloads for short context switches that do not involve another user process. ie: we can take an interrupt, switch to a kthread and return to the user without explicitly flushing the tlb. However, this isn't as exciting as it could be, the interrupt overhead is still high and too much blocks on Giant still. There are some debug sysctls, for stats and for an on/off switch. The main problem with doing this has been "what if the process that you're running on exits while we're borrowing its address space?" - in this case we use an IPI to give it a kick when we're about to reclaim the pmap. Its not compiled in unless you add the LAZY_SWITCH option. I want to fix a few more things and get some more feedback before turning it on by default. This is NOT a replacement for Bosko's lazy interrupt stuff. This was more meant for the kthread case, while his was for interrupts. Mine helps a little for interrupts, but his helps a lot more. The stats are enabled with options SWTCH_OPTIM_STATS - this has been a pseudo-option for years, I just added a bunch of stuff to it. One non-trivial change was to select a new thread before calling cpu_switch() in the first place. This allows us to catch the silly case of doing a cpu_switch() to the current process. This happens uncomfortably often. This simplifies a bit of the asm code in cpu_switch (no longer have to call choosethread() in the middle). This has been implemented on i386 and (thanks to jake) sparc64. The others will come soon. This is actually seperate to the lazy switch stuff. Glanced at by: jake, jhb
# 109715	22-Jan-2003	peter	Now that TPR isn't bogusly raised at boot, there is no need to clear it at context switch.
# 100432	21-Jul-2002	peter	Move SWTCH_OPTIM_STATS related code out of cpufunc.h. (This sort of stat gathering is not an x86 cpu feature)
# 99887	12-Jul-2002	jhb	Set the thread state of the newly chosen to run thread to TDS_RUNNING in choosethread() in MI C code instead of doing it in in assembly in all the various cpu_switch() functions. This fixes problems on ia64 and sparc64. Reviewed by: julian, peter, benno Tested on: i386, alpha, sparc64
# 99072	29-Jun-2002	julian	Part 1 of KSE-III The ability to schedule multiple threads per process (one one cpu) by making ALL system calls optionally asynchronous. to come: ia64 and power-pc patches, patches for gdb, test program (in tools) Reviewed by: Almost everyone who counts (at various times, peter, jhb, matt, alfred, mini, bernd, and a cast of thousands) NOTE: this is still Beta code, and contains lots of debugging stuff. expect slight instability in signals..
# 93264	27-Mar-2002	dillon	Compromise for critical()/cpu_critical() recommit. Cleanup the interrupt disablement assumptions in kern_fork.c by adding another API call, cpu_critical_fork_exit(). Cleanup the td_savecrit field by moving it from MI to MD. Temporarily move cpu_critical() from <arch>/include/cpufunc.h to <arch>/<arch>/critical.c (stage-2 will clean this up). Implement interrupt deferral for i386 that allows interrupts to remain enabled inside critical sections. This also fixes an IPI interlock bug, and requires uses of icu_lock to be enclosed in a true interrupt disablement. This is the stage-1 commit. Stage-2 will occur after stage-1 has stabilized, and will move cpu_critical() into its own header file(s) + other things. This commit may break non-i386 architectures in trivial ways. This should be temporary. Reviewed by: core Approved by: core
# 91328	26-Feb-2002	dillon	revert last commit temporarily due to whining on the lists.
# 91315	26-Feb-2002	dillon	STAGE-1 of 3 commit - allow (but do not require) interrupts to remain enabled in critical sections and streamline critical_enter() and critical_exit(). This commit allows an architecture to leave interrupts enabled inside critical sections if it so wishes. Architectures that do not wish to do this are not effected by this change. This commit implements the feature for the I386 architecture and provides a sysctl, debug.critical_mode, which defaults to 1 (use the feature). For now you can turn the sysctl on and off at any time in order to test the architectural changes or track down bugs. This commit is just the first stage. Some areas of the code, specifically the MACHINE_CRITICAL_ENTER #ifdef'd code, is strictly temporary and will be cleaned up in the STAGE-2 commit when the critical_() functions are moved entirely into MD files. The following changes have been made: critical_enter() and critical_exit() for I386 now simply increment and decrement curthread->td_critnest. They no longer disable hard interrupts. When critical_exit() decrements the counter to 0 it effectively calls a routine to deal with whatever interrupts were deferred during the time the code was operating in a critical section. Other architectures are unaffected. * fork_exit() has been conditionalized to remove MD assumptions for the new code. Old code will still use the old MD assumptions in regards to hard interrupt disablement. In STAGE-2 this will be turned into a subroutine call into MD code rather then hardcoded in MI code. The new code places the burden of entering the critical section in the trampoline code where it belongs. * I386: interrupts are now enabled while we are in a critical section. The interrupt vector code has been adjusted to deal with the fact. If it detects that we are in a critical section it currently defers the interrupt by adding the appropriate bit to an interrupt mask. * In order to accomplish the deferral, icu_lock is required. This is i386-specific. Thus icu_lock can only be obtained by mainline i386 code while interrupts are hard disabled. This change has been made. * Because interrupts may or may not be hard disabled during a context switch, cpu_switch() can no longer simply assume that PSL_I will be in a consistent state. Therefore, it now saves and restores eflags. * FAST INTERRUPT PROVISION. Fast interrupts are currently deferred. The intention is to eventually allow them to operate either while we are in a critical section or, if we are able to restrict the use of sched_lock, while we are not holding the sched_lock. * ICU and APIC vector assembly for I386 cleaned up. The ICU code has been cleaned up to match the APIC code in regards to format and macro availability. Additionally, the code has been adjusted to deal with deferred interrupts. * Deferred interrupts use a per-cpu boolean int_pending, and masks ipending, spending, and fpending. Being per-cpu variables it is not currently necessary to lock; bus cycles modifying them. Note that the same mechanism will enable preemption to be incorporated as a true software interrupt without having to further hack up the critical nesting code. * Note: the old critical_enter() code in kern/kern_switch.c is currently #ifdef to be compatible with both the old and new methodology. In STAGE-2 it will be moved entirely to MD code. Performance issues: One of the purposes of this commit is to enhance critical section performance, specifically to greatly reduce bus overhead to allow the critical section code to be used to protect per-cpu caches. These caches, such as Jeff's slab allocator work, can potentially operate very quickly making the effective savings of the new critical section code's performance very significant. The second purpose of this commit is to allow architectures to enable certain interrupts while in a critical section. Specifically, the intention is to eventually allow certain FAST interrupts to operate rather then defer. The third purpose of this commit is to begin to clean up the critical_enter()/critical_exit()/cpu_critical_enter()/ cpu_critical_exit() API which currently has serious cross pollution in MI code (in fork_exit() and ast() for example). The fourth purpose of this commit is to provide a framework that allows kernel-preempting software interrupts to be implemented cleanly. This is currently used for two forward interrupts in I386. Other architectures will have the choice of using this infrastructure or building the functionality directly into critical_enter()/ critical_exit(). Finally, this commit is designed to greatly improve the flexibility of various architectures to manage critical section handling, software interrupts, preemption, and other highly integrated architecture-specific details.
# 90372	07-Feb-2002	peter	Attempt to patch up some style bugs introduced in the previous commit
# 90361	07-Feb-2002	julian	Pre-KSE/M3 commit. this is a low-functionality change that changes the kernel to access the main thread of a process via the linked list of threads rather than assuming that it is embedded in the process. It IS still embeded there but remove all teh code that assumes that in preparation for the next commit which will actually move it out. Reviewed by: peter@freebsd.org, gallatin@cs.duke.edu, benno rice,
# 89466	17-Jan-2002	bde	Changed the type of pcb_flags from u_char to u_int and adjusted things. This removes the only atomic operation on a char type in the entire kernel.
# 87702	11-Dec-2001	jhb	Overhaul the per-CPU support a bit: - The MI portions of struct globaldata have been consolidated into a MI struct pcpu. The MD per-CPU data are specified via a macro defined in machine/pcpu.h. A macro was chosen over a struct mdpcpu so that the interface would be cleaner (PCPU_GET(my_md_field) vs. PCPU_GET(md.md_my_md_field)). - All references to globaldata are changed to pcpu instead. In a UP kernel, this data was stored as global variables which is where the original name came from. In an SMP world this data is per-CPU and ideally private to each CPU outside of the context of debuggers. This also included combining machine/globaldata.h and machine/globals.h into machine/pcpu.h. - The pointer to the thread using the FPU on i386 was renamed from npxthread to fpcurthread to be identical with other architectures. - Make the show pcpu ddb command MI with a MD callout to display MD fields. - The globaldata_register() function was renamed to pcpu_init() and now init's MI fields of a struct pcpu in addition to registering it with the internal array and list. - A pcpu_destroy() function was added to remove a struct pcpu from the internal array and list. Tested on: alpha, i386 Reviewed by: peter, jake
# 85711	29-Oct-2001	jhb	Fix a typo in comment and #ifdef fixes: GRAP_PRIO -> GRAB_PRIO so that x86 SMP kernels actually boot again to single user mode. Pointy hat to: jhb Noticed by: jlemon
# 85627	28-Oct-2001	jhb	- More whitespace and comment cleanups. - Remove unused sw1a label. A breakpoint can be set in choosethread() for the same effect. Reviewed by: bde Submitted by: bde (partly)
# 85491	25-Oct-2001	jhb	Currently no code does a CROSSJUMP() to sw1a, so we don't need a CROSSJUMPTARGET() for it. Submitted by: bde
# 85490	25-Oct-2001	jhb	Use %ecx instead of %ebx for the scratch register while updating %dr7 since %ecx isn't a call safe register and thus we don't have to save and restore it. Submitted by: bde
# 85489	25-Oct-2001	jhb	- Fix typo in comment from previous revision. - Fix a bug in the LDT changes where the wrong argument was passed to set_user_ldt() from cpu_switch(). The bug was passing a pointer to the ldt, but set_user_ldt() takes a pointer to the process' mdproc structure. Submitted by: bde
# 85487	25-Oct-2001	jhb	Whitespace, comment, and string fixes. Submitted by: bde (mostly)
# 85449	24-Oct-2001	jhb	Split the per-process Local Descriptor Table out of the PCB and into struct mdproc. Submitted by: Andrew R. Reiter <arr@watson.org> Silence on: -current
# 83659	19-Sep-2001	peter	Fix a mistake I made with the pcb movement relative to the stack in the KSE patch. We need to leave the 16 bytes here for enabling the trapframe to be converted to a vm86trapframe if we're switching to a vm86 context.
# 83366	12-Sep-2001	julian	KSE Milestone 2 Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process. Sorry john! (your next MFC will be a doosie!) Reviewed by: peter@freebsd.org, dillon@freebsd.org X-MFC after: ha ha ha ha
# 79893	19-Jul-2001	bsd	swtch.s: During context save, use the correct bit mask for clearing the non-reserved bits of dr7. During context restore, load dr7 in such a way as to not disturb reserved bits. machdep.c: Don't explicitly disallow the setting of the reserved bits in dr7 since we now keep from setting them when we load dr7 from the PCB. This allows one to write back the dr7 value obtained from the system without triggering an EINVAL (one of the reserved bits always seems to be set after taking a trace trap). MFC after: 7 days
# 76903	20-May-2001	bde	Use a critical region to protect saving of the npx state in savectx(). Not doing this was fairly harmless because savectx() is only called for panic dumps and the bug could at worse reset the state. savectx() is still missing saving of (volatile) debug registers, and still isn't called for core dumps.
# 76650	15-May-2001	jhb	Remove unneeded includes of sys/ipl.h and machine/ipl.h.
# 73011	25-Feb-2001	jake	Remove the leading underscore from all symbols defined in x86 asm and used in C or vice versa. The elf compiler uses the same names for both. Remove asnames.h with great prejudice; it has served its purpose. Note that this does not affect the ability to generate an aout kernel due to gcc's -mno-underscores option. moral support from: peter, jhb
# 72930	22-Feb-2001	peter	Activate USER_LDT by default. The new thread libraries are going to depend on this. The linux ABI emulator tries to use it for some linux binaries too. VM86 had a bigger cost than this and it was made default a while ago. Reviewed by: jhb, imp
# 72746	20-Feb-2001	jhb	- Don't call clear_resched() in userret(), instead, clear the resched flag in mi_switch() just before calling cpu_switch() so that the first switch after a resched request will satisfy the request. - While I'm at it, move a few things into mi_switch() and out of cpu_switch(), specifically set the p_oncpu and p_lastcpu members of proc in mi_switch(), and handle the sched_lock state change across a context switch in mi_switch(). - Since cpu_switch() no longer handles the sched_lock state change, we have to setup an initial state for sched_lock in fork_exit() before we release it.
# 72376	11-Feb-2001	jake	Implement a unified run queue and adjust priority levels accordingly. - All processes go into the same array of queues, with different scheduling classes using different portions of the array. This allows user processes to have their priorities propogated up into interrupt thread range if need be. - I chose 64 run queues as an arbitrary number that is greater than 32. We used to have 4 separate arrays of 32 queues each, so this may not be optimal. The new run queue code was written with this in mind; changing the number of run queues only requires changing constants in runq.h and adjusting the priority levels. - The new run queue code takes the run queue as a parameter. This is intended to be used to create per-cpu run queues. Implement wrappers for compatibility with the old interface which pass in the global run queue structure. - Group the priority level, user priority, native priority (before propogation) and the scheduling class into a struct priority. - Change any hard coded priority levels that I found to use symbolic constants (TTIPRI and TTOPRI). - Remove the curpriority global variable and use that of curproc. This was used to detect when a process' priority had lowered and it should yield. We now effectively yield on every interrupt. - Activate propogate_priority(). It should now have the desired effect without needing to also propogate the scheduling class. - Temporarily comment out the call to vm_page_zero_idle() in the idle loop. It interfered with propogate_priority() because the idle process needed to do a non-blocking acquire of Giant and then other processes would try to propogate their priority onto it. The idle process should not do anything except idle. vm_page_zero_idle() will return in the form of an idle priority kernel thread which is woken up at apprioriate times by the vm system. - Update struct kinfo_proc to the new priority interface. Deliberately change its size by adjusting the spare fields. It remained the same size, but the layout has changed, so userland processes that use it would parse the data incorrectly. The size constraint should really be changed to an arbitrary version number. Also add a debug.sizeof sysctl node for struct kinfo_proc.
# 72358	11-Feb-2001	markm	RIP <machine/lock.h>. Some things needed bits of <i386/include/lock.h> - cy.c now has its own (only) copy of the COM_(UN)LOCK() macros, and IMASK_(UN)LOCK() has been moved to <i386/include/apic.h> (AKA <machine/apic.h>). Reviewed by: jhb
# 72276	10-Feb-2001	jhb	- Make astpending and need_resched process attributes rather than CPU attributes. This is needed for AST's to be properly posted in a preemptive kernel. They are backed by two new flags in p_sflag: PS_ASTPENDING and PS_NEEDRESCHED. They are still accesssed by their old macros: aston(), astoff(), etc. For completeness, an astpending() macro has been added to check for a pending AST, and clear_resched() has been added to clear need_resched(). - Rename syscall2() on the x86 back to syscall() to be consistent with other architectures.
# 71294	20-Jan-2001	jake	Rename the ASSYM MTX_RECURSE to MTX_RECURSECNT in order to not conflict with the flag of the same name.
# 71257	19-Jan-2001	peter	Use #ifdef DEV_NPX from opt_npx.h instead of #if NNPX > 0 from npx.h
# 70714	06-Jan-2001	jake	Use %fs to access per-cpu variables in uni-processor kernels the same as multi-processor kernels. The old way made it difficult for kernel modules to be portable between uni-processor and multi-processor kernels. It is no longer necessary to jump through hoops. - always load %fs with the private segment on entry to the kernel - change the type of the self referntial pointer from struct privatespace to struct globaldata - make the globaldata symbol have value 0 in all cases, so the symbols in globals.s are always offsets, not aliases for fields in globaldata - define the globaldata space used for uniprocessor kernels in C, rather than assembler - change the assmebly language accessors to use %fs, add a macro PCPU_ADDR(member, reg), which loads the register reg with the address of the per-cpu variable member
# 70006	14-Dec-2000	jake	Use _lapic+offset to access the local apic from assembly language files, rather than the symbols in globals.s. The offsets are generated by genassym.
# 69971	13-Dec-2000	jake	Introduce a new potientially cleaner interface for accessing per-cpu variables from i386 assembly language. The syntax is PCPU(member) where member is the capitalized name of the per-cpu variable, without the gd_ prefix. Example: movl %eax,PCPU(CURPROC). The capitalization is due to using the offsets generated by genassym rather than the symbols provided by linking with globals.o. asmacros.h is the wrong place for this but it seemed as good a place as any for now. The old implementation in asnames.h has not been removed because it is still used to de-mangle the symbols used by the C variables for the UP case.
# 69779	08-Dec-2000	jake	Revert the previous change I made to cpu_switch. It doesn't help as much as I thought it would and according to bde was a pessimization.
# 69536	02-Dec-2000	jake	Change cpu_switch to explicitly popl the callers program counter and pushl that of the new process, rather than doing a movl (%esp) and assuming that the stack has been setup right. This make the initial stack setup slightly more sane, and will make it easier to stick an interrupted process onto the run queue without its knowing.
# 68860	17-Nov-2000	jhb	- Change extra sanity checks in cpu_switch() to be conditional on INVARIANTS instead of DIAGNOSTIC. - Remove the p_wchan check as it no longer applies since a process may be switched out during CURSIG() within msleep() or mawait(). - Remove an extra sanity check only needed during the early SMPng work.
# 67095	13-Oct-2000	peter	savectx() is now used exclusively by the crash dump system. Move the i386 specific gunk (copy %cr3 to the pcb) from the MI dumpsys() to the MD savectx().
# 66855	09-Oct-2000	bde	Unremoved used include of <machine/ipl.h>. Removing it in rev.1.95 significantly pessimized syscalls by arranging to do null rescheduling on return from every syscall. (AST_RESCHED was not defined, and the mask ~AST_RESCHED gets replaced by the useless mask ~0. This bug has been fixed before, in rev.1.92.)
# 66716	06-Oct-2000	jhb	- Change fast interrupts on x86 to push a full interrupt frame and to return through doreti to handle ast's. This is necessary for the clock interrupts to work properly. - Change the clock interrupts on the x86 to be fast instead of threaded. This is needed because both hardclock() and statclock() need to run in the context of the current process, not in a separate thread context. - Kill the prevproc hack as it is no longer needed. - We really need Giant when we call psignal(), but we don't want to block during the clock interrupt. Instead, use two p_flag's in the proc struct to mark the current process as having a pending SIGVTALRM or a SIGPROF and let them be delivered during ast() when hardclock() has finished running. - Remove CLKF_BASEPRI, which was #ifdef'd out on the x86 anyways. It was broken on the x86 if it was turned on since cpl is gone. It's only use was to bogusly run softclock() directly during hardclock() rather than scheduling an SWI. - Remove the COM_LOCK simplelock and replace it with a clock_lock spin mutex. Since the spin mutex already handles disabling/restoring interrupts appropriately, this also lets us axe all the *_intr() fu. - Back out the hacks in the APIC_IO x86 cpu_initclocks() code to use temporary fast interrupts for the APIC trial. - Add two new process flags P_ALRMPEND and P_PROFPEND to mark the pending signals in hardclock() that are to be delivered in ast(). Submitted by: jakeb (making statclock safe in a fast interrupt) Submitted by: cp (concept of delaying signals until ast())
# 66698	05-Oct-2000	jhb	- Heavyweight interrupt threads on the alpha for device I/O interrupts. - Make softinterrupts (SWI's) almost completely MI, and divorce them completely from the x86 hardware interrupt code. - The ihandlers array is now gone. Instead, there is a MI shandlers array that just contains SWI handlers. - Most of the former machine/ipl.h files have moved to a new sys/ipl.h. - Stub out all the spl*() functions on all architectures. Submitted by: dfr
# 66206	22-Sep-2000	msmith	Implement halt-on-idle in the !SMP case, which should significantly reduce power consumption on most systems.
# 65557	06-Sep-2000	jasone	Major update to the way synchronization is done in the kernel. Highlights include: * Mutual exclusion is used instead of spl(). See mutex(9). (Note: The alpha port is still in transition and currently uses both.) Per-CPU idle processes. * Interrupts are run in their own separate kernel threads and can be preempted (i386 only). Partially contributed by: BSDi (BSD/OS) Submissions by (at least): cp, dfr, dillon, grog, jake, jhb, sheldonh
# 60346	11-May-2000	peter	Move <machine/ipl.h> outside #ifdef SMP because it supplies AST_RESCHED. Without this, it shows up as an undefined symbol in /kernel. (!) (This looks very freaky when doing a nm /kernel!)
# 58764	29-Mar-2000	dillon	The SMP cleanup commit broke need_resched, this fixes that and also removed unncessary MPLOCKED and 'lock' prefixes from the interrupt nesting level, since (A) the MP lock is held at the time, and (B) since the neting level is restored prior to return any interrupted code will see a consistent value.
# 58717	28-Mar-2000	dillon	Commit major SMP cleanups and move the BGL (big giant lock) in the syscall path inward. A system call may select whether it needs the MP lock or not (the default being that it does need it). A great deal of conditional SMP code for various deadended experiments has been removed. 'cil' and 'cml' have been removed entirely, and the locking around the cpl has been removed. The conditional separately-locked fast-interrupt code has been removed, meaning that interrupts must hold the CPL now (but they pretty much had to anyway). Another reason for doing this is that the original separate-lock for interrupts just doesn't apply to the interrupt thread mechanism being contemplated. Modifications to the cpl may now ONLY occur while holding the MP lock. For example, if an otherwise MP safe syscall needs to mess with the cpl, it must hold the MP lock for the duration and must (as usual) save/restore the cpl in a nested fashion. This is precursor work for the real meat coming later: avoiding having to hold the MP lock for common syscalls and I/O's and interrupt threads. It is expected that the spl mechanisms and new interrupt threading mechanisms will be able to run in tandem, allowing a slow piecemeal transition to occur. This patch should result in a moderate performance improvement due to the considerable amount of code that has been removed from the critical path, especially the simplification of the spl*() calls. The real performance gains will come later. Approved by: jkh Reviewed by: current, bde (exception.s) Some work taken from: luoqi's patch
# 55316	02-Jan-2000	obrien	The issue fixed in revision 1.88 was raised by Gerard Roudier <groudier@club-internet.fr>
# 55312	02-Jan-2000	phk	Move the "sti" instruction to right before the "hlt" to close a tiny race condition. Obtained from: bde and/or obrien
# 50477	27-Aug-1999	peter	$Id$ -> $FreeBSD$
# 50036	18-Aug-1999	peter	Use the MI process selection. We use a quick routine to decide whether to get the mplock and enter the kernel to run a process in the SMP case.
# 48729	10-Jul-1999	bde	Go back to the old (icu.s rev.1.7 1993) way of keeping the AST-pending bit separate from ipending, since this is simpler and/or necessary for SMP and may even be better for UP. Reviewed by: alc, luoqi, tegge
# 48691	09-Jul-1999	jlemon	Implement support for hardware debug registers on the i386. Submitted by: Brian Dean <brdean@unx.sas.com>
# 48505	03-Jul-1999	alc	An SMP-specific change: Add the lock prefix to RMW operations on ipending.
# 47678	01-Jun-1999	jlemon	Unifdef VM86. Reviewed by: silence on on -current
# 47081	12-May-1999	luoqi	Unbreak VESA on SMP.
# 46548	06-May-1999	bde	Fixed profiling of elf kernels. Made high resolution profiling compile for elf kernels (it is broken for all kernels due to lack of egcs support). Renaming of many assembler labels is avoided by declaring by declaring the labels that need to be visible to gprof as having type "function" and depending on the elf version of gprof being zealous about discarding the others. A few type declarations are still missing, mainly for SMP. PR: 9413 Submitted by: Assar Westerlund <assar@sics.se> (initial parts)
# 46129	27-Apr-1999	luoqi	Enable vmspace sharing on SMP. Major changes are, - %fs register is added to trapframe and saved/restored upon kernel entry/exit. - Per-cpu pages are no longer mapped at the same virtual address. - Each cpu now has a separate gdt selector table. A new segment selector is added to point to per-cpu pages, per-cpu global variables are now accessed through this new selector (%fs). The selectors in gdt table are rearranged for cache line optimization. - fask_vfork is now on as default for both UP and SMP. - Some aio code cleanup. Reviewed by: Alan Cox <alc@cs.rice.edu> John Dyson <dyson@iquest.net> Julian Elischer <julian@whistel.com> Bruce Evans <bde@zeta.org.au> David Greenman <dg@root.com>
# 45252	02-Apr-1999	alc	Put in place the infrastructure for improved UP and SMP TLB management. In particular, replace the unused field pmap::pm_flag by pmap::pm_active, which is a bit mask representing which processors have the pmap activated. (Thus, it is a simple Boolean on UPs.) Also, eliminate an unnecessary memory reference from cpu_switch() in swtch.s. Assisted by: John S. Dyson <dyson@iquest.net> Tested by: Luoqi Chen <luoqi@watermarkgroup.com>, Poul-Henning Kamp <phk@critter.freebsd.dk>
# 44917	20-Mar-1999	alc	Eliminate a pointless TLB flush from the SMP idle loop. Submitted by: Luoqi Chen <luoqi@watermarkgroup.com> Reviewed by: "John S. Dyson" <toor@dyson.iquest.net>
# 44844	18-Mar-1999	jlemon	Change btrl/btsl to cmpl/movl, since each cpu now has their own copy of private_tss, and there's no need to use a bit array. Also fixes the problem of using `je' after btrl, since cmpl sets ZF. Noticed by: Luoqi, on -current
# 37919	28-Jul-1998	bde	Micro-optimized and cleaned up the clearing of switchtime in idle(). Cleaned up the conditionals in the disgusting SMP ifdef in idle().
# 36441	28-May-1998	phk	Some cleanups related to timecounters and weird ifdefs in <sys/time.h>. Clean up (or if antipodic: down) some of the msgbuf stuff. Use an inline function rather than a macro for timecounter delta. Maintain process "on-cpu" time as 64 bits of microseconds to avoid needless second rollover overhead. Avoid calling microuptime the second time in mi_switch() if we do not pass through _idle in cpu_switch() This should reduce our context-switch overhead a bit, in particular on pre-P5 and SMP systems. WARNING: Programs which muck about with struct proc in userland will have to be fixed. Reviewed, but found imperfect by: bde
# 36214	19-May-1998	dufault	Remove option for SCHED_FIFO. With this optional, SCHED_FIFO is the same as RTPRIO_IDLE when it falls through to the default.
# 35977	12-May-1998	dyson	Some temporary fixes to SMP to make it more scheduling and signal friendly. This is a result of discussions on the mailing lists. Kudos to those who have found the issue and created work-arounds. I have chosen Tor's fix for now, before we can all work the issue more completely. Submitted by: Tor Egge
# 35075	06-Apr-1998	peter	_curpcb is always defined in globals.s instead of here in #ifdefs
# 34925	28-Mar-1998	dufault	Finish _POSIX_PRIORITY_SCHEDULING. Needs P1003_1B and _KPOSIX_PRIORITY_SCHEDULING options to work. Changes: Change all "posix4" to "p1003_1b". Misnamed files are left as "posix4" until I'm told if I can simply delete them and add new ones; Add _POSIX_PRIORITY_SCHEDULING system calls for FreeBSD and Linux; Add man pages for _POSIX_PRIORITY_SCHEDULING system calls; Add options to LINT; Minor fixes to P1003_1B code during testing.
# 34028	04-Mar-1998	dufault	Reviewed by: msmith, bde long ago Fix for RTPRIO scheduler to eliminate invalid context switches.
# 33134	06-Feb-1998	eivind	Back out DIAGNOSTIC changes.
# 33108	04-Feb-1998	eivind	Turn DIAGNOSTIC into a new-style option.
# 31723	15-Dec-1997	tegge	Add support for low resolution SMP kernel profiling. - A nonprofiling version of s_lock (called s_lock_np) is used by mcount. - When profiling is active, more registers are clobbered in seemingly simple assembly routines. This means that some callers needed to save/restore extra registers. - The stack pointer must have space for a 'fake' return address in idle, to avoid stack underflow.
# 31709	14-Dec-1997	dyson	After one of my analysis passes to evaluate methods for SMP TLB mgmt, I noticed some major enhancements available for UP situations. The number of UP TLB flushes is decreased much more than significantly with these changes. Since a TLB flush appears to cost minimally approx 80 cycles, this is a "nice" enhancement, equiv to eliminating between 40 and 160 instructions per TLB flush. Changes include making sure that kernel threads all use the same PTD, and eliminate unneeded PTD switches at context switch time.
# 30265	10-Oct-1997	peter	Convert the VM86 option from a global option to an option only depended on by the files that use it. Changing the VM86 option now only causes a recompile of a dozen files or so rather than the entire kernel.
# 29663	21-Sep-1997	peter	Implement the parts needed for VM86 under SMP.
# 29213	07-Sep-1997	fsmp	General cleanup of the lock pushdown code. They are grouped and enabled from machine/smptests.h: #define PUSHDOWN_LEVEL_1 #define PUSHDOWN_LEVEL_2 #define PUSHDOWN_LEVEL_3 #define PUSHDOWN_LEVEL_4_NOT
# 29151	05-Sep-1997	peter	Argh, what was I thinking?? Don't (yet) halt the CPU in the idle loop while waiting for an interrupt (rather than spinning on the runqueue status bits), since the other cpu can put stuff in there and the sleeping cpu may not get an interrupt for a while. When we have a reschedule IPI, this can come back. Pointed out by: fsmp
# 28808	26-Aug-1997	peter	Clean up the SMP AP bootstrap and eliminate the wretched idle procs. - We now have enough per-cpu idle context, the real idle loop has been revived (cpu's halt now with nothing to do). - Some preliminary support for running some operations outside the global lock (eg: zeroing "free but not yet zeroed pages") is present but appears to cause problems. Off by default. - the smp_active sysctl now behaves differently. It's merely a 'true/false' option. Setting smp_active to zero causes the AP's to halt in the idle loop and stop scheduling processes. - bootstrap is a lot safer. Instead of sharing a statically compiled in stack a number of times (which has caused lots of problems) and then abandoning it, we use the idle context to boot the AP's directly. This should help >2 cpu support since the bootlock stuff was in doubt. - print physical apic id in traps.. helps identify private pages getting out of sync. (You don't want to know how much hair I tore out with this!) More cleanup to follow, this is more of a checkpoint than a 'finished' thing.
# 27993	08-Aug-1997	dyson	VM86 kernel support. Work done by BSDI, Jonathan Lemon <jlemon@americantv.com>, Mike Smith <msmith@gsoft.com.au>, Sean Eric Fagan <sef@kithrup.com>, and probably alot of others. Submitted by: Jnathan Lemon <jlemon@americantv.com>
# 27893	04-Aug-1997	fsmp	Eliminate frequent silo overflows by restoring the TEST_LOPRIO code. This code was eliminated when the PEND_INTS algorithm was added. But it was discovered that PEND_INTS only worsen latency for FAST_INTR() routines, which can't be marked pending. Noticed & debugged by: dave adkins <adkin003@gold.tc.umn.edu>
# 27780	31-Jul-1997	fsmp	Converted the TEST_LOPRIO code to default. Created mplock functions that save/restore NO registers. Minor cleanup.
# 27535	20-Jul-1997	bde	Removed unused #includes.
# 27407	15-Jul-1997	fsmp	Tighten up asm code for TEST_PRIO and other misc. things. Use some new defines in place of "magic numbers".
# 27133	30-Jun-1997	bde	Un-inline a call to spl0(). It is not time critical, and was only inline because there was no non-inline spl0() to call. Don't frob intr_nesting_level in idle() or cpu_switch(). Interrupts are mostly disabled then, so the frobbing had little effect.
# 26812	22-Jun-1997	peter	Preliminary support for per-cpu data pages. This eliminates a lot of #ifdef SMP type code. Things like _curproc reside in a data page that is unique on each cpu, eliminating the expensive macros like: #define curproc (SMPcurproc[cpunumber()]) There are some unresolved bootstrap and address space sharing issues at present, but Steve is waiting on this for other work. There is still some strictly temporary code present that isn't exactly pretty. This is part of a larger change that has run into some bumps, this part is standalone so it should be safe. The temporary code goes away when the full idle cpu support is finished. Reviewed by: fsmp, dyson
# 26494	07-Jun-1997	bde	Preserve %fs and %gs across context switches. This has a relatively low cost since it is only done in cpu_switch(), not for every exception. The extra state is kept in the pcb, and handled much like the npx state, with similar deficiencies (the state is not preserved across signal handlers, and error handling loses state).
# 26309	31-May-1997	peter	Include file updates.. <machine/spl.h> -> <machine/ipl.h>, add <machine/ipl.h> to those files that were depending on getting SWI_* implicitly via <machine/cpufunc.h>
# 26267	29-May-1997	peter	remove no longer needed opt_smp.h includes
# 25243	28-Apr-1997	fsmp	cleaned out an old FIXME.
# 25164	26-Apr-1997	peter	Man the liferafts! Here comes the long awaited SMP -> -current merge! There are various options documented in i386/conf/LINT, there is more to come over the next few days. The kernel should run pretty much "as before" without the options to activate SMP mode. There are a handful of known "loose ends" that need to be fixed, but have been put off since the SMP kernel is in a moderately good condition at the moment. This commit is the result of the tinkering and testing over the last 14 months by many people. A special thanks to Steve Passe for implementing the APIC code!
# 25083	22-Apr-1997	jdp	Make the necessary changes so that an ELF kernel can be built. I have successfully built, booted, and run a number of different ELF kernel configurations, including GENERIC. LINT also builds and links cleanly, though I have not tried to boot it. The impact on developers is virtually nil, except for two things. All linker sets that might possibly be present in the kernel must be listed in "sys/i386/i386/setdefs.h". And all C symbols that are also referenced from assembly language code must be listed in "sys/i386/include/asnames.h". It so happens that failure to do these things will have no impact on the a.out kernel. But it will break the build of the ELF kernel. The ELF bootloader works, but it is not ready to commit quite yet.
# 25038	20-Apr-1997	phk	Fix up the "hlt vector" change I made. Reviewed by: bde, bde, bde
# 24933	14-Apr-1997	phk	Forget all about APM. Instead of "hlt" call through a vector which APM can then fiddle with. Default for the vector is to "htl; ret"
# 24691	07-Apr-1997	peter	The biggie: Get rid of the UPAGES from the top of the per-process address space. (!) Have each process use the kernel stack and pcb in the kvm space. Since the stacks are at a different address, we cannot copy the stack at fork() and allow the child to return up through the function call tree to return to user mode - create a new execution context and have the new process begin executing from cpu_switch() and go to user mode directly. In theory this should speed up fork a bit. Context switch the tss_esp0 pointer in the common tss. This is a lot simpler since than swithching the gdt[GPROC0_SEL].sd.sd_base pointer to each process's tss since the esp0 pointer is a 32 bit pointer, and the sd_base setting is split into three different bit sections at non-aligned boundaries and requires a lot of twiddling to reset. The 8K of memory at the top of the process space is now empty, and unmapped (and unmappable, it's higher than VM_MAXUSER_ADDRESS). Simplity the pmap code to manage process contexts, we no longer have to double map the UPAGES, this simplifies and should measuably speed up fork(). The following parts came from John Dyson: Set PG_G on the UPAGES that are now in kernel context, and invalidate them when swapping them out. Move the upages object (upobj) from the vmspace to the proc structure. Now that the UPAGES (pcb and kernel stack) are out of user space, make rfork(..RFMEM..) do what was intended by sharing the vmspace entirely via reference counting rather than simply inheriting the mappings.
# 22975	22-Feb-1997	peter	Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not ready for it yet.
# 21673	14-Jan-1997	jkh	Make the long-awaited change from $Id$ to $FreeBSD$ This will make a number of things easier in the future, as well as (finally!) avoiding the Id-smashing problem which has plagued developers for so long. Boy, I'm glad we're not using sup anymore. This update would have been insane otherwise.
# 18963	16-Oct-1996	bde	Fixed miscounting for non-statistical (GUPROF) profiling: - use CROSSJUMP() and CROSSJUMP_LABEL() for conditional jumps from idle() into cpu_switch() and vice versa. - moved badsw code to after cpu_switch(). Cosmetic changes: - moved sw0 string to be immediately after its caller (badsw). - removed unused #include.
# 18380	19-Sep-1996	phk	Add APM_IDLE_CPU option, that is off by default. I maintain that it saves more power to simply "hlt" the CPU than to spend tons of time trying to tell the APM bios to do the same. In particular if you do it 100 times a second...
# 17371	31-Jul-1996	bde	Eliminated pcb_inl. It was always 0 because context switches don't occur in interrupt handlers.
# 17366	31-Jul-1996	dg	Converted timer/run queues to 4.4BSD queue style. Removed old and unused sleep(). Implemented wakeup_one() which may be used in the future to combat the "thundering herd" problem for some special cases. Reviewed by: dyson
# 16725	25-Jun-1996	bde	trap.c: Fixed profiling of system times. It was pre-4.4Lite and didn't support statclocks. System times were too small by a factor of 8. Handle deferred profiling ticks the 4.4Lite way: use addupc_task() instead of addupc(). Call addupc_task() directly instead of using the ADDUPC() macro. Removed vestigial support for PROFTIMER. switch.s: Removed addupc(). resourcevar.h: Removed ADDUPC() and declarations of addupc(). cpu.h: Updated a comment. i386's never were tahoe's, and the deferred profiling tick became (possibly) multiple ticks in 4.4Lite. Obtained from: mostly from NetBSD
# 16723	25-Jun-1996	bde	Save John Polstra's initial fix for profiling for reference. The multiplication in addupc() overflowed for addresses >= 256K, assuming the usual profil(2) scale parameter of 0x8000. addupc() will go away soon. Submitted by: John Polstra <jdp@polstra.com>
# 15501	01-May-1996	bde	Don't return unused values in cpu_switch() or savectx(). Don't preserve unused registers in the NPX case in savectx().
# 15379	25-Apr-1996	phk	Fix cpu_fork for real. Suggested by: bde
# 15301	18-Apr-1996	phk	Fix a bogon. cpu_fork & savectx ecpected cpu_switch to restore %eax, they shouldn't.
# 15232	13-Apr-1996	bde	Use PCB_SAVEFPU_SIZE instead of a too-small size in savectx(). This bug only affected FPU emulators. It might have caused bogus FPU states in core dumps and in the child pcb after a fork. Emulated FPU states in core dumps don't work for other reasons, and the child FPU state is reinitialized by exec, so the problem might not have caused any noticeable affects. Cleaned up #includes.
# 14691	19-Mar-1996	nate	Always enable interrupts before calling the APM idle/busy routines. Suggested by: phk@FreeBSD.org
# 14577	12-Mar-1996	nate	Removed undocumented an unused APM_SLOWSTART code.
# 13908	04-Feb-1996	dg	Rewrote cpu_fork so that it doesn't use pmap_activate, and removed pmap_activate since it's not used anymore. Changed cpu_fork so that it uses one line of inline assembly rather than calling mvesp() to get the current stack pointer. Removed mvesp() since it is no longer being used.
# 13852	02-Feb-1996	dg	Killed last change - it was bogus. cpu_switch() already assumes that return address is on the stack.
# 13740	30-Jan-1996	dg	savectx() strikes again: the saved stack pointer wasn't properly adjusted to remove the return address. It's only the frame pointer and luck that allowed the code to work at all.
# 13580	23-Jan-1996	dg	Simplified savectx() a little and fixed a bug that caused it to return garbage in the child process rather than "1" like it is supposed to. Reviewed by: bde
# 13203	03-Jan-1996	wollman	Converted two options over to the new scheme: USER_LDT and KTRACE.
# 12952	21-Dec-1995	dg	Rewrote most of the ddb stack traceback code. These changes are smarter about decoding trap/syscall/interrupt frames and generally works better than the previous stuff. Removed some special (incorrect) frobbing of the frame pointer that was messing some things up with the new traceback code.
# 12722	10-Dec-1995	phk	Staticize and cleanup. remove a TON of #includes from machdep.
# 12702	09-Dec-1995	phk	Remove various unused symbols and procedures.
# 10546	03-Sep-1995	dyson	Machine dependent routines to support pre-zeroed free pages. This significantly improves demand zero performance.
# 6512	17-Feb-1995	phk	This is the latest version of the APM stuff from HOSOKAWA, I have looked briefly over it, and see some serious architectural issues in this stuff. On the other hand, I doubt that we will have any solution to these issues before 2.1, so we might as well leave this in. Most of the stuff is bracketed by #ifdef's so it shouldn't matter too much in the normal case. Reviewed by: phk Submitted by: HOSOKAWA, Tatsumi <hosokawa@mt.cs.keio.ac.jp>
# 5769	21-Jan-1995	bde	Don't count context switches here, they are already counted in mi_switch().
# 4929	03-Dec-1994	bde	i386/exception.s, Keep track of interrupt nesting level. It is normally 0 for syscalls and traps, but is fudged to 1 for their exit processing in case they metamorphose into an interrupt handler. i386/genassym.c; Remove support for the obsolete pcb_iml and pcb_cmap2. Add support for pcb_inl. i386/swtch.s: Fudge the interrupt nesting level across context switches and in the idle loop so that the work for preemptive context switches gets counted as interrupt time, the work for voluntary context switches gets counted mostly as system time (the part when curproc == 0 gets counted as interrupt time), and only truly idle time gets counted as idle time. Remove obsolete support (commented out and otherwise) for pcb_iml. Load curpcb just before curproc instead of just after so that curpcb is always valid if curproc is. A few more changes like this may fix tracing through context switches. Remove obsolete function swtch_to_inactive(). include/cpu.h: Use the new interrupt nesting level variable to implement a non-fake CLF_INTR() so that accounting for the interrupt state works. You can use top, iostat or (best) an up to date systat to see interrupt overheads. I see the expected huge interrupt overheads for ISA devices (on a 486DX/33, about 55% for an IDE drive transferring 1250K/sec and the same for a WD8013EBT network card transferring 1100K/sec). The huge interrupt overheads for serial devices are unfortunately normally invisible. include/pcb.h: Remove the obsolete pcb_iml and pcb_cmap2. Replace them by padding to preserve binary compatibility. Use part of the new padding for pcb_inl. isa/icu.s: isa/vector.s: Keep track of interrupt nesting level.
# 4012	30-Oct-1994	bde	locore.s: Build a dummy frame at the top of tmpstk to help debuggers trace the stack when the system is idle. swtch.s: idle(): Initialize the frame pointer so that debuggers don't try to trace a bogus stack. Load the frame pointer, load the stack pointer and switch out the old stack in the unique order that never leaves one of the pointers pointers invalid so that debuggers can trace idle(). Disabling interrupts provides sufficient validity for normal operation, but debuggers use (trace) traps.
# 3842	25-Oct-1994	dg	Moved initialization of tmpstk so that it immediately follows the kernel text. Fixed rounding bug that caused the last page of kernel text to be read/write instead of read-only. This is important now that tmpstk can crash into it. Removed +4 bias of tmpstk because it screws up ddb's ability to traceback correctly.
# 3291	02-Oct-1994	dg	"idle priority" support. Based on code from Henrik Vestergaard Draboel, but substantially rewritten by me.
# 3258	01-Oct-1994	dg	Laptop Advanced Power Management support by HOSOKAWA Tatsumi. Submitted by: HOSOKAWA Tatsumi
# 2457	02-Sep-1994	dg	Converted P_LINK -> P_FORW, P_RLINK -> P_BACK, minor optimization.
# 2441	01-Sep-1994	dg	Realtime priority scheduling support. Submitted by: Henrik Vestergaard Draboel
# 2410	30-Aug-1994	bde	Don't define LOCORE (as nothing) in sources. It is now defined consistently (as 1) in Makefile.i386 for all assembler sources.
# 2138	19-Aug-1994	dg	Removed bogus save of CMAP2.
# 1888	06-Aug-1994	dg	Made the tmpstk start at tmpstk. Not doing so causes problems for the debugger. Submitted by: John Dyson
# 1825	03-Aug-1994	dg	Merged in post-1.1.5 work done by myself and John Dyson. This includes: me: 1) TLB flush optimization that effectively eliminates half of all of the TLB flushes. This works by only flushing the TLB when a page is "present" in memory (i.e. the valid bit is set in the page table entry). See section 5.3.5 of the Intel 386 Programmer's Reference Manual. 2) The handling of "CMAP" has been improved to catch attempts at multiple simultaneous use. John: 1) Added pmap_qenter/pmap_qremove functions for fast mapping of pages into the kernel. This is for future optimizations and support for the upcoming merged VM/buffer cache. Reviewed by: John Dyson
# 1549	25-May-1994	rgrimes	The big 4.4BSD Lite to FreeBSD 2.0.0 (Development) patch. Reviewed by: Rodney W. Grimes Submitted by: John Dyson and David Greenman
# 1379	20-Apr-1994	dg	Bug fixes and performance improvements from John Dyson and myself: 1) check va before clearing the page clean flag. Not doing so was causing the vnode pager error 5 messages when paging from NFS. (pmap.c) 2) put back interrupt protection in idle_loop. Bruce didn't think it was necessary, John insists that it is (and I agree). (swtch.s) 3) various improvements to the clustering code (vm_machdep.c). It's now enabled/used by default. 4) bad disk blocks are now handled properly when doing clustered IOs. (wd.c, vm_machdep.c) 5) bogus bad block handling fixed in wd.c. 6) algorithm improvements to the pageout/pagescan daemons. It's amazing how well 4MB machines work now.
# 1321	02-Apr-1994	dg	New interrupt code from Bruce Evans. In additional to Bruce's attached list of changes, I've made the following additional changes: 1) i386/include/ipl.h renamed to spl.h as the name conflicts with the file of the same name in i386/isa/ipl.h. 2) changed all use of mask (i.e. netmask, biomask, ttymask, etc) to _imask (net_imask, etc). 3) changed vestige of splnet use in if_is to splimp. 4) got rid of "impmask" completely (Bruce had gotten rid of netmask), and are now using net_imask instead. 5) dozens of minor cruft to glue in Bruce's changes. These require changes I made to config(8) as well, and thus it must be rebuilt. -DG from Bruce Evans: sio: o No diff is supplied. Remove the define of setsofttty(). I hope that is enough. .s: o i386/isa/debug.h no longer exists. The event counters became too much trouble to maintain. All function call entry and exception entry counters can be recovered by using profiling kernel (the new profiling supports all entry points; however, it is too slow to leave enabled all the time; it also). Only BDBTRAP() from debug.h is now used. That is moved to exception.s. It might be worth preserving SHOW_BITS() and calling it from _mcount() (if enabled). o T_ASTFLT is now only set just before calling trap(). o All exception handlers set SWI_AST_MASK in cpl as soon as possible after entry and arrange for _doreti to restore it atomically with exiting. It is not possible to set it atomically with entering the kernel, so it must be checked against the user mode bits in the trap frame before committing to using it. There is no place to store the old value of cpl for syscalls or traps, so there are some complications restoring it. Profiling stuff (mostly in .s): o Changes to kern/subr_mcount.c, gcc and gprof are not supplied yet. o All interesting labels `foo' are renamed `_foo' and all uninteresting labels `_bar' are renamed `bar'. A small change to gprof allows ignoring labels not starting with underscores. o MCOUNT_LABEL() is to provide names for counters for times spent in exception handlers. o FAKE_MCOUNT() is a version of MCOUNT() suitable for exception handlers. Its arg is the pc where the exception occurred. The new mcount() pretends that this was a call from that pc to a suitable MCOUNT_LABEL(). o MEXITCOUNT is to turn off any timer started by MCOUNT(). /usr/src/sys/i386/i386/exception.s: o The non-BDB BPTTRAP() macros were doing a sti even when interrupts were disabled when the trap occurred. The sti (fixed) sti is actually a no-op unless you have my changes to machdep.c that make the debugger trap gates interrupt gates, but fixing that would make the ifdefs messier. ddb seems to be unharmed by both interrupts always disabled and always enabled (I had the branch in the fix back to front for some time :-(). o There is no known pushal bug. o tf_err can be left as garbage for syscalls. /usr/src/sys/i386/i386/locore.s: o Fix and update BDE_DEBUGGER support. o ENTRY(btext) before initialization was dangerous. o Warm boot shot was longer than intended. /usr/src/sys/i386/i386/machdep.c: o DON'T APPLY ALL OF THIS DIFF. It's what I'm using, but may require other changes. Use the following: o Remove aston() and setsoftclock(). Maybe use the following: o No netisr.h. o Spelling fix. o Delay to read the Rebooting message. o Fix for vm system unmapping a reduced area of memory after bounds_check_with_label() reduces the size of a physical i/o for a partition boundary. A similar fix is required in kern_physio.c. o Correct use of __CONCAT. It never worked here for non- ANSI cpp's. Is it time to drop support for non-ANSI? o gdt_segs init. 0xffffffffUL is bogus because ssd_limit is not 32 bits. The replacement may have the same value :-), but is more natural. o physmem was one page too low. Confusing variable names. Don't use the following: o Better numbers of buffers. Each 8K page requires up to 16 buffer headers. On my system, this results in 5576 buffers containing [up to] 2854912 bytes of memory. The usual allocation of about 384 buffers only holds 192K of disk if you use it on an fs with a block size of 512. o gdt changes for bdb. o TGT -> IDT changes for bdb. o #ifdefed changes for bdb. /usr/src/sys/i386/i386/microtime.s: o Use the correct asm macros. I think asm.h was copied from Mach just for microtime and isn't used now. It certainly doesn't belong in <sys>. Various macros are also duplicated in sys/i386/boot.h and libc/i386/.h. o Don't switch to and from the IRR; it is guaranteed to be selected (default after ICU init and explicitly selected in isa.c too, and never changed until the old microtime clobbered it). /usr/src/sys/i386/i386/support.s: o Non-essential changes (none related to spls or profiling). o Removed slow loads of %gs again. The LDT support may require not relying on %gs, but loading it is not the way to fix it! Some places (copyin ...) forgot to load it. Loading it clobbers the user %gs. trap() still loads it after certain types of faults so that fuword() etc can rely on it without loading it explicitly. Exception handlers don't restore it. If we want to preserve the user %gs, then the fastest method is to not touch it except for context switches. Comparing with VM_MAXUSER_ADDRESS and branching takes only 2 or 4 cycles on a 486, while loading %gs takes 9 cycles and using it takes another. o Fixed a signed branch to unsigned. /usr/src/sys/i386/i386/swtch.s: o Move spl0() outside of idle loop. o Remove cli/sti from idle loop. sw1 does a cli, and in the unlikely event of an interrupt occurring and whichqs becoming zero, sw1 will just jump back to _idle. o There's no spl0() function in asm any more, so use splz(). o swtch() doesn't need to be superaligned, at least with the new mcounting. o Fixed a signed branch to unsigned. o Removed astoff(). /usr/src/sys/i386/i386/trap.c: o The decentralized extern decls were inconsistent, of course. o Fixed typo MATH_EMULTATE in comments. / o Removed unused variables. o Old netmask is now impmask; print it instead. Perhaps we should print some of the new masks. o BTW, trap() should not print anything for normal debugger traps. /usr/src/sys/i386/include/asmacros.h: o DON'T APPLY ALL OF THIS DIFF. Just use some of the null macros as necessary. /usr/src/sys/i386/include/cpu.h: o CLKF_BASEPRI() changes since cpl == SWI_AST_MASK is now normal while the kernel is running. o Don't use var++ to set boolean variables. It fails after a mere 4G times :-) and is slower than storing a constant on [3-4]86s. /usr/src/sys/i386/include/cpufunc.h: o DON'T APPLY ALL OF THIS DIFF. You need mainly the include of <machine/ipl.h>. Unfortunately, <machine/ipl.h> is needed by almost everything for the inlines. /usr/src/sys/i386/include/ipl.h: o New file. Defines spl inlines and SWI macros and declares most variables related to hard and soft interrupt masks. /usr/src/sys/i386/isa/icu.h: o Moved definitions to <machine/ipl.h> /usr/src/sys/i386/isa/icu.s: o Software interrupts (SWIs) and delayed hardware interrupts (HWIs) are now handled uniformally, and dispatching them from splx() is more like dispatching them from _doreti. The dispatcher is essentially (handler[ffs(ipending & ~cpl)](). o More care (not quite enough) is taken to avoid unbounded nesting of interrupts. o The interface to softclock() is changed so that a trap frame is not required. o Fast interrupt handlers are now handled more uniformally. Configuration is still too early (new handlers would require bits in <machine/ipl.h> and functions to vector.s). o splnnn() and splx() are no longer here; they are inline functions (could be macros for other compilers). splz() is the nontrivial part of the old splx(). /usr/src/sys/i386/isa/ipl.h o New file. Supposed to have only bus-dependent stuff. Perhaps the h/w masks should be declared here. /usr/src/sys/i386/isa/isa.c: o DON'T APPLY ALL OF THIS DIFF. You need only things involving mask and MASK and comments about them. netmask is now a pure software mask. It works like the softclock mask. /usr/src/sys/i386/isa/vector.s: o Reorganize AUTO_EOI macros. o Option FAST_INTR_HANDLER_USERS_ES for people who don't trust fastintr handlers. o fastintr handlers need to metamorphose into ordinary interrupt handlers if their SWI bit has become set. Previously, sio had unintended latency for handling output completions and input of SLIP framing characters because this was not done. /usr/src/sys/net/netisr.h: o The machine-dependent stuff is now imported from <machine/ipl.h>. /usr/src/sys/sys/systm.h o DON'T APPLY ALL OF THIS DIFF. You need mainly the different splx() prototype. The spl*() prototypes are duplicated as inlines in <machine/ipl.h> but they need to be duplicated here in case there are no inlines. I sent systm.h and cpufunc.h to Garrett. We agree that spl0 should be replaced by splnone and not the other way around like I've done. /usr/src/sys/kern/kern_clock.c o splsoftclock() now lowers cpl so the direct call to softclock() works as intended. o softclock() interface changed to avoid passing the whole frame (some machines may need another change for profile_tick()). o profiling renamed _profiling to avoid ANSI namespace pollution. (I had to improve the mcount() interface and may as well fix it.) The GUPROF variant doesn't actually reference profiling here, but the 'U' in GUPROF should mean to select the microtimer mcount() and not change the interface.
# 1051	31-Jan-1994	dg	WINE/user LDT support from John Brezak, ported to FreeBSD by Jeffrey Hsu <hsu@soda.berkeley.edu>.
# 981	17-Jan-1994	dg	Improvements mostly from John Dyson, with a little bit from me. * Removed pmap_is_wired * added extra cli/sti protection in idle (swtch.s) * slight code improvement in trap.c * added lots of comments * improved paging and other algorithms in VM system
# 974	14-Jan-1994	dg	"New" VM system from John Dyson & myself. For a run-down of the major changes, see the log of any effected file in the sys/vm directory (swap_pager.c for instance).
# 757	13-Nov-1993	dg	First steps in rewriting locore.s, and making info useful when the machine panics. i386/i386/locore.s: 1) got rid of most .set directives that were being used like #define's, and replaced them with appropriate #define's in the appropriate header files (accessed via genassym). 2) added comments to header inclusions and global definitions, and global variables 3) replaced some hardcoded constants with cpp defines (such as PDESIZE and others) 4) aligned all comments to the same column to make them easier to read 5) moved macro definitions for ENTRY, ALIGN, NOP, etc. to /sys/i386/include/asmacros.h 6) added #ifdef BDE_DEBUGGER around all of Bruce's debugger code 7) added new global '_KERNend' to store last location+1 of kernel 8) cleaned up zeroing of bss so that only bss is zeroed 9) fix zeroing of page tables so that it really does zero them all - not just if they follow the bss. 10) rewrote page table initialization code so that 1) works correctly and 2) write protects the kernel text by default 11) properly initialize the kernel page directory, upages, p0stack PT, and page tables. The previous scheme was more than a bit screwy. 12) change allocation of virtual area of IO hole so that it is fixed at KERNBASE + 0xa0000. The previous scheme put it right after the kernel page tables and then later expected it to be at KERNBASE +0xa0000 13) change multiple bogus settings of user read/write of various areas of kernel VM - including the IO hole; we should never be accessing the IO hole in user mode through the kernel page tables 14) split kernel support routines such as bcopy, bzero, copyin, copyout, etc. into a seperate file 'support.s' 15) split swtch and related routines into a seperate 'swtch.s' 16) split routines related to traps, syscalls, and interrupts into a seperate file 'exception.s' 17) remove some unused global variables from locore that got inserted by Garrett when he pulled them out of some .h files. i386/isa/icu.s: 1) clean up global variable declarations 2) move in declaration of astpending and netisr i386/i386/pmap.c: 1) fix calculation of virtual_avail. It previously was calculated to be right in the middle of the kernel page tables - not a good place to start allocating kernel VM. 2) properly allocate kernel page dir/tables etc out of kernel map - previously only took out 2 pages. i386/i386/machdep.c: 1) modify boot() to print a warning that the system will reboot in PANIC_REBOOT_WAIT_TIME amount of seconds, and let the user abort with a key on the console. The machine will wait for ever if a key is typed before the reboot. The default is 15 seconds, but can be set to 0 to mean don't wait at all, -1 to mean wait forever, or any positive value to wait for that many seconds. 2) print "Rebooting..." just before doing it. kern/subr_prf.c: 1) remove PANICWAIT as it is deprecated by the change to machdep.c i386/i386/trap.c: 1) add table of trap type strings and use it to print a real trap/ panic message rather than just a number. Lot's of work to be done here, but this is the first step. Symbolic traceback is in the TODO. i386/i386/Makefile.i386: 1) add support in to build support.s, exception.s and swtch.s ...and various changes to various header files to make all of the above happen.