History log of /freebsd-10.1-release/sys/sparc64/include/
Revision Date Author Comments
272461 03-Oct-2014 gjb

Copy stable/10@r272459 to releng/10.1 as part of
the 10.1-RELEASE process.

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation


266204 16-May-2014 ian

MFC r257854 (discussed with alc@)

As of r257209, all architectures have defined
VM_KMEM_SIZE_SCALE. In other words, every architecture is now
auto-sizing the kmem arena. This revision changes kmeminit() so
that the definition of VM_KMEM_SIZE_SCALE becomes mandatory and
the definition of VM_KMEM_SIZE becomes optional.

Replace or eliminate all existing definitions of VM_KMEM_SIZE.
With auto-sizing enabled, VM_KMEM_SIZE effectively became an
alternate spelling for VM_KMEM_SIZE_MIN on most architectures.
Use VM_KMEM_SIZE_MIN for clarity.


264496 15-Apr-2014 tijl

MFC r263998:

Rename __wchar_t so it no longer conflicts with __wchar_t from clang 3.4
-fms-extensions.


259510 17-Dec-2013 kib

MFC r257228:
Add bus_dmamap_load_ma() function to load map with the array of
vm_pages.


256281 10-Oct-2013 gjb

Copy head (r256279) to stable/10 as part of the 10.0-RELEASE cycle.

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation


255937 29-Sep-2013 marius

Implement GET_STACK_USAGE.
Discussed with: mav

Approved by: re (kib)
MFC after: 1 week


255318 06-Sep-2013 glebius

Fix build with gcc. Move sf_buf_alloc()/sf_buf_free() declarations
to MD headers.


253994 06-Aug-2013 marius

Add MD (for now) atomic_store_acq_<type>() and use it in pmap_activate()
to get the semantics when setting the PMAP right. Prior to r251782, the
latter already used implicit acquire semantics, which - currently - means
to not employ additional explicit memory barriers under the hood (see also
r225889).


253940 04-Aug-2013 attilio

Remove unused member.

Sponsored by: EMC / Isilon storage division
Reviewed by: alc
Tested by: pho


253750 28-Jul-2013 avg

Revert r253748,253749

This WIP should not have been committed yet.

Pointyhat to: avg


253748 28-Jul-2013 avg

put contents of cpu.h under _KERNEL

no userland-serviceable parts inside

MFC after: 20 days


253266 12-Jul-2013 marius

Prefix the alias macros for members of struct __mcontext with an underscore
in order to avoid a clash in the net80211 code.


252434 01-Jul-2013 kib

Fix issues with zeroing and fetching the counters, on x86 and ppc64.
Issues were noted by Bruce Evans and are present on all architectures.

On i386, a counter fetch should use atomic read of 64bit value,
otherwise carry from the increment on other CPU could be lost for the
given fetch, making error of 2^32. If 64bit read (cmpxchg8b) is not
available on the machine, it cannot be SMP and it is enough to disable
preemption around read to avoid the split read.

On x86 the counter increment is not atomic on purpose, which makes it
possible for the store of the incremented result to override just
zeroed per-cpu slot. The effect would be a counter going off by
arbitrary value after zeroing. Perform the counter zeroing on the
same processor which does the increments, making the operations
mutually exclusive. On i386, same as for the fetching, if the
cmpxchg8b is not available, machine is not SMP and we disable
preemption for zeroing.

PowerPC64 is treated the same as amd64.

For other architectures, the changes made to allow the compilation to
succeed, without fixing the issues with zeroing or fetching. It
should be possible to handle them by using the 64bit loads and stores
atomic WRT preemption (assuming the architectures also converted from
using critical sections to proper asm). If architecture does not
provide the facility, using global (spin) mutex would be non-optimal
but working solution.

Noted by: bde
Sponsored by: The FreeBSD Foundation


251783 15-Jun-2013 ed

Remove conflicting macros from SPARC64's atomic(9) header.

The atomic_load() and atomic_store() macros conflict with the equally
named macros from <stdatomic.h>. Remove them, as they are only used to
implement functions that are not present on any of the other
architectures.


250338 07-May-2013 attilio

Rename VM_NDOMAIN into MAXMEMDOM and move it into machine/param.h in
order to match the MAXCPU concept. The change should also be useful
for consolidation and consistency.

Sponsored by: EMC / Isilon storage division
Obtained from: jeff
Reviewed by: alc


249268 08-Apr-2013 glebius

Merge from projects/counters: counter(9).

Introduce counter(9) API, that implements fast and raceless counters,
provided (but not limited to) for gathering of statistical data.

See http://lists.freebsd.org/pipermail/freebsd-arch/2013-April/014204.html
for more details.

In collaboration with: kib
Reviewed by: luigi
Tested by: ae, ray
Sponsored by: Nginx, Inc.


249265 08-Apr-2013 glebius

Merge from projects/counters:

Pad struct pcpu so that its size is denominator of PAGE_SIZE. This
is done to reduce memory waste in UMA_PCPU_ZONE zones.

Sponsored by: Nginx, Inc.


246713 12-Feb-2013 kib

Reform the busdma API so that new types may be added without modifying
every architecture's busdma_machdep.c. It is done by unifying the
bus_dmamap_load_buffer() routines so that they may be called from MI
code. The MD busdma is then given a chance to do any final processing
in the complete() callback.

The cam changes unify the bus_dmamap_load* handling in cam drivers.

The arm and mips implementations are updated to track virtual
addresses for sync(). Previously this was done in a type specific
way. Now it is done in a generic way by recording the list of
virtuals in the map.

Submitted by: jeff (sponsored by EMC/Isilon)
Reviewed by: kan (previous version), scottl,
mjacob (isp(4), no objections for target mode changes)
Discussed with: ian (arm changes)
Tested by: marius (sparc64), mips (jmallet), isci(4) on x86 (jharris),
amd64 (Fabian Keil <freebsd-listen@fabiankeil.de>)


246554 08-Feb-2013 kib

The 'end' word was missed in the comment.

MFC after: 3 days


245850 23-Jan-2013 marius

Revert the part of r239864 which removed obtaining the SMP mutex around
reading registers from other CPUs. As it turns out, the hardware doesn't
really like concurrent IPI'ing causing adverse effects. Also the thought
deadlock when using this spin lock here and the targeted CPU(s) are also
holding or in case of nested locks can't actually happen. This is due to
the fact that on sparc64, spinlock_enter() only raises the PIL but doesn't
disable interrupts completely. Thus direct cross calls as used for the
register reading (and all other MD IPI needs) still will be executed by
the targeted CPU(s) in that case.

MFC after: 3 days


243046 15-Nov-2012 jeff

- Implement run-time expansion of the KTR buffer via sysctl.
- Implement a function to ensure that all preempted threads have switched
back out at least once. Use this to make sure there are no stale
references to the old ktr_buf or the lock profiling buffers before
updating them.

Reviewed by: marius (sparc64 parts), attilio (earlier patch)
Sponsored by: EMC / Isilon Storage Division


242534 03-Nov-2012 attilio

Rework the known rwlock to benefit about staying on their own
cache line in order to avoid manual frobbing but using
struct rwlock_padalign.

Reviewed by: alc, jimharris


241780 20-Oct-2012 marius

- Give PIL_PREEMPT the lowest priority just above low/stray interrupts.
The reason for this is that the SPARC v9 architecture allows nested
interrupts of higher priority/level than that of the current interrupt
to occur (and we can't just entirely bypass this model, also, at least
for tick interrupts, this also wouldn't be wise). However, when a
preemption interrupt interrupts another interrupt of lower priority,
f.e. PIL_ITHREAD, and that one in turn is nested by a third interrupt,
f.e. PIL_TICK, with SCHED_ULE the execution of interrupts higher than
PIL_PREEMPT may be migrated to another CPU. In particular, tl1_ret(),
which is responsible for restoring the state of the CPU prior to entry
to the interrupt based on the (also migrated) trap frame, then is run
on a CPU which actually didn't receive the interrupt in question,
causing an inappropriate processor interrupt level to be "restored".
In turn, this causes interrupts of the first level, i.e. PIL_ITHREAD
in the above scenario, to be blocked on the target of the migration
until the correct PIL happens to be restored again on that CPU again.
Making PIL_PREEMPT the lowest real priority, this effectively prevents
this scenario from happening, as preemption interrupts no longer can
interrupt any other interrupt besides stray ones (which is no issue).
Thanks to attilio@ and especially mav@ for helping me to understand
this problem at the 201208DevSummit.
- Give PIL_STOP (which is also used for IPI_STOP_HARD, given that there's
no real equivalent to NMIs on SPARC v9) the highest possible priority
just below the hardwired PIL_TICK, so it has a chance to interrupt
more things.

MFC after: 1 week


241374 09-Oct-2012 attilio

Add an unified macro to deny ability from the compiler to reorder
instruction loads/stores at its will.
The macro __compiler_membar() is currently supported for both gcc and
clang, but kernel compilation will fail otherwise.

Reviewed by: bde, kib
Discussed with: dim, theraven
MFC after: 2 weeks


241371 09-Oct-2012 attilio

Reverts r234074,234105,234564,234723,234989,235231-235232 and part of
r234247.
Use, instead, the static intializer introduced in r239923 for x86 and
sparc64 intr_cpus, unwinding the code to the initial version.

Reviewed by: marius


240190 07-Sep-2012 gavin

Prevent indent(1) from reformatting this comment, as it contains
a formatting-sensitive table.


239941 31-Aug-2012 marius

Add a global MD macro for the VIS block size instead of duplicating
it and using magic values all over the place.

MFC after: 1 week


239864 29-Aug-2012 marius

- Unlike cache invalidation and TLB demapping IPIs, reading registers from
other CPUs doesn't require locking so get rid of it. As the latter is used
for the timecounter on certain machine models, using a spin lock in this
case can lead to a deadlock with the upcoming callout(9) rework.
- Merge r134227/r167250 from x86:
Avoid cross-IPI SMP deadlock by using the smp_ipi_mtx spin lock not only
for smp_rendezvous_cpus() but also for the MD cache invalidation and TLB
demapping IPIs.
- Mark some unused function arguments as such.

MFC after: 1 week


239079 05-Aug-2012 marius

Merge r236494 from x86:

Isolate the global TTE list lock from data and other locks to prevent false
sharing within the cache.

MFC after: 3 days


237517 24-Jun-2012 andrew

Make the wchar_t type machine dependent.

This is required for ARM EABI. Section 7.1.1 of the Procedure Call for the
ARM Architecture (AAPCS) defines wchar_t as either an unsigned int or an
unsigned short with the former preferred.

Because of this requirement we need to move the definition of __wchar_t to
a machine dependent header. It also cleans up the macros defining the limits
of wchar_t by defining __WCHAR_MIN and __WCHAR_MAX in the same machine
dependent header then using them to define WCHAR_MIN and WCHAR_MAX
respectively.

Discussed with: bde


237433 22-Jun-2012 kib

Implement mechanism to export some kernel timekeeping data to
usermode, using shared page. The structures and functions have vdso
prefix, to indicate the intended location of the code in some future.

The versioned per-algorithm data is exported in the format of struct
vdso_timehands, which mostly repeats the content of in-kernel struct
timehands. Usermode reading of the structure can be lockless.
Compatibility export for 32bit processes on 64bit host is also
provided. Kernel also provides usermode with indication about
currently used timecounter, so that libc can fall back to syscall if
configured timecounter is unknown to usermode code.

The shared data updates are initiated both from the tc_windup(), where
a fast task is queued to do the update, and from sysctl handlers which
change timecounter. A manual override switch
kern.timecounter.fast_gettime allows to turn off the mechanism.

Only x86 architectures export the real algorithm data, and there, only
for tsc timecounter. HPET counters page could be exported as well, but
I prefer to not further glue the kernel and libc ABI there until
proper vdso-based solution is developed.

Minimal stubs neccessary for non-x86 architectures to still compile
are provided.

Discussed with: bde
Reviewed by: jhb
Tested by: flo
MFC after: 1 month


237430 22-Jun-2012 kib

Reserve AT_TIMEKEEP auxv entry for providing usermode the pointer to
timekeeping information.

MFC after: 1 week


237168 16-Jun-2012 alc

The page flag PGA_WRITEABLE is set and cleared exclusively by the pmap
layer, but it is read directly by the MI VM layer. This change introduces
pmap_page_is_write_mapped() in order to completely encapsulate all direct
access to PGA_WRITEABLE in the pmap layer.

Aesthetics aside, I am making this change because amd64 will likely begin
using an alternative method to track write mappings, and having
pmap_page_is_write_mapped() in place allows me to make such a change
without further modification to the MI VM layer.

As an added bonus, tidy up some nearby comments concerning page flags.

Reviewed by: kib
MFC after: 6 weeks


236214 29-May-2012 alc

Replace all uses of the vm page queues lock by a r/w lock that is private
to this pmap.c. This new r/w lock is used primarily to synchronize access
to the TTE lists. However, it will be used in a somewhat unconventional
way. As finer-grained TTE list locking is added to each of the pmap
functions that acquire this r/w lock, its acquisition will be changed from
write to read, enabling concurrent execution of the pmap functions with
finer-grained locking.

Reviewed by: attilio
Tested by: flo
MFC after: 10 days


235941 24-May-2012 bz

MFp4 bz_ipv6_fast:

in_cksum.h required ip.h to be included for struct ip. To be
able to use some general checksum functions like in_addword()
in a non-IPv4 context, limit the (also exported to user space)
IPv4 specific functions to the times, when the ip.h header is
present and IPVERSION is defined (to 4).

We should consider more general checksum (updating) functions
to also allow easier incremental checksum updates in the L3/4
stack and firewalls, as well as ponder further requirements by
certain NIC drivers needing slightly different pseudo values
in offloading cases. Thinking in terms of a better "library".

Sponsored by: The FreeBSD Foundation
Sponsored by: iXsystems

Reviewed by: gnn (as part of the whole)
MFC After: 3 days


235232 10-May-2012 marius

Fix mismerge in r235231.


235231 10-May-2012 marius

Merge r234989 from x86:

Revert part of r234723 by re-enabling the SMP protection for intr_bind().


234785 29-Apr-2012 dim

Add a convenience macro for the returns_twice attribute, and apply it to
the prototypes of the appropriate functions (getcontext, savectx,
setjmp, sigsetjmp and vfork).

MFC after: 2 weeks


234723 26-Apr-2012 attilio

Clean up the intr* MD KPI from the SMP dependency, removing a cause of
discrepancy between modules and kernel, but deal with SMP differences
within the functions themselves.

As an added bonus this also helps in terms of code readability.

Requested by: gibbs
Reviewed by: jhb, marius
MFC after: 1 week


232745 09-Mar-2012 dim

Add casts to __uint16_t to the __bswap16() macros on all arches which
didn't already have them. This is because the ternary expression will
return int, due to the Usual Arithmetic Conversions. Such casts are not
needed for the 32 and 64 bit variants.

While here, add additional parentheses around the x86 variant, to
protect against unintended consequences.

MFC after: 2 weeks


232356 01-Mar-2012 jhb

- Change contigmalloc() to use the vm_paddr_t type instead of an unsigned
long for specifying a boundary constraint.
- Change bus_dma tags to use bus_addr_t instead of bus_size_t for boundary
constraints.

These allow boundary constraints to be fully expressed for cases where
sizeof(bus_addr_t) != sizeof(bus_size_t). Specifically, it allows a
driver to properly specify a 4GB boundary in a PAE kernel.

Note that this cannot be safely MFC'd without a lot of compat shims due
to KBI changes, so I do not intend to merge it.

Reviewed by: scottl


230633 27-Jan-2012 marius

Now that we have a working OF_printf() since r230631 and a OF_panic()
helper since r230632, use these for output and panicing during the
early cycles and move cninit() until after the static per-CPU data
has been set up. This solves a couple of issue regarding the non-
availability of the static per-CPU data:
- panic() not working and only making things worse when called,
- having to supply a special DELAY() implementation to the low-level
console drivers,
- curthread accesses of mutex(9) usage in low-level console drivers
that aren't conditional due to compiler optimizations (basically,
this is the problem described in r227537 but in this case for
keyboards attached via uart(4)). [1]

PR: 164123 [1]


230632 27-Jan-2012 marius

- Now that we have a working OF_printf() since r230631, use it for
implementing a simple OF_panic() that may be used during the early
cycles when panic() isn't available, yet.
- Mark cpu_{exit,shutdown}() as __dead2 as appropriate.


230630 27-Jan-2012 marius

For machines where the kernel address space is unrestricted increase
VM_KMEM_SIZE_SCALE to 2, awaiting more insight from alc@. As it turns
out, the VM apparently has problems with machines that have large holes
in the physical address space, causing the kmem_suballoc() call in
kmeminit() to fail with a VM_KMEM_SIZE_SCALE of 1. Using a value of 2
allows these, namely Blade 1500 with 2GB of RAM, to boot.

PR: 164227


230628 27-Jan-2012 marius

Mark cpu_{halt,reset}() as __dead2 as appropriate.


230475 23-Jan-2012 das

Add C11 macros describing subnormal numbers to float.h.

Reviewed by: bde


228469 13-Dec-2011 ed

Replace __signed by signed.

The signed keyword is an integral part of the C syntax. There's no need
to use __signed.


228222 03-Dec-2011 marius

Revert r225889 a bit. While it's correct that in total store order there's
no need to additionally add CPU memory barriers to the acquire variants of
atomic(9), these are documented to also include compiler memory barriers.
So add the latter, which were previously included by using membar(), back.


228022 27-Nov-2011 marius

For sparc64 also adjust the geometry of da(4) driven disks to not overflow
the 16-bit cylinders field of the VTOC8 disk label (at around 502GB). The
geometry chosen for disks above that limit allows to use disks up to 2TB,
which is the limit of the extended VTOC8 format. The geometry used for
disks smaller than the 16-bit cylinders limit stays the same as used by
cam_calc_geometry(9) for extended translation.
Thanks to Hans-Joerg Sirtl for providing hardware for testing this change.

MFC after: 3 days


227539 15-Nov-2011 marius

Define curthread as an inline function that loads the thread pointer
directly from g7, the pcpu pointer. This guarantees correct behavior
when the thread migrates to a different CPU.
Commit message stolen from r205431. Additional testing by Peter Jeremy.

MFC after: 3 days


226607 21-Oct-2011 das

People porting FreeBSD to new architectures ought not have to
implement a deprecated FPU control interface in addition to the
standard one. To make this clearer, further deprecate ieeefp.h
by not declaring the function prototypes except on architectures
that implement them already.

Currently i386 and amd64 implement the ieeefp.h interface for
compatibility, and for fp[gs]etprec(), which doesn't exist on
most other hardware. Powerpc, sparc64, and ia64 partially implement
it and probably shouldn't, and other architectures don't implement it
at all.


226112 07-Oct-2011 kib

Remove unused define.

MFC after: 1 month


226054 06-Oct-2011 marius

- Use atomic operations rather than sched_lock for safely assigning pm_active
and pc_pmap for SMP. This is key to allowing adding support for SCHED_ULE.
Thanks go to Peter Jeremy for additional testing.
- Add support for SCHED_ULE to cpu_switch().

Committed from: 201110DevSummit


225931 02-Oct-2011 marius

Make sparc64 compatible with NEW_PCIB and enable it:
- Implement bus_adjust_resource() methods as far as necessary and in non-PCI
bridge drivers as far as feasible without rototilling them.
- As NEW_PCIB does a layering violation by activating resources at layers
above pci(4) without previously bubbling up their allocation there, move
the assignment of bus tags and handles from the bus_alloc_resource() to
the bus_activate_resource() methods like at least the other NEW_PCIB
enabled architectures do. This is somewhat unfortunate as previously
sparc64 (ab)used resource activation to indicate whether SYS_RES_MEMORY
resources should be mapped into KVA, which is only necessary if their
going to be accessed via the pointer returned from rman_get_virtual() but
not for bus_space(9) as the later always uses physical access on sparc64.
Besides wasting KVA if we always map in SYS_RES_MEMORY resources, a driver
also may deliberately not map them in if the firmware already has done so,
possibly in a special way. So in order to still allow a driver to decide
whether a SYS_RES_MEMORY resource should be mapped into KVA we let it
indicate that by calling bus_space_map(9) with BUS_SPACE_MAP_LINEAR as
actually documented in the bus_space(9) page. This is implemented by
allocating a separate bus tag per SYS_RES_MEMORY resource and passing the
resource via the previously unused bus tag cookie so we later on can call
rman_set_virtual() in sparc64_bus_mem_map(). As a side effect this now
also allows to actually indicate that a SYS_RES_MEMORY resource should be
mapped in as cacheable and/or read-only via BUS_SPACE_MAP_CACHEABLE and
BUS_SPACE_MAP_READONLY respectively.
- Do some minor cleanup like taking advantage of rman_init_from_resource(),
factor out the common part of bus tag allocation into a newly added
sparc64_alloc_bus_tag(), hook up some missing newbus methods and replace
some homegrown versions with the generic counterparts etc.
- While at it, let apb_attach() (which can't use the generic NEW_PCIB code
as APB bridges just don't have the base and limit registers implemented)
regarding the config space registers cached in pcib_softc and the SYSCTL
reporting nodes set up.


225890 01-Oct-2011 marius

- Add protective parentheses to macros as far as possible.
- Move {r,w,}mb() to the top of this file where they live on most of the
other architectures.


225889 01-Oct-2011 marius

In total store which we use for running the kernel and all of the userland
atomic operations behave as if the were followed by a memory barrier so
there's no need to include ones in the acquire variants of atomic(9).
Removing these results a small performance improvement, specifically this
is sufficient to compensate the performance loss seen in the worldstone
benchmark seen when using SCHED_ULE instead of SCHED_4BSD.
This change is inspired by Linux even more radically doing the equivalent
thing some time ago.
Thanks go to Peter Jeremy for additional testing.


225887 30-Sep-2011 marius

Use the extended integer condition code when comparing 64-bit values. Given
that ATOMIC_INC_LONG currently is unused this happened to not be fatal.


225886 30-Sep-2011 marius

- Right-justify backslashes as suggested by style(9).
- Rename ATOMIC_INC_ULONG to ATOMIC_INC_LONG in order to be consistent with
the names of the other macros in this file an adjust accordingly.


224232 20-Jul-2011 marius

Merge from r224217:
Bump MAXCPU to 64.

Approved by: re (kib)


224207 19-Jul-2011 attilio

Add the possibility to specify from kernel configs MAXCPU value.
This patch is going to help in cases like mips flavours where you
want a more granular support on MAXCPU.

No MFC is previewed for this patch.

Tested by: pluknet
Approved by: re (kib)


223800 05-Jul-2011 marius

- pmap_cache_remove() and pmap_protect_tte() are only used within pmap.c
so static'ize them.
- Correct a typo.


223719 02-Jul-2011 marius

- For Cheetah- and Zeus-class CPUs don't flush all unlocked entries from
the TLBs in order to get rid of the user mappings but instead traverse
them an flush only the latter like we also do for the Spitfire-class.
Also flushing the unlocked kernel entries can cause instant faults which
when called from within cpu_switch() are handled with the scheduler lock
held which in turn can cause timeouts on the acquisition of the lock by
other CPUs. This was easily seen with a 16-core V890 but occasionally
also happened with 2-way machines.
While at it, move the SPARC64-V support code entirely to zeus.c. This
causes a little bit of duplication but is less confusing than partially
using Cheetah-class bits for these.
- For SPARC64-V ensure that 4-Mbyte page entries are stored in the 1024-
entry, 2-way set associative TLB.
- In {d,i}tlb_get_data_sun4u() turn off the interrupts in order to ensure
that ASI_{D,I}TLB_DATA_ACCESS_REG actually are read twice back-to-back.

Tested by: Peter Jeremy (16-core US-IV), Michael Moll (2-way SPARC64-V)


223379 21-Jun-2011 marius

Fix whitespace


223378 21-Jun-2011 marius

On machines where we don't need to lock the kernel TSB into the dTLB and
thus may basically use the entire 64-bit kernel address space reduce
VM_KMEM_SIZE_SCALE to 1 allowing kernel to use more memory.


223126 15-Jun-2011 marius

Don't include curcpu in the mask which is used as the IPI cookie as we
have to ignore it when sending the IPI anyway. Actually I can't think of
a good reason why this ever was done that way in the first place as it's
not even usefull for debugging.
While at it replace the use of pc_other_cpus as it's slated for deorbit.


222828 07-Jun-2011 marius

Adapt CATR() to r222813. This is somewhat tricky as we can't afford using
more than three temporary register in several places CATR() is used so
this code trades instructions in for registers. Actually, this still isn't
sufficient and CATR() has the side-effect of clobbering %y. Luckily, with
the current uses of CATR() this either doesn't matter or we are able to
(save and) restore it.
Now that there's only one use of AND() and TEST() left inline these.


222813 07-Jun-2011 attilio

etire the cpumask_t type and replace it with cpuset_t usage.

This is intended to fix the bug where cpu mask objects are
capped to 32. MAXCPU, then, can now arbitrarely bumped to whatever
value. Anyway, as long as several structures in the kernel are
statically allocated and sized as MAXCPU, it is suggested to keep it
as low as possible for the time being.

Technical notes on this commit itself:
- More functions to handle with cpuset_t objects are introduced.
The most notable are cpusetobj_ffs() (which calculates a ffs(3)
for a cpuset_t object), cpusetobj_strprint() (which prepares a string
representing a cpuset_t object) and cpusetobj_strscan() (which
creates a valid cpuset_t starting from a string representation).
- pc_cpumask and pc_other_cpus are target to be removed soon.
With the moving from cpumask_t to cpuset_t they are now inefficient
and not really useful. Anyway, for the time being, please note that
access to pcpu datas is protected by sched_pin() in order to avoid
migrating the CPU while reading more than one (possible) word
- Please note that size of cpuset_t objects may differ between kernel
and userland. While this is not directly related to the patch itself,
it is good to understand that concept and possibly use the patch
as a reference on how to deal with cpuset_t objects in userland, when
accessing kernland members.
- KTR_CPUMASK is changed and now is represented through a string, to be
set as the example reported in NOTES.

Please additively note that no MAXCPU is bumped in this patch, but
private testing has been done until to MAXCPU=128 on a real 8x8x2(htt)
machine (amd64).

Please note that the FreeBSD version is not yet bumped because of
the upcoming pcpu changes. However, note that this patch is not
targeted for MFC.

People to thank for the time spent on this patch:
- sbruno, pluknet and Nicholas Esborn (nick AT desert DOT net) tested
several revision of the patches and really helped in improving
stability of this work.
- marius fixed several bugs in the sparc64 implementation and reviewed
patches related to ktr.
- jeff and jhb discussed the basic approach followed.
- kib and marcel made targeted review on some specific part of the
patch.
- marius, art, nwhitehorn and andreast reviewed MD specific part of
the patch.
- marius, andreast, gonzo, nwhitehorn and jceel tested MD specific
implementations of the patch.
- Other people have made contributions on other patches that have been
already committed and have been listed separately.

Companies that should be mentioned for having participated at several
degrees:
- Yahoo! for having offered the machines used for testing on big
count of CPUs.
- The FreeBSD Foundation for having sponsored my devsummit attendance,
which has been instrumental.
- Sandvine for having offered offices and infrastructure during
development.

(I really hope I didn't forget anyone, if it happened I apologize in
advance).


221855 13-May-2011 mdf

Move the ZERO_REGION_SIZE to a machine-dependent file, as on many
architectures (i386, for example) the virtual memory space may be
constrained enough that 2MB is a large chunk. Use 64K for arches
other than amd64 and ia64, with special handling for sparc64 due to
differing hardware.

Also commit the comment changes to kmem_init_zero_region() that I
missed due to not saving the file. (Darn the unfamiliar development
environment).

Arch maintainers, please feel free to adjust ZERO_REGION_SIZE as you
see fit.

Requested by: alc
MFC after: 1 week
MFC with: r221853


221750 10-May-2011 marius

Add an ATOMIC_CLEAR_LONG.


220939 22-Apr-2011 marius

Correct spelling in comments.

Submitted by: brucec


219608 13-Mar-2011 marius

Remove the advertising clause from the UCB license according to the
July 22, 1999 addendum.


219567 12-Mar-2011 marius

Sync licenses and the corresponding RCS IDs with NetBSD, mainly switching
the licenses of Matthew R. Green and the TNF to 2-clause.

Obtained from: NetBSD


218909 21-Feb-2011 brucec

Fix typos - remove duplicate "the".

PR: bin/154928
Submitted by: Eitan Adler <lists at eitanadler.com>
MFC after: 3 days


218773 17-Feb-2011 alc

Remove pmap fields that are either unused or not fully implemented.

Discussed with: kib


217515 17-Jan-2011 jkim

Add reader/writer lock around mem_range_attr_get() and mem_range_attr_set().
Compile sys/dev/mem/memutil.c for all supported platforms and remove now
unnecessary dev_mem_md_init(). Consistently define mem_range_softc from
mem.c for all platforms. Add missing #include guards for machine/memdev.h
and sys/memrange.h. Clean up some nearby style(9) nits.

MFC after: 1 month


217192 09-Jan-2011 kib

Move repeated MAXSLP definition from machine/vmparam.h to sys/vmmeter.h.
Update the outdated comments describing MAXSLP and the process
selection algorithm for swap out.

Comments wording and reviewed by: alc


217182 09-Jan-2011 das

Fix the value for DECIMAL_DIG on UltraSparcs. The previous value of
35 wasn't quite big enough to ensure correct rounding for very-close-
to-halfway cases.


217147 08-Jan-2011 tijl

On mixed 32/64 bit architectures (mips, powerpc) use __LP64__ rather than
architecture macros (__mips_n64, __powerpc64__) when 64 bit types (and
corresponding macros) are different from 32 bit. [1]

Correct the type of INT64_MIN, INT64_MAX and UINT64_MAX.

Define (U)INTMAX_C as an alias for (U)INT64_C matching the type definition
for (u)intmax_t. Do this on all architectures for consistency.

Suggested by: bde [1]
Approved by: kib (mentor)


217145 08-Jan-2011 tijl

Fix types of some values in machine/_limits.h.

On some architectures UCHAR_MAX and USHRT_MAX had type unsigned int.
However, lacking integer suffixes for types smaller than int, their type
should correspond to that of an object of type unsigned char (or short)
when used in an expression with objects of type int. In that case unsigned
char (short) are promoted to int (i.e. signed) so the type of UCHAR_MAX and
USHRT_MAX should also be int.

Where MIN/MAX constants implicitly have the correct type the suffix has
been removed.

While here, correct some comments.

Reviewed by: bde
Approved by: kib (mentor)


217097 07-Jan-2011 kib

Add AT_STACKPROT elf aux vector. Will be used to inform rtld about the
initial stack protection set by the kernel image activator.


216961 04-Jan-2011 marius

Reserve INTR_MD[1-4] similarly to what BUS_DMA_BUS[1-4] are intended for
and switch sparc64 to use the first one for bus error filter handlers of
bridge drivers instead of (ab)using INTR_FAST for that so we eventually
can get rid of the latter.

Reviewed by: jhb
MFC after: 1 month


216803 29-Dec-2010 marius

On UltraSPARC-III+ and greater take advantage of ASI_ATOMIC_QUAD_LDD_PHYS,
which takes an physical address instead of an virtual one, for loading TTEs
of the kernel TSB so we no longer need to lock the kernel TSB into the dTLB,
which only has a very limited number of lockable dTLB slots. The net result
is that we now basically can handle a kernel TSB of any size and no longer
need to limit the kernel address space based on the number of dTLB slots
available for locked entries. Consequently, other parts of the trap handlers
now also only access the the kernel TSB via its physical address in order
to avoid nested traps, as does the PMAP bootstrap code as we haven't taken
over the trap table at that point, yet. Apart from that the kernel TSB now
is accessed via a direct mapping when we are otherwise taking advantage of
ASI_ATOMIC_QUAD_LDD_PHYS so no further code changes are needed. Most of this
is implemented by extending the patching of the TSB addresses and mask as
well as the ASIs used to load it into the trap table so the runtime overhead
of this change is rather low. Currently the use of ASI_ATOMIC_QUAD_LDD_PHYS
is not yet enabled on SPARC64 CPUs due to lack of testing and due to the
fact it might require minor adjustments there.
Theoretically it should be possible to use the same approach also for the
user TSB, which already is not locked into the dTLB, avoiding nested traps.
However, for reasons I don't understand yet OpenSolaris only does that with
SPARC64 CPUs. On the other hand I think that also addressing the user TSB
physically and thus avoiding nested traps would get us closer to sharing
this code with sun4v, which only supports trap level 0 and 1, so eventually
we could have a single kernel which runs on both sun4u and sun4v (as does
Linux and OpenBSD).

Developed at and committed from: 27C3


216802 29-Dec-2010 marius

- Move the macros for generating load and store instructions to asmacros.h
so they can be shared by different source files and extend them by a
variant for atomic compare and swap.
- Consistently use EMPTY.


216801 29-Dec-2010 marius

Rename the "xor" parameter to "xorval" as the former is a reserved keyword
in C++.

Submitted by: gahr


216628 21-Dec-2010 marius

Extend the hack of r182730 to trick GAS/GCC into compiling access to
STICK/STICK_COMPARE independently of the selected instruction set by
TICK_COMPARE so tick.c as of r214358 once again can be compiled with
gcc -mcpu=v9 for reference purposes.


216625 21-Dec-2010 marius

Revert r216080 so kmem_map is capped at 3/5 of the currently rather modest
kernel address space in order to leave space for the buffer cache, pipes,
thread stacks, etc on machines with more physical memory until we take
advantage of ASI_ATOMIC_QUAD_LDD_PHYS on CPUs providing it so we don't need
to lock the kernel TSB pages into the dTLB, basically making the entire
64-bit kernel address space available on relevant machines.

Submitted by: alc


216143 03-Dec-2010 brucec

Revert r216134. This checkin broke platforms where bus_space are macros:
they need to be a single statement, and do { } while (0) doesn't work in this
situation so revert until a solution can be devised.


216134 02-Dec-2010 brucec

Disallow passing in a count of zero bytes to the bus_space(9) functions.

Passing a count of zero on i386 and amd64 for [I386|AMD64]_BUS_SPACE_MEM
causes a crash/hang since the 'loop' instruction decrements the counter
before checking if it's zero.

PR: kern/80980
Discussed with: jhb


216080 30-Nov-2010 fjoe

Change VM_KMEM_SIZE_MAX to be just (VM_MAX_KERNEL_ADDRESS - VM_MIN_KERNEL_ADDRESS)

Suggested by: marius


216016 28-Nov-2010 fjoe

Define VM_KMEM_SIZE_MAX on sparc64. Otherwise kernel built with
DEBUG_MEMGUARD panics early in kmeminit() with the message
"kmem_suballoc: bad status return of 1" because of zero "size" argument
passed to kmem_suballoc() due to "vm_kmem_size_max" being zero.

The problem also exists on ia64.


215093 10-Nov-2010 alc

Enable reservation-based physical memory allocation. Even without the
creation of large page mappings in the pmap, it can provide modest
performance benefits. In particular, for a "buildworld" on a 2x 1GHz
Ultrasparc IIIi it reduced the wall clock time by 2.2% and the system
time by 12.6%.

Tested by: marius@


215054 09-Nov-2010 jhb

- Remove <machine/mutex.h>. Most of the headers were empty, and the
contents of the ones that were not empty were stale and unused.
- Now that <machine/mutex.h> no longer exists, there is no need to allow it
to override various helper macros in <sys/mutex.h>.
- Rename various helper macros for low-level operations on mutexes to live
in the _mtx_* or __mtx_* namespaces. While here, change the names to more
closely match the real API functions they are backing.
- Drop support for including <sys/mutex.h> in assembly source files.

Suggested by: bde (1, 2)


214071 19-Oct-2010 marius

- Wrap exchanging td_intr_frame and calling the event timer callback in
a critical section as apparently required by both. I don't think either
belongs in the event timer front-ends but the callback should handle
this as necessary instead just like for example intr_event_handle()
does but this is how the other architectures currently handle it, either
explicitly or implicitly.
- Further rename and reword references to hardclock as this front-end no
longer has a notion of actually calling it.


213578 08-Oct-2010 marius

In the replacement text of the __bswapN_const() macros cast the argument
to the expected type so they work like the corresponding __bswapN_var()
functions and the compiler doesn't complain when arguments of different
width are passed.


212709 15-Sep-2010 marius

Add a VIS-based block copy function for SPARC64 V and later, which
additionally takes advantage of the prefetch cache of these CPUs.
Unlike the uncommitted US-III version, which provide no measurable
speedup or even resulted in a slight slowdown on certain CPUs models
compared to using the US-I version with these, the SPARC64 version
actually results in a slight improvement.


212705 15-Sep-2010 marius

Add macros for alternate entry points.


212541 13-Sep-2010 mav

Refactor timer management code with priority to one-shot operation mode.
The main goal of this is to generate timer interrupts only when there is
some work to do. When CPU is busy interrupts are generating at full rate
of hz + stathz to fullfill scheduler and timekeeping requirements. But
when CPU is idle, only minimum set of interrupts (down to 8 interrupts per
second per CPU now), needed to handle scheduled callouts is executed.
This allows significantly increase idle CPU sleep time, increasing effect
of static power-saving technologies. Also it should reduce host CPU load
on virtualized systems, when guest system is idle.

There is set of tunables, also available as writable sysctls, allowing to
control wanted event timer subsystem behavior:
kern.eventtimer.timer - allows to choose event timer hardware to use.
On x86 there is up to 4 different kinds of timers. Depending on whether
chosen timer is per-CPU, behavior of other options slightly differs.
kern.eventtimer.periodic - allows to choose periodic and one-shot
operation mode. In periodic mode, current timer hardware taken as the only
source of time for time events. This mode is quite alike to previous kernel
behavior. One-shot mode instead uses currently selected time counter
hardware to schedule all needed events one by one and program timer to
generate interrupt exactly in specified time. Default value depends of
chosen timer capabilities, but one-shot mode is preferred, until other is
forced by user or hardware.
kern.eventtimer.singlemul - in periodic mode specifies how much times
higher timer frequency should be, to not strictly alias hardclock() and
statclock() events. Default values are 2 and 4, but could be reduced to 1
if extra interrupts are unwanted.
kern.eventtimer.idletick - makes each CPU to receive every timer interrupt
independently of whether they busy or not. By default this options is
disabled. If chosen timer is per-CPU and runs in periodic mode, this option
has no effect - all interrupts are generating.

As soon as this patch modifies cpu_idle() on some platforms, I have also
refactored one on x86. Now it makes use of MONITOR/MWAIT instrunctions
(if supported) under high sleep/wakeup rate, as fast alternative to other
methods. It allows SMP scheduler to wake up sleeping CPUs much faster
without using IPI, significantly increasing performance on some highly
task-switching loads.

Tested by: many (on i386, amd64, sparc64 and powerc)
H/W donated by: Gheorghe Ardelean
Sponsored by: iXsystems, Inc.


211412 17-Aug-2010 kib

Supply some useful information to the started image using ELF aux vectors.
In particular, provide pagesize and pagesizes array, the canary value
for SSP use, number of host CPUs and osreldate.

Tested by: marius (sparc64)
MFC after: 1 month


211197 11-Aug-2010 jhb

Update various places that store or manipulate CPU masks to use cpumask_t
instead of int or u_int. Since cpumask_t is currently u_int on all
platforms this should just be a cosmetic change.


211071 08-Aug-2010 marius

- As it is not possible for sched_bind(9) to context switch with
td_critnest > 1 when not already running on the desired CPU read the
TICK counter of the BSP via a direct cross trap request in that case
instead.
- Treat the STICK based timecounter the same way as the TICK based one
regarding its quality and obtaining the counter value from the BSP.
Like the TICK timers the STICK ones also are only synchronized during
their startup (which might not result in good synchronicity in the
first place) but not afterwards and might drift over time, causing
problems when the time is read from different CPUs (see r135972).


211050 08-Aug-2010 marius

- Introduce a cpu_ipi_single() function pointer in order to send IPIs
to single CPUs more efficiently with Cheetah(-class) and Jalapeno CPUs.
Besides being used to implement the ipi_cpu() introduced in r210939,
cpu_ipi_single() will also be used internally by the sparc64 MD code.
- Factor out the Jalapeno support from the Cheetah IPI send functions
in order to be able to more easily and efficiently implement support
for more than 32 target CPUs as well as a workaround for Cheetah+
erratum 25 for the latter.


211049 08-Aug-2010 marius

For CPUs which ignore TD_CV and support hardware unaliasing don't
bother doing page coloring. This results in a small but measurable
performance improvement in buildworld times.


210939 06-Aug-2010 jhb

Add a new ipi_cpu() function to the MI IPI API that can be used to send an
IPI to a specific CPU by its cpuid. Replace calls to ipi_selected() that
constructed a mask for a single CPU with calls to ipi_cpu() instead. This
will matter more in the future when we transition from cpumask_t to
cpuset_t for CPU masks in which case building a CPU mask is more expensive.

Submitted by: peter, sbruno
Reviewed by: rookie
Obtained from: Yahoo! (x86)
MFC after: 1 month


210601 29-Jul-2010 mav

Adapt sparc64 and sun4v timer code for the new event timers infrastructure.

Reviewed by: marius@


210550 27-Jul-2010 jhb

Very rough first cut at NUMA support for the physical page allocator. For
now it uses a very dumb first-touch allocation policy. This will change in
the future.
- Each architecture indicates the maximum number of supported memory domains
via a new VM_NDOMAIN parameter in <machine/vmparam.h>.
- Each cpu now has a PCPU_GET(domain) member to indicate the memory domain
a CPU belongs to. Domain values are dense and numbered from 0.
- When a platform supports multiple domains, the default freelist
(VM_FREELIST_DEFAULT) is split up into N freelists, one for each domain.
The MD code is required to populate an array of mem_affinity structures.
Each entry in the array defines a range of memory (start and end) and a
domain for the range. Multiple entries may be present for a single
domain. The list is terminated by an entry where all fields are zero.
This array of structures is used to split up phys_avail[] regions that
fall in VM_FREELIST_DEFAULT into per-domain freelists.
- Each memory domain has a separate lookup-array of freelists that is
used when fulfulling a physical memory allocation. Right now the
per-domain freelists are listed in a round-robin order for each domain.
In the future a table such as the ACPI SLIT table may be used to order
the per-domain lookup lists based on the penalty for each memory domain
relative to a specific domain. The lookup lists may be examined via a
new vm.phys.lookup_lists sysctl.
- The first-touch policy is implemented by using PCPU_GET(domain) to
pick a lookup list when allocating memory.

Reviewed by: alc


210334 21-Jul-2010 attilio

KTR_CTx are long time aliased by existing classes so they can't serve
their purpose anymore. Axe them out.

Sponsored by: Sandvine Incorporated
Discussed with: jhb, emaste
Possible MFC: TBD


210176 16-Jul-2010 mav

Allocate proper ammount of memory for interrupt names on sparc64 and
sun4v, same as done on other architectures. This removes garbage from
`vmstat -ia` output.

Reviewed by: marius@


209695 04-Jul-2010 marius

- Pin the IPI cache and TLB demap functions in order to prevent migration
between determining the other CPUs and calling cpu_ipi_selected(), which
apart from generally doing the wrong thing can lead to a panic when a
CPU is told to IPI itself (which sun4u doesn't support).
Reported and tested by: Nathaniel W Filardo
- Add __unused where appropriate.

MFC after: 3 days


208453 23-May-2010 kib

Reorganize syscall entry and leave handling.

Extend struct sysvec with three new elements:
sv_fetch_syscall_args - the method to fetch syscall arguments from
usermode into struct syscall_args. The structure is machine-depended
(this might be reconsidered after all architectures are converted).
sv_set_syscall_retval - the method to set a return value for usermode
from the syscall. It is a generalization of
cpu_set_syscall_retval(9) to allow ABIs to override the way to set a
return value.
sv_syscallnames - the table of syscall names.

Use sv_set_syscall_retval in kern_sigsuspend() instead of hardcoding
the call to cpu_set_syscall_retval().

The new functions syscallenter(9) and syscallret(9) are provided that
use sv_*syscall* pointers and contain the common repeated code from
the syscall() implementations for the architecture-specific syscall
trap handlers.

Syscallenter() fetches arguments, calls syscall implementation from
ABI sysent table, and set up return frame. The end of syscall
bookkeeping is done by syscallret().

Take advantage of single place for MI syscall handling code and
implement ptrace_lwpinfo pl_flags PL_FLAG_SCE, PL_FLAG_SCX and
PL_FLAG_EXEC. The SCE and SCX flags notify the debugger that the
thread is stopped at syscall entry or return point respectively. The
EXEC flag augments SCX and notifies debugger that the process address
space was changed by one of exec(2)-family syscalls.

The i386, amd64, sparc64, sun4v, powerpc and ia64 syscall()s are
changed to use syscallenter()/syscallret(). MIPS and arm are not
converted and use the mostly unchanged syscall() implementation.

Reviewed by: jhb, marcel, marius, nwhitehorn, stas
Tested by: marcel (ia64), marius (sparc64), nwhitehorn (powerpc),
stas (mips)
MFC after: 1 month


208349 20-May-2010 marius

Change ad_firmware_geom_adjust() to operate on a struct disk * only and
hook it up to ada(4) also. While at it, rename *ad_firmware_geom_adjust()
to *ata_disk_firmware_geom_adjust() etc now that these are no longer
limited to ad(4).

Reviewed by: mav
MFC after: 3 days


207537 02-May-2010 marius

Add support for SPARC64 V (and where it already makes sense for other
HAL/Fujitsu) CPUs. For the most part this consists of fleshing out the
MMU and cache handling, it doesn't add pmap optimizations possible with
these CPU, yet, though.
With these changes FreeBSD runs stable on Fujitsu Siemens PRIMEPOWER 250
and likely also other models based on SPARC64 V like 450, 650 and 850.
Thanks go to Michael Moll for providing access to a PRIMEPOWER 250.


207410 30-Apr-2010 kmacy

On Alan's advice, rather than do a wholesale conversion on a single
architecture from page queue lock to a hashed array of page locks
(based on a patch by Jeff Roberson), I've implemented page lock
support in the MI code and have only moved vm_page's hold_count
out from under page queue mutex to page lock. This changes
pmap_extract_and_hold on all pmaps.

Supported by: Bitgravity Inc.

Discussed with: alc, jeffr, and kib


207269 27-Apr-2010 kib

Style: use #define<TAB> instead of #define<SPACE>.

Noted by: bde, pluknet gmail com
MFC after: 11 days


207243 26-Apr-2010 marius

Add OF_getscsinitid(), a helper similar to OF_getetheraddr() but for
obtaining the initiator ID to be used for SPI controllers from the
Open Firmware device tree.


207152 24-Apr-2010 kib

Move the constants specifying the size of struct kinfo_proc into
machine-specific header files. Add KINFO_PROC32_SIZE for struct
kinfo_proc32 for architectures providing COMPAT_FREEBSD32. Add
CTASSERT for the size of struct kinfo_proc32.

Submitted by: pluknet
Reviewed by: imp, jhb, nwhitehorn
MFC after: 2 weeks


206480 11-Apr-2010 marius

Update for UltraSPARC-IV{,+} and SPARC64 V, VI, VII and VIIIfx CPUs.


206450 10-Apr-2010 marius

Correct the DCR_IPE macro to refer to the right bit. Also improve the
associated comment as besides US-IV+ these bits are only available with
US-III++, i.e. the 1.2GHz version of the US-III+.


205409 21-Mar-2010 marius

- The firmware of Sun Fire V1280 has a misfeature of setting %wstate to
7 which corresponds to WSTATE_KMIX in OpenSolaris whenever calling into
it which totally screws us even when restoring %wstate afterwards as
spill/fill traps can happen while in OFW. The rather hackish OpenBSD
approach of just setting the equivalent of WSTATE_KERNEL to 7 also is
no option as we treat %wstate as a bit field. So in order to deal with
this problem actually implement spill/fill handlers for %wstate 7 which
just act as the WSTATE_KERNEL ones except of theoretically also handling
32-bit, turn off interrupts completely so we don't even take IPIs while
in OFW which should ensure we only take spill/fill traps at most and
restore %wstate after calling into OFW once we have taken over the trap
table. While at it, actually set WSTATE_{,PROM}_KMIX before calling into
OFW just like OpenSolaris does, which should at least help testing this
change on non-V1280.
- Remove comments referring to the %wstate usage in BSD/OS.
- Remove the no longer used RSF_ALIGN_RETRY macro.
- Correct some trap table addresses in comments.
- Ensure %wstate is set to WSTATE_KERNEL when taking over the trap table.
- Ensure PSTATE_AM is off when entering or exiting to OFW as well as that
interrupts are also completely off when exiting to OFW as the firmware
trap table shouldn't be used to handle our interrupts.


205269 17-Mar-2010 marius

o Add support for UltraSparc-IV+:
- Swap the configuration of the first and second large dTLB as with
US-IV+ these can only hold entries of certain page sizes each, which
we happened to chose the non-working way around.
- Additionally ensure that the large iTLB is set up to hold 8k pages
(currently this happens to be a NOP though).
- Add a workaround for US-IV+ erratum #2.
- Turn off dTLB parity error reporting as otherwise we get seemingly
false positives when copying in the user window by simulating a
fill trap on return to usermode. Given that these parity errors can
be avoided by disabling multi issue mode and the problem could be
reproduced with a second machine this appears to be a silicon bug of
some sort.
- Add a membar #Sync also before the stores to ASI_DCACHE_TAG. While
at it, turn of interrupts across the whole cheetah_cache_flush() for
simplicity instead of around every flush. This should have next to no
impact as for cheetah-class machines we typically only need to flush
the caches a few times during boot when recovering from peeking/poking
non-existent PCI devices, if at all.
- Just use KERNBASE for FLUSH as we also do elsewhere as the US-IV+
documentation doesn't seem to mention that these CPUs also ignore the
address like previous cheetah-class CPUs do. Again the code changing
LSU_IC is executed seldom enough that the negligible optimization of
using %g0 instead should have no real impact.

With these changes FreeBSD runs stable on V890 equipped with US-IV+
and -j128 buildworlds in a loop for days are no problem. Unfortunately,
the performance isn't were it should be as a buildworld on a 4x1.5GHz
US-IV+ V890 takes nearly 3h while on a V440 with (theoretically) less
powerfull 4x1.5GHz US-IIIi it takes just over 1h. It's unclear whether
this is related to the supposed silicon bug mentioned above or due to
another issue. The documentation (which contains a sever bug in the
description of the bits added to the context registers though) at least
doesn't mention any requirements for changes in the CPU handling besides
those implemented and the cache as well as the TLB configurations and
handling look fine.
o Re-arrange cheetah_init() so it's easier to add support for SPARC64
V up to VIIIfx CPUs, which only require parts of this initialization.


205263 17-Mar-2010 marius

Add macros for the VER.impl of SPARC64 II to VIIIfx.


205258 17-Mar-2010 marius

- Add TTE and context register bits for the additional page sizes supported
by UltraSparc-IV and -IV+ as well as SPARC64 V, VI, VII and VIIIfx CPUs.
- Replace TLB_PCXR_PGSZ_MASK and TLB_SCXR_PGSZ_MASK with TLB_CXR_PGSZ_MASK
which just is the complement of TLB_CXR_CTX_MASK instead of trying to
assemble it from the page size bits which vary across CPUs.
- Add macros for the remainder of the SFSR bits, which are useful for at
least debugging purposes.


204646 03-Mar-2010 joel

The NetBSD Foundation has granted permission to remove clause 3 and 4 from
the software.

Obtained from: NetBSD


204164 21-Feb-2010 marius

Some machines can not only consist of CPUs running at different speeds
but also of different types, f.e. Sun Fire V890 can be equipped with a
mix of UltraSPARC IV and IV+ CPUs, requiring different MMU initialization
and different workarounds for model specific errata. Therefore move the
CPU implementation number from a global variable to the per-CPU data.
Functions which are called before the latter is available are passed the
implementation number as a parameter now.

This file was missed in r204152.


204152 20-Feb-2010 marius

Some machines can not only consist of CPUs running at different speeds
but also of different types, f.e. Sun Fire V890 can be equipped with a
mix of UltraSPARC IV and IV+ CPUs, requiring different MMU initialization
and different workarounds for model specific errata. Therefore move the
CPU implementation number from a global variable to the per-CPU data.
Functions which are called before the latter is available are passed the
implementation number as a parameter now.


203846 13-Feb-2010 marius

Predict KASSERTs to be true.


203844 13-Feb-2010 marius

- Add the 'cmp' and 'core' pseudo-busses which are used to group CPU cores
to the exclusion lists as the CPU nodes aren't handled as regular devices
either. Also add the pseudo-devices found in Sun Fire V1280.
- Allow nexus_attach() and nexus_alloc_resource() to be used by drivers
derived from nexus(4) for subordinate busses.
- Don't add the zero-sized memory resources of glue devices to the resource
lists.


203843 13-Feb-2010 marius

Resurrect nexusvar.h from r167307.


203838 13-Feb-2010 marius

- Search the whole OFW device tree instead of only the children of the
root nexus device for the CPUs as starting with UltraSPARC IV the 'cpu'
nodes hang off of from 'cmp' (chip multi-threading processor) or 'core'
or combinations thereof. Also in large UltraSPARC III based machines
the 'cpu' nodes hang off of 'ssm' (scalable shared memory) nodes which
group snooping-coherency domains together instead of directly from the
nexus.
It would be great if we could use newbus to deal with the different ways
the 'cpu' devices can hang off of pseudo ones but unfortunately both
cpu_mp_setmaxid() and sparc64_init() have to work prior to regular device
probing.
- Add support for UltraSPARC IV and IV+ CPUs. Due to the fact that these
are multi-core each CPU has two Fireplane config registers and thus the
module/target ID has to be determined differently so the one specific
to a certain core is used. Similarly, starting with UltraSPARC IV the
individual cores use a different property in the OFW device tree to
indicate the CPU/core ID as it no longer is in coincidence with the
shared slot/socket ID.
This involves changing the MD KTR code to not directly read the UPA
module ID either. We use the MID stored in the per-CPU data instead of
calling cpu_get_mid() as a replacement in order prevent clobbering any
registers as side-effect in the assembler version. This requires CATR()
invocations from mp_startup() prior to mapping the per-CPU pages to be
removed though.
While at it additionally distinguish between CPUs with Fireplane and
JBus interconnects as these also use slightly different sizes for the
JBus/agent/module/target IDs.
- Make sparc64_shutdown_final() static as it's not used outside of
machdep.c.


203829 13-Feb-2010 marius

- Assert that HEAPSZ is a multiple of PAGE_SIZE as at least the firmware
of Sun Fire V1280 doesn't round up the size itself but instead lets
claiming of non page-sized amounts of memory fail.
- Change parameters and variables related to the TLB slots to unsigned
which is more appropriate.
- Search the whole OFW device tree instead of only the children of the
root nexus device for the BSP as starting with UltraSPARC IV the 'cpu'
nodes hang off of from 'cmp' (chip multi-threading processor) or 'core'
or combinations thereof. Also in large UltraSPARC III based machines
the 'cpu' nodes hang off of 'ssm' (scalable shared memory) nodes which
group snooping-coherency domains together instead of directly from the
nexus.
- Add support for UltraSPARC IV and IV+ BSPs. Due to the fact that these
are multi-core each CPU has two Fireplane config registers and thus the
module/target ID has to be determined differently so the one specific
to a certain core is used. Similarly, starting with UltraSPARC IV the
individual cores use a different property in the OFW device tree to
indicate the CPU/core ID as it no longer is in coincidence with the
shared slot/socket ID.
While at it additionally distinguish between CPUs with Fireplane and
JBus interconnects as these also use slightly different sizes for the
JBus/agent/module/target IDs.
- Check the return value of init_heap(). This requires moving it after
cons_probe() so we can panic when appropriate. This should be fine as
the PowerPC OFW loader uses that order for quite some time now.


200948 24-Dec-2009 marius

Merge from amd64/i386:
Implement support for interrupt descriptions.


200923 23-Dec-2009 marius

- Add support for the IOMMUs of Fire JBus to PCIe and Oberon Uranus
to PCIe bridges.
- Add support for talking the PROM mappings over to the kernel IOTSB
just like we do with the kernel TSB in order to allow OFW drivers
to continue to work.
- Change some members, parameters and variables to unsigned where
more appropriate.


200922 23-Dec-2009 marius

Fix whitespace according to style(9).


200878 22-Dec-2009 marius

- Add macros for the states of the interrupt clear registers.
- Change INTMAP_VEC() to take an INO as its second argument rather
than an INR. The former is what I actually intended with this
macro and how it's currently used.


200876 22-Dec-2009 marius

Make these constants unsigned which is more appropriate.


199135 10-Nov-2009 kib

Extract the code that records syscall results in the frame into MD
function cpu_set_syscall_retval().

Suggested by: marcel
Reviewed by: marcel, davidxu
PowerPC, ARM, ia64 changes: marcel
Sparc64 tested and reviewed by: marius, also sunv reviewed
MIPS tested by: gonzo
MFC after: 1 month


198502 26-Oct-2009 marius

Sync with the other archs and wrapper the prototype of in_cksum_skip(9)
in #ifdef _KERNEL.

Submitted by: Ulrich Spoerlein
MFC after: 1 month


198203 18-Oct-2009 marius

Change the load base to below 2GB so PIE binaries work including when
compiled to use the Medium/Low code model, which we currently default
to for the userland. GNU/Linux has moved their default to Medium/Middle
some time ago, which probably explains why the current GNU ld(1) uses
a base in the range between 32 and 44 bits instead.

Submitted by: kib


197933 10-Oct-2009 kib

Define architectural load bases for PIE binaries. Addresses were selected
by looking at the bases used for non-relocatable executables by gnu ld(1),
and adjusting it slightly.

Discussed with: bz
Reviewed by: kan
Tested by: bz (i386, amd64), bsam (linux)
MFC after: some time


197316 18-Sep-2009 alc

Add a new sysctl for reporting all of the supported page sizes.

Reviewed by: jhb
MFC after: 3 weeks


196994 08-Sep-2009 phk

Get rid of the _NO_NAMESPACE_POLLUTION kludge by creating an
architecture specific include file containing the _ALIGN*
stuff which <sys/socket.h> needs.


196196 13-Aug-2009 attilio

* Completely Remove the option STOP_NMI from the kernel. This option
has proven to have a good effect when entering KDB by using a NMI,
but it completely violates all the good rules about interrupts
disabled while holding a spinlock in other occasions. This can be the
cause of deadlocks on events where a normal IPI_STOP is expected.
* Adds an new IPI called IPI_STOP_HARD on all the supported architectures.
This IPI is responsible for sending a stop message among CPUs using a
privileged channel when disponible. In other cases it just does match a
normal IPI_STOP.
Right now the IPI_STOP_HARD functionality uses a NMI on ia32 and amd64
architectures, while on the other has a normal IPI_STOP effect. It is
responsibility of maintainers to eventually implement an hard stop
when necessary and possible.
* Use the new IPI facility in order to implement a new userend SMP kernel
function called stop_cpus_hard(). That is specular to stop_cpu() but
it does use the privileged channel for the stopping facility.
* Let KDB use the newly introduced function stop_cpus_hard() and leave
stop_cpus() for all the other cases
* Disable interrupts on CPU0 when starting the process of APs suspension.
* Style cleanup and comments adding

This patch should fix the reboot/shutdown deadlocks many users are
constantly reporting on mailing lists.

Please don't forget to update your config file with the STOP_NMI
option removal

Reviewed by: jhb
Tested by: pho, bz, rink
Approved by: re (kib)


195808 21-Jul-2009 marius

Add a MD __PCI_BAR_ZERO_VALID which denotes that BARs containing 0
actually specify valid bases that should be treated just as normal.
The PCI specifications have no indication that 0 would be a magic value
indicating a disabled BAR as commonly used on at least amd64 and i386
but not sparc64. It's unclear what to do in pci_delete_resource()
instead of writing 0 to a BAR though as there's no (other) way do
disable individual BARs so its decoding is left enabled in case of
__PCI_BAR_ZERO_VALID for now.

Approved by: re (kib), jhb
MFC after: 1 week


195649 12-Jul-2009 alc

Add support to the virtual memory system for configuring machine-
dependent memory attributes:

Rename vm_cache_mode_t to vm_memattr_t. The new name reflects the
fact that there are machine-dependent memory attributes that have
nothing to do with controlling the cache's behavior.

Introduce vm_object_set_memattr() for setting the default memory
attributes that will be given to an object's pages.

Introduce and use pmap_page_{get,set}_memattr() for getting and
setting a page's machine-dependent memory attributes. Add full
support for these functions on amd64 and i386 and stubs for them on
the other architectures. The function pmap_page_set_memattr() is also
responsible for any other machine-dependent aspects of changing a
page's memory attributes, such as flushing the cache or updating the
direct map. The uses include kmem_alloc_contig(), vm_page_alloc(),
and the device pager:

kmem_alloc_contig() can now be used to allocate kernel memory with
non-default memory attributes on amd64 and i386.

vm_page_alloc() and the device pager will set the memory attributes
for the real or fictitious page according to the object's default
memory attributes.

Update the various pmap functions on amd64 and i386 that map pages to
incorporate each page's memory attributes in the mapping.

Notes: (1) Inherent to this design are safety features that prevent
the specification of inconsistent memory attributes by different
mappings on amd64 and i386. In addition, the device pager provides a
warning when a device driver creates a fictitious page with memory
attributes that are inconsistent with the real page that the
fictitious page is an alias for. (2) Storing the machine-dependent
memory attributes for amd64 and i386 as a dedicated "int" in "struct
md_page" represents a compromise between space efficiency and the ease
of MFCing these changes to RELENG_7.

In collaboration with: jhb

Approved by: re (kib)


195376 05-Jul-2009 sam

Cleanup ALIGNED_POINTER:
o add to platforms where it was missing (arm, i386, powerpc, sparc64, sun4v)
o define as "1" on amd64 and i386 where there is no restriction
o make the type returned consistent with ALIGN
o remove _ALIGNED_POINTER
o make associated comments consistent

Reviewed by: bde, imp, marcel
Approved by: re (kensmith)


195149 28-Jun-2009 marius

- Work around the broken loader behavior of not demapping no longer
used kernel TLB slots when unloading the kernel or modules, which
results in havoc when loading a kernel and modules which take up
less TLB slots afterwards as the unused but locked ones aren't
accounted for in virtual_avail. Eventually this should be fixed
in the loader which isn't straight forward though and the kernel
should be robust against this anyway. [1]
- Ensure that the addresses allocated directly from phys_avail[] by
pmap_bootstrap_alloc() are always colored properly. This implicit
assumption was broken in r194784 as unlike the other consumers the
DPCPU area allocated for the BSP isn't a multiple of PAGE_SIZE *
DCACHE_COLORS. [2]
- Remove the no longer used global msgbuf_phys.
- Remove the redundant ekva parameter of pmap_bootstrap_alloc().
- Correct some outdated function names in ktr(9) invocations.

Requested by: jhb [1]
Reported by: gavin [2]
Approved by: re (kib)
MFC after: 2 weeks


195060 26-Jun-2009 alc

Correct the #endif comment.

Noticed by: jmallett
Approved by: re (kib)


195033 26-Jun-2009 alc

This change is the next step in implementing the cache control functionality
required by video card drivers. Specifically, this change introduces
vm_cache_mode_t with an appropriate VM_CACHE_DEFAULT definition on all
architectures. In addition, this changes adds a vm_cache_mode_t parameter
to kmem_alloc_contig() and vm_phys_alloc_contig(). These will be the
interfaces for allocating mapped kernel memory and physical memory,
respectively, with non-default cache modes.

In collaboration with: jhb


194784 23-Jun-2009 jeff

Implement a facility for dynamic per-cpu variables.
- Modules and kernel code alike may use DPCPU_DEFINE(),
DPCPU_GET(), DPCPU_SET(), etc. akin to the statically defined
PCPU_*. Requires only one extra instruction more than PCPU_* and is
virtually the same as __thread for builtin and much faster for shared
objects. DPCPU variables can be initialized when defined.
- Modules are supported by relocating the module's per-cpu linker set
over space reserved in the kernel. Modules may fail to load if there
is insufficient space available.
- Track space available for modules with a one-off extent allocator.
Free may block for memory to allocate space for an extent.

Reviewed by: jhb, rwatson, kan, sam, grehan, marius, marcel, stas


191309 20-Apr-2009 rwatson

Don't conditionally define CACHE_LINE_SHIFT, as we anticipate sizing
a fair number of static data structures, making this an unlikely
option to try to change without also changing source code. [1]

Change default cache line size on ia64, sparc64, and sun4v to 128
bytes, as this was what rtld-elf was already using on those
platforms. [2]

Suggested by: bde [1], jhb [2]
MFC after: 2 weeks


191278 19-Apr-2009 rwatson

Add description and cautionary note regarding CACHE_LINE_SIZE.

MFC after: 2 weeks
Suggested by: alc


191276 19-Apr-2009 rwatson

For each architecture, define CACHE_LINE_SHIFT and a derived
CACHE_LINE_SIZE constant. These constants are intended to
over-estimate the cache line size, and be used at compile-time
when a run-time tuning alternative isn't appropriate or
available.

Defaults for all architectures are 64 bytes, except powerpc
where it is 128 bytes (used on G5 systems).

MFC after: 2 weeks
Discussed on: arch@


190107 19-Mar-2009 marius

- There's no need to wrap kdb_active and kdb_trap() in #ifdef KDB as
they're always available.
- Remove unused variable. [1]
- Add a missing const.
- Sort includes.

Submitted by: Christoph Mallon [1]


189926 17-Mar-2009 kib

Add AT_EXECPATH ELF auxinfo entry type. The value's a_ptr is a pointer
to the full path of the image that is being executed.
Increase AT_COUNT.

Remove no longer true comment about types used in Linux ELF binaries,
listed types contain FreeBSD-specific entries.

Reviewed by: kan


188456 10-Feb-2009 marius

Improve r185008 so the streaming cache is only flushed when
a mapping actually met the threshold.


188455 10-Feb-2009 marius

- Use the generally more appropriate PROM base rather than the
kernel one as the non-faulting flush address in the loader so
we can can change KERNBASE and VM_MIN_KERNEL_ADDRESS if we
ever want to without needing to worry about using a compatible
loader.
- Correctly check for LOADER_DEBUG.
- Add a missing const for page_sizes[].


186682 01-Jan-2009 marius

- Currently the PMAP code is laid out to let the kernel TSB cover the
whole KVA space using one locked 4MB dTLB entry per GB of physical
memory. On Cheetah-class machines only the dt16 can hold locked
entries though, which would be completely consumed for the kernel
TSB on machines with >= 16GB. Therefore limit the KVA space to use
no more than half of the lockable dTLB slots, given that we need
them also for other things.
- Add sanity checks which ensure that we don't exhaust the (lockable)
TLB slots.


186347 20-Dec-2008 nwhitehorn

Modularize the Open Firmware client interface to allow run-time switching
of OFW access semantics, in order to allow future support for real-mode
OF access and flattened device frees. OF client interface modules are
implemented using KOBJ, in a similar way to the PPC PMAP modules.

Because we need Open Firmware to be available before mutexes can be used on
sparc64, changes are also included to allow KOBJ to be used very early in
the boot process by only using the mutex once we know it has been initialized.

Reviewed by: marius, grehan


186212 17-Dec-2008 imp

AT_DEBUG and AT_BRK were OBE like 10 years ago, so retire them.

Reviewed by: peter


186128 15-Dec-2008 nwhitehorn

Adapt parts of the sparc64 Open Firmware bus enumeration code (in particular,
the code for parsing interrupt maps) to PowerPC and reflect their new MI
status by moving them to the shared dev/ofw directory.

This commit also modifies the OFW PCI enumeration procedure on PowerPC to
allow the bus to find non-firmware-enumerated devices that Apple likes to add,
and adds some useful Open Firmware properties (compat and name) to the pnpinfo
string of children on OFW SBus, EBus, PCI, and MacIO links. Because of the
change to PCI enumeration on PowerPC, X has started working again on PPC
machines with Grackle hostbridges.

Reviewed by: marius
Obtained from: sparc64


185162 22-Nov-2008 kmacy

- bump __FreeBSD version to reflect added buf_ring, memory barriers,
and ifnet functions

- add memory barriers to <machine/atomic.h>
- update drivers to only conditionally define their own

- add lockless producer / consumer ring buffer
- remove ring buffer implementation from cxgb and update its callers

- add if_transmit(struct ifnet *ifp, struct mbuf *m) to ifnet to
allow drivers to efficiently manage multiple hardware queues
(i.e. not serialize all packets through one ifq)
- expose if_qflush to allow drivers to flush any driver managed queues

This work was supported by Bitgravity Inc. and Chelsio Inc.


185109 19-Nov-2008 marius

Use the interrupt level right below PIL_FAST for executing interrupt
filters instead of PIL_FAST and allow special filters and handlers
for interrupts which need to be able to interrupt even filters, f.e.
bus error interrupts, to be registered with the revived INTR_FAST
at PIL_FAST.


185008 16-Nov-2008 marius

- Allow the front-end to specify that iommu(4) should disable
rerun of the streaming cache for silicon bug workarounds.
- Announce the presence of a streaming cache on attach for
informational purposes.
- For performance reasons don't do unnecessary flushes of the
streaming cache when coherent mappings are synced.
- Fix some minor style issues.


183201 20-Sep-2008 marius

Use the STICK timers only when absolutely necessary, i.e. if a machine
consists of CPUs running at different speeds, for driving hardclock as
these timers in turn are driven at frequencies as low as 5MHz, resulting
in bad granularity compared to the TICK timers. However, don't employ
the workaround for the BlackBird erratum #1 when using the TICK timer
on machines with cheetah-class CPUs for performance reasons.

Reported by: Florian Smeets


183142 18-Sep-2008 marius

- Newer firmware versions no longer provide SUNW,stop-self so just
disable interrupts and loop forever with these.
- Hide all MP-related bits in <machine/smp.h> underneath #ifdef SMP.
- Inline ipi_all_but_self(9) and ipi_selected(9). We don't expose any
additional bits but save a few cycles by doing so.
- Remove ipi_all(9), which actually only called panic(9). It can't be
implemented natively anyway and having it removed at least causes
MI users to fail already fail when linking.


182878 08-Sep-2008 marius

For cheetah-class CPUs ensure that the dt512_0 is set to hold 8k pages
for all three contexts and configure the dt512_1 to hold 4MB pages for
them (e.g. for direct mappings).
This might allow for additional optimization by using the faulting
page sizes provided by AA_DMMU_TAG_ACCESS_EXT for bypassing the page
size walker for the dt512 in the superpage support code.

Submitted by: nwhitehorn (initial patch)


182773 04-Sep-2008 marius

Use the PROM provided SUNW,set-trap-table to take over the trap
table. This is required in order to set obp-control-relinquished
within the PROM, allowing to safely read the OFW translations node.
Without this, f.e. a `ofwdump -ap` triggers a fatal reset error or
worse things on machines based on USIII and beyond.
In theory this should allow to remove touching %tba in cpu_setregs(),
in practice we seem to currently face a chicken and egg problem when
doing so however.


182768 04-Sep-2008 marius

Flesh out MMU and cache handling of cheetah-class CPUs.


182767 04-Sep-2008 marius

The physical address space of cheetah-class CPUs has been extended
to 43 bits so update TD_PA_BITS accordingly. For the most part this
increase is transparent to the existing code except for when reading
the physical address from ASI_{D,I}TLB_DATA_ACCESS_REG, which we
only do in the loader and which was already adjusted in r182478, or
from the OFW translations node.
While at it, ensure we are only taking valid OFW mapping entries
into account.


182730 03-Sep-2008 marius

- USIII-based machines can consist of CPUs running at different
frequencies (and having different cache sizes) so use the STICK
(System TICK) timer, which was introduced due to this and is
driven by the same frequency across all CPUs, instead of the
TICK timer, whose frequency varies with the CPU clock, to drive
hardclock. We try to use the STICK counter with all CPUs that are
USIII or beyond, even when not necessary due to identical CPUs,
as we can can also avoid the workaround for the BlackBird erratum
#1 there. Unfortunately, using the STICK counter currently causes
a hang with USIIIi MP machines for reasons unknown, so we still
use the TICK timer there (which is okay as they can only consist
of identical CPUs).
- Given that we only (try to) synchronize the (S)TICK timers of APs
with the BSP during startup, we could end up spinning forever in
DELAY(9) if that function is migrated to another CPU while we're
spinning due to clock drift afterwards, so pin to the CPU in order
to avoid migration. Unfortunately, pinning doesn't work at the
point DELAY(9) is required by the low-level console drivers, yet,
so switch to a function pointer, which is updated accordingly, for
implementing DELAY(9). For USIII and beyond, this would also allow
to easily use the STICK counter instead of the TICK one here,
there's no benefit in doing so however.
While at it, use cpu_spinwait(9) for spinning in the delay-
functions. This currently is a NOP though.
- Don't set the TICK timer of the BSP to 0 during at startup as
there's no need to do so.
- Implement cpu_est_clockrate().
- Unfortunately, USIIIi-based machines don't provide a timecounter
device besides the STICK and TICK counters (well, in theory the
Tomatillo bridges have a performance counter that can be (ab)used
as timecounter by configuring it to count bus cycles, though unlike
the performance counter of Schizo bridges, the Tomatillo one is
broken and counts Sun knows what in this mode). This means that
we've to use a (S)TICK counter for timecounting, which has the old
problem of not being in sync across CPUs, so provide an additional
timecounter function which binds itself to the BSP but has an
adequate low priority.


182689 02-Sep-2008 marius

- USIII-based machines can consist of CPUs having different cache
sizes (and running at different frequencies) so move the cacheinfo
to the PCPU data. While at it, remove some redundant and/or unused
members from struct cacheinfo.
- In sparc64_init don't assume the first CPU node we find in the OFW
device tree is the BSP.


182078 23-Aug-2008 marius

Update the comment regarding the workaround for the BlackBird
TICK_COMPARE bug and the instruction alignment used for it based
on information found in the OpenSolaris source.

MFC after: 3 days


181875 19-Aug-2008 jhb

Export 'struct pcpu' to userland w/o requiring _KERNEL. A few ports
already define _KERNEL to get to this and I'm about to add hooks to
libkvm to access per-CPU data.

MFC after: 1 week


181701 13-Aug-2008 marius

cosmetic changes and style fixes


181642 12-Aug-2008 marius

Assume OpenSolaris knows better and use their value for VM_MAX_PROM_ADDRESS.


181398 07-Aug-2008 marius

- Reimplement {d,i}tlb_enter() and {d,i}tlb_va_to_pa() in C. There's
no particular reason for them to be implemented in assembler and
having them in C allows easier extension as well as using more C
macros and {d,i}tlb_slot_max rather than hard-coding magic (and
actually spitfire-only) values.
- Fix the compilation of pmap_print_tte().
- Change pmap_print_tlb() to use ldxa() rather than re-rolling it
inline as well as TLB_DAR_SLOT and {d,i}tlb_slot_max rather than
hardcoding magic (and actually spitfire-only) values.
- While at it, suffix the above mentioned functions with "_sun4u" to
underline they're architecture-specific.
- Use __FBSDID and macros instead of magic values in locore.S.
- Remove unused includes and smp_stack in locore.S.


180297 05-Jul-2008 marius

Revert the addition of "__volatile" to "__asm" done in r180011, since
the condition codes where added to the clobber lists in r180073 the
former is unnecessary.


180073 27-Jun-2008 marius

Improve r180011 by explicitly adding the condition codes to the
clobber list.

Suggested by: Christoph Mallon


180011 25-Jun-2008 marius

Use "__asm __volatile" rather than "__asm" for instruction sequences
that modify condition codes (the carry bit, in this case). Without
"__volatile", the compiler might add the inline assembler instructions
between unrelated code which also uses condition codes, modifying the
latter.
This prevents the TCP pseudo header checksum calculation done in
tcp_output() from having effects on other conditions when compiled
with GCC 4.2.1 at "-O2" and "options INET6" left out. [1]

Reported & tested by: Boris Kochergin [1]
MFC after: 3 days


179990 25-Jun-2008 ed

Remove the unused major/minor numbers from iodev and memdev.

Now that st_rdev is being automatically generated by the kernel, there
is no need to define static major/minor numbers for the iodev and
memdev. We still need the minor numbers for the memdev, however, to
distinguish between /dev/mem and /dev/kmem.

Approved by: philip (mentor)


178860 08-May-2008 marius

- Remove the BUS_HANDLE_MIN checking in the __BUS_DEBUG_ACCESS macro;
for UPA it should have fulfilled its purpose by now and Fireplane-
and JBus-based machines are way to messy in organization to implement
something equivalent.
- Fix a bunch of style(9) bugs.


178840 07-May-2008 marius

- Use the name returned by device_get_nameunit(9) for the name of the
counter-timer timecounter so the associated SYSCTL nodes don't clash on
machines having multiple U2P and U2S bridges as well as establishing a
clear mapping between these bridges and their timecounter device.
- Don't bother setting up a "nice" name for the IOMMU, just use the name
returned by device_get_nameunit(9), too.
- Fix some minor style(9) bugs.
- Use __FBSDID in counter.c

MFC after: 1 week


178445 23-Apr-2008 marius

- Include <machine/utrap.h> so this header doesn't have an MD
dependency.
- Make prototypes style(9) compliant.

MFC after: 1 week


178443 23-Apr-2008 marius

o Rename ic_eoi to ic_clear to emphasize the functions it points
don't send and EOI which works like on amd64/i386 and blocks all
interrupts on the relevant interrupt controller.
o Replace the post_filter and post_inthread hooks registered when
creating the interrupt events with just ic_clear as on sparc64 we
don't need to do any disable->EOI->enable dance to unblock all but
the relevant interrupt while running the filter or handler; just
not clearing the interrupt already has the same effect.
o Merge from amd64/i386:
- Split the intr_table_lock into an sx lock used for most things,
and a spin lock to protect intrcnt_index.
- Add support for binding interrupts to CPUs, including for the
bus_bind_intr(9) interface, a assign_cpu hook and initially
shuffling interrupts arround in a round-robin fashion.

Reviewed by: jhb
MFC after: 1 month


178048 09-Apr-2008 marius

- Add support for IPI_PREEMPT. [1]
- Add my copyright to mp_machdep.c for having implemented support for
USIII and up and some fixes.

Obtained from: sun4v (modulo style(9) bugs) [1]


177661 27-Mar-2008 jb

When building a kernel module, define MAXCPU the same as SMP so
that modules work with and without SMP.


177642 26-Mar-2008 phk

The "free-lance" timer in the i8254 is only used for the speaker
these days, so de-generalize the acquire_timer/release_timer api
to just deal with speakers.

The new (optional) MD functions are:
timer_spkr_acquire()
timer_spkr_release()
and
timer_spkr_setfreq()

the last of which configures the timer to generate a tone of a given
frequency, in Hz instead of 1/1193182th of seconds.

Drop entirely timer2 on pc98, it is not used anywhere at all.

Move sysbeep() to kern/tty_cons.c and use the timer_spkr*() if
they exist, and do nothing otherwise.

Remove prototypes and empty acquire-/release-timer() and sysbeep()
functions from the non-beeping archs.

This eliminate the need for the speaker driver to know about
i8254frequency at all. In theory this makes the speaker driver MI,
contingent on the timer_spkr_*() functions existing but the driver
does not know this yet and still attaches to the ISA bus.

Syscons is more tricky, in one function, sc_tone(), it knows the hz
and things are just fine.

In the other function, sc_bell() it seems to get the period from
the KDMKTONE ioctl in terms if 1/1193182th second, so we hardcode
the 1193182 and leave it at that. It's probably not important.

Change a few other sysbeep() uses which obviously knew that the
argument was in terms of i8254 frequency, and leave alone those
that look like people thought sysbeep() took frequency in hertz.

This eliminates the knowledge of i8254_freq from all but the actual
clock.c code and the prof_machdep.c on amd64 and i386, where I think
it would be smart to ask for help from the timecounters anyway [TBD].


177565 24-Mar-2008 marius

- Const'ify the bus_stream_asi and bus_type_asi arrays.
- Replace hard-coded functions names missed in bus_machdep.c rev. 1.44
with __func__.
- Break some long lines.

MFC after: 1 month


177373 19-Mar-2008 pjd

Oops. Use atomic_add_long() for atomic_fetchadd_long() (not atomic_add_int())
for sparc64 and sun4v.

Noticed by: marius


177276 16-Mar-2008 pjd

Implement atomic_fetchadd_long() for all architectures and document it.

Reviewed by: attilio, jhb, jeff, kris (as a part of the uidinfo_waitfree.patch)


176994 09-Mar-2008 marius

- Do as the comment in pmap_bootstrap() suggests and flush all non-locked
TLB entries possibly left over by the firmware and also do so while
bootstrapping APs.
- Use __FBSDID.

MFC after: 1 month


176197 11-Feb-2008 marius

The Sun disk label only uses 16-bit fields for cylinders, heads and
sectors so the geometry of large IDE disks has to be adjusted. This
corresponds to what the OpenSolaris dad(7D) driver does except that
the latter only tweaks sectors and effectively limits the mediasize
to 128GB so the cylinders and heads fields won't ever overflow. Not
limiting the mediasize is a compromise between allowing to use Sun
disk label as far as possible and being able to use the entire disk
with another disk label.
This allows to use the full capacity of large IDE disks if they were
not labeled under (Open)Solaris (in both ways of the meaning).

MFC after: 2 weeks


174938 27-Dec-2007 alc

Add configuration knobs for the superpage reservation system. Initially,
the reservation will only be enabled on amd64.


174405 07-Dec-2007 jkoshy

Add stubs to unbreak LINT.


174195 02-Dec-2007 rwatson

Break out stack(9) from ddb(4):

- Introduce per-architecture stack_machdep.c to hold stack_save(9).
- Introduce per-architecture machine/stack.h to capture any common
definitions required between db_trace.c and stack_machdep.c.
- Add new kernel option "options STACK"; we will build in stack(9) if it is
defined, or also if "options DDB" is defined to provide compatibility
with existing users of stack(9).

Add new stack_save_td(9) function, which allows the capture of a stacktrace
of another thread rather than the current thread, which the existing
stack_save(9) was limited to. It requires that the thread be neither
swapped out nor running, which is the responsibility of the consumer to
enforce.

Update stack(9) man page.

Build tested: amd64, arm, i386, ia64, powerpc, sparc64, sun4v
Runtime tested: amd64 (rwatson), arm (cognet), i386 (rwatson)


172317 25-Sep-2007 alc

Change the management of cached pages (PQ_CACHE) in two fundamental
ways:

(1) Cached pages are no longer kept in the object's resident page
splay tree and memq. Instead, they are kept in a separate per-object
splay tree of cached pages. However, access to this new per-object
splay tree is synchronized by the _free_ page queues lock, not to be
confused with the heavily contended page queues lock. Consequently, a
cached page can be reclaimed by vm_page_alloc(9) without acquiring the
object's lock or the page queues lock.

This solves a problem independently reported by tegge@ and Isilon.
Specifically, they observed the page daemon consuming a great deal of
CPU time because of pages bouncing back and forth between the cache
queue (PQ_CACHE) and the inactive queue (PQ_INACTIVE). The source of
this problem turned out to be a deadlock avoidance strategy employed
when selecting a cached page to reclaim in vm_page_select_cache().
However, the root cause was really that reclaiming a cached page
required the acquisition of an object lock while the page queues lock
was already held. Thus, this change addresses the problem at its
root, by eliminating the need to acquire the object's lock.

Moreover, keeping cached pages in the object's primary splay tree and
memq was, in effect, optimizing for the uncommon case. Cached pages
are reclaimed far, far more often than they are reactivated. Instead,
this change makes reclamation cheaper, especially in terms of
synchronization overhead, and reactivation more expensive, because
reactivated pages will have to be reentered into the object's primary
splay tree and memq.

(2) Cached pages are now stored alongside free pages in the physical
memory allocator's buddy queues, increasing the likelihood that large
allocations of contiguous physical memory (i.e., superpages) will
succeed.

Finally, as a result of this change long-standing restrictions on when
and where a cached page can be reclaimed and returned by
vm_page_alloc(9) are eliminated. Specifically, calls to
vm_page_alloc(9) specifying VM_ALLOC_INTERRUPT can now reclaim and
return a formerly cached page. Consequently, a call to malloc(9)
specifying M_NOWAIT is less likely to fail.

Discussed with: many over the course of the summer, including jeff@,
Justin Husted @ Isilon, peter@, tegge@
Tested by: an earlier version by kris@
Approved by: re (kensmith)


172066 06-Sep-2007 marius

o Revamp the sparc64 interrupt code in order to be able to interface
with the INTR_FILTER-enabled MI code. Basically this consists of
registering an interrupt controller (of which there can be multiple
and optionally different ones either per host-to-foo bridge or shared
amongst host-to-foo bridges in any one machine) along with an interrupt
vector as specific argument for all the interrupt vectors used by a
given host-to-foo bridge (roughly similar to registering interrupt
sources on amd64 and i386), providing functions to enable, clear and
disable the interrupts of the children beneath the bridge.
This also includes:
- No longer entering a critical section in tl0_intr() and tl1_intr()
for executing interrupt handlers but rather let the handlers enter
it themselves so in the case of intr_event_handle() we don't enter
a nested critical section.
- Adding infrastructure for binding delivery of interrupt vectors to
specific CPUs which later on can be interfaced with the code from
amd64/i386 for binding interrupts to specific CPUs.
- Getting rid of the wrapper hack introduced along the lines of the
API changes for INTR_FILTER which as a side-effect caused interrupts
associated with ithread handlers only to get the elevated priority
of those associated with filters ("fast handlers") (this removes the
hack also in the non-INTR_FILTER case).
- Disabling (by not clearing) an interrupt in the interrupt controller
until all associated handlers have been executed, which is crucial
for the typical locking strategy of NIC drivers in order to work
correctly in case of shared interrupts. This was a more or less
theoretical problem on sparc64 though, as shared interrupts are
rather uncommon there except for the on-board SCCs and UARTs.
Note that due to the behavior of at least of some of the interrupt
controllers used on sparc64 an enable+EOI instead of a disable+EOI
approach (as implied by the INTR_FILTER MI code and implemented on
other architectures) is used as the latter can cause lost interrupts
or in the worst case interrupt starvation.
o Correct a typo in sbus_alloc_resource() which caused (pass-through)
allocations to only work down to the grandchildren of the bus, which
wasn't a real problem so far as we don't support any devices which are
great-grandchildren or greater of a U2S bridge, yet.
o In fhc(4) use bus_{read,write}_4() instead of bus_space_{read,write}_4()
in order to get rid of sc_bh and sc_bt in the fhc_softc. Also get rid
of some other unneeded members in fhc_softc.

Reviewed by: marcel (earlier version)
Approved by: re (kensmith)


172064 06-Sep-2007 marius

Style(9) fix - use #define<tab> consistently.

Approved by: re (kensmith)


171730 05-Aug-2007 marius

- Divorce the IOTSBs, which so far where handled via a global list
instead of per IOMMU, so we no longer need to program all of them
identically in systems having multiple IOMMUs. This continues the
rototilling of the nexus(4) done about 5 months ago, which amongst
others changed nexus(4) and the drivers for host-to-foo bridges
to provide bus_get_dma_tag methods, allowing to handle DMA tags in
a hierarchical way and to link them with devices.
This still doesn't move the silicon bug workarounds for Sabre (and
in the uncommitted schizo(4) for Tomatillo) bridges into special
bus_dma_tag_create() and bus_dmamap_sync() methods though, as w/o
fully newbus'ified bus_dma_tag_create() and bus_dma_tag_destroy()
this still requires too much hackery, i.e. per-child parent DMA
tags in the parent driver.
- Let the host-to-foo drivers supply the maximum physical address
of the IOMMU accompanying the bridges. Previously iommu(4) hard-
coded an upper limit of 16GB, which actually only applies to the
IOMMUs of the Hummingbird and Sabre bridges. The Psycho variants
as well as the U2S in fact can can translate to up to 2TB, i.e.
translate to 41-bit physical addresses. According to the recently
available Tomatillo documentation these bridges even translate to
43-bit physical addresses and hints at the Schizo bridges doing
43 bits as well.
This fixes the issue the FreeBSD 6.0 todo list item "Max RAM on
sparc64" was refering to and pretty much obsoletes the lack of
support for bounce buffers on sparc64.

Thanks to Nathan Whitehorn for pointing me at the Tomatillo manual.

Approved by: re (kensmith)


170846 16-Jun-2007 marius

- Add support for sending IPIs with USIII and greater sun4u CPUs.
These CPUs use an enhanced layout of the interrupt vector dispatch
and dispatch status registers in order to allow sending IPIs to
multiple targets simultaneously. Thus support for these CPUs was
put in a newly added cheetah_ipi_selected(). This is intended to
be pointed to by cpu_ipi_selected, which now is a function pointer,
in order to avoid cpu_impl checks once booted. Alternatively it
can point to spitfire_ipi_selected(), which was renamed from
cpu_ipi_selected(). Consequently cpu_ipi_send() was also renamed
to spitfire_ipi_send() (there's no need for a cheetah equivalent
of this so far). Initialization of the cpu_ipi_selected pointer
and other requirements is done in mp_init(), which was renamed
from mp_tramp_alloc(), as cpu_mp_start() isn't called on UP
systems while cpu_ipi_selected() is. As a side-effect this allows
to make mp_tramp static to sys/sparc64/sparc64/mp_machdep.c.
For the sake of avoiding #ifdef SMP and for keeping the history in
place cheetah_ipi_selected() and spitfire_ipi_{selected,send}()
where not put into/moved to sys/sparc64/sparc64/{cheetah,spitfire}.c
- Add some CTASSERTs and KASSERTs ensuring that MAXCPU doesn't
exceed the data types we use to store the CPU bit fields or the
number of USIII and greater CPUs supported by the current
cheetah_ipi_selected() implementation (which for JBus-CPUs is
only 4; that should be fine though as according to OpenSolaris
there are no sun4u machines with more than 4 JBus-CPUs).
- In cpu_mp_start() don't enumerate and start more than MAXCPU CPUs
as we can't handle more than that.
- In cpu_mp_start() check for upa-portid vs. portid depending on
cpu_impl for consistency with nexus(4).
- In spitfire_ipi_selected() add KASSERTs ensuring that a CPU isn't
told to IPI itself as sun4u CPUs just can't do that.
- In spitfire_ipi_send() do a MEMBAR #Sync after writing the
interrupt vector data as we want to make sure the payload was
actually written before we trigger the dispatch.
- In spitfire_ipi_send() also verify IDR_BUSY when checking whether
the dispatch was successful as it has to be cleared for this to
be the case.
- Remove some redundant variables.


170473 09-Jun-2007 marcel

Add kdb_cpu_sync_icache(), intended to synchronize instruction
caches with data caches after writing to memory. This typically
is required to make breakpoints work on ia64 and powerpc. For
those architectures the function is implemented.


170291 04-Jun-2007 attilio

Rework the PCPU_* (MD) interface:
- Rename PCPU_LAZY_INC into PCPU_INC
- Add the PCPU_ADD interface which just does an add on the pcpu member
given a specific value.

Note that for most architectures PCPU_INC and PCPU_ADD are not safe.
This is a point that needs some discussions/work in the next days.

Reviewed by: alc, bde
Approved by: jeff (mentor)


170262 04-Jun-2007 alc

Add the machine-specific definitions for configuring the new physical
memory allocator.

Approved by: re


169796 20-May-2007 marius

- Staticize cpu_ipi_send() and cpu_mp_unleash() as these aren't
referenced outside of mp_machdep.c
- Replace a magic 14 with the newly added IDC_ITID_SHIFT macro.
- Remove the global mp_boot_mid variable as it's not really necessary
and just replacing it with PCPU_GET(mid) doesn't have any impact on
performance once booted.
- Replace PCPU_GET(cpuid) with the curcpu shortcut.
- Replace hardcoded function names in panic strings etc with __func__
so they don't need to be updated when renaming the function.
- Use register_t instead of u_long for variables used to hold the
return value of intr_disable() so we don't need to apply any
knowledge about the actual width of that value here.
- Improve the wording of some comments.
- Fix several style(9) bugs.


169795 20-May-2007 marius

- Also identify USIIIi+, USIV and USIV+ CPUs.
- Use __FBSDID in identcpu.c.
- Remove #ifndef SUN4V around global cpu_impl variable; it doesn't
hurt on sun4v for now and once setPQL2() is gone sun4v can stop
sharing identcpu.c with sparc64, making the reminder of this file
also sparc64-only again. [1]

Submitted by: kmacy [1]


169730 19-May-2007 kan

Include machine/pcb.hto turn extern struct pcb stoppcbs[]; construct
into the valid C.


169488 11-May-2007 marius

- Add bits for userland profiling. For sun4u this is compile-tested only.
- Replace magic 14 with PIL_TICK.


169291 05-May-2007 alc

Define every architecture as either VM_PHYSSEG_DENSE or
VM_PHYSSEG_SPARSE depending on whether the physical address space is
densely or sparsely populated with memory. The effect of this
definition is to determine which of two implementations of
vm_page_array and PHYS_TO_VM_PAGE() is used. The legacy
implementation is obtained by defining VM_PHYSSEG_DENSE, and a new
implementation that trades off time for space is obtained by defining
VM_PHYSSEG_SPARSE. For now, all architectures except for ia64 and
sparc64 define VM_PHYSSEG_DENSE. Defining VM_PHYSSEG_SPARSE on ia64
allows the entirety of my Itanium 2's memory to be used. Previously,
only the first 1 GB could be used. Defining VM_PHYSSEG_SPARSE on
sparc64 allows USIIIi-based systems to boot without crashing.

This change is a combination of Nathan Whitehorn's patch and my own
work in perforce.

Discussed with: kmacy, marius, Nathan Whitehorn
PR: 112194


168920 21-Apr-2007 sepotvin

Add support for specifying a minimal size for vm.kmem_size in the loader via
vm.kmem_size_min. Useful when using ZFS to make sure that vm.kmem size will
be at least 256mb (for example) without forcing a particular value via vm.kmem_size.

Approved by: njl (mentor)
Reviewed by: alc


167429 11-Mar-2007 alc

Push down the implementation of PCPU_LAZY_INC() into the machine-dependent
header file. Reimplement PCPU_LAZY_INC() on amd64 and i386 making it
atomic with respect to interrupts.

Reviewed by: bde, jhb


167308 07-Mar-2007 marius

Rototill the sparc64 nexus(4) (actually this brings in the code the
sun4v nexus(4) in turn is based on):
o Change nexus(4) to manage the resources of its children so the
respective device drivers don't need to figure them out of OFW
themselves.
o Change nexus(4) to provide the ofw_bus KOBJ interface instead of
using IVARs for supplying the OFW node and the subset of standard
properties of its children. Together with the previous change this
also allows to fully take advantage of newbus in that drivers like
fhc(4), which attach on multiple parent busses, no longer require
different bus front-ends as obtaining the OFW node and properties
as well as resource allocation works the same for all supported
busses. As such this change also is part 4/4 of allowing creator(4)
to work in USIII-based machines as it allows this driver to attach
on both nexus(4) and upa(4). On the other hand removing these IVARs
breaks API compatibility with the powerpc nexus(4) but which isn't
that bad as a) sparc64 currently doesn't share any device driver
hanging off of nexus(4) with powerpc and b) they were no longer
compatible regarding OFW-related extensions at the pci(4) level
since quite some time.
o Provide bus_get_dma_tag methods in nexus(4) and its children in
order to handle DMA tags in a hierarchical way and get rid of the
sparc64_root_dma_tag kludge. Together with the previous two items
this changes also allows to completely get rid of the nexus(4)
IVAR interface. It also includes:
- pushing the constraints previously specified by the nexus_dmatag
down into the DMA tags of psycho(4) and sbus(4) as it's their
IOMMUs which induce these restrictions (and nothing at the
nexus(4) or anything that would warrant specifying them there),
- fixing some obviously wrong constraints of the psycho(4) and
sbus(4) DMA tags, which happened to not actually be used with
the sparc64_root_dma_tag kludge in place and therefore didn't
cause problems so far,
- replacing magic constants for constraints with macros as far
as it is obvious as to where they come from.
This doesn't include taking advantage of the newbus way to get
the parent DMA tags implemented by this change in order to divorce
the IOTSBs of the PCI and SBus IOMMUs or for implementing the
workaround for the DMA sync bug in Sabre (and Tomatillo) bridges,
yet, though.
o Get rid of the notion that nexus(4) (mostly) reflects an UPA bus
by replacing ofw_upa.h and with ofw_nexus.h (which was repo-copied
from ofw_upa.h) and renaming its content, which actually applies to
all of Fireplane/Safari, JBus and UPA (in the host bus case), as
appropriate.
o Just use M_DEVBUF instead of a separate M_NEXUS malloc type for
allocating the device info for the children of nexus(4). This is
done in order to not need to export M_NEXUS when deriving drivers
for subordinate busses from the nexus(4) class.
o Use the DEFINE_CLASS_0() macro to declare the nexus(4) driver so
we can derive subclasses from it.
o Const'ify the nexus_excl_name and nexus_excl_type arrays as well
as add 'associations' and 'rsc', which are pseudo-devices without
resources and therefore of no real interest for nexus(4), to the
former.
o Let the nexus(4) device memory rman manage the entire 64-bit address
space instead of just the UPA_MEMSTART to UPA_MEMEND subregion as
Fireplane/Safari- and JBus-based machines use multiple ranges,
which can't be as easily divided as in the case of UPA (limiting
the address space only served for sanity checking anyway).
o Use M_WAITOK instead of M_NOWAIT when allocating the device info
for children of nexus(4) in order to give one less opportunity
for adding devices to nexus(4) to fail.
o While adapting the drivers affected by the above nexus(4) changes,
change them to take advantage of rman_get_rid() instead of caching
the RIDs assigned to allocated resources, now that the RIDs of
resources are correctly set.
o In iommu(4) and nexus(4) replace hard-coded functions names, which
actually became outdated in several places, in panic strings and
status massages with __func__. [1]
o Use driver_filter_t in prototypes where appropriate.
o Add my copyright to creator(4), fhc(4), nexus(4), psycho(4) and
sbus(4) as I changed considerable amounts of these drivers as well
as added a bunch of new features, workarounds for silicon bugs etc.
o Fix some white space nits.

Due to lack of access to Exx00 hardware, these changes, i.e. central(4)
and fhc(4), couldn't be runtime tested on such a machine. Exx00 are
currently reported to panic before trying to attach nexus(4) anyway
though.

PR: 76052 [1]
Approved by: re (kensmith)


166901 23-Feb-2007 piso

o break newbus api: add a new argument of type driver_filter_t to
bus_setup_intr()

o add an int return code to all fast handlers

o retire INTR_FAST/IH_FAST

For more info: http://docs.freebsd.org/cgi/getmsg.cgi?fetch=465712+0+current/freebsd-current

Reviewed by: many
Approved by: re@


166105 19-Jan-2007 marius

Convert the remainder of the low hanging fruits regarding including
headers in .S directly rather than getting to their macros through
genassym.c/assym.s so there are less headers genassym.c has to be
kept in sync with.
While at it fix some stytle(9) bugs (indentation, prototype format,
sort headers, etc) and remove trailing whitespace.


166096 18-Jan-2007 marius

- Rename UPA_BUS_SPACE to NEXUS_BUS_SPACE; besides an UPA bus, nexus(4)
may also reflect a Fireplane/Safari or JBus bus (or a virtual bus which
in turn reflects a JBus bus or something like that...).
- In the both the sparc64 and sun4v bus_machdep.c use __FBSDID.
- Spell SBus the official way in comments.
- Replace hardcoded function names (all of which were actually outdated)
in panic and status strings with __func__.
- Fix whitespace nits.


166092 18-Jan-2007 marius

Remove the compat shims for the ISA old-stlye in{b,w,l}()/out{b,w,l}()
and friends along with all hacks required to implement them. None of
the drivers currently built (as part of GENERIC, LINT or modules) on
sparc64 or sun4v and none of those we might want to use there in
future uses them, AFAICT there actually never was a driver hooked up
to the sparc64 or sun4v build that correctly used these functions
(and it looks like that due to a bug read{b,w,l}()/write{b,w,l}() and
the other functions working on a memory handle never actually worked on
sun4v). All they ever were good for on sparc64 and sun4v was erroneously
dragging in dependencies on isa(4) in drivers like f.e. dpt(4), si(4)
and syscons(4) in source files that supposedly were bus-neutral and
hiding issues with drivers like f.e. ng_bt3c(4) that used these
functions with busses other than isa(4) and therefore couldn't work on
these platforms.


165967 12-Jan-2007 imp

Remove 3rd clause, renumber, ok per email


165324 18-Dec-2006 kmacy

add new large page sizes for use by shared loader


163808 30-Oct-2006 marius

In the replacement text of the __bswapN_const() macros encapsulate the
argument in parentheses so these macros are safe to use and invocations
with an expression as the argument like __bswap32_const(42 << 23 | 13)
work as expected. Additionally, mask all the individually shifted bytes
as appropriate so the bytes which exceed the width of the respective
__bswapN_const() macro in invocations like __bswap16_const(0xdead600d)
are ignored like it's the case with the corresponding __bswapN_var()
function.

MFC after: 3 days


163151 09-Oct-2006 kmacy

unbreak sparc64 loader build
re-add accidentally deleted asi value
remove sun4v only header include

Approved by: rwatson (mentor)
Reviewed by: jmg


163146 09-Oct-2006 kmacy

kernel clean up to make the sun4v kernel build

Reviewed by: jmg
Approved by: rwatson (mentor)


163016 04-Oct-2006 jb

PR:
Submitted by:
Reviewed by:
Approved by:
Obtained from:
MFC after:
Security:
Move the relocation definitions to the common elf header so that DTrace
can use them on one architecture targeted to a different one.

Add the additional ELF types defines in Sun's "Linker and Libraries"
manual.


162954 02-Oct-2006 phk

First part of a little cleanup in the calendar/timezone/RTC handling.

Move relevant variables to <sys/clock.h> and fix #includes as necessary.

Use libkern's much more time- & spamce-efficient BCD routines.


162487 21-Sep-2006 kan

Use __builtin_va_start instead of __builtin_stdarg_start. GCC4 obsoletes
the former and __builtin_va_start was present in all GCC version 3.1 and
later.


160525 20-Jul-2006 alc

Add pmap_clear_write() to the interface between the virtual memory
system's machine-dependent and machine-independent layers. Once
pmap_clear_write() is implemented on all of our supported
architectures, I intend to replace all calls to pmap_page_protect() by
calls to pmap_clear_write(). Why? Both the use and implementation of
pmap_page_protect() in our virtual memory system has subtle errors,
specifically, the management of execute permission is broken on some
architectures. The "prot" argument to pmap_page_protect() should
behave differently from the "prot" argument to other pmap functions.
Instead of meaning, "give the specified access rights to all of the
physical page's mappings," it means "don't take away the specified
access rights from all of the physical page's mappings, but do take
away the ones that aren't specified." However, owing to our i386
legacy, i.e., no support for no-execute rights, all but one invocation
of pmap_page_protect() specifies VM_PROT_READ only, when the intent
is, in fact, to remove only write permission. Consequently, a
faithful implementation of pmap_page_protect(), e.g., ia64, would
remove execute permission as well as write permission. On the other
hand, some architectures that support execute permission have
basically ignored whether or not VM_PROT_EXECUTE is passed to
pmap_page_protect(), e.g., amd64 and sparc64. This change represents
the first step in replacing pmap_page_protect() by the less subtle
pmap_clear_write() that is already implemented on amd64, i386, and
sparc64.

Discussed with: grehan@ and marcel@


159583 13-Jun-2006 marius

- Complete breaking out the definition of bus_space_{tag,handle}_t by
moving the typedef of bus_space_tag_t from sys/sparc64/include/bus.h
to sys/sparc64/include/_bus.h. This brings sparc64 in sync with the
other platforms and fixes the compilation of drivers which include
<sys/rman.h> before <machine/bus.h> after sys/sys/rman.h rev. 1.34.
- Remove the definition of bus_type_t from sys/sparc64/include/_bus.h
as it's unused since sys/sparc64/include/bus.h rev. 1.6 and
sys/sparc64/sparc64/bus_machdep.c rev. 1.3.
- Remove some pointless comments.


159031 29-May-2006 alc

MFalpha/amd64/arm/ia64
Retire pmap_track_modified(). We no longer need it because we do not
create managed mappings within the clean submap. To prevent regressions,
add assertions blocking the creation of managed mappings within the clean
submap.


158445 11-May-2006 phk

Clean out sysctl machdep.* related defines.

The cmos clock related stuff should really be in MI code.


157448 03-Apr-2006 marcel

Eliminate HAVE_STOPPEDPCBS. On ia64 the PCPU holds a pointer to the
PCB in which the context of stopped CPUs is stored. To access this
PCB from KDB, we introduce a new define, called KDB_STOPPEDPCB. The
definition, when present, lives in <machine/kdb.h> and abstracts
where MD code saves the context. Define KDB_STOPPEDPCB on i386,
amd64, alpha and sparc64 in accordance to previous code.


157239 29-Mar-2006 marius

Add convenience macros for the bits in ASI_ESTATE_ERROR_EN_REG (used
for ECC handling) and the additional uses of the ASIs 0x77 and 0x7f
as well as their bits (used for a CPU bug workaround).

MFC after: 3 days


157224 28-Mar-2006 marius

Sync with the other archs and declare the memory location referenced by
the address argument of the bus_space_write_multi_*() familiy as const.

Prodded by: damien


154419 16-Jan-2006 kris

Correct typos (s/OFERFLOW/OVERFLOW/).

Reviewed by: jhb


154256 12-Jan-2006 marius

- The inline asm in this file uses output operands before all input
operands are consumed so use the appropriate constraint modifier.
Before this change GCC used one register for both an input and an
unrelated output operand of in_addword(), causing the input to be
overwritten before it was consumed and thus breaking in_addword().
For in_cksum_hdr() and in_pseudo() this change is more or less
cosmetic.
- Fix a misspelling in a nearby comment.

Reported & tested by: yongari
MFC after: 1 week


153666 22-Dec-2005 jhb

Tweak how the MD code calls the fooclock() methods some. Instead of
passing a pointer to an opaque clockframe structure and requiring the
MD code to supply CLKF_FOO() macros to extract needed values out of the
opaque structure, just pass the needed values directly. In practice this
means passing the pair (usermode, pc) to hardclock() and profclock() and
passing the boolean (usermode) to hardclock_cpu() and hardclock_process().
Other details:
- Axe clockframe and CLKF_FOO() macros on all architectures. Basically,
all the archs were taking a trapframe and converting it into a clockframe
one way or another. Now they can just extract the PC and usermode values
directly out of the trapframe and pass it to fooclock().
- Renamed hardclock_process() to hardclock_cpu() as the latter is more
accurate.
- On Alpha, we now run profclock() at hz (profhz == hz) rather than at
the slower stathz.
- On Alpha, for the TurboLaser machines that don't have an 8254
timecounter, call hardclock() directly. This removes an extra
conditional check from every clock interrupt on Alpha on the BSP.
There is probably room for even further pruning here by changing Alpha
to use the simplified timecounter we use on x86 with the lapic timer
since we don't get interrupts from the 8254 on Alpha anyway.
- On x86, clkintr() shouldn't ever be called now unless using_lapic_timer
is false, so add a KASSERT() to that affect and remove a condition
to slightly optimize the non-lapic case.
- Change prototypeof arm_handler_execute() so that it's first arg is a
trapframe pointer rather than a void pointer for clarity.
- Use KCOUNT macro in profclock() to lookup the kernel profiling bucket.

Tested on: alpha, amd64, arm, i386, ia64, sparc64
Reviewed by: bde (mostly)


153194 07-Dec-2005 obrien

style(9) nits


153193 07-Dec-2005 obrien

Add Sparc TLS relocation definitions.


153179 06-Dec-2005 jhb

- Cleanup whitespace and extra ()s in vtophys() macros.
- Move vtophys() macros next to vtopte() where vtopte() exists to match
comments above vtopte().
- Remove references to the alternate address space in the comment above
vtopte(). amd64 never had the alternate address space, and i386 lost it
prior to PAE support being added.
- s/entires/entries/ in comments.

Reviewed by: alc


153175 06-Dec-2005 marius

Use <sys/ktr.h> directly in .S files instead of exporting the
KTR_* class macros via genassym.c. Together with sys/sys/ktr.h
rev. 1.34 this has the desired side-effect of providing a default
value for KTR_COMPILE. Thus this fixes warnings from -Wundef
regarding KTR_COMPILE not being defined for .S files.

Requested by: ru
Reviewed by: ru


153168 06-Dec-2005 ru

Drop _MACHINE_ARCH and _MACHINE defines (not to be confused with
MACHINE_ARCH and MACHINE). Their purpose was to be able to test
in cpp(1), but cpp(1) only understands integer type expressions.
Using such unsupported expressions introduced a number of subtle
bugs, which were discovered by compiling with -Wundef.


153061 03-Dec-2005 marius

- Move the declaration of struct upa_ranges and the UPA_RANGE_* macros
from sys/sparc64/include/ofw_upa.h to sys/sparc64/pci/ofw_pci.h and
rename them to struct ofw_pci_ranges and OFW_PCI_RANGE_* respectively.
This ranges struct only applies to host-PCI bridges but no to other
bridges found on UPA. At the same time it applies to all host-PCI
bridges regardless of whether the interconnection bus is Fireplane/
Safari, JBus or UPA.
- While here rename the PCI_CS_* macros in sys/sparc64/pci/ofw_pci.h
to OFW_PCI_CS_* in order to be consistent and change this header to
use uintXX_t instead of u_intXX_t.


152022 03-Nov-2005 jhb

Add stoppcbs[] arrays on Alpha and sparc64 and have each CPU save its
current context in the IPI_STOP handler so that we can get accurate stack
traces of threads on other CPUs on these two archs like we do now on i386
and amd64.

Tested on: alpha, sparc64


151658 25-Oct-2005 jhb

Reorganize the interrupt handling code a bit to make a few things cleaner
and increase flexibility to allow various different approaches to be tried
in the future.
- Split struct ithd up into two pieces. struct intr_event holds the list
of interrupt handlers associated with interrupt sources.
struct intr_thread contains the data relative to an interrupt thread.
Currently we still provide a 1:1 relationship of events to threads
with the exception that events only have an associated thread if there
is at least one threaded interrupt handler attached to the event. This
means that on x86 we no longer have 4 bazillion interrupt threads with
no handlers. It also means that interrupt events with only INTR_FAST
handlers no longer have an associated thread either.
- Renamed struct intrhand to struct intr_handler to follow the struct
intr_foo naming convention. This did require renaming the powerpc
MD struct intr_handler to struct ppc_intr_handler.
- INTR_FAST no longer implies INTR_EXCL on all architectures except for
powerpc. This means that multiple INTR_FAST handlers can attach to the
same interrupt and that INTR_FAST and non-INTR_FAST handlers can attach
to the same interrupt. Sharing INTR_FAST handlers may not always be
desirable, but having sio(4) and uhci(4) fight over an IRQ isn't fun
either. Drivers can always still use INTR_EXCL to ask for an interrupt
exclusively. The way this sharing works is that when an interrupt
comes in, all the INTR_FAST handlers are executed first, and if any
threaded handlers exist, the interrupt thread is scheduled afterwards.
This type of layout also makes it possible to investigate using interrupt
filters ala OS X where the filter determines whether or not its companion
threaded handler should run.
- Aside from the INTR_FAST changes above, the impact on MD interrupt code
is mostly just 's/ithread/intr_event/'.
- A new MI ddb command 'show intrs' walks the list of interrupt events
dumping their state. It also has a '/v' verbose switch which dumps
info about all of the handlers attached to each event.
- We currently don't destroy an interrupt thread when the last threaded
handler is removed because it would suck for things like ppbus(8)'s
braindead behavior. The code is present, though, it is just under
#if 0 for now.
- Move the code to actually execute the threaded handlers for an interrrupt
event into a separate function so that ithread_loop() becomes more
readable. Previously this code was all in the middle of ithread_loop()
and indented halfway across the screen.
- Made struct intr_thread private to kern_intr.c and replaced td_ithd
with a thread private flag TDP_ITHREAD.
- In statclock, check curthread against idlethread directly rather than
curthread's proc against idlethread's proc. (Not really related to intr
changes)

Tested on: alpha, amd64, i386, sparc64
Tested on: arm, ia64 (older version of patch by cognet and marcel)


151344 14-Oct-2005 kris

Add a default value for VM_BCACHE_SIZE_MAX of 400MB. This is copied from
amd64, and is a factor of 3 less than the value previously auto-sized on
a 12GB machine, which would cause an overflow in calculations involving the
maxbcache int, causing bufinit() to loop forever at boot.

Reviewed by: mlaier, peter


150627 27-Sep-2005 jhb

Add a new atomic_fetchadd() primitive that atomically adds a value to a
variable and returns the previous value of the variable.

Tested on: i386, alpha, sparc64, arm (cognet)
Reviewed by: arch@
Submitted by: cognet (arm)
MFC after: 1 week


149337 20-Aug-2005 stefanf

Move MINSIGSTKSZ from <machine/signal.h> to <machine/_limits.h> and rename
it to __MINSIGSTKSZ. Define MINSIGSTKSZ in <sys/signal.h>.

This is done in order to use MINSIGSTKSZ for the macro PTHREAD_STACK_MIN
in <pthread.h> (soon <limits.h>) without having to include the whole
<sys/signal.h> header.

Discussed with: bde


149327 20-Aug-2005 stefanf

Remove a stale occurrence of 'alpha' in a comment.


148453 27-Jul-2005 jhb

Add extra constraints to tell the compiler that the memory be modified
in the arm __swp() and sparc64 casa() and casax() functions is actually
being used as an input and output and not just the value of the register
that points to the memory location. This was the underlying source of
the mbuf refcount problems on sparc64 a while back. For arm this should be
a nop because __swp() has a constraint to clobber all memory which can
probably be removed now.

Reviewed by: alc, cognet
MFC after: 1 week


148067 15-Jul-2005 jhb

Convert the atomic_ptr() operations over to operating on uintptr_t
variables rather than void * variables. This makes it easier and simpler
to get asm constraints and volatile keywords correct.

MFC after: 3 days
Tested on: i386, alpha, sparc64
Compiled on: ia64, powerpc, amd64
Kernel toolchain busted on: arm


147191 09-Jun-2005 jkoshy

MFP4:

- Implement sampling modes and logging support in hwpmc(4).

- Separate MI and MD parts of hwpmc(4) and allow sharing of
PMC implementations across different architectures.
Add support for P4 (EMT64) style PMCs to the amd64 code.

- New pmcstat(8) options: -E (exit time counts) -W (counts
every context switch), -R (print log file).

- pmc(3) API changes, improve our ability to keep ABI compatibility
in the future. Add more 'alias' names for commonly used events.

- bug fixes & documentation.


146734 29-May-2005 nyan

Remove bus_{mem,p}io.h and related code for a micro-optimization on i386
and amd64. The optimization is a trivial on recent machines.

Reviewed by: -arch (imp, marcel, dfr)


146411 19-May-2005 marius

- Collapse eeprom_ebus.c and eeprom_sbus.c into eeprom.c and
eeprom_ebus_attach() and eeprom_sbus_attach() into eeprom_attach()
respectively. Since the introduction of the ofw_bus interface some
time ago and now that ebus(4) also uses SYS_RES_MEMORY for the
memory resources since ebus.c rev. 1.22 there is no longer a
need to have separate front-ends for ebus(4), fhc(4) and sbus(4).
- Fail gracefully instead of panicing when the model can't be
determined.
- Don't leak resources when mk48txx_attach() fails.
- Use FBSDID.


145332 20-Apr-2005 marcel

Add empty header (except of the multiple-inclusion protection) to
get hwpmc(4) to compile on this platform.


145253 18-Apr-2005 imp

Break out the definition of bus_space_{tag,handle}_t and a few other types
into _bus.h to help with name space polution from including all of bus.h.
In a few days, I'll commit changes to the MI code to take advantage of thse
sepration (after I've made sure that these changes don't break anything in
the main tree, I've tested in my trees, but you never know...).

Suggested by: bde (in 2002 or 2003 I think)
Reviewed in principle by: jhb


145150 16-Apr-2005 marius

- Add a workaround for a bug in BlackBird CPUs (said to be part of the
SpitFire erratum #54) which can cause writes to the TICK_CMPR register
to fail. This seems to fix the dying clocks problem reported by jhb@
and kris@. [1]
- In tick_start() don't reset the tick counter of the boot processor to
zero. It's initially reset in _start() and afterwards but _before_
tick_start() is called on the BSP the APs synchronise with the tick
counter of the BSP in mp_startup(). Resetting the tick counter of the
BSP in tick_start() probably also was the cause of problems seen when
using the CPU tick counter as timecounter on SMP machines.
Not resetting the tick counter of the BSP in mp_startup() makes the
tick counters and tick interrupts between the BSP and APs be pretty
much in sync as it's supposed to be. This also means there's no longer
a real reason to have separate tick_start() and tick_start_ap() so
merge them and zap tick_start_ap(). This is also a first step in
simplifying the interface to the tick counters in preparation to use
alternate clock hardware where available.
- Switch to the algorithm used on FreeBSD/ia64 for updating the tick
interrupt register and which compensates the clock drift caused by
varying delays between when the tick interrupts actually trigger and
when they are serviced. Not compensating the clock drift mainly hurts
interactive performance especially when using WITNESS. [2]
For further information about the algorithm also see the commit log
of sys/ia64/ia64/interrupt.c rev. 1.38.
On sparc64 the sysctls for monitoring the behaviour of the tick
interrupts are machdep.tick.adjust_edges, machdep.tick.adjust_excess,
machdep.tick.adjust_missed and machdep.tick.adjust_ticks.
- In tick_init() just use tick_stop() for stopping the tick interrupts
until a proper handler is set up later. This also stops the system
tick interrupt on USIII systems earlier.
- In tick_start() check for a rough upper limit of HZ.
- Some minor changes, e.g. use FBSDID, remove unused headers, etc.

Info obtained from: Linux [1]
Ok'ed by: marcel [2]
Additional testing by: kris (earlier version of the workaround), jhb
X-MFC after: 3 days [1]


145149 16-Apr-2005 marius

Fix a style(9) bug in the stxa_sync() macro (DO NOT use function calls
in initializers).


144637 04-Apr-2005 jhb

Divorce critical sections from spinlocks. Critical sections as denoted by
critical_enter() and critical_exit() are now solely a mechanism for
deferring kernel preemptions. They no longer have any affect on
interrupts. This means that standalone critical sections are now very
cheap as they are simply unlocked integer increments and decrements for the
common case.

Spin mutexes now use a separate KPI implemented in MD code: spinlock_enter()
and spinlock_exit(). This KPI is responsible for providing whatever MD
guarantees are needed to ensure that a thread holding a spin lock won't
be preempted by any other code that will try to lock the same lock. For
now all archs continue to block interrupts in a "spinlock section" as they
did formerly in all critical sections. Note that I've also taken this
opportunity to push a few things into MD code rather than MI. For example,
critical_fork_exit() no longer exists. Instead, MD code ensures that new
threads have the correct state when they are created. Also, we no longer
try to fixup the idlethreads for APs in MI code. Instead, each arch sets
the initial curthread and adjusts the state of the idle thread it borrows
in order to perform the initial context switch.

This change is largely a big NOP, but the cleaner separation it provides
will allow for more efficient alternative locking schemes in other parts
of the kernel (bare critical sections rather than per-CPU spin mutexes
for per-CPU data for example).

Reviewed by: grehan, cognet, arch@, others
Tested on: i386, alpha, sparc64, powerpc, arm, possibly more


143598 14-Mar-2005 scottl

Refactor the bus_dma header files so that the interface is described in
sys/bus_dma.h instead of being copied in every single arch. This slightly
reorders a flag that was specific to AXP and thus changes the ABI there.
The interface still relies on bus_space definitions found in <machine/bus.h>
so it cannot be included on its own yet, but that will be fixed at a later
date. Add an MD <machine/bus_dma.h> for ever arch for consistency and to
allow for future MD augmentation of the API. sparc64 makes heavy use of
this right now due to its different bus_dma implemenation.


143190 06-Mar-2005 alc

Declare as volatile the memory location referenced by a pointer rather than
the pointer's value.


143140 04-Mar-2005 joerg

Addendum to netchild's C compiler abstraction mega-patch which somehow
have been forgotten in my previous commit.

Submitted by: netchild


143063 02-Mar-2005 joerg

netchild's mega-patch to isolate compiler dependencies into a central
place.

This moves the dependency on GCC's and other compiler's features into
the central sys/cdefs.h file, while the individual source files can
then refer to #ifdef __COMPILER_FEATURE_FOO where they by now used to
refer to #if __GNUC__ > 3.1415 && __BARC__ <= 42.

By now, GCC and ICC (the Intel compiler) have been actively tested on
IA32 platforms by netchild. Extension to other compilers is supposed
to be possible, of course.

Submitted by: netchild
Reviewed by: various developers on arch@, some time ago


142045 18-Feb-2005 marius

Silence witness warnings about duplicate pmap lock emitted since
rev. 1.145 of sys/sparc64/sparc64/pmap.c.

Submitted by: alc


141753 12-Feb-2005 marius

- Re-write OF_decode_addr() with a bus-neutral approach, adding support
for nodes hanging off of Central (untested), FireHose (untested) and
PCI (tested) busses.
- Add an additional parameter to OF_decode_addr() which specifies the
index of the register bank to decode.

These should allow to eventually add support for the Z8530 hanging off of
FireHose to uart(4) and to write support for PCI-based graphics adapters.

Suggested by: tmm (back in '03)


140982 29-Jan-2005 ru

Hopefully unbreak modules build.


140485 19-Jan-2005 jhb

Add a small API to manage the MD user trap structures. Specifically, we
now use a pool mutex to manage the reference counts. This fixes races
resulting in use-after-free.

Tested by: kris, David Cornejo dave at dogwood dot com
Reported by: bmilekic's MemGuard
MFC after: 1 week


140281 15-Jan-2005 scottl

Add the bus_dmamap_load_mbuf_sg() function to sparc64.


139825 07-Jan-2005 imp

/* -> /*- for license, minor formatting changes


139264 24-Dec-2004 scottl

Identify USIIIi processors.

Submitted by: Gavin Atkinson
PR: 75468


138253 01-Dec-2004 marcel

Change gdb_cpu_setreg() to not take the value to which to set the
specified register, but a pointer to the in-memory representation of
that value. The reason for this is twofold:
1. Not all registers can be represented by a register_t. In particular
FP registers fall in that category. Passing the new register value
by reference instead of by value makes this point moot.
2. When we receive a G or P packet, both are for writing a register,
the packet will have the register value in target-byte order and
in the memory representation (modulo the fact that bytes are sent
as 2 printable hexadecimal numbers of course). We only need to
decode the packet to have a pointer to the register value.

This change fixes the bug of extracting the register value of the P
packet as a hexadecimal number instead of as a bit array. The quick
(and dirty) fix to bswap the register value in gdb_cpu_setreg() as
it has been added on i386 and amd64 can therefore be removed and has
in fact been that.

Tested on: alpha, amd64, i386, ia64, sparc64


137914 20-Nov-2004 das

Remove UAREA_PAGES.

Reviewed by: arch@


137813 17-Nov-2004 marius

o Sync with the NetBSD mk48txx driver (the result simplyfies some changes
I have in mind for the genclock interface):
- Recognize the MK48T18 as well (differs from the MK48T08 only in
packaging options and voltages).
- Allow MD code to provide functions for reading/writing NVRAM/RTC
locations.
If passed NULL, the old behaviour using bus_space_{read,write}_1() is
used. Otherwise, all access to the chip goes via the MD functions.
This is necessary for mvmeppc boards where the mk48txx NVRAM/RTC is
not directly addressable.
- Cleanup MI mk48txx(4) todclock driver:
- Prepare mk48txxvar.h and leave only register definitions in
mk48txxreg.h.
- Define struct mk48txx_softc as usual devices and allocate necessary
members in it.
- Change mk48txx_attach() to only take a device_t.
o While converting the sparc64 eeprom driver to the above changes:
- Remove some dead code and stale comments.
- Use the NVRAM size provided by the mk48txx driver instead of hardcoding
it as suggested by a comment.
- Add a comment about why it doesn't make much sense to read the hostid
directly from the NVRAM except for displaying it when attaching.
- Don't print the hostid if it reads all zero because it's stored
elsewhere.


135943 29-Sep-2004 kensmith

We seem to have occasions where sending an IPI takes significantly
longer than 'normal'. The cause is still being tracked down but
in the meantime there are machines where raising IPI_RETRIES does
help - it's not just a case of the machine staying locked up longer
and then panic-ing anyway. Several helpful folks on sparc64@ tried
a patch that helped figure out what to raise this number to.

Discussed on: sparc64@
MFC after: 3 days


134398 27-Aug-2004 marcel

Move the kernel-specific logic to adjust frompc from MI to MD. For
these two reasons:
1. On ia64 a function pointer does not hold the address of the first
instruction of a functions implementation. It holds the address
of a function descriptor. Hence the user(), btrap(), eintr() and
bintr() prototypes are wrong for getting the actual code address.
2. The logic forces interrupt, trap and exception entry points to
be layed-out contiguously. This can not be achieved on ia64 and is
generally just bad programming.

The MCOUNT_FROMPC_USER macro is used to set the frompc argument to
some kernel address which represents any frompc that falls outside
the kernel text range. The macro can expand to ~0U to bail out in
that case.
The MCOUNT_FROMPC_INTR macro is used to set the frompc argument to
some kernel address to represent a call to a trap or interrupt
handler. This to avoid that the trap or interrupt handler appear to
be called from everywhere in the call graph. The macro can expand
to ~0U to prevent adjusting frompc. Note that the argument is selfpc,
not frompc.

This commit defines the macros on all architectures equivalently to
the original code in sys/libkern/mcount.c. People can take it from
here...

Compile-tested on: alpha, amd64, i386, ia64 and sparc64
Boot-tested on: i386


133862 16-Aug-2004 marius

Instead of "OpenFirmware", "openfirmware", etc. use the official spelling
"Open Firmware" from IEEE 1275 and OpenFirmware.org (no pun intended).

Ok'ed by: tmm


133728 14-Aug-2004 marius

- Make OF_getetheraddr() honour the "local-mac-address?" system config
variable. If set to "true" OF_getetheraddr() will now return the unique
MAC address stored in the "local-mac-address" property of the device's
OFW node if present and the host address/system default MAC address if
the node doesn't doesn't have such a property. If set to "false" the
host address will be returned for all devices like before this change.
This brings the behaviour of device drivers for NICs with OFW support/
FCode, i.e. dc(4) for on-board DM9102A on Sun machines, gem(4) and hme(4),
regarding "local-mac-address?" in line with NetBSD and Solaris.
The man pages of the respective drivers will be updated separately to
reflect this change.
- Remove OF_getetheraddr2() which was used as a stopgap in dc(4). Its
functionality is now part of OF_getetheraddr().


133589 12-Aug-2004 marius

- Introduce an ofw_bus kobj-interface for retrieving the OFW node and a
subset ("compatible", "device_type", "model" and "name") of the standard
properties in drivers for devices on Open Firmware supported busses. The
standard properties "reg", "interrupts" und "address" are not covered by
this interface because they are only of interest in the respective bridge
code. There's a remaining standard property "status" which is unclear how
to support properly but which also isn't used in FreeBSD at present.
This ofw_bus kobj-interface allows to replace the various (ebus_get_node(),
ofw_pci_get_node(), etc.) and partially inconsistent (central_get_type()
vs. sbus_get_device_type(), etc.) existing IVAR ones with a common one.
This in turn allows to simplify and remove code-duplication in drivers for
devices that can hang off of more than one OFW supported bus.
- Convert the sparc64 Central, EBus, FHC, PCI and SBus bus drivers and the
drivers for their children to use the ofw_bus kobj-interface. The IVAR-
interfaces of the Central, EBus and FHC are entirely replaced by this. The
PCI bus driver used its own kobj-interface and now also uses the ofw_bus
one. The IVARs special to the SBus, e.g. for retrieving the burst size,
remain.
Beware: this causes an ABI-breakage for modules of drivers which used the
IVAR-interfaces, i.e. esp(4), hme(4), isp(4) and uart(4), which need to be
recompiled.
The style-inconsistencies introduced in some of the bus drivers will be
fixed by tmm@ in a generic clean-up of the respective drivers later (he
requested to add the changes in the "new" style).
- Convert the powerpc MacIO bus driver and the drivers for its children to
use the ofw_bus kobj-interface. This invloves removing the IVARs related
to the "reg" property which were unused and a leftover from the NetBSD
origini of the code. There's no ABI-breakage caused by this because none
of these driver are currently built as modules.
There are other powerpc bus drivers which can be converted to the ofw_bus
kobj-interface, e.g. the PCI bus driver, which should be done together
with converting powerpc to use the OFW PCI code from sparc64.
- Make the SBus and FHC front-end of zs(4) and the sparc64 eeprom(4) take
advantage of the ofw_bus kobj-interface and simplify them a bit.

Reviewed by: grehan, tmm
Approved by: re (scottl)
Discussed with: tmm
Tested with: Sun AX1105, AXe, Ultra 2, Ultra 60; PPC cross-build on i386


133451 10-Aug-2004 alc

Add pmap locking to many of the functions.

Implement the protection check required by the pmap_extract_and_hold()
specification.

Remove the acquisition and release of Giant from pmap_extract_and_hold() and
pmap_protect().

Many thanks to Ken Smith for resolving a sparc64-specific initialization
problem in my original patch.

Tested by: kensmith@


133084 03-Aug-2004 mux

Instead of calling ia32_pause() conditionally on __i386__ or __amd64__
being defined, define and use a new MD macro, cpu_spinwait(). It only
expands to something on i386 and amd64, so the compiled code should be
identical.

Name of the macro found by: jhb
Reviewed by: jhb


132956 01-Aug-2004 markm

Break out the MI part of the /dev/[k]mem and /dev/io drivers into
their own directory and module, leaving the MD parts in the MD
area (the MD parts _are_ part of the modules). /dev/mem and /dev/io
are now loadable modules, thus taking us one step further towards
a kernel created entirely out of modules. Of course, there is nothing
preventing the kernel from having these statically compiled.


132700 27-Jul-2004 rwatson

Pass a thread argument into cpu_critical_{enter,exit}() rather than
dereference curthread. It is called only from critical_{enter,exit}(),
which already dereferences curthread. This doesn't seem to affect SMP
performance in my benchmarks, but improves MySQL transaction throughput
by about 1% on UP on my Xeon.

Head nodding: jhb, bmilekic


131952 10-Jul-2004 marcel

Mega update for the KDB framework: turn DDB into a KDB backend.
Most of the changes are a direct result of adding thread awareness.
Typically, DDB_REGS is gone. All registers are taken from the
trapframe and backtraces use the PCB based contexts. DDB_REGS was
defined to be a trapframe on all platforms anyway.
Thread awareness introduces the following new commands:
thread X switch to thread X (where X is the TID),
show threads list all threads.

The backtrace code has been made more flexible so that one can
create backtraces for any thread by giving the thread ID as an
argument to trace.

With this change, ia64 has support for breakpoints.


131948 10-Jul-2004 marcel

Remove obsolete prototype of kdb_trap().


131905 10-Jul-2004 marcel

Implement makectx(). The makectx() function is used by KDB to create
a PCB from a trapframe for purposes of unwinding the stack. The PCB
is used as the thread context and all but the thread that entered the
debugger has a valid PCB.
This function can also be used to create a context for the threads
running on the CPUs that have been stopped when the debugger got
entered. This however is not done at the time of this commit.


131903 10-Jul-2004 marcel

Introduce the KDB debugger frontend. The frontend provides a framework
in which multiple (presumably different) debugger backends can be
configured and which provides basic services to those backends.
Besides providing services to backends, it also serves as the single
point of contact for any and all code that wants to make use of the
debugger functions, such as entering the debugger or handling of the
alternate break sequence. For this purpose, the frontend has been
made non-optional.
All debugger requests are forwarded or handed over to the current
backend, if applicable. Selection of the current backend is done by
the debug.kdb.current sysctl. A list of configured backends can be
obtained with the debug.kdb.available sysctl. One can enter the
debugger by writing to the debug.kdb.enter sysctl.


131899 10-Jul-2004 marcel

Introduce the GDB debugger backend for the new KDB framework. The
backend improves over the old GDB support in the following ways:
o Unified implementation with minimal MD code.
o A simple interface for devices to register themselves as debug
ports, ala consoles.
o Compression by using run-length encoding.
o Implements GDB threading support.


131224 28-Jun-2004 scottl

Retire BUS_DMAMAP_NSEGS for sparc64


131223 28-Jun-2004 scottl

Switch sparc64 busdma to use a dynamically allocated segment list rather
than a a stack-limited list. This removes the artifical limit on s/g list
size.
cvs: ----------------------------------------------------------------------


130764 20-Jun-2004 bde

Backed out previous commit. Blind substitution of dev_t by `struct cdev *'
was just wrong here because the dev_t's are user dev_t's.


130585 16-Jun-2004 phk

Do the dreaded s/dev_t/struct cdev */
Bump __FreeBSD_version accordingly.


130164 06-Jun-2004 phk

Remove filename+line number from panic messages.


129749 26-May-2004 tmm

Move the per-CPU vmspace pointer fixup that is required before a
struct vmspace is freed from cpu_sched_exit() to pmap_release().

This has the advantage of being able to rely on MI code to decide
when a free should occur, instead of having to inspect the reference
count ourselves.

At the same time, turn the per-CPU vmspace pointer into a pmap pointer,
so that pmap_release() can deal with pmaps exclusively.

Reviewed (and embrassing bug spotted) by: jake


129569 22-May-2004 marius

Use unsigned types for the arguments of the atomic(9) operations,
like described in the man page and done on all other architectures.

OK'ed by: tmm


129568 22-May-2004 marius

Switch from BSD-style u_intXX_t to ISO C99 uintXX_t.


129444 19-May-2004 bde

Moved most of the "MI" definitions and declarations from <machine/profile.h>
to <sys/gmon.h>. Cleaned them up a little by not attempting to ifdef
for incomplete and out of date support for GUPROF in userland, as in
the sparc64 version.


129393 18-May-2004 stefanf

<stdint.h> should define WINT_M{AX,IN} independent from whether WCHAR_MIN is
defined. Otherwise first including <wchar.h> and then <stdint.h> leads to no
WINT_M{AX,IN} at all.

PR: 64956
Approved by: das (mentor)


129068 09-May-2004 alc

Correct the implementation of pmap_page_is_mapped(): It should return TRUE
only if the page has one or more managed mappings.


129051 08-May-2004 marius

- Remove the old sparc64 OFW PCI code (as opposed to the former
"options OFW_NEWPCI").
This is a bit overdue, the new sparc64 OFW PCI code which is
meant to replace the old one is in place for 10 months and
enabled by default in GENERIC for 8 months. FreeBSD 5.2 and
5.2.1 also shipped with the new code enabled by default.
- Some minor clean-up, e.g. remove functions that encapsulated
the #ifdefs for OFW_NEWPCI, remove unused resp. no longer
required includes, etc.

Approved by: tmm, no objections on freebsd-sparc64


128776 30-Apr-2004 tmm

Some cleanups to the nexus code:
- Remove second license, the first was not that different and should be
fine.
- Add nexus_attach(), and do not perform its task in nexus_probe() any
more.
- Remove nexus_write_ivar(), since it was quite pointless.
- Remove superfluous devinfo members.
- Clean up some comments, minor style issues and prototypes.


128629 25-Apr-2004 das

Hide FLT_EVAL_METHOD and DECIMAL_DIG in pre-C99 compilation
environments.

PR: 63935
Submitted by: Stefan Farfeleder <stefan@fafoe.narf.at>


128103 11-Apr-2004 alc

Remove avail_end. It is not used.


127977 07-Apr-2004 imp

Remove advertising clause from University of California Regent's
license, per letter dated July 22, 1999 and email from Peter Wemm,
Alan Cox and Robert Watson.

Approved by: core, peter, alc, rwatson


127875 05-Apr-2004 alc

Remove avail_start on those platforms that no longer use it. (Only amd64
does anything with it beyond simple initialization.)


127239 20-Mar-2004 marcel

Introduce the cpumask_t type. The purpose of the type is to create a
level of abstraction for any and all CPU mask and CPU bitmap variables
so that platforms have the ability to break free from the hard limit
of 32 CPUs, simply because we don't have more bits in an u_int. Note
that the type is not supposed to solve massive parallelism, where
the number of CPUs can be larger than the width of the widest integral
type. As such, cpumask_t is not supposed to be a compound type. If
such would be necessary in the future, we can deal with the issues
then and there. For now, it can be assumed that the type is integral
and unsigned.

With this commit, all MD definitions start off as u_int. This allows
us to phase-in cpumask_t at our leasure without breaking anything.
Once cpumask_t is used consistently, platforms can switch to wider
(or smaller) types if such would be beneficial (or not; whatever :-)

Compile-tested on: i386


126817 10-Mar-2004 gad

Change time_t from a 32-bit value to a 64-bit value, on FreeBSD/sparc64
only. This is a MAJOR incompatible change for the sparc64 platform,
but will not effect FreeBSD on other architectures.

Reviewed by: imp for UPDATING, freebsd-sparc for the change itself.


126649 05-Mar-2004 le

Fix syntax errors and wrong function prototypes in several MD header
files when using non-GNUC compilers.

PR: kern/58515
Submitted by: Stefan Farfeleder <stefan@fafoe.narf.at>
Approved by: grog (mentor), obrien


125081 27-Jan-2004 kensmith

- Fix for sparc64 to use new __panic() function

Adapted from patch by: David Cornejo <dcornejo@firetide.com>
Reviewed by: freebsd-sparc64 (harti)
Approved by: rwatson (mentor)


124296 09-Jan-2004 nectar

Provide sysarch(2) prototypes in the MD sysarch.h headers. While I'm
at it, use the ANSI C generic pointer type for the second argument,
thus matching the documentation.

Remove the now extraneous (and now conflicting) function declarations
in various libc sources. Remove now unnecessary casts.

Reviewed by: bde


124259 08-Jan-2004 mux

Some integrated Davicom cards in sparc64 boxes have an all zeros
MAC address in the EEPROM, and we need to get it from OpenFirmware.
This isn't very pretty but time is lacking to do this in a better
way this near 5.2-RELEASE. This is a RELENG_5_2 candidate.

Original version by: Marius Strobl <marius@alchemy.franken.de>
Tested by: Pete Bentley <pete@sorted.org>
Reviewed by: jake


123791 24-Dec-2003 peter

GC the unused <machine/kse.h> file.


122780 16-Nov-2003 alc

- Modify alpha's sf_buf implementation to use the direct virtual-to-
physical mapping.
- Move the sf_buf API to its own header file; make struct sf_buf's
definition machine dependent. In this commit, we remove an
unnecessary field from struct sf_buf on the alpha, amd64, and ia64.
Ultimately, we may eliminate struct sf_buf on those architecures
except as an opaque pointer that references a vm page.


122464 11-Nov-2003 jake

Fix a bug in the data access error recorvery. Before re-enabling the data
cache after a data access error we must discard all cache lines. When
disabled existing cache lines are not invalidated by stores to memory, so
we risk reading stale data that was cached before the data access error if
we don't flush them. This is especially fatal when the memory involved
is the active part of the kernel or user stack. For good measure we also
flush the instruction cache.

This fixes random crashes when the X server probes the PCI bus through
/dev/pci.


120831 06-Oct-2003 bms

Move pmap_resident_count() from the MD pmap.h to the MI pmap.h.
Add a definition of pmap_wired_count().
Add a definition of vmspace_wired_count().

Reviewed by: truckman
Discussed with: peter


120710 03-Oct-2003 alc

Make PAGE_SIZE and related quantities signed on sparc64. (They are signed
quantities on every other architecture.) This change is required in order
to move pmap_prefault() out of the pmap and into the machine-independent
layer.


120611 30-Sep-2003 mux

Allow the compiler to micro-optimize byte swapping functions by
evaluating them at compile time rather than at run time. As for x86
and amd64, this requires GCC and it's enabled only if __OPTIMIZE__ is
defined (ie, if at least -O is used).

Reviewed by: jake


120422 25-Sep-2003 peter

Add sysentvec->sv_fixlimits() hook so that we can catch cases on 64 bit
systems where the data/stack/etc limits are too big for a 32 bit process.

Move the 5 or so identical instances of ELF_RTLD_ADDR() into imgact_elf.c.

Supply an ia32_fixlimits function. Export the clip/default values to
sysctl under the compat.ia32 heirarchy.

Have mmap(0, ...) respect the current p->p_limits[RLIMIT_DATA].rlim_max
value rather than the sysctl tweakable variable. This allows mmap to
place mappings at sensible locations when limits have been reduced.

Have the imgact_elf.c ld-elf.so.1 placement algorithm use the same
method as mmap(0, ...) now does.

Note that we cannot remove all references to the sysctl tweakable
maxdsiz etc variables because /etc/login.conf specifies a datasize
of 'unlimited'. And that causes exec etc to fail since it can no
longer find space to mmap things.


120375 23-Sep-2003 nyan

Implement the bus_space_map() function to allocate resources and initialize
a bus_handle, but currently it does only initializing a bus_handle.


119697 02-Sep-2003 marcel

Add function OF_decode_addr(). This function obtains the physical
address of the device identified by its phandle_t by traversing OFW's
device tree. The space and address returned by this function can
subsequently be passed to sparc64_fake_bustag() to construct a valid
tag and handle for use by the newbus I/O functions.

Use of this function is expected to be limited to pre-newbus access to
devices, such as consoles and keyboards.

Partially obtained from: tmm
Reviewed by: jake, jmg, tmm
SBus testing made possible by: jake
Tested with: LINT


119628 01-Sep-2003 kan

Standardize idempotentcy ifdefs. Consistently use _MACHINE_VARARGS_H_
symbol.


119380 24-Aug-2003 jake

"md" files for syscons.


118990 16-Aug-2003 marcel

Further cleanup <machine/cpu.h> and <machine/md_var.h>: move the MI
prototypes of cpu_halt(), cpu_reset() and swi_vm() from md_var.h to
cpu.h. This affects db_command.c and kern_shutdown.c.

ia64: move all MD prototypes from cpu.h to md_var.h. This affects
madt.c, interrupt.c and mp_machdep.c. Remove is_physical_memory().
It's not used (vm_machdep.c).

alpha: the MD prototypes have been left in cpu.h with a comment
that they should be there. Moving them is left for later. It was
expected that the impact would be significant enough to be done in
a seperate commit.

powerpc: MD prototypes left in cpu.h. Comment added.

Suggested by: bde
Tested with: make universe (pc98 incomplete)


118848 12-Aug-2003 imp

Expand inline the relevant parts of src/COPYRIGHT for Matt Dillon's
copyrighted files.

Approved by: Matt Dillon


118443 04-Aug-2003 jhb

- Since td_critnest is now initialized in MI code, it doesn't have to be
set in cpu_critical_fork_exit() anymore.
- As far as I can tell, cpu_thread_link() has never been used, not even
when it was originally added, so remove it.


118239 31-Jul-2003 peter

Deal with 'options KSTACK_PAGES' being a global option.


118081 27-Jul-2003 mux

- Introduce a new busdma flag BUS_DMA_ZERO to request for zero'ed
memory in bus_dmamem_alloc(). This is possible now that
contigmalloc() supports the M_ZERO flag.
- Remove the locking of Giant around calls to contigmalloc() since
contigmalloc() now grabs Giant itself.


117707 17-Jul-2003 jake

Avoid exposing declarations for kernel variables to userland.

PR: 54528


117661 16-Jul-2003 jmg

change CLASS depending upon __ELF_WORD_SIZE. This is necessary if
someone wants to try to run 32bit binaries on sparc64.


117658 16-Jul-2003 jmg

add support for interrupt counting on sparc64. This copies part of the
code from i386. The code has a slight bogon that interrupts are counted
twice. Once on the ithread dispatch and once on the dispatch for the vector

vmstat -i and systat -vm now contains interrupt counts.

Reviewed by: jake


117390 10-Jul-2003 tmm

Lock down the IOMMU bus_dma implementation to make it safe to use
without Giant held.

A quick outline of the locking strategy:
Since all IOMMUs are synchronized, there is a single lock, iommu_mtx,
which protects the hardware registers (where needed) and the global and
per-IOMMU software states. As soon as the IOMMUs are divorced, each struct
iommu_state will have its own mutex (and the remaining global state
will be moved into the struct).
The dvma rman has its own internal mutex; the TSB slots may only be
accessed by the owner of the corresponding resource, so neither needs
extra protection.
Since there is a second access path to maps via LRU queues, the consumer-
provided locking is not sufficient; therefore, each map which is on a
queue is additionally protected by iommu_mtx (in part, there is one
member which only the map owner may access). Each map on a queue may
be accessed and removed from or repositioned in a queue in any context as
long as the lock is held; only the owner may insert a map.
To reduce lock contention, some bus_dma functions remove the map from
the queue temporarily (on behalf of the map owner) for some operations and
reinsert it when they are done. Shorter operations and operations which are
not done on behalf of the lock owner are completely covered by the lock.

To facilitate the locking, reorganize the streaming buffer handling;
while being there, fix an old oversight which would cause the streaming
buffer to always be flushed, regardless of whether streaming was enabled
in the TSB entry. The streaming buffer is still disabled for now, since
there are a number of drivers which lack critical bus_dmamp_sync() calls.

Additional testing by: jake


117126 01-Jul-2003 scottl

Mega busdma API commit.

Add two new arguments to bus_dma_tag_create(): lockfunc and lockfuncarg.
Lockfunc allows a driver to provide a function for managing its locking
semantics while using busdma. At the moment, this is used for the
asynchronous busdma_swi and callback mechanism. Two lockfunc implementations
are provided: busdma_lock_mutex() performs standard mutex operations on the
mutex that is specified from lockfuncarg. dftl_lock() is a panic
implementation and is defaulted to when NULL, NULL are passed to
bus_dma_tag_create(). The only time that NULL, NULL should ever be used is
when the driver ensures that bus_dmamap_load() will not be deferred.
Drivers that do not provide their own locking can pass
busdma_lock_mutex,&Giant args in order to preserve the former behaviour.

sparc64 and powerpc do not provide real busdma_swi functions, so this is
largely a noop on those platforms. The busdma_swi on is64 is not properly
locked yet, so warnings will be emitted on this platform when busdma
callback deferrals happen.

If anyone gets panics or warnings from dflt_lock() being called, please
let me know right away.

Reviewed by: tmm, gibbs


117119 01-Jul-2003 tmm

Add the new sparc64 OFW PCI framework, conditional on options OFW_NEWPCI
for now. It introduces a OFW PCI bus driver and a generic OFW PCI-PCI
bridge driver. By utilizing these, the PCI handling is much more elegant
now.

The advantages of the new approach are:
- Device enumeration should hopefully be more like on Solaris now,
so unit numbers should match what's printed on the box more
closely.
- Real interrupt routing is implemented now, so cardbus bridges
etc. have at least a chance to work.
- The quirk tables are gone and have been replaced by (hopefully
sufficient) heuristics.
- Much cleaner code.

There was also a report that previously bogus interrupt assignments
are fixed now, which can be attributed to the new heuristics.

A pitfall, and the reason why this is not the default yet, is that
it changes device enumeration, as mentioned above, which can make
it necessary to change the system configuration if more than one
unit of a device type is present (on a system with two hme cars,
for example, it is possible that hme0 becomes hme1 and vice versa
after enabling the option). Systems with multiple disk controllers
may need to be booted into single user (and require manual specification
of the root file system on boot) to adjust the fstab.
Nevertheless, I would like to encourage users to use this option,
so that it can be made the default soon.

In detail, the changes are:
- Introduce an OFW PCI bus driver; it inherits most methods from the
generic PCI bus driver, but uses the firmware for enumeration,
performs additional initialization for devices and firmware-specific
interrupt routing. It also implements an OFW-specific method to allow
child devices to get their firmware nodes.
- Introduce an OFW PCI-PCI bridge driver; again, it inherits most
of the generic PCI-PCI bridge driver; it has it's own method for
interrupt routing, as well as some sparc64-specific methods (one to
get the node again, and one to adjust the bridge bus range, since
we need to reenumerate all PCI buses).
- Convert the apb driver to the new way of handling things.
- Provide a common framework for OFW bridge drivers, used be the two
drivers above.
- Provide a small common framework for interrupt routing (for all
bridge types).
- Convert the psycho driver to the new framework; this gets rid of a
bunch of old kludges in pci_read_config(), and the whole
preinitialization (ofw_pci_init()).
- Convert the ISA MD part and the EBus driver to the new way
interrupts and nodes are handled.
- Introduce types for firmware interrupt properties.
- Rename the old sparcbus_if to ofw_pci_if by repo copy (it is only
required for PCI), and move it to a more correct location (new
support methodsx were also added, and an old one was deprecated).
- Fix a bunch of minor bugs, perform some cleanups.

In some cases, I introduced some minor code duplication to keep the
new code clean, in hopes that the old code will be unifdef'ed soon.

Reviewed in part by: imp
Tested by: jake, Marius Strobl <marius@alchemy.franken.de>,
Sergey Mokryshev <mokr@mokr.net>,
Chris Jackman <cjackNOSPAM@klatsch.org>
Info on u30 firmware provided by: kris


116659 22-Jun-2003 jmg

add support for peeking at pci busses on UltraSparc systems. This prevents
data access errors when trying to read/write to non-existant PCI devices.

fix the psycho bridge to use peek for probing devices. This no longer
fakes it if the OFW node doesn't exist (and the reg == 0).

Reviewed by: jake, tmm


116541 18-Jun-2003 tmm

Further cleanup of the sparc64 busdma implementation:
- Move prototypes for sparc64-specific helper functions from bus.h to
bus_private.h
- Move the method pointers from struct bus_dma_tag into a separate
structure; this saves some memory, and allows to use a single method
table for each busdma backend, so that the bus drivers need no longer
be changed if the methods tables need to be modified.
- Remove the hierarchical tag method lookup. It was never really useful,
since the layering is fixed, and the current implementations do not
need to call into parent implementations anyway. Each tag inherits
its method table pointer and cookie from the parent (or the root tag)
now, and the method wrapper macros directly use the method table
of the tag.
- Add a method table to the non-IOMMU backend, remove unnecessary
prototypes, remove the extra parent tag argument.
- Rename sparc64_dmamem_alloc_map() and sparc64_dmamem_free_map() to
sparc64_dma_alloc_map() and sparc64_dma_free_map(), move them to a
better place and use them for all map allocations and deallocations.
- Add a method table to the iommu backend, and staticize functions,
remove the extra parent tag argument.
- Change the psycho and sbus drivers to just set cookie and method table
in the root tag.
- Miscellaneous small fixes.


116355 14-Jun-2003 alc

Migrate the thread stack management functions from the machine-dependent
to the machine-independent parts of the VM. At the same time, this
introduces vm object locking for the non-i386 platforms.

Two details:

1. KSTACK_GUARD has been removed in favor of KSTACK_GUARD_PAGES. The
different machine-dependent implementations used various combinations
of KSTACK_GUARD and KSTACK_GUARD_PAGES. To disable guard page, set
KSTACK_GUARD_PAGES to 0.

2. Remove the (unnecessary) clearing of PG_ZERO in vm_thread_new. In
5.x, (but not 4.x,) PG_ZERO can only be set if VM_ALLOC_ZERO is passed
to vm_page_alloc() or vm_page_grab().


116213 11-Jun-2003 tmm

Remove the psycho and sbus iommu function stubs, and put the pointer
to the iommu_state structure directly into dt_cookie. The stubs have
not been needed for a long time now.


115971 07-Jun-2003 jake

- Declare sparc64_memreg and sparc64_nmemreg in machine/ofw_mem.h.
- On startup print the total physical memory, instead of what we're told is
free by the firmware, to avoid astonishing users.


115970 07-Jun-2003 jake

BKPT_INST is supposed to be a breakpoint, not 0.


115417 30-May-2003 tmm

Fix interrupt assignment for non-builtin PCI devices on e450s.

This machine uses a non-standard scheme to specify the interrupts to
be assigned for devices in PCI slots; instead of giving the INO
or full interrupt number (which is done for the other devices in this
box), the firmware interrupt properties contain intpin numbers, which
have to be swizzled as usual on PCI-PCI bridges; however, the PCI host
bridge nodes have no interrupt map, so we need to guess the
correct INO by slot number of the device or the closest PCI-PCI
bridge leading to it, and the intpin.

To do this, this fix makes the following changes:
- Add a newbus method for sparc64 PCI host bridges to guess
the INO, and glue code in ofw_pci_orb_callback() to invoke it based
on a new quirk entry. The guessing is only done for interrupt numbers
too low to contain any IGN found on e450s.
- Create another new quirk entry was created to prevent mapping of EBus
interrupts at PCI level; the e450 has full INOs in the interrupt
properties of EBus devices, so trying to remap them could cause
problems.
- Set both quirk entries for e450s; remove the no-swizzle entry.
- Determine the psycho half (bus A or B) a driver instance manages
in psycho_attach()
- Implement the new guessing method for psycho, using the slot number,
psycho half and property value (intpin).

Thanks go to the testers, especially Brian Denehy, who tested many kernels
for me until I had found the right workaround.

Tested by: Brian Denehy <B.Denehy@90east.com>, jake, fenner,
Marius Strobl <marius@alchemy.franken.de>,
Marian Dobre <mari@onix.ro>
Approved by: re (scottl)


115416 30-May-2003 hmp

Rename BUS_DMAMEM_NOSYNC to BUS_DMA_COHERENT.

The current name is confusing, because it indicates to
the client that a bus_dmamap_sync() operation is not
necessary when the flag is specified, which is wrong.

The main purpose of this flag is to hint the underlying
architecture that DMA memory should be mapped in a coherent
way, but the architecture can ignore it. But if the
architecture does supports coherent mapping of memory, then
it makes bus_dmamap_sync() calls cheap.

This flag is the same as the one in NetBSD's Bus DMA.

Reviewed by: gibbs, scottl, des (implicitly)
Approved by: re@ (jhb)


115343 27-May-2003 scottl

Bring back bus_dmasync_op_t. It is now a typedef to an int, though the
BUS_DMASYNC_ definitions remain as before. The does not change the ABI,
and reverts the API to be a bit more compatible and flexible. This has
survived a full 'make universe'.

Approved by: re (bmah)


115316 26-May-2003 scottl

De-orbit bus_dmamem_alloc_size(). It's a hack and was never used anyways.
No need for it to pollute the 5.x API any further.

Approved by: re (bmah)


115164 19-May-2003 kan

sys/sys/limits.h:

- Fix visibilty test for LONG_BIT and WORD_BIT. `#if defined(__FOO_VISIBLE)'
is alays wrong because __FOO_VISIBLE is always defined (to 0 for
invisibility).

sys/<arch>/include/limits.h
sys/<arch>/include/_limits.h:

- Style fixes.

Submitted by: bde
Reviewed by: bsdmike
Approved by: re (scottl)


114678 04-May-2003 kan

Style fixes.
Remove DBL_DIG, DBL_MIN, DBL_MAX and their FLT_ counterparts, they
were marked for deprecation ever since SUSv1 at least.
Only define ULLONG_MIN/MAX and LLONG_MAX if long long type is
supported.
Restore a lost comment in MI _limits.h file and remove it from
sys/limits.h where it does not belong.


114373 01-May-2003 peter

Slight reorg and added AMD64 support. A couple of the MODINFOMD_* values
that were added to sparc64 and later powerpc, really should have been in
the MI area. But changing that now with insufficient preperation will
just cause too much pain.

Move MD_FETCH() to the MI sys/linker.h file to avoid another two copies
of it.


114257 29-Apr-2003 jake

Allow fast instruction and data access mmu miss traps to be handled by
user trap handlers.


114216 29-Apr-2003 kan

Deprecate machine/limits.h in favor of new sys/limits.h.
Change all in-tree consumers to include <sys/limits.h>

Discussed on: standards@
Partially submitted by: Craig Rodrigues <rodrigc@attbi.com>


114188 29-Apr-2003 jake

- Fix placement of cvs ids in previous commit to match .S files in libc.
- gcc uses 32 byte alignment for functions regardless of profiling, so
follow suit.


114085 26-Apr-2003 obrien

I was wrong, the ENTRY bits in asm.h did have a purpose -- for userland.
Restore the bits and remove them from asmacros.h. *.S will now be asm.h
consumers.

Approved by: jake


114072 26-Apr-2003 obrien

The ENTRY bits were in two places. Remove the one not used (asm.h), but
presurve the nice comment by adding it to asmacros.h.


114071 26-Apr-2003 obrien

Two tokens that don't together form a vaid preprocssor token cannot be
pasted together using ANSI-C token concatinatation. GCC's cpp, at least,
produces the desired result w/o using "##".


113941 23-Apr-2003 kan

Add a new sys/limits.h file which in turn depends on machine/_limits.h
to get actual constant values. This is in preparation for machine/limits.h
retirement.

Discussed on: standards@
Submitted by: Craig Rodrigues <rodrigc@attbi.com> (*)
Modified by: kan


113453 13-Apr-2003 jake

- Move the routine for flushing all user mappings from the tlb from pmap to
the cpu dependent files. It will need to be done differently for USIII.
- Simplify the logic for detecting context rollovers. Instead of dealing
with it when the next context switch would cause the context numbers to
rollover, deal with it when they actually do rollover.
- Move some things around in cpu_switch so that we only do 1 membar #Sync
when switching address space, instead of 2.
- Detect kernel threads by comparing the new vm space to vmspace0, instead
if checking if the tlb context is 0.
- Removed some debug code.


113350 10-Apr-2003 mux

I deserve a big pointy hat for having missed all those references
to bus_dmasync_op_t in my last commit.


113347 10-Apr-2003 mux

Change the operation parameter of bus_dmamap_sync() from an
enum to an int and redefine the BUS_DMASYNC_* constants as
flags. This allows us to specify several operations in one
call to bus_dmamap_sync() as in NetBSD.


113238 08-Apr-2003 jake

Use vm_paddr_t for physical addresses.


113171 06-Apr-2003 jake

Make the pmap stats writeable. It can be useful to clear them.


113166 06-Apr-2003 jake

Use the vis block copy/zero functions for pmap_copy_page and pmap_zero_page.
These are called through function pointers so that different implementations
can be provided for cheetah, where the block load instructions may or may
not be a win, and so they can be disabled with the machdep.use_vis tunable.
In terms of raw bandwidth the integer versions are faster, but not allocating
lines in the L2 cache for useless data gives a measurable improvement in user
time for the benchmarks I tested (mostly buildworld with -j8).

As far as I can tell the instructions used are implemented on everything
back to UltraSPARC I, so there should not be a problem with different cpu
types.


113027 03-Apr-2003 jake

Add optimized block copy and zero functions using vis instructions, which
can do 64 bytes at a time and don't allocate lines in the L2 cache. These
assume that everything is 64 byte aligned, and that there's more than 128
bytes of data (best for whole pages). The block load and store instructions
don't follow normal memory ordering rules and require either a memory barrier
or move between registers before the data can actually be used. This
implementation correctly shuffles around 3 out of the 4 sets of registers
in order to avoid memory barriers expect for the last 2 blocks.


113023 03-Apr-2003 jake

- Add space for kernel floating point registers to the pcb. These will be
used to support block copy and zero operations in the kernel which use the
floating point registers.
- While I'm changing the size, improve the layout of struct pcb, sort by size,
then alphabetical etc.
- Add some assertions to validate assumptions made about how the pcb is
allocated.


112924 01-Apr-2003 jake

- Add a flags field to struct pcb. Use this to keep track of wether or
not the pcb has floating point registers saved in it.
- Implement get_mcontext and set_mcontext.


112920 01-Apr-2003 jake

- Rename pcb_fpstate to pcb_ufp (user floating point), and change it to
a simple array of 64 ints.
- Use a critical section when saving floating point state in cpu_fork
instead of sched_lock.


112917 01-Apr-2003 jake

Rename pcb_fp to pcb_sp, so as to not be confused with floating point
state.


112697 27-Mar-2003 jake

Handle the fictitious pages created by the device pager. For fictitious
pages which represent actual physical memory we must strip off the fake
page in order to allow illegal aliases to be detected. Otherwise we map
uncacheable in the virtual and physical caches and set the side effect bit,
as is required for mapping device memory.

This fixes gstat on sparc64, which wants to mmap kernel memory through a
character device.


112569 25-Mar-2003 jake

- Add vm_paddr_t, a physical address type. This is required for systems
where physical addresses larger than virtual addresses, such as i386s
with PAE.
- Use this to represent physical addresses in the MI vm system and in the
i386 pmap code. This also changes the paddr parameter to d_mmap_t.
- Fix printf formats to handle physical addresses >4G in the i386 memory
detection code, and due to kvtop returning vm_paddr_t instead of u_long.

Note that this is a name change only; vm_paddr_t is still the same as
vm_offset_t on all currently supported platforms.

Sponsored by: DARPA, Network Associates Laboratories
Discussed with: re, phk (cdevsw change)


112399 19-Mar-2003 jake

- Remove unused cache flushing routines. These will not necessary work
on future UltraSPARC cpus for which the data cache is not direct mapped.
- Move UltraSPARC I and II (spitfire, blackbird, sapphire, sabre) specific
functions to spitfire.c, and add cheetah.c for UltraSPARC III specific
functions. Initially just cache flushing, but there are a few other
functions that will need to move here.
- Add an ipi handler for data cache flushing on UltraSPARC III.
- Use function pointers to select the right cache flushing functions based
on cpu_impl.

With this it is possible to boot single user from an mfs root on UltraSPARC
III systems, including spinning up secondary processors. There is currently
no support for the host to pci bridge, and no documentation for it is
publically available.

Thanks to Oleg Derevenetz for providing access to a system with UltraSPARC
III+ cpus.


112366 18-Mar-2003 jake

Remove unused fields.


112312 16-Mar-2003 jake

Made the prototypes for pmap_kenter and pmap_kremove MD. These functions
are machine dependent because they are not required to update the tlb when
mappings are added or removed, and doing so is machine dependent.
In addition, an implementation may require that pages mapped with pmap_kenter
have a backing vm_page_t, which is not necessarily true of all physical
pages, and so may choose to pass the vm_page_t to pmap_kenter instead of the
physical address in order to make this requirement clear.


111524 26-Feb-2003 mux

Correctly set BUS_SPACE_MAXSIZE in all the busdma backends.
It was bogusly set to 64 * 1024 or 128 * 1024 because it was
bogusly reused in the BUS_DMAMAP_NSEGS definition.


111383 24-Feb-2003 obrien

Make the 'a' parameter of bus_space_write_multi_stream_*() a const pointer.


111353 23-Feb-2003 obrien

The rest of our platforms make bus_space_write_multi_stream_2's 'a' a
const pointer.


111347 23-Feb-2003 obrien

Add an empty bus_space_unmap() like Alpha has. puc(4) uses it.


110566 08-Feb-2003 mike

Implement fpclassify():
o Add a MD header private to libc called _fpmath.h; this header
contains bitfield layouts of MD floating-point types.
o Add a MI header private to libc called fpmath.h; this header
contains bitfield layouts of MI floating-point types.
o Add private libc variables to lib/libc/$arch/gen/infinity.c for
storing NaN values.
o Add __double_t and __float_t to <machine/_types.h>, and provide
double_t and float_t typedefs in <math.h>.
o Add some C99 manifest constants (FP_ILOGB0, FP_ILOGBNAN, HUGE_VALF,
HUGE_VALL, INFINITY, NAN, and return values for fpclassify()) to
<math.h> and others (FLT_EVAL_METHOD, DECIMAL_DIG) to <float.h> via
<machine/float.h>.
o Add C99 macro fpclassify() which calls __fpclassify{d,f,l}() based
on the size of its argument. __fpclassifyl() is never called on
alpha because (sizeof(long double) == sizeof(double)), which is good
since __fpclassifyl() can't deal with such a small `long double'.

This was developed by David Schultz and myself with input from bde and
fenner.

PR: 23103
Submitted by: David Schultz <dschultz@uclink.Berkeley.EDU>
(significant portions)
Reviewed by: bde, fenner (earlier versions)


110053 29-Jan-2003 scottl

Fix another mistake in the bus_dmamem_alloc_size() thing

Submitted by: tmm


110047 29-Jan-2003 scottl

Fix some more missing dt_ prefixes for dma tag fields.


110030 29-Jan-2003 scottl

Implement bus_dmamem_alloc_size() and bus_dmamem_free_size() as
counterparts to bus_dmamem_alloc() and bus_dmamem_free(). This allows
the caller to specify the size of the allocation instead of it defaulting
to the max_size field of the busdma tag.

This is intended to aid in converting drivers to busdma. Lots of
hardware cannot understand scatter/gather lists, which forces the
driver to copy the i/o buffers to a single contiguous region
before sending it to the hardware. Without these new methods, this
would require a new busdma tag for each operation, or a complex
internal allocator/cache for each driver.

Allocations greater than PAGE_SIZE are rounded up to the next
PAGE_SIZE by contigmalloc(), so this is not suitable for multiple
static allocations that would be better served by a single
fixed-length subdivided allocation.

Reviewed by: jake (sparc64)


109651 21-Jan-2003 tmm

Fixes for a number of problems in the IOMMU code:
1.) Fix an off-by-one in the DVMA space handling, which would make it
possible to allocate one page beyond the end of the DVMA area.
This page was aliased to the first page. Apparently, this bug was
responsible for the trashed nvram/eeprom some people were reporting,
in conjunction with a number of unfortunate coincidences.
2.) Fix broken boundary and and lowaddr calculations.
3.) Fix a memory leak on an error path.
4.) Update a outdated comment to reflect the introduction of IOMMU_MAX_PRE,
make the usage of IOMMU_MAX_PRE more consistent and KASSERT that the
preallocation size is not 0.
5.) Fix a case where an error return was lost.
6.) When signalling an error to the caller by invoking the callback, do
not use a segment pointer of NULL for compatability with existing
drivers.

Also, increase the maximum segment number to 64; it is rather arbitrary,
with the exception of the of the stack space consumed by the segment
array.

Special thanks go to Harti Brandt <brandt@fokus.fraunhofer.de> for
spotting 4 and 5, and testing many iterations of patches.

Pointy hats to: tmm


109036 10-Jan-2003 jake

Don't allow user process to set an invalid window state through sigreturn.

Spotted by: tmm


108917 08-Jan-2003 jake

Implement bus_space_subregion.


108830 06-Jan-2003 tmm

Change the iommu code to be able to handle more than one DVMA area per
map. Use this new feature to implement iommu_dvmamap_load_mbuf() and
iommu_dvmamap_load_uio() functions in terms of a new helper function,
iommu_dvmamap_load_buffer(). Reimplement the iommu_dvmamap_load()
to use it, too.
This requires some changes to the map format; in addition to that,
remove unused or redundant members.
Add SBus and Psycho wrappers for the new functions, and make them
available through the respective DMA tags.


108821 06-Jan-2003 tmm

- remove the unused parent DMA tag argument from
_nexus_dmamap_load_buffer()
- implement nexus_dmamap_load() in terms of _nexus_dmamap_load_buffer().
Note that this is untested, as this code is not currently used (but
might be later for UPA devices).
- move BUS_DMAMAP_NSEGS to bus_private.h
- disable the ecache flushing in nexus_dmamap_sync(); it should not be
needed, although the docs are not entirely clear on that.


108815 06-Jan-2003 tmm

Prefix the members of struct bus_space_tag and struct bus_dma_tag with
a uniqifier. No functional changes.


108808 06-Jan-2003 tmm

Look for the correct method in sparc64_dmamap_load_mbuf() and
sparc64_dmamap_load_uio().


108802 06-Jan-2003 tmm

Some cleanup:
- move some constants into iommureg.h
- correct some comments
- use KASSERT() in one place instead of rolling our own
- take a sanity check out of #ifdef DIAGNOSTIC
- fix a syntax error in normally #ifdef'ed out debug code


108700 05-Jan-2003 jake

- Reorganize PMAP_STATS to scale a little better.
- Add some more stats for things that are now considered interesting.


108697 05-Jan-2003 jake

Make imgact_elf32.c compile on sparc64.

Obtained from: ia64


108533 01-Jan-2003 schweikh

Correct typos, mostly s/ a / an / where appropriate. Some whitespace cleanup,
especially in troff files.


108386 29-Dec-2002 jake

Use memset instead of __builtin_memset. Apparently there's an inline
memset in libkern which causes problems; why that's there is beyond me.


108332 27-Dec-2002 jake

Define UMA_MD_SMALL_ALLOC so that uma_small_alloc and uma_small_free will
be used for zones that allocate objects of less 1 page. The biggest advantage
of this is that all of a sudden the majority of kernel malloc-ed data doesn't
need kva allocated for it. Besides microbenchmarks I haven't seen a measurable
performance improvement from doing this.


108245 23-Dec-2002 jake

- Change the way the direct mapped region is implemented to be generally
useful for accessing more than 1 page of contiguous physical memory, and
to use 4mb tlb entries instead of 8k. This requires that the system only
use the direct mapped addresses when they have the same virtual colour as
all other mappings of the same page, instead of being able to choose the
colour and cachability of the mapping.
- Adapt the physical page copying and zeroing functions to account for not
being able to choose the colour or cachability of the direct mapped
address. This adds a lot more cases to handle. Basically when a page has
a different colour than its direct mapped address we have a choice between
bypassing the data cache and using physical addresses directly, which
requires a cache flush, or mapping it at the right colour, which requires
a tlb flush. For now we choose to map the page and do the tlb flush.

This will allows the direct mapped addresses to be used for more things
that don't require normal pmap handling, including mapping the vm_page
structures, the message buffer, temporary mappings for crash dumps, and will
provide greater benefit for implementing uma_small_alloc, due to the much
greater tlb coverage.


108187 22-Dec-2002 jake

- Add a spin lock to single thread cache invalidation and tlb flush ipis,
which allows ipis to be sent outside of Giant.
- Remove the ap boot mutex, which is unused.


108175 22-Dec-2002 tjr

MB_LEN_MAX is not MD, move it to the MI limits.h.


108166 21-Dec-2002 jake

- Add a pmap pointer to struct md_page, and use this to find the pmap that
a mapping belongs to by setting it in the vm_page_t structure that backs
the tsb page that the tte for a mapping is in. This allows the pmap that
a mapping belongs to to be found without keeping a pointer to it in the
tte itself.
- Remove the pmap pointer from struct tte and use the space to make the
tte pv lists doubly linked (TAILQs), like on other architectures. This
makes entering or removing a mapping O(1) instead of O(n) where n is the
number of pmaps a page is mapped by (including kernel_pmap).
- Use atomic ops for setting and clearing bits in the ttes, now that they
return the old value and can be easily used for this purpose.
- Use __builtin_memset for zeroing ttes instead of bzero, so that gcc will
inline it (4 inline stores using %g0 instead of a function call).
- Initially set the virtual colour for all the vm_page_ts to be equal to their
physical colour. This will be more useful once uma_small_alloc is
implemented, but basically pages with virtual colour equal to phsyical
colour are easier to handle at the pmap level because they can be safely
accessed through cachable direct virtual to physical mappings with that
colour, without fear of causing illegal dcache aliases.

In total these changes give a minor performance improvement, about 1%
reduction in system time during buildworld.


108155 21-Dec-2002 jake

Removed unused pmap_qenter_flags.


108153 21-Dec-2002 jake

Make the atomic arithmetic functions return the old value, since they're
all implemented with cas anyway.


107477 01-Dec-2002 tmm

Always initialize the UPA target module id in the interrupt mapping
register to the one of the processor doing the interrupt setup. This
is required since this field is preinitialized to 0, but there exist
machines which have no processor with a MID of 0 (e.g. e450s with 1 or 2
processors).

Add some more macros for handle the interrupt mapping registers, and
rename some existing ones for consistency.

Approved by: re


106838 13-Nov-2002 alc

Move pmap_collect() out of the machine-dependent code, rename it
to reflect its new location, and add page queue and flag locking.

Notes: (1) alpha, i386, and ia64 had identical implementations
of pmap_collect() in terms of machine-independent interfaces;
(2) sparc64 doesn't require it; (3) powerpc had it as a TODO.


106753 11-Nov-2002 alc

- Clear the page's PG_WRITEABLE flag in the i386's pmap_changebit()
if we're removing write access from the page's PTEs.
- Export pmap_remove_all() on alpha, i386, and ia64. (It's already
exported on sparc64.)


106555 07-Nov-2002 tmm

Add two new workaround for firmware anomalies:
1. At least some Netra t1 models have PCI buses with no associated
interrupt map, but obviously expect the PCI swizzle to be done with
the interrupt number from the higher level as intpin. In this case,
the mapping also needs to continue at parent bus nodes.
To handle that, add a quirk table based on the "name" property of
the root node to avoid breaking other boxen. This property is now
retrieved and printed at boot.
2. On SPARCengine Ultra AX machines, interrupt numbers are not mapped
at all, and full interrupt numbers (not just INOs) are given in
the interrupt properties. This is more or less cosmetical; the
PCI interrupt numbers would be wrong, but the psycho resource
allocation method would pass the right numbers on anyway.

Tested by: mux (1), Maxim Mazurok <maxim@km.ua> (2)


106050 27-Oct-2002 jake

Don peril sensitive sun glasses and change the default system call vector
for sparc64 from trap #9 to trap #65. This is one of the ABI "blessed"
system call vectors and is different from any other system that we might
want to emulate, making the emulation easier by reducing the number of
code paths that need to be shared. Compatibility with old applications
is provided with COMPAT_FREEBSD4.
Add defines for a few special traps that we may need to implement for
compatibility with 32bit applications, and add comments on which vectors
are used for what in other systems, and which are available.
Pass magic flags to trap() for deprecated or unimplemented system call
vectors so they will deliver SIGSYS instead of SIGILL.

This piggy backs nicely with the recent sigaction(2) system call number
change, and provided the rules are followed for upgrading past it, this
change should not be noticed.


105950 25-Oct-2002 peter

Split 4.x and 5.x signal handling so that we can keep 4.x signal
handling clean and functional as 5.x evolves. This allows some of the
nasty bandaids in the 5.x codepaths to be unwound.

Encapsulate 4.x signal handling under COMPAT_FREEBSD4 (there is an
anti-foot-shooting measure in place, 5.x folks need this for a while) and
finish encapsulating the older stuff under COMPAT_43. Since the ancient
stuff is required on alpha (longjmp(3) passes a 'struct osigcontext *'
to the current sigreturn(2), instead of the 'ucontext_t *' that sigreturn
is supposed to take), add a compile time check to prevent foot shooting
there too. Add uniform COMPAT_43 stubs for ia64/sparc64/powerpc.

Tested on: i386, alpha, ia64. Compiled on sparc64 (a few days ago).
Approved by: re


105946 25-Oct-2002 tmm

Initialize tick_MHz and related variables much earlier. After the last
revision of tick.c, this was done at SI_SUB_CLOCKS, which is too late
because tick_MHz is required for DELAY() to work.

Reviewed by: jake


105939 25-Oct-2002 jake

Greatly improve readability of trap() by using a table to convert between
trap types and signals to send. Rearrange KASSERTs to better handle faults
early before curthread is setup, or in the case that it gets corrupted or
set to 0.


105733 22-Oct-2002 jake

- Expand struct trapframe to 256 bytes, make all fields fixed width and the
same size. Add some fields that previously overlapped with something else
or were missing.
- Make struct regs and struct mcontext (minus floating point) the same as
struct trapframe so converting between them is easy (null).
- Add space for saving floating point state to struct mcontext. This requires
that it be 64 byte aligned.
- Add assertions that none of these structures change size, as they are part
of the ABI.
- Remove some dead code in sendsig().
- Save and restore %gsr in struct trapframe. Remember to restore %fsr.
- Add some comments to exception.S.


105531 20-Oct-2002 tmm

Add kernel dump support, based on the ia64 version (which was committed
as sparc64/sparc64/dump_machdep.c a while back).
Other than ia64 (which uses ELF), sparc64 uses a homegrown format for
the dumps (headers are required because the physical address and size of
the tsb must be noted, and because physical memory may be discontiguous);
ELF would not offer any advantages here.

Reviewed by: jake


105454 19-Oct-2002 tmm

Explicitely specify an alignment for struct pcb. While all regular pcb's
are positioned and aligned by md code, dumppcb is just a static
variable and requires this.


105138 15-Oct-2002 peter

The a.out md_coredump stuff isn't referenced anywhere anymore, and
hasn't been filled in for ages.. Nuked.


105044 13-Oct-2002 mike

#ifdef _KERNEL not #if _KERNEL.

Pointy hat to: mike


105014 13-Oct-2002 mike

Add standards visibility conditionals. Change any uses of sigset_t to
struct __sigset to avoid depending on objects from <sys/signal.h>.


104584 06-Oct-2002 mike

Add conditionals to allow va_list to be defined in other headers.


104583 06-Oct-2002 mike

o Add conditionals to allow va_list to be defined in other headers.
o Standardize on _MACHINE_STDARG_H_ to allow multiple header includes.
o Restrict the definition of va_copy() to C99 environments.


104539 05-Oct-2002 mux

Add two extern's for adjkerntz and wall_cmos_clock, all other
archs have them there, alghough the variable are declared in
subr_clock.c. These should probably be moved into some MI
place.

Approved by: jake


104505 05-Oct-2002 mike

Fix namespace issues by using visibility conditionals from
<sys/cdefs.h>.


104493 04-Oct-2002 mike

style(9) <machine/setjmp.h> headers so they look mostly the same.


104486 04-Oct-2002 sam

New bus_dma interfaces for use by crypto device drivers:

o bus_dmamap_load_mbuf
o bus_dmamap_load_uio

Test on i386. Known to compile on alpha and sparc64, but not tested.
Otherwise untried.


104304 01-Oct-2002 jake

Convert the bus space accessors from macros to inlines. This fixes some
problems with drivers that expect functions rather than function like
macros.

Reviewed by: tmm


104271 01-Oct-2002 jake

Get rid of the TODO macro in the few places that still need work; either
comment it out or change to explicit panics. It conflicts with things
like #if TODO in drivers.


104265 01-Oct-2002 jake

Add needed include of queue.h.


104075 28-Sep-2002 jake

Renamed intr_enqueue to intr_vector and intr_dequeue to intr_fast, to
better reflect how they are called.


103814 23-Sep-2002 mike

Be careful not to define GCC-specific optimizations in the non-GCC
case.


103747 21-Sep-2002 mux

Don't include opt_bus.h here, it breaks stuff trying to
include machine/bus.h.

Reviewed by: tmm


103526 18-Sep-2002 mike

Implement C99's va_copy() macro.


103436 17-Sep-2002 peter

Initiate deorbit burn for the i386-only a.out related support. Moves are
under way to move the remnants of the a.out toolchain to ports. As the
comment in src/Makefile said, this stuff is deprecated and one should not
expect this to remain beyond 4.0-REL. It has already lasted WAY beyond
that.

Notable exceptions:
gcc - I have not touched the a.out generation stuff there.
ldd/ldconfig - still have some code to interface with a.out rtld.
old as/ld/etc - I have not removed these yet, pending their move to ports.
some includes - necessary for ldd/ldconfig for now.

Tested on: i386 (extensively), alpha


103322 14-Sep-2002 tmm

Use the definitions in machine/fsr.h instead of duplicating these magic
numbers here (the values need to correspond to the %fsr ones for some
libc functions to work right).


103321 14-Sep-2002 tmm

Clean up a bit, and add some more macros to access %fsr fields.


102874 03-Sep-2002 mike

Now that _BSD_CLK_TCK_ and _BSD_CLOCKS_PER_SEC_ are the same on all
architectures, move the definition directly into <time.h> and finish
the removal of <machine/ansi.h>.


102871 02-Sep-2002 mike

Align _BSD_CLK_TCK_ and _BSD_CLOCKS_PER_SEC_ with most other
platforms. This introduces some binary incompatibilities for
dynamically linked programs which make use of clock(3) and times(3).

Approved by: jake


102600 30-Aug-2002 peter

Change hw.physmem and hw.usermem to unsigned long like they used to be
in the original hardwired sysctl implementation.

The buf size calculator still overflows an integer on machines with large
KVA (eg: ia64) where the number of pages does not fit into an int. Use
'long' there.

Change Maxmem and physmem and related variables to 'long', mostly for
completeness. Machines are not likely to overflow 'int' pages in the
near term, but then again, 640K ought to be enough for anybody. This
comes for free on 32 bit machines, so why not?


102561 29-Aug-2002 jake

Renamed poorly named setregs to exec_setregs. Moved its prototype to
imgact.h with the other exec support functions.


102555 29-Aug-2002 jake

Removed legacy signal trampoline.


102315 23-Aug-2002 mike

Move several MI types from <machine/_types.h> to <sys/_types.h>.
These types are unlikely to ever become very MD. They include:
clockid_t, ct_rune_t, fflags_t, intrmask_t, mbstate_t, off_t, pid_t,
rune_t, socklen_t, timer_t, wchar_t, and wint_t.

While moving them, make a few adjustments (submitted by bde):
o __ct_rune_t needs to be precisely `int', not necessarily __int32_t,
since the arg type of the ctype functions is int.
o __rune_t, __wchar_t and __wint_t inherit this via a typedef of
__ct_rune_t.
o Some minor wording changes in the comment blocks for ct_rune_t and
mbstate_t.

Submitted by: bde (partially)


102301 23-Aug-2002 jake

Removed unneeded include of machine/types.h (which no longer exists).


102227 21-Aug-2002 mike

o Merge <machine/ansi.h> and <machine/types.h> into a new header
called <machine/_types.h>.
o <machine/ansi.h> will continue to live so it can define MD clock
macros, which are only MD because of gratuitous differences between
architectures.
o Change all headers to make use of this. This mainly involves
changing:
#ifdef _BSD_FOO_T_
typedef _BSD_FOO_T_ foo_t;
#undef _BSD_FOO_T_
#endif
to:
#ifndef _FOO_T_DECLARED
typedef __foo_t foo_t;
#define _FOO_T_DECLARED
#endif

Concept by: bde
Reviewed by: jake, obrien


102043 18-Aug-2002 jake

Fix warning. These structs should probably be removed altogether.


102040 18-Aug-2002 jake

Add pmap support for user mappings of multiple page sizes (super pages).
This supports all hardware page sizes (8K, 64K, 512K, 4MB), but only 8k
pages are actually used as of yet.


101957 16-Aug-2002 jake

Minor style. Removed unused declaration.


101955 16-Aug-2002 jake

Demark sections of code that need special fault handling with labels.
Check if the trapped pc is inside of the demarked sections to implement
fault recovery for copyin etc, instead of pcb_onfault. Handle recovery
from data access exceptions as well as page faults.

Inspired by: bde's sys.dif


101898 15-Aug-2002 jake

Store the number of itlb and dtlb entries separately; they may be different.
Find the prom node for the boot cpu earlier and store it in the per-cpu
area, so that cache_init can be called earlier.


101653 10-Aug-2002 jake

Auto size available kernel virtual address space based on phsyical memory
size. This avoids blowing out kva in kmeminit() on large memory machines
(4 gigs or more).

Reviewed by: tmm


101479 07-Aug-2002 alc

o Introduce pmap_page_is_mapped(). Its purpose is to obsolete
the PG_MAPPED flag.


101186 01-Aug-2002 jake

Forgot to commit this.

Spotted by: scottl


101082 31-Jul-2002 jake

These file are no longer used (moved to userland and/or merged into
pmap.c).


100910 30-Jul-2002 jake

Add definitions for statistical and high-resolution profiling. The calling
conventions for _mcount and __cyg_profile_func_enter are different, so
statistical profiling kernels build and link but don't actually work.
IWBNI one could tell gcc to only generate calls to the former.

Define uintfptr_t properly for userland, but not for the kernel (I hope).


100882 29-Jul-2002 mike

Create a new header <machine/_stdint.h> for storing MD parts of
<stdint.h>. Previously, parts were defined in <machine/ansi.h> and
<machine/limits.h>. This resulted in two problems:
(1) Defining macros in <machine/ansi.h> gets in the way of that
header only defining types.
(2) Defining C99 limits in <machine/limits.h> adds pollution to
<limits.h>.


100840 29-Jul-2002 jake

Add _ALIGN_DATA and _ALIGN_TEXT macros.


100823 28-Jul-2002 mike

Revert the previous delta; uintfptr_t needs to be available to
userland for libc/gmon to compile, so the typedef in <machine/types.h>
isn't good enough. This is really ugly since we end up with the
actual value which uintfptr_t is typedef'd from, in multiple places.
This is bug for bug compatible with the other FreeBSD architectures.

Noticed by: sparc64 tinderbox


100783 28-Jul-2002 jake

Add declarations for btext and etext.


100780 27-Jul-2002 jake

uintfptr_t has moved to machine/types.h.


100771 27-Jul-2002 jake

Implement a direct mapped address region, like alpha and ia64. This
basically maps all of physical memory 1:1 to a range of virtual addresses
outside of normal kva. The advantage of doing this instead of accessing
phsyical addresses directly is that memory accesses will go through the
data cache, and will participate in the normal cache coherency algorithm
for invalidating lines in our own and in other cpus' data caches. So
we don't have to flush the cache manually or send IPIs to do so on other
cpus. Also, since the mappings never change, we don't have to flush them
from the tlb manually.
This makes pmap_copy_page and pmap_zero_page MP safe, allowing the idle
zero proc to run outside of giant.

Inspired by: ia64


100718 26-Jul-2002 jake

Remove the tlb argument to tlb_page_demap (itlb or dtlb), in order to better
match the pmap_invalidate api.


100188 16-Jul-2002 tmm

When multiple IOMMUs are present in a system, use a single TSB for all
of them, and couple them by always performing all operations on all
present IOMMUs. This is required because with the current API there
is no way to determine on which bus a busdma operation is performed.

While being there, clean up the iommu code a bit.

This should be a step in the direction of allow some of larger machines
to work; tests have shown that there still seem to be problems left.


100185 16-Jul-2002 tmm

Add new UltraSPARC-III VIS II instructions.


100181 16-Jul-2002 tmm

Add new LSU bits for UltraSPARC-III.


100180 16-Jul-2002 tmm

Add ASI definitions of UltraSPARC-III (Cu) processors, and add some
previously missing US-I and II ones.


99897 13-Jul-2002 jake

Use a fixed address for KERNBASE, so it doesn't change if the size of KVA
is increased. Its confusing for all the kernel addresses to change, and
doesn't serve much purpose as far as conserving address space.


99896 13-Jul-2002 jake

Identify UltraSPARC-III and UltraSPARC-III+ cpus.


99879 12-Jul-2002 tmm

When sending cache flushing IPIs, don't try to IPI the triggering CPU
itself; this causes undefined behaviour on UltraSPARCs. In particular,
the interrupt packet data words will not necessarily be delivered
correctly, which would result in a crash.
This bug also caused the cache-flushing work to be done twice on the
triggering CPU (when it did not cause crashes).

Reviewed by: jake


99733 10-Jul-2002 mike

Remove label_t and physadr, which seem to have never been used in
FreeBSD.

Submitted by: bde


99594 08-Jul-2002 mike

Move __offsetof() macro from <machine/ansi.h> to <sys/cdefs.h>. It's
hardly MD, since all our platforms share the same macro. It's not
really compiler dependent either, but this helps in reducing
<machine/ansi.h> to only type definitions.


99117 30-Jun-2002 mike

Since printf(3) now supports the `j' conversion specifier, use that
when printing intmax_t and uintmax_t.

Forgotten by: mike
Noticed by: bde


99026 29-Jun-2002 julian

Add files that are new for KSE.


99013 29-Jun-2002 peter

Remove a couple of __P() stragglers.


98813 25-Jun-2002 jake

pmap_kremove can no longer be used to remove the magic device mappings
installed with pmap_kenter_flags, since the physical addresses may not
have an associated vm_page. Add a function to do this.

Tested by: Tomi Vainio <Tomi.Vainio@Sun.COM>


98705 23-Jun-2002 mux

Add a missing prototype to fix a warning.


98469 20-Jun-2002 peter

Move the "- 1" into the RQB_FFS(mask) macro itself so that
implementations can provide a base zero ffs function if they wish.
This changes
#define RQB_FFS(mask) (ffs64(mask))
foo = RQB_FFS(mask) - 1;
to
#define RQB_FFS(mask) (ffs64(mask) - 1)
foo = RQB_FFS(mask);
On some platforms we can get the "- 1" for free, eg: those that use the
C code for ffs64().

Reviewed by: jake (in principle)


98350 17-Jun-2002 jake

Add constants for the min and max prom addresses. Use these instead of
magic numbers. Use stxa_sync instead of stxa; membar #Sync; to ensure
that no instruction is placed between the two. This can cause random
corruption even though interrupts are already disabled.


98033 08-Jun-2002 jake

Remove test code.


98032 08-Jun-2002 jake

Remove code from trap which is handled in userland now.


98031 08-Jun-2002 jake

Fix bizarre SMP problems. The secondary cpus sometimes start up with junk
in their tlb which the prom doesn't clear out, so we have to do so manually
before mapping the kernel page table or the cpu can hang due various
conditions which cause undefined behaviour from the tlb.


97829 04-Jun-2002 jake

Bump TSB_PAGES_SHIFT to 4. Less sucks too much.


97564 30-May-2002 dfr

Move the definition of ElfN_Hashelt to common headers. The only platform
which has a different definition for this is alpha.


97508 29-May-2002 jake

Forward declare struct trapframe.


97449 29-May-2002 jake

Add an MD page flag for tracking if a page is cacheable or not, so that
we don't flush all mappings of a physical page in order to make it
virtually cachable again, if it is already cachable.


97447 29-May-2002 jake

Merge the code in pv.c into pmap.c directly. Place all page mappings onto
the pv lists in the vm_page, even unmanaged kernel mappings. This is so
that the virtual cachability of these mappings can be tracked when a page
is mapped to more than one virtual address. All virtually cachable
mappings of a physical page must have the same virtual colour, or illegal
alises can be created in the data cache. This is a bit tricky because we
still have to recognize managed and unmanaged mappings, even though they
are all on the pv lists.


97446 29-May-2002 jake

Add pv list linkage and a pmap pointer to struct tte. Remove separately
allocated pv entries and use the linkage in the tte for pv operations.


97445 29-May-2002 jake

Use a contrived 'tlb_entry' structure for passing the mappings for the
kernel text and data from the loader to the kernel, so that the tte format
is not part of the loader->kernel ABI.


97444 29-May-2002 jake

Remove pmap.pm_pvlist and make the functions that use it no-ops. These are
all optimizations for architectures which have large sparse page tables,
and/or can't put the pv linkage inside of the page table entries.


97265 25-May-2002 jake

Convert the interrupt queue from an array to a linked list. Implement
intr_dequeue in asm so that it can easily be modified to do light weight
context switching.


97262 25-May-2002 jake

Minor style.


97261 25-May-2002 jake

Make the run queue parameters machine dependent. Optimize 64 bit
architectures by using a 64 bit word for the bit array which keeps
track of non-empty queues.

Reviewed by: peter


97031 21-May-2002 jake

Update tsb_tte_enter prototype per tsb.c rev 1.20.


97027 21-May-2002 jake

Redefine the tte accessor macros to take a pointer to a tte, instead of the
value of the tag or data field.
Add macros for getting the page shift, size and mask for the physical page
that a tte maps (which may be one of several sizes).
Use the new cache functions for invalidating single pages.


97001 20-May-2002 jake

Add SMP aware cache flushing functions, which operate on a single physical
page. These send IPIs if necessary in order to keep the caches in sync on
all cpus.


97000 20-May-2002 jake

Forward declare struct trapframe.


96998 20-May-2002 jake

De-inline the tlb demap functions. These were so big that gcc3.1 refused
to inline them anyway. ;)


96672 15-May-2002 obrien

style sync with other platforms.


96606 14-May-2002 phk

Move MI stuff out of MD param.h files.

It can all still be overridden in the MD files should need suddenly arise.


96491 13-May-2002 jake

Fix IF_SEXT(val, 32). The constants need to have type long to
handle size > 16.


96422 11-May-2002 jake

Add a support macro to convert the 5-bit packed register field of
a floating point instruction into a 6-bit register number for
double and quad arguments.
Make use of the new INSFPdq_RN macro where apporpriate; this
is required for correctly handling the "high" fp registers
(>= %f32).
Fix a number of bugs related to the handling of the high registers
which were caused by using __fpu_[gs]etreg() where __fpu_[gs]etreg64()
should be used (the former can only access the low, single-precision,
registers).

Submitted by: tmm


96317 10-May-2002 obrien

Gcc 3.1 varargs support.


96241 09-May-2002 obrien

Comment two values I was looking at for GDB.


96208 08-May-2002 jake

Remove unneeded include.


95744 29-Apr-2002 jake

Add support for an alternate signal trampoline; add a sysarch call to register
an alternate trampoling with the kernel.


95710 29-Apr-2002 peter

Tidy up some loose ends.
i386/ia64/alpha - catch up to sparc64/ppc:
- replace pmap_kernel() with refs to kernel_pmap
- change kernel_pmap pointer to (&kernel_pmap_store)
(this is a speedup since ld can set these at compile/link time)
all platforms (as suggested by jake):
- gc unused pmap_reference
- gc unused pmap_destroy
- gc unused struct pmap.pm_count
(we never used pm_count - we track address space sharing at the vmspace)


94512 12-Apr-2002 mike

Include <sys/cdefs.h> for definition of __BSD_VISIBLE.

Pointy hat to: mike


94363 10-Apr-2002 mike

Remove the hack for segsz_t from <sys/types.h>; use the normal
_BSD_FOO_T_ method for defining segsz_t.


94362 10-Apr-2002 mike

Add manifest constants: _LITTLE_ENDIAN, _BIG_ENDIAN, _PDP_ENDIAN, and
_BYTE_ORDER. These are far more useful than their non-underscored
equivalents as these can be used in restricted namespace environments.
Mark the non-underscored variants as deprecated.


94256 09-Apr-2002 jake

Oops. machine/emul.h didn't exist yet.


94254 09-Apr-2002 jake

Rename some fields in struct frame to be compatible with NetBSD/OpenBSD,
and add some compatibility defines. Add fields for ins and locals to
struct reg also for the same reason; these aren't filled in yet because
getting at those registers sucks and I'd rather not save them in the
trapframe just for this. Reorder struct reg to be ABI compatible as
well. Add needed include of machine/emul.h.

This gets pmdb (poor man's debugger) from OpenBSD mostly compiling but it
doesn't work yet :(


93949 06-Apr-2002 jake

Provide an implementation of KTR_CPU that doesn't use pcpu, so we don't
crash and burn if its not setup yet. Add timestamp, cpu, and (fake) file
and line recording to the asm version of CTR.


93854 05-Apr-2002 tmm

Add missing header for the eeprom driver frontents.


93687 02-Apr-2002 tmm

Fix crashes that would happen when more than one 4MB page was used to
hold the kernel text, data and loader metadata by not using a fixed slot
to store the TSB page(s) into. Enter fake 8k page entries into the kernel
TSB that cover the 4M kernel page(s), sot that pmap_kenter() will work
without having to treat these pages as a special case.

Problem reported by: mjacob, obrien
Problem spotted and 4M page handling proposed by: jake


93685 02-Apr-2002 tmm

Remove the superfluous second argument from the IOTSBSLOT() macro.


93684 02-Apr-2002 tmm

Lower UPA_MEMSTART to 0x1c000000000. This is required for some larger
Enterprise machines.


93607 01-Apr-2002 dillon

Stage-2 commit of the critical*() code. This re-inlines cpu_critical_enter()
and cpu_critical_exit() and moves associated critical prototypes into their
own header file, <arch>/<arch>/critical.h, which is only included by the
three MI source files that need it.

Backout and re-apply improperly comitted syntactical cleanups made to files
that were still under active development. Backout improperly comitted program
structure changes that moved localized declarations to the top of two
procedures. Partially re-apply one of the program structure changes to
move 'mask' into an intermediate block rather then in three separate
sub-blocks to make the code more readable. Re-integrate bug fixes that Jake
made to the sparc64 code.

Note: In general, developers should not gratuitously move declarations out
of sub-blocks. They are where they are for reasons of structure, grouping,
readability, compiler-localizability, and to avoid developer-introduced bugs
similar to several found in recent years in the VFS and VM code.

Reviewed by: jake


93600 01-Apr-2002 jake

Move the CTASSERT macro from MD code to systm.h alongside KASSERT so other
code can use it. This takes a single constant argument and fails to compile
if it is 0 (false). The main application of this is to make assertions about
structure sizes at compile time, in order to validate assumptions made in
other code. Examples:

CTASSERT(sizeof(struct foo) == FOO_SIZEOF);
CTASSERT(sizeof(struct foo) == (1 << FOO_SHIFT));

Requested by: jhb, phk


93264 27-Mar-2002 dillon

Compromise for critical*()/cpu_critical*() recommit. Cleanup the interrupt
disablement assumptions in kern_fork.c by adding another API call,
cpu_critical_fork_exit(). Cleanup the td_savecrit field by moving it
from MI to MD. Temporarily move cpu_critical*() from <arch>/include/cpufunc.h
to <arch>/<arch>/critical.c (stage-2 will clean this up).

Implement interrupt deferral for i386 that allows interrupts to remain
enabled inside critical sections. This also fixes an IPI interlock bug,
and requires uses of icu_lock to be enclosed in a true interrupt disablement.

This is the stage-1 commit. Stage-2 will occur after stage-1 has stabilized,
and will move cpu_critical*() into its own header file(s) + other things.
This commit may break non-i386 architectures in trivial ways. This should
be temporary.

Reviewed by: core
Approved by: core


93120 25-Mar-2002 tmm

Add missing declarations.


93092 24-Mar-2002 obrien

Guard against redefining __gnuc_va_list.


93070 24-Mar-2002 tmm

Revamp the busdma implementation a bit:
- change the IOMMU support code so that it supports overcommittting the
available DVMA memory, while still allocating as lazily as possible.
This is achieved by limiting the preallocation, and deferring the
allocation to map load time when it fails. In the latter case, the
DVMA memory reserved for unloaded maps can be stolen to free up enough
memory for loading a map.
- allow NULL settings in the method tables, and search the parent tags
until an appropriate implementation is found. This allows to remove some
kluges in the old implementation.


93067 24-Mar-2002 tmm

Make the OpenFirmware interrupt mapping code more generic, to reduce
the bus-dependent code and to be able to support more systems. The core
of the new code is mostly obtained from NetBSD.
Kluge the interrupt routing methods of the psycho and apb drivers so
that an intline of 0 can be handled for now; real routing is still not
possible (all intline registers are preinitialized instead); this will
require a sparc64-specific adaption of the driver for generic PCI-PCI
bridges with a custom routing method to work right.


93053 23-Mar-2002 tmm

Add code to print the fault virtual address for uncorrectable DMA errors
caused by IOMMU misses to aid debugging. This will only work on
UltraSPARC-IIi and IIe.


93052 23-Mar-2002 tmm

De-__P(), de-K&R, remove superfluous comments and prototypes, some
style fixes. No functional changes.


93002 23-Mar-2002 jake

Fix a deadlock condition with tlb shootdown ipi delivery. Since ipis are
not blocked by raising the pil, a reciever may be interrupted while holding
a spinlock. If the sender does not defer interrupts throughout the entire
operation it may be interrupted and try to acquire a spinlock held by a
reciever, leading to a deadlock due to the synchronization used by the
ipi handlers themselves.

Submitted by: tmm


92998 23-Mar-2002 obrien

ASM versions of __FBSDID.


92861 21-Mar-2002 imp

intr_disable returns register_t


92850 21-Mar-2002 jeff

Remove references to vm_zone.h and switch over to the new uma API.

Reviewed by: jake


92844 21-Mar-2002 alfred

Remove __P.

profile.h and bus.h were excluded because there is currently WIP.

Reviewed by: tmm


92654 19-Mar-2002 jeff

This is the first part of the new kernel memory allocator. This replaces
malloc(9) and vm_zone with a slab like allocator.

Reviewed by: arch@


92383 16-Mar-2002 des

Move the definition of PT_[GS]ET{,DB,FP}REGS from the MD ptrace.h to the
MI ptrace.h, since all platforms define them. Keep the MD ptrace.h around
for FIX_SSTEP (which is currently only needed on Alpha).


92213 13-Mar-2002 jake

Fix ifdef LOCORE protection.


92205 13-Mar-2002 jake

Add support for starting and stopping cpus with ipis.
Stop the other cpus when shutting down or entering the debugger.

Submitted by: tmm


92202 13-Mar-2002 jake

Add support for driving the clocks on secondary cpus.

Submitted by: tmm


92199 13-Mar-2002 jake

Make IPI_WAIT use a bit mask of the cpus that a pmap is active on and only
wait for those cpus, instead of all of them by using a count. Oops.
Make the pointer to the mask that the primary cpu spins on volatile, so
gcc doesn't optimize out an important load. Oops again.
Activate tlb shootdown ipi synchronization now that it works. We have
all involved cpus wait until all the others are done. This may not be
necessary, it is mostly for sanity.
Make the trigger level interrupt ipi handler work.

Submitted by: tmm


92198 13-Mar-2002 jake

Add an ATOMIC_CLEAR_INT macro.

Submitted by: tmm


92051 11-Mar-2002 tmm

Fix the type of some constants, and make some macros safer by casting
the argument.


92050 11-Mar-2002 tmm

Add convenience macros to extract the cc0 and cc1 from format 2 and 3
instructions.


91974 09-Mar-2002 jake

Increase VM_KMEM_SIZE to 16 megs from 12. Define VM_KMEM_SIZE_SCALE so that
the number of physical pages per KVA page allocated scales properly with
memory size. This fixes problems with kmem_map being too small.

Noticed by: mike, wollman
Submitted by: tmm


91959 09-Mar-2002 mike

o Don't require long long support in bswap64() functions.
o In i386's <machine/endian.h>, macros have some advantages over
inlines, so change some inlines to macros.
o In i386's <machine/endian.h>, ungarbage collect word_swap_int()
(previously __uint16_swap_uint32), it has some uses on i386's with
PDP endianness.

Submitted by: bde

o Move a comment up in <machine/endian.h> that was accidentially moved
down a few revisions ago.
o Reenable userland's use of optimized inline-asm versions of
byteorder(3) functions.
o Fix ordering of prototypes vs. redefinition of byteorder(3)
functions, so that the non-GCC (libc asm) case has proper
prototypes.
o Add proper prototypes for byteorder(3) functions in <sys/param.h>.
o Prevent redundant duplicate prototypes by making use of the
_BYTEORDER_PROTOTYPED define.
o Move the bswap16(), bswap32(), bswap64() C functions into MD space
for platforms in which asm versions don't exist. This significantly
reduces the complexity of some things at the cost of duplicate code.

Reviewed by: bde


91783 07-Mar-2002 jake

Implement delivery of tlb shootdown ipis. This is currently more fine grained
than the other implementations; we have complete control over the tlb, so we
only demap specific pages. We take advantage of the ranged tlb flush api
to send one ipi for a range of pages, and due to the pm_active optimization
we rarely send ipis for demaps from user pmaps.

Remove now unused routines to load the tlb; this is only done once outside
of the tlb fault handlers.
Minor cleanups to the smp startup code.

This boots multi user with both cpus active on a dual ultra 60 and on a
dual ultra 2.


91782 07-Mar-2002 jake

Modify the tlb demap API to take a pmap instead of a tlb context number.
Due to allocating tlb contexts on the fly, we only ever need to demap the
primary context, non-primary contexts have already been implicitly flushed
by context switching. All we really need to tell is if its a kernel demap
or not, and its easier just to compare against the kernel_pmap which is a
constant.


91617 04-Mar-2002 jake

Add support for starting secondary cpus in kernel, as opposed to relying
on the loader to do it. Improve smp startup code to be less racy and to
defer certain things until the right time. This almost boots single user
on my dual ultra 60, it is still very fragile:

SMP: AP CPU #1 Launched!
Enter full pathname of shell or RETURN for /bin/sh:
# ls
Debugger("trapsig")
Stopped at Debugger+0x1c: ta %xcc, 1
db> heh
No such command
db>


91616 04-Mar-2002 jake

Dig the information about which tlb slots were used to map the kernel out
of the metadata passed by the loader.


91613 04-Mar-2002 jake

Allocate tlb contexts on the fly in cpu_switch, instead of statically 1 to 1
with pmaps. When the context numbers wrap around we flush all user mappings
from the tlb. This makes use of the array indexed by cpuid to allow a pmap
to have a different context number on a different cpu. If the context numbers
are then divided evenly among cpus such that none are shared, we can avoid
sending tlb shootdown ipis in an smp system for non-shared pmaps. This also
removes a limit of 8192 processes (pmaps) that could be active at any given
time due to running out of tlb contexts.

Inspired by: the brown book
Crucial bugfix from: tmm


91394 27-Feb-2002 tmm

Add the following functions/macros to support byte order conversions and
device drivers for bus system with other endinesses than the CPU (using
interfaces compatible to NetBSD):

- bwap16() and bswap32(). These have optimized implementations on some
architectures; for those that don't, there exist generic implementations.
- macros to convert from a certain byte order to host byte order and vice
versa, using a naming scheme like le16toh(), htole16().
These are implemented using the bswap functions.
- stream bus space access functions, which do not perform a byte order
conversion (while the normal access functions would if the bus endianess
differs from the CPU endianess).

htons(), htonl(), ntohs() and ntohl() are implemented using the new
functions above for kernel usage. None of the above interfaces is currently
exported to user land.

Make use of the new functions in a few places where local implementations
of the same functionality existed.

Reviewed by: mike, bde
Tested on alpha by: mike


91361 27-Feb-2002 jake

Minimal testing has shown that a 4 page tsb is a nice sweet spot for current
work loads. It tapers off after that as gcc's working set generally just fits.

compiling bin/csh:

TSB_PAGES = 2
213.33 real 77.59 user 110.01 sys
TSB_PAGES = 4
116.43 real 75.78 user 19.16 sys
TSB_PAGES = 8
119.27 real 76.38 user 18.12 sys

Testing by: tmm


91360 27-Feb-2002 jake

Parameterize the number of pages to allocate for the per-cpu area on
PCPU_PAGES.


91359 27-Feb-2002 jake

Make cpu_identify take the value of the ver register and cpuid as arguments
so we can print nice things about non-current cpus.


91338 27-Feb-2002 jake

Wrap long lines.


91336 27-Feb-2002 jake

Add a macro for shift of an integer (1 << shift == sizeof). Move the pointer
define to live alongside it. For kicks assert at compile time that they are
correct. Use these instead of magic numbers.


91331 26-Feb-2002 obrien

Define basic macros required by GDB.


91288 26-Feb-2002 jake

Convert pmap.pm_context to an array of contexts indexed by cpuid. This
doesn't make sense for SMP right now, but it is a means to an end.


91274 26-Feb-2002 jake

Allow the user tsb to span multiple pages. Make the default 2 pages for now
until we do some testing to see what's best. This gives a massive reduction
in system time for processes with a relatively large working set. The size
of the tsb directly affects the rss size that a user process can keep mapped.
When it starts to get full replacements occur and the process takes a lot of
soft vm faults. Increasing the default from 1 page to 2 gives the following
before and after numbers for compiling vfs_bio.c:

before:
14.27 real 6.56 user 5.69 sys
after:
8.57 real 6.11 user 1.62 sys

This should make self hosted builds more tolerable.


91257 25-Feb-2002 jake

Remove code to lock the user tsb into the tlb. We can handle faults on it
now, as we do for normal wired kernel memory.


91246 25-Feb-2002 jake

Implement a nested window state. This avoids attempting to spill a user
window to the user stack while in a nested kernel trap. We do this for
entry to the kernel from user mode, but if we get an interrupt in kernel
mode while there are still user windows in the cpu, and we attempt to spill
to the user stack, we may take too many nested traps and overflow the trap
stack, causing a red state exception. This is needed by upcoming changes
to allow the user tsb to not be locked in the tlb.

Reviewed by: tmm


91224 25-Feb-2002 jake

Modify the tte format to not include the tlb context number and to store the
virtual page number in a much more convenient way; all in one piece. This
greatly simplifies the comparison for a matching tte, and allows the fault
handlers to be much simpler due to not having to load wierd masks.
Rewrite the tlb fault handlers to account for the new format. These are also
written to allow faults on the user tsb inside of the fault handlers; the
kernel fault handler must be aware of this and not clobber the other's
registers. The faults do not yet occur due to other support that is needed
(and still under my desk).

Bug fixes from: tmm


91172 23-Feb-2002 jake

Add inlines for demapping a range of pages from the itlb and dtlb. This
will be used to reduce the number of tlb shootdown ipis in an smp system
by sending one ipi for a whole range of pages, instead of one per page.
Munge the context demap operations slightly to support demapping a non-primary
context.


91170 23-Feb-2002 jake

Use intr_disable/intr_restore instead of TLB_ATOMIC_START/END.

Submitted by: tmm


91168 23-Feb-2002 jake

Adapt the tsb_foreach interface to take a source and a destination pmap so
that it can be used for pmap_copy. Other consumers ignore the second pmap.
Add statistics gathering for tsb_foreach.
Implement pmap_copy.


91163 23-Feb-2002 jake

Add macros to extract the UPA module id from the UPA config register.
This is the hardware cpuid.


91157 23-Feb-2002 jake

Include intr_machdep.h only for !LOCORE.


91147 23-Feb-2002 jake

Add metadata types for dtlb and itlb data, and number of slots used.


90868 18-Feb-2002 mike

o Move NTOHL() and associated macros into <sys/param.h>. These are
deprecated in favor of the POSIX-defined lowercase variants.
o Change all occurrences of NTOHL() and associated marcros in the
source tree to use the lowercase function variants.
o Add missing license bits to sparc64's <machine/endian.h>.
Approved by: jake
o Clean up <machine/endian.h> files.
o Remove unused __uint16_swap_uint32() from i386's <machine/endian.h>.
o Remove prototypes for non-existent bswapXX() functions.
o Include <machine/endian.h> in <arpa/inet.h> to define the
POSIX-required ntohl() family of functions.
o Do similar things to expose the ntohl() family in libstand, <netinet/in.h>,
and <sys/param.h>.
o Prepend underscores to the ntohl() family to help deal with
complexities associated with having MD (asm and inline) versions, and
having to prevent exposure of these functions in other headers that
happen to make use of endian-specific defines.
o Create weak aliases to the canonical function name to help deal with
third-party software forgetting to include an appropriate header.
o Remove some now unneeded pollution from <sys/types.h>.
o Add missing <arpa/inet.h> includes in userland.

Tested on: alpha, i386
Reviewed by: bde, jake, tmm


90711 15-Feb-2002 wollman

Resurrect one of the easiest changes from my big include files roll-up
patch from a year ago: give file flags their own type. This does not
(yet) change the type used by system calls or library functions.
The underlying type was chosen to match what is returned by stat().


90705 15-Feb-2002 tmm

Add a delta missed in the last iommu.c commit. This unbreaks the sparc64
kernel build.

Pointy hat to: tmm


90624 13-Feb-2002 tmm

Avoid crashing in early boot when WITNESS is enabled by moving the
mtx_init() for intr_table_lock after the globaldata pointer
initialization.


90620 13-Feb-2002 tmm

Use stxa_sync() when accessing the diagnostic registers to invalidate
caches; this is needed to avoid undefined behaviour.
Clean up a bit.


90619 13-Feb-2002 tmm

Add support for the counter-timer which is included in the Sun U2S and
U2P bridges as a time counter.


90616 13-Feb-2002 tmm

Merge r1.42 of iommu.c and r1.9 of iommuvar.h from NetBSD (this adds
support for managing both streaming caches on psycho pairs).
Use explicit bus space accesses instead of mapping the device memory into
kva.
Move DVMA allocation to the map creation/dma memory allocation functions.


90615 13-Feb-2002 tmm

Clean up bus space debugging support; change sparc64_bus_mem_map() to
take a bus tag and handle as argument instead of a i/o space id and a
physical address, now that nexus handles device memory resource
allocations.


90614 13-Feb-2002 tmm

Define constants for the CPU implementation id; export the dectected id
as cpu_impl.


90611 13-Feb-2002 tmm

Add a few new functions/macros: intr_disable() and intr_restore() to
disable interrupts completely, and stxa_sync(), which performs a store
immediately followed by a membar #Sync with interrupts disabled (this
is needed for writes to diagnostic registers).


90473 10-Feb-2002 obrien

Add this FreeBSD standard header.


89425 16-Jan-2002 jake

Add extern to avoid sloppy common style declarations.

Tripped over by: jhb, mux@sneakerz.org


89080 08-Jan-2002 tmm

Add upa.h, which I had previously forgotten, to unbreak the sparc64
kernel build.

Pointy hat to: tmm


89051 08-Jan-2002 jake

Add initial smp support. This gets as far as allowing the secondary
cpu(s) into the kernel, and sync-ing them up to "kernel" mode so we can
send them ipis, which also work.

Thanks to John Baldwin for providing me with access to the hardware
that made this possible.

Parts obtained from: bsd/os


89042 08-Jan-2002 jake

Add a macro for getting the tlbs (itlb and/or dtlb) which the given
tte may be mapped by.


89041 08-Jan-2002 jake

Prototype pmap_map_tsb().


89040 08-Jan-2002 jake

Remove PANIC_STACK_PAGES which is no longer used.
Redefine the compile time assertion macro to take one parameter.


89039 08-Jan-2002 jake

Add declarations needed by last commit.


89035 08-Jan-2002 jake

Add a md field to pcpu for the upa module id.
Remove the alt_stack field.
Use the defines for the register variables declared in C, so that they
don't get out of sync with the assembler.


89034 08-Jan-2002 jake

Define CKLF_PC in terms of TRAPF_PC.


89033 08-Jan-2002 jake

Add a mov() macro, which is used in conjunction with the register defines
for setting reserved global registers from c.


89032 08-Jan-2002 jake

Update comments and defines to reflect that normal and alternate g6 point
to the current pcb.
Remove interrupt global defines; they use PCPU_REG now.
Move ATOMIC_INC_INT here from exception.s, add ATOMIC_DEC_INT.
Add a KASSERT macro for use in assembler.


89031 08-Jan-2002 jake

Add asis for the upa config reg, which contains the hardware cpu id, and
for the interrupt send register, which is used for dispatching ipis.


88826 02-Jan-2002 tmm

1. Implement an optimization for pmap_remove() and pmap_protect(): if a
substantial fraction of the number of entries of tte's in the tsb
would need to be looked up, traverse the tsb instead. This is crucial
in some places, e.g. when swapping out a process, where a certain
pmap_remove() call would take very long time to complete without this.
2. Implement pmap_qenter_flags(), which will become used later
3. Reactivate the instruction cache flush done when mapping as executable.
This is required e.g. when executing files via NFS, but is known to
cause problems on UltraSPARC-IIe CPU's. If you have such a CPU, you
will need to comment this call out for now.

Submitted by: jake (3)


88823 02-Jan-2002 tmm

Correct the defintion of struct ofw_upa_regs, and use it instead of
struct ofw_nexus_reg. Implement UPA device memory management in the
nexus driver.
Adapt the psycho driver to these changes, and do some minor cleanup work
while being there.


88790 01-Jan-2002 jake

Define __ASM__ so that libc will know not to define C things.


88789 01-Jan-2002 jake

Add a define for the fp restore soft trap type.
Only declare C things if __ASM__ is not defined.


88783 01-Jan-2002 jake

Implement sysarch(SPARC_UTRAP_INSTALL).

Forgot this file in last commit.


88782 01-Jan-2002 jake

Implement user trap delivery as specified by the sparc abi. This provides
an efficient way for the kernel to bounce certain mundane traps back to
userland for handling there. A user trap handler returns directly to the
trapping user code, rather than going through the kernel again. Only a
handful of instructions are actually executed in kernel mode.
Implement sysarch(SPARC_UTRAP_INSTALL).
Add code to handle sharing of the user trap table across forks and unsharing
at exec.

This can be used to implement efficient tracking of floating point register
usage in userland, fe by a thread library, and to handle alignment fault
fixups and instruction emulation in userland, for which the code may need
to be different for 32bit and 64bit binaries.


88781 01-Jan-2002 jake

Add a panic stack, which is used as a known good stack when there is
something wrong with the kernel stack.
Add code to check the kernel stack pointer in various important places
and try hard not to go down in flames if its wrong.


88699 30-Dec-2001 tmm

Add bus_common.h, which contains some definiton that apply to both PCI
and SBus.

Obtained from: NetBSD


88664 29-Dec-2001 jake

Add a header for user trap types required by the sparc abi.
For simplicity the corresponding kernel types use the same
numerical values.


88663 29-Dec-2001 jake

Adapt for used by upcoming fp emulation code.
Comment.

Submitted by: tmm


88655 29-Dec-2001 jake

Prototype dcache_inval_phys.

Submitted by: tmm


88653 29-Dec-2001 jake

Add comments as to why VM_MAXUSER_ADDRESS is magic (abitrary).
Define the KVA_RANGE in terms of ttes, not sttes.
Remove UPT_MIN_ADDRESS. We no longer use a hard coded address for
the user tsb.


88652 29-Dec-2001 jake

Make tte bit constants explicitly unsigned and long.
Add a wierd soft bit.
Remove struct stte.


88651 29-Dec-2001 jake

Add definitions for dcache color bits, which may move to cache.h.
Add fields to md_page for tracking virtual page color, and pv entry
lists.
Fix pmap_track_modified to work for non-kernel pmaps. This is due to
kernel virtual addresses potentially overlapping with userland addresses.


88650 29-Dec-2001 jake

Implement pv entries as separate structures from the ttes, like other
architectures do.


88649 29-Dec-2001 jake

Remove support for multi level tsbs, making this code much simpler and
much less magic, fragile, broken. Use ttes rather than sttes.
We still use the replacement scheme used by the original code, which
is pretty cool.

Many crucial bug fixes from: tmm


88632 29-Dec-2001 jake

Rename definitions to better match the hardware wstate fields.
Don't include WSTATE_TRANSITION in WSTATE_NORMAL_MASK.


88631 29-Dec-2001 jake

Add definitions for TSTATE_MM_* and TSTATE_{I,X}CC_*.
Implement TSTATE_SECURE in terms of PSTATE_SECURE.


88630 29-Dec-2001 jake

Rename and renumber trap types to comply with the user trap types as
specified by the sparc abi. We use numerically higher values for all
internal kernel types.
Remove soft trap types which need to be exposed to userland. They will
move to utrap.h.


88629 29-Dec-2001 jake

1. Certain tlb operations need to be atomic, so disable interrupts for
their duration. This is still only effective as long as they are
only used in the static kernel. Code in modules may cause instruction
faults which makes these break in different ways anyway.
2. Add a load bearing membar #Sync.
3. Add an inline for demapping an entire context.

Submitted by: tmm (1, 2)


88628 29-Dec-2001 jake

jmpbuf is no longer a ucontext_t since it does not need to be passed
to sigreturn.
Add definitions for array offsets.


88627 29-Dec-2001 jake

Add fprs to struct fpreg.


88626 29-Dec-2001 jake

Define PSTATE_MM_MASK in terms of PSTATE_MM_SIZE.
Implement PSTATE_SECURE.


88625 29-Dec-2001 jake

Remove pcb_y. It has moved to trapframe.


88624 29-Dec-2001 jake

Don't concatentate __func__.
Make page size constants explicitly long and unsigned.
Add a macro for compile time assertions.


88623 29-Dec-2001 jake

Make it clear that IH_SHIFT is expected to be that of a pointer.
Make intr_handlers an array of function pointers instead of
small structures.


88622 29-Dec-2001 jake

Add definitions for magic numbers used in asm.
Bloat trapframe with many extra fields so we don't need extra structures.
Use small data types where possible.
Remove second copy of TF_DONE.
Remove mmuframe.


88621 29-Dec-2001 jake

Change fpblock to be an array of ints instead longs.
Change fp_init_thread to take a thread instead of a pcb so we
can get at the trapframe.


88620 29-Dec-2001 jake

Add "memory" to the clobber list for membars.

Submitted by: tmm


88619 29-Dec-2001 jake

Implement CLKF_USERMODE so that user time is accounted properly.

Submitted by: tmm


88618 29-Dec-2001 jake

Add definitions for the number of bits in the icc and xcc fields
of the ccr, as well as the shifts and masks for each.

Submitted by: tmm


88617 29-Dec-2001 jake

Use ASI_P instead of ASI_N if _KERNEL isn't defined so that these
can be used in userland.

Submitted by: tmm


88616 29-Dec-2001 jake

Add macros for dedicated register variables, for use in assmebler.
Add a PUTS macro.


88436 23-Dec-2001 jake

- Add a file for machine dependant loader metdata types. Include this in
machdep.c.
- Adapt to critical_* changes.


88088 18-Dec-2001 jhb

Modify the critical section API as follows:
- The MD functions critical_enter/exit are renamed to start with a cpu_
prefix.
- MI wrapper functions critical_enter/exit maintain a per-thread nesting
count and a per-thread critical section saved state set when entering
a critical section while at nesting level 0 and restored when exiting
to nesting level 0. This moves the saved state out of spin mutexes so
that interlocking spin mutexes works properly.
- Most low-level MD code that used critical_enter/exit now use
cpu_critical_enter/exit. MI code such as device drivers and spin
mutexes use the MI wrappers. Note that since the MI wrappers store
the state in the current thread, they do not have any return values or
arguments.
- mtx_intr_enable() is replaced with a constant CRITICAL_FORK which is
assigned to curthread->td_savecrit during fork_exit().

Tested on: i386, alpha


87702 11-Dec-2001 jhb

Overhaul the per-CPU support a bit:

- The MI portions of struct globaldata have been consolidated into a MI
struct pcpu. The MD per-CPU data are specified via a macro defined in
machine/pcpu.h. A macro was chosen over a struct mdpcpu so that the
interface would be cleaner (PCPU_GET(my_md_field) vs.
PCPU_GET(md.md_my_md_field)).
- All references to globaldata are changed to pcpu instead. In a UP kernel,
this data was stored as global variables which is where the original name
came from. In an SMP world this data is per-CPU and ideally private to each
CPU outside of the context of debuggers. This also included combining
machine/globaldata.h and machine/globals.h into machine/pcpu.h.
- The pointer to the thread using the FPU on i386 was renamed from
npxthread to fpcurthread to be identical with other architectures.
- Make the show pcpu ddb command MI with a MD callout to display MD
fields.
- The globaldata_register() function was renamed to pcpu_init() and now
init's MI fields of a struct pcpu in addition to registering it with
the internal array and list.
- A pcpu_destroy() function was added to remove a struct pcpu from the
internal array and list.

Tested on: alpha, i386
Reviewed by: peter, jake


87572 09-Dec-2001 obrien

style(9)


87158 01-Dec-2001 mike

o Stop abusing MD headers with non-MD types.
o Hide nonstandard functions and types in <netinet/in.h> when
_POSIX_SOURCE is defined.
o Add some missing types (required by POSIX.1-200x) to <netinet/in.h>.
o Restore vendor ID from Rev 1.1 in <netinet/in.h> and make use of new
__FBSDID() macro.
o Fix some miscellaneous issues in <arpa/inet.h>.
o Correct final argument for the inet_ntop() function (POSIX.1-200x).
o Get rid of the namespace pollution from <sys/types.h> in
<arpa/inet.h>.

Reviewed by: fenner
Partially submitted by: bde


86558 18-Nov-2001 tmm

Add a structure defintion for the id prom contents.

Obtained from: NetBSD


86556 18-Nov-2001 jake

Standardize idempotentcy ifdefs.


86551 18-Nov-2001 jake

Add kernel headers needed to build libc. Some are bogus and/or just enough
to compile.

Mostly obtained from: netbsd


86531 18-Nov-2001 jake

Make jmpbuf the same size as ucontext_t so that it can be passed
to sigreturn.

Obtained from: alpha


86530 18-Nov-2001 jake

1. Split fp.h into fp.h and fsr.h so that the latter can be included
in asm files.
2. Temporarily cause subnormal operands in floating point operations
to be treated as zeros so that comlpetion of the operation does not
need to be emulated.
3. Catch fp_exception_other and correctly skip over the unfinished
instruction, but basically ignore them. Emulating the instruction
is not yet supported.
4. Zero td_retval[1] as well in syscall().

Submitted by: tmm (2, 3)


86525 18-Nov-2001 jake

1. Remove kdbframe. Bad idea.
2. Add a TF_DONE macro, which fiddles a trapframe to make the retry on
return from traps act like a done (advance past the trapping
instruction instead of re-executing).
3. Flush the windows before entering the debugger, since it is no
longer done in the breakpoint trap vector.
4. Print a warning if trace <pid> is attempted, it is not yet implemented.
5. Print traps better and decode system calls in traces.

Submitted by: rwatson (4)


86524 18-Nov-2001 jake

Implement SET. Set execption.s 1.12.

Submitted by: tmm


86523 18-Nov-2001 jake

1. Convert the tstate saved in the pcb to a pstate and test for PSTATE_PEF
to determine if a process is using floating point. in order to avoid
sign extending a 13 bit immediate.
2. We don't need to context switch cwp anymore, it is better to just
fiddle the save tstate on return from traps. See exception.s 1.10
and 1.12.
3. Completely remove pcb_cwp.
4. Implement vmapbuf, vunmapbuf and vm_fault_quick. Completely remove
TODOs from vm_machdep.c (yay!).

Submitted by: tmm (1, 3, 4)
Obtained from: existing archs (4)


86521 18-Nov-2001 jake

1. Remove bootinfo and just pass loader metadata to the kernel.
2. Remove mcontext.mc_sp, it is redundant. Adjust spare space to make
ucontext_t a nice size.
3. Raise pil in the debugger.

Submitted by: tmm (3)


86520 18-Nov-2001 jake

1. Implement ascopyto() and ascopyfrom() for copying to an alternate address
space from kernel space and from an alternate address space to kernel
space.
2. Remove the unused and unprototyped physcopy() and physzero() and replace
with the more versatile ascopy() and aszero(), inspired by the above.
These can be used to copy and zero physical pages of memory without mapping
them into kernel space first.
3. Use magic numbers for the offsets in the jmpbuf structure like other
platforms.
4. Use SET.

Submitted by: tmm (1, 4)


86230 09-Nov-2001 tmm

Support for the UltraSpac DVMA MMU (IOMMU), ported from NetBSD.


86229 09-Nov-2001 tmm

Add some OpenFirmware bus support code and definitions.


86228 09-Nov-2001 tmm

Add bus_space and busdma support for sparc64.


86227 09-Nov-2001 tmm

Add a nexus device for sparc64, which uses the OpenFirmware to attach UPA
devices (mostly host bridges) and handles interrupt allocation and setup.


86226 09-Nov-2001 tmm

Header file updates needed for the cache code: add/correct some ASI
definitions and add PAGE_*_MIN and -_MAX macros.


86221 09-Nov-2001 tmm

Add cache handling code for sparc64.


86146 06-Nov-2001 tmm

Add code to emulate unimplemented (non-fp) instructions and to fixup
unaligned accesses, and instr.h, which contrains definitions for the
sparc64 instruction set (partly from NetBSD).
Make use of some definitions from instr.h in db_disasm.c.


86144 06-Nov-2001 tmm

Add optimized implementations of in_cksum_skip() and related functions
for sparc64.


85892 02-Nov-2001 mike

o Add new header <sys/stdint.h>.
o Make <stdint.h> a symbolic link to <sys/stdint.h>.
o Move most of <sys/inttypes.h> into <sys/stdint.h>, as per C99.
o Remove <sys/inttypes.h>.
o Adjust includes in sys/types.h and boot/efi/include/ia64/efibind.h
to reflect new location of integer types in <sys/stdint.h>.
o Remove previously symbolicly linked <inttypes.h>, instead create a
new file.
o Add MD headers <machine/_inttypes.h> from NetBSD.
o Include <sys/stdint.h> in <inttypes.h>, as required by C99; and
include <machine/_inttypes.h> in <inttypes.h>, to fill in the
remaining requirements for <inttypes.h>.
o Add additional integer types in <machine/ansi.h> and
<machine/limits.h> which are included via <sys/stdint.h>.

Partially obtain from: NetBSD
Tested on: alpha, i386
Discussed on: freebsd-standards@bostonradio.org
Reviewed by: bde, fenner, obrien, wollman


85586 27-Oct-2001 jake

Implement elf_reloc. This makes klds work.

Obtained from: netbsd


85335 23-Oct-2001 mike

Remove funky right justification.

Pointed out by: bde


85294 21-Oct-2001 des

[partially forced commit due to pilot error in earlier commit attempt]

{set,fill}_{,fp,db}regs() fixup:

- Add dummy {set,fill}_dbregs() on architectures that don't have them.

- KSEfy the powerpc versions (struct proc -> struct thread).

- Some architectures had the prototypes in md_var.h, some in reg.h, and
some in both; for consistency, move them to reg.h on all platforms.

These functions aren't really MD (the implementation is MD, but the interface
is MI), so they should move to an MI header, but I haven't figured out which
one yet.

Run-tested on i386, build-tested on Alpha, untested on other platforms.


85256 20-Oct-2001 jake

Fix get_cyclecount. Wrap in ifdef _KERNEL.


85245 20-Oct-2001 jake

Add a definition for normal kernel window state.


85241 20-Oct-2001 jake

Parameterize the size of the kernel virtual address space on KVA_PAGES.
Don't use a hard coded address constant for the virtual address of the
kernel tsb. Allocate kernel virtual address space for the kernel tsb
at runtime.
Remove unused parameter to pmap_bootstrap.
Adapt pmap.c to use KVA_PAGES.
Map the message buffer too.
Add some traces.
Implement pmap_protect.


85236 20-Oct-2001 jake

Add support for physical address hardware watchpoints.


85235 20-Oct-2001 jake

Change the stray count in struct intr_vector to a vector number that can
be used to index tables of counters.
Remove intr_dispatch() inline, it is implemented directly in tl*_intr now.
Count stray interrupts in a table of counters like intrcnt.
Disable interrupts briefly when setting up the interrupt vector table.
We must disable interrupts completely, not just raise the pil.
Pass pointers to the intr_vector structures rather than a vector number
to sched_ithd and intr_stray.


85234 20-Oct-2001 jake

Remove traces that are loud and not that useful. Remove nested include
of ktr.h.


85233 20-Oct-2001 jake

Remove an unused macro arg.


85232 20-Oct-2001 jake

Include a whole interrupt queue in struct globaldata instead of just a
pointer. Minor style.


85231 20-Oct-2001 jake

Add fields for boothowto and the kernel environment to boothowto.


85187 19-Oct-2001 obrien

Try two on the preprocessing logic.

Reviewed by: ru


85183 19-Oct-2001 obrien

Blah, fix braino where ru had to remind me of proper preprocessor syntax.
Bad fingers, no cookie.


85108 18-Oct-2001 obrien

My attempts at minimizing the number of #def's got me in trouble.


85085 18-Oct-2001 obrien

Add support for "__gnuc_va_list". Some overly "smart" libraries assume
the existence of the __gnuc_va_list type[*] because our compiler is GCC.

[*] __gnuc_va_list is defined in the GCC ginclude/stdarg.h replacement
headerwhich we don't use.


84849 12-Oct-2001 tmm

Add inthand_add() and inthand_remove() for use by the MD bus code and
some glue code.


84847 12-Oct-2001 tmm

Save the floating point context to the right pcb in cpu_fork(), and add
an empty stub for is_physical_memory().


84846 12-Oct-2001 tmm

Make the NTOHL, NTOHS, HTONL and HTONS macros (which are nops on
sparc64) empty to avoid compiler warnings.


84844 12-Oct-2001 tmm

Add pmap_kenter_flags(), which is used by MD bus code that will be
committed soon, add a stub form pmap_kenter_temporary(), and implement
pmap_extract() and pmap_kextract().


84783 10-Oct-2001 ps

Make MAXTSIZ, DFLDSIZ, MAXDSIZ, DFLSSIZ, MAXSSIZ, SGROWSIZ loader
tunable.

Reviewed by: peter
MFC after: 2 weeks


84194 30-Sep-2001 jake

Add contents to struct *reg.


84192 30-Sep-2001 jake

Use %ver to identify the cpu instead of openfirmware.

Submitted by: robert


84188 30-Sep-2001 jake

Add a place holder for PSTATE_SECURE, which detects if user code it
trying to set bad pstate bits.


84186 30-Sep-2001 jake

Split the low level trap code into trap, interrupt and syscall, its
easier and hopefully this code is done changing radically.

Don't use the mmu tlb register to address the kernel page table, nor
the 8k pointer register. The hardware will do some of the page table
lookup by storing the the base address in an internal register and
calculating the address of the tte in the table. However it is limited
to a 1 meg tsb, which only maps 512 megs. The kernel page table only
has one level, so its easy to just do it by hand, which has the advantage
of supporting abitrary amounts of kvm and only costs a few more instructions.

Increase kvm to 1 gig now that its easy to do so and so we don't waste
most of a 4 meg page.

Fix some traces. Fix more proc locking.

Call tsb_stte_promote if we get a soft fault on a mapping in the upper
levels of the tsb. If there is an invalid or unreferenced mapping
in the primary tsb, it will be replaced.

Immediately fail for faults occuring in {f,s}uswintr.


84183 30-Sep-2001 jake

Move the kernel to end of the first 4 gigabytes of address space, so that
one 4 meg page can map both the kernel and the openfirmware mappings.
Add the openfirmware mappings to the kernel tsb so we can call the firmware
on the kernel trap table and access kernel memory normally.
Implement pmap_swapout_proc, pmap_swapin_proc, pmap_swapout_thread,
pmap_swapin_thread, pmap_activate, pmap_page_exists, and pmap_phys_address.


84182 30-Sep-2001 jake

Add a macro to get the context from a tte tag, not necesarily a whole
tte. Remove the old inline.


84180 30-Sep-2001 jake

Don't use types that require other headers.


84179 30-Sep-2001 jake

Wrap hardware trap types in ifdef _kernel.


84178 30-Sep-2001 jake

Move the pcb the to the top of the kernel stack.
Add a guard page at the bottom of the kernel stack. Its unclear how easy
it will be to detect these faults and do something useful.
Setup the registers on exec how the c runtime expects.
Implement various {fill,set}_*regs.
Fix proc locking.


84177 30-Sep-2001 jake

Don't overflow the ktr buffer <gulp>.


84176 30-Sep-2001 jake

Implement PCPU_ADDR. Align functions on 16 bytes boundaries.


83643 18-Sep-2001 jhb

- If we ever do the per-cpu KTR stuff, the index won't be volatile as it
will be private to each CPU.
- Re-style(9) the globaldata structures. There really needs to be a MI
struct pcpu that has a MD struct mdpcpu member at some point.


83366 12-Sep-2001 julian

KSE Milestone 2
Note ALL MODULES MUST BE RECOMPILED
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.

Sorry john! (your next MFC will be a doosie!)

Reviewed by: peter@freebsd.org, dillon@freebsd.org

X-MFC after: ha ha ha ha


83053 05-Sep-2001 obrien

style(9) the structure definitions.


82945 04-Sep-2001 obrien

style(9) the structure names


82907 03-Sep-2001 jake

Change tf_arg to uintptr_t from void * to reflect the fact that
non-pointer values may be passed in it. Add appropriate casts.

The interrupt type is now passed in tf_arg instead tf_type.


82906 03-Sep-2001 jake

Implement a slightly different window spill/fill algorithm for dealing
with user windows in kernel mode. We split the windows using %otherwin,
but instead of spilling user window directly to the pcb, we attempt to
spill to user space. If this fails because a stack page is not resident
(or the stack is smashed), the fault handler at tl 2 will detect the
situation and resume at tl 1 again where recovery code can spill to the
pcb. Any windows that have been saved to the pcb will be copied out to
the user stack on return from kernel mode.

Add a first stab at 32 bit window handling. This uses much of the same
recovery code as above because the alignment of the stack pointer is used
to detect 32 bit code. Attempting to spill a 32 bit window to a 64 bit
stack, or vice versa, will cause an alignment fault. The recovery code
then changes the window state to vector to a 32 bit spill/fill handler
and retries the faulting instruction.

Add ktr traces in useful places during trap processing.

Adjust comments to reflect new code and add many more.


82905 03-Sep-2001 jake

Move the alternate global register stack to struct globaldata.


82904 03-Sep-2001 jake

Add ktr traces.


82903 03-Sep-2001 jake

Implement pv_bit_count which is used by pmap_ts_referenced.

Remove the modified tte bit and add a softwrite bit. Mappings are only
writeable if they have been written to, thus in general modify just
duplicates the write bit. The softwrite bit makes it easier to distinguish
mappings which should be writeable but are not yet modified.

Move the exec bit down one, it was being sign extended when used as an
immediate operand.

Use the lock bit to mean tsb page and remove the tsb bit. These are the
only form of locked (tsb) entries we support and we need to conserve bits
where possible.

Implement pmap_copy_page and pmap_is_modified and friends.

Detect mappings that are being being upgraded from read-only to read-write
due to copy-on-write and update the write bit appropriately.

Make trap_mmu_fault do the right thing for protection faults, which is
necessary to implement copy on write correctly. Also handle a bunch
more userland trap types and add ktr traces.


82902 03-Sep-2001 jake

Implement signals.


82901 03-Sep-2001 jake

Move %ver definitions from pstate.h to ver.h. Add definitions for normal
kernel pstate values, which include a memory store order override.


82900 03-Sep-2001 jake

Add simple macros for tracing in assembler files. There are quite
a few places where we cannot even call a function, and these have
proven to be very useful debugging tools for such situations.


82899 03-Sep-2001 jake

Use the correct copyrights. Note where most of this came from.

Requested by: obrien


82898 03-Sep-2001 jake

Bump UPAGES to 4. The pcb can be rather large.


82897 03-Sep-2001 jake

mtx_savecrit is a pil level, not a pstate value, thus mtx_intr_enable
was not doing its thing.


82896 03-Sep-2001 jake

Add a flushw() macro.


82895 03-Sep-2001 jake

Add atomic_load and store functions without membars, fwiw.


82894 03-Sep-2001 jake

The definition for ASI_IMMU_TAG_TARGET_REG was wrong. Sort.


82530 30-Aug-2001 mike

o Remove some GCCisms in src/powerpc/include/endian.h.
o Unify <machine/endian.h>'s across all architectures.
o Make bswapXX() functions use a different spelling of u_int16_t and
friends to reduce namespace pollution. The bswapXX() functions
don't actually exist, but we'll probably import these at some
point. Atleast one driver (if_de) depends on bswapXX() for big
endian cases.
o Deprecate byteorder(3) prototypes from <sys/types.h>, these are
now prototyped indirectly in <arpa/inet.h>.
o Deprecate in_addr_t and in_port_t typedefs in <sys/types.h>, these
are now typedef'd in <arpa/inet.h>.
o Change byteorder(3) prototypes to use standards compliant uint32_t
(spelled __uint32_t to reduce namespace pollution).
o Document new preferred headers and standards compliance.

Discussed with: bde
PR: 29946
Reviewed by: bmilekic


82011 20-Aug-2001 jake

Rename fp_init_pcb to fp_init_proc. Set the FEF bit in fprs register;
according the SCD it should be set if no user trap handler in set.

Submitted by: tmm


82008 20-Aug-2001 jake

Add variables needed by hardware watchpoint support.

Submitted by: tmm


82007 20-Aug-2001 jake

Add code for supporting hardware watch points.

Submitted by: tmm


82006 20-Aug-2001 jake

Add a system call trap type and syscall() call request handler.
Also add support for hardware watch point traps.

Submitted by: tmm


82005 20-Aug-2001 jake

Add support for splitting the register windows on entry to the
kernel from usermode. The remaining user windows are spilled
to the pcb as necessary. The user land window fault handlers
fill directly from the pcb on return.
Add system call entry points.

Submitted by: tmm


82004 20-Aug-2001 jake

db_expr_t is signed.


82003 20-Aug-2001 jake

Add definitions for bits in condition code register and the load store
unit control registers. Move tstate definitions to their own file.

Submitted by: tmm


82002 20-Aug-2001 jake

Add a definition for the load store unit control register.


82000 20-Aug-2001 obrien

Sync globals.h up with the other platforms. There is still some cruft in
here, but now all the platforms have the same cruft. Consistantly spell
the `struct globaldata *' "globalp".

Reviewed by: peter


81894 18-Aug-2001 jake

Spell ta 1 correctly as ta %xcc, 1. Use %pil for critical enter/exit
instead of pstate.ie. Note that popc is not implemented in hardware
on certain ultras, so we can't use it for inline ffs (suck).


81893 18-Aug-2001 jake

Gcc 3.0 requires a .register pseudo-op for certain global registers when
used in assembly language. Tell it to ignore the registers for now.


81763 16-Aug-2001 obrien

style(9) and make consistent across platforms


81727 15-Aug-2001 ache

OFF_T -> OFF (more standard style)


81720 15-Aug-2001 ache

Add OFF_T_MAX/OFF_T_MIN


81673 15-Aug-2001 obrien

Sync up with the latest ansi.h in other platforms -- especially RUNE and
wchar bits.


81613 14-Aug-2001 jake

Don't define ELF_RTLD_ADDR twice.


81493 10-Aug-2001 jhb

- Close races with signals and other AST's being triggered while we are in
the process of exiting the kernel. The ast() function now loops as long
as the PS_ASTPENDING or PS_NEEDRESCHED flags are set. It returns with
preemption disabled so that any further AST's that arrive via an
interrupt will be delayed until the low-level MD code returns to user
mode.
- Use u_int's to store the tick counts for profiling purposes so that we
do not need sched_lock just to read p_sticks. This also closes a
problem where the call to addupc_task() could screw up the arithmetic
due to non-atomic reads of p_sticks.
- Axe need_proftick(), aston(), astoff(), astpending(), need_resched(),
clear_resched(), and resched_wanted() in favor of direct bit operations
on p_sflag.
- Fix up locking with sched_lock some. In addupc_intr(), use sched_lock
to ensure pr_addr and pr_ticks are updated atomically with setting
PS_OWEUPC. In ast() we clear pr_ticks atomically with clearing
PS_OWEUPC. We also do not grab the lock just to test a flag.
- Simplify the handling of Giant in ast() slightly.

Reviewed by: bde (mostly)


81392 10-Aug-2001 jake

Correct copyright language.


81391 10-Aug-2001 jake

Add code to program the tick register and to setup its interrupt handler.


81390 10-Aug-2001 jake

Add early code to support interrupts.


81378 10-Aug-2001 jake

Add trap types for interrupts. Ad definitions to get the interrupt level
from the trap type.


81377 10-Aug-2001 jake

1. Add code to demap pages from the tlb for user contexts.
2. Add a context argument to most functions, instead of extracting it from
from the tte.

Submitted by: tmm (1).


81376 10-Aug-2001 jake

Add fields that point to per-cpu interrupt data.


81375 10-Aug-2001 jake

Add a field to trapframe for saving the pil.


81373 10-Aug-2001 jake

Add asis for interrupt registers.


81335 09-Aug-2001 obrien

Restore the proper copyright on this and remove the gratuitous changes from
sys/alpha/include/elf.h.


81334 09-Aug-2001 obrien

The author isn't a [UC] Regents. Correct the copyright language.


81265 08-Aug-2001 peter

Zap 'ptrace(PT_READ_U, ...)' and 'ptrace(PT_WRITE_U, ...)' since they
are a really nasty interface that should have been killed long ago
when 'ptrace(PT_[SG]ETREGS' etc came along. The entity that they
operate on (struct user) will not be around much longer since it
is part-per-process and part-per-thread in a post-KSE world.

gdb does not actually use this except for the obscure 'info udot'
command which does a hexdump of as much of the child's 'struct user'
as it can get. It carries its own #defines so it doesn't break
compiles.


81179 06-Aug-2001 jake

The kernel runs at a much lower address now.


81178 06-Aug-2001 jake

Fix macros for dealing with tte contexts.
Add tte bits for initializing tsbs and for specifying managed mappings.


81176 06-Aug-2001 jake

Oops. Last commit to tsb.h should have gone here.

Fix macros for eadling with tte contexts and add macros for sfsr fields.


81175 06-Aug-2001 jake

Fix macros for setting and extracting the context field in ttes and
add macros for the fields in sfsr.


81174 06-Aug-2001 jake

Add a vm_object and page count to struct pmap for allocating tsb pages.


81147 05-Aug-2001 tmm

Sigh. Add two files needed for the sparc64 fp contect switching code
that were forgotten in the last commit.

Pointy hat to: tmm


81135 04-Aug-2001 tmm

Add floating point context switching code for sparc64.

Reviewed by: jake


81087 03-Aug-2001 jake

Move some code related to managing pv entries from the pmap module to
the pv module. It works now that vtophys for sttes works.


81086 03-Aug-2001 jake

Fix a bug translating virtual translation table entry addresses to physical
addresses. It helps to use the physical address that the virtual address
actually maps to (doh!). Comment out some code that crashes.

Found independently by: tmm


81083 03-Aug-2001 jake

Add an Elfhashelt type for sparc64.


80709 31-Jul-2001 jake

Flesh out the sparc64 port considerably. This contains:
- mostly complete kernel pmap support, and tested but currently turned
off userland pmap support
- low level assembly language trap, context switching and support code
- fully implemented atomic.h and supporting cpufunc.h
- some support for kernel debugging with ddb
- various header tweaks and filling out of machine dependent structures


80708 31-Jul-2001 jake

Add skeleton machine dependent headers and c files for a port of freebsd
to a new architecture. This is the base of the sparc64 port, but contains
limited machine dependent code, and can be used a base for ports. Included
are:
- standard machine dependent headers, tweaked for a 64 bit, big endian
architecture, including empty versions of all the machine dependent
structures
- a machine independent atomic.h, which can be used until a port has
support for interrupts and the operations really need to be atomic
- stub versions of all the machine dependent functions, which panic
when called and print out the name of the function that needs to
be implemented. functions which are normally in assembly files are
not included, but this should reduce the number of different undefined
references on the first few compiles from hundreds to 5 or 6
Given minimal startup code and console support it should be trivial to
make this compile and run the first few sysinits on almost any architecture.

Requested by: alfred, imp, jhb