History log of /freebsd-current/sys/compat/linuxkpi/common/src/linux_compat.c
Revision Date Author Comments
# ae38a1a1 15-May-2024 Emmanuel Vadot <manu@FreeBSD.org>

linuxkpi: spinlock: Simplify code

Just use a typedef for spinlock_t, no need to create a useless
structure.

Reviewed by: bz, emaste
Sponsored by: Beckhoff Automation GmbH & Co. KG
Differential Revision: https://reviews.freebsd.org/D45205


# 61fb195e 08-Apr-2024 Vladimir Kondratyev <wulf@FreeBSD.org>

LinuxKPI: Improve timer_shutdown_sync

timer_shutdown_sync not only shutdowns a timer but prevents it rearming.

Sponsored by: Serenity CyberSecurity, LLC
Reviewed by: emaste
MFC after: 1 week


# 41fb6dc3 19-Jan-2024 Konstantin Belousov <kib@FreeBSD.org>

kcmp(2): implement for linuxkpi cdevs

Reviewed by: brooks, markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D43518


# b8c88a61 24-Dec-2023 Vladimir Kondratyev <wulf@FreeBSD.org>

LinuxKPI: Add x86_vendor field to struct cpuinfo_x86

and initialize it at linuxkpi module load.

Sponsored by: Serenity Cyber Security, LLC
Reviewed by: manu
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D42820


# c612e3c2 24-Dec-2023 Vladimir Kondratyev <wulf@FreeBSD.org>

LinuxKPI: Add xen/xen.h header

It contains proxy-implementation of xen_initial_domain() and
xen_pv_domain() required by latest drm-kmod.

Sponsored by: Serenity Cyber Security, LLC
Reviewed by: manu, bz
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D42797


# 488e8a7f 23-Oct-2023 Bjoern A. Zeeb <bz@FreeBSD.org>

LinuxKPI: reduce impact of large MAXCPU

Start scaling arrays dynamically instead of using MAXCPU, resulting in
extra allocations on startup but reducing the overall memory footprint.
For the static single CPU mask we provide two versions to further save
memory depending on a low or high CPU count system. The threshold to
switch is currently at 128 CPUs on 64bit platforms.
More detailed comments on the implementations can be found in the code.

If I am not wrong on a MAXCPU=65536 system the memory footprint should
roughly go down from 512M to 1.5M for the static single CPU mask.

Submitted by: olce (most of this final version)
Sponsored by: The FreeBSD Foundation
PR: 274316
Differential Revision: https://reviews.freebsd.org/D42345


# 80446fc7 08-Dec-2023 Jean-Sébastien Pédron <dumbbell@FreeBSD.org>

linuxkpi: Move `struct kobject` code to `linux_kobject.c`

[Why]
`linux_compat.c` is already too long. I will need to add `struct kset`
in a follow-up commit, so let's move the existing `struct kobject` code
to its own file.

Reviewed by: manu
Approved by: manu
Differential Revision: https://reviews.freebsd.org/D43019


# 8a8e86b8 07-Dec-2023 Jean-Sébastien Pédron <dumbbell@FreeBSD.org>

Revert "linuxkpi: `GFP_KERNEL` equals `M_NOWAIT` now"

This change seems to break some drivers such as the mlx5*(4) drivers.

As kib@ says:
> According to the 'official' Linux kernel documentation, the GFP_KERNEL
> flag implies sleepable context.

It was introduced while working on the new vt(4)/DRM integration [1].
During this work, doing sleepable allocations broke vt(4) and the DRM
drivers. However, I made further improvements and some locking-related
fixed to the new integration without revisiting the need for it.

After more testing, the improvements to the integration mentionned above
seems to make the change to `GFP_KERNEL` unneeded now. I can thus
revert it to restore expectations of other drivers.

This reverts commit 14dcd40983748596d116d91acb934a8a95ac76bc.

[1] https://github.com/freebsd/drm-kmod/pull/243

Reviewed by: kib
Approved by: kib
Differential Revision: https://reviews.freebsd.org/D42962


# 14dcd409 24-Nov-2023 Jean-Sébastien Pédron <dumbbell@FreeBSD.org>

linuxkpi: `GFP_KERNEL` equals `M_NOWAIT` now

... instead of `M_WAITOK`.

[Why]
The reason is that in some places in the DRM drivers (in particular, the
framebuffer management code), kmalloc() is called from a non-sleepable
context, such as after a call to mtx_lock(8) with an MTX_DEF mutex.

If `GFP_KERNEL` is defined as `M_WAITOK`, we hit an assertion from
witness(4).

[How]
The definition of `GFP_KERNEL` is changed to `M_NOWAIT`. This means that
callers should verify the return value of kmalloc(). Fortunately, this
is always the case in Linux.

Reviewed by: bz, emaste, manu
Approved by: manu
Differential Revision: https://reviews.freebsd.org/D42054


# 685dc743 16-Aug-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove $FreeBSD$: one-line .c pattern

Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/


# 2e07e885 16-May-2023 Bjoern A. Zeeb <bz@FreeBSD.org>

LinuxKPI: implement timer_{delete,shutdown}_sync()

Implement timer_{delete,shutdown}_sync(), which do not seem to require
anything additional to the already existing del_timer_sync().

Sponsored by: The FreeBSD Foundation
MFC after: 10 days
Reviewed by: hselasky
Differential Revision: https://reviews.freebsd.org/D40124


# ad513b4d 23-May-2023 Bjoern A. Zeeb <bz@FreeBSD.org>

LinuxKPI: add utsname for init_utsname() with release

A wireless dirver is requesting release from the result of
init_utsname(). Populate the field on startup.

MFC after: 10 days
Reviewed by: emaste
Differential Revision: https://reviews.freebsd.org/D40248


# a27902c1 10-Feb-2023 Jean-Sébastien Pédron <dumbbell@FreeBSD.org>

linuxkpi: Define `cpu_data(cpu)`

`cpu_data(cpu)` evaluates to a `struct cpuinfo_x86` filled with
attributes of the given CPU number. The CPU number is an index in the
`__cpu_data[]` array with MAXCPU entries. On FreeBSD, we simply
initialize all of them like we do with `boot_cpu_data`.

While here, we add the `x86_model` field to the `struct cpuinfo_x86`. We
use `CPUID_TO_MODEL()` to set it.

At the same time, we fix the value of `x86` which should have been set
to the CPU family. It was using the same implementation as
`CPUID_TO_MODEL()` before. It now uses `CPUID_TO_FAMILY()`.

Reviewed by: manu
Approved by: manu
Differential Revision: https://reviews.freebsd.org/D38542


# 58cf3a69 11-Nov-2022 Jean-Sébastien Pédron <dumbbell@FreeBSD.org>

linuxkpi: Define `boot_cpu_data.x86_max_cores`

Reviewed by: manu
Approved by: manu
Differential Revision: https://reviews.freebsd.org/D36971


# c72dd0aa 11-Nov-2022 Jean-Sébastien Pédron <dumbbell@FreeBSD.org>

linuxkpi: Introduce `vma_set_file()`

This code was moved from the i915 driver in Linux 5.11.

Reviewed by: manu
Approved by: manu
Differential Revision: https://reviews.freebsd.org/D36957


# e2361e04 31-Oct-2022 Bjoern A. Zeeb <bz@FreeBSD.org>

LinuxKPI: implement cpumask_of()

Add a static set of cpumasks for all (possible) cpus with only the one
indexed cpu enabled in each set.
This is needed for cpumask_of(_cpuid) which returns a cpumask (cpuset)
with only cpu _cpuid enabled and is used by one wireless driver at least.

MFC after: 3 days
Reviewed by: emaste
Differential Revision: https://reviews.freebsd.org/D37223


# af3c7888 30-Sep-2022 Ed Schouten <ed@FreeBSD.org>

Alter the prototype of qsort_r(3) to match POSIX, which adopted the
glibc-based interface.

Unfortunately, the glibc maintainers, despite knowing the existence
of the FreeBSD qsort_r(3) interface in 2004 and refused to add the
same interface to glibc based on grounds of the lack of standardization
and portability concerns, has decided it was a good idea to introduce
their own qsort_r(3) interface in 2007 as a GNU extension with a
slightly different and incompatible interface.

With the adoption of their interface as POSIX standard, let's switch
to the same prototype, there is no need to remain incompatible.

C++ and C applications written for the historical FreeBSD interface
get source level compatibility when building in C++ mode, or when
building with a C compiler with C11 generics support, provided that
the caller passes a fifth parameter of qsort_r() that exactly matches
the historical FreeBSD comparator function pointer type and does not
redefine the historical qsort_r(3) prototype in their source code.

Symbol versioning is used to keep old binaries working.

MFC: never
Relnotes: yes
Reviewed by: cem, imp, hps, pauamma
Differential revision: https://reviews.freebsd.org/D17083


# 7ae99f80 22-Sep-2022 John Baldwin <jhb@FreeBSD.org>

pmap_unmapdev/bios: Accept a pointer instead of a vm_offset_t.

This matches the return type of pmap_mapdev/bios.

Reviewed by: kib, markj
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D36548


# b2c86006 09-Aug-2022 Emmanuel Vadot <manu@FreeBSD.org>

linuxkpi: Add asm/processor.h

Also fill the boot_cpu_data struct as drm needs it.

Reviewed by: bz
Obtained from: drm-kmod
Sponsored by: Beckhoff Automation GmbH & Co. KG
Differential Revision: https://reviews.freebsd.org/D36107


# 132b00f9 04-Apr-2022 Warner Losh <imp@FreeBSD.org>

linuxkpi: move io_mapping_create_wc to .c

Move io_mapping_create_wc to .c because it encodes the size of struct
io_mapping so we move this from the client module to the linuxkpi
module.

Sponsored by: Netflix
Reviewed by: hselasky, emaste
Differential Revision: https://reviews.freebsd.org/D34776


# aca0bcbc 04-Apr-2022 Warner Losh <imp@FreeBSD.org>

linuxkpi: Move cdev_alloc into .c file

Move cdev_alloc into linux_compat.c since it encodes the size of struct
linux_cdev into the client modules otherwise.

Sponsored by: Netflix
Reviewed by: hselasky, emaste
Differential Revision: https://reviews.freebsd.org/D34771


# 1341ac9f 04-Apr-2022 Warner Losh <imp@FreeBSD.org>

linuxkpi: Move class_create to .c file

class_create encodes the size of struct class into the generated
code. Move from .h file to .c file to move this knowledge from the
client modules that call this into the linuxkpi module.

Sponsored by: Netflix
Reviewed by: hselasky, emaste
Differential Revision: https://reviews.freebsd.org/D34769


# 702b6875 04-Apr-2022 Warner Losh <imp@FreeBSD.org>

linuxkpi: Move device_create_groups_vargs to linux_compat.c

device_create_groups_vargs encodes the size of struct device. Move
definition from .h to .c to move this size into the linuxkpi module
rather than encoding it in all client driver modules.

Sponsored by: Netflix
Reviewed by: hselasky, emaste
Differential Revision: https://reviews.freebsd.org/D34768


# 36929b55 04-Apr-2022 Warner Losh <imp@FreeBSD.org>

linuxkpi: move kobject_create to .c file

kobject_create knows the size of struct kobject. Move it to
linux_compat.c so this knowledge is confined to the loadable module and
not the clients.

Sponsored by: Netflix
Reviewed by: hselasky, emaste
Differential Revision: https://reviews.freebsd.org/D34767


# aca2a7fa 07-Mar-2022 Eric van Gyzen <vangyzen@FreeBSD.org>

stack_zero is not needed before stack_save

The man page was recently clarified to commit to this contract.

MFC after: 1 week
Sponsored by: Dell EMC Isilon


# 04d42cb4 22-Nov-2021 Vladimir Kondratyev <wulf@FreeBSD.org>

LinuxKPI: Implement default sysfs kobject attribute operations

Required by drm-kmod 5.7

MFC after: 1 week
Reviewed by: hselasky
Differential Revision: https://reviews.freebsd.org/D33292


# dbc920bd 06-Nov-2021 Vladimir Kondratyev <wulf@FreeBSD.org>

LinuxKPI: Implement interval_tree

Required by drm-kmod

MFC after: 1 week
Reviewed by: hselasky, manu
Differential Revision: https://reviews.freebsd.org/D32869


# 2390a144 02-Nov-2021 Hans Petter Selasky <hselasky@FreeBSD.org>

LinuxKPI: Add sysctl(8) knob to control verbosity of WARN_ON's.

The purpose of this change is to reduce the amount of dmesg(8) noise when
VT switching after a panic.

Submitted by: Greg V <greg@unrelenting.technology>
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D30174
Sponsored by: NVIDIA Networking


# 60d962e0 17-Oct-2021 Jessica Clarke <jrtc27@FreeBSD.org>

LinuxKPI: Implement _ioremap_attr for riscv

Now that riscv implements pmap_mapdev_attr we can enable the non-stub
implementation for riscv, which is needed for drm-kmod to not fail at
run time for drivers that need to map I/O regions.

Reviewed by: hselasky, bz
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D32446


# 2b68eb8e 01-Oct-2021 Mateusz Guzik <mjg@FreeBSD.org>

vfs: remove thread argument from VOP_STAT

and fo_stat.


# 062f1500 29-Sep-2021 Vladimir Kondratyev <wulf@FreeBSD.org>

LinuxKPI: Remove vma argument from fault method of vm_operations_struct

It is removed from Linux since 4.11.
In FreeBSD it results in several #ifdefs in drm-kmod.

Reviewed by: emaste, hselasky, manu
Differential revision: https://reviews.freebsd.org/D32169


# 7d92d483 29-Sep-2021 Vladimir Kondratyev <wulf@FreeBSD.org>

LinuxKPI: Invoke release handler when file is destroyed by fput()

Required by drm_kmod 5.6

Reviewed by: hselasky, manu
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D32067


# 66ea3906 29-Sep-2021 Vladimir Kondratyev <wulf@FreeBSD.org>

LinuxKPI: Remove FreeBSD struct resource from all LKPI headers

except linux/pci.h to avoid conflicts with Linux version.
This allows to #define resource in drm-kmod globally and strip some #ifdef-s

Reviewed by: hselasky, manu
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D31673


# a81b36c6 29-Sep-2021 Vladimir Kondratyev <wulf@FreeBSD.org>

LinuxKPI: Implement get_file_rcu()

get_file_rcu() grabs a file if the file->f_count is not zero.

Required by drm-kmod 5.6

Reviewed by: hselasky, manu (previous version)
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D31672


# 5ca9d417 29-Jun-2021 Dmitry Chagin <dchagin@FreeBSD.org>

LinuxKPI: Rename a short description of the kmalloc type.

To avoid duplication in the vmstat -m output rename the kmalloc type short
description to 'lkpikmalloc' as the Linux emulation layer historically names
its linux malloc type as 'linux'.

Reviewed by: hselasky, kib, emaste
Differential Revision: https://reviews.freebsd.org/D30928
MFC after: 2 weeks


# 1fd26da9 29-Jun-2021 Dmitry Chagin <dchagin@FreeBSD.org>

LinuxKPI: Put compat code under appropriate condition.

Reviewed by: hselasky, emaste, kib
Differential Revision: https://reviews.freebsd.org/D30927
MFC after: 2 weeks


# 945accf5 29-Jun-2021 Dmitry Chagin <dchagin@FreeBSD.org>

LinuxKPI: Use the proper API to determine the ABI of the running process.

Reviewed by: markj, hselasky, kib
Differential Revision: https://reviews.freebsd.org/D30924
MFC after: 2 weeks


# d16b6cb1 30-May-2021 Bjoern A. Zeeb <bz@FreeBSD.org>

LinuxKPI: enhance the irq KPI for managed and threaded operations.

Move request_irq() to an internal function which serves request_irq()
and the newly added request_threaded_irq() and devm_request_threaded_irq().
Likewise factor out parts of free_irq() to also be used with
devm_free_irq(). Add the storage and call to a thread_handler in case
of IRQ_WAKE_THREAD.
This is needed for the iwlwifi driver.

Sponsored by: The FreeBSD Foundation
MFC after: 10 days
Reviewed by: hselasky
Differential Revision: https://reviews.freebsd.org/D30549


# 801cf532 27-May-2021 Bjoern A. Zeeb <bz@FreeBSD.org>

LinuxKPI: add KPI for netdev_notifier_info returning ifp

While currently the ifp gets cast to a net_device and then returned
and consumers are expecting an ifp again, allow parallel usage now and
in the future by extending and also passing the ifp directly back in
the netdev_notifier_info. Add a function to return the ifp instead of
the net_device.

Sponsored by: The FreeBSD Foundation
MFC after: 10 days
Suggested by: hselasky
Reviewed by: hselasky
Differential Revision: https://reviews.freebsd.org/D30522


# 5fce8027 24-May-2021 Bjoern A. Zeeb <bz@FreeBSD.org>

LinuxKPI: add cpu.h for cpumask_*()

Add linux/cpu.h for cpumask_*() functions found in wireless drivers
and make sure cpu_online_mask is always initialised.

Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Reviewed by: hselasky
Differential Revision: https://reviews.freebsd.org/D30421


# b8f113ca 11-May-2021 Hans Petter Selasky <hselasky@FreeBSD.org>

Implement cdev_device_add() and cdev_device_del() in the LinuxKPI.

MFC after: 1 week
Sponsored by: Mellanox Technologies // NVIDIA Networking


# 67807f50 11-May-2021 Hans Petter Selasky <hselasky@FreeBSD.org>

cdev_del() should only put it's kernel object in the LinuxKPI.

The destructor takes care of the rest.

MFC after: 1 week
Sponsored by: Mellanox Technologies // NVIDIA Networking


# 904390b4 11-May-2021 Hans Petter Selasky <hselasky@FreeBSD.org>

Implement read-only VM_SHARED flag in the LinuxKPI.

For use by mmap(2) callbacks.

MFC after: 1 week
Sponsored by: Mellanox Technologies // NVIDIA Networking


# 8011fb79 30-Mar-2021 Konstantin Belousov <kib@FreeBSD.org>

linuxkpi: drop single-use variable

Reviewed by: hselasky
Sponsored by: Mellanox Technologies/NVidia Networking
MFC after: 1 week


# f6b10883 30-Mar-2021 Konstantin Belousov <kib@FreeBSD.org>

linuxkpi: avoid counting per-thread use for the embedded linux cdevs

The counter is not used to control destroy.

Reviewed by: hselasky
Sponsored by: Mellanox Technologies/NVidia Networking
MFC after: 1 week


# 7f9867f8 30-Mar-2021 Konstantin Belousov <kib@FreeBSD.org>

linuxkpi: do not destroy/free embedded linux cdevs

They have their own lifetime managed by the containing objects.
Premature and unexpected free causes corruption.

Reviewed by: hselasky
Sponsored by: Mellanox Technologies/NVidia Networking
MFC after: 1 week


# 28b482e2 30-Mar-2021 Konstantin Belousov <kib@FreeBSD.org>

linuxkpi: rename cdev to ldev

the variables hold pointers to a linux_cdev, not to a FreeBSD cdev.

Reviewed by: hselasky
Sponsored by: Mellanox Technologies/NVidia Networking
MFC after: 1 week


# 7b0125cb 30-Mar-2021 Konstantin Belousov <kib@FreeBSD.org>

linuxkpi: copy ldev into local to test and free the same pointer

Reviewed by: hselasky
Sponsored by: Mellanox Technologies/NVidia Networking
MFC after: 1 week


# fdcfe8a2 21-Mar-2021 Bjoern A. Zeeb <bz@FreeBSD.org>

LinuxKPI: netdevice notifier callback argument

Introduce struct netdev_notifier_info as a container to pass
net_device to the callback functions.
Adjust netdev_notifier_info_to_dev() to return the net_device field.

Add explicit casts from ifp to ni->dev even though currently
struct net_device is defined to struct ifnet. This is needed in
preparation for untangling this and improving the net_device compat
code.

Obtained-from: bz_iwlwifi
Sponsored-by: The FreeBSD Foundation
MFC-after: 2 weeks
Reviewed-by: hselasky
Differential Revision: https://reviews.freebsd.org/D29365


# bc042266 23-Mar-2021 Bjoern A. Zeeb <bz@FreeBSD.org>

LinuxKPI: add net_ratelimit()

Add a net_ratelimit() compat implementation based on ppsratecheck().
Add a sysctl to allow tuning of the number of messages.

Sponsored-by: The FreeBSD Foundation
MFC-after: 2 weeks
Reviewed-by: hselasky
Differential Revision: https://reviews.freebsd.org/D29399


# dfb33cb0 10-Mar-2021 Hans Petter Selasky <hselasky@FreeBSD.org>

Allocating the LinuxKPI current structure from a software interrupt thread
must be done using the M_NOWAIT flag after 1ae20f7c70ea .

MFC after: 1 week
Sponsored by: Mellanox Technologies // NVIDIA Networking


# d1cbe790 10-Mar-2021 Hans Petter Selasky <hselasky@FreeBSD.org>

Allocating the LinuxKPI current structure from an interrupt thread must be
done using the M_NOWAIT flag after 1ae20f7c70ea .

MFC after: 1 week
Sponsored by: Mellanox Technologies // NVIDIA Networking


# fa765ca7 28-Jan-2021 Bjoern A. Zeeb <bz@FreeBSD.org>

LinuxKPI: implement devres() framework parts and two examples

This code implements a version of the devres framework found
working for various iwlwifi use cases and also providing functions
for ttm_page_alloc_dma.c from DRM.

Part of the framework replicates the consumed KPI, while others
are internal helper functions.

In addition the simple devm_k*malloc() consumers were implemented
and kvasprintf() was enhanced to also work for the devm_kasprintf()
case.
Addmittingly lkpi_devm_kmalloc_release() could be avoided but for
the overall understanding of the code and possible memory tracing
it may still be helpful.

Further devsres consumer are implemented for iwlwifi but will follow
later as the main reason for this change is to sort out overlap with
DRM.

Sponsored-by: The FreeBSD Foundation
Obtained-from: bz_iwlwifi
MFC After: 3 days
Reviewed-by: hselasky, manu
Differential Revision: https://reviews.freebsd.org/D28189


# e90afaa0 08-Nov-2020 Mateusz Guzik <mjg@FreeBSD.org>

kqueue: save space by using only one func pointer for assertions


# 1a180032 01-Sep-2020 Mateusz Guzik <mjg@FreeBSD.org>

compat: clean up empty lines in .c and .h files


# 0e123c13 14-Aug-2020 Emmanuel Vadot <manu@FreeBSD.org>

linuxkpi: Add a few wait_bit functions

The linux function does a lot more than that as multiple waitqueue could be fetch
from a static table based on the hash of the argument but since in DRM it's only used
in one place just add a single variable.
We will probably need to change that in the futur but it's ok with DRM even with current
linux.

Reviewed by: hselasky
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D26054


# 51ea7bea 07-Aug-2020 Mateusz Guzik <mjg@FreeBSD.org>

vfs: add VOP_STAT

The current scheme of calling VOP_GETATTR adds avoidable overhead.

An example with tmpfs doing fstat (ops/s):
before: 7488958
after: 7913833

Reviewed by: kib (previous version)
Differential Revision: https://reviews.freebsd.org/D25910


# 42f0f394 24-May-2020 Emmanuel Vadot <manu@FreeBSD.org>

linuxkpi: Fix mod_timer and del_timer_sync

mod_timer is supposed to return 1 if the modified timer was pending, which
is exactly what callout_reset does so return the value after checking
that it's a correct one in case the api change.
del_timer_sync returns int so add a function and handle that.

Reviewed by: hselasky
Differential Revision: https://reviews.freebsd.org/D24983


# 42f8ef4b 04-May-2020 Hans Petter Selasky <hselasky@FreeBSD.org>

Fix warning about sleeping with non-sleepable lock when allocating
"current" from linux_cdev_pager_populate() in the LinuxKPI:

Backtrace:
witness_debugger()
witness_warn()
uma_zalloc_arg()
malloc()
linux_alloc_current()
linux_cdev_pager_populate()
vm_fault()
vm_fault_trap()
trap_pfault()
trap()
calltrap()

Suggested by: avg@
MFC after: 1 week
Sponsored by: Mellanox Technologies


# 1328771d 03-Mar-2020 Hans Petter Selasky <hselasky@FreeBSD.org>

When closing a LinuxKPI file always use the real release function to avoid
resource leakage when destroying a LinuxKPI character device.

Submitted by: Andrew Boyer <aboyer@pensando.io>
Reviewed by: kib@
PR: 244572
MFC after: 1 week
Sponsored by: Mellanox Technologies


# 77632fc7 27-Feb-2020 Hans Petter Selasky <hselasky@FreeBSD.org>

Extend the range of the return value from nsecs_to_jiffies64() to support
Mesa's drm_syncobj usage, in the LinuxKPI.

While at it optimise the jiffies conversion functions to avoid repeated
and constant calculations.

Submitted by: Greg V <greg@unrelenting.technology>
Differential Revision: https://reviews.freebsd.org/D23846
MFC after: 1 week
Sponsored by: Mellanox Technologies


# 39a3542b 15-Feb-2020 Pawel Biernacki <kaktus@FreeBSD.org>

Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (2 of many)

r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked). Use it in
preparation for a general review of all nodes.
This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.

Reviewed by: hselasky, kib, zeising
Approved by: kib (mentor)
Differential Revision: https://reviews.freebsd.org/D23631


# 2a3529df 28-Jan-2020 Konstantin Belousov <kib@FreeBSD.org>

Provide support for fdevname(3) on linuxkpi-backed devices.

Reported and tested by: manu
Reviewed by: hselasky, manu
Sponsored by: Mellanox Technologies
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D23386


# a7e348d7 15-Jan-2020 Mark Johnston <markj@FreeBSD.org>

Handle a NULL thread pointer in linux_close_file().

This can happen if a file is closed during unix socket GC. The same bug
was fixed for devfs descriptors in r228361.

PR: 242913
Reported and tested by: iz-rpi03@hs-karlsruhe.de
Reviewed by: hselasky, kib
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D23178


# b249ce48 03-Jan-2020 Mateusz Guzik <mjg@FreeBSD.org>

vfs: drop the mostly unused flags argument from VOP_UNLOCK

Filesystems which want to use it in limited capacity can employ the
VOP_UNLOCK_FLAGS macro.

Reviewed by: kib (previous version)
Differential Revision: https://reviews.freebsd.org/D21427


# 3cf3b4e6 21-Dec-2019 Jeff Roberson <jeff@FreeBSD.org>

Make page busy state deterministic on free. Pages must be xbusy when
removed from objects including calls to free. Pages must not be xbusy
when freed and not on an object. Strengthen assertions to match these
expectations. In practice very little code had to change busy handling
to meet these rules but we can now make stronger guarantees to busy
holders and avoid conditionally dropping busy in free.

Refine vm_page_remove() and vm_page_replace() semantics now that we have
stronger guarantees about busy state. This removes redundant and
potentially problematic code that has proliferated.

Discussed with: markj
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D22822


# 0012f373 14-Oct-2019 Jeff Roberson <jeff@FreeBSD.org>

(4/6) Protect page valid with the busy lock.

Atomics are used for page busy and valid state when the shared busy is
held. The details of the locking protocol and valid and dirty
synchronization are in the updated vm_page.h comments.

Reviewed by: kib, markj
Tested by: pho
Sponsored by: Netflix, Intel
Differential Revision: https://reviews.freebsd.org/D21594


# fee2a2fa 09-Sep-2019 Mark Johnston <markj@FreeBSD.org>

Change synchonization rules for vm_page reference counting.

There are several mechanisms by which a vm_page reference is held,
preventing the page from being freed back to the page allocator. In
particular, holding the page's object lock is sufficient to prevent the
page from being freed; holding the busy lock or a wiring is sufficent as
well. These references are protected by the page lock, which must
therefore be acquired for many per-page operations. This results in
false sharing since the page locks are external to the vm_page
structures themselves and each lock protects multiple structures.

Transition to using an atomically updated per-page reference counter.
The object's reference is counted using a flag bit in the counter. A
second flag bit is used to atomically block new references via
pmap_extract_and_hold() while removing managed mappings of a page.
Thus, the reference count of a page is guaranteed not to increase if the
page is unbusied, unmapped, and the object's write lock is held. As
a consequence of this, the page lock no longer protects a page's
identity; operations which move pages between objects are now
synchronized solely by the objects' locks.

The vm_page_wire() and vm_page_unwire() KPIs are changed. The former
requires that either the object lock or the busy lock is held. The
latter no longer has a return value and may free the page if it releases
the last reference to that page. vm_page_unwire_noq() behaves the same
as before; the caller is responsible for checking its return value and
freeing or enqueuing the page as appropriate. vm_page_wire_mapped() is
introduced for use in pmap_extract_and_hold(). It fails if the page is
concurrently being unmapped, typically triggering a fallback to the
fault handler. vm_page_wire() no longer requires the page lock and
vm_page_unwire() now internally acquires the page lock when releasing
the last wiring of a page (since the page lock still protects a page's
queue state). In particular, synchronization details are no longer
leaked into the caller.

The change excises the page lock from several frequently executed code
paths. In particular, vm_object_terminate() no longer bounces between
page locks as it releases an object's pages, and direct I/O and
sendfile(SF_NOCACHE) completions no longer require the page lock. In
these latter cases we now get linear scalability in the common scenario
where different threads are operating on different files.

__FreeBSD_version is bumped. The DRM ports have been updated to
accomodate the KPI changes.

Reviewed by: jeff (earlier version)
Tested by: gallatin (earlier version), pho
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D20486


# e12be321 20-May-2019 Conrad Meyer <cem@FreeBSD.org>

Include eventhandler.h in more compilation units

This was enumerated with exhaustive search for sys/eventhandler.h includes,
cross-referenced against EVENTHANDLER_* usage with the comm(1) utility. Manual
checking was performed to avoid redundant includes in some drivers where a
common os_bsd.h (for example) included sys/eventhandler.h indirectly, but it is
possible some of these are redundant with driver-specific headers in ways I
didn't notice.

(These CUs did not show up as missing eventhandler.h in tinderbox.)

X-MFC-With: r347984


# 47e2723a 16-May-2019 Johannes Lundberg <johalun@FreeBSD.org>

LinuxKPI: Update access_ok macro for v5.0.

Check LINUXKPI_VERSION macro for backwards compatibility.
It's recommended to update any drivers that depend on the older KPI
so we can deprecate < 5.0 code as we update to newer Linux version.
This patch is part of D19565

Reviewed by: hps
Approved by: imp (mentor), hps
MFC after: 1 week


# 02927c76 14-May-2019 Johannes Lundberg <johalun@FreeBSD.org>

LinuxKPI: Let del_timer return a value to match Linux.

This patch is part of https://reviews.freebsd.org/D19565.

Reviewed by: hps
Approved by: imp (mentor), hps
MFC after: 1 week


# 4580f5ea 06-May-2019 Hans Petter Selasky <hselasky@FreeBSD.org>

Allow controlling pr_debug at runtime in the LinuxKPI.

Turning on pr_debug at compile time make it non-optional at runtime.
This often means that the amount of the debugging is unbearable.
Allow developer to turn on pr_debug output only when needed.

Build tested drm-current-kmod prior to commit.

MFC after: 1 week
Submitted by: kib@
Sponsored by: Mellanox Technologies


# af248a7c 25-Apr-2019 Johannes Lundberg <johalun@FreeBSD.org>

Don't call cdev_init where cdev_alloc is called. cdev_alloc already
handles initialization.

Reported by: johalun
Reviewed by: hps
Approved by: imp (mentor), hps
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D19565


# ff9be73e 20-Apr-2019 Ed Maste <emaste@FreeBSD.org>

Enable ioremap for aarch64 in the LinuxKPI

Required for Mellanox drivers (e.g. on Ampere eMAG at Packet.com).

PR: 237055
Submitted by: Greg V <greg@unrelenting.technology>
Reviewed by: hselasky
Differential Revision: https://reviews.freebsd.org/D19987


# 9362b6a3 30-Dec-2018 Konstantin Belousov <kib@FreeBSD.org>

Fix 32bit gcc builds after r342625.

MFC after: 1 week
Sponsored by: Mellanox Technologies


# f823a36e 30-Dec-2018 Konstantin Belousov <kib@FreeBSD.org>

Fix linux_destroy_dev() behaviour when there are still files open from
the destroying cdev.

Currently linux_destroy_dev() waits for the reference count on the
linux cdev to drain, and each open file hold the reference.
Practically it means that linux_destroy_dev() is blocked until all
userspace processes that have the cdev open, exit. FreeBSD devfs does
not have such problem, because device refcount only prevents freeing
of the cdev memory, and separate 'active methods' counter blocks
destroy_dev() until all threads leave the cdevsw methods. After that,
attempts to enter cdevsw methods are refused with an error.

Implement somewhat similar mechanism for LinuxKPI cdevs. Demote cdev
refcount to only mean a hold on the linux cdev memory. Add sirefs
count to track both number of threads inside the cdev methods, and for
single-bit indicator that cdev is being destroyed. In the later case,
the call is redirected to the dummy cdev.

Reviewed by: markj
Discussed with: hselasky
Tested by: zeising
MFC after: 1 week
Sponsored by: Mellanox Technologies
Differential revision: https://reviews.freebsd.org/D18606


# e5a3393a 30-Dec-2018 Konstantin Belousov <kib@FreeBSD.org>

Implement zap_vma_ptes() for managed device objects.

Reviewed by: markj
Discussed with: hselasky
Tested by: zeising
MFC after: 1 week
Sponsored by: Mellanox Technologies
Differential revision: https://reviews.freebsd.org/D18606


# 069598b9 30-Dec-2018 Konstantin Belousov <kib@FreeBSD.org>

Use IDX_TO_OFF().

Reviewed by: markj
Discussed with: hselasky
Tested by: zeising
MFC after: 1 week
Sponsored by: Mellanox Technologies
Differential revision: https://reviews.freebsd.org/D18606


# 9c7b53cc 05-Dec-2018 Slava Shwartsman <slavash@FreeBSD.org>

linuxkpi: Fix for use-after-free when tearing down character devices.

Make sure we hold a reference on the character device for every opened file
to prevent the character device to be freed prematurely.

Submitted by: hselasky@
Approved by: hselasky (mentor)
MFC after: 1 week
Sponsored by: Mellanox Technologies


# f1863400 03-Dec-2018 Konstantin Belousov <kib@FreeBSD.org>

Improve procstat reporting for the linux cdev file descriptors.

If there is a vnode attached to the linux file, use it to fill
kinfo_file. Otherwise, report a new KF_TYPE_DEV file type, without
supplying any type-specific information.

KF_TYPE_DEV is supposed to be used by most devfs-specific file types.

Sponsored by: Mellanox Technologies
MFC after: 1 week


# e35079db 30-Oct-2018 Hans Petter Selasky <hselasky@FreeBSD.org>

Implement the dump_stack() function in the LinuxKPI.

Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 3 days
Sponsored by: Mellanox Technologies


# 4b706099 30-Mar-2018 Hans Petter Selasky <hselasky@FreeBSD.org>

Optimise use of Giant in the LinuxKPI.

- Make sure Giant is locked when calling PCI device methods.
Newbus currently requires this.

- Avoid unlocking Giant right before aquiring the sleepqueue lock.
This can save a task switch.

MFC after: 1 week
Sponsored by: Mellanox Technologies


# 83630517 23-Mar-2018 Ed Maste <emaste@FreeBSD.org>

linuxkpi whitespace cleanup

Reviewed by: hselasky, markj
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D14807


# be15e133 08-Mar-2018 Hans Petter Selasky <hselasky@FreeBSD.org>

Implement proper support for complete_all() in the LinuxKPI.

When complete_all() is called there might be multiple waiters. The
current implementation could only handle one waiter. Make sure the
completion is sticky when complete_all() is called to be compatible
with Linux.

Found by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
Sponsored by: Limelight Networks


# 9555cfd2 02-Mar-2018 Hans Petter Selasky <hselasky@FreeBSD.org>

Rename callout member in struct timer_list to match the one in struct
delayed_work in the LinuxKPI. This allows the timer_pending() function
macro to be used with delayed work structures.

No functional nor structural change.

MFC after: 1 week
Submitted by: Johannes Lundberg <johalun0@gmail.com>
Sponsored by: Mellanox Technologies
Sponsored by: Limelight Networks


# 94944062 22-Feb-2018 Hans Petter Selasky <hselasky@FreeBSD.org>

Return correct error code to user-space when a system call receives a
signal in the LinuxKPI.

The read(), write() and mmap() system calls can return either EINTR or
ERESTART upon receiving a signal. Add code to figure out the correct
return value by temporarily storing the return code from the relevant
FreeBSD kernel APIs in the Linux task structure.

MFC after: 3 days
Sponsored by: Mellanox Technologies


# 0628fc90 18-Feb-2018 Hans Petter Selasky <hselasky@FreeBSD.org>

Make the vm_fault structure in the LinuxKPI compatible with
newer versions of the Linux kernel. No functional change.

MFC after: 1 week
Submitted by: Johannes Lundberg <johalun0@gmail.com>
Sponsored by: Mellanox Technologies
Sponsored by: Limelight Networks


# f71d0b0d 01-Feb-2018 Hans Petter Selasky <hselasky@FreeBSD.org>

Fix some recent regressions after r328436 in the LinuxKPI:

1) The OPW() function macro should have the same return type like the
function it executes.
2) The DEVFS I/O-limit should be enforced for all character device reads
and writes.
3) The character device file handle should be passable, same as for
DEVFS based file handles.

Reported by: jbeich @
MFC after: 1 week
Sponsored by: Mellanox Technologies


# 3f3735db 01-Feb-2018 Hans Petter Selasky <hselasky@FreeBSD.org>

Make sure the LinuxKPI's internal ERESTARTSYS error code gets translated
into ERESTART for mmap and page fault calls aswell.

MFC after: 1 week
Sponsored by: Mellanox Technologies


# e23ae408 26-Jan-2018 Hans Petter Selasky <hselasky@FreeBSD.org>

Decouple Linux files from the belonging character device right after open
in the LinuxKPI. This is done by calling finit() just before returning a magic
value of ENXIO in the "linux_dev_fdopen" function.

The Linux file structure should mimic the BSD file structure as much as
possible. This patch decouples the Linux file structure from the belonging
character device right after the "linux_dev_fdopen" function has returned.
This fixes an issue which allows a Linux file handle to exist after a
character device has been destroyed and removed from the directory index
of /dev. Only when the reference count of the BSD file handle reaches zero,
the Linux file handle is destroyed. This fixes use-after-free issues related
to accessing the Linux file structure after the character device has been
destroyed.

While at it add a missing NULL check for non-present file operation.
Calling a NULL pointer will result in a segmentation fault.

Reviewed by: kib @
MFC after: 1 week
Sponsored by: Mellanox Technologies


# 0c19d064 13-Nov-2017 Hans Petter Selasky <hselasky@FreeBSD.org>

Properly handle the case where the linux_cdev_handle_insert() function
in the LinuxKPI returns NULL. This happens when the VM area's private
data handle already exists and could cause a so-called NULL pointer
dereferencing issue prior to this fix.

Found by: greg@unrelenting.technology
MFC after: 1 week
Sponsored by: Mellanox Technologies


# 8ead3a99 03-Nov-2017 Hans Petter Selasky <hselasky@FreeBSD.org>

Remove redundant dev->si_drv1 NULL checks in the LinuxKPI.
This pointer is checked during the linux_dev_open() callback and does
not need to be NULL checked again. It should always be set for
character devices belonging to the "linuxcdevsw" and technically
there is no need to NULL check this pointer at all.

Suggested by: kib @
MFC after: 1 week
Sponsored by: Mellanox Technologies


# 627ac5b4 13-Oct-2017 Hans Petter Selasky <hselasky@FreeBSD.org>

Don't call selrecord() outside the select system call in the LinuxKPI, because
then td->td_sel is NULL and this will result in a segfault inside selrecord().
This happens when only using kqueue() to poll for read and write events.
If select() and kqueue() is mixed there won't be a segfault.

Reported by: Johannes Lundberg
MFC after: 1 week
Sponsored by: Mellanox Technologies


# 40f53a7c 22-Sep-2017 Hans Petter Selasky <hselasky@FreeBSD.org>

Add support for 32-bit compatibility IOCTLs in the LinuxKPI.

Bump the FreeBSD version to force recompilation of external
kernel modules due to structure change.

PR: 222504
Submitted by: Greg V <greg@unrelenting.technology>
MFC after: 1 week
Sponsored by: Mellanox Technologies


# 6dec7efa 09-Sep-2017 Hans Petter Selasky <hselasky@FreeBSD.org>

Properly implement poll_wait() in the LinuxKPI. This prevents direct
use of the linux_poll_wakeup() function from unsafe contexts, which
can lead to use-after-free issues.

Instead of calling linux_poll_wakeup() directly use the wake_up()
family of functions in the LinuxKPI to do this.

Bump the FreeBSD version to force recompilation of external kernel modules.

MFC after: 1 week
Sponsored by: Mellanox Technologies


# ebf85480 11-Aug-2017 Hans Petter Selasky <hselasky@FreeBSD.org>

Make sure the "vm_flags" and "vm_page_prot" fields get set correctly
in the VM area structure in the LinuxKPI when doing mmap() and that
unsupported bits are masked away.

While at it fix some redundant use of parenthesing inside some related
macros.

Found by: KrishnamRaju ErapaRaju <Krishna2@chelsio.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies


# f6800be3 10-Aug-2017 Hans Petter Selasky <hselasky@FreeBSD.org>

Use integer type to pass around jiffies and/or ticks values in the
LinuxKPI because in FreeBSD ticks are 32-bit.

MFC after: 1 week
Sponsored by: Mellanox Technologies


# 2b79a966 02-Aug-2017 Hans Petter Selasky <hselasky@FreeBSD.org>

Fix LinuxKPI regression after r321920. The mda_unit and si_drv0 fields are not
wide enough to hold the full 64-bit dev_t. Instead use the "dev" field in
the "linux_cdev" structure to store and lookup this value.

While at it remove superfluous use of parenthesis inside the
MAJOR(), MINOR() and MKDEV() macros in the LinuxKPI.

MFC after: 1 week
Sponsored by: Mellanox Technologies


# dac6b88a 08-Jul-2017 Mark Johnston <markj@FreeBSD.org>

Add some helper definitions to fs.h in the LinuxKPI.

Add a field to struct linux_file to allow the creation of anonymous
shmem objects.

MFC after: 1 week


# 61157228 07-Jul-2017 Hans Petter Selasky <hselasky@FreeBSD.org>

Complete r320189 which allows a NULL VM fault handler in the LinuxKPI.
Instead of mapping a dummy page upon a page fault, map the page
pointed to by the physical address given by IDX_TO_OFF(vmap->vm_pfn).
To simplify the implementation use OBJT_DEVICE to implement our own
linux_cdev_pager_fault() instead of using the existing
linux_cdev_pager_populate().

Some minor code factoring while at it.

Reviewed by: markj @
MFC after: 1 week
Sponsored by: Mellanox Technologies


# 47d8a7d4 21-Jun-2017 Mark Johnston <markj@FreeBSD.org>

Add missing lock destructor invocations to the LinuxKPI unload handler.

MFC after: 1 week


# cde3f930 21-Jun-2017 Hans Petter Selasky <hselasky@FreeBSD.org>

Allow the VM fault handler to be NULL in the LinuxKPI when handling a
memory map request. When the VM fault handler is NULL a return code of
VM_PAGER_BAD is returned from the character device's pager populate
handler. This fixes compatibility with Linux.

MFC after: 1 week
Sponsored by: Mellanox Technologies


# 864092bc 07-Jun-2017 Justin Hibbits <jhibbits@FreeBSD.org>

Remove ARM and MIPS from linuxkpi ioremap_attr definition

ARM and MIPS fail universe builds.

ARM and MIPS are missing the following:
* VM_MEMATTR_WRITE_THROUGH
* VM_MEMATTR_WRITE_COMBINING

Pointy-hat to: jhibbits


# 287e7a86 07-Jun-2017 Justin Hibbits <jhibbits@FreeBSD.org>

Add more #ifdef arch checks to the linuxkpi

arm, mips, and powerpc all implement pmap_mapdev_attr() and pmap_unmapdev(),
so add those archs to the checks. powerpc also includes the atomic_swap_*()
functions, so add that to the supported list as well. Not tested except by
compiling powerpc.

Reviewed by: markj


# 67e984c8 02-Jun-2017 Hans Petter Selasky <hselasky@FreeBSD.org>

Improve kqueue() support in the LinuxKPI. Some applications using the
kqueue() does not set non-blocking I/O mode for event driven read of
file descriptors. This means the LinuxKPI internal kqueue read and
write event flags must be updated before the next read and/or write
system call. Else the read and/or write system call may block. This
can happen when there is no more data to read following a previous
read event. Then the application also gets blocked from processing
other events. This situation can also be solved by the applications
setting and using non-blocking I/O mode.

MFC after: 1 week
Sponsored by: Mellanox Technologies


# 639af71a 02-Jun-2017 Hans Petter Selasky <hselasky@FreeBSD.org>

Add support for setting the non-blocking I/O flag for LinuxKPI
character devices. In Linux the FIONBIO IOCTL is handled by the kernel
and not the drivers. Also need return success for the FIOASYNC ioctl
due to existing logic in kern_fcntl() even though it is not supported
currently.

MFC after: 1 week
Sponsored by: Mellanox Technologies


# 8600ba1a 01-Jun-2017 Hans Petter Selasky <hselasky@FreeBSD.org>

Make sure the selrecord() function is only called from within system
polling contexts in the LinuxKPI.

After the kqueue() support was added to the LinuxKPI in r319409 the
Linux poll file operation will be used outside the system file polling
callback function, which can cause a NULL-pointer panic inside
selrecord() because curthread->td_sel is set to NULL. This patch moves
the selrecord() call away from poll_wait() and to the system file poll
callback function in the LinuxKPI, which essentially wraps the Linux
one. This is similar to what the cuse(3) module is currently doing.
Refer to sys/fs/cuse/*.[ch] for more details.

MFC after: 1 week
Sponsored by: Mellanox Technologies


# 328c75d6 01-Jun-2017 Hans Petter Selasky <hselasky@FreeBSD.org>

Translate the ERESTARTSYS error code into ERESTART in the LinuxKPI
ioctl(), read() and write() system call handlers. This error code is
internal to the kernel and should not be seen by user-space programs
according to Linux.

Submitted by: Yanko Yankulov <yanko.yankulov@gmail.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies


# a6b28ee0 01-Jun-2017 Hans Petter Selasky <hselasky@FreeBSD.org>

Add generic kqueue() and kevent() support to the LinuxKPI character
devices. The implementation allows read and write filters to be
created and piggybacks on the poll() file operation to determine when
a filter should trigger. The piggyback mechanism is simply to check
for the EWOULDBLOCK or EAGAIN return code from read(), write() or
ioctl() system calls and then update the kqueue() polling state bits.
The implementation is similar to the one found in the cuse(3) module.
Refer to sys/fs/cuse/*.[ch] for more details.

MFC after: 1 week
Sponsored by: Mellanox Technologies


# dff36e69 31-May-2017 Hans Petter Selasky <hselasky@FreeBSD.org>

Implement in_atomic() function in the LinuxKPI.

Obtained from: kmacy @
MFC after: 1 week
Sponsored by: Mellanox Technologies


# 90b30e65 31-May-2017 Hans Petter Selasky <hselasky@FreeBSD.org>

Properly set the .d_name field in the cdevsw structure for the
LinuxKPI.

Obtained from: kmacy @
MFC after: 1 week
Sponsored by: Mellanox Technologies


# d56f1ed8 31-May-2017 Hans Petter Selasky <hselasky@FreeBSD.org>

Make sure the VMAP's "vm_file" field is referenced in a Linux
compatible way by the linux_dev_mmap_single() function in the
LinuxKPI.

MFC after: 1 week
Sponsored by: Mellanox Technologies


# cca15f28 31-May-2017 Hans Petter Selasky <hselasky@FreeBSD.org>

Remove the VMA handle from its list before calling the LinuxKPI VMA
close operation to prevent other threads from reusing the VM object
handle pointer.

MFC after: 1 week
Sponsored by: Mellanox Technologies


# ea67550b 30-May-2017 Hans Petter Selasky <hselasky@FreeBSD.org>

Fix a reference count leak in the LinuxKPI due to calling VM open when
it shouldn't be called.

Background:
The Linux VM open operation is called when a new VMA is
created on top of the current VMA. This is done through either mremap
flow or split_vma, usually due to mlock, madvise, munmap and so
on. This is currently not supported by the LinuxKPI.

MFC after: 1 week
Sponsored by: Mellanox Technologies


# f5a9867b 30-May-2017 Hans Petter Selasky <hselasky@FreeBSD.org>

Fixes for refcounting "struct linux_file" in the LinuxKPI.

- Allow "struct linux_file" to be refcounted when its "_file" member
is NULL by using its "f_count" field. The reference counts are
transferred to the file structure when the file descriptor is
installed.

- Add missing vdrop() calls for error cases during open().

- Set the "_file" member of "struct linux_file" during open. This
allows use of refcounting through get_file() and fput() with LinuxKPI
character devices.

MFC after: 1 week
Sponsored by: Mellanox Technologies


# 02fb845b 18-May-2017 Mark Johnston <markj@FreeBSD.org>

Fix a few uses of kern_yield() in the TTM and the LinuxKPI.

kern_yield(0) effectively causes the calling thread to be rescheduled
immediately since it resets the thread's priority to the highest possible
value. This can cause livelocks when the pattern
"while (!trylock()) kern_yield(0);" is used since the thread holding the
lock may linger on the runqueue for the CPU on which the looping thread is
running.

MFC after: 1 week


# 6c106233 05-May-2017 Mark Johnston <markj@FreeBSD.org>

Use pmap_invalidate_cache() to implement wbinvd_on_all_cpus().

Suggested by: jhb
X-MFC with: r317651


# 67960816 05-May-2017 Hans Petter Selasky <hselasky@FreeBSD.org>

Fix for use after free in the LinuxKPI.

Background:
The same VM object might be shared by multiple processes and the
mm_struct is usually freed when a process exits.

Grab a reference on the mm_struct while the vmap is in the
linux_vma_head list in case the first process which inserted a VM
object has exited.

Tested by: kwm @
MFC after: 1 week
Sponsored by: Mellanox Technologies


# c12488bb 01-May-2017 Mark Johnston <markj@FreeBSD.org>

Add on_each_cpu() and wbinvd_on_all_cpus().

Reviewed by: hselasky
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D10550


# b602c283 19-Apr-2017 Mark Johnston <markj@FreeBSD.org>

Drop Giant before sleeping in linux_wait_for_{timeout_,}common().

Reported and tested by: Pete Wright <pete@nomadlogic.org>
Reviewed by: hselasky (previous version)
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D10414


# 76fe8c93 09-Apr-2017 Hans Petter Selasky <hselasky@FreeBSD.org>

Fix compilation of LinuxKPI for PowerPC.

Found by: emaste @
MFC after: 1 week
Sponsored by: Mellanox Technologies


# 1ea4c857 06-Apr-2017 Hans Petter Selasky <hselasky@FreeBSD.org>

Implement proper support for memory map operations in the LinuxKPI,
like open, close and fault using the character device pager.

Some notes about the implementation:

1) Linux drivers set the vm_ops and vm_private_data fields during a
mmap() call to indicate that the driver wants to use the LinuxKPI VM
operations. Else these operations are not used.

2) The vm_private_data pointer is associated with a VM area structure
and inserted into an internal LinuxKPI list. If the vm_private_data
pointer already exists, the existing VM area structure is used instead
of the allocated one which gets freed.

3) The LinuxKPI's vm_private_data pointer is used as the callback
handle for the FreeBSD VM object. The VM subsystem in FreeBSD has a
similar list to identify equal handles and will only call the
character device pager's close function once.

4) All LinuxKPI VM operations are serialized through the mmap_sem
sempaphore, which is per procedure, which prevents simultaneous access
to the shared VM area structure when receiving page faults.

Obtained from: kmacy @
MFC after: 1 week
Sponsored by: Mellanox Technologies


# 6e3e6544 04-Apr-2017 Hans Petter Selasky <hselasky@FreeBSD.org>

Unify error handling when si_drv1 is NULL in the LinuxKPI.

Make sure the character device poll callback function does not return
an error code, but a POLLXXX value, in case of failure.

MFC after: 1 week
Sponsored by: Mellanox Technologies


# 8f7eee5a 23-Mar-2017 Hans Petter Selasky <hselasky@FreeBSD.org>

Use ppsratecheck() for ratelimiting in the LinuxKPI.

Suggested by: cem @
MFC after: 1 week
Sponsored by: Mellanox Technologies


# e9db3df2 23-Mar-2017 Hans Petter Selasky <hselasky@FreeBSD.org>

Add support for ratelimited printouts in the LinuxKPI.

MFC after: 1 week
Sponsored by: Mellanox Technologies


# 3803a97f 16-Mar-2017 Hans Petter Selasky <hselasky@FreeBSD.org>

Use __LP64__ to detect presence of suword64() to fix linking and
loading of the LinuxKPI on 32-bit platforms.

Reported by: lwhsu @
MFC after: 1 week
Sponsored by: Mellanox Technologies


# 0e05589b 16-Mar-2017 Hans Petter Selasky <hselasky@FreeBSD.org>

Implement more userspace memory access functions in the LinuxKPI.

Obtained from: kmacy @
MFC after: 1 week
Sponsored by: Mellanox Technologies


# 40402727 16-Mar-2017 Hans Petter Selasky <hselasky@FreeBSD.org>

Add basic support for VIMAGE to the LinuxKPI and ibcore.

Support is implemented by mapping Linux's "struct net" into FreeBSD's
"struct vnet". Currently only vnet0 is supported by ibcore.

MFC after: 1 week
Sponsored by: Mellanox Technologies


# 5f50a414 14-Mar-2017 Hans Petter Selasky <hselasky@FreeBSD.org>

Set "current" pointer for LinuxKPI interrupts and timer callbacks.

MFC after: 1 week
Sponsored by: Mellanox Technologies


# ca2ad6bd 06-Mar-2017 Hans Petter Selasky <hselasky@FreeBSD.org>

LinuxKPI workqueue cleanup.

This change makes the workqueue implementation behave more like in
Linux, both functionality wise and structure wise.

All workqueue code has been moved to linux_work.c

Add an atomic based statemachine to the work_struct to ensure proper
operation. Prior to this change struct_work was directly mapped to a
FreeBSD task. When a taskqueue has multiple threads the same task may
end up being executed on more than one worker thread simultaneously.
This might cause problems with code coming from Linux, which expects
serial behaviour, similar to Linux tasklets.

Move all global workqueue function names into the linux_xxx domain to
avoid symbol name clashes in the future.

Implement a few more workqueue related functions and macros.

Create two multithreaded taskqueues for the LinuxKPI during module
load, one for time-consuming callbacks and one for non-time consuming
callbacks.

MFC after: 1 week
Sponsored by: Mellanox Technologies


# def277d3 06-Mar-2017 Hans Petter Selasky <hselasky@FreeBSD.org>

Implement add_timer_on() function in the LinuxKPI.

Obtained from: kmacy @
MFC after: 1 week
Sponsored by: Mellanox Technologies


# 1f827dab 03-Mar-2017 Hans Petter Selasky <hselasky@FreeBSD.org>

Update the LinuxKPI RCU and SRCU wrappers for the concurrency kit, CK.

- Optimise the RCU implementation to not allocate and free
ck_epoch_records during runtime. Instead allocate two sets of
ck_epoch_records per CPU for general purpose use. The first set is
only used for reader locks and the second set is only used for
synchronization and barriers and is protected with a regular mutex to
prevent simultaneous issues.

- Move the task structure away from the rcu_head structure and into
the per-CPU structures. This allows the size of the rcu_head structure
to be reduced down to the size of two pointers.

- Fix a bug where the linux_rcu_barrier() function only waited for one
per-CPU epoch record to be completed instead of all.

- Use a critical section or a mutex to protect ck_epoch_begin() and
ck_epoch_end() depending on RCU or SRCU type. All the ck_epoch_xxx()
functions, except ck_epoch_register(), ck_epoch_unregister() and
ck_epoch_recycle() are not re-entrant and needs a critical section or
a mutex to operate in the LinuxKPI, after inspecting the CK
implementation of the above mentioned functions. The simultaneous
issues arise from per-CPU epoch records being shared between multiple
threads depending on the amount of taskswitching and how many threads
are involved with the RCU and SRCU operations.

- Properly free all epoch records by using safe list traversal at
LinuxKPI module unload. It turns out the ck_epoch_recycle() always
have the records on an internal list and use a flag in the epoch
record to track allocated and free entries. This would lead to use
after free during module unload.

- Remove redundant synchronize_rcu() call from the
linux_compat_uninit() function. Let the linux_rcu_runtime_uninit()
function do the final rcu_barrier() instead.

MFC after: 1 week
Sponsored by: Mellanox Technologies


# 1a01b4e5 21-Feb-2017 Hans Petter Selasky <hselasky@FreeBSD.org>

Replace dummy implementation of RCU in the LinuxKPI with one based on
the in-kernel concurrency kit's ck_epoch API. Factor RCU hlist_xxx()
functions into own rculist.h header file.

Obtained from: kmacy @
MFC after: 1 week
Sponsored by: Mellanox Technologies


# 1e3db1de 20-Feb-2017 Hans Petter Selasky <hselasky@FreeBSD.org>

Make the LinuxKPI task struct persistent accross system calls.

A set of helper functions have been added to manage the life of the
LinuxKPI task struct. When an external system call or task is invoked,
a check is made to create the task struct by demand. A thread
destructor callback is registered to free the task struct when a
thread exits to avoid memory leaks.

This change lays the ground for emulating the Linux kernel more
closely which is a dependency by the code using the LinuxKPI APIs.

Add new dedicated td_lkpi_task field has been added to struct thread
instead of abusing td_retval[1].

Fix some header file inclusions to make LINT kernel build properly
after this change.

Bump the __FreeBSD_version to force a rebuild of all kernel modules.

MFC after: 1 week
Sponsored by: Mellanox Technologies


# 1125dbc0 25-Dec-2016 Hans Petter Selasky <hselasky@FreeBSD.org>

Implement register and unregister chrdev in the LinuxKPI.

Obtained from: kmacy @
MFC after: 1 week
Sponsored by: Mellanox Technologies


# 8eeb3e17 27-May-2016 Hans Petter Selasky <hselasky@FreeBSD.org>

The SCHEDULER_STOPPED() macro already contains a predict false statement.
Remove superfluous unlikely() wrapper.

Suggested by: glebius
MFC after: 1 week
Sponsored by: Mellanox Technologies


# 1d9b99e5 24-May-2016 Hans Petter Selasky <hselasky@FreeBSD.org>

Implement Linux module parameters as read-only tunable SYSCTLs.

Bool module parameters are no longer supported, because there is no
equivalent in FreeBSD.

There are two macros available which control the behaviour of the
LinuxKPI module parameters:

- LINUXKPI_PARAM_PARENT allows the consumer to set the SYSCTL parent
where the modules parameters will be created.

- LINUXKPI_PARAM_PREFIX defines a parameter name prefix, which is
added to all created module parameters.

Sponsored by: Mellanox Technologies
MFC after: 1 week


# 85714218 25-May-2016 Hans Petter Selasky <hselasky@FreeBSD.org>

Add checks for SCHEDULER_STOPPED() so that code using the LinuxKPI can
run after a panic(). This for example allows a LinuxKPI based graphics
stack to receive prints during a panic.

Obtained from: kmacy @
MFC after: 1 week
Sponsored by: Mellanox Technologies


# 3ce12630 24-May-2016 Hans Petter Selasky <hselasky@FreeBSD.org>

Set "current" for all PCI enumeration callbacks.

Obtained from: kmacy @
MFC after: 1 week
Sponsored by: Mellanox Technologies


# aad02fb4 22-May-2016 Hans Petter Selasky <hselasky@FreeBSD.org>

Add more list_xxx() functions to the LinuxKPI.

Obtained from: kmacy @
MFC after: 1 week
Sponsored by: Mellanox Technologies


# 03219fba 16-May-2016 Hans Petter Selasky <hselasky@FreeBSD.org>

Properly implement "cpu_has_clflush" macro.

Suggested by: kib, jhb
MFC after: 1 week
Sponsored by: Mellanox Technologies


# 3a8bec33 12-May-2016 Hans Petter Selasky <hselasky@FreeBSD.org>

Fix handling of IOCTLs in the LinuxKPI.

Linux requires that all IOCTL data resides in userspace. FreeBSD
always moves the main IOCTL structure into a kernel buffer before
invoking the IOCTL handler and then copies it back into userspace,
before returning. Hide this difference in the "linux_copyin()" and
"linux_copyout()" functions by remapping userspace addresses in the
range from 0x10000 to 0x20000, to the kernel IOCTL data buffer.

It is assumed that the userspace code, data and stack segments starts
no lower than memory address 0x400000, which is also stated by "man 1
ld", which means any valid userspace pointer can be passed to regular
LinuxKPI handled IOCTLs.

Bump the FreeBSD version to force recompilation of all kernel modules.

Discussed with: kmacy @
MFC after: 1 week
Sponsored by: Mellanox Technologies


# 464d20bc 12-May-2016 Hans Petter Selasky <hselasky@FreeBSD.org>

Create a dummy "task_struct" on the stack which is returned by
"current" inside all LinuxKPI file operation callbacks. The "current"
is frequently used for various debug prints, printing the thread name
and thread ID for example.

Obtained from: kmacy @
MFC after: 1 week
Sponsored by: Mellanox Technologies


# b3c89b5a 11-May-2016 Hans Petter Selasky <hselasky@FreeBSD.org>

Return a proper error code instead of panicing when an I/O vector
having the wrong number of entries is detected.

MFC after: 1 week
Sponsored by: Mellanox Technologies


# 0754e66c 09-May-2016 Hans Petter Selasky <hselasky@FreeBSD.org>

Fix file polling bug.

Ensure the actual poll result is returned by the "linux_file_poll()"
function instead of zero which means no data is available.

MFC after: 3 days
Sponsored by: Mellanox Technologies


# b0338411 31-Mar-2016 Navdeep Parhar <np@FreeBSD.org>

Add wait_event_interruptible_timeout to linuxkpi.

Submitted by: Krishnamraju Eraparaju @ Chelsio
Reviewed by: hselasky@
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D5776


# a1e1814d 22-Feb-2016 Svatopluk Kraus <skra@FreeBSD.org>

As <machine/pmap.h> is included from <vm/pmap.h>, there is no need to
include it explicitly when <vm/pmap.h> is already included.

Reviewed by: alc, kib
Differential Revision: https://reviews.freebsd.org/D5373


# e23cd1b9 19-Jan-2016 John Baldwin <jhb@FreeBSD.org>

Initialize vm_page_prot to VM_MEMATTR_DEFAULT instead of 0.

If a driver's Linux mmap callback passed vm_page_prot through unchanged,
then linux_dev_mmap_single() would try to apply whatever VM_MEMATTR_xxx
value 0 is to the mapping. On x86, VM_MEMATTR_DEFAULT is the PAT value
for write-back (WB) which is 6, while 0 maps to the PAT value for
uncacheable (UC). Thus, any mmap request that did not explicitly set
page_prot was tried to map memory as UC triggering the warning in
sg_pager_getpages().

Tested by: np
Reported by: Krishnamraju Eraparaju @ Chelsio
MFC after: 3 days
Sponsored by: Chelsio Communications


# 0c510167 08-Jan-2016 Hans Petter Selasky <hselasky@FreeBSD.org>

LinuxKPI style changes:
- Properly prefix internal functions with "linux_" instead of only a
single underscore to avoid future namespace collisions.
- Make some functions global instead of inline to ease debugging and
to avoid unnecessary code duplication.
- Remove no longer existing kthread_create() function's prototype.

MFC after: 1 week
Sponsored by: Mellanox Technologies


# e10c4cc0 04-Jan-2016 Hans Petter Selasky <hselasky@FreeBSD.org>

Implement RCU mechanism using shared exclusive locks.

MFC after: 1 week
Sponsored by: Mellanox Technologies


# 06204f8e 30-Dec-2015 Hans Petter Selasky <hselasky@FreeBSD.org>

Minor LinuxKPI code cleanup:
- Declare some static functions in linux_compat.c instead if inside
various header files.
- Prefix FreeBSD local functions in the LinuxKPI with "linux_" to
avoid symbol name conflicts in the future and to make debugging
easier.
- Make the "struct kobj_ktype" declaractions constant to shave off a
few bytes from the data segment.

MFC after: 1 week
Sponsored by: Mellanox Technologies


# 337cb9f0 31-Dec-2015 Hans Petter Selasky <hselasky@FreeBSD.org>

Make the kobject refcounting compliant with Linux. Refcounting on the
parent kobject cannot be factored out and must be done by the kobject
consumers.

MFC after: 1 week
Sponsored by: Mellanox Technologies


# 26019405 28-Dec-2015 Hans Petter Selasky <hselasky@FreeBSD.org>

Reduce memory consumption when allocating kobject strings in the
LinuxKPI. Compute string length before allocating memory instead of
using fixed size allocations. Make kobject_set_name_vargs() global
instead of inline to save some bytes when compiling.

MFC after: 1 week
Sponsored by: Mellanox Technologies


# c4e58b4e 20-Dec-2015 Hans Petter Selasky <hselasky@FreeBSD.org>

Implement drain_workqueue() function.

MFC after: 1 week
Sponsored by: Mellanox Technologies


# 55d445d3 21-Dec-2015 Hans Petter Selasky <hselasky@FreeBSD.org>

Minor workqueue cleanup:
- Make some functions global instead of inline to ease debugging.
- Fix some minor style issues.

MFC after: 1 week
Sponsored by: Mellanox Technologies


# f727a767 13-Nov-2015 Hans Petter Selasky <hselasky@FreeBSD.org>

Add assert and note about the size of "unsigned long" inside the
LinuxKPI for the future.

Sponsored by: Mellanox Technologies


# 86845417 12-Nov-2015 Hans Petter Selasky <hselasky@FreeBSD.org>

Build fixes:
- Add some missing I/O functions for non-i386 and amd64 platforms.
- Stub ioremap() to NULL using a macro to ensure non-existing memory
attributes are not referred when they do not exist.
- Add more header files to linux/list.h to resolve driver compilation
issues on Sparc64 and PowerPC platforms.

Sponsored by: Mellanox Technologies


# 8d59ecb2 29-Oct-2015 Hans Petter Selasky <hselasky@FreeBSD.org>

Finish process of moving the LinuxKPI module into the default kernel build.

- Move all files related to the LinuxKPI into sys/compat/linuxkpi and
its subfolders.
- Update sys/conf/files and some Makefiles to use new file locations.
- Added description of COMPAT_LINUXKPI to sys/conf/NOTES which in turn
adds the LinuxKPI to all LINT builds.
- The LinuxKPI can be added to the kernel by setting the
COMPAT_LINUXKPI option. The OFED kernel option no longer builds the
LinuxKPI into the kernel. This was done to keep the build rules for
the LinuxKPI in sys/conf/files simple.
- Extend the LinuxKPI module to include support for USB by moving the
Linux USB compat from usb.ko to linuxkpi.ko.
- Bump the FreeBSD_version.
- A universe kernel build has been done.

Reviewed by: np @ (cxgb and cxgbe related changes only)
Sponsored by: Mellanox Technologies