#
82e96613 |
|
30-Apr-2024 |
Jules Irenge <jbi.octave@gmail.com> |
RDMA/mlx5: Remove NULL check before dev_{put, hold} Coccinelle reports a warning WARNING: NULL check before dev_{put, hold} functions is not needed The reason is the call netdev_{put, hold} of dev_{put,hold} will check NULL There is no need to check before using dev_{put, hold} Signed-off-by: Jules Irenge <jbi.octave@gmail.com> Link: https://lore.kernel.org/r/ZjGC4qXrOwZE0aHi@octinomon.home Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
#
d727d27d |
|
06-Dec-2023 |
Mark Bloch <mbloch@nvidia.com> |
RDMA/mlx5: Expose register c0 for RDMA device This patch introduces improvements for matching egress traffic sent by the local device. When applicable, all egress traffic from the local vport is now tagged with the provided value. This enhancement is particularly useful for FDB steering purposes. The primary focus of this update is facilitating the transmission of traffic from the hypervisor to a VF. To achieve this, one must initiate an SQ on the hypervisor and subsequently create a rule in the FDB that matches on the eswitch manager vport and the SQN of the aforementioned SQ. Obtaining the SQN can be had from SQ opened, and the eswitch manager vport match can be substituted with the register c0 value exposed by this patch. Signed-off-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Michael Guralnik <michaelgur@nvidia.com> Link: https://lore.kernel.org/r/aa4120a91c98ff1c44f1213388c744d4cb0324d6.1701871118.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
#
2ef422f0 |
|
24-Oct-2023 |
George Kennedy <george.kennedy@oracle.com> |
IB/mlx5: Fix init stage error handling to avoid double free of same QP and UAF In the unlikely event that workqueue allocation fails and returns NULL in mlx5_mkey_cache_init(), delete the call to mlx5r_umr_resource_cleanup() (which frees the QP) in mlx5_ib_stage_post_ib_reg_umr_init(). This will avoid attempted double free of the same QP when __mlx5_ib_add() does its cleanup. Resolves a splat: Syzkaller reported a UAF in ib_destroy_qp_user workqueue: Failed to create a rescuer kthread for wq "mkey_cache": -EINTR infiniband mlx5_0: mlx5_mkey_cache_init:981:(pid 1642): failed to create work queue infiniband mlx5_0: mlx5_ib_stage_post_ib_reg_umr_init:4075:(pid 1642): mr cache init failed -12 ================================================================== BUG: KASAN: slab-use-after-free in ib_destroy_qp_user (drivers/infiniband/core/verbs.c:2073) Read of size 8 at addr ffff88810da310a8 by task repro_upstream/1642 Call Trace: <TASK> kasan_report (mm/kasan/report.c:590) ib_destroy_qp_user (drivers/infiniband/core/verbs.c:2073) mlx5r_umr_resource_cleanup (drivers/infiniband/hw/mlx5/umr.c:198) __mlx5_ib_add (drivers/infiniband/hw/mlx5/main.c:4178) mlx5r_probe (drivers/infiniband/hw/mlx5/main.c:4402) ... </TASK> Allocated by task 1642: __kmalloc (./include/linux/kasan.h:198 mm/slab_common.c:1026 mm/slab_common.c:1039) create_qp (./include/linux/slab.h:603 ./include/linux/slab.h:720 ./include/rdma/ib_verbs.h:2795 drivers/infiniband/core/verbs.c:1209) ib_create_qp_kernel (drivers/infiniband/core/verbs.c:1347) mlx5r_umr_resource_init (drivers/infiniband/hw/mlx5/umr.c:164) mlx5_ib_stage_post_ib_reg_umr_init (drivers/infiniband/hw/mlx5/main.c:4070) __mlx5_ib_add (drivers/infiniband/hw/mlx5/main.c:4168) mlx5r_probe (drivers/infiniband/hw/mlx5/main.c:4402) ... Freed by task 1642: __kmem_cache_free (mm/slub.c:1826 mm/slub.c:3809 mm/slub.c:3822) ib_destroy_qp_user (drivers/infiniband/core/verbs.c:2112) mlx5r_umr_resource_cleanup (drivers/infiniband/hw/mlx5/umr.c:198) mlx5_ib_stage_post_ib_reg_umr_init (drivers/infiniband/hw/mlx5/main.c:4076 drivers/infiniband/hw/mlx5/main.c:4065) __mlx5_ib_add (drivers/infiniband/hw/mlx5/main.c:4168) mlx5r_probe (drivers/infiniband/hw/mlx5/main.c:4402) ... Fixes: 04876c12c19e ("RDMA/mlx5: Move init and cleanup of UMR to umr.c") Link: https://lore.kernel.org/r/1698170518-4006-1-git-send-email-george.kennedy@oracle.com Suggested-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: George Kennedy <george.kennedy@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
02e7d139 |
|
19-Oct-2023 |
Patrisious Haddad <phaddad@nvidia.com> |
RDMA/mlx5: Change the key being sent for MPV device affiliation Change the key that we send from IB driver to EN driver regarding the MPV device affiliation, since at that stage the IB device is not yet initialized, so its index would be zero for different IB devices and cause wrong associations between unrelated master and slave devices. Instead use a unique value from inside the core device which is already initialized at this stage. Fixes: 0d293714ac32 ("RDMA/mlx5: Send events from IB driver about device affiliation state") Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Link: https://lore.kernel.org/r/ac7e66357d963fc68d7a419515180212c96d137d.1697705185.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
#
b28ad324 |
|
20-Sep-2023 |
Patrisious Haddad <phaddad@nvidia.com> |
IB/mlx5: Rename 400G_8X speed to comply to naming convention Rename 400G_8X speed to comply to naming convention. Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Reviewed-by: Mark Zhang <markzhang@nvidia.com> Link: https://lore.kernel.org/r/ac98447cac8379a43fbdb36d56e5fb2b741a97ff.1695204156.git.leon@kernel.org Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
#
948f0bf5 |
|
20-Sep-2023 |
Patrisious Haddad <phaddad@nvidia.com> |
IB/mlx5: Add support for 800G_8X lane speed Add a check for 800G_8X speed when querying PTYS and report it back correctly when needed. Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Reviewed-by: Mark Zhang <markzhang@nvidia.com> Link: https://lore.kernel.org/r/26fd0b6e1fac071c3eb779657bb3d8ba47f47c4f.1695204156.git.leon@kernel.org Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
#
0d293714 |
|
21-Sep-2023 |
Patrisious Haddad <phaddad@nvidia.com> |
RDMA/mlx5: Send events from IB driver about device affiliation state Send blocking events from IB driver whenever the device is done being affiliated or if it is removed from an affiliation. This is useful since now the EN driver can register to those event and know when a device is affiliated or not. Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Link: https://lore.kernel.org/r/a7491c3e483cfd8d962f5f75b9a25f253043384a.1695296682.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
#
dab994bc |
|
20-Sep-2023 |
Shay Drory <shayd@nvidia.com> |
RDMA/mlx5: Fix NULL string error checkpath is complaining about NULL string, change it to 'Unknown'. Fixes: 37aa5c36aa70 ("IB/mlx5: Add UARs write-combining and non-cached mapping") Signed-off-by: Shay Drory <shayd@nvidia.com> Link: https://lore.kernel.org/r/8638e5c14fadbde5fa9961874feae917073af920.1695203958.git.leonro@nvidia.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
#
58dbd642 |
|
12-Apr-2023 |
Patrisious Haddad <phaddad@nvidia.com> |
RDMA/mlx5: Handles RoCE MACsec steering rules addition and deletion Add RoCE MACsec rules when a gid is added for the MACsec netdevice and handle their cleanup when the gid is removed or the MACsec SA is deleted. Also support alias IP for the MACsec device, as long as we don't have more ips than what the gid table can hold. In addition handle the case where a gid is added but there are still no SAs added for the MACsec device, so the rules are added later on when the SAs are added. Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
|
#
758ce14a |
|
02-May-2022 |
Patrisious Haddad <phaddad@nvidia.com> |
RDMA/mlx5: Implement MACsec gid addition and deletion Handle MACsec IP ambiguity issue, since mlx5 hw can't support programming both the MACsec and the physical gid when they have the same IP address, because it wouldn't know to whom to steer the traffic. Hence in such case we delete the physical gid from the hw gid table, which would then cause all traffic sent over it to fail, and we'll only be able to send traffic over the MACsec gid. Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Reviewed-by: Raed Salem <raeds@nvidia.com> Reviewed-by: Mark Zhang <markzhang@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
|
#
674dd4e2 |
|
22-Jun-2023 |
Maher Sanalla <msanalla@nvidia.com> |
net/mlx5: Rename mlx5_comp_vectors_count() to mlx5_comp_vectors_max() To accurately represent its purpose, rename the function that retrieves the value of maximum vectors from mlx5_comp_vectors_count() to mlx5_comp_vectors_max(). Signed-off-by: Maher Sanalla <msanalla@nvidia.com> Reviewed-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
#
ee4d269e |
|
05-Jun-2023 |
Maher Sanalla <msanalla@nvidia.com> |
RDMA/mlx5: Initiate dropless RQ for RAW Ethernet functions Delay drop data is initiated for PFs that have the capability of rq_delay_drop and are in roce profile. However, PFs with RAW ethernet profile do not initiate delay drop data on function load, causing kernel panic if delay drop struct members are accessed later on in case a dropless RQ is created. Thus, stage the delay drop initialization as part of RAW ethernet PF loading process. Fixes: b5ca15ad7e61 ("IB/mlx5: Add proper representors support") Signed-off-by: Maher Sanalla <msanalla@nvidia.com> Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Link: https://lore.kernel.org/r/2e9d386785043d48c38711826eb910315c1de141.1685960567.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
#
88c9483f |
|
16-Mar-2023 |
Maher Sanalla <msanalla@nvidia.com> |
IB/mlx5: Add support for 400G_8X lane speed Currently, when driver queries PTYS to report which link speed is being used on its RoCE ports, it does not check the case of having 400Gbps transmitted over 8 lanes. Thus it fails to report the said speed and instead it defaults to report 10G over 4 lanes. Add a check for the said speed when querying PTYS and report it back correctly when needed. Fixes: 08e8676f1607 ("IB/mlx5: Add support for 50Gbps per lane link modes") Signed-off-by: Maher Sanalla <msanalla@nvidia.com> Reviewed-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/ec9040548d119d22557d6a4b4070d6f421701fd4.1678973994.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
#
594cac11 |
|
17-Jan-2023 |
Or Har-Toov <ohartoov@nvidia.com> |
RDMA/mlx5: Use query_special_contexts for mkeys Use query_sepcial_contexts to get the correct value of mkeys such as null_mkey, terminate_scatter_list_mkey and dump_fill_mkey, as FW will change them in certain configurations. Link: https://lore.kernel.org/r/000236f0a9487d48809f87bcc3620a3964b2d3d3.1673960981.git.leon@kernel.org Signed-off-by: Or Har-Toov <ohartoov@nvidia.com> Reviewed-by: Michael Guralnik <michaelgur@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
1c71222e |
|
26-Jan-2023 |
Suren Baghdasaryan <surenb@google.com> |
mm: replace vma->vm_flags direct modifications with modifier calls Replace direct modifications to vma->vm_flags with calls to modifier functions to be able to track flag changes and to keep vma locking correctness. [akpm@linux-foundation.org: fix drivers/misc/open-dice.c, per Hyeonggon Yoo] Link: https://lkml.kernel.org/r/20230126193752.297968-5-surenb@google.com Signed-off-by: Suren Baghdasaryan <surenb@google.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Acked-by: Sebastian Reichel <sebastian.reichel@collabora.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arjun Roy <arjunroy@google.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: David Rientjes <rientjes@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Greg Thelen <gthelen@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Joel Fernandes <joelaf@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: Laurent Dufour <ldufour@linux.ibm.com> Cc: Lorenzo Stoakes <lstoakes@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Minchan Kim <minchan@google.com> Cc: Paul E. McKenney <paulmck@kernel.org> Cc: Peter Oskolkov <posk@google.com> Cc: Peter Xu <peterx@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Punit Agrawal <punit.agrawal@bytedance.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Shakeel Butt <shakeelb@google.com> Cc: Soheil Hassas Yeganeh <soheil@google.com> Cc: Song Liu <songliubraving@fb.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
85f9e38a |
|
02-Feb-2023 |
Leon Romanovsky <leon@kernel.org> |
RDMA/mlx5: Remove impossible check of mkey cache cleanup failure mlx5_mkey_cache_cleanup() can't fail and can be changed to be void. Link: https://lore.kernel.org/r/1acd9528995d083114e7dec2a2afc59436406583.1675328463.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
|
#
312b8f79 |
|
04-Jan-2023 |
Mark Zhang <markzhang@nvidia.com> |
RDMA/mlx: Calling qp event handler in workqueue context Move the call of qp event handler from atomic to workqueue context, so that the handler is able to block. This is needed by following patches. Signed-off-by: Mark Zhang <markzhang@nvidia.com> Reviewed-by: Patrisious Haddad <phaddad@nvidia.com> Link: https://lore.kernel.org/r/0cd17b8331e445f03942f4bb28d447f24ac5669d.1672821186.git.leonro@nvidia.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
#
dca55da0 |
|
01-Nov-2022 |
Jiri Pirko <jiri@nvidia.com> |
RDMA/mlx5: Track netdev to avoid deadlock during netdev notifier unregister When removing a network namespace with mlx5 devlink instance being in it, following callchain is performed: cleanup_net (takes down_read(&pernet_ops_rwsem) devlink_pernet_pre_exit() devlink_reload() mlx5_devlink_reload_down() mlx5_unload_one_devl_locked() mlx5_detach_device() del_adev() mlx5r_remove() __mlx5_ib_remove() mlx5_ib_roce_cleanup() mlx5_remove_netdev_notifier() unregister_netdevice_notifier (takes down_write(&pernet_ops_rwsem) This deadlocks. Resolve this by converting to register_netdevice_notifier_dev_net() which does not take pernet_ops_rwsem and moves the notifier block around according to netdev it takes as arg. Use previously introduced netdev added/removed events to track uplink netdev to be used for register_netdevice_notifier_dev_net() purposes. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
#
ca7ef7ad |
|
22-Aug-2022 |
Daisuke Matsuda <matsuda-daisuke@fujitsu.com> |
IB/mlx5: Remove duplicate header inclusion related to ODP rdma/ib_umem.h and rdma/ib_verbs.h are included by rdma/ib_umem_odp.h. This patch removes the redundant entries. Link: https://lore.kernel.org/r/20220823025131.862811-1-matsuda-daisuke@fujitsu.com Signed-off-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
#
13ad1125 |
|
31-Jul-2022 |
Aharon Landau <aharonl@nvidia.com> |
RDMA/mlx5: Don't compare mkey tags in DEVX indirect mkey According to the ib spec: If the CI supports the Base Memory Management Extensions defined in this specification, the L_Key format must consist of: 24 bit index in the most significant bits of the R_Key, and 8 bit key in the least significant bits of the R_Key Through a successful Allocate L_Key verb invocation, the CI must let the consumer own the key portion of the returned R_Key Therefore, when creating a mkey using DEVX, the consumer is allowed to change the key part. The kernel should compare only the index part of a R_Key to determine equality with another R_Key. Adding capability in order not to break backward compatibility. Fixes: 534fd7aac56a ("IB/mlx5: Manage indirection mkey upon DEVX flow for ODP") Link: https://lore.kernel.org/r/3d669aacea85a3a15c3b3b953b3eaba3f80ef9be.1659255945.git.leonro@nvidia.com Signed-off-by: Aharon Landau <aharonl@nvidia.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
#
9ca05b0f |
|
28-Aug-2022 |
Maher Sanalla <msanalla@nvidia.com> |
RDMA/mlx5: Rely on RoCE fw cap instead of devlink when setting profile When the RDMA auxiliary driver probes, it sets its profile based on devlink driverinit value. The latter might not be in sync with FW yet (In case devlink reload is not performed), thus causing a mismatch between RDMA driver and FW. This results in the following FW syndrome when the RDMA driver tries to adjust RoCE state, which fails the probe: "0xC1F678 | modify_nic_vport_context: roce_en set on a vport that doesn't support roce" To prevent this, select the PF profile based on FW RoCE capability instead of relying on devlink driverinit value. To provide backward compatibility of the RoCE disable feature, on older FW's where roce_rw is not set (FW RoCE capability is read-only), keep the current behavior e.g., rely on devlink driverinit value. Fixes: fbfa97b4d79f ("net/mlx5: Disable roce at HCA level") Reviewed-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Michael Guralnik <michaelgur@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Maher Sanalla <msanalla@nvidia.com> Link: https://lore.kernel.org/r/cb34ce9a1df4a24c135cb804db87f7d2418bd6cc.1661763459.git.leonro@nvidia.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
#
4b83c3ca |
|
08-Aug-2022 |
Mark Bloch <mbloch@nvidia.com> |
RDMA/mlx5: Use the proper number of ports The cited commit allowed the driver to operate over HCAs that have 4 physical ports. Use the number of ports of the RDMA device in the for loop instead of using the struct size. Fixes: 4cd14d44b11d ("net/mlx5: Support devices with more than 2 ports") Link: https://lore.kernel.org/r/a54a56c2ede16044a29d119209b35189c662ac72.1659944855.git.leonro@nvidia.com Signed-off-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
#
01137808 |
|
26-Jul-2022 |
Aharon Landau <aharonl@nvidia.com> |
RDMA/mlx5: Rename the mkey cache variables and functions After replacing the MR cache with an Mkey cache, rename the variables and functions to fit the new meaning. Link: https://lore.kernel.org/r/20220726071911.122765-6-michaelgur@nvidia.com Signed-off-by: Aharon Landau <aharonl@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
04876c12 |
|
12-Apr-2022 |
Aharon Landau <aharonl@nvidia.com> |
RDMA/mlx5: Move init and cleanup of UMR to umr.c The first patch in a series to split UMR logic to a dedicated file. As a start, move the init and cleanup of UMR resources to umr.c. Link: https://lore.kernel.org/r/849e632dd1945a2534712a320cc5779f2149ba96.1649747695.git.leonro@nvidia.com Signed-off-by: Aharon Landau <aharonl@nvidia.com> Reviewed-by: Michael Guralnik <michaelgur@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
e945c653 |
|
03-Apr-2022 |
Jason Gunthorpe <jgg@ziepe.ca> |
RDMA: Split kernel-only global device caps from uverbs device caps Split out flags from ib_device::device_cap_flags that are only used internally to the kernel into kernel_cap_flags that is not part of the uapi. This limits the device_cap_flags to being the same bitmap that will be copied to userspace. This cleanly splits out the uverbs flags from the kernel flags to avoid confusion in the flags bitmap. Add some short comments describing which each of the kernel flags is connected to. Remove unused kernel flags. Link: https://lore.kernel.org/r/0-v2-22c19e565eef+139a-kern_caps_jgg@nvidia.com Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
34a30d76 |
|
01-Mar-2022 |
Mark Bloch <mbloch@nvidia.com> |
net/mlx5: Lag, expose number of lag ports Downstream patches will add support for hardware lag with more than 2 ports. Add a way for users to query the number of lag ports. Signed-off-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
#
de8bdb47 |
|
06-Apr-2022 |
Leon Romanovsky <leon@kernel.org> |
RDMA/mlx5: Drop crypto flow steering API The mlx5 flow steering crypto API was intended to be used in FPGA devices, which is not supported for years already. The removal of mlx5 crypto FPGA code together with inability to configure encryption keys makes the low steering API completely unusable. So delete the code, so any ESP flow steering requests will fail with not supported error, as it is happening now anyway as no device support this type of API. Link: https://lore.kernel.org/r/634a5face7734381463d809bfb89850f6998deac.1649232994.git.leonro@nvidia.com Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
|
#
74ec29bd |
|
06-Apr-2022 |
Leon Romanovsky <leon@kernel.org> |
RDMA/mlx5: Delete never supported IPsec flow action The IPSEC_REQUIRED_METADATA capability bit is never set, and can be safely removed from the flow action flags. Link: https://lore.kernel.org/r/697cd60bd5c9b6a004c449c1a41c2798fac844ff.1649232994.git.leonro@nvidia.com Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
|
#
66771a1c |
|
18-Feb-2022 |
Moshe Shemesh <moshe@nvidia.com> |
net/mlx5: Move debugfs entries to separate struct Move the debugfs entry pointers under priv to their own struct. Add get function for device debugfs root. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
#
27963d3d |
|
21-Dec-2021 |
David E. Box <david.e.box@linux.intel.com> |
RDMA/irdma: Use auxiliary_device driver data helpers Use auxiliary_get_drvdata and auxiliary_set_drvdata helpers. Reviewed-by: Cezary Rojewski <cezary.rojewski@intel.com> Signed-off-by: David E. Box <david.e.box@linux.intel.com> Link: https://lore.kernel.org/r/20211221235852.323752-2-david.e.box@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
d2c8a155 |
|
22-Sep-2021 |
Meir Lichtinger <meirl@nvidia.com> |
IB/mlx5: Enable UAR to have DevX UID UID field was added to alloc_uar and dealloc_uar PRM command, to specify DevX UID for UAR. This change enables firmware validating user access to its own UAR resources. For the kernel allocated UARs the UID will stay 0 as of today. Signed-off-by: Meir Lichtinger <meirl@nvidia.com> Reviewed-by: Yishai Hadas <yishaih@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
|
#
20da44df |
|
23-Jul-2021 |
Leon Romanovsky <leon@kernel.org> |
RDMA/mlx5: Drop in-driver verbs object creations There is no real value in bypassing IB/core APIs for creating standard objects with standard types. The open-coded variant didn't have any restrack task management calls and caused to such objects to be not present when running rdmatoool. Link: https://lore.kernel.org/r/f745590e5fb7d56f90fdb25f64ee3983ba17e1e4.1627040189.git.leonro@nvidia.com Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
514aee66 |
|
23-Jul-2021 |
Leon Romanovsky <leon@kernel.org> |
RDMA: Globally allocate and release QP memory Convert QP object to follow IB/core general allocation scheme. That change allows us to make sure that restrack properly kref the memory. Link: https://lore.kernel.org/r/48e767124758aeecc433360ddd85eaa6325b34d9.1627040189.git.leonro@nvidia.com Reviewed-by: Gal Pressman <galpress@amazon.com> #efa Tested-by: Gal Pressman <galpress@amazon.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> #rdma and core Tested-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Tested-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
0dc0da15 |
|
23-Jul-2021 |
Leon Romanovsky <leon@kernel.org> |
RDMA/mlx5: Rework custom driver QP type creation Starting from commit 2b1f747071c5 ("RDMA/core: Allow drivers to disable restrack DB") the restrack is able to handle non-standard QP types either. That change allows us to rewrite custom QP calls to their IB/core counterparts, so we will use general QP creation flow even for the driver QP types. Link: https://lore.kernel.org/r/51682ab82298748941f38bd23ee3bf77ef1cab7b.1627040189.git.leonro@nvidia.com Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
8c9e7f03 |
|
23-Jul-2021 |
Leon Romanovsky <leon@kernel.org> |
RDMA/mlx5: Delete device resource mutex that didn't protect anything The dev->devr.mutex was intended to protect GSI QP pointer change in the struct mlx5_ib_port_resources when it is accessed from the pkey_change_work. However that pointer isn't changed during the runtime and once IB/core adds MAD, it stays stable. Link: https://lore.kernel.org/r/6e338c561033df20d92e1371fc6a7a0d93aad945.1627040189.git.leonro@nvidia.com Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
b0791dbf |
|
23-Jul-2021 |
Leon Romanovsky <leon@kernel.org> |
RDMA/mlx5: Cancel pkey work before destroying device resources In the driver release flow, we are ensuring that notifier is disabled and no new works can be added to pkey_change_handler. It means that we can cancel that handler before destroying resources to make sure that our unwind routine is symmetrical to the allocation one. Link: https://lore.kernel.org/r/f2b1ea1bad952e4e7a48a6f731de9e0344986b29.1627040189.git.leonro@nvidia.com Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
11656f59 |
|
21-Jun-2021 |
Lior Nahmanson <liorna@mellanox.com> |
RDMA/mlx5: Add DCS offload support DCS is an offload to SW load balancing of DC initiator work requests. A single DCI can be connected to only one target at the time and can't start new connection until the previous work request is completed. This limitation will cause to delay when the initiator process needs to transfer data to multiple targets at the same time. The SW solution is to use a process that handling and spreading the work request on many DCIs according to destinations. This feature is an offload to this process and coming to reduce the load from the CPU and improve the performance. Link: https://lore.kernel.org/r/491c2c2afdb5b07de7f03eab3f93cf0704549dbc.1624258894.git.leonro@nvidia.com Reviewed-by: Meir Lichtinger <meirl@nvidia.com> Signed-off-by: Lior Nahmanson <liorna@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
c446d9da |
|
03-Aug-2021 |
Mark Bloch <mbloch@nvidia.com> |
RDMA/mlx5: Add shared FDB support Shared FDB allows to create a single RDMA device that holds representors from both eswitches. As shared FDB is only active when both uplink representors are enslaved there is a single RDMA port that represents both uplinks. The number of ports is the number of vports on both eswitches minus one as we only need 1 port for both uplinks. Signed-off-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Mark Zhang <markzhang@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
#
da78fe5f |
|
09-Aug-2021 |
Maor Gottlieb <maorg@nvidia.com> |
RDMA/mlx5: Fix crash when unbind multiport slave Fix the below crash when deleting a slave from the unaffiliated list twice. First time when the slave is bound to the master and the second when the slave is unloaded. Fix it by checking if slave is unaffiliated (doesn't have ib device) before removing from the list. RIP: 0010:mlx5r_mp_remove+0x4e/0xa0 [mlx5_ib] Call Trace: auxiliary_bus_remove+0x18/0x30 __device_release_driver+0x177/x220 device_release_driver+0x24/0x30 bus_remove_device+0xd8/0x140 device_del+0x18a/0x3e0 mlx5_rescan_drivers_locked+0xa9/0x210 [mlx5_core] mlx5_unregister_device+0x34/0x60 [mlx5_core] mlx5_uninit_one+0x32/0x100 [mlx5_core] remove_one+0x6e/0xe0 [mlx5_core] pci_device_remove+0x36/0xa0 __device_release_driver+0x177/0x220 device_driver_detach+0x3c/0xa0 unbind_store+0x113/0x130 kernfs_fop_write_iter+0x110/0x1a0 new_sync_write+0x116/0x1a0 vfs_write+0x1ba/0x260 ksys_write+0x5f/0xe0 do_syscall_64+0x3d/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae Fixes: 93f8244431ad ("RDMA/mlx5: Convert mlx5_ib to use auxiliary bus") Link: https://lore.kernel.org/r/17ec98989b0ba88f7adfbad68eb20bce8d567b44.1628587493.git.leonro@nvidia.com Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
4a754d76 |
|
29-Jun-2021 |
Leon Romanovsky <leon@kernel.org> |
RDMA/mlx5: Don't access NULL-cleared mpi pointer The "dev->port[i].mp.mpi" is set to NULL during mlx5_ib_unbind_slave_port() execution, however that field is needed to add device to unaffiliated list. Such flow causes to the following kernel panic while unloading mlx5_ib module in multi-port mode, hence the device should be added to the list prior to unbind call. RPC: Unregistered rdma transport module. RPC: Unregistered rdma backchannel transport module. BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page PGD 0 P4D 0 Oops: 0002 [#1] SMP NOPTI CPU: 4 PID: 1904 Comm: modprobe Not tainted 5.13.0-rc7_for_upstream_min_debug_2021_06_24_12_08 #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:mlx5_ib_cleanup_multiport_master+0x18b/0x2d0 [mlx5_ib] Code: 00 04 0f 85 c4 00 00 00 48 89 df e8 ef fa ff ff 48 8b 83 40 0d 00 00 48 8b 15 b9 e8 05 00 4a 8b 44 28 20 48 89 05 ad e8 05 00 <48> c7 00 d0 57 c5 a0 48 89 50 08 48 89 02 39 ab 88 0a 00 00 0f 86 RSP: 0018:ffff888116ee3df8 EFLAGS: 00010296 RAX: 0000000000000000 RBX: ffff8881154f6000 RCX: 0000000000000080 RDX: ffffffffa0c557d0 RSI: ffff88810b69d200 RDI: 000000000002d8a0 RBP: 0000000000000002 R08: ffff888110780408 R09: 0000000000000000 R10: ffff88812452e1c0 R11: fffffffffff7e028 R12: 0000000000000000 R13: 0000000000000080 R14: ffff888102c58000 R15: 0000000000000000 FS: 00007f884393a740(0000) GS:ffff8882f5a00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 00000001249f6004 CR4: 0000000000370ea0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: mlx5_ib_stage_init_cleanup+0x16/0xd0 [mlx5_ib] __mlx5_ib_remove+0x33/0x90 [mlx5_ib] mlx5r_remove+0x22/0x30 [mlx5_ib] auxiliary_bus_remove+0x18/0x30 __device_release_driver+0x177/0x220 driver_detach+0xc4/0x100 bus_remove_driver+0x58/0xd0 auxiliary_driver_unregister+0x12/0x20 mlx5_ib_cleanup+0x13/0x897 [mlx5_ib] __x64_sys_delete_module+0x154/0x230 ? exit_to_user_mode_prepare+0x104/0x140 do_syscall_64+0x3f/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f8842e095c7 Code: 73 01 c3 48 8b 0d d9 48 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a9 48 2c 00 f7 d8 64 89 01 48 RSP: 002b:00007ffc68f6e758 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0 RAX: ffffffffffffffda RBX: 00005638207929c0 RCX: 00007f8842e095c7 RDX: 0000000000000000 RSI: 0000000000000800 RDI: 0000563820792a28 RBP: 00005638207929c0 R08: 00007ffc68f6d701 R09: 0000000000000000 R10: 00007f8842e82880 R11: 0000000000000206 R12: 0000563820792a28 R13: 0000000000000001 R14: 0000563820792a28 R15: 00007ffc68f6fb40 Modules linked in: xt_MASQUERADE nf_conntrack_netlink nfnetlink iptable_nat xt_addrtype xt_conntrack nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter overlay rdma_ucm ib_iser libiscsi scsi_transport_iscsi rdma_cm iw_cm ib_ipoib ib_cm ib_umad mlx5_ib(-) mlx4_ib ib_uverbs ib_core mlx4_en mlx4_core mlx5_core ptp pps_core [last unloaded: rpcrdma] CR2: 0000000000000000 ---[ end trace a0bb7e20804e9e9b ]--- Fixes: 7ce6095e3bff ("RDMA/mlx5: Don't add slave port to unaffiliated list") Link: https://lore.kernel.org/r/899ac1b33a995be5ec0e16a4765c4e43c2b1ba5b.1624956444.git.leonro@nvidia.com Reviewed-by: Itay Aveksis <itayav@nvidia.com> Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
33652951 |
|
16-Jun-2021 |
Aharon Landau <aharonl@nvidia.com> |
RDMA/mlx5: Support real-time timestamp directly from the device Currently, if the user asks for a real-time timestamp, the device will return a free-running one, and the timestamp will be translated to real-time in the user-space. When the device supports only real-time timestamp and not free-running, the creation of the QP will fail even though the user needs supported the real-time one. To prevent this, we will return the real-time timestamp directly from the device. Link: https://lore.kernel.org/r/c6cfc8e6f038575c5c2de6505830f7e74e4de80d.1623829775.git.leonro@nvidia.com Signed-off-by: Aharon Landau <aharonl@nvidia.com> Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
915e4af5 |
|
11-Jun-2021 |
Jason Gunthorpe <jgg@ziepe.ca> |
RDMA: Remove rdma_set_device_sysfs_group() The driver's device group can be specified as part of the ops structure like the device's port group. No need for the complicated API. Link: https://lore.kernel.org/r/8964785a34fd3a29ff5b6693493f575b717e594d.1623427137.git.leonro@nvidia.com Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
7ce6095e |
|
31-May-2021 |
Leon Romanovsky <leon@kernel.org> |
RDMA/mlx5: Don't add slave port to unaffiliated list The mlx5_ib_bind_slave_port() doesn't remove multiport device from the unaffiliated list, but mlx5_ib_unbind_slave_port() did it. This unbalanced flow caused to the situation where mlx5_ib_unaffiliated_port_list was changed during iteration. Fixes: 32f69e4be269 ("{net, IB}/mlx5: Manage port association for multiport RoCE") Link: https://lore.kernel.org/r/2726e6603b1e6ecfe76aa5a12a063af72173bcf7.1622477058.git.leonro@nvidia.com Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
c906b86e |
|
10-May-2021 |
Sergey Gorenko <sergeygo@nvidia.com> |
RDMA/mlx5: Add SQD2RTS bit to the alloc ucontext response The new bit in the comp_mask is needed to mark that kernel supports SQD2RTS transition for the modify QP command. Link: https://lore.kernel.org/r/7ce705fedac1b2b8e3a2f4013e04244dc5946344.1620641808.git.leonro@nvidia.com Reviewed-by: Evgenii Kochetov <evgeniik@nvidia.com> Signed-off-by: Sergey Gorenko <sergeygo@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
97f30d32 |
|
10-May-2021 |
Maor Gottlieb <maorg@nvidia.com> |
RDMA/mlx5: Recover from fatal event in dual port mode When there is fatal event on the slave port, the device is marked as not active. We need to mark it as active again when the slave is recovered to regain full functionality. Fixes: d69a24e03659 ("IB/mlx5: Move IB event processing onto a workqueue") Link: https://lore.kernel.org/r/8906754455bb23019ef223c725d2c0d38acfb80b.1620711734.git.leonro@nvidia.com Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
dedbc2d3 |
|
18-Apr-2021 |
Parav Pandit <parav@nvidia.com> |
IB/mlx5: Set right RoCE l3 type and roce version while deleting GID Currently when GID is deleted, it zero out all the fields of the RoCE address in the SET_ROCE_ADDRESS command for a specified index. roce_version = 0 means RoCEv1 in the SET_ROCE_ADDRESS command. This assumes that device has RoCEv1 always enabled which is not always correct. For example Subfunction does not support RoCEv1. Due to this assumption a previously added RoCEv2 GID is always deleted as RoCEv1 GID. This results in a below syndrome: mlx5_core.sf mlx5_core.sf.4: mlx5_cmd_check:777:(pid 4256): SET_ROCE_ADDRESS(0x761) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x12822d) Hence set the right RoCE version during GID deletion provided by the core. Link: https://lore.kernel.org/r/d3f54129c90ca329caf438dbe31875d8ad08d91a.1618753425.git.leonro@nvidia.com Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
cea85fa5 |
|
11-Apr-2021 |
Maor Gottlieb <maorg@nvidia.com> |
RDMA/mlx5: Add support in MEMIC operations MEMIC buffer, in addition to regular read and write operations, can support atomic operations from the host. Introduce and implement new UAPI to allocate address space for MEMIC operations such as atomic. This includes: 1. Expose new IOCTL for request mapping of MEMIC operation. 2. Hold the operations address in a list, so same operation to same DM will be allocated only once. 3. Manage refcount on the mlx5_ib_dm object, so it would be keep valid until all addresses were unmapped. Link: https://lore.kernel.org/r/20210411122924.60230-7-leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
251b9d78 |
|
11-Apr-2021 |
Maor Gottlieb <maorg@nvidia.com> |
RDMA/mlx5: Re-organize the DM code 1. Inline the checks from check_dm_type_support() into their respective allocation functions. 2. Fix use after free when driver fails to copy the MEMIC address to the user by moving the allocation code into their respective functions, hence we avoid the explicit call to free the DM in the error flow. 3. Split mlx5_ib_dm struct to memic and icm proper typesafety throughout. Fixes: dc2316eba73f ("IB/mlx5: Fix device memory flows") Link: https://lore.kernel.org/r/20210411122924.60230-5-leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
831df883 |
|
11-Apr-2021 |
Maor Gottlieb <maorg@nvidia.com> |
RDMA/mlx5: Move all DM logic to separate file Move all device memory related code to a separate file. Link: https://lore.kernel.org/r/20210411122924.60230-4-leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
e5dc370b |
|
18-Mar-2021 |
Shay Drory <shayd@nvidia.com> |
RDMA/mlx5: Set ODP caps only if device profile support ODP Currently, ODP caps are set during the init stage of mlx5_ib_dev, regardless of whether the device profile supports ODP or not. There is no point in setting ODP caps if the device profile doesn't support ODP. Hence, move setting the ODP caps to the odp_init stage. Link: https://lore.kernel.org/r/20210318135259.681264-1-leon@kernel.org Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Shay Drory <shayd@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
1fb7f897 |
|
01-Mar-2021 |
Mark Bloch <mbloch@nvidia.com> |
RDMA: Support more than 255 rdma ports Current code uses many different types when dealing with a port of a RDMA device: u8, unsigned int and u32. Switch to u32 to clean up the logic. This allows us to make (at least) the core view consistent and use the same type. Unfortunately not all places can be converted. Many uverbs functions expect port to be u8 so keep those places in order not to break UAPIs. HW/Spec defined values must also not be changed. With the switch to u32 we now can support devices with more than 255 ports. U32_MAX is reserved to make control logic a bit easier to deal with. As a device with U32_MAX ports probably isn't going to happen any time soon this seems like a non issue. When a device with more than 255 ports is created uverbs will report the RDMA device as having 255 ports as this is the max currently supported. The verbs interface is not changed yet because the IBTA spec limits the port size in too many places to be u8 and all applications that relies in verbs won't be able to cope with this change. At this stage, we are extending the interfaces that are using vendor channel solely Once the limitation is lifted mlx5 in switchdev mode will be able to have thousands of SFs created by the device. As the only instance of an RDMA device that reports more than 255 ports will be a representor device and it exposes itself as a RAW Ethernet only device CM/MAD/IPoIB and other ULPs aren't effected by this change and their sysfs/interfaces that are exposes to userspace can remain unchanged. While here cleanup some alignment issues and remove unneeded sanity checks (mainly in rdmavt), Link: https://lore.kernel.org/r/20210301070420.439400-1-leon@kernel.org Signed-off-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
2904bb37 |
|
04-Mar-2021 |
Yishai Hadas <yishaih@nvidia.com> |
IB/core: Split uverbs_get_const/default to consider target type Change uverbs_get_const/uverbs_get_const_default to work properly with both signed/unsigned parameters. Current APIs mix s64 and u64 which leads to incorrect check when u64 value was supplied and its upper bit was set. In that case uverbs_get_const() / uverbs_get_const_default() lower bound check may fail unexpectedly, target is unsigned (lower bound is 0) but value became negative as of the s64 usage. Split to have two different APIs, no change to callers as the required API will be called internally according to the target type. Link: https://lore.kernel.org/r/20210304130501.1102577-3-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
7852546f |
|
04-Mar-2021 |
Maor Gottlieb <maorg@nvidia.com> |
RDMA/mlx5: Fix query RoCE port mlx5_is_roce_enabled returns the devlink RoCE init value, therefore it should be used only when driver is loaded. Instead we just need to read the roce_en field. In addition, rename mlx5_is_roce_enabled to mlx5_is_roce_init_enabled. Fixes: 7a58779edd75 ("IB/mlx5: Improve query port for representor port") Link: https://lore.kernel.org/r/20210304124517.1100608-2-leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
658cfceb |
|
11-Mar-2021 |
Mark Bloch <mbloch@nvidia.com> |
RDMA/mlx5: Use representor E-Switch when getting netdev and metadata Now that a pointer to the managing E-Switch is stored in the representor use it. Signed-off-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Acked-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
#
db72438c |
|
02-Feb-2021 |
Yishai Hadas <yishaih@nvidia.com> |
RDMA/mlx5: Cleanup the synchronize_srcu() from the ODP flow Cleanup the synchronize_srcu() from the ODP flow as it was found to be a very heavy time consumer as part of dereg_mr. For example de-registration of 10000 ODP MRs each with size of 2M hugepage took 19.6 sec comparing de-registration of same number of non ODP MRs that took 172 ms. The new locking scheme uses the wait_event() mechanism which follows the use count of the MR instead of using synchronize_srcu(). By that change, the time required for the above test took 95 ms which is even better than the non ODP flow. Once fully dropped the srcu usage, had to come with a lock to protect the XA access. As part of using the above mechanism we could also clean the num_deferred_work stuff and follow the use count instead. Link: https://lore.kernel.org/r/20210202071309.2057998-1-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
13179652 |
|
03-Feb-2021 |
Parav Pandit <parav@nvidia.com> |
IB/mlx5: Use rdma_for_each_port for port iteration Instead of open coding the loop for port iteration, use rdma_for_each_port macro provided by core. To use such macro, early initialization of phys_port_cnt is needed. Hence, initialize such constant early enough in the init stage. Whichever functions are called with port using rdma_for_each_port(), convert their port type from u8 to unsigned int to match the core API. Link: https://lore.kernel.org/r/20210203130133.4057329-6-leon@kernel.org Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
7416790e |
|
03-Feb-2021 |
Parav Pandit <parav@nvidia.com> |
RDMA/core: Introduce and use API to read port immutable data Currently mlx5 driver caches port GID table length for 2 ports. It is also cached by IB core as port immutable data. When mlx5 representor ports are present, which are usually more than 2, invalid access to port_caps array can happen while validating the GID table length which is only for 2 ports. To avoid this, take help of the IB cores port immutable data by exposing an API to read the port immutable fields. Remove mlx5 driver's internal cache, thereby reduce code and data. Link: https://lore.kernel.org/r/20210203130133.4057329-5-leon@kernel.org Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
7a58779e |
|
03-Feb-2021 |
Parav Pandit <parav@nvidia.com> |
IB/mlx5: Improve query port for representor port Improve query port functionality for representor port as below. 1. RoCE Qkey violation counters are not applicable for representor port. 2. Avoid setting gid_tbl_len twice for representor port. 3. Avoid setting ip_gids and IB_PORT_CM_SUP property for representor port as GID table is empty and CM support is not present in representor mode. Link: https://lore.kernel.org/r/20210203130133.4057329-4-leon@kernel.org Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
2019d70e |
|
03-Feb-2021 |
Parav Pandit <parav@nvidia.com> |
IB/mlx5: Avoid calling query device for reading pkey table length Pkey table length for all the ports of the device is the same. Currently get_ports_cap() reads and stores it for each port by querying the device which reads more than just pkey table length. For representor device ports which can be in range of hundreds, it queries is for each such port and end up returning same value for all the ports. When in representor mode, modify QP accesses pkey port caps for a port index that can be outside of the port_caps table. Hence, simplify the logic to query the max pkey table length only once during device initialization sequence. Link: https://lore.kernel.org/r/20210203130133.4057329-3-leon@kernel.org Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
3ce60f44 |
|
03-Feb-2021 |
Parav Pandit <parav@nvidia.com> |
IB/mlx5: Move mlx5_port_caps from mlx5_core_dev to mlx5_ib_dev mlx5_port_caps are RDMA specific capabilities. These are not used by the mlx5_core_device at all. Move them to mlx5_ib_dev where it is used and reduce the scope of it to multiple drivers. Link: https://lore.kernel.org/r/20210203130133.4057329-2-leon@kernel.org Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
d6fd59e1 |
|
27-Jan-2021 |
Parav Pandit <parav@nvidia.com> |
IB/mlx5: Support default partition key for representor port Representor port has only one default pkey. Hence have simpler query pkey callback or it. Link: https://lore.kernel.org/r/20210127150010.1876121-4-leon@kernel.org Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
d286ac1d |
|
27-Jan-2021 |
Parav Pandit <parav@nvidia.com> |
IB/mlx5: Return appropriate error code instead of ENOMEM When mlx5_ib_stage_init_init() fails, return the error code related to failure instead of -ENOMEM. Fixes: 16c1975f1032 ("IB/mlx5: Create profile infrastructure to add and remove stages") Link: https://lore.kernel.org/r/20210127150010.1876121-8-leon@kernel.org Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
90da7dc8 |
|
15-Dec-2020 |
Jianxin Xiong <jianxin.xiong@intel.com> |
RDMA/mlx5: Support dma-buf based userspace memory region Implement the new driver method 'reg_user_mr_dmabuf'. Utilize the core functions to import dma-buf based memory region and update the mappings. Add code to handle dma-buf related page fault. Link: https://lore.kernel.org/r/1608067636-98073-5-git-send-email-jianxin.xiong@intel.com Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Acked-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Acked-by: Christian Koenig <christian.koenig@amd.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
ab40530a |
|
13-Jan-2021 |
Parav Pandit <parav@nvidia.com> |
IB/mlx5: Add mutex destroy call to cap_mask_mutex mutex mutex_destroy() call for device's cap_mask_mutex mutex is missing, let's add it to annotate destruction. Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters") Link: https://lore.kernel.org/r/20210113121703.559778-4-leon@kernel.org Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
de641d74 |
|
17-Jan-2021 |
Parav Pandit <parav@nvidia.com> |
Revert "RDMA/mlx5: Fix devlink deadlock on net namespace deletion" This reverts commit fbdd0049d98d44914fc57d4b91f867f4996c787b. Due to commit in fixes tag, netdevice events were received only in one net namespace of mlx5_core_dev. Due to this when netdevice events arrive in net namespace other than net namespace of mlx5_core_dev, they are missed. This results in empty GID table due to RDMA device being detached from its net device. Hence, revert back to receive netdevice events in all net namespaces to restore back RDMA functionality in non init_net net namespace. The deadlock will have to be addressed in another patch. Fixes: fbdd0049d98d ("RDMA/mlx5: Fix devlink deadlock on net namespace deletion") Link: https://lore.kernel.org/r/20210117092633.10690-1-leon@kernel.org Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
1c3aa6bd |
|
13-Jan-2021 |
Mark Bloch <mbloch@nvidia.com> |
RDMA/mlx5: Fix wrong free of blue flame register on error If the allocation of the fast path blue flame register fails, the driver should free the regular blue flame register allocated a statement above, not the one that it just failed to allocate. Fixes: 16c1975f1032 ("IB/mlx5: Create profile infrastructure to add and remove stages") Link: https://lore.kernel.org/r/20210113121703.559778-6-leon@kernel.org Reported-by: Hans Petter Selasky <hanss@nvidia.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
2cb091f6 |
|
13-Jan-2021 |
Parav Pandit <parav@nvidia.com> |
IB/mlx5: Fix error unwinding when set_has_smi_cap fails When set_has_smi_cap() fails, multiport master cleanup is missed. Fix it by doing the correct error unwinding goto. Fixes: a989ea01cb10 ("RDMA/mlx5: Move SMI caps logic") Link: https://lore.kernel.org/r/20210113121703.559778-3-leon@kernel.org Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
93f82444 |
|
03-Oct-2020 |
Leon Romanovsky <leon@kernel.org> |
RDMA/mlx5: Convert mlx5_ib to use auxiliary bus The conversion to auxiliary bus solves long standing issue with existing mlx5_ib<->mlx5_core coupling. It required to have both modules in initramfs if one of them needed for the boot. Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
|
#
f946e45f |
|
26-Oct-2020 |
Meir Lichtinger <meirl@mellanox.com> |
IB/mlx5: Add support for NDR link speed The IBTA specification has new speed - NDR. That speed supports signaling rate of 100Gb. mlx5 IB driver translates link modes reported by ConnectX device to IB speed and width. Added translation of new 100Gb, 200Gb and 400Gb link modes to NDR IB type and width of x1, x2 or x4 respectively. Link: https://lore.kernel.org/r/20201026133738.1340432-3-leon@kernel.org Signed-off-by: Meir Lichtinger <meirl@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
8010d74b |
|
26-Oct-2020 |
Jason Gunthorpe <jgg@ziepe.ca> |
RDMA/mlx5: Split the WR setup out of mlx5_ib_update_xlt() The memory allocation is quite complicated, and makes this function hard to understand. Refactor things so that a function call sets up the WR, SG, DMA mapping and buffer, further splitting that into buffer and DMA/wr. This also slightly changes the buffer allocation logic to try an order 0 page allocation (with OOM warnings on) before going to the emergency page. Link: https://lore.kernel.org/r/20201026132314.1336717-4-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
f22c30aa |
|
26-Oct-2020 |
Jason Gunthorpe <jgg@ziepe.ca> |
RDMA/mlx5: Move xlt_emergency_page_mutex into mr.c This is the only user, so remove the wrappers. Link: https://lore.kernel.org/r/20201026132314.1336717-3-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
1c7fd726 |
|
07-Oct-2020 |
Joe Perches <joe@perches.com> |
RDMA: Convert sysfs device * show functions to use sysfs_emit() Done with cocci script: @@ identifier d_show; identifier dev, attr, buf; @@ ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf) { <... return - sprintf(buf, + sysfs_emit(buf, ...); ...> } @@ identifier d_show; identifier dev, attr, buf; @@ ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf) { <... return - snprintf(buf, PAGE_SIZE, + sysfs_emit(buf, ...); ...> } @@ identifier d_show; identifier dev, attr, buf; @@ ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf) { <... return - scnprintf(buf, PAGE_SIZE, + sysfs_emit(buf, ...); ...> } @@ identifier d_show; identifier dev, attr, buf; expression chr; @@ ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf) { <... return - strcpy(buf, chr); + sysfs_emit(buf, chr); ...> } @@ identifier d_show; identifier dev, attr, buf; identifier len; @@ ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf) { <... len = - sprintf(buf, + sysfs_emit(buf, ...); ...> return len; } @@ identifier d_show; identifier dev, attr, buf; identifier len; @@ ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf) { <... len = - snprintf(buf, PAGE_SIZE, + sysfs_emit(buf, ...); ...> return len; } @@ identifier d_show; identifier dev, attr, buf; identifier len; @@ ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf) { <... len = - scnprintf(buf, PAGE_SIZE, + sysfs_emit(buf, ...); ...> return len; } @@ identifier d_show; identifier dev, attr, buf; identifier len; @@ ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf) { <... - len += scnprintf(buf + len, PAGE_SIZE - len, + len += sysfs_emit_at(buf, len, ...); ...> return len; } @@ identifier d_show; identifier dev, attr, buf; expression chr; @@ ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf) { ... - strcpy(buf, chr); - return strlen(buf); + return sysfs_emit(buf, chr); } Link: https://lore.kernel.org/r/7f406fa8e3aa2552c022bec680f621e38d1fe414.1602122879.git.joe@perches.com Signed-off-by: Joe Perches <joe@perches.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
676a80ad |
|
03-Oct-2020 |
Jason Gunthorpe <jgg@ziepe.ca> |
RDMA: Remove AH from uverbs_cmd_mask Drivers that need a uverbs AH should instead set the create_user_ah() op similar to reg_user_mr(). MODIFY_AH and QUERY_AH cmds were never implemented so are just deleted. Link: https://lore.kernel.org/r/11-v1-caa70ba3d1ab+1436e-ucmd_mask_jgg@nvidia.com Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
1f11a761 |
|
03-Oct-2020 |
Jason Gunthorpe <jgg@ziepe.ca> |
RDMA: Check create_flags during create_qp Each driver should check that the QP attrs create_flags is supported. Unfortuantely when create_flags was added to the QP attrs the drivers were not updated. uverbs_ex_cmd_mask was used to block it - even though kernel drivers use these flags too. Check that flags is zero in all drivers that don't use it, remove IB_USER_VERBS_EX_CMD_CREATE_QP from uverbs_ex_cmd_mask. Fix the error code to be EOPNOTSUPP. Link: https://lore.kernel.org/r/8-v1-caa70ba3d1ab+1436e-ucmd_mask_jgg@nvidia.com Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
1c407cb5 |
|
03-Oct-2020 |
Jason Gunthorpe <jgg@ziepe.ca> |
RDMA: Check flags during create_cq Each driver should check that the CQ attrs is supported. Unfortuantely when flags was added to the CQ attrs the drivers were not updated, uverbs_ex_cmd_mask was used to block it. This was missed when create CQ was converted to ioctl, so non-zero flags could have been passed into drivers. Check that flags is zero in all drivers that don't use it, remove IB_USER_VERBS_EX_CMD_CREATE_CQ from uverbs_ex_cmd_mask. Fixes: 41b2a71fc848 ("IB/uverbs: Move ioctl path of create_cq and destroy_cq to a new file") Link: https://lore.kernel.org/r/7-v1-caa70ba3d1ab+1436e-ucmd_mask_jgg@nvidia.com Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
26e990ba |
|
03-Oct-2020 |
Jason Gunthorpe <jgg@ziepe.ca> |
RDMA: Check attr_mask during modify_qp Each driver should check that it can support the provided attr_mask during modify_qp. IB_USER_VERBS_EX_CMD_MODIFY_QP was being used to block modify_qp_ex because the driver didn't check RATE_LIMIT. Link: https://lore.kernel.org/r/6-v1-caa70ba3d1ab+1436e-ucmd_mask_jgg@nvidia.com Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
652caba5 |
|
03-Oct-2020 |
Jason Gunthorpe <jgg@ziepe.ca> |
RDMA: Check srq_type during create_srq uverbs was blocking srq_types the driver doesn't support based on the CREATE_XSRQ cmd_mask. Fix all drivers to check for supported srq_types during create_srq and move CREATE_XSRQ to the core code. Link: https://lore.kernel.org/r/5-v1-caa70ba3d1ab+1436e-ucmd_mask_jgg@nvidia.com Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
44ce37bc |
|
03-Oct-2020 |
Jason Gunthorpe <jgg@ziepe.ca> |
RDMA: Move more uverbs_cmd_mask settings to the core These functions all depend on the driver providing a specific op: - REREG_MR is rereg_user_mr(). bnxt_re set this without providing the op - ATTACH/DEATCH_MCAST is attach_mcast()/detach_mcast(). usnic set this without providing the op - OPEN_QP doesn't involve the driver but requires a XRCD. qedr provides xrcd but forgot to set it, usnic doesn't provide XRCD but set it anyhow. - OPEN/CLOSE_XRCD are the ops alloc_xrcd()/dealloc_xrcd() - CREATE_SRQ/DESTROY_SRQ are the ops create_srq()/destroy_srq() - QUERY/MODIFY_SRQ is op query_srq()/modify_srq(). hns sets this but sometimes supplies a NULL op. - RESIZE_CQ is op resize_cq(). bnxt_re sets this boes doesn't supply an op - ALLOC/DEALLOC_MW is alloc_mw()/dealloc_mw(). cxgb4 provided an (now deleted) implementation but no userspace All drivers were checked that no drivers provide the op without also setting uverbs_cmd_mask so this should have no functional change. Link: https://lore.kernel.org/r/4-v1-caa70ba3d1ab+1436e-ucmd_mask_jgg@nvidia.com Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
c074bb1e |
|
03-Oct-2020 |
Jason Gunthorpe <jgg@ziepe.ca> |
RDMA: Remove elements in uverbs_cmd_mask that all drivers set This is a step toward eliminating uverbs_cmd_mask. Preset this list in the core code. Only the op reg_user_mr wasn't already being required from the drivers. Link: https://lore.kernel.org/r/3-v1-caa70ba3d1ab+1436e-ucmd_mask_jgg@nvidia.com Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
b8e3130d |
|
03-Oct-2020 |
Jason Gunthorpe <jgg@ziepe.ca> |
RDMA: Remove uverbs_ex_cmd_mask values that are linked to functions Since a while now the uverbs layer checks if the driver implements a function before allowing the ucmd to proceed. This largely obsoletes the cmd_mask stuff, but there is some tricky bits in drivers preventing it from being removed. Remove the easy elements of uverbs_ex_cmd_mask by pre-setting them in the core code. These are triggered soley based on the related ops function pointer. query_device_ex is not triggered based on an op, but all drivers already implement something compatible with the extension, so enable it globally too. Link: https://lore.kernel.org/r/2-v1-caa70ba3d1ab+1436e-ucmd_mask_jgg@nvidia.com Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
fbdd0049 |
|
26-Oct-2020 |
Parav Pandit <parav@nvidia.com> |
RDMA/mlx5: Fix devlink deadlock on net namespace deletion When a mlx5 core devlink instance is reloaded in different net namespace, its associated IB device is deleted and recreated. Example sequence is: $ ip netns add foo $ devlink dev reload pci/0000:00:08.0 netns foo $ ip netns del foo mlx5 IB device needs to attach and detach the netdevice to it through the netdev notifier chain during load and unload sequence. A below call graph of the unload flow. cleanup_net() down_read(&pernet_ops_rwsem); <- first sem acquired ops_pre_exit_list() pre_exit() devlink_pernet_pre_exit() devlink_reload() mlx5_devlink_reload_down() mlx5_unload_one() [...] mlx5_ib_remove() mlx5_ib_unbind_slave_port() mlx5_remove_netdev_notifier() unregister_netdevice_notifier() down_write(&pernet_ops_rwsem);<- recurrsive lock Hence, when net namespace is deleted, mlx5 reload results in deadlock. When deadlock occurs, devlink mutex is also held. This not only deadlocks the mlx5 device under reload, but all the processes which attempt to access unrelated devlink devices are deadlocked. Hence, fix this by mlx5 ib driver to register for per net netdev notifier instead of global one, which operats on the net namespace without holding the pernet_ops_rwsem. Fixes: 4383cfcc65e7 ("net/mlx5: Add devlink reload") Link: https://lore.kernel.org/r/20201026134359.23150-1-parav@nvidia.com Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
e0477b34 |
|
08-Oct-2020 |
Jason Gunthorpe <jgg@ziepe.ca> |
RDMA: Explicitly pass in the dma_device to ib_register_device The code in setup_dma_device has become rather convoluted, move all of this to the drivers. Drives now pass in a DMA capable struct device which will be used to setup DMA, or drivers must fully configure the ibdev for DMA and pass in NULL. Other than setting the masks in rvt all drivers were doing this already anyhow. mthca, mlx4 and mlx5 were already setting up maximum DMA segment size for DMA based on their hardweare limits in: __mthca_init_one() dma_set_max_seg_size (1G) __mlx4_init_one() dma_set_max_seg_size (1G) mlx5_pci_init() set_dma_caps() dma_set_max_seg_size (2G) Other non software drivers (except usnic) were extended to UINT_MAX [1, 2] instead of 2G as was before. [1] https://lore.kernel.org/linux-rdma/20200924114940.GE9475@nvidia.com/ [2] https://lore.kernel.org/linux-rdma/20200924114940.GE9475@nvidia.com/ Link: https://lore.kernel.org/r/20201008082752.275846-1-leon@kernel.org Link: https://lore.kernel.org/r/6b2ed339933d066622d5715903870676d8cc523a.1602590106.git.mchehab+huawei@kernel.org Suggested-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
1c15b4f2 |
|
23-Sep-2020 |
Avihai Horon <avihaih@nvidia.com> |
RDMA/core: Modify enum ib_gid_type and enum rdma_network_type Separate IB_GID_TYPE_IB and IB_GID_TYPE_ROCE to two different values, so enum ib_gid_type will match the gid types of the new query GID table API which will be introduced in the following patches. This change in enum ib_gid_type requires to change also enum rdma_network_type by separating RDMA_NETWORK_IB and RDMA_NETWORK_ROCE_V1 values. Link: https://lore.kernel.org/r/20200923165015.2491894-3-leon@kernel.org Signed-off-by: Avihai Horon <avihaih@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
f8225e34 |
|
26-Sep-2020 |
Leon Romanovsky <leon@kernel.org> |
RDMA/mlx5: Reuse existing fields in parent QP storage object Remove duplication of mlx5_ib_qp and mlx5_ib_gsi_qp fields. This change returns the memory footprint of mlx5_ib QP to be as it was before embedding GSI QP. Link: https://lore.kernel.org/r/20200926102450.2966017-3-leon@kernel.org Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
0ec52f01 |
|
14-Sep-2020 |
Jason Gunthorpe <jgg@ziepe.ca> |
RDMA/mlx5: Disable IB_DEVICE_MEM_MGT_EXTENSIONS if IB_WR_REG_MR can't work set_reg_wr() always fails if !umr_modify_entity_size_disabled because mlx5_ib_can_use_umr() always fails. Without set_reg_wr() IB_WR_REG_MR doesn't work and that means the device should not advertise IB_DEVICE_MEM_MGT_EXTENSIONS. Fixes: 841b07f99a47 ("IB/mlx5: Block MR WR if UMR is not possible") Link: https://lore.kernel.org/r/20200914112653.345244-5-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
376ceb31 |
|
16-Sep-2020 |
Aharon Landau <aharonl@mellanox.com> |
RDMA: Fix link active_speed size According to the IB spec active_speed size should be u16 and not u8 as before. Changing it to allow further extensions in offered speeds. Link: https://lore.kernel.org/r/20200917090223.1018224-4-leon@kernel.org Signed-off-by: Aharon Landau <aharonl@mellanox.com> Reviewed-by: Michael Guralnik <michaelgur@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
8310e327 |
|
03-Sep-2020 |
Alex Vesker <valex@nvidia.com> |
RDMA/mlx5: Allow DM allocation for sw_owner_v2 enabled devices sw_owner_v2 will replace sw_owner for future devices, this means that if sw_owner_v2 is set sw_owner should be ignored and DM allocation is required for sw_owner_v2 devices to function. Link: https://lore.kernel.org/r/20200903073857.1129166-3-leon@kernel.org Signed-off-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
c0a6b5ec |
|
02-Sep-2020 |
Leon Romanovsky <leon@kernel.org> |
RDMA: Convert RWQ table logic to ib_core allocation scheme Move struct ib_rwq_ind_table allocation to ib_core. Link: https://lore.kernel.org/r/20200902081623.746359-3-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
d18bb3e1 |
|
02-Sep-2020 |
Leon Romanovsky <leon@kernel.org> |
RDMA: Clean MW allocation and free flows Move allocation and destruction of memory windows under ib_core responsibility and clean drivers to ensure that no updates to MW ib_core structures are done in driver layer. Link: https://lore.kernel.org/r/20200902081623.746359-2-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
e27014bd |
|
16-Sep-2020 |
Aharon Landau <aharonl@mellanox.com> |
RDMA/mlx5: Delete duplicated mlx5_ptys_width enum Combine two same enums to avoid duplication. Signed-off-by: Aharon Landau <aharonl@mellanox.com> Reviewed-by: Michael Guralnik <michaelgur@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
|
#
639bf441 |
|
16-Sep-2020 |
Aharon Landau <aharonl@mellanox.com> |
net/mlx5: Refactor query port speed functions The functions mlx5_query_port_link_width_oper and mlx5_query_port_ib_proto_oper are always called together, so combine them to a new function called mlx5_query_port_oper to avoid duplication. And while the mlx5i_get_port_settings is the same as mlx5_query_port_oper therefore let's remove it. According to the IB spec link_width_oper and ib_proto_oper should be u16 and not as written u8, so perform casting as a preparation to cross-RDMA patch which will fix that type for all drivers in the RDMA subsystem. Fixes: ada68c31ba9c ("net/mlx5: Introduce a new header file for physical port functions") Signed-off-by: Aharon Landau <aharonl@mellanox.com> Reviewed-by: Michael Guralnik <michaelgur@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
|
#
91a7c58f |
|
07-Sep-2020 |
Leon Romanovsky <leon@kernel.org> |
RDMA: Restore ability to fail on PD deallocate The IB verbs objects are counted by the kernel and ib_core ensures that deallocate PD will success so it will be called once all other objects that depends on PD will be released. This is achieved by managing various reference counters on such objects. The mlx5 driver didn't follow this standard flow when allowed DEVX objects that are not managed by ib_core to be interleaved with the ones under ib_core responsibility. In such interleaved scenarios deallocate command can fail and ib_core will leave uobject in internal DB and attempt to clean it later to free resources anyway. This change partially restores returned value from dealloc_pd() for all drivers, but keeping in mind that non-DEVX devices and kernel verbs paths shouldn't fail. Fixes: 21a428a019c9 ("RDMA: Handle PD allocations by IB/core") Link: https://lore.kernel.org/r/20200907120921.476363-2-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
df561f66 |
|
23-Aug-2020 |
Gustavo A. R. Silva <gustavoars@kernel.org> |
treewide: Use fallthrough pseudo-keyword Replace the existing /* fall through */ comments and its variants with the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary fall-through markings when it is the case. [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
|
#
71cab8ef |
|
26-Jul-2020 |
Leon Romanovsky <leon@kernel.org> |
RDMA/mlx5: Delete unreachable code Delete two occurrences of unreachable code discovered by the Coverity. Link: https://lore.kernel.org/r/20200727095746.495915-1-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
530c8632 |
|
07-Jul-2020 |
Aya Levin <ayal@mellanox.com> |
IB/mlx5: Fix 50G per lane indication Some released FW versions mistakenly don't set the capability that 50G per lane link-modes are supported for VFs (ptys_extended_ethernet capability bit). Use PTYS.ext_eth_proto_capability instead, as this indication is always accurate. If PTYS.ext_eth_proto_capability is valid (has a non-zero value) conclude that the HCA supports 50G per lane. Otherwise, conclude that the HCA doesn't support 50G per lane. Fixes: 08e8676f1607 ("IB/mlx5: Add support for 50Gbps per lane link modes") Link: https://lore.kernel.org/r/20200707110612.882962-3-leon@kernel.org Signed-off-by: Aya Levin <ayal@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
1e2b5a90 |
|
02-Jul-2020 |
Leon Romanovsky <leon@kernel.org> |
RDMA/mlx5: Delete one-time used functions Merge them into their callers, usually the only thing the caller did was to call the one function, so this is clearer. Link: https://lore.kernel.org/r/20200702081809.423482-7-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
d8b7515e |
|
02-Jul-2020 |
Leon Romanovsky <leon@kernel.org> |
RDMA/mlx5: Cleanup DEVX initialization flow Move DEVX initialization and cleanup flows to the devx.c instead of having almost empty functions in main.c Link: https://lore.kernel.org/r/20200702081809.423482-6-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
f7c4ffda |
|
02-Jul-2020 |
Leon Romanovsky <leon@kernel.org> |
RDMA/mlx5: Separate flow steering logic from main.c Move flow steering logic to be in separate file and rename flow.c to be fs.c because it is better describe the content. Link: https://lore.kernel.org/r/20200702081809.423482-5-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
64825827 |
|
02-Jul-2020 |
Leon Romanovsky <leon@kernel.org> |
RDMA/mlx5: Separate counters from main.c There are number of counters types supported in mlx5_ib: HW counters, congestion counters, Q-counters and flow counters. Almost all supporting code was placed in main.c that made almost impossible to maintain the code anymore. Let's create separate code namespace for the counters to easy future generalization effort. Link: https://lore.kernel.org/r/20200702081809.423482-4-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
b572ebe6 |
|
02-Jul-2020 |
Leon Romanovsky <leon@kernel.org> |
RDMA/mlx5: Separate restrack callbacks initialization from main.c The restrack code has separate .c, so move callbacks initialization to that file to improve code locality. Link: https://lore.kernel.org/r/20200702081809.423482-3-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
ac47bf5e |
|
02-Jul-2020 |
Leon Romanovsky <leon@kernel.org> |
RDMA/mlx5: Limit the scope of mlx5_ib_enable_driver function The mlx5_ib_enable_driver() is local function and doesn't need to be shared in mlx5_ib, so change it's signature to have static keyword in it. Link: https://lore.kernel.org/r/20200702081809.423482-2-leon@kernel.org Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
28ad5f65 |
|
30-Jun-2020 |
Leon Romanovsky <leon@kernel.org> |
RDMA: Move XRCD to be under ib_core responsibility Update the code to allocate and free ib_xrcd structure in the ib_core instead of inside drivers. Link: https://lore.kernel.org/r/20200630101855.368895-4-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
3b023e1b |
|
30-Jun-2020 |
Leon Romanovsky <leon@kernel.org> |
RDMA/core: Create and destroy counters in the ib_core Move allocation and destruction of counters under ib_core responsibility Link: https://lore.kernel.org/r/20200630101855.368895-2-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
05f71ef9 |
|
29-Jun-2020 |
Yishai Hadas <yishaih@mellanox.com> |
RDMA/mlx5: Introduce UAPI to query PD attributes Introduce UAPI to query PD attributes, this can be used to retrieve PD attributes by having the PD handle of the created one and owning the command FD for the ucontxet. Link: https://lore.kernel.org/r/20200630093916.332097-7-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
0fb556b2 |
|
29-Jun-2020 |
Yishai Hadas <yishaih@mellanox.com> |
RDMA/mlx5: Implement the query ucontext functionality Implement the query ucontext functionality by returning the original ucontext data as part of an extra mlx5 attribute that holds the driver UAPI response. Link: https://lore.kernel.org/r/20200630093916.332097-6-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
45ec21c9 |
|
29-Jun-2020 |
Yishai Hadas <yishaih@mellanox.com> |
RDMA/mlx5: Refactor mlx5_ib_alloc_ucontext() response Refactor mlx5_ib_alloc_ucontext() to set its response fields in a cleaner way. It includes, - Move the relevant code to a self contained function. - Calculate the response length once and drop redundant code all around. - Reuse previously set ucontext fields once preparing the response. The self contained function will be used in next patch as part of implementing the query ucontext functionality. Link: https://lore.kernel.org/r/20200630093916.332097-5-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
f4375443 |
|
06-Jul-2020 |
Leon Romanovsky <leon@kernel.org> |
RDMA/mlx5: Get XRCD number directly for the internal use The mlx5_ib creates XRC domain and uses for creating internal SRQ. However all that is needed is XRCD number and not full blown ib_xrcd objects. Update the code to get and store the number only. Link: https://lore.kernel.org/r/20200706122716.647338-2-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
28b5fa68 |
|
23-Jun-2020 |
Maor Gottlieb <maorg@mellanox.com> |
RDMA/mlx5: Add support to get MR resource in RAW format Add support to get MR (mkey) resource dump in RAW format. Link: https://lore.kernel.org/r/20200623113043.1228482-12-leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
1ccecc88 |
|
23-Jun-2020 |
Maor Gottlieb <maorg@mellanox.com> |
RDMA/mlx5: Add support to get CQ resource in RAW format Add support to get CQ resource dump in RAW format. Link: https://lore.kernel.org/r/20200623113043.1228482-11-leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
1776dd23 |
|
23-Jun-2020 |
Maor Gottlieb <maorg@mellanox.com> |
RDMA/mlx5: Add support to get QP resource in RAW format Add a generic function to use the resource dump mechanism to get the QP resource data. Link: https://lore.kernel.org/r/20200623113043.1228482-10-leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
f4434529 |
|
23-Jun-2020 |
Maor Gottlieb <maorg@mellanox.com> |
RDMA: Add dedicated MR resource tracker function In order to avoid double multiplexing of the resource when it is a MR, add a dedicated callback function. Link: https://lore.kernel.org/r/20200623113043.1228482-5-leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
#
4d12c04c |
|
28-May-2020 |
Jason Gunthorpe <jgg@ziepe.ca> |
RDMA: Remove 'max_map_per_fmr' Now that FMR support is gone, this attribute can be deleted from all places. Link: https://lore.kernel.org/r/13-v3-f58e6669d5d3+2cf-fmr_removal_jgg@mellanox.com Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
802dcc7f |
|
26-May-2020 |
Mark Zhang <markz@mellanox.com> |
RDMA/mlx5: Support TX port affinity for VF drivers in LAG mode The mlx5 VF driver doesn't set QP tx port affinity because it doesn't know if the lag is active or not, since the "lag_active" works only for PF interfaces. In this case for VF interfaces only one lag is used which brings performance issue. Add a lag_tx_port_affinity CAP bit; When it is enabled and "num_lag_ports > 1", then driver always set QP tx affinity, regardless of lag state. Link: https://lore.kernel.org/r/20200527055014.355093-1-leon@kernel.org Signed-off-by: Mark Zhang <markz@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
5f62a521 |
|
26-May-2020 |
Leon Romanovsky <leon@kernel.org> |
RDMA/mlx5: Set ECE options during modify QP The most common way to set ECE option will be during modify QP command in INIT2RTR, RTR2RTS and RTS2RTS stages, so update mlx5 to support it. The new bit in the comp_mask is needed to mark that kernel supports ECE and can receive data instead of "reserved" field in the struct mlx5_ib_modify_qp. Link: https://lore.kernel.org/r/20200526115440.205922-8-leon@kernel.org Reviewed-by: Mark Zhang <markz@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
0ac8903c |
|
19-May-2020 |
Jason Gunthorpe <jgg@ziepe.ca> |
RDMA/core: Allow the ioctl layer to abort a fully created uobject While creating a uobject every create reaches a point where the uobject is fully initialized. For ioctls that go on to copy_to_user this means they need to open code the destruction of a fully created uobject - ie the RDMA_REMOVE_DESTROY sort of flow. Open coding this creates bugs, eg the CQ does not properly flush the events list when it does its error unwind. Provide a uverbs_finalize_uobj_create() function which indicates that the uobject is fully initialized and that abort should call to destroy_hw to destroy the uobj->object and related. Methods can call this function if they go on to have error cases after setting uobj->object. Once done those error cases can simply do return, without an error unwind. Link: https://lore.kernel.org/r/20200519072711.257271-2-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
daeee976 |
|
12-May-2020 |
Shay Drory <shayd@mellanox.com> |
RDMA/mlx5: Update mlx5_ib driver name Current description doesn't include new devices, change it by updating to have more generic description and remove DRIVER_NAME and DRIVER_VERSION defines. Link: https://lore.kernel.org/r/20200513095304.210240-1-leon@kernel.org Signed-off-by: Shay Drory <shayd@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
8c112a5f |
|
03-May-2020 |
Maor Gottlieb <maorg@mellanox.com> |
RDMA/mlx5: Add support in steering default miss User can configure default miss rule in order to skip matching in the user domain and forward the packet to the kernel steering domain. When user requests a default miss rule, we add steering rule to forward the traffic to the next namespace. Link: https://lore.kernel.org/r/20200504053012.270689-5-leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Mark Zhang <markz@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
14c129e3 |
|
03-May-2020 |
Maor Gottlieb <maorg@mellanox.com> |
{IB/net}/mlx5: Simplify don't trap code The fs_core already supports creation of rules with multiple actions/destinations. Refactor fs_core to handle the case when don't trap rule is created with destination. Adapt the calling code in the driver. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Mark Zhang <markz@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
|
#
8d93efb8 |
|
06-May-2020 |
Mark Bloch <markb@mellanox.com> |
RDMA/mlx5: Assign profile before calling stages Assign the profile to the IB device before executing stages. This will allow to check which profile is being used from within a stage. Link: https://lore.kernel.org/r/20200506071602.7177-2-leon@kernel.org Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
029e88fd |
|
06-May-2020 |
Leon Romanovsky <leon@kernel.org> |
RDMA/mlx5: Move all WR logic from qp.c to separate file Split qp.c by removing all WR logic to separate file. Link: https://lore.kernel.org/r/20200506065513.4668-4-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
31578def |
|
06-May-2020 |
Leon Romanovsky <leon@kernel.org> |
RDMA/mlx5: Update mlx5_ib to use new cmd interface Reuse newly introduced mlx5_cmd_exec_in() and mlx5_cmd_exec_inout() to reduce code duplication in mlx5_ib module. Link: https://lore.kernel.org/r/20200506065513.4668-2-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
5ac55dfc |
|
03-May-2020 |
Mark Zhang <markz@mellanox.com> |
RDMA/mlx5: Set UDP source port based on the grh.flow_label Calculate UDP source port based on the grh.flow_label. If grh.flow_label is not valid, we will use minimal supported UDP source port. Link: https://lore.kernel.org/r/20200504051935.269708-6-leon@kernel.org Signed-off-by: Mark Zhang <markz@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
cfc1a89e |
|
30-Apr-2020 |
Maor Gottlieb <maorg@mellanox.com> |
RDMA/mlx5: Set lag tx affinity according to slave The patch sets the lag tx affinity of the data QPs and the GSI QPs according to the LAG xmit slave. For GSI QPs, in case the link layer is Ethenet (RoCE) we create two GSI QPs, one for each physical port. When the driver selects the GSI QP, it will consider the port affinity result. For connected QPs, the driver sets the affinity of the xmit slave. The above, ensures that RC QP and it's corresponding GSI QP will transmit from the same physical port. Link: https://lore.kernel.org/r/20200430192146.12863-17-maorg@mellanox.com Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
2be08c30 |
|
27-Apr-2020 |
Leon Romanovsky <leon@kernel.org> |
RDMA/mlx5: Delete create QP flags obfuscation There is no point in redefinition of stable and exposed to users create flags. Their values won't be changed and it is equal to used by the mlx5. Delete the mlx5 definitions and use IB/core fields. Link: https://lore.kernel.org/r/20200427154636.381474-14-leon@kernel.org Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
dff8e2d1 |
|
24-Apr-2020 |
Erez Shitrit <erezsh@mellanox.com> |
net/mlx5: Use aligned variable while allocating ICM memory The alignment value is part of the input structure, so use it and spare extra memory allocation when is not needed. Now, using the new ability when allocating icm for Direct-Rule insertion. Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
#
333fbaa0 |
|
04-Apr-2020 |
Leon Romanovsky <leon@kernel.org> |
net/mlx5: Move QP logic to mlx5_ib The mlx5_core doesn't need any functionality coded in qp.c, so move that file to drivers/infiniband/ be under mlx5_ib responsibility. Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
|
#
bfd745f8 |
|
03-Apr-2020 |
Leon Romanovsky <leon@kernel.org> |
RDMA/mlx5: Delete Q counter allocations command Remove mlx5_ib implementation of Q counter allocation logic together with cleaning boolean which controlled validity of the counter. It is not needed, because counter_id == 0 means that counter is not valid. Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
|
#
66247fbb |
|
03-Apr-2020 |
Leon Romanovsky <leon@kernel.org> |
net/mlx5: Remove Q counter low level helper APIs mlx5 core users are encouraged to use low level API (mlx5_cmd_exec) without the need of helper functions, do this for q counters, remove helper functions and call mlx5_cmd_exec directly from users. This will help reduce the total amount of code and reduction of the mlx5_core symbol table. Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
|
#
af9c3841 |
|
24-Mar-2020 |
Michael Guralnik <michaelgur@mellanox.com> |
RDMA/mlx5: Add support for RDMA TX flow table Enable user application to add rules for RDMA TX steering table. Rules in this steering table will allow to steer transmitted RDMA traffic. Link: https://lore.kernel.org/r/20200324061425.1570190-3-leon@kernel.org Signed-off-by: Michael Guralnik <michaelgur@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
0a2fd01c |
|
24-Mar-2020 |
Yishai Hadas <yishaih@mellanox.com> |
IB/mlx5: Move to fully dynamic UAR mode once user space supports it Move to fully dynamic UAR mode once user space supports it. In this case we prevent any legacy mode of UARs on the allocated context and prevent redundant allocation of the static ones. Link: https://lore.kernel.org/r/20200324060143.1569116-6-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Michael Guralnik <michaelgur@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
342ee59d |
|
24-Mar-2020 |
Yishai Hadas <yishaih@mellanox.com> |
IB/mlx5: Expose UAR object and its alloc/destroy commands Expose UAR object and its alloc/destroy commands to be used over the ioctl interface by user space applications. This API supports both BF & NC modes and enables a dynamic allocation of UARs once really needed. As the number of driver objects were limited by the core ones when the merged tree is prepared, had to decrease the number of core objects to enable the new UAR object usage. Link: https://lore.kernel.org/r/20200324060143.1569116-2-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Michael Guralnik <michaelgur@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
1f3db161 |
|
17-Mar-2020 |
Yishai Hadas <yishaih@mellanox.com> |
IB/mlx5: Generally use the WC auto detection test result Now that we have direct and reliable detection of WC support by the system, use is broadly. The only case we have to worry about is when the WC autodetector cannot run. For this fringe case generally assume that that WC is available, except in the well defined case of no PAT support on x86 which is tested by calling arch_can_pci_mmap_wc(). If WC is wrongly assumed to be available then it causes a small performance hit on paths in userspace that are tuned to the assumption that WC is available. There is no functional loss. It is very unlikely that any platforms exist that lack WC and also care about the micro optimization of WC in the fringe case where autodetection does not work. By removing the fairly bogus CONFIG tests this makes WC work broadly on all arches and all platforms. Link: https://lore.kernel.org/r/20200318100323.46659-1-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Michael Guralnik <michaelgur@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
f743ff3b |
|
10-Mar-2020 |
Saeed Mahameed <saeedm@mellanox.com> |
RDMA/mlx5: Replace spinlock protected write with atomic var mkey variant calculation was spinlock protected to make it atomic, replace that with one atomic variable. Link: https://lore.kernel.org/r/20200310082238.239865-4-leon@kernel.org Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
fc6a9f86 |
|
10-Mar-2020 |
Saeed Mahameed <saeedm@mellanox.com> |
{IB,net}/mlx5: Assign mkey variant in mlx5_ib only mkey variant is not required for mlx5_core use, move the mkey variant counter to mlx5_ib. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
|
#
a762d460 |
|
10-Mar-2020 |
Leon Romanovsky <leon@kernel.org> |
RDMA/mlx5: Use offsetofend() instead of duplicated variant Convert mlx5 driver to use offsetofend() instead of its duplicated variant. Link: https://lore.kernel.org/r/20200310091438.248429-5-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
41e684ef |
|
05-Mar-2020 |
Alex Vesker <valex@mellanox.com> |
IB/mlx5: Replace tunnel mpls capability bits for tunnel_offloads Until now the flex parser capability was used in ib_query_device() to indicate tunnel_offloads_caps support for mpls_over_gre/mpls_over_udp. Newer devices and firmware will have configurations with the flexparser but without mpls support. Testing for the flex parser capability was a mistake, the tunnel_stateless capability was intended for detecting mpls and was introduced at the same time as the flex parser capability. Otherwise userspace will be incorrectly informed that a future device supports MPLS when it does not. Link: https://lore.kernel.org/r/20200305123841.196086-1-leon@kernel.org Cc: <stable@vger.kernel.org> # 4.17 Fixes: e818e255a58d ("IB/mlx5: Expose MPLS related tunneling offloads") Signed-off-by: Alex Vesker <valex@mellanox.com> Reviewed-by: Ariel Levkovich <lariel@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
ec16b6bb |
|
05-Mar-2020 |
Mark Zhang <markz@mellanox.com> |
RDMA/mlx5: Fix the number of hwcounters of a dynamic counter When we read the global counter and there's any dynamic counter allocated, the value of a hwcounter is the sum of the default counter and all dynamic counters. So the number of hwcounters of a dynamically allocated counter must be same as of the default counter, otherwise there will be read violations. This fixes the KASAN slab-out-of-bounds bug: BUG: KASAN: slab-out-of-bounds in rdma_counter_get_hwstat_value+0x36d/0x390 [ib_core] Read of size 8 at addr ffff8884192a5778 by task rdma/10138 CPU: 7 PID: 10138 Comm: rdma Not tainted 5.5.0-for-upstream-dbg-2020-02-06_18-30-19-27 #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 Call Trace: dump_stack+0xb7/0x10b print_address_description.constprop.4+0x1e2/0x400 ? rdma_counter_get_hwstat_value+0x36d/0x390 [ib_core] __kasan_report+0x15c/0x1e0 ? mlx5_ib_query_q_counters+0x13f/0x270 [mlx5_ib] ? rdma_counter_get_hwstat_value+0x36d/0x390 [ib_core] kasan_report+0xe/0x20 rdma_counter_get_hwstat_value+0x36d/0x390 [ib_core] ? rdma_counter_query_stats+0xd0/0xd0 [ib_core] ? memcpy+0x34/0x50 ? nla_put+0xe2/0x170 nldev_stat_get_doit+0x9c7/0x14f0 [ib_core] ... do_syscall_64+0x95/0x490 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7fcc457fe65a Code: bb 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 8b 05 fa f1 2b 00 45 89 c9 4c 63 d1 48 63 ff 85 c0 75 15 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 76 f3 c3 0f 1f 40 00 41 55 41 54 4d 89 c5 55 RSP: 002b:00007ffc0586f868 EFLAGS: 00000246 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fcc457fe65a RDX: 0000000000000020 RSI: 00000000013db920 RDI: 0000000000000003 RBP: 00007ffc0586fa90 R08: 00007fcc45ac10e0 R09: 000000000000000c R10: 0000000000000000 R11: 0000000000000246 R12: 00000000004089c0 R13: 0000000000000000 R14: 00007ffc0586fab0 R15: 00000000013dc9a0 Allocated by task 9700: save_stack+0x19/0x80 __kasan_kmalloc.constprop.7+0xa0/0xd0 mlx5_ib_counter_alloc_stats+0xd1/0x1d0 [mlx5_ib] rdma_counter_alloc+0x16d/0x3f0 [ib_core] rdma_counter_bind_qpn_alloc+0x216/0x4e0 [ib_core] nldev_stat_set_doit+0x8c2/0xb10 [ib_core] rdma_nl_rcv_msg+0x3d2/0x730 [ib_core] rdma_nl_rcv+0x2a8/0x400 [ib_core] netlink_unicast+0x448/0x620 netlink_sendmsg+0x731/0xd10 sock_sendmsg+0xb1/0xf0 __sys_sendto+0x25d/0x2c0 __x64_sys_sendto+0xdd/0x1b0 do_syscall_64+0x95/0x490 entry_SYSCALL_64_after_hwframe+0x49/0xbe Fixes: 18d422ce8ccf ("IB/mlx5: Add counter_alloc_stats() and counter_update_stats() support") Link: https://lore.kernel.org/r/20200305124052.196688-1-leon@kernel.org Signed-off-by: Mark Zhang <markz@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
30f2fe40 |
|
19-Feb-2020 |
Yishai Hadas <yishaih@mellanox.com> |
IB/mlx5: Introduce UAPIs to manage packet pacing Introduce packet pacing uobject and its alloc and destroy methods. This uobject holds mlx5 packet pacing context according to the device specification and enables managing packet pacing device entries that are needed by DEVX applications. Link: https://lore.kernel.org/r/20200219190518.200912-3-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
79db784e |
|
27-Feb-2020 |
Parav Pandit <parav@mellanox.com> |
IB/mlx5: Fix missing congestion control debugfs on rep rdma device Cited commit missed to include low level congestion control related debugfs stage initialization. This resulted in missing debugfs entries for cc_params of a RDMA device. Add them back. Fixes: b5ca15ad7e61 ("IB/mlx5: Add proper representors support") Link: https://lore.kernel.org/r/20200227125407.99803-1-leon@kernel.org Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
91b74bf5 |
|
17-Feb-2020 |
Alexander Lobakin <alobakin@pm.me> |
IB/mlx5: Optimize u64 division on 32-bit arches Commit f164be8c0366 ("IB/mlx5: Extend caps stage to handle VAR capabilities") introduced a straight "/" division of the u64 variable "bar_size". This was fixed with commit 685eff513183 ("IB/mlx5: Use div64_u64 for num_var_hw_entries calculation"). However, div64_u64() is redundant here as mlx5_var_table::stride_size is of type u32. Make the actual code way more optimized on 32-bit kernels using div_u64() and fix 80 chars break-through by the way. Fixes: 685eff513183 ("IB/mlx5: Use div64_u64 for num_var_hw_entries calculation") Link: https://lore.kernel.org/r/20200217073629.8051-1-alobakin@dlink.ru Signed-off-by: Alexander Lobakin <alobakin@dlink.ru> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
0f0d3827 |
|
15-Feb-2020 |
Paul Blakey <paulb@mellanox.com> |
net/mlx5: E-Switch, Move source port on reg_c0 to the upper 16 bits Multi chain support requires the miss path to continue the processing from the last chain id, and for that we need to save the chain miss tag (a mapping for 32bit chain id) on reg_c0 which will come in a next patch. Currently reg_c0 is exclusively used to store the source port metadata, giving it 32bit, it is created from 16bits of vcha_id, and 16bits of vport number. We will move this source port metadata to upper 16bits, and leave the lower bits for the chain miss tag. We compress the reg_c0 source port metadata to 16bits by taking 8 bits from vhca_id, and 8bits from the vport number. Since we compress the vport number to 8bits statically, and leave two top ids for special PF/ECPF numbers, we will only support a max of 254 vports with this strategy. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
#
685eff51 |
|
06-Feb-2020 |
Jason Gunthorpe <jgg@ziepe.ca> |
IB/mlx5: Use div64_u64 for num_var_hw_entries calculation On i386: ERROR: "__udivdi3" [drivers/infiniband/hw/mlx5/mlx5_ib.ko] undefined! ERROR: "__divdi3" [drivers/infiniband/hw/mlx5/mlx5_ib.ko] undefined! Fixes: f164be8c0366 ("IB/mlx5: Extend caps stage to handle VAR capabilities") Reported-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested Reported-by: Alexander Lobakin <alobakin@dlink.ru> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
9b6d3bbc |
|
12-Feb-2020 |
Leon Romanovsky <leon@kernel.org> |
RDMA/mlx5: Prevent overflow in mmap offset calculations The cmd and index variables declared as u16 and the result is supposed to be stored in u64. The C arithmetic rules doesn't promote "(index >> 8) << 16" to be u64 and leaves the end result to be u16. Fixes: 7be76bef320b ("IB/mlx5: Introduce VAR object and its alloc/destroy methods") Link: https://lore.kernel.org/r/20200212072635.682689-10-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
8889f6fa |
|
30-Jan-2020 |
Jason Gunthorpe <jgg@ziepe.ca> |
RDMA/core: Make the entire API tree static Compilation of mlx5 driver without CONFIG_INFINIBAND_USER_ACCESS generates the following error. on x86_64: ld: drivers/infiniband/hw/mlx5/main.o: in function `mlx5_ib_handler_MLX5_IB_METHOD_VAR_OBJ_ALLOC': main.c:(.text+0x186d): undefined reference to `ib_uverbs_get_ucontext_file' ld: drivers/infiniband/hw/mlx5/main.o:(.rodata+0x2480): undefined reference to `uverbs_idr_class' ld: drivers/infiniband/hw/mlx5/main.o:(.rodata+0x24d8): undefined reference to `uverbs_destroy_def_handler' This is happening because some parts of the UAPI description are not static. This is a hold over from earlier code that relied on struct pointers to refer to object types, now object types are referenced by number. Remove the unused globals and add statics to the remaining UAPI description elements. Remove the redundent #ifdefs around mlx5_ib_*defs and obsolete mlx5_ib_get_devx_tree(). The compiler now trims alot more unused code, including the above problematic definitions when !CONFIG_INFINIBAND_USER_ACCESS. Fixes: 7be76bef320b ("IB/mlx5: Introduce VAR object and its alloc/destroy methods") Reported-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
61dc7b01 |
|
14-Nov-2019 |
Paul Blakey <paulb@mellanox.com> |
net/mlx5: Refactor mlx5_create_auto_grouped_flow_table Refactor mlx5_create_auto_grouped_flow_table() to use ft_attr param which already carries the max_fte, prio and flags memebers, and is used the same in similar mlx5_create_flow_table() function. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
#
d7fab916 |
|
15-Jan-2020 |
Avihai Horon <avihaih@mellanox.com> |
IB/mlx5: Expose RoCE accelerator counters Introduce the following RoCE accelerator counters: * roce_adp_retrans - number of adaptive retransmission for RoCE traffic. * roce_adp_retrans_to - number of times RoCE traffic reached time out due to adaptive retransmission. * roce_slow_restart - number of times RoCE slow restart was used. * roce_slow_restart_cnps - number of times RoCE slow restart generate CNP packets. * roce_slow_restart_trans - number of times RoCE slow restart change state to slow restart. Link: https://lore.kernel.org/r/20200115145459.83280-3-leon@kernel.org Signed-off-by: Avihai Horon <avihaih@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
a73a8955 |
|
15-Jan-2020 |
Moni Shoua <monis@mellanox.com> |
IB/mlx5: Mask out unsupported ODP capabilities for kernel QPs The ODP handler for WQEs in RQ or SRQ is not implented for kernel QPs. Therefore don't report support in these if query comes from a kernel user. Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
|
#
48357091 |
|
15-Jan-2020 |
Leon Romanovsky <leon@kernel.org> |
RDMA/mlx5: Don't fake udata for kernel path Kernel paths must not set udata and provide NULL pointer, instead of faking zeroed udata struct. Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
|
#
3f59b6c3 |
|
12-Dec-2019 |
Yishai Hadas <yishaih@mellanox.com> |
IB/mlx5: Add mmap support for VAR Add mmap support for VAR, it uses the 'offset' command mode with involvement of IB core APIs to find the previously allocated mmap entry. Link: https://lore.kernel.org/r/20191212110928.334995-6-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
7be76bef |
|
12-Dec-2019 |
Yishai Hadas <yishaih@mellanox.com> |
IB/mlx5: Introduce VAR object and its alloc/destroy methods Introduce VAR object and its alloc/destroy KABI methods. The internal implementation uses the IB core API to manage mmap/munamp calls. Link: https://lore.kernel.org/r/20191212110928.334995-5-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
f164be8c |
|
12-Dec-2019 |
Yishai Hadas <yishaih@mellanox.com> |
IB/mlx5: Extend caps stage to handle VAR capabilities Extend caps stage to handle VAR capabilities. Link: https://lore.kernel.org/r/20191212110928.334995-4-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
4cca96a8 |
|
12-Dec-2019 |
Parav Pandit <parav@mellanox.com> |
IB/mlx5: Do reverse sequence during device removal When IB device profile initialization completes, device is marked as active. However, IB device is not marked inactive, during device removal flow. It should be the mirror of the add flow. Hence, mark it inactive during remove sequence. Link: https://lore.kernel.org/r/20191212113024.336702-2-leon@kernel.org Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
dc2316eb |
|
11-Dec-2019 |
Yishai Hadas <yishaih@mellanox.com> |
IB/mlx5: Fix device memory flows Fix device memory flows so that only once there will be no live mmaped VA to a given allocation the matching object will be destroyed. This prevents a potential scenario that existing VA that was mmaped by one process might still be used post its deallocation despite that it's owned now by other process. The above is achieved by integrating with IB core APIs to manage mmap/munmap. Only once the refcount will become 0 the DM object and its underlay area will be freed. Fixes: 3b113a1ec3d4 ("IB/mlx5: Support device memory type attribute") Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Link: https://lore.kernel.org/r/20191212100237.330654-3-leon@kernel.org Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
ed9085fe |
|
12-Dec-2019 |
Maor Gottlieb <maorg@mellanox.com> |
IB/mlx5: Fix steering rule of drop and count There are two flow rule destinations: QP and packet. While users are setting DROP packet rule, the QP should not be set as a destination. Fixes: 3b3233fbf02e ("IB/mlx5: Add flow counters binding support") Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Raed Salem <raeds@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Link: https://lore.kernel.org/r/20191212091214.315005-4-leon@kernel.org Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
eb243d1d |
|
20-Nov-2019 |
Ingo Molnar <mingo@kernel.org> |
x86/mm/pat: Rename <asm/pat.h> => <asm/memtype.h> pat.h is a file whose main purpose is to provide the memtype_*() APIs. PAT is the low level hardware mechanism - but the high level abstraction is memtype. So name the header <memtype.h> as well - this goes hand in hand with memtype.c and memtype_interval.c. Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
9c0015ef |
|
06-Nov-2019 |
Danit Goldberg <danitg@mellanox.com> |
IB/mlx5: Implement callbacks for getting VFs GUID attributes Implement the IB defined callback mlx5_ib_get_vf_guid used to query FW for VFs attributes and return node and port GUIDs. Signed-off-by: Danit Goldberg <danitg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
|
#
c16339b6 |
|
15-Nov-2019 |
Mark Zhang <markz@mellanox.com> |
IB/mlx5: Support extended number of strides for Striding RQ Extends the minimum single WQE strides from 64 to 8, which is exposed by the "min_single_wqe_log_num_of_strides" field of striding_rq_caps. Choose right number of strides based on FW capability. Link: https://lore.kernel.org/r/20191115154555.247856-1-leon@kernel.org Signed-off-by: Mark Zhang <markz@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
94de879c |
|
08-Nov-2019 |
Michael Guralnik <michaelgur@mellanox.com> |
IB/mlx5: Load profile according to RoCE enablement state When RoCE is disabled load mlx5_ib in raw_eth profile. Clean pf_profile roce capability checks as it will not be used without roce capability. Signed-off-by: Michael Guralnik <michaelgur@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
#
b5a498ba |
|
08-Nov-2019 |
Michael Guralnik <michaelgur@mellanox.com> |
IB/mlx5: Rename profile and init methods Rename uplink_rep_profile and its unique init and cleanup stages to suit its upcoming use as the profile when RoCE is disabled. Signed-off-by: Michael Guralnik <michaelgur@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
#
c043ff2c |
|
30-Oct-2019 |
Michal Kalderon <michal.kalderon@marvell.com> |
RDMA: Connect between the mmap entry and the umap_priv structure The rdma_user_mmap_io interface created a common interface for drivers to correctly map hw resources and zap them once the ucontext is destroyed enabling the drivers to safely free the hw resources. However, this meant the drivers need to delay freeing the resource to the ucontext destroy phase to ensure they were no longer mapped. The new mechanism for a common way of handling user/driver address mapping enabled notifying the driver if all umap_priv mappings were removed, and enabled freeing the hw resources when they are done with and not delay it until ucontext destroy. Since not all drivers use the mechanism, NULL can be sent to the rdma_user_mmap_io interface to continue working as before. Drivers that use the mmap_xa interface can pass the entry being mapped to the rdma_user_mmap_io function to be linked together. Link: https://lore.kernel.org/r/20191030094417.16866-4-michal.kalderon@marvell.com Signed-off-by: Ariel Elior <ariel.elior@marvell.com> Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
09b0965e |
|
04-Nov-2019 |
Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
IB: mlx5: no need to check return value of debugfs_create functions When calling debugfs functions, there is no need to ever check the return value. The function can work or not, but the code logic should never do something different based on this. Cc: Leon Romanovsky <leon@kernel.org> Cc: Doug Ledford <dledford@redhat.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: linux-rdma@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
e53a9d26 |
|
28-Oct-2019 |
Parav Pandit <parav@mellanox.com> |
IB/mlx5: Introduce and use mlx5_core_is_vf() Instead of deciding a given device is virtual function or not based on a device is PF or not, use already defined MLX5_COREDEV_VF by introducing an helper API mlx5_core_is_vf(). This enables to clearly identify PF, VF and non virtual functions. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Vu Pham <vuhuong@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
#
11f552e2 |
|
10-Jun-2019 |
Michael Guralnik <michaelgur@mellanox.com> |
IB/mlx5: Test write combining support Linux can run in all sorts of physical machines and VMs where write combining may or may not be supported. Currently there is no way to reliably tell if the system supports WC, or not. The driver uses WC to optimize posting work to the HCA, and getting this wrong in either direction can cause a significant performance loss. Add a test in mlx5_ib initialization process to test whether write-combining is supported on the machine. The test will run as part of the enable_driver callback to ensure that the test runs after the device is setup and can create and modify the QP needed, but runs before the device is exposed to the users. The test opens UD QP and posts NOP WQEs, the WQE written to the BlueFlame is different from the WQE in memory, requesting CQE only on the BlueFlame WQE. By checking whether we received a completion on one of these WQEs we can know if BlueFlame succeeded and this write-combining must be supported. Change reporting of BlueFlame support to be dependent on write-combining support instead of the FW's guess as to what the machine can do. Link: https://lore.kernel.org/r/20191027062234.10993-1-leon@kernel.org Signed-off-by: Michael Guralnik <michaelgur@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
5256edcb |
|
09-Oct-2019 |
Jason Gunthorpe <jgg@ziepe.ca> |
RDMA/mlx5: Rework implicit ODP destroy Use SRCU in a sensible way by removing all MRs in the implicit tree from the two xarrays (the update operation), then a synchronize, followed by a normal single threaded teardown. This is only a little unusual from the normal pattern as there can still be some work pending in the unbound wq that may also require a workqueue flush. This is tracked with a single atomic, consolidating the redundant existing atomics and wait queue. For understand-ability the entire ODP implicit create/destroy flow now largely exists in a single pair of functions within odp.c, with a few support functions for tearing down an unused child. Link: https://lore.kernel.org/r/20191009160934.3143-13-jgg@ziepe.ca Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
806b101b |
|
09-Oct-2019 |
Jason Gunthorpe <jgg@ziepe.ca> |
RDMA/mlx5: Use a dedicated mkey xarray for ODP There is a per device xarray storing mkeys that is used to store every mkey in the system. However, this xarray is now only read by ODP for certain ODP designated MRs (ODP, implicit ODP, MW, DEVX_INDIRECT). Create an xarray only for use by ODP, that only contains ODP related MKeys. This xarray is protected by SRCU and all erases are protected by a synchronize. This improves performance: - All MRs in the odp_mkeys xarray are ODP MRs, so some tests for is_odp() can be deleted. The xarray will also consume fewer nodes. - normal MR's are never mixed with ODP MRs in a SRCU data structure so performance sucking synchronize_srcu() on every MR destruction is not needed. - No smp_load_acquire(live) and xa_load() double barrier on read Due to the SRCU locking scheme care must be taken with the placement of the xa_store(). Once it completes the MR is immediately visible to other threads and only through a xa_erase() & synchronize_srcu() cycle could it be destroyed. Link: https://lore.kernel.org/r/20191009160934.3143-4-jgg@ziepe.ca Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
50211ec9 |
|
09-Oct-2019 |
Jason Gunthorpe <jgg@ziepe.ca> |
RDMA/mlx5: Split sig_err MR data into its own xarray The locking model for signature is completely different than ODP, do not share the same xarray that relies on SRCU locking to support ODP. Simply store the active mlx5_core_sig_ctx's in an xarray when signature MRs are created and rely on trivial xarray locking to serialize everything. The overhead of storing only a handful of SIG related MRs is going to be much less than an xarray full of every mkey. Link: https://lore.kernel.org/r/20191009160934.3143-3-jgg@ziepe.ca Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
68abaa76 |
|
20-Oct-2019 |
Ran Rozenstein <ranro@mellanox.com> |
IB/mlx5: Remove dead code mlx5_ib_dc_atomic_is_supported function is not used anywhere. Remove the dead code. Fixes: a60109dc9a95 ("IB/mlx5: Add support for extended atomic operations") Link: https://lore.kernel.org/r/20191020064454.8551-1-leon@kernel.org Signed-off-by: Ran Rozenstein <ranro@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
4061ff7a |
|
16-Oct-2019 |
Erez Alfasi <ereza@mellanox.com> |
RDMA/nldev: Provide MR statistics Add RDMA nldev netlink interface for dumping MR statistics information. Output example: $ ./ibv_rc_pingpong -o -P -s 500000000 local address: LID 0x0001, QPN 0x00008a, PSN 0xf81096, GID :: $ rdma stat show mr dev mlx5_0 mrn 2 page_faults 122071 page_invalidations 0 Link: https://lore.kernel.org/r/20191016062308.11886-5-leon@kernel.org Signed-off-by: Erez Alfasi <ereza@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
e1b95ae0 |
|
16-Oct-2019 |
Erez Alfasi <ereza@mellanox.com> |
RDMA/mlx5: Return ODP type per MR Provide an ODP explicit/implicit type as part of 'rdma -dd resource show mr' dump. For example: $ rdma -dd resource show mr dev mlx5_0 mrn 1 rkey 0xa99a lkey 0xa99a mrlen 50000000 pdn 9 pid 7372 comm ibv_rc_pingpong drv_odp explicit For non-ODP MRs, we won't print "drv_odp ..." at all. Link: https://lore.kernel.org/r/20191016062308.11886-4-leon@kernel.org Signed-off-by: Erez Alfasi <ereza@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
36609056 |
|
07-Oct-2019 |
Yamin Friedman <yaminf@mellanox.com> |
RDMA/mlx5: Add capability for max sge to get optimized performance Allows the IB device to provide a value of maximum scatter gather entries per RDMA READ. In certain cases it may be preferable for a device to perform UMR memory registration rather than have many scatter entries in a single RDMA READ. This provides a significant performance increase in devices capable of using different memory registration schemes based on the number of scatter gather entries. This general capability allows each device vendor to fine tune when it is better to use memory registration. Link: https://lore.kernel.org/r/20191007135933.12483-4-leon@kernel.org Signed-off-by: Yamin Friedman <yaminf@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
6f26b2ac |
|
02-Oct-2019 |
Erez Alfasi <ereza@mellanox.com> |
IB/mlx5: Remove unnecessary else statement 'else' is not generally useful after a break or return. Remove this unnecessary statement. Link: https://lore.kernel.org/r/20191002122517.17721-4-leon@kernel.org Signed-off-by: Erez Alfasi <ereza@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
5d44adeb |
|
16-Sep-2019 |
Danit Goldberg <danitg@mellanox.com> |
IB/mlx5: Free mpi in mp_slave mode ib_add_slave_port() allocates a multiport struct but never frees it. Don't leak memory, free the allocated mpi struct during driver unload. Cc: <stable@vger.kernel.org> Fixes: 32f69e4be269 ("{net, IB}/mlx5: Manage port association for multiport RoCE") Link: https://lore.kernel.org/r/20190916064818.19823-3-leon@kernel.org Signed-off-by: Danit Goldberg <danitg@mellanox.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
2b688ea5 |
|
15-Aug-2019 |
Maor Gottlieb <maorg@mellanox.com> |
net/mlx5: Add flow steering actions to fs_cmd shim layer Add flow steering actions: modify header and packet reformat to the fs_cmd shim layer. This allows each namespace to define possibly different functionality for alloc/dealloc action commands. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
#
c9b9dcb4 |
|
29-Aug-2019 |
Ariel Levkovich <lariel@mellanox.com> |
net/mlx5: Move device memory management to mlx5_core Move the device memory allocation and deallocation commands SW ICM memory to mlx5_core to expose this API for all mlx5_core users. This comes as preparation for supporting SW steering in kernel where it will be required to allocate and register device memory for direct rule insertion. In addition, an API to register this device memory for future remote access operations is introduced using the create_mkey commands. Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
#
d8abe884 |
|
19-Aug-2019 |
Mark Zhang <markz@mellanox.com> |
RDMA/mlx5: RDMA_RX flow type support for user applications Currently user applications can only steer TCP/IP(NIC RX/RX) traffic. This patch adds RDMA_RX as a new flow type to allow the user to insert steering rules to control RDMA traffic. Two destinations are supported(but not set at the same time): devx flow table object and QP. Signed-off-by: Mark Zhang <markz@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Link: https://lore.kernel.org/r/20190819113626.20284-4-leon@kernel.org Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
c571feca |
|
06-Aug-2019 |
Jason Gunthorpe <jgg@ziepe.ca> |
RDMA/odp: use mmu_notifier_get/put for 'struct ib_ucontext_per_mm' This is a significant simplification, no extra list is kept per FD, and the interval tree is now shared between all the ucontexts, reducing overhead if there are multiple ucontexts active. Link: https://lore.kernel.org/r/20190806231548.25242-7-jgg@ziepe.ca Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
ce51346f |
|
19-Aug-2019 |
Moni Shoua <monis@mellanox.com> |
RDMA/core: Make invalidate_range a device operation The callback function 'invalidate_range' is implemented in a driver so the place for it is in the ib_device_ops structure and not in ib_ucontext. Signed-off-by: Moni Shoua <monis@mellanox.com> Reviewed-by: Guy Levi <guyle@mellanox.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Link: https://lore.kernel.org/r/20190819111710.18440-11-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
00815752 |
|
15-Aug-2019 |
Moni Shoua <monis@mellanox.com> |
IB/mlx5: Report and handle ODP support properly ODP depends on the several device capabilities, among them is the ability to send UMR WQEs with that modify atomic and entity size of the MR. Therefore, only if all conditions to send such a UMR WQE are met then driver can report that ODP is supported. Use this check of conditions in all places where driver needs to know about ODP support. Also, implicit ODP support depends on ability of driver to send UMR WQEs for an indirect mkey. Therefore, verify that all conditions to do so are met when reporting support. Fixes: c8d75a980fab ("IB/mlx5: Respect new UMR capabilities") Signed-off-by: Moni Shoua <monis@mellanox.com> Reviewed-by: Guy Levi <guyle@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Link: https://lore.kernel.org/r/20190815083834.9245-7-leon@kernel.org Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
9dc4cfff |
|
13-Aug-2019 |
Leon Romanovsky <leon@kernel.org> |
RDMA/mlx5: Annotate lock dependency in bind/unbind slave port NULL-ing notifier_call is performed under protection of mlx5_ib_multiport_mutex lock. Such protection is not easily spotted and better to be guarded by lockdep annotation. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Link: https://lore.kernel.org/r/20190813102814.22350-1-leon@kernel.org Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
72a7720f |
|
07-Aug-2019 |
Kamal Heib <kamalheib1@gmail.com> |
RDMA: Introduce ib_port_phys_state enum In order to improve readability, add ib_port_phys_state enum to replace the use of magic numbers. Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Reviewed-by: Andrew Boyer <aboyer@tobark.org> Acked-by: Michal Kalderon <michal.kalderon@marvell.com> Acked-by: Bernard Metzler <bmt@zurich.ibm.com> Link: https://lore.kernel.org/r/20190807103138.17219-2-kamalheib1@gmail.com Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
23eaf3b5 |
|
31-Jul-2019 |
Leon Romanovsky <leon@kernel.org> |
RDMA/mlx5: Release locks during notifier unregister The below kernel panic was observed when created bond mode LACP with GRE tunnel on top. The reason to it was not released spinlock during mlx5 notify unregsiter sequence. [ 234.562007] BUG: scheduling while atomic: sh/10900/0x00000002 [ 234.563005] Preemption disabled at: [ 234.566864] ------------[ cut here ]------------ [ 234.567120] DEBUG_LOCKS_WARN_ON(val > preempt_count()) [ 234.567139] WARNING: CPU: 16 PID: 10900 at kernel/sched/core.c:3203 preempt_count_sub+0xca/0x170 [ 234.569550] CPU: 16 PID: 10900 Comm: sh Tainted: G W 5.2.0-rc1-for-linust-dbg-2019-05-25_04-57-33-60 #1 [ 234.569886] Hardware name: Dell Inc. PowerEdge R720/0X3D66, BIOS 2.6.1 02/12/2018 [ 234.570183] RIP: 0010:preempt_count_sub+0xca/0x170 [ 234.570404] Code: 03 38 d0 7c 08 84 d2 0f 85 b0 00 00 00 8b 15 dd 02 03 04 85 d2 75 ba 48 c7 c6 00 e1 88 83 48 c7 c7 40 e1 88 83 e8 76 11 f7 ff <0f> 0b 5b c3 65 8b 05 d3 1f d8 7e 84 c0 75 82 e8 62 c3 c3 00 85 c0 [ 234.570911] RSP: 0018:ffff888b94477b08 EFLAGS: 00010286 [ 234.571133] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000 [ 234.571391] RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000246 [ 234.571648] RBP: ffff888ba5560000 R08: fffffbfff08962d5 R09: fffffbfff08962d5 [ 234.571902] R10: 0000000000000001 R11: fffffbfff08962d4 R12: ffff888bac6e9548 [ 234.572157] R13: ffff888babfaf728 R14: ffff888bac6e9568 R15: ffff888babfaf750 [ 234.572412] FS: 00007fcafa59b740(0000) GS:ffff888bed200000(0000) knlGS:0000000000000000 [ 234.572686] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 234.572914] CR2: 00007f984f16b140 CR3: 0000000b2bf0a001 CR4: 00000000001606e0 [ 234.573172] Call Trace: [ 234.573336] _raw_spin_unlock+0x2e/0x50 [ 234.573542] mlx5_ib_unbind_slave_port+0x1bc/0x690 [mlx5_ib] [ 234.573793] mlx5_ib_cleanup_multiport_master+0x1d3/0x660 [mlx5_ib] [ 234.574039] mlx5_ib_stage_init_cleanup+0x4c/0x360 [mlx5_ib] [ 234.574271] ? kfree+0xf5/0x2f0 [ 234.574465] __mlx5_ib_remove+0x61/0xd0 [mlx5_ib] [ 234.574688] ? __mlx5_ib_remove+0xd0/0xd0 [mlx5_ib] [ 234.574951] mlx5_remove_device+0x234/0x300 [mlx5_core] [ 234.575224] mlx5_unregister_device+0x4d/0x1e0 [mlx5_core] [ 234.575493] remove_one+0x4f/0x160 [mlx5_core] [ 234.575704] pci_device_remove+0xef/0x2a0 [ 234.581407] ? pcibios_free_irq+0x10/0x10 [ 234.587143] ? up_read+0xc1/0x260 [ 234.592785] device_release_driver_internal+0x1ab/0x430 [ 234.598442] unbind_store+0x152/0x200 [ 234.604064] ? sysfs_kf_write+0x3b/0x180 [ 234.609441] ? sysfs_file_ops+0x160/0x160 [ 234.615021] kernfs_fop_write+0x277/0x440 [ 234.620288] ? __sb_start_write+0x1ef/0x2c0 [ 234.625512] vfs_write+0x15e/0x460 [ 234.630786] ksys_write+0x156/0x1e0 [ 234.635988] ? __ia32_sys_read+0xb0/0xb0 [ 234.641120] ? trace_hardirqs_off_thunk+0x1a/0x1c [ 234.646163] do_syscall_64+0x95/0x470 [ 234.651106] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 234.656004] RIP: 0033:0x7fcaf9c9cfd0 [ 234.660686] Code: 73 01 c3 48 8b 0d c0 6e 2d 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d cd cf 2d 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 ee cb 01 00 48 89 04 24 [ 234.670128] RSP: 002b:00007ffd3b01ddd8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ 234.674811] RAX: ffffffffffffffda RBX: 000000000000000d RCX: 00007fcaf9c9cfd0 [ 234.679387] RDX: 000000000000000d RSI: 00007fcafa5c1000 RDI: 0000000000000001 [ 234.683848] RBP: 00007fcafa5c1000 R08: 000000000000000a R09: 00007fcafa59b740 [ 234.688167] R10: 00007ffd3b01d8e0 R11: 0000000000000246 R12: 00007fcaf9f75400 [ 234.692386] R13: 000000000000000d R14: 0000000000000001 R15: 0000000000000000 [ 234.696495] irq event stamp: 153067 [ 234.700525] hardirqs last enabled at (153067): [<ffffffff83258c39>] _raw_spin_unlock_irqrestore+0x59/0x70 [ 234.704665] hardirqs last disabled at (153066): [<ffffffff83259382>] _raw_spin_lock_irqsave+0x22/0x90 [ 234.708722] softirqs last enabled at (153058): [<ffffffff836006c5>] __do_softirq+0x6c5/0xb4e [ 234.712673] softirqs last disabled at (153051): [<ffffffff81227c1d>] irq_exit+0x17d/0x1d0 [ 234.716601] ---[ end trace 5dbf096843ee9ce6 ]--- Fixes: df097a278c75 ("IB/mlx5: Use the new mlx5 core notifier API") Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Link: https://lore.kernel.org/r/20190731083852.584-1-leon@kernel.org Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
3e1f000f |
|
23-Jul-2019 |
Parav Pandit <parav@mellanox.com> |
IB/mlx5: Support per device q counters in switchdev mode When parent mlx5_core_dev is in switchdev mode, q_counters are not applicable to multiple non uplink vports. Hence, have make them limited to device level. While at it, correct __mlx5_ib_qp_set_counter() and __mlx5_ib_modify_qp() to use u16 set_id as defined by the device. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Link: https://lore.kernel.org/r/20190723073117.7175-3-leon@kernel.org Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
5dcecbc9 |
|
23-Jul-2019 |
Parav Pandit <parav@mellanox.com> |
IB/mlx5: Refactor code for counters allocation To support per device counters in switchdev mode (instead of per port counter), refactor query routines to work on mlx5_ib_counter structure instead of mlx5_ib_port structure. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Link: https://lore.kernel.org/r/20190723073117.7175-2-leon@kernel.org Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
a5c9c299 |
|
23-Jul-2019 |
Parav Pandit <parav@mellanox.com> |
IB/mlx5: Avoid unnecessary typecast IB device pointer is already available while deallocating IB device, Hence do not typecast it. Link: https://lore.kernel.org/r/20190723065733.4899-9-leon@kernel.org Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
96e2fd73 |
|
08-Jul-2019 |
Leon Romanovsky <leon@kernel.org> |
RDMA/mlx5: Set RDMA DIM to be enabled by default Enable RDMA DIM by default for better user experience. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
89705e92 |
|
05-Jul-2019 |
Danit Goldberg <danitg@mellanox.com> |
IB/mlx5: Report correctly tag matching rendezvous capability Userspace expects the IB_TM_CAP_RC bit to indicate that the device supports RC transport tag matching with rendezvous offload. However the firmware splits this into two capabilities for eager and rendezvous tag matching. Only if the FW supports both modes should userspace be told the tag matching capability is available. Cc: <stable@vger.kernel.org> # 4.13 Fixes: eb761894351d ("IB/mlx5: Fill XRQ capabilities") Signed-off-by: Danit Goldberg <danitg@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
18d422ce |
|
02-Jul-2019 |
Mark Zhang <markz@mellanox.com> |
IB/mlx5: Add counter_alloc_stats() and counter_update_stats() support Add support for ib callback counter_alloc_stats() and counter_update_stats(). Signed-off-by: Mark Zhang <markz@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
45842fc6 |
|
02-Jul-2019 |
Mark Zhang <markz@mellanox.com> |
IB/mlx5: Support statistic q counter configuration Add support for ib callbacks counter_bind_qp(), counter_unbind_qp() and counter_dealloc(). Signed-off-by: Mark Zhang <markz@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
318d535c |
|
02-Jul-2019 |
Mark Zhang <markz@mellanox.com> |
IB/mlx5: Add counter set id as a parameter for mlx5_ib_query_q_counters() Add counter set id as a parameter so that this API can be used for querying any q counter. Signed-off-by: Mark Zhang <markz@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
50ba3c18 |
|
30-Jun-2019 |
Leon Romanovsky <leon@kernel.org> |
RDMA/mlx5: Use proper allocation API to get zeroed memory There is no need in custom memory zeroing, because it can be done by using kzalloc from the beginning. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
e337dd53 |
|
30-Jun-2019 |
Yishai Hadas <yishaih@mellanox.com> |
IB/mlx5: Register DEVX with mlx5_core to get async events Register DEVX with with mlx5_core to get async events. This will enable to dispatch the applicable events to its consumers in down stream patches. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
4e0e2ea1 |
|
30-Jun-2019 |
Yishai Hadas <yishaih@mellanox.com> |
net/mlx5: Report EQE data upon CQ completion Report EQE data upon CQ completion to let upper layers use this data. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
|
#
f6455de0 |
|
28-Jun-2019 |
Bodong Wang <bodong@mellanox.com> |
net/mlx5: E-Switch, Refactor eswitch SR-IOV interface Devlink eswitch mode is not necessarily related to SR-IOV, e.g, ECPF can be at offload mode when SR-IOV is not enabled. Rename the interface and eswitch mode names to decouple from SR-IOV, and cleanup eswitch messages accordingly. Signed-off-by: Bodong Wang <bodong@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
#
669ff1e3 |
|
25-Jun-2019 |
Jianbo Liu <jianbol@mellanox.com> |
RDMA/mlx5: Add vport metadata matching for IB representors If vport metadata matching is enabled in eswitch, the rule created must be changed to match on the metadata, instead of source port. Signed-off-by: Jianbo Liu <jianbol@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
#
bb0ee7dc |
|
25-Jun-2019 |
Jianbo Liu <jianbol@mellanox.com> |
net/mlx5: Add flow context for flow tag Refactor the flow data structures, add new flow_context and move flow_tag into it, as flow_tag doesn't belong to the rule action. Signed-off-by: Jianbo Liu <jianbol@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
#
c0a6cbb9 |
|
11-Jun-2019 |
Israel Rukshin <israelr@mellanox.com> |
RDMA/core: Rename signature qp create flag and signature device capability Rename IB_QP_CREATE_SIGNATURE_EN to IB_QP_CREATE_INTEGRITY_EN and IB_DEVICE_SIGNATURE_HANDOVER to IB_DEVICE_INTEGRITY_HANDOVER. Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
62e3c379 |
|
11-Jun-2019 |
Max Gurtovoy <maxg@mellanox.com> |
RDMA/mlx5: Add attr for max number page list length for PI operation PI offload (protection information) is a feature that each RDMA provider can implement differently. Thus, introduce new device attribute to define the maximal length of the page list for PI fast registration operation. For example, mlx5 driver uses a single internal MR to map both data and protection SGL's, so it's equal to max_fast_reg_page_list_len / 2. Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
6c984472 |
|
11-Jun-2019 |
Max Gurtovoy <maxg@mellanox.com> |
RDMA/mlx5: Implement mlx5_ib_map_mr_sg_pi and mlx5_ib_alloc_mr_integrity mlx5_ib_map_mr_sg_pi() will map the PI and data dma mapped SG lists to the mlx5 memory region prior to the registration operation. In the new API, the mlx5 driver will allocate an internal memory region for the UMR operation to register both PI and data SG lists. The internal MR will use KLM mode in order to map 2 (possibly non-contiguous/non-align) SG lists using 1 memory key. In the new API, each ULP will use 1 memory region for the signature operation (instead of 3 in the old API). This memory region will have a key that will be exposed to remote server to perform RDMA operation. The internal memory key that will map the SG lists will stay private. Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
09d985be |
|
12-Jun-2019 |
Maor Gottlieb <maorg@mellanox.com> |
RDMA/mlx5: Enable decap and packet reformat on FDB If FDB flow tables support decap operation, enable it on creation, This allows to perform decapsulation of tunnelled packets by steering rules. If FDB flow tables support reformat operation, enable it on creation as well. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Petr Vorel <pvorel@suse.cz> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
cecae747 |
|
12-Jun-2019 |
Maor Gottlieb <maorg@mellanox.com> |
RDMA/mlx5: Consider eswitch encap mode When flow steering is created, then the encap support should consider the eswitch encap mode. If the eswitch flow table (FDB) supports encap then it shouldn't be supported on NIC RX flow tables. Fixes: 4adda1122c490 ('RDMA/mlx5: Enable decap and packet reformat on flow tables') Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Petr Vorel <pvorel@suse.cz> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
e39afe3d |
|
28-May-2019 |
Leon Romanovsky <leon@kernel.org> |
RDMA: Convert CQ allocations to be under core responsibility Ensure that CQ is allocated and freed by IB/core and not by drivers. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Acked-by: Gal Pressman <galpress@amazon.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Tested-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
7a154142 |
|
05-Jun-2019 |
Jason Gunthorpe <jgg@ziepe.ca> |
RDMA: Move owner into struct ib_device_ops This more closely follows how other subsytems work, with owner being a member of the structure containing the function pointers. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
72c6ec18 |
|
05-Jun-2019 |
Jason Gunthorpe <jgg@ziepe.ca> |
RDMA: Move uverbs_abi_ver into struct ib_device_ops No reason for every driver to emit code to set this, just make it part of the driver's existing static const ops structure. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
b9560a41 |
|
05-Jun-2019 |
Jason Gunthorpe <jgg@ziepe.ca> |
RDMA: Move driver_id into struct ib_device_ops No reason for every driver to emit code to set this, just make it part of the driver's existing static const ops structure. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
37eb86c4 |
|
20-May-2019 |
Michal Kubecek <mkubecek@suse.cz> |
mlx5: avoid 64-bit division Commit 25c13324d03d ("IB/mlx5: Add steering SW ICM device memory type") breaks i386 build by introducing three 64-bit divisions. As the divisor is MLX5_SW_ICM_BLOCK_SIZE() which is always a power of 2, we can replace the division with bit operations. Fixes: 25c13324d03d ("IB/mlx5: Add steering SW ICM device memory type") Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
10bf13c3 |
|
06-May-2019 |
Leon Romanovsky <leon@kernel.org> |
RDMA/mlx5: Remove MAYEXEC flag MAYEXEC flag was mistakenly added in the commit cited in the fixes line. Fixes: 4eb6ab13b991 ("RDMA: Remove rdma_user_mmap_page") Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
25c13324 |
|
05-May-2019 |
Ariel Levkovich <lariel@mellanox.com> |
IB/mlx5: Add steering SW ICM device memory type This patch adds support for allocating, deallocating and registering a new device memory type, STEERING_SW_ICM. This memory can be allocated and used by a privileged user for direct rule insertion and management of the device's steering tables. The type is provided by the user via the dedicated attribute in the alloc_dm ioctl command. Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Reviewed-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
4056b12e |
|
05-May-2019 |
Ariel Levkovich <lariel@mellanox.com> |
IB/mlx5: Warn on allocated MEMIC buffers during cleanup Adding a warning on allocated MEMIC buffers that weren't freed prior to driver tear down. Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Reviewed-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
3b113a1e |
|
05-May-2019 |
Ariel Levkovich <lariel@mellanox.com> |
IB/mlx5: Support device memory type attribute This patch intoruduces a new mlx5_ib driver attribute to the DM allocation method - the DM type. In order to allow addition of new types in downstream patches this patch also refactors the allocation, deallocation and registration handlers to consider the requested type and perform the necessary actions according to it. Since not all future device memory types will be such that are mapped to user memory, the mandatory page index output attribute is modified to be optional. Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Reviewed-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
a70c0739 |
|
02-May-2019 |
Parav Pandit <parav@mellanox.com> |
RDMA: Introduce and use GID attr helper to read RoCE L2 fields Instead of RoCE drivers figuring out vlan, smac fields while working on QP/AH, provide a helper routine to read the L2 fields such as vlan_id and source mac address. This moves logic from mlx5 driver to core for wider usage for RoCE ports. This is a preparation patch to allow detaching netdev in subsequent patch. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
6cfdc7e4 |
|
29-Apr-2019 |
Aya Levin <ayal@mellanox.com> |
IB/mlx5: Restrict 'DELAY_DROP_TIMEOUT' subtype to Ethernet interfaces Subtype 'DELAY_DROP_TIMEOUT' (under 'GENERAL' event) is restricted to Ethernet interfaces. This patch doesn't change functionality or breaks current flow. In the downstream patch, non Ethernet (like IB) interfaces will receive 'GENERAL' event. Fixes: 5d3c537f9070 ("net/mlx5: Handle event of power detection in the PCIE slot") Signed-off-by: Aya Levin <ayal@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
#
c42260f1 |
|
29-Apr-2019 |
Vu Pham <vuhuong@mellanox.com> |
net/mlx5: Separate and generalize dma device from pci device The mlx5 Sub-Function (SF) sub device will be introduced in subsequent patches. It will be created as mediated device and belong to mdev bus. It is necessary to treat dma operations on PF, VF and SF in uniform way, hence reduce the dependency on pdev pci dev struct and work directly out of newly introduced 'struct device' from previous patch. This patch does not change any functionality. Signed-off-by: Vu Pham <vuhuong@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
#
4eb6ab13 |
|
16-Apr-2019 |
Jason Gunthorpe <jgg@ziepe.ca> |
RDMA: Remove rdma_user_mmap_page Upon further research drivers that want this should simply call the core function vm_insert_page(). The VMA holds a reference on the page and it will be automatically freed when the last reference drops. No need for disassociate to sequence the cleanup. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
ddcdc368 |
|
16-Apr-2019 |
Jason Gunthorpe <jgg@ziepe.ca> |
RDMA/mlx5: Use get_zeroed_page() for clock_info get_zeroed_page() returns a virtual address for the page which is better than allocating a struct page and doing a permanent kmap on it. Cc: stable@vger.kernel.org Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
d5e560d3 |
|
16-Apr-2019 |
Jason Gunthorpe <jgg@ziepe.ca> |
RDMA/mlx5: Use rdma_user_map_io for mapping BAR pages Since mlx5 supports device disassociate it must use this API for all BAR page mmaps, otherwise the pages can remain mapped after the device is unplugged causing a system crash. Cc: stable@vger.kernel.org Fixes: 5f9794dc94f5 ("RDMA/ucontext: Add a core API for mmaping driver IO memory") Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
|
#
c660133c |
|
16-Apr-2019 |
Jason Gunthorpe <jgg@ziepe.ca> |
RDMA/mlx5: Do not allow the user to write to the clock page The intent of this VMA was to be read-only from user space, but the VM_MAYWRITE masking was missed, so mprotect could make it writable. Cc: stable@vger.kernel.org Fixes: 5c99eaecb1fc ("IB/mlx5: Mmap the HCA's clock info to user-space") Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
|
#
5fb58c9e |
|
28-Mar-2019 |
Mark Bloch <markb@mellanox.com> |
RDMA/mlx5: Don't create IB representors when in multiport RoCE mode Switchdev mode and mutiport RoCE mode aren't compatible at this point. Don't create IB reps when a user switches to switchdev mode and the driver operates in that mode. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
d3b5cc1c |
|
28-Mar-2019 |
Mark Bloch <markb@mellanox.com> |
RDMA/mlx5: Initialize roce port info before multiport master init When working in mutliport RoCE mode it is possible to attach a slave before the master. In that case the slave is waiting for a master to be attached. When the master is attached it goes over the list of waiting slaves, finds a slave that is compatible and tries to bind it to itself. The call stack is: mlx5_ib_init_multiport_master() -> mlx5_ib_bind_slave_port() In the bind function we will create a netdev notifier, but this is done before we initialize the RoCE structure (this is done at a later stage by the master in the ROCE stage). Once events are delivered to that notifier we will use mlx5_ib_get_native_port_mdev() to get the actual port and as the native port is zero we will access an invalid index in the port structure. Move the RoCE structure initialization to an earlier stage. Fixes: 32f69e4be269 ("{net, IB}/mlx5: Manage port association for multiport RoCE") Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
7f575103 |
|
28-Mar-2019 |
Mark Bloch <markb@mellanox.com> |
RDMA/mlx5: Allow DEVX and raw creation flow on reps Remove the limitations that were in place and provide support for DEVX and raw flow creation on reps. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
3b70508a |
|
28-Mar-2019 |
Mark Bloch <markb@mellanox.com> |
RDMA/mlx5: Create flow table with max size supported Instead of failing the request, just use the supported number of flow entries. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
13a43765 |
|
28-Mar-2019 |
Mark Bloch <markb@mellanox.com> |
RDMA/mlx5: Access the prio bypass inside the FDB flow table namespace Now that we have a specific prio inside the FDB namespace allow retrieving it from the RDMA side. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
7249c8ea |
|
10-Apr-2019 |
Guy Levi <guyle@mellanox.com> |
IB/mlx5: Fix scatter to CQE in DCT QP creation When scatter to CQE is enabled on a DCT QP it corrupts the mailbox command since it tried to treat it as as QP create mailbox command instead of a DCT create command. The corrupted mailbox command causes userspace to malfunction as the device doesn't create the QP as expected. A new mlx5 capability is exposed to user-space which ensures that it will not enable the feature on DCT without this fix in the kernel. Fixes: 5d6ff1babe78 ("IB/mlx5: Support scatter to CQE for DC transport type") Signed-off-by: Guy Levi <guyle@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
fb652d32 |
|
28-Mar-2019 |
Mark Bloch <markb@mellanox.com> |
RDMA/mlx5: Remove VF representor profile Now that we have a single IB device with multiple ports we can remove the VF representor profile. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
26628e2d |
|
28-Mar-2019 |
Mark Bloch <markb@mellanox.com> |
RDMA/mlx5: Move to single device multiport ports in switchdev mode Move from IB device (representor) per virtual function to single IB device with port per virtual function (port 1 represents the uplink). As number of ports is a static property of an IB device, declare the IB device with as many port as the possible according to the PCI bus. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
a989ea01 |
|
28-Mar-2019 |
Mark Bloch <markb@mellanox.com> |
RDMA/mlx5: Move SMI caps logic We store the SMI information in the core device's struct, make sure we set that information only once (and not per port), while here make the for loop based on the actual size of the array. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
35b0aa67 |
|
28-Mar-2019 |
Mark Bloch <markb@mellanox.com> |
RDMA/mlx5: Refactor netdev affinity code The design of representors is such that once an IB representor is created, the netdev of representor already exists, we can use that fact to simplify the netdev affinity code. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
6a4d00be |
|
28-Mar-2019 |
Mark Bloch <markb@mellanox.com> |
RDMA/mlx5: Move rep into port struct In preparation of moving into a model of single IB device multiple ports move rep to be part of the port structure. We mark a representor device by setting is_rep, no functional change with this patch. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
5d8f6a0e |
|
28-Mar-2019 |
Mark Bloch <markb@mellanox.com> |
RDMA/mlx5: Use correct size for device resources On allocation we use the array size and on destruction num_ports, use the array size of destruction as well, in this context the array corresponds to the native/actual ports on the NIC so no need to adjust this logic for representors. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
da796ccb |
|
28-Mar-2019 |
Mark Bloch <markb@mellanox.com> |
RDMA/mlx5: Move ports allocation to outside of INIT stage In downstream patches we will need access to the ports before doing any stages, in order to set net device per representor. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
4a6dc855 |
|
28-Mar-2019 |
Mark Bloch <markb@mellanox.com> |
RDMA/mlx5: Free IB device on remove Simplify the code and move the deallocation of the IB device into the remove function. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
95579e78 |
|
28-Mar-2019 |
Mark Bloch <markb@mellanox.com> |
RDMA/mlx5: Move netdev info into the port struct Netdev info is stored in a separate array and holds data relevant on a per port basis, move it to be part of the port struct. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
68e326de |
|
03-Apr-2019 |
Leon Romanovsky <leon@kernel.org> |
RDMA: Handle SRQ allocations by IB/core Convert SRQ allocation from drivers to be in the IB/core Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
d3456914 |
|
03-Apr-2019 |
Leon Romanovsky <leon@kernel.org> |
RDMA: Handle AH allocations by IB/core Simplify drivers by ensuring lifetime of ib_ah object. The changes in .create_ah() go hand in hand with relevant update in .destroy_ah(). We will use this opportunity and convert .destroy_ah() to don't fail, as it was suggested a long time ago, because there is nothing to do in case of failure during destroy. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
aa8106f1 |
|
29-Mar-2019 |
Huy Nguyen <huyn@mellanox.com> |
net/mlx5: Add explicit bar address field Add bar_addr field to store bar-0 address to avoid calling pci_resource_start with hard-coded bar-0 as parameter. Also note that different mlx5 device types will have bar_addr on different bars. This patch does not change any functionality. Signed-off-by: Huy Nguyen <huyn@mellanox.com> Signed-off-by: Vu Pham <vuhuong@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
#
ff23dfa1 |
|
31-Mar-2019 |
Shamir Rabinovitch <shamir.rabinovitch@oracle.com> |
IB: Pass only ib_udata in function prototypes Now when ib_udata is passed to all the driver's object create/destroy APIs the ib_udata will carry the ib_ucontext for every user command. There is no need to also pass the ib_ucontext via the functions prototypes. Make ib_udata the only argument psssed. Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
bdeacabd |
|
31-Mar-2019 |
Shamir Rabinovitch <shamir.rabinovitch@oracle.com> |
IB: Remove 'uobject->context' dependency in object destroy APIs Now that we have the udata passed to all the ib_xxx object destroy APIs and the additional macro 'rdma_udata_to_drv_context' to get the ib_ucontext from ib_udata stored in uverbs_attr_bundle, we can finally start to remove the dependency of the drivers in the ib_xxx->uobject->context. Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
c4367a26 |
|
31-Mar-2019 |
Shamir Rabinovitch <shamir.rabinovitch@oracle.com> |
IB: Pass uverbs_attr_bundle down ib_x destroy path The uverbs_attr_bundle with the ucontext is sent down to the drivers ib_x destroy path as ib_udata. The next patch will use the ib_udata to free the drivers destroy path from the dependency in 'uobject->context' as we already did for the create path. Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
cd272875 |
|
11-Mar-2019 |
Aya Levin <ayal@mellanox.com> |
IB/mlx5: Fix mapping of link-mode to IB width and speed Add mapping of link mode: CAUI4 100Gbps CR4/KR4 with 4 lines and 25Gbps. Fix mapping of link mode: GAUI2 50Gbps CR2/KR2 to be 2 lines with 25Gbps. Fixes: 08e8676f1607 ("IB/mlx5: Add support for 50Gbps per lane link modes") Signed-off-by: Aya Levin <ayal@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
a2a074ef |
|
12-Feb-2019 |
Leon Romanovsky <leon@kernel.org> |
RDMA: Handle ucontext allocations by IB/core Following the PD conversion patch, do the same for ucontext allocations. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
a6bc3875 |
|
17-Feb-2019 |
Moni Shoua <monis@mellanox.com> |
IB/mlx5: Protect against prefetch of invalid MR When deferring a prefetch request we need to protect against MR or PD being destroyed while the request is still enqueued. The first step is to validate that PD owns the lkey that describes the MR and that the MR that the lkey refers to is owned by that PD. The second step is to dequeue all requests when MR is destroyed. Since PD can't be destroyed while it owns MRs it is guaranteed that when a worker wakes up the request it refers to is still valid. Now, it is possible to refrain from taking a reference on the device since it is assured to be present as pd. While that, replace the dedicated ordered workqueue with the system unbound workqueue to reuse an existing resource and improve performance. This will also fix a bug of queueing to the wrong workqueue. Fixes: 813e90b1aeaa ("IB/mlx5: Add advise_mr() support") Reported-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
08e8676f |
|
12-Feb-2019 |
Aya Levin <ayal@mellanox.com> |
IB/mlx5: Add support for 50Gbps per lane link modes Driver now supports new link modes: 50Gbps per lane support for 50G/100G/200G. This patch reads the correct field (legacy vs. extended) based on a FW indication bit, and adds a translation function (link modes to IB width and speed) to the new link modes. Signed-off-by: Aya Levin <ayal@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
#
a08b4ed1 |
|
12-Feb-2019 |
Aya Levin <ayal@mellanox.com> |
net/mlx5: Add support to ext_* fields introduced in Port Type and Speed register This patch exposes new link modes (including 50Gbps per lane), and ext_* fields which describes the new link modes in Port Type and Speed register (PTYS). Access functions, translation functions (speed <-> HW bits) and link max speed function were modified. Signed-off-by: Aya Levin <ayal@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
#
bc4e12ff |
|
12-Feb-2019 |
Aya Levin <ayal@mellanox.com> |
net/mlx5: Refactor queries to speed fields in Port Type and Speed register This patch fascicles queries to speed related fields in Port Type and Speed register (PTYS) into a single API. I addition, this patch refactors functions which serves only Ethernet driver: remove the protocol type as an input parameter, move code from 'core' directory into 'en' directory and add 'eth' prefix to the function's name. The patch also encapsulates functions that are not used outside the Ethernet driver removes redundant include files. Signed-off-by: Aya Levin <ayal@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
#
f0666f1f |
|
12-Feb-2019 |
Bodong Wang <bodong@mellanox.com> |
IB/mlx5: Use unified register/load function for uplink and VF vports IB driver maintains different registration and load function calls for uplink and VF vports. This is not necessary as they only differ with each other on their profiles. This patch doesn't change any functionality. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
#
21a428a0 |
|
03-Feb-2019 |
Leon Romanovsky <leon@kernel.org> |
RDMA: Handle PD allocations by IB/core The PD allocations in IB/core allows us to simplify drivers and their error flows in their .alloc_pd() paths. The changes in .alloc_pd() go hand in had with relevant update in .dealloc_pd(). We will use this opportunity and convert .dealloc_pd() to don't fail, as it was suggested a long time ago, failures are not happening as we have never seen a WARN_ON print. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
cf34e1fe |
|
27-Jan-2019 |
Parav Pandit <parav@mellanox.com> |
IB/mlx5: Consider vlan of lower netdev for macvlan GID entries When a given netdev of the GID entry is macvlan netdevice, and if the lower netdevice is vlan device, GID entry for macvlan based IP address needs to inherit the vlan of the lower netdevice. Therefore, attempt to find out if the lower device exist and if so, if it is vlan device and setup the vlan tag correctly. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Yuval Avnery <yuvalav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
459cc69f |
|
29-Jan-2019 |
Leon Romanovsky <leon@kernel.org> |
RDMA: Provide safe ib_alloc_device() function All callers to ib_alloc_device() provide a larger size than struct ib_device and rely on the fact that struct ib_device is embedded in their driver specific structure as the first member. Provide a safer variant of ib_alloc_device() that checks and enforces this approach to make sure the drivers are using it right. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
f3ffed0c |
|
30-Jan-2019 |
Kamal Heib <kamalheib1@gmail.com> |
IB/mlx5: Make mlx5_ib_stage_odp_cleanup() static The function mlx5_ib_stage_odp_cleanup() is only used in main.c Fixes: d5d284b829a6 ("{net,IB}/mlx5: Move Page fault EQ and ODP logic to RDMA") Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
73eb8f03 |
|
22-Jan-2019 |
Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
infiniband: mlx5: no need to check return value of debugfs_create functions When calling debugfs functions, there is no need to ever check the return value. The function can work or not, but the code logic should never do something different based on this. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
6113cc44 |
|
17-Jan-2019 |
Maor Gottlieb <maorg@mellanox.com> |
IB/mlx5: Don't override existing ip_protocol Two flow specifications can set the ip protocol field in the flow table entry: 1) IB_FLOW_SPEC_TCP/UDP/GRE - set the ip protocol accordingly. 2) IB_FLOW_SPEC_IPV4/6 - has ip_protocol field for users who want to receive specific L4 packets. We need to avoid overriding of the ip_protocol with zeros, in case that the user first put the L4 specification and only then the L3. Fixes: ca0d47538528b ('IB/mlx5: Add support in TOS and protocol to flow steering') Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
534fd7aa |
|
13-Jan-2019 |
Yishai Hadas <yishaih@mellanox.com> |
IB/mlx5: Manage indirection mkey upon DEVX flow for ODP Manage indirection mkey upon DEVX flow to support ODP. To support a page fault event on the indirection mkey it needs to be part of the device mkey radix tree. Both the creation and the deletion flows for a DEVX object which is indirection mkey were adapted to handle that. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
8e8aa145 |
|
14-Jan-2019 |
Gustavo A. R. Silva <gustavo@embeddedor.com> |
RDMA/mlx5: Replace kzalloc with kcalloc Replace kzalloc() function with its 2-factor argument form, kcalloc(). This patch replaces cases of: kzalloc(a * b, gfp) with: kcalloc(a, b, gfp) This code was detected with the help of Coccinelle. Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
54747231 |
|
18-Dec-2018 |
Parav Pandit <parav@mellanox.com> |
RDMA: Introduce and use rdma_device_to_ibdev() Introduce and use rdma_device_to_ibdev() API for those drivers which are registering one sysfs group and also use in ib_core. In subsequent patch, device->provider_ibdev one-to-one mapping is no longer holds true during accessing sysfs entries. Therefore, introduce an API rdma_device_to_ibdev() that provides such information. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
ea4baf7f |
|
18-Dec-2018 |
Parav Pandit <parav@mellanox.com> |
RDMA: Rename port_callback to init_port Most provider routines are callback routines which ib core invokes. _callback suffix doesn't convey information about when such callback is invoked. Therefore, rename port_callback to init_port. Additionally, store the init_port function pointer in ib_device_ops, so that it can be accessed in subsequent patches when binding rdma device to net namespace. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
8cbfaac3 |
|
09-Jan-2019 |
Leon Romanovsky <leon@kernel.org> |
RDMA: Clear PD objects during their allocation As part of an audit process to update drivers to use rdma_restrack_add() ensure that PD objects is cleared before access. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
13859d5d |
|
08-Jan-2019 |
Leon Romanovsky <leon@kernel.org> |
RDMA/mlx5: Embed into the code flow the ODP config option Convert various places to more readable code, which embeds CONFIG_INFINIBAND_ON_DEMAND_PAGING into the code flow. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
e502b8b0 |
|
08-Jan-2019 |
Leon Romanovsky <leon@kernel.org> |
RDMA/core: Don't depend device ODP capabilities on kconfig option Device capability bits are exposing what specific device supports from HW perspective. Those bits are not dependent on kernel configurations and RDMA/core should ensure that proper interfaces to users will be disabled if CONFIG_INFINIBAND_ON_DEMAND_PAGING is not set. Fixes: f4056bfd8ccf ("IB/core: Add on demand paging caps to ib_uverbs_ex_query_device") Fixes: 8cdd312cfed7 ("IB/mlx5: Implement the ODP capability query verb") Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
aa74be6e |
|
08-Dec-2018 |
Yishai Hadas <yishaih@mellanox.com> |
IB/mlx5: Allocate the per-port Q counter shared when DEVX is supported The per-port Q counter is some kernel resource and as such may be used by few UID(s) upon DEVX usage. To enable using it for QP/RQ when DEVX context is used need to allocate it with a sharing mode indication to let firmware allows its usage. The UID = 0xffff was chosen to mark it. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
623d1543 |
|
20-Dec-2018 |
Jason Gunthorpe <jgg@ziepe.ca> |
IB/mlx5: Fix wrong error unwind The destroy_workqueue on error unwind is missing, and the code jumps to the wrong exit label. Fixes: 813e90b1aeaa ("IB/mlx5: Add advise_mr() support") Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
842a9c83 |
|
11-Dec-2018 |
Or Gerlitz <ogerlitz@mellanox.com> |
IB/mlx5: Simplify netdev unbinding When dealing with netdev unregister events, we just need to know that this is our currently bounded netdev. There's no need to do any further checks/queries. This patch doesn't change any functionality. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
813e90b1 |
|
11-Dec-2018 |
Moni Shoua <monis@mellanox.com> |
IB/mlx5: Add advise_mr() support The verb advise_mr() is used to give advice to the kernel about an address range that belongs to a MR. Implement the verb and register it on the device. The current implementation supports the only known advice to date, prefetch. Signed-off-by: Moni Shoua <monis@mellanox.com> Reviewed-by: Guy Levi <guyle@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
7c34ec19 |
|
23-Aug-2018 |
Aviv Heller <avivh@mellanox.com> |
net/mlx5: Make RoCE and SR-IOV LAG modes explicit With the introduction of SR-IOV LAG, checking whether LAG is active is no longer good enough, since RoCE and SR-IOV LAG each entails different behavior by both the core and infiniband drivers. This patch introduces facilities to discern LAG type, in addition to mlx5_lag_is_active(). These are implemented in such a way as to allow more complex mode combinations in the future. Signed-off-by: Aviv Heller <avivh@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
#
06cc74af |
|
12-Dec-2018 |
Mark Bloch <markb@mellanox.com> |
IB/mlx5: Unify e-switch representors load approach between uplink and VFs When in switchdev mode and the add function is called by the core level driver, make sure we only register the callbacks, but don't create the mlx5 IB device or initialize anything. With this change all the IB devices in switchdev mode are created only once the load callback is invoked by the e-switch core sub-module. This follows the design paradigm under which the all the Eth representors must be loaded before any of IB reprs is loaded. Signed-off-by: Mark Bloch <markb@mellanox.com> Acked-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
#
3023a1e9 |
|
10-Dec-2018 |
Kamal Heib <kamalheib1@gmail.com> |
RDMA: Start use ib_device_ops Make all the required change to start use the ib_device_ops structure. Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
96458233e |
|
10-Dec-2018 |
Kamal Heib <kamalheib1@gmail.com> |
RDMA/mlx5: Initialize ib_device_ops struct Initialize ib_device_ops with the supported operations using ib_set_device_ops(). Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
d764970b |
|
09-Dec-2018 |
Michael Guralnik <michaelgur@mellanox.com> |
IB/mlx5: Add 2X width support to query_port Report to the user 2x width over MAD interface. Signed-off-by: Michael Guralnik <michaelgur@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
4106a758 |
|
09-Dec-2018 |
Michael Guralnik <michaelgur@mellanox.com> |
IB/mlx5: Report CapabilityMask2 in ib_query_port CapabilityMask2 exists when IB_PORT_CAP_MASK2_SUP is set in the original capability mask. In such cases, query its value and report it in query port. Signed-off-by: Michael Guralnik <michaelgur@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
5886a96a |
|
10-Dec-2018 |
Oz Shlomo <ozsh@mellanox.com> |
net/mlx5: Revise gre and nvgre key formats GRE RFC defines a 32 bit key field. NVGRE RFC splits the 32 bit key field to 24 bit VSID (gre_key_h) and 8 bit flow entropy (gre_key_l). Define the two key parsing alternatives in a union, thus enabling both access methods. Signed-off-by: Oz Shlomo <ozsh@mellanox.com> Reviewed-by: Eli Britstein <elibr@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
#
7e11b911 |
|
30-Nov-2018 |
Danit Goldberg <danitg@mellanox.com> |
IB/mlx5: Report packet based credit mode device capability Report packet based credit mode capability via the mlx5 DV interface. Signed-off-by: Danit Goldberg <danitg@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
fb98153b |
|
25-Nov-2018 |
Yishai Hadas <yishaih@mellanox.com> |
IB/mlx5: Enforce DEVX privilege by firmware Enforce DEVX privilege by firmware, this enables future device functionality without the need to make driver changes unless a new privilege type will be introduced. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
f3da6577 |
|
28-Nov-2018 |
Leon Romanovsky <leon@kernel.org> |
RDMA/mlx5: Initialize SRQ tables on mlx5_ib Transfer initialization and cleanup from mlx5_priv struct of mlx5_core_dev to be part of mlx5_ib_dev. This completes removal of SRQ from mlx5_core. Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
|
#
81773ce5 |
|
28-Nov-2018 |
Leon Romanovsky <leon@kernel.org> |
RDMA/mlx5: Use stages for callback to setup and release DEVX Reuse existing infrastructure to initialize and release DEVX uid. The DevX interface is intended for user space access, so it is supposed to be initialized before ib_register_device(). Also it isn't supported in switchdev mode and don't need to initialize it in that mode. Fixes: 76dc5a8406bf ("IB/mlx5: Manage device uid for DEVX white list commands") Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
|
#
09e574fa |
|
26-Nov-2018 |
Saeed Mahameed <saeedm@mellanox.com> |
IB/mlx5: Handle raw delay drop general event Handle FW general event rq delay drop as it was received from FW via mlx5 notifiers API, instead of handling the processed software version of that event. After this patch we can safely remove all software processed FW events types and definitions. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
#
134e9349 |
|
26-Nov-2018 |
Saeed Mahameed <saeedm@mellanox.com> |
IB/mlx5: Handle raw port change event rather than the software version Use the FW version of the port change event as forwarded via new mlx5 notifiers API. After this patch, processed software version of the port change event will become deprecated and will be totally removed in downstream patches. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
#
df097a27 |
|
26-Nov-2018 |
Saeed Mahameed <saeedm@mellanox.com> |
IB/mlx5: Use the new mlx5 core notifier API Remove the deprecated mlx5_interface->event mlx5_ib callback and use new mlx5 notifier API to subscribe for mlx5 events. For native mlx5_ib devices profiles pf_profile/nic_rep_profile register the notifier callback mlx5_ib_handle_event which treats the notifier context as mlx5_ib_dev. For vport repesentors, don't register any notifier, same as before, they didn't receive any mlx5 events. For slave port (mlx5_ib_multiport_info) register a different notifier callback mlx5_ib_event_slave_port, which knows that the event is coming for mlx5_ib_multiport_info and prepares the event job accordingly. Before this on the event handler work we had to ask mlx5_core if this is a slave port mlx5_core_is_mp_slave(work->dev), now it is not needed anymore. mlx5_ib_multiport_info notifier registration is done on mlx5_ib_bind_slave_port and de-registration is done on mlx5_ib_unbind_slave_port. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
#
bfc5d839 |
|
20-Nov-2018 |
Mark Bloch <markb@mellanox.com> |
RDMA/mlx5: Attach a DEVX counter via raw flow creation Allow a user to attach a DEVX counter via mlx5 raw flow creation. In order to attach a counter we introduce a new attribute: MLX5_IB_ATTR_CREATE_FLOW_ARR_COUNTERS_DEVX A counter can be attached to multiple flow steering rules. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
36e235c8 |
|
12-Nov-2018 |
Jason Gunthorpe <jgg@ziepe.ca> |
RDMA/mlx5: Use the uapi disablement APIs instead of code Rely on UAPI_DEF_IS_OBJ_SUPPORTED instead of manipulating the contents of the driver's definition list. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
|
#
0cbf432d |
|
12-Nov-2018 |
Jason Gunthorpe <jgg@ziepe.ca> |
RDMA/uverbs: Use a linear list to describe the compiled-in uapi The 'tree' data structure is very hard to build at compile time, and this makes it very limited. The new radix tree based compiler can handle a more complex input language that does not require the compiler to perfectly group everything into a neat tree structure. Instead use a simple list to describe to input, where the list elements can be of various different 'opcodes' instructing the radix compiler what to do. Start out with opcodes chaining to other definition lists and chaining to the existing 'tree' definition. Replace the very top level of the 'object tree' with this list type and get rid of struct uverbs_object_tree_def and DECLARE_UVERBS_OBJECT_TREE. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
|
#
dfb631a1 |
|
12-Nov-2018 |
Jason Gunthorpe <jgg@ziepe.ca> |
RDMA/mlx5: Do not generate the uabi specs unconditionally For DM there is no reason not to add the spec for the START_OFFSET, if DM is not supported then ib_dev.alloc_dm is already set to NULL which ensures we do not call the method. For IPSEC, the core code should be setting ib_dev.create_flow_action_esp to NULL to disable it, not relying on wonky manipulation of the specs. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
|
#
db7a691a |
|
21-Nov-2018 |
Michael Guralnik <michaelgur@mellanox.com> |
IB/mlx5: Avoid load failure due to unknown link width If the firmware reports a connection width that is not 1x, 4x, 8x or 12x it causes the driver to fail during initialization. To prevent this failure every time a new width is introduced to the RDMA stack, we will set a default 4x width for these widths which ar unknown to the driver. This is needed to allow to run old kernels with new firmware. Cc: <stable@vger.kernel.org> # 4.1 Fixes: 1b5daf11b015 ("IB/mlx5: Avoid using the MAD_IFC command under ISSI > 0 mode") Signed-off-by: Michael Guralnik <michaelgur@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
d5d284b8 |
|
19-Nov-2018 |
Saeed Mahameed <saeedm@mellanox.com> |
{net,IB}/mlx5: Move Page fault EQ and ODP logic to RDMA Use the new generic EQ API to move all ODP RDMA data structures and logic form mlx5 core driver into mlx5_ib driver. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Acked-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
|
#
f2f3df55 |
|
19-Nov-2018 |
Saeed Mahameed <saeedm@mellanox.com> |
net/mlx5: EQ, Privatize eq_table and friends Move unnecessary EQ table structures and declaration from the public include/linux/mlx5/driver.h into the private area of mlx5_core and into eq.c/eq.h. Introduce new mlx5 EQ APIs: mlx5_comp_vectors_count(dev); mlx5_comp_irq_get_affinity_mask(dev, vector); And use them from mlx5_ib or mlx5e netdevice instead of direct access to mlx5_core internal structures. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
|
#
9afc97c2 |
|
01-Nov-2018 |
Sagi Grimberg <sagi@grimberg.me> |
mlx5: remove support for ib_get_vector_affinity Devices that does not use managed affinity can not export a vector affinity as the consumer relies on having a static mapping it can map to upper layer affinity (e.g. sw queues). If the driver allows the user to set the device irq affinity, then the affinitization of a long term existing entites is not relevant. For example, nvme-rdma controllers queue-irq affinitization is determined at init time so if the irq affinity changes over time, we are no longer aligned. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
d5634fee |
|
19-Sep-2018 |
Paul Blakey <paulb@mellanox.com> |
net/mlx5: Add a no-append flow insertion mode If no-append flag is set, we will add a new FTE, instead of appending the actions of the inserted rule when the same match already exists. While here, move the has_flow_tag boolean indicator to be a flag too. This patch doesn't change any functionality. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanmox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
#
171c7625b |
|
02-Oct-2018 |
Mark Bloch <markb@mellanox.com> |
net/mlx5: Use flow counter IDs and not the wrapping cache object Currently, when a flow rule is created using the FS core layer, the caller has to pass the entire flow counter object and not just the counter HW handle (ID). This requires both the FS core and the caller to have knowledge about the inner implementation of the FS layer flow counters cache and limits the possible users. Move to use the counter ID across the place when dealing with flows. Doing this decoupling, now can we privatize the inner implementation of the flow counters. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
#
508a523f |
|
11-Oct-2018 |
Parav Pandit <parav@mellanox.com> |
RDMA/drivers: Use core provided API for registering device attributes Use rdma_set_device_sysfs_group() to register device attributes and simplify the driver. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
f6a8a19b |
|
14-Aug-2018 |
Denis Drozdov <denisd@mellanox.com> |
RDMA/netdev: Hoist alloc_netdev_mqs out of the driver netdev has several interfaces that expect to call alloc_netdev_mqs from the core code, with the driver only providing the arguments. This is incompatible with the rdma_netdev interface that returns the netdev directly. Thus re-organize the API used by ipoib so that the verbs core code calls alloc_netdev_mqs for the driver. This is done by allowing the drivers to provide the allocation parameters via a 'get_params' callback and then initializing an allocated netdev as a second step. Fixes: cd565b4b51e5 ("IB/IPoIB: Support acceleration options callbacks") Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Denis Drozdov <denisd@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
#
3df6e023 |
|
20-Sep-2018 |
Yishai Hadas <yishaih@mellanox.com> |
IB/mlx5: Enable DEVX on IB IB has additional protections with SELinux that cannot be extended to the DEVX domain. SELinux can restrict access to pkeys. The first version of DEVX blocked IB entirely until this could be understood. Since DEVX requires CAP_NET_RAW, it supersedes the SELinux restriction and allows userspace to form arbitrary packets with arbitrary pkeys. Thus we enable IB for DEVX when CAP_NET_RAW is given. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
76dc5a84 |
|
20-Sep-2018 |
Yishai Hadas <yishaih@mellanox.com> |
IB/mlx5: Manage device uid for DEVX white list commands Manage device uid for DEVX white list commands. The created device uid will be used on white list commands if the user didn't supply its own uid. This will enable the firmware to filter out non privileged functionality as of the recognition of the uid. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
e349f858 |
|
25-Sep-2018 |
Jason Gunthorpe <jgg@ziepe.ca> |
RDMA: Fully setup the device name in ib_register_device The current code has two copies of the device name, ibdev->dev and dev_name(&ibdev->dev), and they are setup at different times, which is very confusing. Set them both up at the same time and make dev_name() the lead name, which is the proper use of the driver core APIs. To make it very clear that the name is not valid until registration pass it in to the ib_register_device() call rather than messing with ibdev->name directly. Also the reorganization now checks that dev_name is unique even if it does not contain a %. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Acked-by: Adit Ranadive <aditr@vmware.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Acked-by: Devesh Sharma <devesh.sharma@broadcom.com> Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
|
#
0430e74f |
|
24-Sep-2018 |
Mark Bloch <markb@mellanox.com> |
RDMA/mlx5: Remove superfluous version print When profiles were introduced to MLX5 IB an unneeded version print when creating an MLX5 IB device was added. Remove the print, we still have a printk for driver version in mlx5_ib_add(). Fixes: 16c1975f1032 ("IB/mlx5: Create profile infrastructure to add and remove stages") Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
d2d19121 |
|
20-Sep-2018 |
Yishai Hadas <yishaih@mellanox.com> |
IB/mlx5: Set uid as part of TD commands Set uid as part of TD commands so that the firmware can manage the TD object in a secured way. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
58895f0d |
|
20-Sep-2018 |
Yishai Hadas <yishaih@mellanox.com> |
IB/mlx5: Set uid upon PD allocation Set uid as part of PD allocation, this uid is used for other mlx5 objects upon calling the firmware. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
539ec982 |
|
20-Sep-2018 |
Yishai Hadas <yishaih@mellanox.com> |
IB/mlx5: Set uid as part of MCG commands Set uid as part of MCG commands so that the firmware can manage the MCG object in a secured way. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
a1069c1c |
|
20-Sep-2018 |
Yishai Hadas <yishaih@mellanox.com> |
IB/mlx5: Use uid as part of PD commands Use uid as part of PD commands so that the firmware can manage the PD object in a secured way. For example when a QP is created its uid must match the CQ uid which it uses. Next patches in this series will use the uid from the PD, then will come a patch to set the uid on the PD so that all objects will be properly work in one change. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
0042f9e4 |
|
17-Sep-2018 |
Mark Bloch <markb@mellanox.com> |
RDMA/mlx5: Enable vport loopback when user context or QP mandate A user can create a QP which can accept loopback traffic, but that's not enough. We need to enable loopback on the vport as well. Currently vport loopback is enabled only when more than 1 users are using the IB device, update the logic to consider whatever a QP which supports loopback was created, if so enable vport loopback even if there is only a single user. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
a560f1d9 |
|
17-Sep-2018 |
Mark Bloch <markb@mellanox.com> |
RDMA/mlx5: Refactor transport domain bookkeeping logic In preparation to enable loopback on a single user context move the logic that enables/disables loopback to separate functions and group variables under a single struct. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
f27a0d50 |
|
16-Sep-2018 |
Jason Gunthorpe <jgg@ziepe.ca> |
RDMA/umem: Use umem->owning_mm inside ODP Since ODP had a single struct mmu_notifier located in the ucontext it could only handle a single MM at a time, and this prevented it from using the new owning_mm system. With the prior rework it is now simple to let ODP track multiple MMs per ucontext, finish the job so that the per_mm is allocated on a mm by mm basis, and freed when the last umem is dropped from the ucontext. As a side effect the new saner locking removes the lockdep splat about nesting the umem_rwsem between mmu_notifier_unregister and ib_umem_odp_release. It also makes ODP work with multiple processes, across, fork, etc. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
e2cd1d1a |
|
16-Sep-2018 |
Jason Gunthorpe <jgg@ziepe.ca> |
RDMA/mlx5: Use rdma_user_mmap_io Rely on the new core code helper to map BAR memory from the driver. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
a7ee18bd |
|
06-Sep-2018 |
Mark Bloch <markb@mellanox.com> |
RDMA/mlx5: Allow creating a matcher for a NIC TX flow table Currently a matcher can only be created and attached to a NIC RX flow table. Extend it to allow it on NIC TX flow tables as well. In order to achieve that, we: 1) Expose a new attribute: MLX5_IB_ATTR_FLOW_MATCHER_FLOW_FLAGS. enum ib_flow_flags is used as valid flags. Only IB_FLOW_ATTR_FLAGS_EGRESS is supported. 2) Remove the requirement to have a DEVX or QP destination when creating a flow. A flow added to NIC TX flow table will forward the packet outside of the vport (Wire or E-Switch in the SR-iOV case). Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
b47fd4ff |
|
06-Sep-2018 |
Mark Bloch <markb@mellanox.com> |
RDMA/mlx5: Add NIC TX namespace when getting a flow table Add the ability to get a NIC TX flow table when using _get_flow_table(). This will allow to create a matcher and a flow rule on the NIC TX path. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
b823dd6d |
|
06-Sep-2018 |
Mark Bloch <markb@mellanox.com> |
RDMA/mlx5: Refactor raw flow creation Move struct mlx5_flow_act to be passed from the method entry point, this will allow to add support for flow action for the raw create flow path. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
501f14e3 |
|
06-Sep-2018 |
Mark Bloch <markb@mellanox.com> |
RDMA/mlx5: Don't overwrite action if already set We support only a single action type per flow rule, in case the user passes the same type of flow actions fail the flow creation. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
2ea26203 |
|
06-Sep-2018 |
Mark Bloch <markb@mellanox.com> |
RDMA/mlx5: Refactor flow action parsing to be more generic Make the parsing of flow actions more generic so it could be used by mlx5 raw create flow. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
e806f932 |
|
01-Sep-2018 |
Mark Bloch <markb@mellanox.com> |
RDMA/mlx5: Enable attaching packet reformat action to steering flows Any matching rules will be mutated based on the packet reformat context which is attached to that given flow rule. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
5c2db53f |
|
01-Sep-2018 |
Mark Bloch <markb@mellanox.com> |
RDMA/mlx5: Enable reformat on NIC RX if supported A L3_TUNNEL_TO_L2 decap flow action requires to enable the encap bit on the flow table, enable it if supported. This will allow to attach those flow actions to NIC RX steering. We don't enable if running on a representor. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
10a30896 |
|
01-Sep-2018 |
Mark Bloch <markb@mellanox.com> |
RDMA/mlx5: Enable attaching DECAP action to steering flows Any matching packet will be stripped of it's VXLAN tunnel, only the inner L2 onward is left. The user will receive the decapsulated packet. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
4adda112 |
|
01-Sep-2018 |
Mark Bloch <markb@mellanox.com> |
RDMA/mlx5: Enable decap and packet reformat on flow tables If NIC RX flow tables support decap operation, enable it on creation, This allows to perform decapsulation of tunnelled packets by steering rules. If NIC TX flow tables support reformat operation, enable it on creation. We don't enable those capabilities on representors as the E-Switch should handle packet modification (can be configured via TC) and as current hardware can't handle both FDB and NIC flow tables with decap/packet reformat support. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
b1085be3 |
|
01-Sep-2018 |
Mark Bloch <markb@mellanox.com> |
RDMA/mlx5: Enable attaching modify header to steering flows When creating a flow steering rule, allow the user to attach a modify header action. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
78dd0c43 |
|
01-Sep-2018 |
Mark Bloch <markb@mellanox.com> |
RDMA/mlx5: Add NIC TX steering support Just like ingress steering, allow a user to create steering rules that match egress vport traffic. We expose the same number of priorities as the bypass (NIC RX) steering. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
b4749bf2 |
|
28-Aug-2018 |
Mark Bloch <markb@mellanox.com> |
RDMA/mlx5: Add a new flow action verb - modify header Expose the ability to create a flow action which changes packet headers. The data passed from userspace should be modify header actions as defined by HW specification. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
c6a21c38 |
|
28-Aug-2018 |
Majd Dibbiny <majd@mellanox.com> |
IB/mlx5: Change TX affinity assignment in RoCE LAG mode In the current code, the TX affinity is per RoCE device, which can cause unfairness between different contexts. e.g. if we open two contexts, and each open 10 QPs concurrently, all of the QPs of the first context might end up on the first port instead of distributed on the two ports as expected To overcome this unfairness between processes, we maintain per device TX affinity, and per process TX affinity. The allocation algorithm is as follow: 1. Hold two tx_port_affinity atomic variables, one per RoCE device and one per ucontext. Both initialized to 0. 2. In mlx5_ib_alloc_ucontext do: 2.1. ucontext.tx_port_affinity = device.tx_port_affinity 2.2. device.tx_port_affinity += 1 3. In modify QP INIT2RST: 3.1. qp.tx_port_affinity = ucontext.tx_port_affinity % MLX5_PORT_NUM 3.2. ucontext.tx_port_affinity += 1 Signed-off-by: Majd Dibbiny <majd@mellanox.com> Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
7d96c9b1 |
|
09-Aug-2018 |
Jason Gunthorpe <jgg@ziepe.ca> |
IB/uverbs: Have the core code create the uverbs_root_spec There is no reason for drivers to do this, the core code should take of everything. The drivers will provide their information from rodata to describe their modifications to the core's base uapi specification. The core uses this to build up the runtime uapi for each device. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
|
#
9f49a5b5 |
|
29-Jul-2018 |
Jason Gunthorpe <jgg@ziepe.ca> |
RDMA/netdev: Use priv_destructor for netdev cleanup Now that the unregister_netdev flow for IPoIB no longer relies on external code we can now introduce the use of priv_destructor and needs_free_netdev. The rdma_netdev flow is switched to use the netdev common priv_destructor instead of the special free_rdma_netdev and the IPOIB ULP adjusted: - priv_destructor needs to switch to point to the ULP's destructor which will then call the rdma_ndev's in the right order - We need to be careful around the error unwind of register_netdev as it sometimes calls priv_destructor on failure - ULPs need to use ndo_init/uninit to ensure proper ordering of failures around register_netdev Switching to priv_destructor is a necessary pre-requisite to using the rtnl new_link mechanism. The VNIC user for rdma_netdev should also be revised, but that is left for another patch. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Denis Drozdov <denisd@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
|
#
bccd0622 |
|
26-Jul-2018 |
Jason Gunthorpe <jgg@ziepe.ca> |
IB/uverbs: Add UVERBS_ATTR_FLAGS_IN to the specs language This clearly indicates that the input is a bitwise combination of values in an enum, and identifies which enum contains the definition of the bits. Special accessors are provided that handle the mandatory validation of the allowed bits and enforce the correct type for bitwise flags. If we had introduced this at the start then the kabi would have uniformly used u64 data to pass flags, however today there is a mixture of u64 and u32 flags. All places are converted to accept both sizes and the accessor fixes it. This allows all existing flags to grow to u64 in future without any hassle. Finally all flags are, by definition, optional. If flags are not passed the accessor does not fail, but provides a value of zero. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
|
#
2577188e |
|
23-Jul-2018 |
Qing Huang <qing.huang@oracle.com> |
IB/mlx5: avoid excessive warning msgs when creating VFs on 2nd port When a CX5 device is configured in dual-port RoCE mode, after creating many VFs against port 1, creating the same number of VFs against port 2 will flood kernel/syslog with something like "mlx5_*:mlx5_ib_bind_slave_port:4266:(pid 5269): port 2 already affiliated." So basically, when traversing mlx5_ib_dev_list, mlx5_ib_add_slave_port() repeatedly attempts to bind the new mpi structure to every device on the list until it finds an unbound device. Change the log level from warn to dbg to avoid log flooding as the warning should be harmless. Signed-off-by: Qing Huang <qing.huang@oracle.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
cb80fb18 |
|
23-Jul-2018 |
Yishai Hadas <yishaih@mellanox.com> |
IB/mlx5: Enable driver uapi commands for flow steering Expose the mlx5 flow steering parsing trees, exposing the functionality to user space. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
6346f0bf |
|
23-Jul-2018 |
Yishai Hadas <yishaih@mellanox.com> |
IB/mlx5: Add support for a flow table destination for driver flow steering Add support to set a destination that is a flow table, this can come from the DEVX destination. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
d4be3f44 |
|
23-Jul-2018 |
Yishai Hadas <yishaih@mellanox.com> |
IB/mlx5: Support adding flow steering rule by raw description Add support to set a public flow steering rule when its destination is a TIR by using raw specification data. The logic follows the verbs API but instead of using ib_spec(s) the raw, device specific, description is used. This allows supporting specialty matchers without having to define new matches in the verbs struct based language. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
32269441 |
|
23-Jul-2018 |
Yishai Hadas <yishaih@mellanox.com> |
IB/mlx5: Introduce driver create and destroy flow methods Introduce driver create and destroy flow methods on the uverbs flow object. This allows the driver to get its specific device attributes to match the underlay specification while still using the generic ib_flow object for cleanup and code sharing. The IB object's attributes are set via the ib_set_flow() helper function. The specific implementation for the given specification is added in downstream patches. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
aa09ea6e |
|
18-Jul-2018 |
Kamal Heib <kamalheib1@gmail.com> |
RDMA/mlx5: Remove set but not used variables Remove "uctx" and "pa" variables that were set but not used. Fixes: a8b92ca1b0e5 ("IB/mlx5: Introduce DEVX") Fixes: 8f0622873358 ("RDMA/mlx5: Remove debug prints of VMA pointers") Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
b02289b3 |
|
04-Jul-2018 |
Artemy Kovalyov <artemyko@mellanox.com> |
RDMA: Validate grh_required when handling AVs Extend the existing grh_required flag to check when AV's are handled that a GRH is present. Since we don't want to do query_port during the AV checks for performance reasons move the flag into the immutable_data. Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
2f944c0f |
|
04-Jul-2018 |
Jason Gunthorpe <jgg@ziepe.ca> |
RDMA: Fix storage of PortInfo CapabilityMask in the kernel The internal flag IP_BASED_GIDS was added to a field that was being used to hold the port Info CapabilityMask without considering the effects this will have. Since most drivers just use the value from the HW MAD it means IP_BASED_GIDS will also become set on any HW that sets the IBA flag IsOtherLocalChangesNoticeSupported - which is not intended. Fix this by keeping port_cap_flags only for the IBA CapabilityMask value and store unrelated flags externally. Move the bit definitions for this to ib_mad.h to make it clear what is happening. To keep the uAPI unchanged define a new set of flags in the uapi header that are only used by ib_uverbs_query_port_resp.port_cap_flags which match the current flags supported in rdma-core, and the values exposed by the current kernel. Fixes: b4a26a27287a ("IB: Report using RoCE IP based gids in port caps") Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
|
#
921c0f5b |
|
08-Jul-2018 |
Parav Pandit <parav@mellanox.com> |
IB/mlx5: Honor cnt_set_id_valid flag instead of set_id It is incorrect to depend on set_id value to know if counters were allocated or not. set_id_valid field is set to true when counters were allocated. Therefore, use set_id_valid while deciding to free counters. Cc: <stable@vger.kernel.org> # 4.15 Fixes: aac4492ef23a ("IB/mlx5: Update counter implementation for dual port RoCE") Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
e3f1ed1f |
|
07-Jul-2018 |
Leon Romanovsky <leon@kernel.org> |
RDMA/mlx5: Remove unused port number parameter Clean up a little bit code to drop unused port_num parameter. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
83bb4442 |
|
03-Jul-2018 |
Jason Gunthorpe <jgg@ziepe.ca> |
RDMA/uverbs: Remove UA_FLAGS This bit of boilerplate isn't really necessary, we can use bitfields instead of a flags enum and the macros can then individually initialize them through the __VA_ARGS__ like everything else. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
|
#
9a119cd5 |
|
03-Jul-2018 |
Jason Gunthorpe <jgg@ziepe.ca> |
RDMA/uverbs: Get rid of the & in method specifications Hide it inside the macros. The & is confusing and interferes with using this as a generic DSL in later patches. Since this also touches almost every line, also run the specs through clang-format (with 'BinPackParameters: false') to make the maintenance easier. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
|
#
87fc2a62 |
|
03-Jul-2018 |
Jason Gunthorpe <jgg@ziepe.ca> |
RDMA/uverbs: Store the specs_root in the struct ib_uverbs_device The specs are required to operate the uverbs file, so they belong inside the ib_uverbs_device, not inside the ib_device. The spec passed in the ib_device is just a communication from the driver and should not be used during runtime. This also changes the lifetime of the spec memory to match the ib_uverbs_device, however at this time the spec_root can still contain driver pointers after disassociation, so it cannot be used if ib_dev is NULL. This is preparation for another series. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
|
#
25bb36e7 |
|
18-Jun-2018 |
Yonatan Cohen <yonatanc@mellanox.com> |
IB/mlx5: Expose dump and fill memory key MLX5 IB HCA offers the memory key, dump_fill_mkey to boost performance, when used in a send or receive operations. It is used to force local HCA operations to skip the PCI bus access, while keeping track of the processed length in the ibv_sge handling. Meaning, instead of a PCI write access the HCA leaves the target memory untouched, and skips filling that packet section. Similar behavior is done upon send, the HCA skips data in memory relevant to this key and saves PCI bus access. This functionality saves PCI read/write operations. Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Guy Levi <guyle@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
a93b632c |
|
01-Jul-2018 |
Maor Gottlieb <maorg@mellanox.com> |
IB/mlx5: Fix GRE flow specification Currently the driver sets the mask of the gre_protocol to 0xffff without consideration in the user request. Fix it by copy the mask from the verbs spec. Fixes: da2f22ae7707 ("IB/mlx5: Add support for GRE flow specification") Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Ariel Levkovich <lariel@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
7496a511 |
|
02-Jul-2018 |
Bart Van Assche <bvanassche@acm.org> |
IB/mlx5: Remove set-but-not-used variables Avoid that the compiler complains about set-but-not-used variables when building with W=1. This patch does not change any functionality. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Leon Romanovsky <leonro@mellanox.com> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
15177999 |
|
27-Jun-2018 |
Leon Romanovsky <leon@kernel.org> |
RDMA/mlx5: Don't leak UARs in case of free fails The failure in releasing one UAR doesn't mean that we can't continue to release rest of system pages, so don't return too early. As part of cleanup, there is no need to print warning if mlx5_cmd_free_uar() fails because such warning will be printed as part of mlx5_cmd_exec(). Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
aff2252a |
|
31-May-2018 |
Or Gerlitz <ogerlitz@mellanox.com> |
IB/mlx5: Avoid dealing with vport representors if not being e-switch manager In smartnic env, the host (PF) driver might not be an e-switch manager, hence the switchdev mode representors are running on the embedded cpu (EC) and not at the host. As such, we should avoid dealing with vport representors if not being esw manager. Fixes: b5ca15ad7e61 ('IB/mlx5: Add proper representors support') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
#
d0e84c0a |
|
19-Jun-2018 |
Yishai Hadas <yishaih@mellanox.com> |
IB/mlx5: Add support for drain SQ & RQ This patch follows the logic from ib_core but considers the internal device state upon executing the involved commands. Specifically, Upon internal error state modify QP to an error state can be assumed to be success as each in-progress WR going to be flushed in error in any case as expected by that modify command. In addition, As the drain should never fail the driver makes sure that post_send/recv will succeed even if the device is already in an internal error state. As such once the driver will supply the simulated/SW CQEs the CQE for the drain WR will be handled as well. In case of an internal error state the CQE for the drain WR may be completed as part of the main task that handled the error state or by the task that issued the drain WR. As the above depends on scheduling the code takes the relevant locks and actions to make sure that the completion handler for that WR will always be called after that the post_send/recv were issued but not in parallel to the other task that handles the error flow. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
9f876f3d |
|
21-Jun-2018 |
Talat Batheesh <talatb@mellanox.com> |
IB/mlx5: Support RoCE ICRC encapsulated error counter This patch adds support to query the counter that counts the RoCE packets with corrupted ICRC (Invariant Cyclic Redundancy Code). This counter will be under /sys/class/infiniband/<mlx5-dev>/ports/<port>/hw_counters/ rx_icrc_encapsulated - The number of RoCE packets with ICRC error. Signed-off-by: Talat Batheesh <talatb@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
cfdeb893 |
|
19-Jun-2018 |
Leon Romanovsky <leon@kernel.org> |
RDMA/mlx5: Refactor transport domain checks Put all relevant checks for transport domain in the mlx5_ib_alloc/dealloc_transport_domain functions. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
c59450c4 |
|
17-Jun-2018 |
Yishai Hadas <yishaih@mellanox.com> |
IB/mlx5: Expose DEVX tree Expose DEVX tree to be used by upper layers. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
a8b92ca1 |
|
16-Jun-2018 |
Yishai Hadas <yishaih@mellanox.com> |
IB/mlx5: Introduce DEVX Introduce DEVX to enable direct device commands in downstream patches from this series. In that mode of work the firmware manages the isolation between processes' resources and as such a DEVX user id is created and assigned to the given user context upon allocation request. A capability check is done to make sure that this feature is really supported by the firmware prior to creating the DEVX user id. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
33023fb8 |
|
18-Jun-2018 |
Steve Wise <larrystevenwise@gmail.com> |
IB/core: add max_send_sge and max_recv_sge attributes This patch replaces the ib_device_attr.max_sge with max_send_sge and max_recv_sge. It allows ulps to take advantage of devices that have very different send and recv sge depths. For example cxgb4 has a max_recv_sge of 4, yet a max_send_sge of 16. Splitting out these attributes allows much more efficient use of the SQ for cxgb4 with ulps that use the RDMA_RW API. Consider a large RDMA WRITE that has 16 scattergather entries. With max_sge of 4, the ulp would send 4 WRITE WRs, but with max_sge of 16, it can be done with 1 WRITE WR. Acked-by: Sagi Grimberg <sagi@grimberg.me> Acked-by: Christoph Hellwig <hch@lst.de> Acked-by: Selvin Xavier <selvin.xavier@broadcom.com> Acked-by: Shiraz Saleem <shiraz.saleem@intel.com> Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
47ec3866 |
|
13-Jun-2018 |
Parav Pandit <parav@mellanox.com> |
RDMA: Convert drivers to use sgid_attr instead of sgid_index The core code now ensures that all driver callbacks that receive an rdma_ah_attrs will have a sgid_attr's pointer if there is a GRH present. Drivers can use this pointer instead of calling a query function with sgid_index. This simplifies the drivers and also avoids races where a gid_index lookup may return different data if it is changed. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
|
#
f4df9a7c |
|
04-Jun-2018 |
Parav Pandit <parav@mellanox.com> |
RDMA: Use GID from the ib_gid_attr during the add_gid() callback Now that ib_gid_attr contains the GID, make use of that in the add_gid() callback functions for the provider drivers to simplify the add_gid() implementations. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
e31abf76 |
|
06-Jun-2018 |
weiyongjun (A) <weiyongjun1@huawei.com> |
IB/mlx5: Fix return value check in flow_counters_set_data() In case of error, the function mlx5_fc_create() returns ERR_PTR() and never returns NULL. The NULL test in the return value check should be replaced with IS_ERR(). Fixes: 3b3233fbf02e ("IB/mlx5: Add flow counters binding support") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
299eafee |
|
07-Jun-2018 |
Gustavo A. R. Silva <gustavo@embeddedor.com> |
IB/mlx5: Fix memory leak in mlx5_ib_create_flow In case memory resources for *ucmd* were allocated, release them before return. Addresses-Coverity-ID: 1469857 ("Resource leak") Fixes: 3b3233fbf02e ("IB/mlx5: Add flow counters binding support") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
1a1e03dc |
|
31-May-2018 |
Raed Salem <raeds@mellanox.com> |
IB/mlx5: Add counters read support This patch implements the uverbs counters read API, it will use the specific read counters function to the given type to accomplish its task. Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Raed Salem <raeds@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
5e95af5f |
|
31-May-2018 |
Raed Salem <raeds@mellanox.com> |
IB/mlx5: Add flow counters read support Implements the flow counters read wrapper. Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Raed Salem <raeds@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
3b3233fb |
|
31-May-2018 |
Raed Salem <raeds@mellanox.com> |
IB/mlx5: Add flow counters binding support Associates a counters with a flow when IB_FLOW_SPEC_ACTION_COUNT is part of the flow specifications. The counters user space placements of location and description (index, description) pairs are passed as private data of the counters flow specification. Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Raed Salem <raeds@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
b29e2a13 |
|
31-May-2018 |
Raed Salem <raeds@mellanox.com> |
IB/mlx5: Add counters create and destroy support This patch implements the device counters create and destroy APIs and introducing some internal management structures. Downstream patches in this series will add the functionality to support flow counters binding and reading. Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Raed Salem <raeds@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
59082a32 |
|
31-May-2018 |
Matan Barak <matanb@mellanox.com> |
IB/core: Support passing uhw for create_flow This is required when user-space drivers need to pass extra information regarding how to handle this flow steering specification. Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
2cb40791 |
|
29-May-2018 |
Leon Romanovsky <leon@kernel.org> |
RDMA/mlx5: Don't check return value of zap_vma_ptes() There is no need to check return value of zap_vma_ptes() because there is nothing to do with this knowledge. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
a0976f41 |
|
28-May-2018 |
Wei Hu(Xavier) <xavier.huwei@huawei.com> |
RDMA/uverbs: Hoist the common process of disassociate_ucontext into ib core This patch hoisted the common process of disassociate_ucontext callback function into ib core code, and these code are common to ervery ib_device driver. Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
6f1006a4 |
|
27-May-2018 |
Yonatan Cohen <yonatanc@mellanox.com> |
IB/mlx5: Introduce a new mini-CQE format The new mini-CQE format includes the stride index, byte count and packet checksum. Stride index is needed for striding WQ feature. This patch exposes this capability and enables its setting via mlx5 UHW data as part of query device and cq creation. Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Guy Levi <guyle@mellanox.com> Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
572f46bf |
|
27-May-2018 |
Yonatan Cohen <yonatanc@mellanox.com> |
IB/mlx5: Refactor CQE compression response Refactor CQE compression response to be fully set only when it`s really supported. There is no change from user perspective because anyway resp.cqe_comp_caps.max_num was set to zero. Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com>W Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
8f062287 |
|
21-May-2018 |
Leon Romanovsky <leon@kernel.org> |
RDMA/mlx5: Remove debug prints of VMA pointers Remove various prints of VMA pointers. Reported-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Michal Kalderon <michal.kalderon@cavium.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
e818e255 |
|
13-May-2018 |
Ariel Levkovich <lariel@mellanox.com> |
IB/mlx5: Expose MPLS related tunneling offloads This patch reports the device's capbilities to offload encapsulated MPLS tunnel protocols to user-space: - Capability to offload MPLS over GRE. - Capability to offload MPLS over UDP. Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
71c6e863 |
|
13-May-2018 |
Ariel Levkovich <lariel@mellanox.com> |
IB/mlx5: Add support for MPLS flow specification This patch introduces support for the MPLS flow spec and allows the creation of rules that are matching on the MPLS label. Applying the rule matching depends on the flow specs order and the location of the MPLS in the spec list as there are different configurations to be made in the device in the cases of MPLSoGRE and MPLSoUDP vs. non-encapsulated MPLS. Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
da2f22ae |
|
13-May-2018 |
Ariel Levkovich <lariel@mellanox.com> |
IB/mlx5: Add support for GRE flow specification This patch introduces support for the GRE flow spec and allowing the creation of rules based on the protocol and key fields that are part of GRE protocol header. Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
37da2a03 |
|
07-May-2018 |
Daria Velikovsky <daria@mellanox.com> |
RDMA/mlx5: Use proper spec flow label type Flow label is defined as u32 in the in ipv6 flow spec, but used internally in the flow specs parsing as u8. That was causing loss of part of flow_label value. Fixes: 2d1e697e9b716 ('IB/mlx5: Add support to match inner packet fields') Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Daria Velikovsky <daria@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
444261ca |
|
23-Apr-2018 |
Leon Romanovsky <leon@kernel.org> |
RDMA/mlx5: Properly check return value of mlx5_get_uars_page Starting from commit 72f36be06138 ("net/mlx5: Fix mlx5_get_uars_page to return error code") the mlx5_get_uars_page() call returns error in case of failure, but it was mistakenly overlooked in the merge commit. Fixes: e7996a9a77fc ("Merge tag v4.15 of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git") Reported-by: Alaa Hleihel <alaa@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
84a6a7a9 |
|
23-Apr-2018 |
Parav Pandit <parav@mellanox.com> |
IB/mlx5: Fix represent correct netdevice in dual port RoCE In commit bcf87f1dbbec ("IB/mlx5: Listen to netdev register/unresiter events in switchdev mode") incorrectly mapped primary device's netdevice to 2nd port netdevice. It always represented primary port's netdevice for 2nd port netdevice when ib representors were not used. This results into failing to process CM request arriving on 2nd port due to incorrect mapping of netdevice. This fix corrects it by considering the right mdev. Cc: <stable@vger.kernel.org> # 4.16 Fixes: bcf87f1dbbec ("IB/mlx5: Listen to netdev register/unresiter events in switchdev mode") Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
6082d9c9 |
|
12-Apr-2018 |
Israel Rukshin <israelr@mellanox.com> |
net/mlx5: Fix mlx5_get_vector_affinity function Adding the vector offset when calling to mlx5_vector2eqn() is wrong. This is because mlx5_vector2eqn() checks if EQ index is equal to vector number and the fact that the internal completion vectors that mlx5 allocates don't get an EQ index. The second problem here is that using effective_affinity_mask gives the same CPU for different vectors. This leads to unmapped queues when calling it from blk_mq_rdma_map_queues(). This doesn't happen when using affinity_hint mask. Fixes: 2572cf57d75a ("mlx5: fix mlx5_get_vector_affinity to start from completion vector 0") Fixes: 05e0cc84e00c ("net/mlx5: Fix get vector affinity helper function") Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
|
#
25d0e2db |
|
14-Apr-2018 |
Zhu Yanjun <yanjun.zhu@oracle.com> |
IB/mlx5: remove duplicate header file The header file fs_helpers.h is included twice. So it should be removed. Fixes: 802c2125689d ("IB/mlx5: Add IPsec support for egress and ingress") CC: Srinivas Eeda <srinivas.eeda@oracle.com> CC: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
6c29f57e |
|
05-Apr-2018 |
Ariel Levkovich <lariel@mellanox.com> |
IB/mlx5: Device memory mr registration support Adding mlx5_ib driver implementation for reg_dm_mr callback which allows registering device memory (DM) as an MR for local and remote access. Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
24da0016 |
|
05-Apr-2018 |
Ariel Levkovich <lariel@mellanox.com> |
IB/mlx5: Device memory support in mlx5_ib This patch adds the mlx5_ib driver implementation for the device memory allocation API. It implements the ib_device callbacks for allocation and deallocation operations as well as a new mmap command support which allows mapping an allocated device memory to a VMA. The change also adds reporting of device memory maximum size and alignment parameters reported in device capabilities. The allocation/deallocation operations are using new firmware commands to allocate MEMIC memory on the device. Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
2d93fc85 |
|
28-Mar-2018 |
Matan Barak <matanb@mellanox.com> |
IB/mlx5: Add ability to hash by IPSEC_SPI when creating a TIR When a Raw Ethernet QP is created, we actually create a few objects. One of these objects is a TIR. Currently, a TIR could hash (and spread the traffic) by IP or port only. Adding a hashing by IPSec SPI to TIR creation with the required UAPI bit. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
c03faa56 |
|
28-Mar-2018 |
Matan Barak <matanb@mellanox.com> |
IB/mlx5: Add information for querying IPsec capabilities Users should be able to query for IPSec support. Adding a few capabilities bits as part of the driver specific part in alloc_ucontext: MLX5_USER_ALLOC_UCONTEXT_FLOW_ACTION_FLAGS_ESP_AES_GCM_REQ_METADATA Payload's header is returned with metadata representing the IPSec decryption state. MLX5_USER_ALLOC_UCONTEXT_FLOW_ACTION_FLAGS_ESP_AES_GCM_RX Support ESP_AES_GCM in ingress path. MLX5_USER_ALLOC_UCONTEXT_FLOW_ACTION_FLAGS_ESP_AES_GCM_TX Support ESP_AES_GCM in egress path. MLX5_USER_ALLOC_UCONTEXT_FLOW_ACTION_FLAGS_ESP_AES_GCM_SPI_RSS_ONLY Hardware doesn't support matching SPI in flow steering rules but just hashing and spreading the traffic accordingly. Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
802c2125 |
|
28-Mar-2018 |
Aviad Yehezkel <aviadye@mellanox.com> |
IB/mlx5: Add IPsec support for egress and ingress This commit introduces support for the esp_aes_gcm flow specification for the Innova device. To that end we add support for egress steering and some validations that an IPsec rule is indeed valid. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
349705c1 |
|
28-Mar-2018 |
Matan Barak <matanb@mellanox.com> |
IB/mlx5: Add modify_flow_action_esp verb Adding implementation in mlx5 driver to modify action_xfrm object. This merely call the accel layer. Currently a user can modify only the ESN parameters. Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
c6475a0b |
|
28-Mar-2018 |
Aviad Yehezkel <aviadye@mellanox.com> |
IB/mlx5: Add implementation for create and destroy action_xfrm Adding implementation in mlx5 driver to create and destroy action_xfrm object. This merely call the accel layer. A user may pass MLX5_IB_XFRM_FLAGS_REQUIRE_METADATA flag which states that [s]he expects a metadata header to be added to the payload. This header represents information regarding the transformation's state. Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
8c84660b |
|
28-Mar-2018 |
Matan Barak <matanb@mellanox.com> |
IB/mlx5: Initialize the parsing tree root without the help of uverbs In order to have a custom parsing tree, a provider driver needs to assign its parsing tree to ib_device specs_tree field. Otherwise, the uverbs client assigns a common default parsing tree for it. In downstream patches, the mlx5_ib driver gains a custom parsing tree, which contains both the common objects and a new flags field for the UVERBS_FLOW_ACTION_ESP_CREATE command. This patch makes mlx5_ib assign its own tree to specs_root, which later on will be extended. Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
414448d2 |
|
01-Apr-2018 |
Parav Pandit <parav@mellanox.com> |
RDMA: Use ib_gid_attr during GID modification Now that ib_gid_attr contains device, port and index, simplify the provider APIs add_gid() and del_gid() to use device, port and index fields from the ib_gid_attr attributes structure. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
3e44e0ee |
|
01-Apr-2018 |
Parav Pandit <parav@mellanox.com> |
IB/providers: Avoid null netdev check for RoCE Now that IB core GID cache ensures that all RoCE entries have an associated netdev remove null checks from the provider drivers for clarity. Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
32927e28 |
|
20-Mar-2018 |
Mark Bloch <markb@mellanox.com> |
IB/mlx5: Don't clean uninitialized UMR resources In case we failed to create UMR resources, mark them as invalid so we won't try to destroy them on the unwind path. Add the relevant checks to destroy_umrc_res(), this is done for the unlikely event ib_register_device() or create_umr_res() err out and we try to destroy invalid objects. Fixes: 42cea83f9524 ("IB/mlx5: Fix cleanup order on unload") Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
0ede73bc |
|
19-Mar-2018 |
Matan Barak <matanb@mellanox.com> |
IB/uverbs: Extend uverbs_ioctl header with driver_id Extending uverbs_ioctl header with driver_id and another reserved field. driver_id should be used in order to identify the driver. Since every driver could have its own parsing tree, this is necessary for strace support. Downstream patches take off the EXPERIMENTAL flag from the ioctl() IB support and thus we add some reserved fields for future usage. Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
61147f39 |
|
19-Mar-2018 |
Bodong Wang <bodong@mellanox.com> |
IB/mlx5: Packet packing enhancement for RAW QP Enable RAW QP to be able to configure burst control by modify_qp. By using burst control with rate limiting, user can achieve best performance and accuracy. The burst control information is passed by user through udata. This patch also reports burst control capability for mlx5 related hardwares, burst control is only marked as supported when both packet_pacing_burst_bound and packet_pacing_typical_size are supported. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
7672ed33 |
|
15-Mar-2018 |
Honggang Li <honli@redhat.com> |
IB/mlx5: Set the default active rate and width to QDR and 4X Before commit f1b65df5a232 ("IB/mlx5: Add support for active_width and active_speed in RoCE"), the mlx5_ib driver set the default active_width and active_speed to IB_WIDTH_4X and IB_SPEED_QDR. When the RoCE port is down, the RoCE port does not negotiate the active width with the remote side, causing the active width to be zero. When running userspace ibstat to view the port status, ibstat will panic as it reads an invalid width from sys file. This patch restores the original behavior. Fixes: f1b65df5a232 ("IB/mlx5: Add support for active_width and active_speed in RoCE"). Signed-off-by: Honggang Li <honli@redhat.com> Reviewed-by: Hal Rosenstock <hal@mellanox.com> Reviewed-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
42cea83f |
|
14-Mar-2018 |
Mark Bloch <markb@mellanox.com> |
IB/mlx5: Fix cleanup order on unload On load we create private CQ/QP/PD in order to be used by UMR, we create those resources after we register ourself as an IB device, and we destroy them after we unregister as an IB device. This was changed by commit 16c1975f1032 ("IB/mlx5: Create profile infrastructure to add and remove stages") which moved the destruction before we unregistration. This allowed to trigger an invalid memory access when unloading mlx5_ib while there are open resources: BUG: unable to handle kernel paging request at 00000001002c012c ... Call Trace: mlx5_ib_post_send_wait+0x75/0x110 [mlx5_ib] __slab_free+0x9a/0x2d0 delay_time_func+0x10/0x10 [mlx5_ib] unreg_umr.isra.15+0x4b/0x50 [mlx5_ib] mlx5_mr_cache_free+0x46/0x150 [mlx5_ib] clean_mr+0xc9/0x190 [mlx5_ib] dereg_mr+0xba/0xf0 [mlx5_ib] ib_dereg_mr+0x13/0x20 [ib_core] remove_commit_idr_uobject+0x16/0x70 [ib_uverbs] uverbs_cleanup_ucontext+0xe8/0x1a0 [ib_uverbs] ib_uverbs_cleanup_ucontext.isra.9+0x19/0x40 [ib_uverbs] ib_uverbs_remove_one+0x162/0x2e0 [ib_uverbs] ib_unregister_device+0xd4/0x190 [ib_core] __mlx5_ib_remove+0x2e/0x40 [mlx5_ib] mlx5_remove_device+0xf5/0x120 [mlx5_core] mlx5_unregister_interface+0x37/0x90 [mlx5_core] mlx5_ib_cleanup+0xc/0x225 [mlx5_ib] SyS_delete_module+0x153/0x230 do_syscall_64+0x62/0x110 entry_SYSCALL_64_after_hwframe+0x21/0x86 ... We restore the original behavior by breaking the UMR stage into two parts, pre and post IB registration stages, this way we can restore the original functionality and maintain clean separation of logic between stages. Fixes: 16c1975f1032 ("IB/mlx5: Create profile infrastructure to add and remove stages") Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
c44ef998 |
|
13-Mar-2018 |
Ilya Lesokhin <ilyal@mellanox.com> |
IB/mlx5: Maintain a single emergency page The mlx5 driver needs to be able to issue invalidation to ODP MRs even if it cannot allocate memory. To this end it preallocates emergency pages to use when the situation arises. This flow should be extremely rare enough, that we don't need to worry about contention and therefore a single emergency page is good enough. Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
3346c487 |
|
20-Aug-2017 |
Boris Pismenny <borisp@mellanox.com> |
{net,IB}/mlx5: Add flow steering helpers Add helper functions that check if a protocol is part of a flow steering match criteria. Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
#
a9db0ecf |
|
16-Aug-2017 |
Matan Barak <matanb@mellanox.com> |
{net,IB}/mlx5: Add has_tag to mlx5_flow_act The has_tag member will indicate whether a tag action was specified in flow specification. A flow tag 0 = MLX5_FS_DEFAULT_FLOW_TAG is assumed a valid flow tag that is currently used by mlx5 RDMA driver, whereas in HW flow_tag = 0 means that the user doesn't care about flow_tag. HW always provide a flow_tag = 0 if all flow tags requested on a specific flow are 0. So we need a way (in the driver) to differentiate between a user really requesting flow_tag = 0 and a user who does not care, in order to be able to report conflicting flow tags on a specific flow. Signed-off-by: Matan Barak <matanb@mellanox.com> Reviewed-by: Aviad Yehezkel <aviadye@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
#
075572d4 |
|
16-Aug-2017 |
Boris Pismenny <borisp@mellanox.com> |
IB/mlx5: Pass mlx5_flow_act struct instead of multiple arguments Group and pass all function arguments of parse_flow_attr call in one common struct mlx5_flow_act. This patch passes all the action arguments of parse_flow_attr in one common struct mlx5_flow_act. It allows us to scale the number of actions without adding new arguments to the function. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Acked-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
c33251a3 |
|
31-Jan-2018 |
Aviad Yehezkel <aviadye@mellanox.com> |
IB/mlx5: Removed not used parameters Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Acked-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
210b1f78 |
|
05-Mar-2018 |
Mark Bloch <markb@mellanox.com> |
IB/mlx5: When not in dual port RoCE mode, use provided port as native The series that introduced dual port RoCE mode assumed that we don't have a dual port HCA that use the mlx5 driver, this is not the case for Connect-IB HCAs. This reasoning led to assigning 1 as the native port index which causes issue when the second port is used. For example query_pkey() when called on the second port will return values of the first port. Make sure that we assign the right port index as the native port index. Fixes: 32f69e4be269 ("{net, IB}/mlx5: Manage port association for multiport RoCE") Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
aba46213 |
|
25-Feb-2018 |
Daniel Jurgens <danielj@mellanox.com> |
{net, IB}/mlx5: Raise fatal IB event when sys error occurs All other mlx5_events report the port number as 1 based, which is how FW reports it in the port event EQE. Reporting 0 for this event causes mlx5_ib to not raise a fatal event notification to registered clients due to a seemingly invalid port. All switch cases in mlx5_ib_event that go through the port check are supposed to set the port now, so just do it once at variable declaration. Fixes: 89d44f0a6c73("net/mlx5_core: Add pci error handlers to mlx5_core driver") Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
b5ca15ad |
|
23-Jan-2018 |
Mark Bloch <markb@mellanox.com> |
IB/mlx5: Add proper representors support This commit adds full support for IB representor: 1) Representors profile, We add two new profiles: nic_rep_profile - This profile will be used to create an IB device that represents the PF/UPLINK. rep_profile - This profile will be used to create an IB device that represents VFs. Each VF will be its own representor. 2) Proper load/unload callbacks, Those are called by the E-Switch when moving to/from switchdev mode. 3) Different flow DB handling for when we in switchdev mode. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
#
8e6efa3a |
|
05-Nov-2017 |
Mark Bloch <markb@mellanox.com> |
IB/mlx5: When in switchdev mode, expose only raw packet capabilities Currently in switchdev mode we allow only for raw packet QPs. Expose the right capabilities and set the gid table length to 0, also make sure we don't try to enable RoCE, so split the function to enable RoCE so representors can enable only the notifier needed for net device events. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
#
bcf87f1d |
|
16-Jan-2018 |
Mark Bloch <markb@mellanox.com> |
IB/mlx5: Listen to netdev register/unresiter events in switchdev mode Currently we listen to netdev register/unregister event based on PCI device. When in switchdev mode PF and representors share the same PCI device, so in order to pair ib device and netdev in switchdev mode compare the netdev that triggered the event to that of the representor. Expose a function that lets you receive the netdev associated what a given representor. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
#
018a94ee |
|
16-Jan-2018 |
Mark Bloch <markb@mellanox.com> |
IB/mlx5: Add match on vport when in switchdev mode When we point to a representor, it means we are in switchdev mode. The flow db is shared between PF and virtual function representors so each rule created needs to have a match on its specific source port. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
#
9a4ca38d |
|
16-Jan-2018 |
Mark Bloch <markb@mellanox.com> |
IB/mlx5: Allocate flow DB only on PF IB device A flow DB is a shared resource between PF and representors, need to allocate it only when creating the PF IB device. Once we add IB representors, they will use the flow db which was created by the PF. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
#
fc385b7a |
|
16-Jan-2018 |
Mark Bloch <markb@mellanox.com> |
IB/mlx5: Add basic regiser/unregister representors code Create the basic infrastructure of registering and unregistering IB representors. The load/unload callbacks are left empty and proper implementation will be introduced in following patches. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
#
10bea9c8 |
|
19-Jan-2018 |
Leon Romanovsky <leon@kernel.org> |
RDMA/mlx5: Remove redundant allocation warning print The kmalloc() failure to allocate memory generates enough information and doesn't need to be accompanied by another driver print. Fixes: d69a24e03659 ("IB/mlx5: Move IB event processing onto a workqueue") Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
5c99eaec |
|
16-Jan-2018 |
Feras Daoud <ferasda@mellanox.com> |
IB/mlx5: Mmap the HCA's clock info to user-space This patch maps the new page to user space applications to allow converting a user space completion timestamp to system wall time at the lowest possible latency cost. By using a versioning scheme we allow compatibility between current and future userspace libraries. The change moves mlx5_ib_mmap_cmd enum from mlx5_ib.h to the abi header file mlx5-abi.h. Reviewed-by: Alex Vesker <valex@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Eitan Rabin <rabin@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
72f36be0 |
|
20-Nov-2017 |
Eran Ben Elisha <eranbe@mellanox.com> |
net/mlx5: Fix mlx5_get_uars_page to return error code Change mlx5_get_uars_page to return ERR_PTR in case of allocation failure. Change all callers accordingly to check the IS_ERR(ptr) instead of NULL. Fixes: 59211bd3b632 ("net/mlx5: Split the load/unload flow into hardware and software flows") Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
#
8978cc92 |
|
09-Jan-2018 |
Eran Ben Elisha <eranbe@mellanox.com> |
{net,ib}/mlx5: Don't disable local loopback multicast traffic when needed There are systems platform information management interfaces (such as HOST2BMC) for which we cannot disable local loopback multicast traffic. Separate disable_local_lb_mc and disable_local_lb_uc capability bits so driver will not disable multicast loopback traffic if not supported. (It is expected that Firmware will not set disable_local_lb_mc if HOST2BMC is running for example.) Function mlx5_nic_vport_update_local_lb will do best effort to disable/enable UC/MC loopback traffic and return success only in case it succeeded to changed all allowed by Firmware. Adapt mlx5_ib and mlx5e to support the new cap bits. Fixes: 2c43c5a036be ("net/mlx5e: Enable local loopback in loopback selftest") Fixes: c85023e153e3 ("IB/mlx5: Add raw ethernet local loopback support") Fixes: bded747bb432 ("net/mlx5: Add raw ethernet local loopback firmware command") Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Cc: kernel-team@fb.com Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
#
da005f9f |
|
09-Jan-2018 |
Colin Ian King <colin.king@canonical.com> |
IB/mlx5: remove redundant assignment of mdev The initial assignment to mdev is redundant as mdev is re-assigned later and the first assigned value is never read. Remove this redundant assignment. Cleans up clang warning: drivers/infiniband/hw/mlx5/main.c:359:24: warning: Value stored to 'mdev' during its initialization is never read Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
85c7c014 |
|
04-Jan-2018 |
Daniel Jurgens <danielj@mellanox.com> |
IB/mlx5: Don't advertise RAW QP support in dual port mode When operating in dual port RoCE mode FW doesn't support steering for raw QPs on the slave port. They still work on the master port, but the user has no way of knowing which port is the master. The capability is reported per device, not per port. Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
cfe4e37f |
|
04-Jan-2018 |
Daniel Jurgens <danielj@mellanox.com> |
{net, IB}/mlx5: Change set_roce_gid to take a port number When in dual port mode setting a RoCE GID for any port flows through the master ports mlx5_core_dev. Provide an interface to set the port when sending this command. Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
aac4492e |
|
04-Jan-2018 |
Daniel Jurgens <danielj@mellanox.com> |
IB/mlx5: Update counter implementation for dual port RoCE Update the counter interface for multiple ports. Some counter sets always comes from the primary device. Port specific counters should be accessed per mlx5_core_dev not always through the IB master mdev. Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
a9e546e7 |
|
04-Jan-2018 |
Parav Pandit <parav@mellanox.com> |
IB/mlx5: Change debugfs to have per port contents When there are multiple ports for single IB(RoCE) device, support debugfs entries to be available for each port. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
b3cbd6f0 |
|
04-Jan-2018 |
Daniel Jurgens <danielj@mellanox.com> |
IB/mlx5: Implement dual port functionality in query routines Port operations must be routed to their native mlx5_core_dev. A multiport RoCE device registers itself as having 2 ports even before a 2nd port is affiliated. If an unaffilated port is queried use capability information from the master port, these values are the same. Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
d69a24e0 |
|
04-Jan-2018 |
Daniel Jurgens <danielj@mellanox.com> |
IB/mlx5: Move IB event processing onto a workqueue Because mlx5_ib_event can be called from atomic context move event handling onto a workqueue. A mutex lock is required to get the IB device for slave ports, so move event processing onto a work queue. When an IB event is received, check if the mlx5_core_dev is a slave port, if so attempt to get the IB device it's affiliated with. If found process the event for that device, otherwise return. Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
32f69e4b |
|
04-Jan-2018 |
Daniel Jurgens <danielj@mellanox.com> |
{net, IB}/mlx5: Manage port association for multiport RoCE When mlx5_ib_add is called determine if the mlx5 core device being added is capable of dual port RoCE operation. If it is, determine whether it is a master device or a slave device using the num_vhca_ports and affiliate_nic_vport_criteria capabilities. If the device is a slave, attempt to find a master device to affiliate it with. Devices that can be affiliated will share a system image guid. If none are found place it on a list of unaffiliated ports. If a master is found bind the port to it by configuring the port affiliation in the NIC vport context. Similarly when mlx5_ib_remove is called determine the port type. If it's a slave port, unaffiliate it from the master device, otherwise just remove it from the unaffiliated port list. The IB device is registered as a multiport device, even if a 2nd port is not available for affiliation. When the 2nd port is affiliated later the GID cache must be refreshed in order to get the default GIDs for the 2nd port in the cache. Export roce_rescan_device to provide a mechanism to refresh the cache after a new port is bound. In a multiport configuration all IB object (QP, MR, PD, etc) related commands should flow through the master mlx5_core_dev, other commands must be sent to the slave port mlx5_core_mdev, an interface is provide to get the correct mdev for non IB object commands. Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
7fd8aefb |
|
04-Jan-2018 |
Daniel Jurgens <danielj@mellanox.com> |
IB/mlx5: Make netdev notifications multiport capable When multiple RoCE ports are supported registration for events on multiple netdevs is required. Refactor the event registration and handling to support multiple ports. Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
508562d6 |
|
04-Jan-2018 |
Daniel Jurgens <danielj@mellanox.com> |
IB/mlx5: Reduce the use of num_port capability Remove use of the num_ports general capability throughout. The number of ports will be variable in the future, and reported in a different way. Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
776a3906 |
|
02-Jan-2018 |
Moni Shoua <monis@mellanox.com> |
IB/mlx5: Add support for DC target QP A DC Target (DCT) QP is represented in the hardware as a unique object. This object is created by CREATE_DCT command and destroyed by DESTROY_DCT command. However, in the driver we describe it as a QP. The hardware command that creates a DCT needs parameters that the verb create_qp() does not provide. Those remaining parameters are provided with the call to the verb modify_qp(). Therefore we delay the actual creation of a DCT in the hardware until the stage of modify_qp() to RTR. A support for query_qp() was added as well. It uses QUERY_DCT command to retrieve the applicable fields. Signed-off-by: Moni Shoua <monis@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
3cc297db |
|
01-Jan-2018 |
Mark Bloch <markb@mellanox.com> |
IB/mlx5: Move locks initialization to the corresponding stage Unconditional locks/list and ODP srcu initialization should be done in the INIT stage. Remove those from the CAPS stage and move them to the proper stage. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
c8b89924 |
|
01-Jan-2018 |
Mark Bloch <markb@mellanox.com> |
IB/mlx5: Move loopback initialization to the corresponding stage The loopback stage only initializes a lock, move it to be in the CAPS initialization phase and get rid loopback step completely. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
5e1e7612 |
|
01-Jan-2018 |
Mark Bloch <markb@mellanox.com> |
IB/mlx5: Move hardware counters initialization to the corresponding stage Now that we have a stage just for hardware counters, move all relevant initialization logic into one place. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
07321b3c |
|
01-Jan-2018 |
Mark Bloch <markb@mellanox.com> |
IB/mlx5: Move ODP initialization to the corresponding stage Now that we have a stage just for ODP, move all relevant initialization logic into one place. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
c11a226a |
|
01-Jan-2018 |
Mark Bloch <markb@mellanox.com> |
IB/mlx5: Move RoCE/ETH initialization to the corresponding stage Now that we have a stage just for RoCE/ETH, move all relevant initialization logic into one place. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
16c1975f |
|
01-Jan-2018 |
Mark Bloch <markb@mellanox.com> |
IB/mlx5: Create profile infrastructure to add and remove stages Today we have single function which is used when we add an IB interface, break this function into multiple functions. Create stages and a generic mechanism to execute each stage. This is in preparation for RDMA/IB representors which might not need all stages or will do things differently in some of the stages. This patch doesn't change any functionality. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
4ed131d0 |
|
24-Dec-2017 |
Yishai Hadas <yishaih@mellanox.com> |
IB/mlx5: Expose dynamic mmap allocation This patch exposes the option to dynamic allocates a UAR, this functionality will be used in downstream patch in this series as part of QP creation. Specifically, the user space driver asks for a UAR allocation in a given page index, upon success this UAR and its bfregs can be used as part of QP creation by the user space driver. To enable allocating more than 256 UARs the page index is encoded in an extra one byte just after the command byte. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
31a78a5a |
|
24-Dec-2017 |
Yishai Hadas <yishaih@mellanox.com> |
IB/mlx5: Extend UAR stuff to support dynamic allocation This patch extends the alloc context flow to be prepared for working with dynamic UAR allocations. Currently upon alloc context there is some fix size of UARs that are allocated (named 'static allocation') and there is no option to user application to ask for more or control which UAR will be used by which QP. In this patch the driver prepares its data structures to manage both the static and the dynamic allocations and let the user driver knows about the max value of dynamic blue-flame registers that are allowed. Downstream patches from this series will enable the dynamic allocation and the association as part of QP creation. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
4e2b53a5 |
|
24-Dec-2017 |
Maor Gottlieb <maorg@mellanox.com> |
IB/mlx5: Report inner RSS capability Add missing inner RSS support capability as part of the RSS supported fields. In addition change MLX5_RX_HASH_INNER to 1UL << 31 in order to define it as unsigned. Fixes: 309fa3470fca ("IB/mlx5: Add support for RSS on the inner packet") Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
ad9a3668 |
|
24-Dec-2017 |
Majd Dibbiny <majd@mellanox.com> |
IB/mlx5: Serialize access to the VMA list User-space applications can do mmap and munmap directly at any time. Since the VMA list is not protected with a mutex, concurrent accesses to the VMA list from the mmap and munmap can cause data corruption. Add a mutex around the list. Cc: <stable@vger.kernel.org> # v4.7 Fixes: 7c2344c3bbf9 ("IB/mlx5: Implements disassociate_ucontext API") Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
72c7fe90 |
|
06-Dec-2017 |
Pravin Shedge <pravin.shedge4linux@gmail.com> |
drivers: infiniband: remove duplicate includes These duplicate includes have been found with scripts/checkincludes.pl but they have been removed manually to avoid removing false positives. Signed-off-by: Pravin Shedge <pravin.shedge4linux@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
71a0ff65 |
|
21-Dec-2017 |
Majd Dibbiny <majd@mellanox.com> |
IB/mlx5: Fix congestion counters in LAG mode Congestion counters are counted and queried per physical function. When working in LAG mode, CNP packets can be sent or received on both of the functions, thus congestion counters should be aggregated from the two physical functions. Fixes: e1f24a79f424 ("IB/mlx5: Support congestion related counters") Signed-off-by: Majd Dibbiny <majd@mellanox.com> Reviewed-by: Aviv Heller <avivh@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
#
87ab3f52 |
|
13-Nov-2017 |
Yonatan Cohen <yonatanc@mellanox.com> |
IB/mlx5: Add CQ moderation capability to query_device query_device can now obtain the maximum values for cq_max_count and cq_period, needed for cq moderation. Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
b0e9df6d |
|
13-Nov-2017 |
Yonatan Cohen <yonatanc@mellanox.com> |
IB/mlx5: Exposing modify CQ callback to uverbs layer Exposed mlx5_ib_modify_cq to be called from ib device verb list. Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
31fde034 |
|
30-Oct-2017 |
Majd Dibbiny <majd@mellanox.com> |
IB/mlx5: Assign send CQ and recv CQ of UMR QP The UMR's QP is created by calling mlx5_ib_create_qp directly, and therefore the send CQ and the recv CQ on the ibqp weren't assigned. Assign them right after calling the mlx5_ib_create_qp to assure that any access to those pointers will work as expected and won't crash the system as might happen as part of reset flow. Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters") Signed-off-by: Majd Dibbiny <majd@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
b1383aa6 |
|
29-Oct-2017 |
Noa Osherovich <noaos@mellanox.com> |
IB/mlx5: Add PCI write end padding support Add the PCI write end padding flag to device_cap_flags enum and set it during mlx5_ib_query_device so it will be reported to user-space. During WQ/QP creation, set that capability for WQ/QP if user requested it and HW supports it. PCI write end padding modification is not supported for now. There's no such flag for a QP but for a WQ, create and modify use the same flag. Return an error if PCI write end padding flag is set during modify_wq. Signed-off-by: Noa Osherovich <noaos@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
f95ef6cb |
|
18-Oct-2017 |
Maor Gottlieb <maorg@mellanox.com> |
IB/mlx5: Add tunneling offloads support The device can support receive Stateless Offloads for the inner packet's fields only when the packet is processed by TIR which is enabled to support tunneling. Otherwise, the device treats the packet as an ordinary non-tunneling packet and receive offloads can be done only for the outer packet's field. In order to enable receive Stateless Offloading support for incoming tunneling traffic the TIR should be created with tunneled_offload_en. Tunneling offloads is supported only be raw ethernet QP. This patch includes: * New QP creation flag for tunneling offloads. * Reports device capabilities. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
7a0c8f42 |
|
18-Oct-2017 |
Guy Levi <guyle@mellanox.com> |
IB/mlx5: Support padded 128B CQE feature In some benchmarks and some CPU architectures, writing the CQE on a full cache line size improves performance by saving memory access operations (read-modify-write) relative to partial cache line change. This patch lets the user to configure the device to pad the CQE up to 128B in case its content is less than 128B. Currently the driver supports only padding for a CQE size of 128B. Signed-off-by: Guy Levi <guyle@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
de57f2ad |
|
18-Oct-2017 |
Guy Levi <guyle@mellanox.com> |
IB/mlx5: Support 128B CQE compression feature In commit 1cbe6fc86ccf ("IB/mlx5: Add support for CQE compressing") the concept of CQE compression was introduced and added a support for 64B CQE size. This change update the code to support 128B CQE size as well. Signed-off-by: Guy Levi <guyle@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
b4f34597 |
|
17-Oct-2017 |
Noa Osherovich <noaos@mellanox.com> |
IB/mlx5: Expose multi-packet RQ capabilities This patch reports the device's striding RQ capabilities to the user-space: - min/max_single_stride_log_num_of_bytes: Log of min/max number of bytes in a single stride. - min/max_single_wqe_log_num_of_strides: Log of min/max number of strides in a single WQE. - supported_qpts: A bit mask to know which QP types support multi- packet RQ, for now only Raw Packet QPs. Signed-off-by: Noa Osherovich <noaos@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
e19cd282 |
|
01-Oct-2017 |
Parav Pandit <parav@mellanox.com> |
IB/mlx5: Fix label order in error path handling When UAR get_page fails, it needs to continue to cleanup debugfs for congestion control parameters. Labels for error path were incorrectly ordered. This patch fixes to do correct cleanup on debugfs init failure and uar get page failure. Fixes: 4a2da0b8c078 ("IB/mlx5: Add debug control parameters for congestion control") Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
78b1beb0 |
|
24-Sep-2017 |
Leon Romanovsky <leon@kernel.org> |
IB/core: Fix typo in the name of the tag-matching cap struct The tag matching functionality is implemented by mlx5 driver by extending XRQ, however this internal kernel information was exposed to user space applications with *xrq* name instead of *tm*. This patch renames *xrq* to *tm* to handle that. Fixes: 8d50505ada72 ("IB/uverbs: Expose XRQ capabilities") Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
cbafad87 |
|
17-Sep-2017 |
Sudip Mukherjee <sudipm.mukherjee@gmail.com> |
IB/mlx5: fix debugfs cleanup If delay_drop_debugfs_init() fails in any of the operations to create debugfs, it is calling delay_drop_debugfs_cleanup() as part of its cleanup. But delay_drop_debugfs_cleanup() checks for 'dbg' and since we have not yet pointed 'dbg' to the debugfs we need to cleanup, the cleanup fails and we are left with stray debugfs elements and also a memory leak. Fixes: 4a5fd5d2965c ("IB/mlx5: Add necessary delay drop assignment") Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
eb761894 |
|
17-Aug-2017 |
Artemy Kovalyov <artemyko@mellanox.com> |
IB/mlx5: Fill XRQ capabilities Provide driver specific values for XRQ capabilities. Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Reviewed-by: Yossi Itigin <yosefe@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
1a56ff6d |
|
17-Aug-2017 |
Artemy Kovalyov <artemyko@mellanox.com> |
IB/core: Separate CQ handle in SRQ context Before this change CQ attached to SRQ was part of XRC specific extension. Moving CQ handle out makes it available to other types extending SRQ functionality. Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Reviewed-by: Yossi Itigin <yosefe@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
050da902 |
|
17-Aug-2017 |
Bodong Wang <bodong@mellanox.com> |
IB/mlx5: Report mlx5 enhanced multi packet WQE capability Expose enhanced multi packet WQE capability to user space through query_device by uhw. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
795b609c |
|
17-Aug-2017 |
Bodong Wang <bodong@mellanox.com> |
IB/mlx5: Allow posting multi packet send WQEs if hardware supports Set the field to allow posting multi packet send WQEs if hardware supports this feature. This doesn't mean the send WQEs will be for multi packet unless the send WQE was prepared according to multi packet send WQE format. User space shall use flag MLX5_IB_ALLOW_MPW to check if hardware supports MPW and allows MPW in SQ context. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
a550ddfc |
|
17-Aug-2017 |
Yishai Hadas <yishaih@mellanox.com> |
IB/mlx5: Add support for multi underlay QP Set underlay QPN as part of flow rule when it's applicable. There is one root flow table in the NIC RX namespace and all the underlay QPs steer the traffic to this flow table. In order to prevent QP to get traffic which is not target to its underlay QP, we need to set the underlay QP number as part of the steering matching. Note: When multicast traffic is sent the QPN filtering is done by the firmware as some early step. Adding the QPN match on the flow table entry is wrong as by that time the target QPN holds the multicast address (e.g. FF(s)) and it won't match. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
96dc3fc5 |
|
17-Aug-2017 |
Noa Osherovich <noaos@mellanox.com> |
IB/mlx5: Expose software parsing for Raw Ethernet QP Software parsing (SWP) is a feature that can be used to instruct the device to stop using its internal parser and to parse packets on the transmit path according to offsets set for each packets. Through this feature, the device allows the handling of checksum and LSO by the hardware according to the location of IP and TCP/UDP headers. Enable SW parsing on Raw Ethernet send queue by default if firmware supports it and report these capabilities to user space. Signed-off-by: Noa Osherovich <noaos@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
84305d71 |
|
17-Aug-2017 |
Leon Romanovsky <leon@kernel.org> |
RDMA/mlx5: Limit scope of get vector affinity local function The mlx5_ib_get_vector_affinity() call is local to main.c file and there is no need to be declared globally visible. Fixes: 40b24403f33e ("mlx5: support ->get_vector_affinity") Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
4a5fd5d2 |
|
17-Aug-2017 |
Maor Gottlieb <maorg@mellanox.com> |
IB/mlx5: Add necessary delay drop assignment Assign the statistics and configuration structure pointer on success. Fixes: fe248c3a5837 ('IB/mlx5: Add delay drop configuration and statistics') Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
ec255879 |
|
22-Aug-2017 |
Majd Dibbiny <majd@mellanox.com> |
IB/mlx5: Always return success for RoCE modify port CM layer calls ib_modify_port() regardless of the link layer. For the Ethernet ports, qkey violation and Port capabilities are meaningless. Therefore, always return success for ib_modify_port calls on the Ethernet ports. Cc: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Majd Dibbiny <majd@mellanox.com> Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
e093111d |
|
27-Jun-2017 |
Amrani, Ram <Ram.Amrani@cavium.com> |
IB/core: Fix input len in multiple user verbs Most user verbs pass user data to the kernel with the inclusion of the ib_uverbs_cmd_hdr structure. This is problematic because the vendor has no ideas if the verb was called by a legacy verb or an extended verb. Also, the incosistency between the verbs is confusing. Fixes: 565197dd8fb1 ("IB/core: Extend ib_uverbs_create_cq") Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
9abb0d1b |
|
27-Jun-2017 |
Leon Romanovsky <leon@kernel.org> |
RDMA: Simplify get firmware interface There is a need to forward FW version to user space application through RDMA netlink. In order to make it safe, there is need to declare nla_policy and limit the size of FW string. The new define IB_FW_VERSION_NAME_MAX will limit the size of FW version string. That define was chosen to be equal to ETHTOOL_FWVERS_LEN, because many drivers anyway are limited by that value indirectly. The introduction of this define allows us to remove the string size from get_fw_str function signature. Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
|
#
40b24403 |
|
13-Jul-2017 |
Sagi Grimberg <sagi@grimberg.me> |
mlx5: support ->get_vector_affinity Simply refer to the generic affinity mask helper. Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
9c2d33d4 |
|
27-Jun-2017 |
Colin Ian King <colin.king@canonical.com> |
net/mlx5: fix spelling mistake: "alloated" -> "allocated" Trivial fix to spelling mistake in mlx5_ib_dbg message Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
58dcb60a |
|
18-Jun-2017 |
Parav Pandit <parav@mellanox.com> |
IB/mlx5: Expose extended error counters This patch adds below requester and responder side error counters, which will be exposed by hardware counters interface and are supported as part of query Q counters command extension. +---------------------------+-------------------------------------+ | Name | Description | |---------------------------+-------------------------------------| |resp_local_length_error | Number of times responder detected | | | local length errors | |---------------------------+-------------------------------------| |resp_cqe_error | Number of CQEs completed with error | | | at responder | |---------------------------+-------------------------------------| |req_cqe_error | Number of CQEs completed with error | | | at requester | |---------------------------+-------------------------------------| |req_remote_invalid_request | Number of times requester detected | | | remote invalid request error | |---------------------------+-------------------------------------| |req_remote_access_error | Number of times requester detected | | | remote access error | |---------------------------+-------------------------------------| |resp_remote_access_error | Number of times responder detected | | | remote access error | |---------------------------+-------------------------------------| |resp_cqe_flush_error | Number of CQEs completed with | | | flushed with error at responder | |---------------------------+-------------------------------------| |req_cqe_flush_error | Number of CQEs completed with | | | flushed with error at requester | +---------------------------+-------------------------------------+ Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Reviewed-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
1d54f890 |
|
08-Jun-2017 |
Yishai Hadas <yishaih@mellanox.com> |
IB/mlx5: Report RX checksum capabilities for IPoIB Report RX checksum capabilities when 'ipoib_enhanced_offloads' capabilities are set and support checksum. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
81e30880 |
|
08-Jun-2017 |
Yishai Hadas <yishaih@mellanox.com> |
IB/mlx5: Add multicast flow steering support for underlay QP In order to add multicast flow steering support, there is need to block the attaching of mcg flow for underlay QP, recognize multicast IB_FLOW_SPEC_IPV4 based on its IP and enable creating/destroying flow for IB layer. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
fe248c3a |
|
30-May-2017 |
Maor Gottlieb <maorg@mellanox.com> |
IB/mlx5: Add delay drop configuration and statistics Add debugfs interface for monitor the number of delay drop timeout events and the number of existing dropless RQs in the system. In addition add debugfs interface for configuring the global timeout value which is used in the SET_DELAY_DROP command. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
03404e8a |
|
30-May-2017 |
Maor Gottlieb <maorg@mellanox.com> |
IB/mlx5: Add support to dropless RQ RQs that were configured for "delay drop" will prevent packet drops when their WQEs are depleted. Marking an RQ to be drop-less is done by setting delay_drop_en in RQ context using CREATE_RQ command. Since this feature is globally activated/deactivated by using the SET_DELAY_DROP command on all the marked RQs, we activated/deactivated it according to the number of RQs with 'delay_drop' enabled. When timeout is expired, then the feature is deactivated. Therefore the driver handles the delay drop timeout event and reactivate it. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
4a2da0b8 |
|
30-May-2017 |
Parav Pandit <parav@mellanox.com> |
IB/mlx5: Add debug control parameters for congestion control This patch adds debug control parameters for congestion control which can be read or written through debugfs. They are for reaction point and notification point nodes. These control parameters are as below: +------------------------------+-----------------------------------------+ | Name | Description | |------------------------------+-----------------------------------------| |rp_clamp_tgt_rate | When set target rate is updated to | | | current rate | |------------------------------+-----------------------------------------| |rp_clamp_tgt_rate_ati | When set update target rate based on | | | timer as well | |------------------------------+-----------------------------------------| |rp_time_reset | time between rate increase if no | | | CNP is received unit in usec | |------------------------------+-----------------------------------------| |rp_byte_reset | Number of bytes between rate inease if | | | no CNP is received | |------------------------------+-----------------------------------------| |rp_threshold | Threshold for reaction point rate | | | control | |------------------------------+-----------------------------------------| |rp_ai_rate | Rate for target rate, unit in Mbps | |------------------------------+-----------------------------------------| |rp_hai_rate | Rate for hyper increase state | | | unit in Mbps | |------------------------------+-----------------------------------------| |rp_min_dec_fac | Minimum factor by which the current | | | transmit rate can be changed when | | | processing a CNP, unit is percerntage | |------------------------------+-----------------------------------------| |rp_min_rate | Minimum value for rate limit, | | | unit in Mbps | |------------------------------+-----------------------------------------| |rp_rate_to_set_on_first_cnp | Rate that is set when first CNP is | | | received, unit is Mbps | |------------------------------+-----------------------------------------| |rp_dce_tcp_g | Used to calculate alpha | |------------------------------+-----------------------------------------| |rp_dce_tcp_rtt | Time between updates of alpha value, | | | unit is usec | |------------------------------+-----------------------------------------| |rp_rate_reduce_monitor_period | Minimum time between consecutive rate | | | reductions | |------------------------------+-----------------------------------------| |rp_initial_alpha_value | Initial value of alpha | |------------------------------+-----------------------------------------| |rp_gd | When CNP is received, flow rate is | | | reduced based on gd, rp_gd is given as | | | log2(rp_gd) | |------------------------------+-----------------------------------------| |np_cnp_dscp | dscp code point for generated cnp | |------------------------------+-----------------------------------------| |np_cnp_prio_mode | 802.1p priority for generated cnp | |------------------------------+-----------------------------------------| |np_cnp_prio | cnp priority mode | +------------------------------+-----------------------------------------+ Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Reviewed-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
fd65f1b8 |
|
30-May-2017 |
Moni Shoua <monis@mellanox.com> |
IB/mlx5: Change logic for dispatching IB events for port state The old logic ignored link state. This led to missing IB events like when link goes down on the switch while admin state is up or to redundant events like when admin state goes up while link is down. To fix that, probe the port state on NETDEV events and compare to last known state to decide if IB events needs to be dispatched. FIxes: 5ec8c83e3ad3 ("IB/mlx5: Port events in RoCE now rely on netdev events") Signed-off-by: Moni Shoua <monis@mellanox.com> Reviewed-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
c85023e1 |
|
30-May-2017 |
Huy Nguyen <huyn@mellanox.com> |
IB/mlx5: Add raw ethernet local loopback support Currently, unicast/multicast loopback raw ethernet (non-RDMA) packets are sent back to the vport. A unicast loopback packet is the packet with destination MAC address the same as the source MAC address. For multicast, the destination MAC address is in the vport's multicast filter list. Moreover, the local loopback is not needed if there is one or none user space context. After this patch, the raw ethernet unicast and multicast local loopback are disabled by default. When there is more than one user space context, the local loopback is enabled. Note that when local loopback is disabled, raw ethernet packets are not looped back to the vport and are forwarded to the next routing level (eswitch, or multihost switch, or out to the wire depending on the configuration). Signed-off-by: Huy Nguyen <huyn@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
e1267b01 |
|
25-Jun-2017 |
Leon Romanovsky <leon@kernel.org> |
RDMA: Remove useless MODULE_VERSION All modules in drivers/infiniband defined and used MODULE_VERSION, which was pointless because the kernel version describes their state more accurate then those arbitrary numbers. Signed-off-by: Leon Romanovsky <leon@kernel.org> Acked-by: Sagi Grimbrg <sagi@grimberg.me> Reviewed-by: Sagi Grimberg <sagi@grimbeg.me> Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Acked-by: Selvin Xavier <selvin.xavier@broadcom.com> Acked-by: Ram Amrani <Ram.Amrani@cavium.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Acked-by: Adit Ranadive <aditr@vmware.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
8e959601 |
|
30-Jun-2017 |
Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> |
IB/core, opa_vnic, hfi1, mlx5: Properly free rdma_netdev IPOIB is calling free_rdma_netdev even though alloc_rdma_netdev has returned -EOPNOTSUPP. Move free_rdma_netdev from ib_device structure to rdma_netdev structure thus ensuring proper cleanup function is called for the rdma net device. Fix the following trace: ib0: Failed to modify QP to ERROR state BUG: unable to handle kernel paging request at 0000000000001d20 IP: hfi1_vnic_free_rn+0x26/0xb0 [hfi1] Call Trace: ipoib_remove_one+0xbe/0x160 [ib_ipoib] ib_unregister_device+0xd0/0x170 [ib_core] rvt_unregister_device+0x29/0x90 [rdmavt] hfi1_unregister_ib_device+0x1a/0x100 [hfi1] remove_one+0x4b/0x220 [hfi1] pci_device_remove+0x39/0xc0 device_release_driver_internal+0x141/0x200 driver_detach+0x3f/0x80 bus_remove_driver+0x55/0xd0 driver_unregister+0x2c/0x50 pci_unregister_driver+0x2a/0xa0 hfi1_mod_cleanup+0x10/0xf65 [hfi1] SyS_delete_module+0x171/0x250 do_syscall_64+0x67/0x150 entry_SYSCALL64_slow_path+0x25/0x25 Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
095b0927 |
|
14-May-2017 |
Ilan Tayari <ilant@mellanox.com> |
IB/mlx5: Respect mlx5_core reserved GIDs Reserved gids are taken by the mlx5_core, report smaller GID table size to IB core. Set mlx5_query_roce_port's return value back to int. In case of error, return an indication. This rolls back some of the change in commit 50f22fd8ecf9 ("IB/mlx5: Set mlx5_query_roce_port's return value to void") Change set_roce_addr to use gid_set function, instead of directly sending the command. Signed-off-by: Ilan Tayari <ilant@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
#
bd10838a |
|
28-May-2017 |
Or Gerlitz <ogerlitz@mellanox.com> |
net/mlx5: Fix some spelling mistakes Fixed few places where endianness was misspelled and one spot whwere output was: CHECK: 'endianess' may be misspelled - perhaps 'endianness'? CHECK: 'ouput' may be misspelled - perhaps 'output'? Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
#
022d038a |
|
14-Jun-2017 |
Alex Vesker <valex@mellanox.com> |
IB/ipoib: Limit call to free rdma_netdev for capable devices Limit calls to free_rdma_netdev() for capable devices only. Fixes: cd565b4b51e5 ('IB/IPoIB: Support acceleration options callbacks') Signed-off-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
6e8484c5 |
|
28-May-2017 |
Max Gurtovoy <maxg@mellanox.com> |
RDMA/mlx5: set UMR wqe fence according to HCA cap Cache the needed umr_fence and set the wqe ctrl segmennt accordingly. Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Acked-by: Leon Romanovsky <leon@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
b359911d |
|
22-Feb-2017 |
Tariq Toukan <tariqt@mellanox.com> |
IB/mlx5: Bump driver version Remove date and bump version for mlx5_ib driver. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
#
1b9a07ee |
|
10-May-2017 |
Leon Romanovsky <leon@kernel.org> |
{net, IB}/mlx5: Replace mlx5_vzalloc with kvzalloc Commit a7c3e901a46f ("mm: introduce kv[mz]alloc helpers") added proper implementation of mlx5_vzalloc function to the MM core. This made the mlx5_vzalloc function useless, so let's remove it. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
#
693dfd5a |
|
27-Apr-2017 |
Erez Shitrit <erezsh@mellanox.com> |
IB/mlx5: Enable IPoIB acceleration Enable mlx5 IPoIB acceleration by declaring mlx5_ib_{alloc,free}_rdma_netdev and assigning the mlx5 IPoIB rdma_netdev callbacks. In addition, this patch brings in sync mlx5's IPoIB parts for net and IB trees. As a precaution, we disabled IPoIB acceleration by default (in the mlx5_core Kconfig file). Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
f1b65df5 |
|
20-Apr-2017 |
Noa Osherovich <noaos@mellanox.com> |
IB/mlx5: Add support for active_width and active_speed in RoCE Add missing calculation and translation of active_width and active_speed for RoCE. Fixes: 3f89a643eb295 ('IB/mlx5: Extend query_device/port to ...') Signed-off-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
50f22fd8 |
|
20-Apr-2017 |
Noa Osherovich <noaos@mellanox.com> |
IB/mlx5: Set mlx5_query_roce_port's return value to void In case of an error, the properties reported to user are zeroed out, so no need for a return value. Signed-off-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
e1f24a79 |
|
15-Apr-2017 |
Parav Pandit <parav@mellanox.com> |
IB/mlx5: Support congestion related counters This patch adds support to query the congestion related hardware counters through new command and links them with other hw counters being available in hw_counters sysfs location. In order to reuse existing infrastructure it renames related q_counter data structures to more generic counters to reflect q_counters and congestion counters and maybe some other counters in the future. New hardware counters: * rp_cnp_handled - CNP packets handled by the reaction point * rp_cnp_ignored - CNP packets ignored by the reaction point * np_cnp_sent - CNP packets sent by notification point to respond to CE marked RoCE packets * np_ecn_marked_roce_packets - CE marked RoCE packets received by notification point It also avoids returning ENOSYS which is specific for invalid system call and produces the following checkpatch.pl warning. WARNING: ENOSYS means 'invalid syscall nr' and nothing else + return -ENOSYS; Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
a22ed86c |
|
03-Apr-2017 |
Slava Shwartsman <slavash@mellanox.com> |
IB/mlx5: Add drop flow steering rule support A drop rule is described by an action drop and no destination. If a user specified IB_FLOW_SPEC_ACTION_DROP then set the action to MLX5_FLOW_CONTEXT_ACTION_DROP and clear the destination. Signed-off-by: Slava Shwartsman <slavash@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
19cc7524 |
|
03-Apr-2017 |
Ariel Levkovich <lariel@mellanox.com> |
IB/mlx5: Use IP version matching to classify IP traffic This change adds the ability for flow steering to classify IPv4/6 packets with MPLS tag (Ethertype 0x8847 and 0x8848) as standard IP packets and hit IPv4/6 classifed steering rules. When user added a flow rule with IP classification, driver was implicitly adding ethertype matching to the created rule in order to distinguish between IPv4 and IPv6 protocols. Since IP packets with MPLS tag header have MPLS ethertype, they missed the rule and ended up hitting the default filters. Such behavior prevented from MPLS packets to undergo inbound traffic load balancing flows (if such were defined by configuring RSS) to achieve higher throughput - the way that non-MPLS IP packets performed. Since our device is able to look past the MPLS tag and identify the next protocol we introduce this solution which replaces Ethertype matching by the device's capability to perform IP version parsing and matching in order to distinguish between IPv4 and IPv6. Therefore, whenever a flow with IP spec is added and device support IP version matching, driver will implicitly add IP version matching to the rule (Based on the IP spec type) without Ethertype matching which will cause relevant MPLS tagged packets to hit this rule as well. Otherwise (device doesn't support IP version matching), we fall back to setting Ethertype matching. If the user's filters specify an L2 ethertype and an IP spec the rule will then match both the ethertype and the IP version. The device's support for IP version matching is reported by the device via dedicated capability bit in query_device_cap and named outer/inner_ip_version. Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
0f750966 |
|
03-Apr-2017 |
Ariel Levkovich <lariel@mellanox.com> |
IB/mlx5: Add inner spec and IPv6 validation in user's flow attribute list This change fixes an incomplete validation of the user's flow attributes list. Previous implementation validated only matching of IPv4 Ethertype to IPv4 spec of outer headers (in case both Ethernet with specified Ethertype and IP specs were present) and lacked the validation of: 1. Matching of IPv6 Ethertype in Ethernet spec (if such exists) to an IPv6 protocol spec (if such exists). 2. Validation of Ethertype to IP protocol matching on inner headers specs. Which could cause some combinations of unmatching Ethernet and IP protocols to pass validation and apply on the device. The fix adds validation of IPv6 Ethertype and IP spec as well as performing the scan on both outer and inner attributes. Fixes: 038d2ef87572 ("Add flow steering support") Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
00b7c2ab |
|
28-Mar-2017 |
Maor Gottlieb <maorg@mellanox.com> |
IB/mlx5: Enlarge autogroup flow table In order to enlarge the flow group size to 8k, we decrease the number of flow group types to 6 and increase the flow table size to 64k. Flow group size is calculated as follow: group_size = table_size / (#group_types + 1) Fixes: 038d2ef87572 ('IB/mlx5: Add flow steering support') Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
dac388ef |
|
28-Mar-2017 |
Maor Gottlieb <maorg@mellanox.com> |
IB/mlx5: Check supported flow table size Check that the required flow table size is supported by device. Return ENOMEM error if no space left. In addition change the create flow table routine to return ENOMEM instead of ENOSPC. Fixes: 038d2ef87572 ('IB/mlx5: Add flow steering support') Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
13776612 |
|
28-Mar-2017 |
Maor Gottlieb <maorg@mellanox.com> |
IB/mlx5: Change vma from shared to private Anonymous VMA (->vm_ops == NULL) cannot be shared, otherwise it would lead to SIGBUS. Remove the shared flags from the vma after we change it to be anonymous. This is easily reproduced by doing modprobe -r while running a user-space application such as raw_ethernet_bw. Fixes: 7c2344c3bbf97 ('IB/mlx5: Implements disassociate_ucontext API') Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
ecc7d83b |
|
28-Mar-2017 |
Maor Gottlieb <maorg@mellanox.com> |
IB/mlx5: Take write semaphore when changing the vma struct When the driver disassociate user context, it changes the vma to anonymous by setting the vm_ops to null and zap the vma ptes. In order to avoid race in the kernel, we need to take write lock before we change the vma entries. Fixes: 7c2344c3bbf97 ('IB/mlx5: Implements disassociate_ucontext API') Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
0881e7bd |
|
05-Feb-2017 |
Ingo Molnar <mingo@kernel.org> |
sched/headers: Prepare to move the get_task_struct()/put_task_struct() and related APIs from <linux/sched.h> to <linux/sched/task.h> But first update usage sites with the new header dependency. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
6e84f315 |
|
08-Feb-2017 |
Ingo Molnar <mingo@kernel.org> |
sched/headers: Prepare for new header dependencies before moving code to <linux/sched/mm.h> We are going to split <linux/sched/mm.h> out of <linux/sched.h>, which will have to be picked up from other headers and a couple of .c files. Create a trivial placeholder <linux/sched/mm.h> file that just maps to <linux/sched.h> to make this patch obviously correct and bisectable. The APIs that are going to be moved first are: mm_alloc() __mmdrop() mmdrop() mmdrop_async_fn() mmdrop_async() mmget_not_zero() mmput() mmput_async() get_task_mm() mm_access() mm_release() Include the new header in the files that are going to need it. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
cdbe33d0 |
|
13-Feb-2017 |
Eli Cohen <eli@mellanox.com> |
IB/mlx5: Fix configuration of port capabilities When the "ib_virt" cap is set, configuration of port capabilities need to be done through mlx5_core_modify_hca_vport_context. Since modify_hca_vport_context accepts mask and value, there is no need to read the port capabilities and calculate the new cap values so we avoid the mutex when ib_virt is set. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
c4550c63 |
|
24-Jan-2017 |
Or Gerlitz <ogerlitz@mellanox.com> |
IB: Query ports via the core instead of direct into the driver Change the drivers to call ib_query_port in their get port immutable handler instead of their own query port handler. Doing this required to set the core cap flags of this device before the ib_query_port call is made, since the IB core might need these caps to serve the port query. Drivers are ensured by the IB core that the port attributes passed to the port query verb implementation are zero, and hence we removed the zeroing from the drivers. This patch doesn't add any new functionality. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Acked-by: Adit Ranadive <aditr@vmware.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
72cd5717 |
|
24-Jan-2017 |
Or Gerlitz <ogerlitz@mellanox.com> |
IB/mlx5: Support raw packet protocol Mark support for the new raw packet protocol on Eth ports. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
81713d37 |
|
18-Jan-2017 |
Artemy Kovalyov <artemyko@mellanox.com> |
IB/mlx5: Add implicit MR support Add implicit MR, covering entire user address space. The MR is implemented as an indirect KSM MR consisting of 1GB direct MRs. Pages and direct MRs are added/removed to MR by ODP. Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
e8161334 |
|
18-Jan-2017 |
Noa Osherovich <noaos@mellanox.com> |
IB/mlx5: Expose vlan offloads capabilities Check device's capabilities and report which raw packet capabilities are supported. Signed-off-by: Noa Osherovich <noaos@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
7c16f477 |
|
18-Jan-2017 |
Kamal Heib <kamalh@mellanox.com> |
IB/mlx5: Expose Q counters groups only if they are supported by FW This patch modify the Q counters implementation, so each one of the three Q counters groups will be exposed by the driver only if they are supported by the firmware. Signed-off-by: Kamal Heib <kamalh@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
1ffd3a26 |
|
18-Jan-2017 |
Leon Romanovsky <leon@kernel.org> |
IB/mlx5: Replace ENOTSUPP usage with EOPNOTSUPP Flow steering is supposed to return EOPNOTSUPP error for unsupported fields and not ENOTSUPP error. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
2ac693f9 |
|
18-Jan-2017 |
Moses Reuben <mosesr@mellanox.com> |
IB/mlx5: Add flow tag support Set flow tag in flow table entry, when IB_FLOW_SPEC_ACTION_TAG is part of the flow specifications. Flow tag doesn't support multicast flows, so it's passing to hardware only when used. Signed-off-by: Moses Reuben <mosesr@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
5abb0da9 |
|
18-Jan-2017 |
Leon Romanovsky <leon@kernel.org> |
IB/mlx5: Remove deprecated module parameter Commit 9603b61de1ee ("mlx5: Move pci device handling from mlx5_ib to mlx5_core") moved prof_sel module parameter from mlx5_ib to mlx5_core and marked it as deprecated in 2014. Three years after deprecation, it is time to remove the deprecated module parameter. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Jack Morgenstein <jackm@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
ed88451e |
|
18-Jan-2017 |
Majd Dibbiny <majd@mellanox.com> |
IB/mlx5: Assign DSCP for R-RoCE QPs Address Path For Routable RoCE QPs, the DSCP should be set in the QP's address path. The DSCP's value is derived from the traffic class. Fixes: 2811ba51b049 ("IB/mlx5: Add RoCE fields to Address Vector") Cc: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Majd Dibbiny <majd@mellanox.com> Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
c43f1112 |
|
18-Jan-2017 |
Maor Gottlieb <maorg@mellanox.com> |
IB/mlx5: Add additional checks before processing MADs Check the has_smi bit in vport context and class version of MADs before allowing MADs processing to take place. MAD_IFC SMI commands can be executed only if smi bit is set. Fixes: e126ba97dba9 ('mlx5: Add driver for Mellanox Connect-IB adapters') Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Parvi Kaustubhi <parvik@mellanox.com> Reviewed-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
45bded2c |
|
18-Jan-2017 |
Kamal Heib <kamalh@mellanox.com> |
IB/mlx5: Verify that Q counters are supported Make sure that the Q counters are supported by the FW before trying to allocate/deallocte them, this will avoid driver load failure when they aren't supported by the FW. Fixes: 0837e86a7a34 ('IB/mlx5: Add per port counters') Cc: <stable@vger.kernel.org> # v4.7+ Signed-off-by: Kamal Heib <kamalh@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
78984898 |
|
30-Nov-2016 |
Or Gerlitz <ogerlitz@mellanox.com> |
IB/mlx5: Enable Eth VFs to query their min-inline value for user-space For some mlx5 HW models (CX4, CX4Lx), the VF driver needs to put part of the packet headers on the TX descriptor so the e-switch can do proper matching and steering. This is called "min-inline", it's advertized to the VF by the FW and also enforced on them by the HW, such that if they don't obey, their packets are dropped. SRIOV VF libmlx5 instances should take into account the min-inline value of their vports. For that end, we provide this value through the vendor response part of init_ucontext command. The min inline value is reported in a way which will let newer libmlx5 instances realize that they are running over an older kernel and act accordingly (e.g apply some educated guess). Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
#
9b0c289e |
|
20-Jan-2017 |
Bart Van Assche <bvanassche@acm.org> |
IB/mlx5: Switch from dma_device to dev.parent Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Matan Barak <matanb@mellanox.com> Cc: Leon Romanovsky <leonro@mellanox.com> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
10543365 |
|
09-Oct-2016 |
Mohamad Haj Yahia <mohamad@mellanox.com> |
net/mlx5: Add support to s-tag in mlx5 firmware interface Add svlan_tag and rename vlan_tag to cvlan_tag in flow table entry match param. Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
|
#
30aa60b3 |
|
03-Jan-2017 |
Eli Cohen <eli@mellanox.com> |
IB/mlx5: Support 4k UAR for libmlx5 Add fields to structs to convey to kernel an indication whether the library supports multi UARs per page and return to the library the size of a UAR based on the queried value. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
#
b037c29a |
|
03-Jan-2017 |
Eli Cohen <eli@mellanox.com> |
IB/mlx5: Allow future extension of libmlx5 input data Current check requests that new fields in struct mlx5_ib_alloc_ucontext_req_v2 that are not known to the driver be zero. This was introduced so new libraries passing additional information to the kernel through struct mlx5_ib_alloc_ucontext_req_v2 will be notified by old kernels that do not support their request by failing the operation. This schecme is problematic since it requires libmlx5 to issue the requests with descending input size for struct mlx5_ib_alloc_ucontext_req_v2. To avoid this, we require that new features that will obey the following rules: If the feature requires one or more fields in the response and the at least one of the fields can be encoded such that a zero value means the kernel ignored the request then this field will provide the indication to the library. If no response is required or if zero is a valid response, a new field should be added that indicates to the library whether its request was processed. Fixes: b368d7cb8ceb ('IB/mlx5: Add hca_core_clock_offset to udata in init_ucontext') Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
#
5fe9dec0 |
|
03-Jan-2017 |
Eli Cohen <eli@mellanox.com> |
IB/mlx5: Use blue flame register allocator in mlx5_ib Make use of the blue flame registers allocator at mlx5_ib. Since blue flame was not really supported we remove all the code that is related to blue flame and we let all consumers to use the same blue flame register. Once blue flame is supported we will add the code. As part of this patch we also move the definition of struct mlx5_bf to mlx5_ib.h as it is only used by mlx5_ib. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
#
2f5ff264 |
|
03-Jan-2017 |
Eli Cohen <eli@mellanox.com> |
mlx5: Fix naming convention with respect to UARs This establishes a solid naming conventions for UARs. A UAR (User Access Region) can have size identical to a system page or can be fixed 4KB depending on a value queried by firmware. Each UAR always has 4 blue flame register which are used to post doorbell to send queue. In addition, a UAR has section used for posting doorbells to CQs or EQs. In this patch we change names to reflect this conventions. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
#
de8d6e02 |
|
03-Jan-2017 |
Eli Cohen <eli@mellanox.com> |
IB/mlx5: Fix kernel to user leak prevention logic The logic was broken as it failed to update the response length for architectures with PAGE_SIZE larger than 4kB. As a result further extension of the ucontext response struct would fail. Fixes: d69e3bcf7976 ('IB/mlx5: Mmap the HCA's core clock register to user-space') Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
#
d9aaed83 |
|
02-Jan-2017 |
Artemy Kovalyov <artemyko@mellanox.com> |
{net,IB}/mlx5: Refactor page fault handling * Update page fault event according to last specification. * Separate code path for page fault EQ, completion EQ and async EQ. * Move page fault handling work queue from mlx5_ib static variable into mlx5_core page fault EQ. * Allocate memory to store ODP event dynamically as the events arrive, since in atomic context - use mempool. * Make mlx5_ib page fault handler run in process context. Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
7d0cc6ed |
|
02-Jan-2017 |
Artemy Kovalyov <artemyko@mellanox.com> |
IB/mlx5: Add MR cache for large UMR regions In this change we turn mlx5_ib_update_mtt() into generic mlx5_ib_update_xlt() to perfrom HCA translation table modifiactions supporting both atomic and process contexts and not limited by number of modified entries. Using this function we increase preallocated MRs up to 16GB. Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
9f885201 |
|
02-Jan-2017 |
Leon Romanovsky <leon@kernel.org> |
IB/mlx5: Reorder code in query device command The order of features exposed by private mlx5-abi.h file is CQE zipping, packet pacing and multi-packet WQE. The internal order implemented in mlx5_ib_query_device() is multi-packet WQE, CQE zipping and packet pacing. Such difference hurts code readability, so let's sync, while mlx5-abi.h (exposed to userspace) is the primary order. This commit doesn't change any functionality. Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
626bc02d |
|
05-Dec-2016 |
Bart Van Assche <bvanassche@acm.org> |
mlx5: Use { } instead of { 0 } to init struct Detected by sparse. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Eli Cohen <eli@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
7d29f349 |
|
01-Dec-2016 |
Bodong Wang <bodong@mellanox.com> |
IB/mlx5: Properly adjust rate limit on QP state transitions - Add MODIFY_QP_EX CMD to extend modify_qp. - Rate limit will be updated in the following state transactions: RTR2RTS, RTS2RTS. The limit will be removed when SQ is in RST and ERR state. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
d949167d |
|
01-Dec-2016 |
Bodong Wang <bodong@mellanox.com> |
IB/mlx5: Report mlx5 packet pacing capabilities when querying device Enable mlx5 based hardware to report packet pacing capabilities from kernel to user space. Packet pacing allows to limit the rate to any number between the maximum and minimum, based on user settings. The capabilities are exposed to user space through query_device by uhw. The following capabilities are reported: 1. The maximum and minimum rate limit in kbps supported by packet pacing. 2. Bitmap showing which QP types are supported by packet pacing operation. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
ca5b91d6 |
|
27-Nov-2016 |
Or Gerlitz <ogerlitz@mellanox.com> |
IB/mlx5: Support RAW Ethernet when RoCE is disabled On some environments, such as certain SRIOV VF configurations, RoCE is not supported for mlx5 Ethernet ports. Currently, the driver will not open IB device on that port. This is problematic, since we do want user-space RAW Ethernet (RAW_PACKET QPs) functionality to remain in place. For that end, enhance the relevant driver flows such that we do create a device instance in that case. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
45f95acd |
|
27-Nov-2016 |
Or Gerlitz <ogerlitz@mellanox.com> |
IB/mlx5: Rename RoCE related helpers to reflect being Eth ones This is a pre-step towards having mlx5 IB device also over Eth ports where RoCE is not supported. We change the roce enable/disable and roce_lag init/fini function names to have _eth instead of _roce. This patch doesn't change any functionality. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
d012f5d6 |
|
27-Nov-2016 |
Or Gerlitz <ogerlitz@mellanox.com> |
IB/mlx5: Refactor registration to netdev notifier Refactor the netdev notifier registration into a small helper function. This is a pre-step towards having mlx5 IB device over an Ethernet port which doesn't support RoCE. Also, renamed the de-registration helper and the new helper as netdev notifier and not roce, to make it clear this is not only used with roce. This patch doesn't change any functionality. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
41c450fd |
|
22-Nov-2016 |
Moni Shoua <monis@mellanox.com> |
IB/mlx5: Make create/destroy_ah available to userspace Advertise that create_ah and destroy_ah verbs are accessible from uverbs interface. Signed-off-by: Moni Shoua <monis@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
6ad279c5 |
|
22-Nov-2016 |
Moni Shoua <monis@mellanox.com> |
IB/mlx5: Report that device has udata response in create_ah To make mlx5 user driver aware of whether kernel driver returns dmac in user data response add a new flag that will be returned back to user-space through alloc_ucontext. Signed-off-by: Moni Shoua <monis@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
2d1e697e |
|
14-Nov-2016 |
Moses Reuben <mosesr@mellanox.com> |
IB/mlx5: Add support to match inner packet fields Add support to match packet fields which are tunneled, i.e. support matching the header of the inner packet which is the result of or bit operation of the original header and the IB_FLOW_SPEC_INNER type. The combination of IB_FLOW_SPEC_INNER | IB_FLOW_SPEC_VXLAN_TUNNEL is not needed to be checked, because the IB core has this check already. Signed-off-by: Moses Reuben <mosesr@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
ffb30d8f |
|
14-Nov-2016 |
Moses Reuben <mosesr@mellanox.com> |
IB/mlx5: Support Vxlan tunneling specification Add support to receive specific Vxlan packet in ConnectX-4. Signed-off-by: Moses Reuben <mosesr@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
7e43a2a5 |
|
30-Oct-2016 |
Bodong Wang <bodong@mellanox.com> |
IB/mlx5: Report mlx5 CQE compression caps during query The capabilities include: - Max number of compressed and aggregated CQEs in a single session, while zero means unsupported. - For Responder, there are two formats of mini CQE: mini CQE with Rx hash and mini CQE with checksum. They're mutual exclusive. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
191ded4a |
|
30-Oct-2016 |
Bodong Wang <bodong@mellanox.com> |
IB/mlx5: Report mlx5 multi packet WQE caps during query The capabilities whether hardware support multi packet WQE or not is exposed to user space through query_device by uhw. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
288c01b7 |
|
27-Oct-2016 |
Eli Cohen <eli@mellanox.com> |
IB/mlx5: Fix reported max SGE calculation Add the 512 bytes limit of RDMA READ and the size of remote address to the max SGE calculation. Fixes: e126ba97dba9 ('mlx5: Add driver for Mellanox Connect-IB adapters') Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
86695a65 |
|
27-Oct-2016 |
Maor Gottlieb <maorg@mellanox.com> |
IB/mlx5: Put non zero value in max_ah We put INT_MAX since this is the max value that can be held. Though there is no hardware limitation, this is practically a large enough number so we can use it. Fixes: e126ba97dba9 ('mlx5: Add driver for Mellanox Connect-IB adapters') Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
dbaaff2a |
|
27-Oct-2016 |
Eli Cohen <eli@mellanox.com> |
IB/mlx5: Fix fatal error dispatching When an internal error condition is detected, make sure to set the device inactive after dispatching the event so ULPs can get a notification of this event. Fixes: e126ba97dba9 ('mlx5: Add driver for Mellanox Connect-IB adapters') Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Mohamad Haj Yahia <mohamad@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
90be7c8a |
|
27-Oct-2016 |
Majd Dibbiny <majd@mellanox.com> |
IB/mlx5: Fix memory leak in query device We need to free dev->port when we fail to enable RoCE or initialize node data. Fixes: 0837e86a7a34 ('IB/mlx5: Add per port counters') Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
66958ed9 |
|
07-Nov-2016 |
Hadar Hen Zion <hadarh@mellanox.com> |
net/mlx5: Support encap id when setting new steering entry In order to support steering rules which add encapsulation headers, encap_id parameter is needed. Add new mlx5_flow_act struct which holds action related parameter: action, flow_tag and encap_id. Use mlx5_flow_act struct when adding a new steering rule. This patch doesn't change any functionality. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
c9f1b073 |
|
07-Nov-2016 |
Hadar Hen Zion <hadarh@mellanox.com> |
net/mlx5: Add creation flags when adding new flow table When creating flow tables, allow the caller to specify creation flags. Currently no flags are used and as such this patch doesn't add any new functionality. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
74491de9 |
|
31-Aug-2016 |
Mark Bloch <markb@mellanox.com> |
net/mlx5: Add multi dest support Currently when calling mlx5_add_flow_rule we accept only one flow destination, this commit allows to pass multiple destinations. This change forces us to change the return structure to a more flexible one. We introduce a flow handle (struct mlx5_flow_handle), it holds internally the number for rules created and holds an array where each cell points the to a flow rule. From the consumers (of mlx5_add_flow_rule) point of view this change is only cosmetic and requires only to change the type of the returned value they store. From the core point of view, we now need to use a loop when allocating and deleting rules (e.g given to us a flow handler). Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
#
bdc37924 |
|
29-Sep-2016 |
Saeed Mahameed <saeedm@mellanox.com> |
IB/mlx5: Skip handling unknown events Do not dispatch unknown mlx5 core events on mlx5_ib_event. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
#
b47bd6ea |
|
25-Oct-2016 |
Daniel Jurgens <danielj@mellanox.com> |
{net, ib}/mlx5: Make cache line size determination at runtime. ARM 64B cache line systems have L1_CACHE_BYTES set to 128. cache_line_size() will return the correct size. Fixes: cf50b5efa2fe('net/mlx5_core/ib: New device capabilities handling.') Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
3085e29e |
|
22-Sep-2016 |
Leon Romanovsky <leon@kernel.org> |
IB/mlx5: Move and decouple user vendor structures This patch decouples and moves vendors specific structures to common UAPI folder which will be visible to all consumers. These structures are used by user-space library driver (libmlx5) and currently manually copied to that library. This move will allow cross-compile against these files and simplify introduction of vendor specific data. Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
bd99fdea |
|
25-Aug-2016 |
Yuval Shaia <yuval.shaia@oracle.com> |
IB/{core,hw}: Add constant for node_desc Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
4babcf97 |
|
18-Sep-2016 |
Aviv Heller <avivh@mellanox.com> |
IB/mlx5: Set unique device name on LAG IB bond device name is now 'mlx5_bond_X', instead of 'mlx5_X'. Signed-off-by: Aviv Heller <avivh@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
88621dfe |
|
18-Sep-2016 |
Aviv Heller <avivh@mellanox.com> |
IB/mlx5: Port status track LAG master, when LAG is active When LAG is active, port up/down events should be triggered by tracking the LAG master, and not one of the two slave netdevs. In the same manner, ib_query_port() should return the details of the LAG master. Signed-off-by: Aviv Heller <avivh@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
9ef9c640 |
|
18-Sep-2016 |
Aviv Heller <avivh@mellanox.com> |
IB/mlx5: Merge vports flow steering during LAG This is done in two steps: 1) Issuing CREATE_VPORT_LAG in order to have Ethernet traffic from both ports arriving on PF0 root flowtable, so we will be able to catch all raw-eth traffic on PF0. 2) Creation of LAG demux flowtable in order to direct all non-raw-eth traffic back to its source port, assuring that normal Ethernet traffic "jumps" to the root flowtable of its RX port (non-LAG behavior). Signed-off-by: Aviv Heller <avivh@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
5ec8c83e |
|
18-Sep-2016 |
Aviv Heller <avivh@mellanox.com> |
IB/mlx5: Port events in RoCE now rely on netdev events Since ib_query_port() in RoCE returns the state of its netdev as the port state, it makes sense to propagate the port up/down events to ib_core when the netdev port state changes, instead of relying on traditional core events. This also keeps both the event and ib_query_port() synchronized. Signed-off-by: Aviv Heller <avivh@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
466fa6d2 |
|
30-Aug-2016 |
Maor Gottlieb <maorg@mellanox.com> |
IB/mlx5: Add support of more IPv6 fields to flow steering Add support to receive Traffic Class, specific IPv6 protocol or IPv6 flow label. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
ca0d4753 |
|
30-Aug-2016 |
Maor Gottlieb <maorg@mellanox.com> |
IB/mlx5: Add support in TOS and protocol to flow steering Add support to receive TOS or specific IPv4 protocol. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
c47ac6ae |
|
30-Aug-2016 |
Maor Gottlieb <maorg@mellanox.com> |
IB/mlx5: Add validation to flow specifications parsing Add validation check that all set fields in flow specification are supported by vendor. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
cc0e5d42 |
|
28-Aug-2016 |
Maor Gottlieb <maorg@mellanox.com> |
IB/mlx5: Add sniffer support to steering Add support to create sniffer rule. This rule receive all incoming and outgoing packets from the port. A user could create such rule by using IB_FLOW_ATTR_SNIFFER type. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
d9d4980a |
|
28-Aug-2016 |
Maor Gottlieb <maorg@mellanox.com> |
IB/mlx5: Increase flow table reference count in create rule Move the reference count increasing of flow table to be in create_flow_rule, it will increase the reference count for each rule creation and not for each flow. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
dd063d0e |
|
28-Aug-2016 |
Maor Gottlieb <maorg@mellanox.com> |
IB/mlx5: Fix coverity warning Fix covertiy warning of passing "&flow_attr" to function "create_flow_rule" which uses it as an array. In addition pass flow attributes argument as const. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
5497adc6 |
|
28-Aug-2016 |
Maor Gottlieb <maorg@mellanox.com> |
IB/mlx5: Save flow table priority handler instead of index Saving the flow table priority object's pointer in the flow handle is necessary for downstream patches since the sniffer flow table isn't placed at the standard flow_db structure but in a different database. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
7055a294 |
|
28-Aug-2016 |
Maor Gottlieb <maorg@mellanox.com> |
IB/mlx5: Fix steering resource leak Fix multicast flow rule leak on adding unicast rule failure. Fixes: 038d2ef87572 ('IB/mlx5: Add flow steering support') Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
31f69a82 |
|
28-Aug-2016 |
Yishai Hadas <yishaih@mellanox.com> |
IB/mlx5: Expose RSS related capabilities Expose RSS related capabilities on both IB and vendor channels. In addition to the IB capabilities the driver reports some extra capabilities on its vendor channel: - Bit mask of the supported types of hash functions. - Bit mask of the supported RX fields that can participate in the RX hashing. Those capabilities are applicable only when the link layer is Ethernet. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
ed082d36 |
|
04-Sep-2016 |
Christoph Hellwig <hch@lst.de> |
IB/core: add support to create a unsafe global rkey to ib_create_pd Instead of exposing ib_get_dma_mr to ULPs and letting them use it more or less unchecked, this moves the capability of creating a global rkey into the RDMA core, where it can be easily audited. It also prints a warning everytime this feature is used as well. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
ee3da804 |
|
12-Sep-2016 |
Maor Gottlieb <maorg@mellanox.com> |
IB/mlx5: Set source mac address in FTE Set the source mac address in the FTE when L2 specification is provided. Fixes: 038d2ef87572 ('IB/mlx5: Add flow steering support') Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
7fae6655 |
|
12-Sep-2016 |
Noa Osherovich <noaos@mellanox.com> |
IB/mlx5: Enable MAD_IFC commands for IB ports only MAD_IFC command is supported only for physical functions (PF) and when physical port is IB. The proposed fix enforces it. Fixes: d603c809ef91 ("IB/mlx5: Fix decision on using MAD_IFC") Reported-by: David Chang <dchang@suse.com> Signed-off-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
d9f88e5a |
|
28-Aug-2016 |
Yishai Hadas <yishaih@mellanox.com> |
IB/mlx5: Use TIR number based on selector Use TIR number based on selector, it should be done to differentiate between RSS QP to RAW one. Reported-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Tested-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
82d200cc |
|
23-Aug-2016 |
Chris Wilson <chris@chris-wilson.co.uk> |
IB/mlx5: Remove superfluous include of io-mapping.h This file does not use any structs or functions defined by io-mapping.h (nor does it directly use iomap, ioremap, iounamp or friends). Remove it to simplify verification of changes to io-mapping.h The include existed since its inception in commit e126ba97dba9edeb6fafa3665b5f8497fc9cdf8c Author: Eli Cohen <eli@mellanox.com> Date: Sun Jul 7 17:25:49 2013 +0300 mlx5: Add driver for Mellanox Connect-IB adapters which looks like a copy across from the Mellanox ethernet driver. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Eli Cohen <eli@mellanox.com> Cc: Jack Morgenstein <jackm@dev.mellanox.co.il> Cc: Or Gerlitz <ogerlitz@mellanox.com> Cc: Matan Barak <matanb@mellanox.com> Cc: Leon Romanovsky <leonro@mellanox.com> Cc: Doug Ledford <dledford@redhat.com> Cc: Sean Hefty <sean.hefty@intel.com> Cc: Hal Rosenstock <hal.rosenstock@gmail.com> Cc: linux-rdma@vger.kernel.org Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Laurence Oberman <loberman@redhat.com> Tested-by: Laurence Oberman <loberman@redhat.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
d5beb7f2 |
|
02-Jun-2016 |
Noa Osherovich <noaos@mellanox.com> |
net/mlx5: Separate query_port_proto_oper for IB and EN Replaced mlx5_query_port_proto_oper with separate functions per link type. The functions should take different arguments so no point in trying to unite them. Signed-off-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
#
c4f287c4 |
|
19-Jul-2016 |
Saeed Mahameed <saeedm@mellanox.com> |
net/mlx5: Unify and improve command interface Now as all commands use mlx5 ifc interface, instead of doing two calls for executing a command we embed command status checking into mlx5_cmd_exec to simplify the interface. Also we do here some cleanup for redundant software structures (inbox/outbox) and functions and improved command failure output. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
#
61961500 |
|
12-Jul-2016 |
Wei Yongjun <yongjun_wei@trendmicro.com.cn> |
IB/mlx5: Fix duplicate const warning Fixes the following sparse warning: drivers/infiniband/hw/mlx5/main.c:2574:25: warning: duplicate const Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
c5bb1730 |
|
04-Jul-2016 |
Maor Gottlieb <maorg@mellanox.com> |
net/mlx5: Refactor mlx5_add_flow_rule Reduce the set of arguments passed to mlx5_add_flow_rule by introducing flow_spec structure. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
c7342823 |
|
15-Jun-2016 |
Ira Weiny <ira.weiny@intel.com> |
IB/mlx5: Support device FW version string And remove sysfs entry in favor of the common code. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
0ad17a8f |
|
17-Jun-2016 |
Mark Bloch <markb@mellanox.com> |
IB/mlx5: Add port protocol stats Expose new counters using the get protocol stats callback. We expose the following counters: |------------------------------------------------------------------------| | Name | IB | EN | Description | |------------------------------------------------------------------------| |rx_write_requests | + | - | Number of received WRITE requests for | | | | | the associated QP. | |------------------------------------------------------------------------| |rx_read_requests | + | - | Number of received READ requests for | | | | | the associated QP. | |------------------------------------------------------------------------| |rx_atomic_requests | + | - | Number of received ATOMIC requests for | | | | | the associated QP. | |------------------------------------------------------------------------| |out_of_buffer | + | + | Number of drops occurred due to lack | | | | | of WQE for the associated QPs/RQs. | |------------------------------------------------------------------------| |out_of_sequence | + | - | Number of errors in the packet | | | | | transport sequence number | |------------------------------------------------------------------------| |duplicate_request | + | + | Number of received duplicated packets. | | | | | A request that previously executed is | | | | | named duplicated. | |------------------------------------------------------------------------| |rnr_nak_retry_err | + | + | Number of received RNR NAC packets. | | | | | The QP retry limit did not exceed. | |------------------------------------------------------------------------| |packet_seq_err | + | + | Number of received NAK - sequence error| | | | | packets. The QP retry limit did not | | | | | exceed. | |------------------------------------------------------------------------| |implied_nak_err | + | + | Number of times the requester detected | | | | | an ACK with a PSN larger than expected | | | | | PSN for RDMA READ or ATOMIC response | | | | | The QP retry limit did not exceed. | |------------------------------------------------------------------------| |local_ack_timeout_err| + | - | Number of NO ACK responses from | | | | | responder within timer interval. | | | | | The QP retry limit did not exceed. | |------------------------------------------------------------------------| Counters are available if all of them are supported. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Reviewed-by: Christoph Lameter <cl@linux.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
0837e86a |
|
17-Jun-2016 |
Mark Bloch <markb@mellanox.com> |
IB/mlx5: Add per port counters In order to support statistics for ports, we attach each QP to a counter set which is dedicate to this port. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
402ca536 |
|
17-Jun-2016 |
Bodong Wang <bodong@mellanox.com> |
IB/mlx5: Report mlx5 TSO capabilities when querying device Enable mlx5 based hardware to report TCP segmentation offload (TSO) capabilities from kernel to user space. A TSO enabled NIC will accept big chunks of data with sizes greater than MTU for TCP traffic. The TSO engine will break the data into separate packets and will insert headers automatically. The capabilities are exposed to user space through query_device by uhw directly. The following capabilities are reported: 1. The maximum payload size in bytes supported for segmentation by TSO engine. 2. Bitmap showing which QP types are supported by TSO operation. The bitmap is built by members from 'enmu ib_qp_type'. For example, similar code should be performed if UD QP is supported: supported_qpts |= 1 << IB_QPT_UD; To make user-space library aware of whether kernel supports uhw or not, a new flag: cmds_supp_uhw will be returned back to user-space through alloc_ucontext. Signed-off-by: Bodong Wang <bodong@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
026bae0c |
|
17-Jun-2016 |
Maor Gottlieb <maorg@mellanox.com> |
IB/mlx5: Enable flow steering for IPv6 traffic Enable flow steering for IPv6 traffic by using an IPv6 spec. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
89ea94a7 |
|
17-Jun-2016 |
Maor Gottlieb <maorg@mellanox.com> |
IB/mlx5: Reset flow support for IB kernel ULPs The driver exposes interfaces that directly relate to HW state. Upon fatal error, consumers of these interfaces (ULPs) that rely on completion of all their posted work-request could hang, thereby introducing dependencies in shutdown order. To prevent this from happening, we manage the relevant resources (CQs, QPs) that are used by the device. Upon a fatal error, we now generate simulated completions for outstanding WQEs that were not completed at the time the HW was reset. It includes invoking the completion event handler for all involved CQs so that the ULPs will poll those CQs. When polled we return simulated CQEs with IB_WC_WR_FLUSH_ERR return code enabling ULPs to clean up their resources and not wait forever for completions upon receiving remove_one. The above change requires an extra check in the data path to make sure that when device is in error state, the simulated CQEs will be returned and no further WQEs will be posted. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
7c2344c3 |
|
17-Jun-2016 |
Maor Gottlieb <maorg@mellanox.com> |
IB/mlx5: Implements disassociate_ucontext API Implements the IB core disassociate_ucontext API. The driver detaches the HW resources for a given user context to prevent a dependency between application termination and device disconnect. This is done by managing the VMAs that were mapped to the HW bars such as doorbell and blueflame. When need to detach, remap them to an arbitrary kernel page returned by the zap API. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
c5f90929 |
|
23-May-2016 |
Yishai Hadas <yishaih@mellanox.com> |
IB/mlx5: Add Receive Work Queue Indirection table operations Some mlx5 based hardwares support a RQ table object. This RQ table points to a few RQ objects. We implement the receive work queue indirection table API (create and destroy) by using this hardware object. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
79b20a6c |
|
23-May-2016 |
Yishai Hadas <yishaih@mellanox.com> |
IB/mlx5: Add receive Work Queue verbs A QP can be created without internal WQs "packaged" inside it, this QP can be configured to use "external" WQ object as its receive/send queue. WQ is a necessary component for RSS technology since RSS mechanism is supposed to distribute the traffic between multiple Receive Work Queues Receive WQs are implemented by RQs. Implement the WQ creation, modification and destruction verbs. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
2cc6ad5f |
|
04-Jun-2016 |
Noa Osherovich <noaos@mellanox.com> |
IB/mlx5: Check BlueFlame HCA support BlueFlame support is reported only for PFs when the HCA capability is on. Fixes: 938fe83c8dcbb ('net/mlx5_core: New device capabilities...') Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
bc5c6eed |
|
04-Jun-2016 |
Noa Osherovich <noaos@mellanox.com> |
IB/mlx5: Limit query HCA clock When PAGE_SIZE is larger than 4K, the user shouldn't be able to query the HCA core clock. This counter is within 4KB boundary and the user-space shall not read information that's after this boundary. Fixes: b368d7cb8ceb7 ('IB/mlx5: Add hca_core_clock_offset to...') Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
c0fcebf5 |
|
04-Jun-2016 |
Eran Ben Elisha <eranbe@mellanox.com> |
IB/mlx5: Fix FW version diaplay in sysfs Add a 4-digit padding to show FW version in proper format. Fixes: 9603b61de1eee ('mlx5: Move pci device handling from...') Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
2788cf3b |
|
04-Jun-2016 |
Noa Osherovich <noaos@mellanox.com> |
IB/mlx5: Return PORT_ERR in Active to Initializing tranisition FW port-change events are fired on Active <-> non Active port state transitions only. When the port state changes from Active to Initializing (Active -> Down -> Initializing), a single event is fired. The HCA transitions from Down to Initializing unless prevented from doing so, hence the driver should also propagate events when the port state is Initializing to consumers so they'll be aware that the port is no longer Active and act accordingly. Fixes: e126ba97dba9e ('mlx5: Add driver for Mellanox Connect-IB...') Signed-off-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
da6d6ba3 |
|
04-Jun-2016 |
Maor Gottlieb <maorg@mellanox.com> |
IB/mlx5: Set flow steering capability bit Flow steering is supported by mlx5 device when the following features are supported by firmware: 1. NIC RX flow table. 2. Device has enough flow steering levels. 3. Atomic modification of flow table entry. 4. Flow tables chaining. To check if flow steering is supported it's enough to check if the driver opened the mlx5 bypass namespace. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
cff5a0f3 |
|
17-Apr-2016 |
Majd Dibbiny <majd@mellanox.com> |
IB/mlx5: Report Scatter FCS device capability when supported Report Scatter FCS support when the Firmware supports as well. Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
37aa5c36 |
|
27-Apr-2016 |
Guy Levi <guyle@mellanox.com> |
IB/mlx5: Add UARs write-combining and non-cached mapping By this patch, the user space library will be able to improve performance using appropriate ringing DoorBell method according to the memory type it asked for. Currently only one mapping command is allowed for UARs: MLX5_IB_MMAP_REGULAR_PAGE. Using this mapping, the kernel maps the UARs to write-combining (WC) if the system supports it. If the system is not supporting WC the UARs are mapped to non-cached(NC). In this case the user space library can't tell which mapping is applied. This patch adds 2 new mapping commands: MLX5_IB_MMAP_WC_PAGE and MLX5_IB_MMAP_NC_PAGE. For these commands the kernel maps exactly as requested and fails if it can't. Since there is no generic way to check if the requested memory region can be mapped as WC, driver enables conclusive WC mapping only for x86, PowerPC and ARM which support WC for the device's memory region. Signed-off-by: Guy Levy <guyle@mellanox.com> Signed-off-by: Moshe Lazer <moshel@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
6cbac1e4 |
|
14-Apr-2016 |
Matan Barak <matanb@mellanox.com> |
IB/mlx5: Allow mapping the free running counter on PROT_EXEC The current mlx5 code disallows mapping the free running counter of mlx5 based hardwares when PROT_EXEC is set. Although this behaviour is correct, Linux does add an implicit VM_EXEC to the vm_flags if the READ_IMPLIES_EXEC bit is set in the process personality. This happens for example if the process stack is executable. This causes libmlx5 to output a warning and prevents the user from reading the free running clock. Executing the init segment of the hardware isn't a security risk (at least no more than executing a process own stack), so we just prevent writes to there. Fixes: d69e3bcf7976 ('IB/mlx5: Mmap the HCA's core clock register to user-space') Signed-off-by: Matan Barak <matanb@mellanox.com> Reviewed-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
d63cd286 |
|
28-Apr-2016 |
Maor Gottlieb <maorg@mellanox.com> |
net/mlx5: Add user chosen levels when allocating flow tables Currently, consumers of the flow steering infrastructure can't choose their own flow table levels and are limited to one flow table per level. This just waste levels. Instead, we introduce here the possibility to use multiple flow tables in a level. The user is free to connect these flow tables, while following the rule (FTEs in FT of level x could only point to FTs of level y where y > x). In addition this patch switch the order of the create/destroy flow tables of the NIC(vlan and main). Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
986ef95e |
|
31-Mar-2016 |
Sagi Grimberg <sagi@grimberg.me> |
IB/mlx5: Expose correct max_sge_rd limit mlx5 devices (Connect-IB, ConnectX-4, ConnectX-4-LX) has a limitation where rdma read work queue entries cannot exceed 512 bytes. A rdma_read wqe needs to fit in 512 bytes: - wqe control segment (16 bytes) - rdma segment (16 bytes) - scatter elements (16 bytes each) So max_sge_rd should be: (512 - 16 - 16) / 16 = 30. Cc: linux-stable@vger.kernel.org Reported-by: Christoph Hellwig <hch@lst.de> Tested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sagi Grimberg <sagig@grimberg.me> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
046339ea |
|
21-Apr-2016 |
Saeed Mahameed <saeedm@mellanox.com> |
net/mlx5e: Device's mtu field is u16 and not int For set/query MTU port firmware commands the MTU field is 16 bits, here I changed all the "int mtu" parameters of the functions wrapping those firmware commands to be u16. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
eff901d3 |
|
11-Mar-2016 |
Eli Cohen <eli@mellanox.com> |
IB/mlx5: Implement callbacks for manipulating VFs Implement the IB defined callbacks used to manipulate the policy for the link state, set GUIDs or get statistics information. This functionality is added into a new file that will be used to add any SRIOV related functionality to the mlx5 IB layer. The following callbacks have been added: mlx5_ib_get_vf_config mlx5_ib_set_vf_link_state mlx5_ib_get_vf_stats mlx5_ib_set_vf_guid In addition, publish whether this device is based on a virtual function. In mlx5 supported devices, virtual functions are implemented as vHCAs. vHCAs have their own QP number space so it is possible that two vHCAs will use a QP with the same number at the same time. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
d603c809 |
|
11-Mar-2016 |
Eli Cohen <eli@mellanox.com> |
IB/mlx5: Fix decision on using MAD_IFC Fix the condition that dictates when MAD_IFC should be used. According to firmware specifications, MAD_IFC commands must be used only if the ib_virt capability is off. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
35d19011 |
|
07-Mar-2016 |
Maor Gottlieb <maorg@mellanox.com> |
IB/mlx5: Add support for don't trap rules Each bypass flow steering priority will be split into two priorities: 1. Priority for don't trap rules. 2. Priority for normal rules. When user creates a flow using IB_FLOW_ATTR_FLAGS_DONT_TRAP flag, the driver creates two flow rules, one used for receiving the traffic and the other one for forwarding the packet to continue matching in lower or equal priorities. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
b005d316 |
|
29-Feb-2016 |
Sagi Grimberg <sagig@mellanox.com> |
mlx5: Add arbitrary sg list support Allocate proper context for arbitrary scatterlist registration If ib_alloc_mr is called with IB_MR_MAP_ARB_SG, the driver allocate a private klm list instead of a private page list. Set the UMR wqe correctly when posting the fast registration. Also, expose device cap IB_DEVICE_MAP_ARB_SG according to the device id (until we have a FW bit that correctly exposes it). Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
911f4331 |
|
03-Mar-2016 |
Sagi Grimberg <sagig@mellanox.com> |
IB/mlx5: Expose correct max_fast_reg_page_list_len While documentation indicates that the number of translation entries per memory key is unlimited, in practice, we can only fit a finite amount of translation entries in a single registration wqe (which is log_max_klm_list_size). Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
add08d76 |
|
03-Mar-2016 |
Christoph Hellwig <hch@lst.de> |
IB/mlx5: Convert UMR CQ to new CQ API Simplifies the code, and makes it more fair vs other users by using a softirq for polling. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Haggai Eran <haggaie@mellanox.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
d2370e0a |
|
29-Feb-2016 |
Matan Barak <matanb@mellanox.com> |
IB/mlx5: Add memory windows allocation support This patch adds user-space support for memory windows allocation and deallocation. It also exposes the supported types via query_device_caps verb. Signed-off-by: Matan Barak <matanb@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Tested-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
56e11d62 |
|
29-Feb-2016 |
Noa Osherovich <noaos@mellanox.com> |
IB/mlx5: Added support for re-registration of MRs This patch adds support for re-registration of memory regions in MLX5. The functionality is basically the same as deregister followed by register, but attempts to reuse the existing resources as much as possible. Original memory keys are kept if possible, saving the need to communicate new ones to remote peers. Signed-off-by: Noa Osherovich <noaos@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
7722f47e |
|
29-Feb-2016 |
Haggai Eran <haggaie@mellanox.com> |
IB/mlx5: Create GSI transmission QPs when P_Key table is changed Whenever the P_Key table is changed, we create the required GSI transmission QPs on-demand. Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
d16e91da |
|
29-Feb-2016 |
Haggai Eran <haggaie@mellanox.com> |
IB/mlx5: Add GSI QP wrapper mlx5 creates special GSI QPs that has limited ability to control the P_Key of transmitted packets. The sent P_Key is taken from the QP object, similarly to what happens with regular UD QPs. Create a software wrapper around GSI QPs that with the following patches will be able to emulate the functionality of a GSI QP including control of the P_Key per work request. Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
f0313965 |
|
21-Feb-2016 |
Erez Shitrit <erezsh@mellanox.com> |
IB/mlx5: Implement UD QP offloads for IPoIB in the TX flow In order to support LSO and CSUM in the TX flow the driver does the following: * LSO bit for the enum mlx5_ib_qp_flags was added, indicates QP that supports LSO offloads. * Enables the special offload when the QP is created, and enable the special work request id (IB_WR_LSO) when comes. * Calculates the size of the WQE according to the new WQE format that support these offloads. * Handles the new WQE format when arrived, sets the relevant fields, and copies the needed data. Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
ada68c31 |
|
22-Feb-2016 |
Achiad Shochat <achiad@mellanox.com> |
net/mlx5: Introduce a new header file for physical port functions All the device physical port access functions are implemented in the port.c file. We just extract the exposure of these functions from driver.h into a dedicated header file called port.h. Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
a168a41c |
|
28-Jan-2016 |
Majd Dibbiny <majd@mellanox.com> |
IB/mlx5: Fix reqlen validation in mlx5_ib_alloc_ucontext Older libraries that don't have all the new req_v2 fields should be able to work as well. Today, if the library uses v2, it will fail to allocate context since the size of reqlen is smaller than the req_v2 size. Fix the validation to be with the original req_v2 size and not the current. Fixes: f72300c56c3b ('IB/mlx5: Expose CQE version to user-space') Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
d4584ddf |
|
28-Jan-2016 |
Matan Barak <matanb@mellanox.com> |
IB/mlx5: Add CREATE_CQ and CREATE_QP to uverbs_ex_cmd_mask The mlx5_ib driver supports the extended create_cq and create_qp user verbs. In the current mechanism, a vendor supporting an extended uverb should set the appropriate bit in the uverbs_ex_cmd_mask field. Adding the actual support by setting the required bits in order to support features like completion time-stamping and cross-channel. Fixes: 972ecb821379 ('IB/mlx5: Add create_cq extended command') Fixes: ddf9529be19c ('IB/core: Allow setting create flags in QP init attribute') Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
146d2f1a |
|
14-Jan-2016 |
majd@mellanox.com <majd@mellanox.com> |
IB/mlx5: Allocate a Transport Domain for each ucontext Transport Domain groups several TIS and TIR object. By grouping these object, it defines wheather local loopback packets that are sent from the TIS objects in the group are received by the TIR objects in the same group. Allocate a Transport Domain(TD) for each user context to be used in the future by Raw Packet QP for Self-Loopback Control. Signed-off-by: Majd Dibbiny <majd@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
f72300c5 |
|
14-Jan-2016 |
Haggai Abramovsky <hagaya@mellanox.com> |
IB/mlx5: Expose CQE version to user-space Per user context, work with CQE version that both the user-space and the kernel support. Report this CQE version via the response of the alloc_ucontext command. Signed-off-by: Haggai Abramovsky <hagaya@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
dfbee859 |
|
14-Jan-2016 |
Haggai Abramovsky <hagaya@mellanox.com> |
IB/mlx5: Fix data validation in mlx5_ib_alloc_ucontext The wrong buffer size was passed to ib_is_udata_cleared. Signed-off-by: Haggai Abramovsky <hagaya@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
c876a1b7 |
|
09-Jan-2016 |
Leon Romanovsky <leon@kernel.org> |
IB/mlx5: Fix passing casted pointer in mlx5_query_port_roce Fix static checker warning: drivers/infiniband/hw/mlx5/main.c:149 mlx5_query_port_roce() warn: passing casted pointer '&props->qkey_viol_cntr' to 'mlx5_query_nic_vport_qkey_viol_cntr()' 32 vs 16. Fixes: 3f89a643eb29 ("IB/mlx5: Extend query_device/port to support RoCE") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
50ca6ed2 |
|
19-Jan-2016 |
Leon Romanovsky <leon@kernel.org> |
IB/mlx5: Delete locally redefined variable Fix the following sparse warning: drivers/infiniband/hw/mlx5/main.c:1061:29: warning: symbol 'pfn' shadows an earlier one drivers/infiniband/hw/mlx5/main.c:1030:21: originally declared here Fixes: d69e3bcf7976 ('IB/mlx5: Mmap the HCA's core clock register to user-space') Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
9f177686 |
|
13-Jan-2016 |
Leon Romanovsky <leon@kernel.org> |
IB/mlx5: Expose correct maximum number of CQE capacity Maximum number of EQE capacity per CQ was mistakenly exposed as CQE. Fix that. Fixes: 938fe83c8dcb ("net/mlx5_core: New device capabilities handling") Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Cc: <stable@vger.kernel.org> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
038d2ef8 |
|
11-Jan-2016 |
Maor Gottlieb <maorg@mellanox.com> |
IB/mlx5: Add flow steering support Adding flow steering support by creating a flow-table per priority (if rules exist in the priority). mlx5_ib uses autogrouping and thus only creates the required destinations. Also includes adding of these flow steering utilities 1. Parsing verbs flow attributes hardware steering specs. 2. Check if flow is multicast - this is required in order to decide to which flow table will we add the steering rule. 3. Set outer headers in flow match criteria to zeros. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
88115fe7 |
|
18-Dec-2015 |
Bodong Wang <bodong@mellanox.com> |
IB/mlx5: report tx/rx checksum cap in query results This patch will report the tx/rx checksum cap for raw qp via the query device results. Signed-off-by: Bodong Wang <bodong@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
da7525d2 |
|
14-Dec-2015 |
Eran Ben Elisha <eranbe@mellanox.com> |
IB/mlx5: Advertise atomic capabilities in query device In order to ensure IB spec atomic correctness in atomic operations, if HW is configured to host endianness, advertise IB_ATOMIC_HCA. if not, advertise IB_ATOMIC_NONE. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
051f2630 |
|
19-Dec-2015 |
Leon Romanovsky <leon@kernel.org> |
IB/mlx5: Add driver cross-channel support Add support of cross-channel functionality to mlx5 driver. This includes ability to ignore overrun for CQ which intended for cross-channel, export device capability and configure the QP to be sync master/slave queues. The cross-channel enabled QP supports combination of three possible properties: * WQE processing on the receive queue of this QP * WQE processing on the send queue of this QP * WQE are supported on the send queue Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
d69e3bcf |
|
15-Dec-2015 |
Matan Barak <matanb@mellanox.com> |
IB/mlx5: Mmap the HCA's core clock register to user-space In order to read the HCA's current cycles register, we need to map it to user-space. Add support to map this register via mmap command. Signed-off-by: Matan Barak <matanb@mellanox.com> Reviewed-by: Moshe Lazer <moshel@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
b368d7cb |
|
15-Dec-2015 |
Matan Barak <matanb@mellanox.com> |
IB/mlx5: Add hca_core_clock_offset to udata in init_ucontext Pass hca_core_clock_offset to user-space is mandatory in order to let the user-space read the free-running clock register from the right offset in the memory mapped page. Passing this value is done by changing the vendor's command and response of init_ucontext to be in extensible form. Signed-off-by: Matan Barak <matanb@mellanox.com> Reviewed-by: Moshe Lazer <moshel@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
7c60bcbb |
|
15-Dec-2015 |
Matan Barak <matanb@mellanox.com> |
IB/mlx5: Add support for hca_core_clock and timestamp_mask Reporting the hca_core_clock (in kHZ) and the timestamp_mask in query_device extended verb. timestamp_mask is used by users in order to know what is the valid range of the raw timestamps, while hca_core_clock reports the clock frequency that is used for timestamps. Signed-off-by: Matan Barak <matanb@mellanox.com> Reviewed-by: Moshe Lazer <moshel@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
e53505a8 |
|
23-Dec-2015 |
Achiad Shochat <achiad@mellanox.com> |
IB/mlx5: Support RoCE Advertise RoCE support for IB/core layer and set the hardware to work in RoCE mode. Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
2811ba51 |
|
23-Dec-2015 |
Achiad Shochat <achiad@mellanox.com> |
IB/mlx5: Add RoCE fields to Address Vector Set the address handle and QP address path fields according to the link layer type (IB/Eth). Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
3cca2606 |
|
23-Dec-2015 |
Achiad Shochat <achiad@mellanox.com> |
IB/mlx5: Support IB device's callbacks for adding/deleting GIDs These callbacks write into the mlx5 RoCE address table. Upon del_gid we write a zero'd GID. Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
3f89a643 |
|
23-Dec-2015 |
Achiad Shochat <achiad@mellanox.com> |
IB/mlx5: Extend query_device/port to support RoCE Using the vport access functions to retrieve the Ethernet specific information and return this information in ib_query_device and ib_query_port. Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
fc24fc5e |
|
23-Dec-2015 |
Achiad Shochat <achiad@mellanox.com> |
IB/mlx5: Support IB device's callback for getting its netdev For Eth ports only: Maintain a net device pointer in mlx5_ib_device and update it upon NETDEV_REGISTER and NETDEV_UNREGISTER events if the net-device and IB device have the same PCI parent device. Implement the get_netdev callback to return this net device. Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
ebd61f68 |
|
23-Dec-2015 |
Achiad Shochat <achiad@mellanox.com> |
IB/mlx5: Support IB device's callback for getting the link layer Make the existing mlx5_ib_port_link_layer() signature match the ib device callback signature (add port_num parameter). Refactor it to use a sub function so that the link layer could be queried also before the ibdev is created. Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
dd01e66a |
|
13-Oct-2015 |
Sagi Grimberg <sagig@mellanox.com> |
IB/mlx5: Remove old FRWR API support No ULP uses it anymore, go ahead and remove it. Keep only the local invalidate part of the handlers. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
8a187ee5 |
|
13-Oct-2015 |
Sagi Grimberg <sagig@mellanox.com> |
IB/mlx5: Support the new memory registration API Support the new memory registration API by allocating a private page list array in mlx5_ib_mr and populate it when mlx5_ib_map_mr_sg is invoked. Also, support IB_WR_REG_MR by setting the exact WQE as IB_WR_FAST_REG_MR, just take the needed information from different places: - page_size, iova, length, access flags (ib_mr) - page array (mlx5_ib_mr) - key (ib_reg_wr) The IB_WR_FAST_REG_MR handlers will be removed later when all the ULPs will be converted. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
adec640e |
|
28-Aug-2015 |
Christoph Hellwig <hch@lst.de> |
mlx5: stop including <asm-generic/kmap_types.h> <linux/highmem.h> is the placace the get the kmap type flags, asm-generic files are generic implementations only to be used by architecture code. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
#
81fb5e26 |
|
24-Sep-2015 |
Sagi Grimberg <sagig@mellanox.com> |
IB/mlx5: Remove pa_lkey usages Since mlx5 driver cannot rely on registration using the reserved lkey (global_dma_lkey) it used to allocate a private physical address lkey for each allocated pd. Commit 96249d70dd70 ("IB/core: Guarantee that a local_dma_lkey is available") just does it in the core layer so we can go ahead and use that. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
c6790aa9 |
|
24-Sep-2015 |
Sagi Grimberg <sagig@mellanox.com> |
IB/mlx5: Remove support for IB_DEVICE_LOCAL_DMA_LKEY Commit 96249d70dd70 ("IB/core: Guarantee that a local_dma_lkey is available") allows ULPs that make use of the local dma key to keep working as before by allocating a DMA MR with local permissions and converted these consumers to use the MR associated with the PD rather then device->local_dma_lkey. ConnectIB has some known issues with memory registration using the local_dma_lkey (SEND, RDMA, RECV seems to work ok). Thus don't expose support for it (remove device->local_dma_lkey setting), and take advantage of the above commit such that no regression is introduced to working systems. The local_dma_lkey support will be restored in CX4 depending on FW capability query. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
b37c788f |
|
30-Jul-2015 |
Jason Gunthorpe <jgg@ziepe.ca> |
IB/mlx5: Remove ib_get_dma_mr calls The pd now has a local_dma_lkey member which completely replaces ib_get_dma_mr, use it instead. Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
b3778ba8 |
|
30-Jul-2015 |
Sagi Grimberg <sagig@mellanox.com> |
mlx5: Drop mlx5_ib_alloc_fast_reg_mr Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
9bee178b |
|
30-Jul-2015 |
Sagi Grimberg <sagig@mellanox.com> |
IB: Modify ib_create_mr API Use ib_alloc_mr with specific parameters. Change the existing callers. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
8b91ffc1 |
|
30-Jul-2015 |
Sagi Grimberg <sagig@mellanox.com> |
IB/core: Get rid of redundant verb ib_destroy_mr This was added in a thought of uniting all mr allocation and deallocation routines but the fact is we have a single deallocation routine already, ib_dereg_mr. And, move mlx5_ib_destroy_mr specific logic into mlx5_ib_dereg_mr (includes only signature stuff for now). And, fixup the only callers (iser/isert) accordingly. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
18ebd407 |
|
27-Jul-2015 |
Sagi Grimberg <sagig@mellanox.com> |
mlx4, mlx5, mthca: Expose max_sge_rd correctly Applications must not assume that max_sge and max_sge_rd are the same, Hence expose max_sge_rd correctly as well. Reported-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
e0238a6a |
|
21-Jul-2015 |
Sagi Grimberg <sagig@mellanox.com> |
mlx5: Expose correct page_size_cap in device attributes Should be all the page sizes that are supported by the device. Reported-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
a3c87420 |
|
20-Jul-2015 |
Sagi Grimberg <sagig@mellanox.com> |
mlx5: Fix missing device local_dma_lkey The mlx5 driver exposes device capability IB_DEVICE_LOCAL_DMA_LKEY but does not set the the device local_dma_lkey. This breaks rpcrdma drivers. Query and set this lkey when creating the device resources. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
337877a4 |
|
06-Jun-2015 |
Ira Weiny <ira.weiny@intel.com> |
IB/core: Add ability for drivers to report an alternate MAD size. Add max MAD size to the device immutable data set and have all drivers that support MADs report the current IB MAD size (IB_MGMT_MAD_SIZE) to the core. Verify MAD size data in both the MAD core and when reading the immutable data. OPA drivers will report alternate MAD sizes in subsequent patches. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
2528e33e |
|
11-Jun-2015 |
Matan Barak <matanb@mellanox.com> |
IB/core: Pass hardware specific data in query_device Vendors should be able to pass vendor specific data to/from user-space via query_device uverb. In order to do this, we need to pass the vendors' specific udata. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
8e37210b |
|
11-Jun-2015 |
Matan Barak <matanb@mellanox.com> |
IB/core: Change ib_create_cq to use struct ib_cq_init_attr Currently, ib_create_cq uses cqe and comp_vecotr instead of the extendible ib_cq_init_attr struct. Earlier patches already changed the vendors to work with ib_cq_init_attr. This patch changes the consumers too. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
bcf4c1ea |
|
11-Jun-2015 |
Matan Barak <matanb@mellanox.com> |
IB/core: Change provider's API of create_cq to be extendible Add a new ib_cq_init_attr structure which contains the previous cqe (minimum number of CQ entries) and comp_vector (completion vector) in addition to a new flags field. All vendors' create_cq callbacks are changed in order to work with the new API. This commit does not change any functionality. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-By: Devesh Sharma <devesh.sharma@avagotech.com> to patch #2 Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
facc9699 |
|
11-Jun-2015 |
Saeed Mahameed <saeedm@mellanox.com> |
net/mlx5e: Fix HW MTU settings Previously we configured HW MTU to be netdev->mtu, actually we need to configure netdev->mtu + (ETH_HLEN + VLAN_HLEN + ETH_FCS_LEN). Also, query MTU can not fail, hence make the relevant helper a void functionm, add mlx5e_set_dev_port_mtu, helper function to handle MTU setting. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
4aa17b28 |
|
04-Jun-2015 |
Haggai Abramonvsky <hagaya@mellanox.com> |
mlx5: Enable mutual support for IB and Ethernet Ethernet functionality is only available when working in ISSI > 0 mode. Previously, the IB driver wasn't ready to work on that mode, and hence building both the IB driver and the Ethernet functionality in the core driver were disallowed by Kconfigs. Now, once we have all the pre-steps in place, we can remove this limitation. The last steps in the IB driver for getting that setup to work are: create dummy SRQ for the driver's use (until now we could use XRC_SRQ as SRQ and XRC_SRQ, after moving to ISSI > 0, we separate XRC SRQs from basic SRQs) and adapt the create QP function to be compatible with ISSI > 0. Signed-off-by: Haggai Abramovsky <hagaya@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
647241ea |
|
04-Jun-2015 |
Majd Dibbiny <majd@mellanox.com> |
IB/mlx5: Don't create IB instance over Ethernet ports Since we still don't have RoCE support in mlx5, avoid creating IB driver instance over Ethernet ports. Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
1b5daf11 |
|
04-Jun-2015 |
Majd Dibbiny <majd@mellanox.com> |
IB/mlx5: Avoid using the MAD_IFC command under ISSI > 0 mode In ISSI > 0 mode, most of the MAD_IFC command features are deprecated, and can't be used. Therefore, when in that mode, we replace all of them with other commands that provide the required functionality. Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
938fe83c |
|
28-May-2015 |
Saeed Mahameed <saeedm@mellanox.com> |
net/mlx5_core: New device capabilities handling - Query all supported types of dev caps on driver load. - Store the Cap data outbox per cap type into driver private data. - Introduce new Macros to access/dump stored caps (using the auto generated data types). - Obsolete SW representation of dev caps (no need for SW copy for each cap). - Modify IB driver to use new macros for checking caps. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
f9b22e35 |
|
13-May-2015 |
Ira Weiny <ira.weiny@intel.com> |
IB/core: Convert core to use bitfield for caps Remove query_protocol callback Use the new Core Capability bits for: rdma_protocol_* rdma_cap_ib_mad rdma_cap_ib_smi rdma_cap_ib_cm rdma_cap_iw_cm rdma_cap_ib_sa rdma_cap_ib_mcast rdma_cap_af_ib rdma_cap_eth_ah Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
7738613e |
|
13-May-2015 |
Ira Weiny <ira.weiny@intel.com> |
IB/core: Add per port immutable struct to ib_device As of commit 5eb620c81ce3 "IB/core: Add helpers for uncached GID and P_Key searches"; pkey_tbl_len and gid_tbl_len are immutable data which are stored in the ib_device. The per port core capability flags to be added later are also immutable data to be stored in the ib_device object. In preparation for this create a structure for per port immutable data and place the pkey and gid table lengths within this structure. "get_port_immutable" is added as a mandatory device function to allow the drivers to fill in this data. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
6b90a6d6 |
|
05-May-2015 |
Michael Wang <yun.wang@profitbricks.com> |
IB/Verbs: Implement new callback query_protocol() Add new callback query_protocol() and implement for each HW. Mapping List: node-type link-layer transport protocol nes RNIC ETH IWARP IWARP amso1100 RNIC ETH IWARP IWARP cxgb3 RNIC ETH IWARP IWARP cxgb4 RNIC ETH IWARP IWARP usnic USNIC_UDP ETH USNIC_UDP USNIC_UDP ocrdma IB_CA ETH IB IBOE mlx4 IB_CA IB/ETH IB IB/IBOE mlx5 IB_CA IB IB IB ehca IB_CA IB IB IB ipath IB_CA IB IB IB mthca IB_CA IB IB IB qib IB_CA IB IB IB Signed-off-by: Michael Wang <yun.wang@profitbricks.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Tested-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Tested-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
|
#
64613d94 |
|
02-Apr-2015 |
Saeed Mahameed <saeedm@mellanox.com> |
net/mlx5_core: Extend struct mlx5_interface to support multiple protocols Preparation for ethernet driver. Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
233d05d2 |
|
02-Apr-2015 |
Saeed Mahameed <saeedm@mellanox.com> |
net/mlx5_core: Move completion eqs from mlx5_ib to mlx5_core Preparation for ethernet driver. These functions will be used in drivers other than mlx5_ib. Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
6cf0a15f |
|
02-Apr-2015 |
Saeed Mahameed <saeedm@mellanox.com> |
IB/mlx5: Fix Mellanox copyright note Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
1707cb4a |
|
08-Feb-2015 |
Haggai Eran <haggaie@mellanox.com> |
IB/mlx5: Enable the ODP capability query verb Re-enable the on-demand paging capability query through the extended query device verb. Signed-off-by: Haggai Eran <haggaie@mellanox.com> Reviewed-by: Yann Droneaud <ydroneaud@opteya.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
|
#
f614fc15 |
|
12-Jan-2015 |
Dan Carpenter <dan.carpenter@oracle.com> |
IB/mlx5: Fix error code in get_port_caps() The current code returns success when kmalloc() fails. It should return an error code, -ENOMEM. Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
|
#
43c61165 |
|
05-Feb-2015 |
Yann Droneaud <ydroneaud@opteya.com> |
Revert "IB/core: Add support for extended query device caps" While commit 7e36ef8205ff ("IB/core: Temporarily disable ex_query_device uverb") is correct as it makes the extended QUERY_DEVICE uverb (which came as part of commit 5a77abf9a97a ("IB/core: Add support for extended query device caps") and commit 860f10a799c8 ("IB/core: Add flags for on demand paging support")) not available to userspace, it doesn't address the initial issue regarding ib_copy_to_udata() [1][2]. Additionally, further discussions around this new uverb seems to conclude it would require a different data structure than the one currently described in <rdma/ib_user_verbs.h> [3]. Both of these issues require a revert of the changes, so this patch partially reverts commit 8cdd312cfed7 ("IB/mlx5: Implement the ODP capability query verb") and commit 860f10a799c8 ("IB/core: Add flags for on demand paging support") and fully reverts commit 5a77abf9a97a ("IB/core: Add support for extended query device caps"). [1] "Re: [PATCH v3 06/17] IB/core: Add support for extended query device caps" http://mid.gmane.org/1418733236.2779.26.camel@opteya.com [2] "Re: [PATCH] IB/core: Temporarily disable ex_query_device uverb" http://mid.gmane.org/1423067503.3030.83.camel@opteya.com [3] "RE: [PATCH v1 1/5] IB/uverbs: ex_query_device: answer must not depend on request's comp_mask" http://mid.gmane.org/2807E5FD2F6FDA4886F6618EAC48510E0CC12C30@CRSMSX101.amr.corp.intel.com Cc: Eli Cohen <eli@mellanox.com> Cc: Haggai Eran <haggaie@mellanox.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Cc: Sagi Grimberg <sagig@mellanox.com> Cc: Shachar Raindel <raindel@mellanox.com> Signed-off-by: Yann Droneaud <ydroneaud@opteya.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
|
#
b4cfe447 |
|
11-Dec-2014 |
Haggai Eran <haggaie@mellanox.com> |
IB/mlx5: Implement on demand paging by adding support for MMU notifiers * Implement the relevant invalidation functions (zap MTTs as needed) * Implement interlocking (and rollback in the page fault handlers) for cases of a racing notifier and fault. * With this patch we can now enable the capability bits for supporting RC send/receive/RDMA read/RDMA write, and UD send. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Shachar Raindel <raindel@mellanox.com> Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
|
#
6aec21f6 |
|
11-Dec-2014 |
Haggai Eran <haggaie@mellanox.com> |
IB/mlx5: Page faults handling infrastructure * Refactor MR registration and cleanup, and fix reg_pages accounting. * Create a work queue to handle page fault events in a kthread context. * Register a fault handler to get events from the core for each QP. The registered fault handler is empty in this patch, and only a later patch implements it. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Shachar Raindel <raindel@mellanox.com> Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
|
#
8cdd312c |
|
11-Dec-2014 |
Haggai Eran <haggaie@mellanox.com> |
IB/mlx5: Implement the ODP capability query verb The patch adds infrastructure to query ODP capabilities in the mlx5 driver. The code will read the capabilities from the device, and enable only those capabilities that both the driver and the device supports. At this point ODP is not supported, so no capability is copied from the device, but the patch exposes the global ODP device capability bit. Signed-off-by: Shachar Raindel <raindel@mellanox.com> Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
|
#
1c3ce90d |
|
14-Sep-2014 |
Eli Cohen <eli@dev.mellanox.co.il> |
IB/mlx5: Fix possible array overflow The check to verify that userspace does not provide an invalid index to the micro UAR was placed too late. Fix this by moving the check before using the index. Reported by: Shachar Raindel <raindel@mellanox.com> Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
|
#
eefd56e5 |
|
14-Sep-2014 |
Eli Cohen <eli@dev.mellanox.co.il> |
IB/mlx5: Clear umr resources after ib_unregister_device Some ULPs may make use of resources created in create_umr_res so make sure to call destroy_umrc_res after returning from ib_unregister_device, which makes sure all ULPs have closed their resources. Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
|
#
c7a08ac7 |
|
01-Oct-2014 |
Eli Cohen <eli@mellanox.com> |
net/mlx5_core: Update device capabilities handling Rearrange struct mlx5_caps so it has a "gen" field to represent the current capabilities configured for the device. Max capabilities can also be queried from the device. Also update capabilities struct to contain more fields as per the latest revision if firmware specification. Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
4d2f9bbb |
|
28-Jul-2014 |
Jack Morgenstein <jackm@dev.mellanox.co.il> |
mlx5: Adjust events to use unsigned long param instead of void * In the event flow, we currently pass only a port number in the void *data argument. Rather than pass a pointer to the event handlers, we should use an "unsigned long" parameter, and pass the port number value directly. In the future, if necessary for some events, we can use the unsigned long parameter to pass a pointer. Based on a patch by Eli Cohen <eli@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
f241e749 |
|
28-Jul-2014 |
Jack Morgenstein <jackm@dev.mellanox.co.il> |
mlx5: minor fixes (mainly avoidance of hidden casts) There were many places where parameters which should be u8/u16 were integer type. Additionally, in 2 places, a check for a non-null pointer was added before dereferencing the pointer (this is actually a bug fix). Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
9603b61d |
|
28-Jul-2014 |
Jack Morgenstein <jackm@dev.mellanox.co.il> |
mlx5: Move pci device handling from mlx5_ib to mlx5_core In preparation for a new mlx5 device which is VPI (i.e., ports can be either IB or ETH), move the pci device functionality from mlx5_ib to mlx5_core. This involves the following changes: 1. Move mlx5_core_dev struct out of mlx5_ib_dev. mlx5_core_dev is now an independent structure maintained by mlx5_core. mlx5_ib_dev now has a pointer to that struct. This requires changing a lot of places where the core_dev struct was accessed via mlx5_ib_dev (now, this needs to be a pointer dereference). 2. All PCI initializations are now done in mlx5_core. Thus, it is now mlx5_core which does pci_register_device (and not mlx5_ib, as was previously). 3. mlx5_ib now registers itself with mlx5_core as an "interface" driver. This is very similar to the mechanism employed for the mlx4 (ConnectX) driver. Once the HCA is initialized (by mlx5_core), it invokes the interface drivers to do their initializations. 4. There is a new event handler which the core registers: mlx5_core_event(). This event handler invokes the event handlers registered by the interfaces. Based on a patch by Eli Cohen <eli@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
f360d88a |
|
01-Apr-2014 |
Eli Cohen <eli@dev.mellanox.co.il> |
IB/mlx5: Add block multicast loopback support Add support for the block multicast loopback QP creation flag along the proper firmware API for that. Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
|
#
2dea9094 |
|
23-Feb-2014 |
Sagi Grimberg <sagig@mellanox.com> |
IB/mlx5: Expose support for signature MR feature Currently support only T10-DIF types of signature handover operations (types 1|2|3). Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
|
#
d5436ba0 |
|
23-Feb-2014 |
Sagi Grimberg <sagig@mellanox.com> |
IB/mlx5: Collect signature error completion This commit takes care of the generated signature error CQE generated by the HW (if happened). The underlying mlx5 driver will handle signature error completions and will mark the relevant memory region as dirty. Once the consumer gets the completion for the transaction, it must check for signature errors on signature memory region using a new lightweight verb ib_check_mr_status(). In case the user doesn't check for signature error (i.e. doesn't call ib_check_mr_status() with status check IB_MR_CHECK_SIG_STATUS), the memory region cannot be used for another signature operation (REG_SIG_MR work request will fail). Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
|
#
3121e3c4 |
|
23-Feb-2014 |
Sagi Grimberg <sagig@mellanox.com> |
mlx5: Implement create_mr and destroy_mr Support create_mr and destroy_mr verbs. Creating ib_mr may be done for either ib_mr that will register regular page lists like alloc_fast_reg_mr routine, or indirect ib_mrs that can register other (pre-registered) ib_mrs in an indirect manner. In addition user may request signature enable, that will mean that the created ib_mr may be attached with signature attributes (BSF, PSVs). Currently we only allow direct/indirect registration modes. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
|
#
169a1d85 |
|
19-Feb-2014 |
Amir Vadai <amirv@mellanox.com> |
net,IB/mlx: Bump all Mellanox driver versions Bump all Mellanox driver versions. Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
1a4c3a3d |
|
06-Feb-2014 |
Eli Cohen <eli@dev.mellanox.co.il> |
IB/mlx5: Don't set "block multicast loopback" capability Currently Connect-IB does not support blocking multicast loopback, so don't set IB_DEVICE_BLOCK_MULTICAST_LOOPBACK in the device caps. Reported by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
|
#
78c0f98c |
|
30-Jan-2014 |
Eli Cohen <eli@dev.mellanox.co.il> |
IB/mlx5: Fix binary compatibility with libmlx5 Commit c1be5232d21d ("Fix micro UAR allocator") broke binary compatibility between libmlx5 and mlx5_ib since it defines a different value to the number of micro UARs per page, leading to wrong calculation in libmlx5. This patch defines struct mlx5_ib_alloc_ucontext_req_v2 as an extension to struct mlx5_ib_alloc_ucontext_req. The extended size is determined in mlx5_ib_alloc_ucontext() and in case of old library we use uuarn 0 which works fine -- this is acheived due to create_user_qp() falling back from high to medium then to low class where low class will return 0. For new libraries we use the more sophisticated allocation algorithm. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Yann Droneaud <ydroneaud@opteya.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
|
#
c1be5232 |
|
14-Jan-2014 |
Eli Cohen <eli@dev.mellanox.co.il> |
IB/mlx5: Fix micro UAR allocator The micro UAR (uuar) allocator had a bug which resulted from the fact that in each UAR we only have two micro UARs avaialable, those at index 0 and 1. This patch defines iterators to aid in traversing the list of available micro UARs when allocating a uuar. In addition, change the logic in create_user_qp() so that if high class allocation fails (high class means lower latency), we revert to medium class and not to the low class. Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
|
#
746b5583 |
|
23-Oct-2013 |
Eli Cohen <eli@dev.mellanox.co.il> |
IB/mlx5: Multithreaded create MR Use asynchronous commands to execute up to eight concurrent create MR commands. This is to fill memory caches faster so we keep consuming from there. Also, increase timeout for shrinking caches to five minutes. Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
|
#
ada9f5d0 |
|
11-Sep-2013 |
Sagi Grimberg <sagig@mellanox.com> |
IB/mlx5: Fix eq names to display nicely in /proc/interrupts It's helpful for a driver to put the pci slot name in its interrupt names, so /proc/interrupts will show the pci slot of the device. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
|
#
81bea28f |
|
11-Sep-2013 |
Eli Cohen <eli@mellanox.com> |
IB/mlx5: Disable atomic operations Currently Atomic operations don't work properly. Disable them for the time being. Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
|
#
a0c84c32 |
|
11-Sep-2013 |
Eli Cohen <eli@mellanox.com> |
IB/mlx5: Avoid async events on invalid port number On a single ported Connect-IB, its possible for the firmware to issue events on the non-existing 2nd port. Make sure to ignore events generated for such ports. Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
|
#
92b0ca7c |
|
25-Jul-2013 |
Dan Carpenter <dan.carpenter@oracle.com> |
IB/mlx5: Fix stack info leak in mlx5_ib_alloc_ucontext() We don't set "resp.reserved". Since it's at the end of the struct that means we don't have to copy it to the user. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
|
#
281d1a92 |
|
29-Jul-2013 |
Wei Yongjun <yongjun_wei@trendmicro.com.cn> |
IB/mlx5: Fix error return code in init_one() Fix to return a negative error code from the error handling case instead of 0, as done elsewhere in this function. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: Roland Dreier <roland@purestorage.com>
|
#
ad32b95f |
|
08-Jul-2013 |
Roland Dreier <roland@purestorage.com> |
IB/mlx5: Make profile[] static in main.c Signed-off-by: Roland Dreier <roland@purestorage.com>
|
#
e126ba97 |
|
07-Jul-2013 |
Eli Cohen <eli@mellanox.com> |
mlx5: Add driver for Mellanox Connect-IB adapters The driver is comprised of two kernel modules: mlx5_ib and mlx5_core. This partitioning resembles what we have for mlx4, except that mlx5_ib is the pci device driver and not mlx5_core. mlx5_core is essentially a library that provides general functionality that is intended to be used by other Mellanox devices that will be introduced in the future. mlx5_ib has a similar role as any hardware device under drivers/infiniband/hw. Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> [ Merge in coccinelle fixes from Fengguang Wu <fengguang.wu@intel.com>. - Roland ] Signed-off-by: Roland Dreier <roland@purestorage.com>
|