This was mistakenly comparing the exit code of the syscall, which is
always -1 and not the corresponding error-code. Added unit tests to
ensure we don't regress.
Signed-off-by: Dave Tucker <dave@dtucker.co.uk>
This adds support for loading XDP programs that are multi-buffer
capable, which is signalled using the xdp.frags section name. When this
is set, we should set the BPF_F_XDP_HAS_FRAGS flag when loading the
program into the kernel.
Signed-off-by: Dave Tucker <dave@dtucker.co.uk>
This commit adds from_pin() which allows the creation of a Program
from a path on bpffs. This is useful to be able to call `attach` or
other APIs for programs that are already loaded to the kernel.
This differs from #444 since it implements this on the concrete program
type, not the Program enum, allowing the user to pass in any additional
context that isn't available from bpf_prog_info.
Signed-off-by: Dave Tucker <dave@dtucker.co.uk>
Fix a bug which was resulting in `ENOTSUPP` following
the `BPF_MAP_CREATE` Syscall. This fix was initially
found by libbpf maintainers in:
https://github.com/libbpf/libbpf/issues/355.
Signed-off-by: Andrew Stoycos <astoycos@redhat.com>
- Rename `new_tc_link` to `attached`
- Simplified example for `attached`
The following was not requested in reviews:
- Moved example from `SchedClassifierLink` tor
`SchedClassifierLink::attached' since we're only showing an
example of that one method
Signed-off-by: Andre Fredette <afredette@redhat.com>
This is the proposed solution for Step 2 of issue #414
“For cases where you have done program.take_link() to manage
ownership of TcLink we need an API similar to PinnedLink::from_pin
that can reconstruct a TcLink”
As long as a user application continues to run after executing
`take_link()`, the `SchedClassifierLink` returned can be used to
detach the program. However, if we want to handle cases where the
application exits or crashes, we need a way to save and reconstruct
the link, and to do that, we also need to know the information
required for the reconstruction -- namely, the `interface`,
`attach_type`, `priority`, and `handle`. The user knows the first
two because they are required to execute `attach()` in the first
place; however, the user will not know the others if they let the
system choose them.
This pr solves the problems by adding an `impl` for
`SchedClassifierLink` with an accessor for `tc_options` and a `new()`
function.
Signed-off-by: Andre Fredette <afredette@redhat.com>
Currently aya will just report a standard outer level
error on failure. Ensure that we also report the inner
error condition back to the user
Signed-off-by: Andrew Stoycos <astoycos@redhat.com>
To split the crate into two, several changes were made:
1. Most `pub(crate)` are now `pub` to allow access from Aya;
2. Parts of BpfError are merged into, for example, RelocationError;
3. BTF part of Features is moved into the new crate;
4. `#![deny(missing_docs)]` is removed temporarily;
5. Some other code gets moved into the new crate, mainly:
- aya::{bpf_map_def, BtfMapDef, PinningType},
- aya::programs::{CgroupSock*AttachType},
The new crate is currenly allowing missing_docs. Member visibility
will be adjusted later to minimize exposure of implementation details.
Refs: #473
Aya::obj depends on bindgen generated files, and we start
by migrating bindgen generated files.
This commit adds the new aya-obj crate to the workplace
and migrates generated files into the crate. We use core
instead of std in an effort to make the final crate no_std.
Bindgen was run against libbpf v1.0.1.
Refs: #473
Kernel 4.15 added a new eBPF program that can
be used with cgroup v2 to control & observe device
access (e.g. read, write, mknod) - `BPF_PROG_TYPE_CGROUP_DEVICE`.
We add the ability to create these programs with the `cgroup_device`
proc macro which creates the `cgroup/dev` link section. Device
details are available to the eBPF program in `DeviceContext`.
The userspace representation is provided with the `CgroupDevice`
structure.
Fixes: #212
Signed-off-by: Milan <milan@mdaverde.com>
Use a struct called TcOptions for setting priority and handle in SchedClassifier attach
struct TcOptions implements the Default trait, so for the simple use
case in which the defaults are acceptable, we can call attach as
follows:
attach(“eth0”, TcAttachType::Ingress, TcOptions::default())
To specify all options:
attach(“eth0”, TcAttachType::Ingress, TcOptions { priority: (50), handle: (3) })
Or, some options:
attach(“eth0”, TcAttachType::Ingress, TcOptions { priority: (50), ..Default::default() })
Signed-off-by: Andre Fredette <afredette@redhat.com>
Implements step 1 of https://github.com/aya-rs/aya/issues/414.
- Adds handle to the SchedClassifier attach API
- Saves handle in the TcLink sruct and uses it when detaching programs
NOTE: this changes the API, so it will require a bump in the Aya version.
Signed-off-by: Andre Fredette <afredette@redhat.com>
Fix some broken rust doc links.
Make sure rustdoc build fail on warnings
so we catch these broken links in CI.
Signed-off-by: Andrew Stoycos <astoycos@redhat.com>
switch map() and map_mut() from returning a
`Result` to an `Option` since it's just getting
a value from a Hashmap, and to stay in line with
the Programs API.
Remove `MapError::MapNotFound`
Signed-off-by: Andrew Stoycos <astoycos@redhat.com>
Remove MapError::UnexpectedMapType
Add Macro for converting from aya::Map to
u32 (map type) for use in
`MapError::InvalidMapType { map_type: x }`
Signed-off-by: Andrew Stoycos <astoycos@redhat.com>
Respond to more review comments:
Revert to try_from in doctests so we don't need
to explicitly specify type parameters.
Fixup some documentation
Remove explit types in `try_from` methods
Signed-off-by: Andrew Stoycos <astoycos@redhat.com>
Implement Copy for MapData so that
when `take_map` is used we create a
1 to 1 mapping of MapData to internal
FileDescriptor. This will ensure
that when MapData is used in multiple
tasks that we don't drop the FD before
all tasks are done using it.
Signed-off-by: Andrew Stoycos <astoycos@redhat.com>
Create a new type called `SockMapFd` which is
solely used when a program needs to attach
to a socket map. In the future this same
tatic could be used for other use cases
so we may make this more generic.
Signed-off-by: Andrew Stoycos <astoycos@redhat.com>
Build completing tests passing
Refactor the Map API to better align
with the aya programs API. Specifically
remove all internal locking mechanisms
and custom Deref/DerefMut implementations.
They are replaced with a Map enum
and AsRef/AsMut implementations.
All Try_From implementations have been moved
to standardized enums, with a slightly
special one for PerfEventArray's.
Also cleanup/fix all associated tests and
documentation.
Signed-off-by: Andrew Stoycos <astoycos@redhat.com>
Following the lead of crates like tokio and nix, we now annotate APIs
that require optional features. This helps in cases where a user wants
to have an `AsyncPerfEventArray` which is documented on crates.io, but
it's not obvious that you have to enable the `async` feature.
Signed-off-by: Dave Tucker <dave@dtucker.co.uk>
`override_syscall` performs integer-to-pointer conversion. This is
considered harmful on the newest Rust nightly which provides
`ptr::from_exposed_addr`, but there is no other way on Rust stable than
doing `as *const T`, which is what miri is unhappy about.
Signed-off-by: Michal Rostecki <vadorovsky@gmail.com>
Add BpfLoader::set_max_entries, which sets the max_entries for the
specified map, as the load-time option.
The max_entries set at map initialization in the ebpf component can be
overwritten by this method called on the userspace component.
If you want to set max_entries for multiple maps in an ebpf component,
you can do so by calling set_max_entries in the form of a method chain.
Fixes: #308
Refs: #292
Add `from_pinned` to allow loading BPF maps
from pinned points in the bpffs and
`from_fd` to allow loading BPF maps from
RawFds aquired via some other means eg
a unix socket.
These functions return an
aya::Map which has not been used previously
but will be the future abstraction once
all bpf maps are represented as an enum.
Signed-off-by: Andrew Stoycos <astoycos@redhat.com>
This commit fixes a bug and adds some missing lifecycle APIs.
1. Adds PinnedLink::from_path to create a pinned link from bpffs
2. Adds From<PinnedLink> for FdLink to allow for ^ to be converted
3. Adds From<FdLink> for XdpLink
Signed-off-by: Dave Tucker <dave@dtucker.co.uk>
Files changed:\nM aya/src/generated/linux_bindings_aarch64.rs
M aya/src/generated/linux_bindings_armv7.rs
M aya/src/generated/linux_bindings_riscv64.rs
M aya/src/generated/linux_bindings_x86_64.rs
Files changed:\nM aya/src/generated/btf_internal_bindings.rs
M aya/src/generated/linux_bindings_aarch64.rs
M aya/src/generated/linux_bindings_armv7.rs
M aya/src/generated/linux_bindings_riscv64.rs
M aya/src/generated/linux_bindings_x86_64.rs
M bpf/aya-bpf-bindings/src/aarch64/bindings.rs
M bpf/aya-bpf-bindings/src/aarch64/helpers.rs
M bpf/aya-bpf-bindings/src/armv7/bindings.rs
M bpf/aya-bpf-bindings/src/armv7/helpers.rs
M bpf/aya-bpf-bindings/src/riscv64/bindings.rs
M bpf/aya-bpf-bindings/src/riscv64/helpers.rs
M bpf/aya-bpf-bindings/src/x86_64/bindings.rs
M bpf/aya-bpf-bindings/src/x86_64/helpers.rs
This commit removes reliance on generated BtfType structs, as
well as adding a dedicated struct for each BTF type. As such,
we can now add nice accessors like `bits()` and `encoding()`
for Int vs. inlined shift/mask operations.
Signed-off-by: Dave Tucker <dave@dtucker.co.uk>
1. Removes OwnedLink
2. Allows Links to be converted into FdLink
3. Introduces a PinnedLink type to handle wrap FdLink when pinned and
support un-pinning
Signed-off-by: Dave Tucker <dave@dtucker.co.uk>
This allows for FdLinks to also be pinned to BpfFs.
In order for it to be called, the user would first call
`take_link` to get the underlying link. This can then
be destructured to an FdLink where FdLink::pin() may be called.
Signed-off-by: Dave Tucker <dave@dtucker.co.uk>
This allows for `pin` to be called as `Xdp::pin()` or
Program::pin() - the same way that unload() can be used.
This simplifies the use of this API.
Signed-off-by: Dave Tucker <dave@dtucker.co.uk>
This commit allows for BTF maps in the .maps ELF section to be parsed.
It reads the necessary information from the BTF section of the ELF file.
While the btf_ids of Keys and Values types are stored, they are not (yet)
used.
When creating a BTF map, we pass the btf_key_type_id and
btf_value_type_id.
Signed-off-by: Dave Tucker <dave@dtucker.co.uk>
Kernels before 5.11 don't use cgroup accounting, so they might reach the
RLIMIT_MEMLOCK when creating maps. After this change, we raise a warning
recommending to raise the RLIMIT_MEMLOCK.
This allows for Extension programs already loaded to the kernel to be
attached to another program that is BTF-compatible with the one provided
at `load()` time
Signed-off-by: Dave Tucker <dave@dtucker.co.uk>
This removes the ProgramFd trait with a struct that wraps a RawFd.
Program::fd() has been implemented as well as fd() for each Program
Type. This allows for a better API than requiring the use of the
ProgramFd trait.
Signed-off-by: Dave Tucker <dave@dtucker.co.uk>
bpf_map_update_elem is used in lieu of bpf_map_push_elem to maintain support for kernel version < 4.20. The kernel expects a null pointer for the key for this use case. With this change, if you pass None as key to `bpf_map_update_elem`, it will pass null as key.
Files changed:\nM aya/src/generated/linux_bindings_aarch64.rs
M aya/src/generated/linux_bindings_armv7.rs
M aya/src/generated/linux_bindings_riscv64.rs
M aya/src/generated/linux_bindings_x86_64.rs
This allows access to XdpLink, XdpLinkId etc... which is currently
unavailable since these modules are private
Signed-off-by: Dave Tucker <dave@dtucker.co.uk>
Files changed:\nM aya/src/generated/linux_bindings_riscv64.rs
M bpf/aya-bpf-bindings/src/riscv64/bindings.rs
M bpf/aya-bpf-bindings/src/riscv64/getters.rs
M bpf/aya-bpf-bindings/src/riscv64/helpers.rs
Files changed:\nM aya/src/generated/linux_bindings_aarch64.rs
M aya/src/generated/linux_bindings_armv7.rs
M aya/src/generated/linux_bindings_x86_64.rs
M bpf/aya-bpf-bindings/src/aarch64/bindings.rs
M bpf/aya-bpf-bindings/src/aarch64/getters.rs
M bpf/aya-bpf-bindings/src/aarch64/helpers.rs
M bpf/aya-bpf-bindings/src/armv7/bindings.rs
M bpf/aya-bpf-bindings/src/armv7/getters.rs
M bpf/aya-bpf-bindings/src/armv7/helpers.rs
M bpf/aya-bpf-bindings/src/x86_64/bindings.rs
M bpf/aya-bpf-bindings/src/x86_64/getters.rs
M bpf/aya-bpf-bindings/src/x86_64/helpers.rs
Since we support multiple maps in the same section, the section_index is
no longer a unique way to identify maps. This commit uses the symbol
index as the identifier, but falls back to section_index for rodata
and bss maps since we don't retrieve the symbol_index during parsing.
Signed-off-by: Dave Tucker <dave@dtucker.co.uk>
Files changed:\nM aya/src/generated/linux_bindings_aarch64.rs
M aya/src/generated/linux_bindings_armv7.rs
M aya/src/generated/linux_bindings_x86_64.rs
M bpf/aya-bpf-bindings/src/aarch64/bindings.rs
M bpf/aya-bpf-bindings/src/aarch64/getters.rs
M bpf/aya-bpf-bindings/src/aarch64/helpers.rs
M bpf/aya-bpf-bindings/src/armv7/bindings.rs
M bpf/aya-bpf-bindings/src/armv7/getters.rs
M bpf/aya-bpf-bindings/src/armv7/helpers.rs
M bpf/aya-bpf-bindings/src/x86_64/bindings.rs
M bpf/aya-bpf-bindings/src/x86_64/getters.rs
M bpf/aya-bpf-bindings/src/x86_64/helpers.rs
Remove LinkRef and remove the Rc<RefCell<_>> that was used to store
type-erased link values in ProgramData. Among other things, this allows
`Bpf` to be `Send`, which makes it easier to use it with async runtimes.
Change the link API to:
let link_id = prog.attach(...)?;
...
prog.detach(link_id)?;
Link ids are strongly typed, so it's impossible to eg:
let link_id = uprobe.attach(...)?;
xdp.detach(link_id);
As it would result in a compile time error.
Links are still stored inside ProgramData, and unless detached
explicitly, they are automatically detached when the parent program gets
dropped.
This commit uses the symbol table to discover all maps inside an ELF
section. Instead of doing what libbpf does - divide the section data
in to equal sized chunks - we read in to section data using the
symbol address and offset, thus allowing us to support definitions
of varying lengths.
Signed-off-by: Dave Tucker <dave@dtucker.co.uk>
This changes PerfBuffer::read_events() to call BytesMut::reserve()
internally, and deprecates PerfBufferError::MoreSpaceNeeded.
This makes for a more ergonomic API, and allows for a more idiomatic
usage of BytesMut. For example consider:
let mut buffers = vec![BytesMut::with_capacity(N), ...];
loop {
let events = oob_cpu_buf.read_events(&mut buffers).unwrap();
for buf in &mut buffers[..events.read] {
let sub: Bytes = buf.split_off(n).into();
process_sub_buf(sub);
}
...
}
This is a common way to process perf bufs, where a sub buffer is split
off from the original buffer and then processed. In the next iteration
of the loop when it's time to read again, two things can happen:
- if processing of the sub buffer is complete and `sub` has been
dropped, read_events() will call buf.reserve(sample_size) and hit a fast
path in BytesMut that will just restore the original capacity of the
buffer (assuming sample_size <= N).
- if processing of the sub buffer hasn't ended (eg the buffer has been
stored or is being processed in another thread),
buf.reserve(sample_size) will actually allocate the new memory required
to read the sample.
In other words, calling buf.reserve(sample_size) inside read_events()
simplifies doing zero-copy processing of buffers in many cases.
`BPF_PROG_TYPE_SOCKET_FILTER` program expands the sectionname's kind with `socket` not `socket_filter`.
So current eBPF program with socket filter always fails.
This patch fixes it.
Fix https://github.com/aya-rs/aya/issues/227
fa037a88e2 allowed for cgroup skb programs
that did not specify an attach direction to use the cgroup/skb section
name per the convention established in libbpf. It did not add the
necessary code to load programs from those sections which is added in
this commit
Signed-off-by: Dave Tucker <dave@dtucker.co.uk>
Handle relocations against .text symbols in all instructions not just
calls. Makes it so that let x = &some_function triggers linking of
some_function in the current program and handles the resulting
relocation accordingly.
Among other things, enables the use of bpf_for_each_map_elem.
This replaces the / character with a . which is allowed in the kernel
names. Not allowing a forward slash is perhaps a kernel bug, but lets
fix it up here as it's commonly used for Aya
Signed-off-by: Dave Tucker <dave@dtucker.co.uk>
Aya will now perform sanitzation and fixups in a single phase, requiring
only one pass over the BTF. This modifies the parsed BTF in place.
Signed-off-by: Dave Tucker <dave@dtucker.co.uk>
Currently errors can occur if the verifier output is > buffer as we get
ENOMEM. We should only provide a log_buf if initial load failed, then
retry up to 10 times to get full verifier output.
To DRY this logic it has been moved to a function so its shared with
program loading
Signed-off-by: Dave Tucker <dave@dtucker.co.uk>
one verifier loop to rule them all
Signed-off-by: Dave Tucker <dave@dtucker.co.uk>
If an argument has a type, it must also have a name, see btf_func_check
in the kernel.
Given:
SEC("lsm/syslog")
int BPF_PROG(syslog_audit, int type, int ret_prev)
{
return 0;
}
Fixes:
error: BTF error: the BPF_BTF_LOAD syscall failed. Verifier output: magic: 0xeb9f
version: 1
flags: 0x0
hdr_len: 24
type_off: 0
type_len: 76
str_off: 76
str_len: 128
btf_total_size: 228
[1] FUNC_PROTO (anon) return=2 args=(3 (anon))
[2] INT int size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
[3] PTR (anon) type_id=4
[4] INT long long unsigned int size=8 bits_offset=0 nr_bits=64 encoding=(none)
[5] FUNC syslog_audit type_id=1
[5] FUNC syslog_audit type_id=1 Invalid arg#1
: Invalid argument (os error 22)
The union of `size` and `type` is unused in BTF_KIND_ARRAY.
Type information of elements is in the btf_array struct that follows in
the type_ field while the index type is in the index_type field.
For BTF_KIND_INT, only the offset should be compared and size and
signedness should be ignored.
Signed-off-by: Dave Tucker <dave@dtucker.co.uk>
This requires loading the BTF to kernel when loading all programs as
well as implementing Extension program type
Signed-off-by: Dave Tucker <dave@dtucker.co.uk>
This allows for parsed BTF to be re-encoded such that it could be loaded
in to the kernel. It moves bytes_of to the utils package. We could use
Object::bytes_of, but this requires the impl of the Pod trait on
generated code.
Signed-off-by: Dave Tucker <dave@dtucker.co.uk>
This commit marks .rodata maps as BPF_F_RDONLY_PROG when loaded to
prevent a BPF program mutating them.
Initial map data is populated by the loader using the new
`BpfLoader::set_global()` API. The loader will mark
is marked as frozen using bpf_map_freeze to prevent map data
being changed from userspace.
Signed-off-by: Dave Tucker <dave@dtucker.co.uk>
fentry and fexit programs are similar to kprobe and kretprobe, but they
are newer and they have practically zero overhead to call before or
after kernel function. Also, fexit programs are focused on access to
arguments rather than the return value.
Those kind of programs were introduced in the following patchset:
https://lwn.net/Articles/804112/
Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>
Map iteration can yield stale keys and values by virtue of sharing a
data structure with BPF programs which can modify it. However, all
accesses remain perfectly safe and will not cause memory corruption or
data races.
Map and ProgramData objects had unnecessarily cloned strings for their
names, despite them being just as easily available to external users via
bpf.maps() and bpf.programs().
This commit improves section detection.
Previously, a section named "xdp_metadata" would be interpretted as a
program section, which is incorrect. This commit first attempts to
identify a BPF section by name, then by section.kind() ==
SectionKind::Text (executable code). The computed section kind is
stored in the Section so variants can be easily matched on later.
Signed-off-by: Dave Tucker <dave@dtucker.co.uk>
When a BPF program doesn't specify the target kernel version, the
most compatible option is to set the program kernel version to match
the currently running kernel.
In kernel 4.15 and additional parameter was added to allow maps to have
names but using this breaks on older kernels.
This change makes it so the name is only added on kernels 4.15 and
newer.
This commit fixes name parsing of sk_skb sections such that both named
and unnamed variants will work correctly.
Signed-off-by: Dave Tucker <dave@dtucker.co.uk>
This change adds support for the following program types:
* raw tracepoint
* LSM
Supporting LSM programs involved a necessity of supporting more
load_attrs for the BPF_PROG_LOAD operation, concretely:
* expected_attach_type - for LSM programs, it has always to be set to
BPF_LSM_MAC
* attach_btf_obj_fd - it's often used to reference the file descriptor of
program's BTF info, altough in case of LSM programs, it only has to
contain the value 0, which means the vmlinux object file (usually
/sys/kernel/btf/vmlinux)
* attach_btf_id - ID of the BTF object, which in case of LSM programs is
the ID of the function (the LSM hook)
The example of LSM program using that functionality can be found here:
https://github.com/vadorovsky/aya-example-lsmFixes: #9
Signed-off-by: William Findlay <william@williamfindlay.com>
Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>
I found a corner case in my own development workflow that caused the existing macro to not
work properly. The following changes appear to fix things. Ideally, we could add some test
cases to CI to prevent regressions. This would require creating a dedicated directory to
hold test cases so that we can "include" them at compile time.
This is a helper macro that can be used to include bytes at compile-time that can then be
used in Bpf::load(). Unlike std's include_bytes!(), this macro also ensures that the
resulting byte array is correctly aligned so that it can be parsed as an ELF binary.
Signed-off-by: William Findlay <william@williamfindlay.com>
This commit adds 2 new methods to aya::sys
- bpf_pin_object
- bpf_get_object
Which allow the pinning and retrieval of programs/maps to bpffs.
It adds a `Program.pin` API, such that a loaded program can be pinned.
For map pinning, the user must ensure the `pinning u32` in the
`bpf_map_def` is set to 1, maps will be pinned using a new builder API.
BpfLoader::new().map_pin_path("/sys/fs/bpf/myapp").load_file("myapp.o")
This will pin all maps whose definition requests pinning to path + name.
Signed-off-by: Dave Tucker <dave@dtucker.co.uk>
The size of Unknown should be ty_size, otherwise when it is encountered,
we never advance the cursor and it creates an infinite loop.
Signed-off-by: Dave Tucker <dave@dtucker.co.uk>
qdisc_detach_program can be used to detach all the programs that have
the given name. It's useful when you want to detach programs that were
attached by some other process (eg. iproute2), or when you want to
detach programs that were previously left attached because the program
that attached them was killed.
LLVM will split .text into .text.hot .text.unlikely etc and move the
content around in order to improve locality. We need to parse all the
text sections or relocations can potentially fail.
When a perf map has max_entries=0, max_entries is dynamically set at
load time to the number of possible cpus as reported by
/sys/devices/system/cpu/possible.
This change fixes a bug where instead of setting max_entries to the
number of possible cpus, we were setting it to the cpu index of the last
possible cpu.
XDP_FLAGS_REPLACE was added in 5.7. Now for kernels >= 5.7 whenever we
detach an XDP program we pass along the program fd we expect to be
detaching. For older kernels, we just detach whatever is attached, which
is not great but it's the way the API worked pre XDP_FLAGS_REPLACE.
Make MapKeys not use IterableMap. Leave only ProgramArray::get,
ProgramArray::set and ProgramArray::unset exposed as the other syscalls
don't work consistently for program arrays.
Change get() from -> Result<Option<V>, MapError> to -> Result<V,
MapError> where MapError::KeyNotFound is returned instead of Ok(None) to
signify that the key is not present.