Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 19 additions & 7 deletions BUILTINS.md
Original file line number Diff line number Diff line change
Expand Up @@ -83,17 +83,23 @@ fn main() -> i32 {

---

#### `attach(handle, target, flags)`
#### `attach(handle, target, flags)` / `attach(handle, opts, flags)`
**Signature:** `attach(handle: ProgramHandle, target: str(128), flags: u32) -> u32`
**Signature:** `attach(handle: ProgramHandle, opts: perf_options, flags: u32) -> u32`
**Variadic:** No
**Context:** Userspace only

**Description:** Attach a loaded eBPF program to a target interface or attachment point.
**Description:** Attach a loaded eBPF program to a target interface or attachment point, or to a perf event counter described by `perf_options`. Both forms take three arguments, keeping a uniform call shape across all program types.

**Parameters:**
- `handle`: Program handle returned from `load()`
- `target`: Target interface name (e.g., "eth0", "lo") or attachment point
- `flags`: Attachment flags (context-dependent)
- Standard form:
- `handle`: Program handle returned from `load()`
- `target`: Target interface name (e.g., "eth0", "lo") or attachment point
- `flags`: Attachment flags (context-dependent)
- Perf event form:
- `handle`: Program handle returned from `load()`
- `opts`: `perf_options` value — only `perf_type` and `perf_config` are required; all other fields have defaults
- `flags`: Reserved (pass `0`)

**Return Value:**
- Returns `0` on success
Expand All @@ -106,11 +112,17 @@ var result = attach(prog, "eth0", 0)
if (result != 0) {
print("Failed to attach program")
}

// Minimal perf attach — all non-perf_type/perf_config fields use defaults:
// pid=-1 (all procs), cpu=0, period=1_000_000, wakeup=1, flags=false
var perf_prog = load(on_branch_miss)
attach(perf_prog, perf_options { perf_type: perf_type_hardware, perf_config: branch_misses }, 0)
detach(perf_prog)
```

**Context-specific implementations:**
- **eBPF:** Not available
- **Userspace:** Uses `bpf_prog_attach` system call
- **Userspace:** Uses `attach_bpf_program_by_fd` for standard targets and `ks_attach_perf_event` for perf events
- **Kernel Module:** Not available

---
Expand Down Expand Up @@ -340,7 +352,7 @@ fn main() -> i32 {
|----------|------|-----------|---------------|-------|
| `print()` | ✅ | ✅ | ✅ | Different output destinations |
| `load()` | ❌ | ✅ | ❌ | Program management only |
| `attach()` | ❌ | ✅ | ❌ | Program management only |
| `attach()` | ❌ | ✅ | ❌ | Standard attach and perf_options attach |
| `detach()` | ❌ | ✅ | ❌ | Program management only |
| `register()` | ❌ | ✅ | ❌ | struct_ops registration |
| `test()` | ❌ | ✅ | ❌ | Testing framework only |
Expand Down
60 changes: 60 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -119,6 +119,13 @@ fn traffic_shaper(ctx: *__sk_buff) -> i32 {
// Trace system call entry
return 0
}

// Perf event program for hardware counter sampling
@perf_event
fn on_branch_miss(ctx: *bpf_perf_event_data) -> i32 {
// Runs on every hardware branch-miss event
return 0
}
```

### Type System
Expand Down Expand Up @@ -261,6 +268,58 @@ fn main() -> i32 {
}
```

### Hardware Performance Counter Programs

Use `@perf_event` to attach eBPF programs to hardware or software performance counters. `perf_options` keeps the kernel's tagged `perf_type + perf_config` model, so adding new perf event families does not require flattening everything into one enum. Only `perf_type` and `perf_config` are required; all other fields have sensible defaults. If you need the current count in userspace, call `perf_read(prog)` after `attach(...)`:

```kernelscript
// eBPF program fires on every hardware branch-miss sample
@perf_event
fn on_branch_miss(ctx: *bpf_perf_event_data) -> i32 {
return 0
}

fn main() -> i32 {
var prog = load(on_branch_miss)

// Minimal form — defaults: pid=-1 (all procs), cpu=0,
// period=1_000_000, wakeup=1, all flags=false
attach(prog, perf_options { perf_type: perf_type_hardware, perf_config: branch_misses }, 0)
var count = perf_read(prog)
print("branch misses: %lld", count)

detach(prog) // disables counter, destroys BPF link, closes fd
return 0
}
```

**Available `perf_type` values:**

| Enum value | Hardware/software event |
|---|---|
| `perf_type_hardware` | `PERF_TYPE_HARDWARE` |
| `perf_type_software` | `PERF_TYPE_SOFTWARE` |
| `perf_type_tracepoint` | `PERF_TYPE_TRACEPOINT` |
| `perf_type_hw_cache` | `PERF_TYPE_HW_CACHE` |
| `perf_type_raw` | `PERF_TYPE_RAW` |
| `perf_type_breakpoint` | `PERF_TYPE_BREAKPOINT` |

**Common `perf_config` constants:**

| Constant | Intended `perf_type` | Linux config |
|---|---|---|
| `cpu_cycles` | `perf_type_hardware` | `PERF_COUNT_HW_CPU_CYCLES` |
| `instructions` | `perf_type_hardware` | `PERF_COUNT_HW_INSTRUCTIONS` |
| `cache_references` | `perf_type_hardware` | `PERF_COUNT_HW_CACHE_REFERENCES` |
| `cache_misses` | `perf_type_hardware` | `PERF_COUNT_HW_CACHE_MISSES` |
| `branch_instructions` | `perf_type_hardware` | `PERF_COUNT_HW_BRANCH_INSTRUCTIONS` |
| `branch_misses` | `perf_type_hardware` | `PERF_COUNT_HW_BRANCH_MISSES` |
| `page_faults` | `perf_type_software` | `PERF_COUNT_SW_PAGE_FAULTS` |
| `context_switches` | `perf_type_software` | `PERF_COUNT_SW_CONTEXT_SWITCHES` |
| `cpu_migrations` | `perf_type_software` | `PERF_COUNT_SW_CPU_MIGRATIONS` |

For newer families such as `perf_type_hw_cache`, pass the kernel-compatible encoded `perf_config` value directly.

📖 **For detailed language specification, syntax reference, and advanced features, please read [`SPEC.md`](SPEC.md).**

🔧 **For complete builtin functions reference, see [`BUILTINS.md`](BUILTINS.md).**
Expand Down Expand Up @@ -304,6 +363,7 @@ my_project/
- `tc` - Traffic control programs
- `probe` - Kernel function probing
- `tracepoint` - Kernel tracepoint programs
- `perf_event` - Hardware/software performance counter programs

**Available struct_ops:**
- `tcp_congestion_ops` - TCP congestion control
Expand Down
127 changes: 126 additions & 1 deletion SPEC.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ var flows : hash<IpAddress, PacketStats>(1024)
KernelScript uses a simple and clear scoping model that eliminates ambiguity:

- **`@helper` functions**: Kernel-shared functions - accessible by all eBPF programs, compile to eBPF bytecode
- **Attributed functions** (e.g., `@xdp`, `@tc`, `@tracepoint`): eBPF program entry points - compile to eBPF bytecode
- **Attributed functions** (e.g., `@xdp`, `@tc`, `@tracepoint`, `@perf_event`): eBPF program entry points - compile to eBPF bytecode
- **Regular functions**: User space - functions and data structures compile to native executable
- **Maps and global configs**: Shared resources accessible from both kernel and user space
- **No wrapper syntax**: Direct, flat structure without unnecessary nesting
Expand Down Expand Up @@ -440,6 +440,131 @@ kernelscript init tracepoint/syscalls/sys_enter_read my_syscall_tracer
# appropriate KernelScript templates with correct context types
```

#### 3.1.3 Perf Event Programs

`@perf_event` programs attach eBPF logic to hardware or software performance counters via `perf_event_open(2)`. The eBPF function is invoked for every counter sample; the userspace side controls which counter to monitor through a `perf_options` struct literal passed to the standard 3-argument `attach()`.

**Syntax:**
```kernelscript
@perf_event
fn <handler_name>(ctx: *bpf_perf_event_data) -> i32 {
// runs on every sample
return 0
}
```

The context type is always `*bpf_perf_event_data` (from `vmlinux.h`).

**Userspace lifecycle:**
```kernelscript
fn main() -> i32 {
var prog = load(my_handler)

// Only perf_type + perf_config are required; all other fields use language-level defaults:
// pid=-1, cpu=0, period=1_000_000, wakeup=1, inherit/exclude_*=false
attach(prog, perf_options { perf_type: perf_type_hardware, perf_config: branch_misses }, 0)

// Override specific fields as needed:
attach(prog, perf_options {
perf_type: perf_type_hardware,
perf_config: cache_misses,
cpu: 2,
period: 500000,
exclude_kernel: true,
}, 0)

var count = perf_read(prog)
print("count: %lld", count)

detach(prog) // IOC_DISABLE → bpf_link__destroy → close(perf_fd)
return 0
}
```

**`perf_options` fields and defaults:**

| Field | Type | Default | Description |
|---|---|---|---|
| `perf_type` | `perf_type` | *(required)* | `perf_event_attr.type` tag |
| `perf_config` | `u64` | *(required)* | `perf_event_attr.config` value for that type |
| `pid` | `i32` | `-1` | -1 = all processes; ≥0 = specific PID |
| `cpu` | `i32` | `0` | ≥0 = specific CPU; -1 = any CPU (pid must be ≥0) |
| `period` | `u64` | `1000000` | Sample after this many events |
| `wakeup` | `u32` | `1` | Wake userspace after N samples |
| `inherit` | `bool` | `false` | Inherit to forked children |
| `exclude_kernel` | `bool` | `false` | Exclude kernel-mode samples |
| `exclude_user` | `bool` | `false` | Exclude user-mode samples |

**`pid` / `cpu` rules enforced at runtime:**

| `pid` | `cpu` | Meaning |
|---|---|---|
| ≥ 0 | ≥ 0 | Specific process on specific CPU |
| ≥ 0 | -1 | Specific process on any CPU |
| -1 | ≥ 0 | All processes on specific CPU (system-wide) |
| -1 | -1 | **Invalid** — rejected with error |

**`perf_type` enum:**

| Value | Linux constant |
|---|---|
| `perf_type_hardware` | `PERF_TYPE_HARDWARE` |
| `perf_type_software` | `PERF_TYPE_SOFTWARE` |
| `perf_type_tracepoint` | `PERF_TYPE_TRACEPOINT` |
| `perf_type_hw_cache` | `PERF_TYPE_HW_CACHE` |
| `perf_type_raw` | `PERF_TYPE_RAW` |
| `perf_type_breakpoint` | `PERF_TYPE_BREAKPOINT` |

**Common `perf_config` constants:**

| Value | Intended `perf_type` | Linux constant |
|---|---|---|
| `cpu_cycles` | `perf_type_hardware` | `PERF_COUNT_HW_CPU_CYCLES` |
| `instructions` | `perf_type_hardware` | `PERF_COUNT_HW_INSTRUCTIONS` |
| `cache_references` | `perf_type_hardware` | `PERF_COUNT_HW_CACHE_REFERENCES` |
| `cache_misses` | `perf_type_hardware` | `PERF_COUNT_HW_CACHE_MISSES` |
| `branch_instructions` | `perf_type_hardware` | `PERF_COUNT_HW_BRANCH_INSTRUCTIONS` |
| `branch_misses` | `perf_type_hardware` | `PERF_COUNT_HW_BRANCH_MISSES` |
| `page_faults` | `perf_type_software` | `PERF_COUNT_SW_PAGE_FAULTS` |
| `context_switches` | `perf_type_software` | `PERF_COUNT_SW_CONTEXT_SWITCHES` |
| `cpu_migrations` | `perf_type_software` | `PERF_COUNT_SW_CPU_MIGRATIONS` |

For event families with a richer config space, such as `perf_type_hw_cache`, provide the encoded kernel `perf_config` value directly instead of relying on a flattened enum.

**Generated C helpers (emitted when `attach(prog, perf_options{...}, flags)` is used):**

| Function | Signature | Description |
|---|---|---|
| `ks_open_perf_event` | `int (ks_perf_options)` | Calls `perf_event_open(2)`, returns fd |
| `ks_attach_perf_event` | `int (int prog_fd, ks_perf_options, int flags)` | Full open-reset-attach-enable lifecycle |
| `ks_read_perf_count` | `int64_t (int perf_fd)` | Reads current 64-bit counter via `read()` |
| `ks_perf_read` | `int64_t (int prog_fd)` | High-level read via program handle |

**Attach sequence (compiler-generated, inside `ks_attach_perf_event`):**
1. `ks_attr.attr.disabled = 1` — open counter without starting it
2. `syscall(SYS_perf_event_open, ...)` → `perf_fd`
3. `ioctl(perf_fd, PERF_EVENT_IOC_RESET, 0)` — zero the counter
4. `bpf_program__attach_perf_event(prog, perf_fd)` — link BPF program
5. `ioctl(perf_fd, PERF_EVENT_IOC_ENABLE, 0)` — **start counting**

**Detach sequence (compiler-generated):**
1. `ioctl(perf_fd, PERF_EVENT_IOC_DISABLE, 0)` — stop counting
2. `bpf_link__destroy(link)` — unlink BPF program
3. `close(perf_fd)` — release the kernel perf event

**Compiler implementation:**
- Detects `attach(prog, perf_options_value, flags)` (three-argument form with `perf_options` second arg) and routes to `ks_attach_perf_event`
- Exposes omitted `perf_options` fields as language-level defaults (partial struct literal)
- Validates `pid ≥ -1`, `cpu ≥ -1`, and rejects `pid == -1 && cpu == -1` at runtime
- Emits `PERF_FLAG_FD_CLOEXEC` for safe fd inheritance
- BPF program section is `SEC("perf_event")`

**Project Initialization:**
```bash
# Initialize a perf_event project
kernelscript init perf_event my_perf_monitor
```

### 3.2 Named Configuration Blocks
```kernelscript
// Named configuration blocks - globally accessible
Expand Down
25 changes: 25 additions & 0 deletions examples/perf_cache_miss.ks
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
// perf_cache_miss.ks
// Demonstrates @perf_event program type in KernelScript.
// The eBPF program runs on every hardware cache-miss event.
// The userspace side opens the perf event and attaches the BPF program.

@perf_event
fn on_cache_miss(ctx: *bpf_perf_event_data) -> i32 {
return 0
}

fn main() -> i32 {
var prog = load(on_cache_miss)

// Only perf_type + perf_config are required; pid, cpu, period, wakeup and flag fields
// default to: pid=-1 (all procs), cpu=0, period=1_000_000, wakeup=1,
// inherit/exclude_kernel/exclude_user=false.
attach(prog, perf_options { perf_type: perf_type_hardware, perf_config: cache_misses, period: 10000000, inherit: true }, 0)
print("Cache-miss perf_event demo attached")
var count = perf_read(prog)
print("Cache-miss count: %lld", count)

detach(prog)
print("Cache-miss perf_event demo detached")
return 0
}
32 changes: 32 additions & 0 deletions examples/perf_page_fault.ks
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
// perf_page_fault.ks
// Demonstrates @perf_event program type in KernelScript.
// The eBPF program runs on every software page-fault event.
// The userspace side opens the perf event and attaches the BPF program.

@perf_event
fn on_page_fault(ctx: *bpf_perf_event_data) -> i32 {
return 0
}

fn main() -> i32 {
var prog = load(on_page_fault)

// pid: 0 = current process, cpu: -1 = any CPU (standard per-process monitoring).
// page_faults (PERF_COUNT_SW_PAGE_FAULTS) is the most reliable software event:
// every heap/stack allocation triggers minor page faults, no scheduler dependency.
attach(prog, perf_options { perf_type: perf_type_software, perf_config: page_faults, pid: 0, cpu: -1, period: 1 }, 0)
print("Page-fault perf_event demo attached")

// Repeatedly increment a counter; stack/heap activity will generate page faults.
var x: i64 = 0
for (i in 0..10000000) {
x = x + 1
}

var count = perf_read(prog)
print("Page-fault count: %lld", count)

detach(prog)
print("Page-fault perf_event demo detached")
return 0
}
3 changes: 2 additions & 1 deletion src/ast.ml
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ type probe_type =

(** Program types supported by KernelScript *)
type program_type =
| Xdp | Tc | Probe of probe_type | Tracepoint | StructOps
| Xdp | Tc | Probe of probe_type | Tracepoint | StructOps | PerfEvent

(** Map types for eBPF maps *)
type map_type =
Expand Down Expand Up @@ -658,6 +658,7 @@ let string_of_program_type = function
| Probe Kprobe -> "kprobe"
| Tracepoint -> "tracepoint"
| StructOps -> "struct_ops"
| PerfEvent -> "perf_event"

let string_of_map_type = function
| Hash -> "hash"
Expand Down
Loading
Loading