Skip to content

Add perfetto tracing to host tooling#2502

Open
jmacnak wants to merge 1 commit into
google:mainfrom
jmacnak:perfetto
Open

Add perfetto tracing to host tooling#2502
jmacnak wants to merge 1 commit into
google:mainfrom
jmacnak:perfetto

Conversation

@jmacnak
Copy link
Copy Markdown
Member

@jmacnak jmacnak commented May 1, 2026

@jmacnak jmacnak force-pushed the perfetto branch 7 times, most recently from e958a29 to 8330f27 Compare May 6, 2026 17:15
@jmacnak jmacnak added the kokoro:force-run Trigger a presubmit build unconditionally. label May 6, 2026
@GoogleCuttlefishTesterBot GoogleCuttlefishTesterBot removed the kokoro:force-run Trigger a presubmit build unconditionally. label May 6, 2026
@jmacnak jmacnak force-pushed the perfetto branch 4 times, most recently from 19542bf to 6e9eaf0 Compare May 7, 2026 18:24
@jmacnak jmacnak force-pushed the perfetto branch 5 times, most recently from 1d90f66 to 1727b13 Compare May 15, 2026 20:30
@jmacnak jmacnak requested a review from Databean May 18, 2026 14:54
@jmacnak
Copy link
Copy Markdown
Member Author

jmacnak commented May 18, 2026

This needs a couple upstream changes to Perfetto but generally ready for a first look.

Copy link
Copy Markdown
Member

@Databean Databean left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not finished reviewing, but I'll leave the comments I have so far.

Comment thread base/cvd/cuttlefish/host/commands/run_cvd/server_loop_impl.cpp
Comment thread base/cvd/cuttlefish/host/commands/run_cvd/main.cc
Comment thread base/cvd/cuttlefish/host/libs/tracing/tracing.h Outdated
Comment thread base/cvd/cuttlefish/host/libs/tracing/tracing.cpp Outdated
Comment thread base/cvd/cuttlefish/host/libs/tracing/tracing.h Outdated
Comment thread base/cvd/cuttlefish/host/libs/tracing/tracing.cpp Outdated
Comment on lines +280 to +284
worker_condition_variable_.wait(lock, [this] { return !trace_events_.empty() || worker_shutting_down_; });

if (worker_shutting_down_ && trace_events_.empty()) {
break;
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirming my understanding: once worker_shutting_down_ is true, this will never wait on worker_condition_variable_ again, but will still make sure all events are handled before going into shutdown?

Comment thread base/cvd/cuttlefish/host/libs/tracing/tracing.cpp Outdated
Comment thread base/cvd/cuttlefish/host/libs/tracing/tracing.cpp Outdated

const bool enabled_ = false;

std::mutex mutex_;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using std::recursive_mutex would avoid the extra Locked methods at the cost of some runtime performance.

LalitMaganti pushed a commit to google/perfetto that referenced this pull request May 19, 2026
Bug: b/378560215

This would be useful for
google/android-cuttlefish#2502 where we could
eliminate most of
https://github.com/google/android-cuttlefish/pull/2502/changes#diff-d28a094890e585d0e9bec027b845e31c3c15ebd9c2262dd814dd6e7537ab2af8
when using the c library via `dlopen()` and we do not want to directly
depend on the library itself.
@jmacnak jmacnak force-pushed the perfetto branch 4 times, most recently from d806762 to 7be473c Compare May 19, 2026 20:15
Example cvd create trace:
https://ui.perfetto.dev/#!/?s=315a987bda469405966eb46797201e11b12d247c

Example cvd fetch trace:
https://ui.perfetto.dev/#!/?s=8e6e182ffab8e92f0f0a690b205c73313ec71f17

Perfetto uses a shared memory buffer between producer processes and the
traced consumer daemon. `fork()` is problematic for this as the parent
and child processes would end up clobbering each other when the shared
memory buffer is inherited by the child. To prevent this, we introduce
a perfetto wrapper that uses `pthread_atfork()` to register handlers that
shutdown and restart tracing before and after forking.

Perfetto also depends on global static data and explicitly calls out that
it can not be restarted unless this data is cleared [1]. With this, the
Perfetto wrapper utilizes `dlopen()` and `dlclose()` during init and
shutdown.

Perfetto also uses thread local data which requires extra care around
`dlclose()` to prevent some thread local destructors from running after
`dlclose()` has already unloaded the Perfetto library. To work around
this, the Perfetto wrapper isolates all Perfetto calls into a single
worker thread.

[1] https://github.com/google/perfetto/blob/ab21398d658ce3bccc192b226d2731add6c02890/include/perfetto/tracing/tracing.h#L252

Bug: b/378560215
Copy link
Copy Markdown
Member

@Databean Databean left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, still not finished: I haven't fully reviewed TracingState, but have looked at everything else.

}

Result<bool> DiskBuilder::BuildCompositeDiskIfNecessary() {
CF_TRACEF("BuildCompositeDisk: %s", composite_disk_path_.c_str());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this and the other CF_TRACEF still need to use .c_str() if absl::StrFormat is producing the string?

std::string downloaded;

{
CF_TRACEF("Download: %s", artifact_name_.c_str());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Does this and the other CF_TRACEF still need .c_str()?

setenv(kCuttlefishInstanceEnvVarName, instance.id().c_str(),
/* overwrite */ 1);

CF_TRACEF("Running run_cvd: %s", instance.instance_name().c_str());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: is the .c_str() still necessary?

"Dependency issue detected, not performing any setup.");
// TODO(b/189153501): This can potentially be parallelized.
for (auto& feature : ordered_features) {
CF_TRACEF("FeatureSetup %s", feature->Name().c_str());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: is the .c_str() still necessary?

// Emit trace end events for all of the active traces. If we are forking,
// the trace begin events will be re-emitted after the fork.
const uint64_t shutdown_timestamp = GetTimestamp();
for (auto& [_, thread_state] : thread_states_) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: _ is not a placeholder, this declares a variable named _.

struct LibraryCloser {
void operator()(void* handle) const;
};
std::unique_ptr<void, LibraryCloser> handle;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: handle seems like private state to PerfettoFunctions, I don't see it accessed outside of PerfettoFunctions::Load.

Comment on lines +36 to +37
absl::MutexLock lock(mutex_);
trace_events_.push_back(event);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the worker thread fails to start, this will accumulate trace events without ever draining them. Maybe this can reuse the shutting_down_ variable: if WorkerThreadLoop exits early it can set shutting_down_, and EnqueueTraceEvent can drop events if shutting_down_ is true.

Comment on lines +103 to +106
if (!LoadPerfetto()) {
LOG(WARNING) << "Failed to initialize tracing: failed to load Perfetto.";
return;
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell, this code upholds the property that perfetto_ is not used if LoadPerfetto() fails. However, this is only only implicit: perfetto_ can be in uninitialized or initialized states, and the initialization state is never checked.

What do you think of accepting the PerfettoFunctions instance as a constructor argument? If you want to get really fancy, it would also be possible to write unit tests with a mock PerfettoFunctions instance by filling in the function pointers with mocks / stubs.

PFN_PerfettoTePublishCategories PerfettoTePublishCategories = nullptr;
uint64_t* perfetto_te_process_track_uuid_ptr = nullptr;

static Result<PerfettoFunctions> Load();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to the other comment about initialization in PerfettoWorkerThread, part of the intention of static factory functions (https://abseil.io/tips/42) is that the static factory function is the only way of producing an instance. Can the default constructor be made private?

if (res != 0) {
LOG(ERROR) << "Failed to register pthread fork callbacks.";
}
std::atexit(TracingAtExit);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Abseil has its own concept of "run on exit" handlers:

https://github.com/abseil/abseil-cpp/blob/8fb4507d10a24a1204e98f77ad19c27c291e13e9/absl/log/log.h#L28-L31

Oddly this seems to be internal-only: it references process_state.h which is not public. Despite this, if I understand correclty abseil still uses abort (ignoring atexit handlers) internally to exit the process in LOG(FATAL) and CHECK cases:

https://github.com/abseil/abseil-cpp/blob/8fb4507d10a24a1204e98f77ad19c27c291e13e9/absl/log/internal/conditions.h#L241

However, this contradicts the abseil documentation which claims that LOG(QFATAL) and QCHECK are the only version that don't run exit handlers. As far as I can tell from the source though, LOG(FATAL) calls abort and LOG(QFATAL) calls _exit, neither of which is much help.

All that to say, not sure there's anything we can do about it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants