Skip to content

Speaker Labeling & Call for Community Contribution Workflow #6

@MrH2T

Description

@MrH2T

After several days of intensive development, I have completed a preliminary speaker labeling for the game strings. The current scripts and raw data are available in the output/results/ directory (refer to all_chapters_speaker_results.csv and all_chapters_dump_aligned_results.csv).

Currently, the automated labeling accuracy maintains a baseline of 50%–60%. To further refine this dataset, I wish we can implement a community-driven maintenance workflow. This would allow contributors to verify existing labels, correct misidentifications, or submit their own labeled datasets to improve the overall model precision.

https://drive.google.com/file/d/1WZeFz4ouzA3zf1D3Y4TKI8ni1AziPaKE/view?usp=drive_link

(I don't know why I cannot attach files)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions