About the Role:
We are looking for hands-on Ubuntu desktop users to help refine multimodal datasets for AI training.
This role involves replaying real computer tasks inside an Ubuntu environment, capturing accurate screenshots, and producing PyAutoGUI-style action annotations step-by-step.
This is not a coding or engineering role.
It requires excellent computer usage skills, patience, and strong attention to detail.
Responsibilities:
- Navigate and explore operating systems (e.g., Windows or macOS) to complete guided tasks.
- Follow clear, step-by-step instructions provided by the team.
- Observe how the system behaves when tasks are performed.
- Record actions, results, and observations accurately in annotation tools.
- Flag unexpected behavior or issues using predefined guidelines.
- Maintain consistency and quality across repetitive tasks.
Requirements:
1. Reconstruct GUI Task Trajectories
- For each provided task:
- Interpret the task instruction and text-only trajectory (Observation / Thought / Action).
- Recreate the workflow inside an Ubuntu desktop environment.
- At each step capture a clear screenshot of the current screen state.
- Write the exact PyAutoGUI action JSON for the action taken next.
- Update the thought and observation, likewise to your provided screenshot
2. Create New Tasks
- Perform the task based on the given existing prompts and annotate it with screenshots + action JSON in realistic Ubuntu desktop tasks (e.g., using VLC, LibreOffice, GIMP, browser tasks).
3. Maintain Annotation Quality
- Ensure screenshots match the correct UI state for each step.
- Avoid impossible/inaccurate actions (wrong coordinates, inactive buttons, incorrect app state).
- Follow consistent file naming and directory structure.
Must Have:
- Linux / Ubuntu Usage (Mandatory)
- You must be comfortable using Ubuntu Desktop as your primary or secondary OS, including:
- File Manager
- Desktop navigation, window switching, and settings
- Understanding of basic file paths and user directories
- Open-source Application Experience (Required)
- You should know how to perform common tasks in:
- Chrome
- VLC Media Player
- LibreOffice Writer / Calc / Impress
- GIMP
- Thunderbird
- VS Code
- Annotation Accuracy.
- Able to match screenshots with the correct next-step actions.
- Able to identify UI elements and click at the correct coordinates.
- Comfortable writing simple JSON.
- Must be comfortable writing accurate screen-state descriptions (observations) and generating model-style thought processes that logically lead to the next action.
- Patient, detail-oriented, consistent.
- Able to follow structured guidelines.
- Not easily bored by step-by-step GUI work.
What This Role Is Not:
- Not a programming job.
- Not a research job.
- Not a creative engineering role.
- Not suitable for people who want only coding or high-level design work.
- Not suitable for beginners who use only smartphones or Windows casually.
- This is a precision-driven, GUI-based desktop annotation job.