Turing.com review by full-stack developer from Nepal

"I earn more than any of my previous jobs"

- Roshan, Fullstack Developer from Nepal

Roshanwebp.webp

Roshan, a full-stack developer, recently shared his Turing.com review with the Turing Newsdesk. He revealed that the organization gave him a steady and secure high-paying job right in the comfort of his home. He also added that Turing enabled him to interact with engineers across the globe and grow in the process.

Life before Turing jobs

A full-stack developer by profession, Roshan is based in the beautiful city of Kathmandu, Nepal. With high personal ambitions and expectations, Roshan felt constrained by the limited opportunities available in Nepal.

"I worked as a freelance developer, and the jobs were not steady at all," he recalls.

How did he learn about Turing US software jobs?

Set on finding good job prospects, Roshan scanned the web searching for remote-work opportunities. He came across Turing on DesignRush.

"I applied immediately. There are some tests and a take-home challenge that you can start in your own time. It took a while for me to complete. But after I finished, I got a call from Turing, and soon after that, I was hired," he says.

How has his journey with Turing.com been so far?

Top developers like Roshan need bigger stages for them to truly prosper.

"As an engineer, I get to learn so much working with smart professionals from all over the world. Financially speaking, I get more than any of my previous jobs coupled with the security of steady long-term employment," he mentions.

What's his take on Turing developers?

"I've definitely grown much faster since I joined this organization. Turing presented me with opportunities that are very hard to find where I live," shares the Kathmandu-based developer.

What's the final verdict?

The full-stack enthusiast did not hesitate to share his excitement. "I am working full-time with a Silicon Valley company without having to relocate!" he exclaims.

If you want to work with companies based in the best US from the comfort of your home, join Turing's boundaryless team today!

Interested in U.S. software jobs?

Apply to Turing today.

Apply now

Explore remote developer jobs

briefcase
AI Quality Analyst - Portuguese (Portugal)

About Turing:
Based in San Francisco, California, Turing is the world’s leading research accelerator for frontier AI labs and a trusted partner for global enterprises deploying advanced AI systems. Turing supports customers in two ways: first, by accelerating frontier research with high-quality data, advanced training pipelines, plus top AI researchers who specialize in coding, reasoning, STEM, multilinguality, multimodality, and agents; and second, by applying that expertise to help enterprises transform AI from proof of concept into proprietary intelligence with systems that perform reliably, deliver measurable impact, and drive lasting results on the P&L.

Role Overview:

As an AI Quality Analyst, you will evaluate a new personalization feature for Gemini. You will assess how well the model uses information from your past Gemini conversations, Gmail, Google Search, and YouTube activity to make responses more relevant and helpful. This role requires a unique blend of creativity and analytical rigor. You will actively design prompts from the perspective of your own personal experiences. You will then use your analytical skills to assess the quality of the model's personalized responses, evaluating dimensions like Grounding, Integration, and Helpfulness.


Key Qualifications

  • Portugueese Proficiency: Ability to read and write in Portuguese with a high degree of comp, as Portuguese is the focus language for this project.
  • Personal Account Usage: Willingness to use your primary personal Google account (not a testing account) and enable personal data sources for a genuine assessment.
  • Schedule Flexibility: Full-time availability in your local time zone is required.  We are staffing a global, 24-hour operations team.
  • Exceptional Analytical Thinking: Demonstrate ability to evaluate nuanced and ambiguous AI responses, specifically assessing personalization quality.
  • Creative Prompt Engineering: Experience in designing creative, multi-turn starting prompts based on personal context to thoroughly test the model's capabilities.
  • Strong Evaluation Acumen: Understanding of personalization concepts, including the ability to identify incorrect personalization, poor inferences, and forced connections.
  • Meticulous Attention to Detail: The ability to review Side-by-Side (SxS) model responses and spot subtle differences in naturalness and overnarrating.
  • Excellent Written Communication: Superior ability to write clear, concise, and structured rationales for model rankings, explicitly referencing specific turn numbers.
  • Feedback: Ability to provide constructive feedback and detailed annotations.
  • Communication: Excellent communication and collaboration skills.
  • Independence: Self-motivated and able to work independently in a remote setting.
  • Technical Setup: Desktop/Laptop set up with a good internet connection.


Description:

  • In this role, you will be part of a dynamic team focused on evaluating the quality of personalized AI interactions. Your day-to-day work will involve:
  • Designing and executing multi-turn conversational prompts (typically 1-5 turns) that require the AI to utilize your personal information and experiences.
  • Evaluating model responses based on your intent from the starting prompt, checking if the personalization was appropriately applied.
  • Analyzing responses for Grounding issues, ensuring claims about you are supported by evidence and not flawed inferences or hallucinations.
  • Assessing Integration quality to ensure personal data is woven naturally into the response without robotic "overnarrating".
  • Rigorously evaluating and stack-ranking two model responses side-by-side (SxS) to determine which is overall more helpful, easy to use, and enjoyable.
  • Writing clear, defensible rationales for your comparisons, explicitly referencing where issues or positive aspects occurred in the conversation.
  • Extracting and verifying "Debug Info" from the model to confirm that chat summaries and data sources were properly utilized.
  • Maintaining strict data hygiene by deleting evaluation conversations to prevent them from polluting your future chat history.


Education & Experience

  • BS/BA degree or equivalent experience in a relevant field (e.g., Policy, Law, Ethics, Linguistics, Journalism, Computer Science, or a related analytical field).
  • Experience in data annotation, AI quality evaluation, content moderation, or a related role is strongly preferred.

Offer Details:

  • Commitments Required: at least 4 hours per day and upto 40 hours per week with 4 hours of overlap with PST.
  • Engagement type: Contractor
  • Engagement Length: 3 months
  • Our offered rate for this project is $15 per hour.

Evaluation Process -

  • Shortlisted candidates will be sent a Job Interest Form.
  • After the profile review, an assessment will be shared, which must be completed within 24 hours.
  • Based on the assessment outcomes, shortlisted candidates will be contacted to discuss the pre‑onboarding requirements.
Software
10K+ employees
Domain-Specific Languages
briefcase
AI Engineer

About Turing


Based in San Francisco, California, Turing is the world’s leading research accelerator for frontier AI labs and a trusted partner for global enterprises deploying advanced AI systems. Turing supports customers in two ways: first, by accelerating frontier research with high-quality data, advanced training pipelines, plus top AI researchers who specialize in coding, reasoning, STEM, multilinguality, multimodality, and agents; and second, by applying that expertise to help enterprises transform AI from proof of concept into proprietary intelligence with systems that perform reliably, deliver measurable impact, and drive lasting results on the P&L


Role Overview


We are looking for an AI/ML Engineer specializing in LLM post-training and reinforcement learning workflows. The role focuses on fine-tuning open-weight models, building reward systems, and improving model performance through scalable training, evaluation, and data curation


What does day-to-day life look like?

  • Design and execute fine-tuning pipelines for open-weight models (Qwen, Llama, Mistral families) using SFT → DPO → GRPO progressions on tool-use and agentic data.
  • Implement and tune LoRA / QLoRA adapters for parameter-efficient fine-tuning; understand when full fine-tuning vs PEFT is the right call.
  • Build reward functions and verifiers for RL training  including programmatic verifiers, LLM-as-judge rubrics, and state-transition checks against gym environments.
  • Generate, curate, and filter RL tool-use training data: golden trajectories, preference pairs, on-policy rollouts, and rejection-sampled completions.
  • Run distributed training on multi-GPU setups; manage inference at scale with vLLM (including extended-context configurations via YaRN / RoPE scaling).
  • Diagnose failure modes: reward hacking, distribution collapse, KL blow-up, tool-selection errors vs state-transition errors, format drift.
  • Define and track evaluation metrics  pass@k, pass^k, trajectory-level scoring, rubric-based vs binary scoring  and own model-quality reporting against benchmarks.
  • Partner with annotation, eval, and client teams to translate data-quality signals into training improvements.

Requirements

  • 3+ years of hands-on ML engineering experience, with at least 1+ year specifically on LLM post-training.
  • Demonstrated production or research experience with at least three of: SFT, LoRA/QLoRA, DPO, PPO, GRPO, RLHF.
  • Strong PyTorch fundamentals; working familiarity with Hugging Face TRL, Accelerate, DeepSpeed or FSDP, and vLLM.
  • Experience designing reward signals or verifiers for RL training  not just running training scripts.
  • Solid grasp of tokenization, attention, chat templates, tool-calling formats (OpenAI/Anthropic-style), and common failure modes in agent training.
  • Comfort with Python, distributed training, GPU profiling, and reading research papers and turning them into working code.

Strongly Preferred:


  • Experience training tool-use or agentic models (function calling, multi-step tool selection, planner-executor patterns).
  • Experience with synthetic data generation pipelines and rejection sampling.
  • Familiarity with MCP, LangChain/LangGraph, or similar agent frameworks.
  • Exposure to evals at scale: building harnesses, designing rubrics, dealing with judge variance and reward hacking.
  • Cloud/infra: RunPod, AWS, GCP; container workflows; long-context inference tuning.


Perks of Freelancing With Turing

  • Work in a fully remote environment.
  • Opportunity to work on cutting-edge AI projects with leading LLM companies.

Offer Details

  • Commitments Required: 40 hours per week with overlap of 4 hours with PST. 
  • Engagement Type: Contractor assignment (no medical/paid leave)
  • Duration of contract : 2 months; [expected start date is next week]
  • Location: India, Pakistan, Bangladesh, Brazil

Evaluation Process

  • 2 rounds of Technical Interview (90 mins)
-
1-10 employees
PythonMachine Learning
sample card

Apply for the best jobs

View more openings

Work full-time at top U.S. companies

Create your profile, pass Turing Tests and get job offers as early as 2 weeks.