The Alexa Customer Journeys team is looking for talented Language Data Scientist to solve emerging and novel data and linguistic challenges in Alexa.
From human evaluations to Responsible AI safeguards to Retrieval-Augmented Generation and beyond, we are building exciting Alexa experiences leveraging Generative AI.
As a Language Data Scientist, you will start by diving deep into a couple of critical projects across Alexa experiences .
You will collaborate with fellow language data scientists, program managers, as well as stakeholders in science, engineering, and product teams to understand the role data plays in developing data sets and exemplars that meet customer needs.
You will analyze and automate processes for collecting and annotating LLM inputs and outputs to assess data quality and measurement.
You will apply state-of-the-art Generative AI techniques to analyze how well our data represents human language and run experiments to gauge downstream interactions.
You will work collaboratively with other language data scientists and scientists to design and implement principled strategies for data optimization.
BASIC QUALIFICATIONS
3+ years of data querying languages (e.g. SQL), scripting languages (e.g. Python) or statistical / mathematical software (e.
g. R, SAS, Matlab, etc.) experience
- PhD in Computational Linguistics, Linguistics with a computational component, or an equivalent field; alternatively, MA / MS with 3+ years of experience, Bachelors with 5+ yrs of experience
- Excellent knowledge of semantics, pragmatics, conversation analysis, and / or discourse analysis
- Experience designing and executing data collection projects, including guidelines, labelset and annotation workflow development
- Experience developing and evaluating data annotation and data quality metrics
- Experience designing and executing psychology / linguistic / cognitive science surveys or experiments with human participants
PREFERRED QUALIFICATIONS
- Voice assistants experience
- Working with LLMs
- Experience with synthetic dataset creation
- Experience with surveying
- Experience working with a diverse array of languages or language varieties