Research Fellow, User Modeling & Personal Memory
koodos
USD 5k-5k / month
Posted on May 17, 2026
🔖
Research Fellow, User Modeling & Personal Memory
15 more properties
Remote, NYC, or hybrid | Summer 2026 (10–12 weeks, flexible)
Why join koodos labs?
koodos labs is a consumer AI research and product company dedicated to ensuring that the internet knows you and can do things for you, on your terms. We are the company behind Shelf, the leading personal context platform used by millions of people to track and understand what they consume. Read about the co-founders and their story here. We've raised 3 (unannounced) rounds from the world's leading investors, and our Board is comprised of Pinterest's co-founder, Facebook's first head of monetization and a leading GP.
You should join koodos labs for the people and our mission. We aspire to build the best team of the 2020s — a place where it's good to be from. If you join us, we promise to be the best place to grow your career, with the best people you've ever worked with.
Read more about working at koodos labs here.
What we're building
Shelf is used by millions of people to track what they're watching, reading, listening to & more, and to keep up with what others are into. Shelf connects to your favorite platforms, shares back insights from your behavior, and will soon let you share that context with services to enable personalized intelligence, on your terms.
The opportunity
AI models get evaluated on everything except the thing that matters most for truly personalized AI: do they actually understand you?
There is no standard benchmark for how well a system can build a coherent model of a person from the full stack of user signals. Nobody has formally characterized that structure. So nobody can measure progress against it. And because the modeling happens inside closed platforms, the people being represented have no visibility into how they are represented.
No agreed taxonomy. No methodology for how signals compound into a holistic representation of you.
You'll work directly with our CTO, with meaningful exposure to the founding team and the freedom to shape the direction.
What makes this role rare: you get to do rigorous benchmark research and ground it in real-world data. Shelf has millions of users with cross-category consumption data — the substrate to test whether your benchmark actually carves reality at the joints. Most academic research never gets to do this part.
Your primary deliverable is a publishable, open-source benchmark for user understanding.
And then you make it real — building the infrastructure to generate profiles for Shelf's millions of users, and putting your taxonomy on the line against real signals at scale.
The work
Phase 1: Build the benchmark
A rigorous, reproducible, publishable eval framework for user understanding across signal types.
Survey the state of the art. What approaches exist today, and how effective are they?
Formalize the taxonomy of signals that make up a person's identity to an AI system. Build on our current taxonomy. Identify where it over-specifies, where it under-specifies, where inferences systematically fail.
Design the eval methodology — signals, tasks, metrics. Construct the dataset. Run baselines against current state-of-the-art models.
Phase 2: Make it real with Shelf data
Once the benchmark is in good shape, we'll point it at Shelf's user base — millions of people with cross-category consumption data — and build the infrastructure to generate profiles at scale. You'll:
Build the pipeline to generate profiles for Shelf users at scale
Validate the taxonomy against real signals — where does it hold up, where does it break?
Identify where current personalization systems systematically fail on real user data
Feed findings back into the benchmark, and into Shelf's intelligence layer
This is where the work compounds beyond a paper: the same infra becomes the core representation of users inside Shelf's product.
And once we’re ready, we can open-source the benchmark. Write the paper, targeted at RecSys, UMAP, NeurIPS, or adjacent.
Ideal candidate
💡
We recognize that a confidence gap might discourage amazing candidates from applying. Every job description is a wish list, so please reach out if this role really excites you.
You're likely a good fit if you:
Are a post-doc, PhD or graduate student in ML, NLP, information retrieval, recommender systems, cognitive science, HCI, or adjacent OR you’re an autodidactic OSS contributor!
Can do the research and ship the implementation. We're not looking for pure theory — we want someone who can build the eval pipeline, work with real data, and write production-quality code alongside the paper
Have full-stack engineering experience, particularly with data pipelines and automation (AI tools like Claude Code are a strong pre-req)
Are fluent in benchmark design — you have views on what PersonaMem gets right, what it misses, and what makes an eval actually stick
Are comfortable on the conceptual side (is the profile structure carving reality at the joints?) and the technical side (dataset construction, eval pipelines, baselines)
Are a clear writer
Logistics
Remote, NYC, or hybrid — your call. 10–12 weeks, Summer 2026. Compensation: $5,000/month stipend. Academic credit arrangeable.
How to apply
CV and a writing sample (published paper preferred, course project acceptable) to jad@koodos.com.
FAQs
Who will I work with?
Where will I work?
What tech stack do you use?
Where can I find more info?