Jobs

Browse open roles across our portfolio and work with leading crypto companies.

Research Fellow, User Modeling & Personal Memory

koodos

USD 5k-5k / month

Posted on May 17, 2026

Apply now

🔖

Research Fellow, User Modeling & Personal Memory

15 more properties

Remote, NYC, or hybrid | Summer 2026 (10–12 weeks, flexible)

Why join koodos labs?

koodos labs is a consumer AI research and product company dedicated to ensuring that the internet knows you and can do things for you, on your terms. We are the company behind Shelf, the leading personal context platform used by millions of people to track and understand what they consume. Read about the co-founders and their story here. We've raised 3 (unannounced) rounds from the world's leading investors, and our Board is comprised of Pinterest's co-founder, Facebook's first head of monetization and a leading GP.

You should join koodos labs for the people and our mission. We aspire to build the best team of the 2020s — a place where it's good to be from. If you join us, we promise to be the best place to grow your career, with the best people you've ever worked with.

Read more about working at koodos labs here.

What we're building

Shelf is used by millions of people to track what they're watching, reading, listening to & more, and to keep up with what others are into. Shelf connects to your favorite platforms, shares back insights from your behavior, and will soon let you share that context with services to enable personalized intelligence, on your terms.

The opportunity

AI models get evaluated on everything except the thing that matters most for truly personalized AI: do they actually understand you?

There is no standard benchmark for how well a system can build a coherent model of a person from the full stack of user signals. Nobody has formally characterized that structure. So nobody can measure progress against it. And because the modeling happens inside closed platforms, the people being represented have no visibility into how they are represented.

No agreed taxonomy. No methodology for how signals compound into a holistic representation of you.

You'll work directly with our CTO, with meaningful exposure to the founding team and the freedom to shape the direction.

What makes this role rare: you get to do rigorous benchmark research and ground it in real-world data. Shelf has millions of users with cross-category consumption data — the substrate to test whether your benchmark actually carves reality at the joints. Most academic research never gets to do this part.

Your primary deliverable is a publishable, open-source benchmark for user understanding.

And then you make it real — building the infrastructure to generate profiles for Shelf's millions of users, and putting your taxonomy on the line against real signals at scale.

The work

Phase 1: Build the benchmark

A rigorous, reproducible, publishable eval framework for user understanding across signal types.

Survey the state of the art. What approaches exist today, and how effective are they?

Formalize the taxonomy of signals that make up a person's identity to an AI system. Build on our current taxonomy. Identify where it over-specifies, where it under-specifies, where inferences systematically fail.

Design the eval methodology — signals, tasks, metrics. Construct the dataset. Run baselines against current state-of-the-art models.

Phase 2: Make it real with Shelf data

Once the benchmark is in good shape, we'll point it at Shelf's user base — millions of people with cross-category consumption data — and build the infrastructure to generate profiles at scale. You'll:

Build the pipeline to generate profiles for Shelf users at scale

Validate the taxonomy against real signals — where does it hold up, where does it break?

Identify where current personalization systems systematically fail on real user data

Feed findings back into the benchmark, and into Shelf's intelligence layer

This is where the work compounds beyond a paper: the same infra becomes the core representation of users inside Shelf's product.

And once we’re ready, we can open-source the benchmark. Write the paper, targeted at RecSys, UMAP, NeurIPS, or adjacent.

Ideal candidate

💡

We recognize that a confidence gap might discourage amazing candidates from applying. Every job description is a wish list, so please reach out if this role really excites you.

You're likely a good fit if you:

Are a post-doc, PhD or graduate student in ML, NLP, information retrieval, recommender systems, cognitive science, HCI, or adjacent OR you’re an autodidactic OSS contributor!

Can do the research and ship the implementation. We're not looking for pure theory — we want someone who can build the eval pipeline, work with real data, and write production-quality code alongside the paper

Have full-stack engineering experience, particularly with data pipelines and automation (AI tools like Claude Code are a strong pre-req)

Are fluent in benchmark design — you have views on what PersonaMem gets right, what it misses, and what makes an eval actually stick

Are comfortable on the conceptual side (is the profile structure carving reality at the joints?) and the technical side (dataset construction, eval pipelines, baselines)

Are a clear writer

Logistics

Remote, NYC, or hybrid — your call. 10–12 weeks, Summer 2026. Compensation: $5,000/month stipend. Academic credit arrangeable.

How to apply

CV and a writing sample (published paper preferred, course project acceptable) to jad@koodos.com.

FAQs

Who will I work with?

Where will I work?

What tech stack do you use?

Where can I find more info?

Apply now

See more open positions at koodos