Structured Bites #1 : Synthetic Data is a real hassle


We're Live!

Hey Reader,

Welcome to Structured Bites where I send over a curated list of things I've enjoyed reading, resources I've found useful and a quick summary of what I've written up over the past week.

Overall the theme for the past 2-3 weeks for me has been consistently synthetic data and evaluations as I've been tinkering around with some new ways to generate datasets from scratch.

That and finding this great song by a Taiwanese Band called 告五人 called 就说你想说的。

One thing that's really stood out for me has been the importance of user intent and diversity with a good objective metric in mind.

Whether it's just simple cosine similarity, recall of retrieved results or the efficacy of tool calls, having a simple binary metric provides a quick way to iterate fast.

If you'd like to chat about some problems you're facing or bounce some ideas about specific topics such as synthetic data, evaluations or UX, you can schedule some time for a quick chat here at no cost or just send me a message on twitter at ivanleomk.

Alternatively, feel free to forward this email to someone who might find this useful.

What I've enjoyed

Here are some links that I've enjoyed over the past few weeks

  1. Claude Insights and Observations : This was a new paper released by Anthropic that shows how they understand broad general trends by Claude users. I highly recommend watching the video interview that they released alongside it before diving into the paper.
  2. A Practical Guide to Technical Writing : A short article by Hamel Hussain talking about his experience with technical writing and how to grow an audience organically over time. I really enjoyed this and I hope you like it as much as I did
  3. A guide to ROPE : A short article walking through the intuition of why ROPE works the way it does in embedding positional information for tokens in a transformer.
  4. Sequence To Sequence : What a Decade : A short talk by Ilya on his reflections 10 years after his seminal sequence to sequence paper was selected for the test of time award at Neurips.

What I've Written

Over the past 2-3 weeks, I've written a good amount of posts around good UX for LLM applications, writing better evals and some general reflections. I've collected them in an easy format below

  • Simplify Your LLM Evals : Binary evals are a better starting point for applications, simplify them as much as possible. We can convert subjective evals to binary ones by being creative as seen by this data extraction case study we walk through here.
  • Is there value in a wrapper? : A common criticism of LLM application is that they're just wrappers. I think there are more levers we can pull - using user data, better UX and thinking deeply about serving infrastructure
  • How to look at your data : While looking at your data is a meme, it's rare to find an example of how to do so. I wrote up a small case study based on a problem I was facing
  • How to tame your LLM Application : LLM Applications are tricky businesses - I walk through 5 different levels that developers can progress through to build more resilient and reliable applications.
  • Are Eval Improvements just pure chance? : Eval Improvements are important but you should definitely do some statistical analysis before committing 5 weeks of engineering time. See how to determine if your improvements are due to reliable improvements or statistical chance.
  • What makes good documentation : Good Documentation is often treated as a chore or an api listing. That's a huge missed opportunity for any developer or company with any public facing product.
  • Why User Intent matters the most for Synthetic Data : Conventional wisdom tells you to vary tone and length but what is most important is your understanding of what users actually want to use your application for. See how these materially impact the quality of synthetic data.

Ivan Leo

Research Engineer at 567 Labs. I experiment and tinker with LLMs and maintain instructor on the side. Massive adventure junkie and outdoorsy person.

Read more from Ivan Leo