I don’t really understand how OpenAI’s o1 works, but I found today’s Stratechery update helpful, contrasting the approach of o1 compared to other LLMs which can sometimes blindly follow the wrong path:
In summary, there are two important things happening: first, o1 is explicitly trained on how to solve problems, and second, o1 is designed to generate multiple problem-solving streams at inference time, choose the best one, and iterate through each step in the process when it realizes it made a mistake.