
RLHF is the training technique that transformed a raw language model into the helpful, conversational ChatGPT users interact with today.
This is why ChatGPT feels more like a helpful assistant than a raw text predictor.
Reference:
TaskLoco™ — The Sticky Note GOAT