๐ Shared Links
-
๐ [Link] Circuit Tracing: Revealing Computational Graphs in Language Models
A group of Anthropic-affiliated scientists has released a paper where they study how human concepts are represented across Claude 3.5 Haiku's neurons and how these features interact to produce model outputs.
This is a specially difficult task since these concepts are not contained within a single neuron. Neurons are
polysemantic
, meaning that they encode multiple unrelated concepts in its representation. To make matters worse,superposition
makes it so the representation of features are built from a combination of multiple neurons, not just one.In this paper, the researches build a Local Replacement Model, where they replace the neural network's components with a simpler, interpretable function that mimics its behavior. Also, for each prompt, they show many Attribution Graph that help visualize how the model processes information and how the features smeared across the model's neurons influence its outputs.
Also check out the companion paper: On the Biology of a Large Language Model. In this paper the researchers also use interactive Attribution Graphs to study how models can think ahead of time to perform complex text generations that require the model to think through many steps to answer.
[[ Visit external link ]] -
๐ [Link] Claude Think Tool
The Anthropic team has discovered an interesting approach to LLM thinking capabilities. Instead of making the model think deeply before answering or taking an action, they experimented with giving the model a think tool. The think tool does nothing but register a thought in the state. However, it does allow the model to decide when it's appropriate to stop and think more carefully about the current state and the best approach to move forward.
The thinking done using the think tool will not be as deep and it will be more focused on newly obtained information. Therefore, the think tool is specially useful when the model has to carefully analyze the outputs of complex tools and act on them thoughtfully.
[[ Visit external link ]] -
๐ [Quote] ๐ญ The Einstein AI model
These benchmarks test if AI models can find the right answers to a set of questions we already know the answer to. However, real scientific breakthroughs will come not from answering known questions, but from asking challenging new questions and questioning common conceptions and previous ideas. - Thomas Wolf
Interesting reflection from Thomas from HuggingFace. Current LLMs have limited potential to make breakthroughs since they cannot "think out-of-the-box" from their training data. We might be able to give the LLMs the ability to explore outside their known world by mechanisms like reinforcement learning + live environment feedback, or other mechanisms that we haven't thought about yet. Still, significant breakthroughs will be hard for LLMs since the real breakthroughs that make a huge impact are usually very far away from established knowledge - very far from the AI model's current probability space.
[[ Visit external link ]]
- Previous
- 1
- Next