Several computational models try to explain the brain mechanisms in the origin of addictive behaviors. Perhaps, David A. Redish, Professor at the Department of Neuroscience (Minnesota University) has been the pioneer. He published in Science (2004) the article "Addiction as a computational process gone awry", a milestone for many other computational models.
Following to Redish and Johnson (2007), there are two systems in the mammalian brain with differing levels of search: (1) a flexible system, which is capable of being learned quickly, but is computationally expensive to use, and (2) an inflexible system, which can act quickly, but must be learned slowly. The flexible system allows the planning of multiple paths to achieve a goal; the inflexible system only retrieves the remembered action for a given situation. The authors hypothesize a unified system incorporating three subsystems, a situation recognition system and two contrasting decision systems-a flexible, planning-capable system that accommodates multiple paths to goals and takes into account the value of potential outcomes, and an inflexible, habit-like system.
The flexible decision-making system requires recognition of a situation, recognition of a means of achieving outcome from situation as well as the evaluation of the value of achieving outcome.
The inflexible decision-making system entails a simple association between situation and action. Evaluation entails a memory recall of the learned associated value of taking an action in the situation. To evaluate the value of an outcome, the system needs a signal that recognizes hedonic value. Two brain structures that have been suggested to be involved in the evaluation of an outcome are the orbitofrontal cortex and the ventral striatum. Neurons in the ventral striatum show reward correlates, and anticipate predicted reward. The hippocampus projects to ventral striatum and ventral striatal firing patterns reflect hippocampal activity.
It seems that hedonic signals are carried by opioid signaling. If endogenous opioids signal the actual hedonic evaluation of an achieved outcome, then when faced with potential outcome signals arriving from the hippocampus, one might expect similar processes to evaluate the value of expected outcomes. According to Redish and Johnson (art. cit., p. 330), this predicts that the effect of hippocampal planning signals on accumbens structures will be to trigger evaluative processes similar to those that occur in response to actual achieved outcomes. This has an effect for craving. Craving can be defined as the intense desire for something. Because the flexible system only entails the recognition that an action can lead to a potential path to a goal and does not entail a commitment to action, craving is not necessarily going to produce action selection. When the hippocampal component reaches a goal that is evaluated to have a high value, this will produce a strong desire to achieve that goal: the psychological effect of that recognition is to produce craving.
Competitive opioid antagonists have been used clinically to reduce craving. The hypothesis that reward signals are released on recognition of a pathway to a high-value outcome implies that blocking those reward signals would not only reduce the subjective hedonic value of receiving reward, but would also reduce craving for those rewards. If that reward signal is based on opioid signaling, then this may explain why an opioid antagonist such as naltrexone can reduce craving.