Question
Answered step-by-step
PrivateWater38693
Explain this in simple terms( make sure everything important is in…

Explain this in simple terms( make sure everything important is in it)

 

All these approaches embraced the basic idea of distributed representations, in which stimuli are represented by overlapping sets of nodes or stimulus elements. Similarity emerges naturally from the fact that two similar stimuli (such as yellow and orange lights, or golden retrievers and cocker spaniels) activate elements belonging to both sets. Thus, what is learned about one stimulus will tend to transfer or generalize to other stimuli that activate some of the same nodes.

distributed representation
A representation in which information is coded as a pattern of activation distributed across many different nodes.
Figure 6.4A shows how distributed representations might be used in a network model that is only slightly more complicated than the model of Figure 6.2. This network has three layers of nodes and two layers of weights; by convention, we refer to it as a two-layer network, counting the number of layers of weights. In the network model in Figure 6.4A, each stimulus activates an input node that is connected, by a layer of fixed (nonmodifiable) weights, to several nodes in an internal representation. These fixed weights, which do not change during learning, are drawn in light blue (except when activated) to help distinguish them from the modifiable weights, which are shown in gray (except when activated). The internal representation nodes are then connected, via the layer of modifiable weights, which will change during learning, to a final output node.

Thus, the presentation of a yellow light would activate the “yellow” node in the input layer, which in turn would activate three nodes at the internal representation (nodes 3, 4, and 5); but notice that two of these internal representation nodes (3 and 4) could also be activated by a yellow-green light, and nodes 4 and 5 could also be activated by yellow-orange light. In this manner, yellow-green, yellow, and yellow-orange all activate overlapping sets of internal representation nodes.

The order of nodes in Figure 6.4A forms a topographic representation, meaning that nodes responding to physically similar stimuli, such as yellow and yellow-orange light, are placed next to each other in the model (a concept introduced in Chapter 3). Thus, in Figure 6.4A, there is more overlap between the representations for yellow and yellow-orange than between the representations for yellow and orange. There is no overlap between the representations of very different colors, such as green and orange.

Now suppose this network model is trained to respond to a yellow light, which activates the three internal representation nodes, as shown in Figure 6.4A. Note that this network transforms a representation of yellow light in the input row to a different representation in the middle row. The representation of yellow at the input nodes is a discrete-component representation because each light has one and only one node that is activated (and no other colors activate this node). In contrast, the representation at the middle nodes is a distributed representation because the yellow color’s representation is distributed over three nodes (3, 4, and 5). So this network can be viewed as one that converts a discrete representation at the input nodes to a distributed representation at the middle nodes. As you will see next, this distributed representation allows the model to account for similarity and generalization.

One way to determine how much the modifiable associative weights should be increased is to use the Rescorla-Wagner learning rule. As presented in Chapter 4, this rule states that weights should be changed in proportion to the error on each trial (with “error” defined as the mismatch between the output that did or did not occur and the correct, or desired, output, which is determined by the actual outcome). Following many trials of training in which yellow is paired with reward, the weights from the three active internal representation nodes would each come to equal 0.33, as shown in Figure 6.4A, where the lines connecting internal representation nodes 3, 4, and 5 to the output node have been thickened. Now when presentation of a yellow light activates nodes 3, 4, and 5, the result would be a net response activation of 1.0 in the output node—that being the sum of these three weights. (Note that the weights from internal representation nodes that have never been activated or associated with reward remain at their initial value of 0.0.)

Compare the distributed network in Figure 6.4A, which was trained to respond to a yellow light, with the discrete-component network in Figure 6.2, which was trained with the very same stimulus and outcome pairings. In the distributed network of Figure 6.4A, the learning is distributed over weights from three internal representation nodes, each with a trained weight of 0.33; in contrast, the discrete-component network in Figure 6.2 localizes this same respond-to-yellow rule into a single weight of 1.0 from one input node. Both network models give a response of 1.0 when the original yellow light is presented.

The difference between the distributed network of Figure 6.4 and the discrete-component network of Figure 6.2 becomes apparent only on presentation of stimuli that are similar—but not identical—to the trained stimulus. The distributed network is able to generalize. This generalization behavior can be assessed by testing a yellow-orange light, as shown in Figure 6.4B. Here, yellow-orange activates an internal representation that has considerable overlap with the representation activated by the trained yellow light. Specifically, both nodes 4 and 5 are also activated by the yellow-orange light, and each of these internal representation nodes will contribute to partially activate the output node. As a result, a reasonably strong output node activation of 0.66 results, proportional to the two-thirds degree of overlap between the representations for yellow and yellow-orange light. If the same network were tested with orange light, there would be less overlap with the representation of yellow light and a consequently even weaker response.

Figure 6.5 shows that when this model is used to test generalization to a series of novel lights, it produces a stimulus-generalization gradient that decreases smoothly for stimuli of increasing distance from the trained stimulus, similar to the pigeons’ generalization gradient shown in Figure 6.1. This ability to capture animals’ and humans’ natural tendency to generalize—that is, to treat similar stimuli similarly—contrasts markedly with the (lack of a) generalization gradient in Figure 6.3, which was produced using the network that had only discrete-component representation. Thus, even though the two models can learn the same initial task (respond to yellow light), they differ considerably in their generalization performance.