Yes, you can use an activation function such as Relu (f (x) = max (0, x))
An example of the weight of such a network:
Layer1: [[-1, 1], [1, -1]] Layer2: [[1], [1]]
For the first (hidden) layer:
- If the input is [0,0], both nodes will have activation 0: ReLU (-1 * 0 + 1 * 0) = 0, ReLU (1 * 0 + -1 * 0) = 0
- If the input is [1,0], one node will have activation 0 ReLU (-1 * 1 + 1 * 0) = 0, and the other activation 1 ReLU (1 * 1 + -1 * 0) = 1
- If the input is [0,1], one node will have activation 1 ReLu (-1 * 0 + 1 * 1) = 1, and the other activation 0 ReLU (1 * 0 + -1 * 1) = 0
- If the input is [1,1], both nodes will have activation 0: ReLU (-1 * 1 + 1 * 1 = 0) = 0, ReLU (1 * 1 + -1 * 1 = 0) = 0
For the second (output) level: Since the scales [[1], [1]] (and due to ReLU there can be no negative activations from the previous layer), the layer simply acts as a summation of the activations in layer 1
- If the input is [0,0], the sum of activations in the previous layer is 0
- If the input is [1,0], the sum of activations in the previous layer is 1
- If the input is [0,1], the sum of activations on the previous layer is 1
- If the input is [1,1], the sum of activations in the previous layer is 0
Although this method by coincidence works in the above example, it is limited to using the zero mark (0) for false examples of the XOR problem. If, for example, we used for False examples and deuces for True examples, this approach no longer worked.
source share