m8ta
You are not authenticated, login.
text: sort by
tags: modified
type: chronology
{1566}
hide / / print
ref: -1992 tags: evolution baldwin effect ackley artificial life date: 03-21-2022 23:20 gmt revision:0 [head]

Interactions between learning and evolution

  • Ran simulated evolution and learning on a population of agents over ~100k lifetimes.
  • Each agent can last several hundred timesteps with a gridworld like environment.
  • Said gridworld environment has plants (food), trees (shelter), carnivores, and other agents (for mating)
  • Agent behavior is parameterized by an action network and a evaluation network.
    • The action network transforms sensory input into actions
    • The evaluation network sets the valence (positive or negative) of the sensory signals
      • This evaluation network modifies the weights of the action network using a gradient-based RL algorithm called CRBP (complementary reinforcement back-propagation) which reinforces based on the temporal derivative, and complements (negative) when action does not increase reward, with some e-greedy exploration.
        • It's not perfect, but as they astutely say, any reinforcement learning algorithm involves some search, so generally heuristics are required to select new actions in the face of uncertainty.
      • Observe that it seems easier to make a good evaluation network than action network (evaluation network is lower dimensional -- one output!)
    • Networks are implemented as one-layer perceptrons (boring, but they had limited computational resources back then)
  • Showed (roughly) that in winner populations you get:
    • When learning is an option, the population will learn, and with time this will grow to anticipation / avoidance
    • This will transition to the Baldwin effect; learned behavior becomes instinctive
      • But, interestingly, only when the problem is incompletely solved!
      • If it's completely solved by learning (eg super fast), then there is no selective leverage on innate behavior over many generations.
      • Likewise, the survival problem to be solved needs to be stationary and consistent for long enough for the Baldwin effect to occur.
    • Avoidance is a form of shielding, and learning no longer matters on this behavior
    • Even longer term, shielding leads to goal regression: avoidance instincts allow the evaluation network to do something else, set new goals.
      • In their study this included goals such as approaching predators (!).

Altogether (historically) interesting, but some of these ideas might well have been anticipated by some simple hand calculations.

{1533}
hide / / print
ref: -2009 tags: Baldwin effect finches date: 02-22-2021 17:35 gmt revision:0 [head]

Evolutionary significance of phenotypic accommodation in novel environments: an empirical test of the Baldwin effect

Up until reading this, I had thought that the Balwin effect refers to the fact that when animals gain an ability to learn, this allows them to take new ecological roles without genotypic adaptation. This is a component of the effect, but is not the original meaning, which is opposite: when species adapt to a novel environment through phenotypic adptation (say adapting to colder weather through within-lifetime variation), evolution tends to push these changes into the germ line. This is something to the effect of Lamarkian evolution.

In the case of house finches, as discussed in the link above, this pertains to increased brood variability and sexual dimorphism due to varied maternal habits and hormones due to environmental stress. This variance is then rapidly operated on by natural selection to tune the finch to it's new enviroment, including Montana, where the single author did most of his investigation.

There are of course countless other details here, but still this is an illuminating demonstration of how evolution works to move information into the genome.