• howrar@lemmy.ca
    link
    fedilink
    arrow-up
    1
    ·
    8 months ago

    Counterexample: There exists an optimal deterministic policy for any MDP.