Reinforcement Learning: Bottom-up Programming for Ethical Machines. Marten Kaas

Giorgi Vachnadze
4 min readFeb 3, 2021

In Raising Ethical Machines: Bottom-Up Methods to Implementing Machine Ethics Marten H. L. Kaas offers an alternative method for instilling ethical algorithms into A.I.’s.

Traditionally, the strategy for deploying ethical decision-making algorithms was to start from general ethical theories and pre-suppositions and then apply them to particular situations. This is known as the “top-down” approach. But what if we were to reverse this method and allow for the A.I. to abstract its own rules, locally through regional regularities and particular cases?

Asimov’s Laws are a classical top-down model for Machine Ethics. But according to Kaas, just like most top-down models, they face serious limitations. Specifically:

  1. Top-down rules require agreement and they need to be stated explicitly.
  2. Rigidity.
  3. They are limited to a domain.

The first problem lies in the fact that we cannot lay down an explicit rule for every single context and possibility. Not to mention, that we do not possess universal definitions for all the different terms used in ethical programming. It is not easy to code common-sense, because half of our human decision-making is intuitive. “There are simply no widely agreed upon explications of ethical concepts, e.g., “fairness” or “goodness”” (Thompson, S.J. 2020).

Second, while concepts, terms, notions and definition are often contextual and flexible or elusive, top-down modelling attempts make them fixed, rigid and universal. Ethical problems are incredibly situational and they often reveal themselves spontaneously without any clearly traceable rules for their emergence. Top-down approaches fail to account for this level of complexity.

The third problem is another way of underlining the argument from complexity, but this time it involves not the rules themselves, but their domain of application. Even if an A.I. possess a vast database of explicit rules, it will be limited in the diversity of contexts that it could apply them in. It will become very specialized in operating within some delimited set of domains, while failing miserably outside of that domain.

So how is a bottom-up approach supposed to resolve these problems and account for the vagueness and complexity involved in ethical decision-making? Ethics is not a technical enterprise, there are no calculations or rules of thumb that we could rely on to be ethical. Strictly speaking, an ethical algorithm is a contradiction in terms.

Bottom-up approaches operate through constant feedback loops, constantly updating, modifying and changing the internal set of rules. It is a self-correcting system that replicates the human creativity and intuition. Since the feedback process is purely instrumental, focusing on one task at a time, it does not attempt to lay down some universal set of rules or laws that could be appealed to at any moment. The system does not rely on, nor attempts to reduce all activity to a minimal axiomatic.

“Training a machine using bottom-up approaches to prevent humans from falling into a hole (i.e., minimize harm), for example, does not require the explication of “harm.” Rather, the machine is given feedback about how well or poorly it performed, such that, over many training sessions, the machine will have eventually learned, without ever having an understanding of “harm,” how to minimize harm to humans” (Thompson, S.J. 2020).

From this example we see that the notion of harm despite its inherent contextual complexity and a lack of a perfect universal or abstract definition, is correctly used, that is, applied in a specific context. In order to do so, the A.I. does not need to retrieve the concept “harm” from some internal database and then navigate the situation. It simply solves the problem.

The type of programing that occurs with bottom-up models is more akin to a training rather then a teaching, which is more analogous to top-down approaches. The A.I. is taught within specific contexts and situations, like obstacle fields, to learn to overcome difficulties and find solutions.

Reinforcement Learning (RL) refers to a bottom-up approach to programming. Instead of explicit rules of operation, RL uses a goal-oriented approach where a “rule” would emerge as a temporary side-effect of an effectively resolved problem. That very same rule could be discarded at any moment later on, where it proves no-longer effective. The point of RL modelling is to help the A.I. mimic a living organism as much as possible, thereby compensating for what we commonly held to be the main draw-back of Machine-Learning: The impossibility of Machine-Training, which is precisely what RL is supposed to be.


Thompson, S. J. (Ed.). (2020). Machine Law, Ethics, and Morality in the Age of Artificial Intelligence. IGI Global.