Published Dec 3, 2023
Preface: This text is intended to convey one subtle concept related to the philosophy of knowledge (that induction constrains and cannot support deduction) and my extrapolations from it. I am publishing this Dec 3, 2023 as a rough draft and will update sections if they receive useful criticisms. The text requires further editing and may be repetitive. It begins with the ideas I believe most strongly and, while progressing toward a design concept for AGI, also progress toward more technical arguments that I have lower confidence in. I welcome critiques of any of the ideas.
Introduction
The knowledge present in modern AI models is created by induction and lacks the explanatory structure that humans have used to unique advantage. I consider explanatory knowledge and the knowledge contained in our most powerful AI systems to have specific and distinct properties. I believe that the problem of creating explanatory knowledge is solvable with current technological capabilities and this could rapidly yield superintelligent machines that far exceed human abilities in every problem solving domain. I attempt to describe how this may be accomplished.
The ideas in this text are inspired by insights
from Karl Popper, and by the further refinements
from David Deutsch. The ideas relate to a central problem in the philosophy of science (highlighted by Karl Popper and David Miller
): reconciling the progress of scientific knowledge with the impossibility of supporting a theory (an element of deductive knowledge) with evidence (an element of inductive knowledge). Those familiar with David Deutsch’s work may notice that I attempt to improve some of his definitions, including his characterization of induction. This paper skips over a more detailed philosophical critique statistics to focus on the application to AI systems; however, I do plan to publish a more detailed explanation of the foundational idea. In short, the concept resembles Hempel’s deductive-nomological
model of scientific explanation. Bayesian statistics (and inductive methods generally) contain the assumption that the future will resemble the past. Deductive models of the world contain the assumption that the world is connected and constrained by a unifying structure. The different properties of these two categories of knowledge mean that a deductive world model cannot be derived by inductive methods. Inductive methods can eventually generate approximate knowledge of how to intentionally construct a deductive world model (as humans have), yet I believe we can help our AI systems “wake up” rather than waiting for them to discover how to do so on their own.
Two Categories of Knowledge
Humans have a unique ability to translate their observations of the world into formal statements using a Turing-complete language (e.g. “P*V=n*R*T”, “All cats are eukaryotes”), then compose those statements, thus creating self-consistent explanatory structures (e.g. the collection of statements in the curriculum of “CHMB41 - Organic Chemistry I”). When combined with compatible inputs, explanatory structures can compute outputs across specific statements (e.g. “synthesizing 1000 kg of ammonia requires ‘a’ kg of hydrogen gas, which requires ‘b’ kJ of heat to synthesize, which requires ‘c’ kg of fuel, which costs ‘d’ dollars, etc.”).
Knowledge is a computationally reducible
pattern stored within an information media, which maps to data of an observable external physical pattern (the world). All knowledge interacts with nature via measurement
(e.g. nominal, ordinal, interval and ratio patterns). Measurement is the translation of a physical pattern into data within an information media. Observation is the mapping of a pattern within a knowledge structure to a pattern in data. Measurement is always theory-laden
and can never be proven to be a true representation of the world. That is, truth cannot provably traverse the gap between the physical world and a virtual representation of the world that exists in an information media. Knowledge is inextricably an imperfect virtual representation of the patterns in the physical world.
There are two major forms of knowledge which I will call “interrogative” and “assertive”; these terms are roughly interchangeable with a posteriori
(inductive) and a priori
(deductive) knowledge, respectively. The new contrived terms are used to guide intuitions of readers and evade possible conflicts with existing definitions. All learning requires a mechanism to guide pattern creation in the knowledge-storing medium. Interrogative learning uses a continuous mechanism (e.g. loss-minimizing backpropogation) whereas assertive learning uses a binary mechanism (e.g. Is a hypothesis compatible with the data or not?). The physical power a knowledge structure relative to the world resides in the latent ability of the knowledge to outmaneuver or subdue physical patterns of its environment that would infringe upon its vehicle (information media). A unit of interrogative knowledge is a feature. The unit of assertive knowledge is a statement (synonymous with theory and proposition).
Interrogative knowledge has an empirical relationship
to nature. It is theoryless, not composed of discrete statements. It is created by induction, responsively adjusting the state of a knowledge-storing information media to approximate patterns measured from the world (input data). Thus, interrogative knowledge is effected by confirming and disconfirming observations (“evidence”). Interrogative knowledge creation can be applied to low dimensional (e.g. simple linear regression) or high dimensional data (e.g. gradient descent). The structure of interrogative knowledge is continuous; features in a purely interrogative knowledge structure can only consist of “relatively discrete” (fundamentally continuous) approximations, meaning that these structures are not efficiently composable or useful for reasoning.
Interrogative knowledge constitutes much of the knowledge in biological (especially non-human) brains and artificial neural networks. This mode of knowledge creation has capacity to learn high dimensional patterns in data. It has flexibility to efficiently approximate existing patterns in any environment it can observe. Learning is not constrained by a requirement that the patterns learned are connected by logical rules. Interrogative knowledge characterizes the world with the assumption that the future resembles the past, or that true patterns repeat. The knowledge structure is somewhat like an impression of an object in high dimensional clay. The impressions exist in a malleable learning substrate that is accommodating but imprecise and increasingly intricate patterns in training data tend to get smoothed over. Interrogative learning is particularly effective at generating power over a local environment (adapting to a reward function). The rate of interrogative knowledge creation is approximately logarithmic (initially rapid then slowing) because obvious patterns in data can always be approximated quickly but subtle patterns cannot be distinguished from noise.
Assertive knowledge has a binary relationship to nature; it is either consistent with observation or not. It is composed only of formal statements (theories or propositions). It is created through a process of conjecture (statement formation) and refutation (statement elimination). In this process, the state of a knowledge-storing information media (i.e. specific statements or compositions of statements) is compared with observations and evaluated for consistency (binary outcome); consistent knowledge persists and inconsistent knowledge does not. Thus, assertive knowledge is only effected by error detection; the knowledge cannot be confirmed and it is unaffected by magnitudes of error. Assertive knowledge creation can only focus on a specific feature (pattern) in training data. Assertive knowledge is created when statement addition, subtraction, or substitution increases consistency of the knowledge structure with the training data (while maintaining non-refutability relative to the training data). The process of assertive knowledge creation is analogous to reverse engineering
. The structure of assertive knowledge is discrete; any statements in a purely assertive knowledge structure have absolute relationships, making them efficient to compose and useful for reasoning. Explanation is a sub-type of assertive knowledge.
Assertive knowledge is the kind of knowledge present in genes and in scientific explanations. It has capacity for learning high dimensional patterns, where dimensionality extends from the composition and execution of interacting statements (like machine instructions of an OS). It is constrained to only learning discrete patterns that are compatible with other parts of the knowledge structure. Assertive knowledge characterizes the world with the assumption that all data is connected and constrained by a unified structure of causality. The knowledge structure is somewhat like the assembly of rigid components in an idealized mechanical system. The components are modular, specific components can only interact with other components in absolutely constrained ways, chains of interactions do not decrease precision, and a defect in a single component can cause large parts of the system to break. Assertive knowledge is particularly effective at generating power over the world (a global environment). The rate of assertive knowledge creation is approximately exponential (initially slow then accelerating) because statement improvement can have global effects on the knowledge structure and can eliminate noise from data.
From this description of knowledge and its division into two types with distinct properties, I believe many insights can be generated about the definition of life, our genetic evolutionary history, the adaptive effects of biological neural networks; and I suspect also the Fermi paradox and composition of our laws of physics, though I believe those insights have little use in addressing the problem of creating AGI. I will not further discuss them in this text.
Explanation
Explanation is meta-knowledge. It is knowledge about a knowledge structure. Creation of explanatory knowledge requires attention to (and observation of) the relationships between units of knowledge. These relationships consist of the components shared among assertive knowledge statements (e.g. variables, constants, operators).
Informally, one may think of explanatory knowledge as “statements about what is there, what it does, and how and why” (as stated by David Deutsch). Because interrogative knowledge does not contain discrete components that can be used to form meta-patterns in a knowledge structure, explanatory knowledge is a sub-set of assertive knowledge.
Explanatory knowledge is a high order structure relative to assertive knowledge generally, meaning it is constructed using units of assertive knowledge, either atomic (the basic symbols which categorize observations) or composite (statement) units. Once an explanation is created it can begin to be observed as a pattern in data from the environment.
When explanatory knowledge maps to a unit of assertive knowledge, that assertive knowledge becomes explicit (to a degree that increases in relationship to the connectivity of the explanatory network). Knowledge is implicit when explanatory knowledge about its relationships with other units of knowledge is absent (i.e. those relationships are not described by the knowledge structure). Knowledge becomes more explicit when the number of explanatory connections in the structure increases. The knowledge within a genome, for example, lacks explanatory meta-knowledge even though it can execute the functions coded within it. When explanations are created about a knowledge structure, the knowledge transitions from implicit to explicit (the opposite transition also occurs when explanations are lost).
The Relationship Between Interrogative and Assertive Knowledge
Knowledge can only be created relative to an environment (the substrate manifesting the patterns to be learned). An environment can consist of measurements (data) of the physical world, but it can also consist of measurements of other knowledge structures (virtual worlds). Interrogative and assertive knowledge structures can both learn from other interrogative or assertive knowledge structures. For assertive knowledge to be created, it must fit within (it is constrained by) the patterns of its environment. For interrogative knowledge to be created, it must adjust toward (approximate) the patterns of its environment.
Because the high dimensional shape of interrogative knowledge adjusts toward patterns of the world, it can relatively quickly construct an approximate model of its environment. Such a model can serve as an effective intermediary virtual substrate for constraining the shape of assertive knowledge about the same physical environment. For example, the assertive knowledge in science can be varied and selected within constraints of our learned intuitions about the world prior to attempts at testing assertive knowledge against nature; whereas the assertive knowledge contained in a genome can only improve by random variation and selection by the physical world. The use of intermediary interrogative knowledge structure as a kind of orthotic device can enable efficient statement generation and testing, focusing statement generation (creativity) and bypassing complexity (constructing physical devices that can collect the pertinent data) of testing statements in the physical world.