copycat/papers/paper.tex

\documentclass[a4paper]{article}

%% Language and font encodings
\usepackage[english]{babel}
\usepackage[utf8x]{inputenc}
\usepackage[T1]{fontenc}

%% Sets page size and margins
\usepackage[a4paper,top=3cm,bottom=2cm,left=3cm,right=3cm,marginparwidth=1.75cm]{geometry}

%% Useful packages
\usepackage{amsmath}
\usepackage{graphicx}
\usepackage[colorinlistoftodos]{todonotes}
\usepackage[colorlinks=true, allcolors=blue]{hyperref}

\title{Distributed Parallel Terraced Scan}
%% \title{The Brain as a Distributed System}?
\author{Lucas Saldyt, Alexandre Linhares}

\begin{document}
\maketitle

\begin{abstract}
We investigate FARG architectures in general, and Copycat in particular.  One of the foundations of those models is the \emph{Parallel Terraced Scan}--a psychologically-plausible model that enables a system to fluidly move between different modes of processing.  Previous work has modeled decision-making under Parallel Terraced Scan by using a central variable of \emph{Temperature}.

\end{abstract}

\section{Introduction}

    This paper stems from Mitchell's (1993) and Hofstadter \& FARG (1995). The goals of this project are twofold:

    Firstly, we focus on effectively simulating intelligent processes through increasingly distributed decision-making.
    ...
    Written by Linhares:
    The Parallel Terraced Scan is a major innovation of FARG architectures.
    It corresponds to the psychologically-plausible behavior of briefly browsing, say, a book, and delving deeper whenever something sparks one's interest.
    This type of behavior seems to very fluidly change the intensity of an activity based on local, contextual cues.
    It is found in high-level decisions such as marriage and low-level decisions such as a foraging predator choosing whether to further explore a particular area.
    Previous FARG models have used a central temperature T to implement this behavior.
    We explore how to maintain the same behavior while distributing decision-making throughout the system.
    ...
    Specifically, we begin by attempting different refactors of the copycat architecture.
    First, we experiment with different treatments of temperature, adjusting the formulas that depend on it
    Then, we experiment with two methods for replacing temperature with a distributed metric, instead.
    First, we remove temperature destructively, essentially removing any lines of code that mention it, simply to see what effect it has.
    Then, we move toward a surgical removal of temperature, leaving in tact affected structures or replacing them by effective distributed mechanisms.

    Secondly, we focus on the creation of a `normal science' framework in FARG architectures.
    By `normal science' we use the term created by Thomas Kuhn--the collaborative enterprise of furthering understanding within a paradigm.
    Today, "normal science" is simply not done on FARG architectures (and on most computational cognitive architectures too... see Addyman \& French 2012).
    Unlike mathematical theories or experiments, which can be replicated by following the materials and methods, computational models generally have dozens of particularly tuned variables, undocumented procedures, multiple assumptions about the users computational environment, etc.
    It then becomes close to impossible to reproduce a result, or to test some new idea.
    This paper focuses on the introduction of statistical techniques, reduction of "magic numbers", improvement and documentation of formulas, and proposals for effective human comparison.

    We also discuss, in general, the nature of the brain as a distributed system.
    While the removal of a single global variable may initially seem trivial, one must realize that copycat and other cognitive architectures have many central structures.
    This paper explores the justification of these central structures in general.
    Is it possible to model intelligence with them, or are they harmful?
    ...

\section{Body: Distributed Decision Making and Normal Science}

\subsection{Distributed Decision Making}

    The distributed nature of decision making is essential to modeling intelligent processes [..]

\subsection{Normal Science}

    An objective, scientifically oriented framework is essential to making progress in the domain of cognitive science.
    [John Von Neumann: The Computer and the Brain?
    He pointed out that there were good grounds merely in terms of electrical analysis to show that the mind, the brain itself, could not be working on a digital system. It did not have enough accuracy; or... it did not have enough memory. ...And he wrote some classical sentences saying there is a statistical language in the brain... different from any other statistical language that we use... this is what we have to discover. ...I think we shall make some progress along the lines of looking for what kind of statistical language would work.]
    Notion that the brain obeys statistical, entropical mathematics

\section{Normal Science}

\section{Notes}

    According to the differences we can enumerate between brains and computers, it is clear that, since computers are universal and have vastly improved in the past five decades, that computers are capable of simulating intelligent processes.
    [Cite Von Neumann].
    Primarily, the main obstacle now lies in our comprehension of intelligent processes.
    Once we truly understand the brain, writing software that emulates intelligence will be a relatively simple software engineering task.
    However, we must be careful to remain true to what we already know about intelligent processes so that we may come closer to learning more about them and eventually replicating them in full.
    The largest difference between the computer and the brain is the distributed nature of computation.
    Specifically, our computers as they exist today have central processing units, where literally all of computation happens.
    On the other hand, our brains have no central location where all processing happens.
    Luckily, the speed advantage and universality of computers makes it possible to simulate the distributed behavior of the brain.
    However, this simulation is only possible if computers are programmed with concern for the distributed nature of the brain.
    [Actually, I go back and forth on this: global variables might be plausible, but likely aren't]
    Also, even though the brain is distributed, some clustered processes must take place.
    So, centralized structures should be removed from the copycat software, because they will likely improve the accuracy of simulating intelligent processes.
    It isn't clear to what degree this refactor should take place.
    The easiest target is the central variable, temperature, but other central structures exist.
    This paper focuses primarily on temperature, and the unwanted global unification associated with it.

    Even though copycat uses simulated parallel code, if copycat were actually parallelized, the global variable of temperature would actually prevent most copycat codelets from running at the same time.
    If this global variable and other constricting centralized structures were removed, copycat's code would more closely replicate intelligent processes and would be able to be run much faster.
    From a function-programming like perspective (i.e. LISP, the original language of copycat), the brain should simply be carrying out the same function in many locations (i.e. mapping neuron.process() across each of its neurons, if you will...)
    However, in violating this model with the introduction of global variables......

    Global variables seem like a construct that people use to model the real world.
    ...

    It is entirely possible that at the level of abstraction that copycat uses, global variables are perfectly acceptable.
    For example, a quick grep-search of copycat shows that the workspace singleton also exists as a global variable.
    Making all of copycat distributed clearly would require a full rewrite of the software....

    If copycat can be run such that codelets may actually execute at the same time (without pausing to access globals), then it will much better replicate the human brain.

    However, I question the assumption that the human brain has absolutely no centralized processing.
    For example, input and output chanels (i.e. speech mechanisms) are not accessible from the entire brain.
    Also, brain-region science leads me to believe that some (for example, research concerning wernicke's or broca's areas) brain regions truly are "specialized," and thus lend some support to the existence of centralized structures in a computer model of the brain.
    However, these centralized structures may be emergent?

    So, to re-iterate: Two possibilities exist (hypotheses)

    A computer model of the brain can contain centralized structures and still be effective in its modeling.
    A computer model cannot have any centralzied structures if it is going to be effective in its modeling.

    Another important problem is defining the word "effective".
    I suppose that "effective" would mean capable of solving problems effectively.
    However, it isn't clear to me that removing temperature increases the ability to solve problems effectively.
    Is this because models are aloud to have centralized structures, or because temperature isn't the only centralized structure?

    Clearly, creating a model of copycat that doesn't have centralized structures will take an excessive amount of effort.

\subsection{Steps/plan}

Normal Science:
\begin{enumerate}
	\item Introduce statistical techniques
    \item Reduce magic number usage, document reasoning and math
    \item Propose effective human subject comparison
\end{enumerate}
Temperature:
\begin{enumerate}
	\item Propose formula improvements
    \item Experiment with a destructive removal of temperature
    \item Experiment with a "surgical" removal of temperature
    \item Assess different copycat versions with/without temperature
\end{enumerate}

\subsection{Semi-structured Notes}

Biological or psychological plausibility only matters if it actually affects the presence of intelligent processes. For example, neurons don't exist in copycat because we feel that they aren't required to simulate the processes being studied. Instead, copycat uses higher-level structures to simulate the same emergent processes that neurons do. However, codelets and the control of them relies on a global function representing tolerance to irrelevant structures. Other higher level structures in copycat likely rely on globals as well. Another central variable in copycat is the "rule" structure, of which there is only one. While some global variables might be viable, others may actually obstruct the ability to model intelligent processes. For example, a distributed notion of temperature will not only increase biological and psychological plausibility, but increase copycat's effectiveness at producing acceptable answer distributions.

We must also realize that copycat is only a model, so even if we take goals (level of abstraction) and biological plausibility into account...
It is only worth changing temperature if it affects the model.
Arguably, it does affect the model. (Or, rather, we hypothesize that it does. There is only one way to find out for sure, and that's the point of this paper)

So, maybe this is a paper about goals, model accuracy, and an attempt to find which cognitive details matter and which don't. It also might provide some insight into making a "Normal Science" framework.

Copycat is full of random uncommented parameters and formulas. Personally, I would advocate for removing or at least documenting as many of these as possible. In an ideal model, all of the numbers present might be either from existing mathematical formulas, or present for a very good (emergent and explainable - so that no other number would make sense in the same place) reason. However, settling on so called "magic" numbers because the authors of the program believed that their parameterizations were correct is very dangerous. If we removed random magic numbers, we would gain confidence in our model, progress towards a normal science, and gain a better understanding of cognitive processes.

Similarly, a lot of the testing of copycat is based on human perception of answer distributions. However, I suggest that we move to a more statistical approach. For example, deciding on some arbitrary baseline answer distribution and then modifying copycat to obtain other answer distributions and then comparing distributions with a statistical significance test would actually be indicative of what effect each change had. This paper will include code changes and proposals that lead copycat (and FARG projects in general) to a more statistical and verifiable approach.
While there is a good argument about copycat representing an individual with biases and therefore being incomparable to a distributed group of individuals, I believe that additional effort should be made to test copycat against human subjects.  I may include in this paper a concrete proposal on how such an experiment might be done.

Let's simply test the hypothesis: \[H_i\] Copycat will have an improved (significantly different with increased frequencies of more desirable answers and decreased frequencies of less desirable answers: desirability will be determined by some concrete metric, such as the number of relationships that are preserved or mirrored) answer distribution if temperature is turned to a set of distributed metrics. \[H_0\] Copycat's answer distribution will be unaffected by changing temperature to a set of distributed metrics.

\subsection{Random Notes}

This is all just free-flow unstructured notes. Don't take anything too seriously :).

Below are a list of relevant primary and secondary sources I am reviewing:

Biological/Psychological Plausibility:
\begin{verbatim}
http://www.cell.com/trends/cognitive-sciences/abstract/S1364-6613(16)30217-0
"There is no evidence for a single site of working memory storage."
https://ekmillerlab.mit.edu/2017/01/10/the-distributed-nature-of-working-memory/

Creativity as a distributed process (SECONDARY: Review primaries)
https://blogs.scientificamerican.com/beautiful-minds/the-real-neuroscience-of-creativity/
cognition results from the dynamic interactions of distributed brain areas operating in large-scale networks
http://scottbarrykaufman.com/wp-content/uploads/2013/08/Bressler_Large-Scale_Brain_10.pdf


\bibliographystyle{alpha}
\bibliography{sample}

\end{document}