Moves notes into paper structure

This commit is contained in:
LSaldyt
2017-12-02 13:36:48 -07:00
parent b6789b96f9
commit bb3bdf251d

View File

@ -26,49 +26,45 @@
\maketitle \maketitle
\begin{abstract} \begin{abstract}
[Write Abstract] We investigate the distributed nature of computation in a FARG architecture, Copycat.
One of the foundations of those models is the \emph{Parallel Terraced Scan}--a psychologically-plausible model that enables a system to fluidly move between different modes of processing.
Previous work has modeled decision-making under Parallel Terraced Scan by using a central variable of \emph{Temperature}.
However, it is unlikely that this design decision accurately replicates the processes in the human brain.
This paper proposes several changes to copycat architectures that will increase their modeling accuracy.
\end{abstract} \end{abstract}
\section{Introduction} \section{Introduction}
This paper stems from Mitchell's (1993) and Hofstadter's \& FARG's (1995) work on the copycat program.
This project focuses on effectively simulating intelligent processes through increasingly distributed decision-making.
In the process of evaluating the distributed nature of copycat, this paper also proposes a "Normal Science" framework.
First, copycat uses a "Parallel Terraced Scan" as a humanistic inspired search algorithm.
The Parallel Terraced Scan corresponds to the psychologically-plausible behavior of briefly browsing, say, a book, and delving deeper whenever something sparks one's interest.
In a way, it is a mix between a depth-first and breadth-first search.
This type of behavior seems to very fluidly change the intensity of an activity based on local, contextual cues.
Previous FARG models use centralized structures, like the global temperature value, to control the behavior of the Parallel Terraced Scan.
This paper explores how to maintain the same behavior while distributing decision-making throughout the system.
Specifically, this paper attempts different refactors of the copycat architecture.
First, the probability adjustment formulas based on temperature are changed.
Then, we experiment with two methods for replacing temperature with a distributed metric.
Initially, temperature is removed destructively, essentially removing any lines of code that mention it, simply to see what effect it has.
Then, a surgical removal of temperature is attempted, leaving in tact affected structures or replacing them by effective distributed mechanisms.
To evaluate the distributed nature of copycat, this paper focuses on the creation of a `normal science' framework.
By `Normal science,' this paper means the term created by Thomas Kuhn--the collaborative enterprise of furthering understanding within a paradigm.
Today, "normal science" is simply not done on FARG architectures (and on most computational cognitive architectures too... see Addyman \& French 2012).
Unlike mathematical theories or experiments, which can be replicated by following the materials and methods, computational models generally have dozens of particularly tuned variables, undocumented procedures, multiple assumptions about the users computational environment, etc.
It then becomes close to impossible to reproduce a result, or to test some new idea scientifically.
This paper focuses on the introduction of statistical techniques, reduction of "magic numbers", improvement and documentation of formulas, and proposals for statistical human comparison.
We also discuss, in general, the nature of the brain as a distributed system.
While the removal of a single global variable may initially seem trivial, one must realize that copycat and other cognitive architectures have many central structures.
This paper explores the justification of these central structures in general.
Is it possible to model intelligence with them, or are they harmful?
\section{Theory} \section{Theory}
\subsection{Normal Science}
\subsubsection{Scientific Style}
\subsubsection{Scientific Testing}
\subsection{Distribution}
\subsubsection{Von Neumann Discussion}
\subsubsection{Turing Completeness}
\subsubsection{Computers Can Simulate Brains}
\subsubsection{Simulation of Distributed Processes}
\subsubsection{Efficiency of True Distribution}
\subsubsection{Temperature in Copycat}
\subsubsection{Other Centralizers in Copycat}
\subsubsection{...}
\section{Methods}
\subsection{Formula Adjustments}
\subsubsection{Temperature Probability Adjustment}
\subsubsection{Temperature Calculation Adjustment}
\subsubsection{Temperature Usage Adjustment}
\subsection{Chi \^ 2 Distribution Testing}
\section{Results}
\subsection{Chi \^ 2 Table}
\section{Discussion}
\subsection{Distributed Computation Accuracy}
\subsection{Prediction}
\section{Body: Distributed Decision Making and Normal Science}
\subsection{Distributed Decision Making}
The distributed nature of decision making is essential to modeling intelligent processes [..]
\subsection{Normal Science}
An objective, scientifically oriented framework is essential to making progress in the domain of cognitive science.
[John Von Neumann: The Computer and the Brain?
He pointed out that there were good grounds merely in terms of electrical analysis to show that the mind, the brain itself, could not be working on a digital system. It did not have enough accuracy; or... it did not have enough memory. ...And he wrote some classical sentences saying there is a statistical language in the brain... different from any other statistical language that we use... this is what we have to discover. ...I think we shall make some progress along the lines of looking for what kind of statistical language would work.]
Notion that the brain obeys statistical, entropical mathematics
\subsection{Notes}
According to the differences we can enumerate between brains and computers, it is clear that, since computers are universal and have vastly improved in the past five decades, that computers are capable of simulating intelligent processes. According to the differences we can enumerate between brains and computers, it is clear that, since computers are universal and have vastly improved in the past five decades, that computers are capable of simulating intelligent processes.
[Cite Von Neumann]. [Cite Von Neumann].
@ -90,6 +86,7 @@
Even though copycat uses simulated parallel code, if copycat were actually parallelized, the global variable of temperature would actually prevent most copycat codelets from running at the same time. Even though copycat uses simulated parallel code, if copycat were actually parallelized, the global variable of temperature would actually prevent most copycat codelets from running at the same time.
If this global variable and other constricting centralized structures were removed, copycat's code would more closely replicate intelligent processes and would be able to be run much faster. If this global variable and other constricting centralized structures were removed, copycat's code would more closely replicate intelligent processes and would be able to be run much faster.
From a function-programming like perspective (i.e. LISP, the original language of copycat), the brain should simply be carrying out the same function in many locations (i.e. mapping neuron.process() across each of its neurons, if you will...) From a function-programming like perspective (i.e. LISP, the original language of copycat), the brain should simply be carrying out the same function in many locations (i.e. mapping neuron.process() across each of its neurons, if you will...)
Note that this is more similar to the behavior of a GPU than a CPU....?
However, in violating this model with the introduction of global variables...... However, in violating this model with the introduction of global variables......
Global variables seem like a construct that people use to model the real world. Global variables seem like a construct that people use to model the real world.
@ -118,8 +115,6 @@
Clearly, creating a model of copycat that doesn't have centralized structures will take an excessive amount of effort. Clearly, creating a model of copycat that doesn't have centralized structures will take an excessive amount of effort.
\break
.....
\break \break
The calculation for temperature in the first place is extremely convoluted (in the Python version of copycat). The calculation for temperature in the first place is extremely convoluted (in the Python version of copycat).
@ -161,7 +156,7 @@
For example, the global formula for temperature converts the raw importance value for each object into a relative importance value for each object. For example, the global formula for temperature converts the raw importance value for each object into a relative importance value for each object.
If a distributed metric was used, this importance value would have to be left in its raw form. If a distributed metric was used, this importance value would have to be left in its raw form.
\subsubsection{Functional Programming Languages and the Brain} \break
The original copycat was written in LISP, a mixed-paradigm language. The original copycat was written in LISP, a mixed-paradigm language.
Because of LISP's preference for functional code, global variables must be explicitly marked with surrounding asterisks. Because of LISP's preference for functional code, global variables must be explicitly marked with surrounding asterisks.
@ -181,7 +176,42 @@
Alternatively, codelets could be equated to ants in an anthill (see anthill analogy in GEB). Alternatively, codelets could be equated to ants in an anthill (see anthill analogy in GEB).
Instead of querying a global structure, codelets could query their neighbors, the same way that ants query their neighbors (rather than, say, relying on instructions from their queen). Instead of querying a global structure, codelets could query their neighbors, the same way that ants query their neighbors (rather than, say, relying on instructions from their queen).
\subsection{Initial Formula Adjustments} Biological or psychological plausibility only matters if it actually affects the presence of intelligent processes. For example, neurons don't exist in copycat because we feel that they aren't required to simulate the processes being studied. Instead, copycat uses higher-level structures to simulate the same emergent processes that neurons do. However, codelets and the control of them relies on a global function representing tolerance to irrelevant structures. Other higher level structures in copycat likely rely on globals as well. Another central variable in copycat is the "rule" structure, of which there is only one. While some global variables might be viable, others may actually obstruct the ability to model intelligent processes. For example, a distributed notion of temperature will not only increase biological and psychological plausibility, but increase copycat's effectiveness at producing acceptable answer distributions.
We must also realize that copycat is only a model, so even if we take goals (level of abstraction) and biological plausibility into account...
It is only worth changing temperature if it affects the model.
Arguably, it does affect the model. (Or, rather, we hypothesize that it does. There is only one way to find out for sure, and that's the point of this paper)
So, maybe this is a paper about goals, model accuracy, and an attempt to find which cognitive details matter and which don't. It also might provide some insight into making a "Normal Science" framework.
Copycat is full of random uncommented parameters and formulas. Personally, I would advocate for removing or at least documenting as many of these as possible. In an ideal model, all of the numbers present might be either from existing mathematical formulas, or present for a very good (emergent and explainable - so that no other number would make sense in the same place) reason. However, settling on so called "magic" numbers because the authors of the program believed that their parameterizations were correct is very dangerous. If we removed random magic numbers, we would gain confidence in our model, progress towards a normal science, and gain a better understanding of cognitive processes.
Similarly, a lot of the testing of copycat is based on human perception of answer distributions. However, I suggest that we move to a more statistical approach. For example, deciding on some arbitrary baseline answer distribution and then modifying copycat to obtain other answer distributions and then comparing distributions with a statistical significance test would actually be indicative of what effect each change had. This paper will include code changes and proposals that lead copycat (and FARG projects in general) to a more statistical and verifiable approach.
While there is a good argument about copycat representing an individual with biases and therefore being incomparable to a distributed group of individuals, I believe that additional effort should be made to test copycat against human subjects. I may include in this paper a concrete proposal on how such an experiment might be done.
Let's simply test the hypothesis: \[H_i\] Copycat will have an improved (significantly different with increased frequencies of more desirable answers and decreased frequencies of less desirable answers: desirability will be determined by some concrete metric, such as the number of relationships that are preserved or mirrored) answer distribution if temperature is turned to a set of distributed metrics. \[H_0\] Copycat's answer distribution will be unaffected by changing temperature to a set of distributed metrics.
\subsection{Normal Science}
\subsubsection{Scientific Style}
\subsubsection{Scientific Testing}
\subsection{Distribution}
\subsubsection{Von Neumann Discussion}
An objective, scientifically oriented framework is essential to making progress in the domain of cognitive science.
[John Von Neumann: The Computer and the Brain?
He pointed out that there were good grounds merely in terms of electrical analysis to show that the mind, the brain itself, could not be working on a digital system. It did not have enough accuracy; or... it did not have enough memory. ...And he wrote some classical sentences saying there is a statistical language in the brain... different from any other statistical language that we use... this is what we have to discover. ...I think we shall make some progress along the lines of looking for what kind of statistical language would work.]
Notion that the brain obeys statistical, entropical mathematics
\subsubsection{Turing Completeness}
\subsubsection{Computers Can Simulate Brains}
\subsubsection{Simulation of Distributed Processes}
\subsubsection{Efficiency of True Distribution}
\subsubsection{Temperature in Copycat}
\subsubsection{Other Centralizers in Copycat}
\subsubsection{The Motivation for Removing Centralizers in Copycat}
\section{Methods}
\subsection{Formula Adjustments}
\subsubsection{Temperature Probability Adjustment}
This research begin with adjustments to probability weighting formulas. This research begin with adjustments to probability weighting formulas.
@ -237,7 +267,7 @@ After some experimentation and reading the original copycat documentation, it wa
The following formulas let $U = p^r$ if $p < 0.5$ and let $U = p^\frac{1}{r}$ if $p >= 0.5$. The following formulas let $U = p^r$ if $p < 0.5$ and let $U = p^\frac{1}{r}$ if $p >= 0.5$.
This controls whether/when curving happens. This controls whether/when curving happens.
Now, the parameter $r$ simply controls the degree to which curving happens. Now, the parameter $r$ simply controls the degree to which curving happens.
Different values of $r$ were experimented with (values between $10$ and $1$ were experimented with at increasingly smaller step sizes. Different values of $r$ were experimented with (values between $10$ and $1$ were experimented with at increasingly smaller step sizes).
$2$ and $1.05$ are both good choices at opposite "extremes". $2$ and $1.05$ are both good choices at opposite "extremes".
$2$ works because it is large enough to produce novel changes in behavior at extreme temperatures without totally disregarding the original probabilities. $2$ works because it is large enough to produce novel changes in behavior at extreme temperatures without totally disregarding the original probabilities.
Values above $2$ do not work because they make probabilities too uniform. Values above $2$ do not work because they make probabilities too uniform.
@ -265,58 +295,14 @@ At this point, I plan on using the git branch "feature-normal-science-framework"
Then, I'll do a massive cross-formula answer distribution comparison with $\chi^2$ tests. This will give me an idea about which formula and which changes are best. Then, I'll do a massive cross-formula answer distribution comparison with $\chi^2$ tests. This will give me an idea about which formula and which changes are best.
I'll also be able to compare all of these answer distributions to the frequencies obtained in temperature removal branches of the repository. I'll also be able to compare all of these answer distributions to the frequencies obtained in temperature removal branches of the repository.
\subsection{Steps/plan} \subsubsection{Temperature Calculation Adjustment}
\subsubsection{Temperature Usage Adjustment}
Normal Science: \subsection{$\chi^2$ Distribution Testing}
\begin{enumerate} \section{Results}
\item Introduce statistical techniques \subsection{$\chi^2$ Table}
\item Reduce magic number usage, document reasoning and math \section{Discussion}
\item Propose effective human subject comparison \subsection{Distributed Computation Accuracy}
\end{enumerate} \subsection{Prediction}
Temperature:
\begin{enumerate}
\item Propose formula improvements
\item Experiment with a destructive removal of temperature
\item Experiment with a "surgical" removal of temperature
\item Assess different copycat versions with/without temperature
\end{enumerate}
\subsection{Semi-structured Notes}
Biological or psychological plausibility only matters if it actually affects the presence of intelligent processes. For example, neurons don't exist in copycat because we feel that they aren't required to simulate the processes being studied. Instead, copycat uses higher-level structures to simulate the same emergent processes that neurons do. However, codelets and the control of them relies on a global function representing tolerance to irrelevant structures. Other higher level structures in copycat likely rely on globals as well. Another central variable in copycat is the "rule" structure, of which there is only one. While some global variables might be viable, others may actually obstruct the ability to model intelligent processes. For example, a distributed notion of temperature will not only increase biological and psychological plausibility, but increase copycat's effectiveness at producing acceptable answer distributions.
We must also realize that copycat is only a model, so even if we take goals (level of abstraction) and biological plausibility into account...
It is only worth changing temperature if it affects the model.
Arguably, it does affect the model. (Or, rather, we hypothesize that it does. There is only one way to find out for sure, and that's the point of this paper)
So, maybe this is a paper about goals, model accuracy, and an attempt to find which cognitive details matter and which don't. It also might provide some insight into making a "Normal Science" framework.
Copycat is full of random uncommented parameters and formulas. Personally, I would advocate for removing or at least documenting as many of these as possible. In an ideal model, all of the numbers present might be either from existing mathematical formulas, or present for a very good (emergent and explainable - so that no other number would make sense in the same place) reason. However, settling on so called "magic" numbers because the authors of the program believed that their parameterizations were correct is very dangerous. If we removed random magic numbers, we would gain confidence in our model, progress towards a normal science, and gain a better understanding of cognitive processes.
Similarly, a lot of the testing of copycat is based on human perception of answer distributions. However, I suggest that we move to a more statistical approach. For example, deciding on some arbitrary baseline answer distribution and then modifying copycat to obtain other answer distributions and then comparing distributions with a statistical significance test would actually be indicative of what effect each change had. This paper will include code changes and proposals that lead copycat (and FARG projects in general) to a more statistical and verifiable approach.
While there is a good argument about copycat representing an individual with biases and therefore being incomparable to a distributed group of individuals, I believe that additional effort should be made to test copycat against human subjects. I may include in this paper a concrete proposal on how such an experiment might be done.
Let's simply test the hypothesis: \[H_i\] Copycat will have an improved (significantly different with increased frequencies of more desirable answers and decreased frequencies of less desirable answers: desirability will be determined by some concrete metric, such as the number of relationships that are preserved or mirrored) answer distribution if temperature is turned to a set of distributed metrics. \[H_0\] Copycat's answer distribution will be unaffected by changing temperature to a set of distributed metrics.
\subsection{Random Notes}
This is all just free-flow unstructured notes. Don't take anything too seriously :).
Below are a list of relevant primary and secondary sources I am reviewing:
Biological/Psychological Plausibility:
\begin{verbatim}
http://www.cell.com/trends/cognitive-sciences/abstract/S1364-6613(16)30217-0
"There is no evidence for a single site of working memory storage."
https://ekmillerlab.mit.edu/2017/01/10/the-distributed-nature-of-working-memory/
Creativity as a distributed process (SECONDARY: Review primaries)
https://blogs.scientificamerican.com/beautiful-minds/the-real-neuroscience-of-creativity/
cognition results from the dynamic interactions of distributed brain areas operating in large-scale networks
http://scottbarrykaufman.com/wp-content/uploads/2013/08/Bressler_Large-Scale_Brain_10.pdf
\end{verbatim}
\bibliographystyle{alpha} \bibliographystyle{alpha}
\bibliography{sample} \bibliography{sample}