Trying To Think
Tuesday, September 20, 2005
Surface Metaphysics and Data Mining
Surface metaphysics claims that to exist is simply to be pattern (that is capable of being tracked). Data Mining finds patterns, using the operational definition that a pattern enables prediction of new data. This seems similar, but a puzzling difference is that in Data Mining, it is the data that "exists", and the patterns are just useful abstractions of it (and so, applying a buddhist perspective on abstractions, they don't really exist at all, but are useful as a "screen" over reality). Can we reconcile this, and have a surface metaphysics perspective of data mining? And if so, how would data mining techniques and assumptions affect metaphysics (and vice-versa?)
The different definition of a pattern is also interesting; what patterns would data mining find if it used the surface metaphysics definition? Are the 2 equivalent? (and if so, SM should probably favour the more useful operational definition).
Monday, January 31, 2005
Data MiningData Mining: Practical Machine Learning Tools and Techniques with Java Implementations.
Witten and Frank, QA76.9.D343.W58/2000
Data Mining is the extraction of implicit, previously unknown, and potentially useful information from data. The idea is to build conputer programs that sift through databases automatically, seeking regularities or patterns. Strong patterns, if found, will likely generalize to make accurate predictions on future data. Of course, there will be problems. Many patterns will be banal and uninteresting. Other will be spurious, contingent on accidental coincidences in the particular dataset used. And real data is imperfect: some parts are garbled, some missing.(pg xix)
Useful patterns allow us to make non-trivial predictions on new data. (pg3)
One way of visualizing the problem of learning - and one that distinguishes it from statistical approaches - is to imagine a search through a space of possible concept descriptions for one that fits the data. (pg27)
Viewing generalization as a search in a space of possible concepts makes it clear that the most important decisions in a machine learning system are:
- the concept description language
- the order in which the space is searched
- the way that overfitting to the particular training data is avoided
These three properties are generally referred to as the bias of the search and are called language bias, search bias, and overfitting-avoidance bias. (pg29)
When building a decision tree, a commitment to split early on using a particular attribute might turn out later to be ill-considered in the light of how the tree develops below that node. To get around these problems, a "beam search" could be used where irrovocable commitments are not made, but instead a set of several active alternatives - whose number is the "beam width" - are pursued in parrallel. (pg31)
If disjunction is allowed, useless concept descriptions that merely summarize the data become possible, whereas if it is prohibited, some concepts are unlearnable. To get around this problem, it is common to search the concept space starting with the simplest concept descriptions and proceeding to more complext ones later: simpliest-first ordering. (pg32)
Neural Networks: Bigus, 1996Genetic Algorithms: Goldberg, 1989
Four basically different styles of learning appear in data mining applications. In classification learning, a learning scheme takes a set of classified examples from which it is expected to learn a way of classifying unseen examples. In association learning, any association between features is sought, not just ones that predict a particular class value. In clustering, groups of examples that belong together are sought. In numeric prediction, the outcome to be predicted is not a discrete class but a numeric quantity. (pg38)
Inductive logic programming: Bergadano and Gunetti, 1996
Instance based representation: "just store the instances themselves and operate by relating new instances whose class is unknown to existing ones whose class is known. Instead of trying to create rules, work directly from the examples themselves .. instance-based learning is lazy, deferent the real work for as long as possible .. each new instance is compared with existing ones using a distance metric, and the closest existing instance is used to assign the class to the new one. This is called the nearest-neighbour classification method" (pg72)
Comparing against all instances is slow, so "deciding which instance to save and which to discard is another key problem in instance-based learning ... when training instances are discarded, the result is to save just a few proto-typical examples of each class ... these prototypical examples server as a kind of explicit knowledge representation" (pg73-4)
Other algorithms provide a hierarchical structure of clusters, so that at the top level the instance space divides into just a few clusters, each of which divides into its own subclusters at the next level down, and so on. ... diagrams like these are called dendograms. This terms means just the same thing as tree diagram. (pg76)
Chapter 4 – Basic Algorithms
In the infinite variety of possible datasets there are many different kinds of structure that can occur, and a data mining tool – no matter how capable – that is looking for one class of structure may completely miss regularities of a different kind, regardless of how rudimentary those might be. The result: a baroque and opaque classification structure of one kind instead of a simple, elegant, immediately comprehensible structure of another. (pg78)
1R (“one rule” - pg78ff): for each attribute, get the majority class for each value. Count the errors for each attribute (i.e. the number of values that did not get the majority class). The rules for the attribute with the least errors become the rules we use. Missing values are just another attribute value, and numeric values can be partitioned into ranges based on class changes, with a minimum number in each partition to avoid overfitting.
Naïve Bayes (pg82ff): For each attribute value, work out the % chance of each class. For new instances, use these % values to work out the most likely class. “Naïve” assumes that all the attributes are equal and independent. Adding the “Laplace estimator” handles attributes that produce zero probabilities. Assigning an “a priori” initial probability for each attribute removes the assumption that the attributes are equal, produces a full Bayes (but working out what the initial value should be is difficult, and don’t make much difference for large datasets).
Numeric values are handled by calculating the mean and standard deviation for each class and attribute, and then calculating the probability density function (pg86-7).
Problems: must be careful to choose independent attributes; must choose the correct distribution for numeric data (i.e. might not be normally distributed), or use “kernel density estimation”, or discretize the data first.
Decision Trees: select an attribute, and create a branch for each of its values. Then using another attribute, on each branch, create branches those instances which reach that branch. Continue until all instances would have the same classification from that branch.
We select attributes first that create the purest branches (i.e. the branch has the smallest number of classifications). Purity is measured by information (which has units of bits) = amount of information needed to classify an instance at that node. Quinlin's C4.5 is dicussed in Chapter 6.
Note - this talk about information is similar to information theory (dretsky's book)
While decision trees use divide-and-conquer (seek the attribute to best split the data into classes), the opposite approach is a covering approach - take each class in turn and seek a way of covering all instances in it, while excluding instances that aren't in the class (-> a set of rules). We start with a rule that maximises the probability of getting the class right, and add rules until we have the desired tolerance. We then continue to add rules for each class.
Thursday, November 25, 2004
Change - Eight Lectures on the I Ching
- Hellmut Wilhelm, PL 2464.Z7.W513
To be aware of what is constant in the flux of natures and life is the first step in abstract thinking ... the concept of constancy in change provides the first guarantee of meaningful action (p23)
The Creative and the Receptive are indeed the gateway to the Changes. The Creative is the representation of light things and the Receptive of dark things. In that the natures of the dark and light are joined, the firm and the yielding receive form. (p31)
The Book of Changes contains the measure of heaven and earth; therefore it enables us to comprehend the tao of heaven and earth and its order.
Looking upward we contemplate with its help the signs in the heavens; looking down, we examine the lines of the earth. Thus we come to know the circumstances of the dark and the light. Going back to the beginnings of things and pursuing them to the end, we come to know the lessons of birth and death. The union of seed and power produces all things; the escape of the soul brings about change. Through this we come to know the conditions of outgoing and returning spirits.
Since in this way man comes to resemble heaven and earth, he is not in conflict with them. His wisdom embraces all things, and his tao brings order into the whole world; therefore he does not err. He is active everywhere but does not let himself be carried away. He rejoices in heaven and has knowledge of fate, therefore he is free of care. He is content with his circumstances and genuine in his kindness, therefore he can practice love.
In it are included the forms and the scope of everything in the heavens and on earth, so that nothing escapes it. In it all things everywhere are completed, so that none is missing. Therefore by means of it we can penetrate the tao of day and night, and so understand it. Therefore the spirit is bound to no one place, nor the Book of Changes to any one form.
(I, 315-19, quoted on p69)
It is important to think of this representation as very concrete. Today we tend to speak of “symbols” in such a context, each person varying at will the distance between the symbol and the thing symbolized. In a magical world view, however, such as the one which has left its impress on the oldest stata of our book, a thing and its image are identical. (p35)
This attempt to view the totality of changing phenomena in terms of such a strict law of form may appear strange to us. The fact, however, that nature lends itself more easily to such systemizations than does the human mind is witnessed – to cite one example – by the arrangement, as rigid as natural, of the elements in the unbroken order of their atomic numbers. The occasional gaps, it became clear, were to be attributed to the state of chemical research and not to defects in the system. (p49)
The clouds pass and the rain does its work, and all individual beings flow into their forms.
(II, 3, quoted on p51)
He who is noble and has no corresponding position, he who stands high and has no following, he who has able peple under him who do not have his support, that man will have cause for regret at every turn.
(II, 16, quoted on p58)
The cosmos was not yet strange to him; it was not the subject of a specialised science; he lived in direct contact with the law of change, and the images were at hand, out of the store of ideas offered by the time and a living tradition. (p65)
Where disorder develops, words are the first steps. If the prince is not discreet, he loses his servant. If the servant is not discrete, he loses his life. If germinating things are not handled with discretion, the perfecting of them is impeded. Therefore the superior man is careful to maintain silence and does not go forth.
(I, 248, quoted on p71)
Natural-Born Cyborgs (Andy Clark, T14.5.C58/2003)
The more closely the smart world becomes tailored to an individual’s specific needs, habits and preferences, the harder it will become to tell where that person stops and this tailer-made, co-evolving smart world begins. (p30)
Well-fitted transparent technologies have the potential to impact what we feel capable of doing, where we feel we are located, and what kinds of problems we find ourselves capable of solving. (p34)
Study of chimps: those who learnt to use plastic tokens to stand for objects were able to solve problems that other chimps were not (p70)
The whole imposing edifice of human science itself is testimony, I believe, to the power and scope of this species of cognitive shortcut (p71-72)
There is, to be sure, a kind of low grade, approximate numerical sensibility that is probably innate and that we share with infants and other animals. Such a capacity allows us to judge that there are one, two, three, or many items present, and to judge that one array is greater than another. But the capacity to know that 25 + 376 is precisely 401 depends, Dehaene et al. argue, upon the operation of distinct, culturally inculcate, and language-specific abilities. (p72)
J. Elman “Learning and Development in Neural Networks: The Importance of Starting Small”, Cognition, 48 (1994): 71-99
S. Fahlman and C. Lebiere, “the Cascade-Correlation Learning Architecture”, in Advances in Neural Information Processing Systems 2, ed. D Touretzky (1990)
C. Thornton, Truth from Trash (MIT Press, 2000)
Languages of Art
An Approach to a Theory of Symbols
- Nelson Goodman, BH301.S8.G6/1976
The artist’s task in representing an object before him is to decide what light rays, under gallery conditions, will succeed in rendering what he sees. This is not a matter of copying but of conveying. (p14)
The measure of realism is habitation, but descriptions do not become descriptions by habitation. The most commonplace nouns of English have not become pictures. (p41)
Representation is thus disengaged from perverted ideas of it as an idiosyncratic physical process like mirroring, and is recognized as a symbolic relationship that is relative and variable. (p43)
With representation and expression alike, certain relationships become firmly fixed for certain people by habit; but in neither case are these relationships absolute, universal, or immutable (p50)
With progressive loss of its virility as a figure of speech, a metaphor becomes not less but more like literal truth. What vanishes is not its veracity but its vivacity. Metaphors, like new styles of representation, become more literal as their novelty wanes. (p68)
In metaphor … a term with an extension established by habit is applied elsewhere under the influence of that habit; there is both departure from and deference to the precedent. When one use of a term precedes and informs another, the second is the metaphorical one. As time goes on, the history may fade and the two uses tend to achieve equality and independence; the metaphor freezes, or rather evaporates, and the residue is a pair of literal uses – mere ambiguity instead of metaphor. (p71)
The shifts in range that occur in metaphor, then, usually amount to no mere distribution of family goods but to an expedition abroad. A whole set of alternative labels, a whole apparatus of organization, takes over new territory. What occurs is a transfer of a schema, a migration of concepts, an alienation of categories. Indeed, a metaphor might be regarded as a calculated category-mistake – or rather as a happy and revitalizing, even if bigamous, second marriage. (p73)
Whatever reverence may be felt for classes or attributes, surely classes are not moved from realm to realm, nor are attributes somehow extracted from some objects and inserted into others. Rather a set of terms, of alternative labels, is transported; and the organization they effect in the alien realm in guided by their habitual use in the home realm. (p74)
The incessant use of metaphor springs not merely from the love of literary color by also from urgent need of economy. If we could not readily transfer schemata to make new sortings and orderings, we should have to burden ourselves with unmanageably many different schemata, either by adoption of a vast vocabulary of elementary terms or by prodigious elaboration of composite ones. (p80)
In effect, the fact that a literary work is in a definite notation, consisting of certain signs or charatacters that are to be combined by concatenation, provides the means for distinguishing the properties constitutive of the work from all contingent properties - that is, for fixing the required features and the limits of permissible variation in each. Merely by determining that the copy before us is spelled correctly we can determine that it meets all requirements for the work in question. In painting, on the contrary, with no such alphabet of characters, none of the pictorial properties - none of the properties the picture has as such - is distinguished as constitutive; no such feature can be dismissed as continguent, and no deviation as insignificant. (p116)
Initially, perhaps, all arts are autographic. Where the works are transitory, as in signing or reciting, or require many persons for their production, as in architecture and symphonic music, a notation may be devised in orider to transcend the limitations of time and the individual. This involves establishing a distinction between the constitutive and the contingent properties of the work. (pg 121)
When there is a theoretically decisive test for determining that an object has all the constitutive properties of the work in question without determining how or by whom the work was produced, there is no requistite history of production and hence no forgery of a given work. Such a test is provided by a suitable notational system with an articulate set of characters and of relative positions for them. (pg 122)
That the characters must thus be disjoint may not seem very important or striking; but it is an absolutely essential and, I think, rather remarkable feature of notations. (pg 133)
The syntactic requirements of disjointness and of finite differentiation are clearly independent of each other. The first but not the second is satisfied by the scheme of classification of straight marks that counts every difference in length, however small, as a difference of character. The second but not the first is satisfied by a scheme where all inscriptions are conspicuously different by some two characters have at least one inscription in common. (pg 137)
A symbol scheme is analog if syntactically dense; a system is analog if syntactically and semantically dense. Analog systems are thus both syntactically and semantically undifferentiated in the extreme: for every character there are infinitely many others such for the some mark, we cannot possibly determine that the mark does not belong to all, and such that for some object we cannot possibly determine that the object does not comply with all. ... A digital scheme, in constrast, is discontinuous throughout; and in a digital system the characters of such a scheme are one-one correlated with compliance-classes of a similarly discontinuous set ... To be digital a system must be not merely discontinuous but differentiated throughout, syntactically and semantically. If ... it is also unambiguous and syntactically and semantically disjoint, it will therefore be notational. (p160-1)
If the subject matter is antecedently atomized, we tend to adopt an articulate symbol scheme and a digital system. Or if we are predisposed to to apply an available articulate symbol scheme to a previously undifferentiated field, we try to provide the symbols with with differentiated compliance-classes by dividing, combining, deleting; the fractional quantities not registered by our meter tend to be disregarded, and the smallest units it discriminates to be taken as the atomic units of what is measured. Should a prior structuring authoritatively resist such surgery, we may lay aside our articulate symbol scheme and turn to an analog system. (p162-3)
You see no experiment can be repeated exactly. There will always be something different ... What it comes to when you say you repeat an experiment is that you repeat all the features of an experiment which a theory determines are relevant. In other words you repeat the experiment as an example of the theory. (Sir George Thomson, quoted on p177)
Thursday, November 04, 2004
Some sort of contest?
"Apparently there was some kind of contest last night. We hope that guy sings "She Bangs" in that funny voice won."
Wednesday, November 03, 2004
Getting Things Done (1)
In knowledge work .. the task is not given; it has to be determined. ... There is usually no right answer; there are choices instead. And results have to be clearly specified, if productivity is to be acheived.
Most often, the reason something is "on your mind" is that you want it to be different than it currently is, and yet:
- you haven't clarified exactly what the intended outcome is
- you haven't decided what the very next physical action step is, and/or
- you haven't put reminders of the outcome and the action required in a system you trust
Things rarely get stuck because of the lack of time. They get stuck because the doing of them has not been defined. (pg19)
Tuesday, November 02, 2004
... can be thought of as partitioning a space. Each input node of the NN is a dimension of the space; for any given array of input values, we have a point in the space. The outputs of the NN divide up this space, and so the point will exist in one (or more) regions.
All a NN does is map input values to output values. All this description does is create a visualisation of mapping.
Internal representations: these can be thought of in the same way, simply by taking hidden nodes instead of output nodes. But this is not what I want to do - I want to create "objects" or "pictures" that can be thought to mediate the mappings.
Modularity: an important assumption in Cog Sci is that mental processes (and in particular, the abtract mental modules we theorize about) are modular. This seems to be the case, or at least we have theories of mind (folk and scientific) that are modular and have some success.
But how can we modularise a NN? Can we look at it (or its behaviour) and deduce functional modules, or can we only take a functional stance towards the behaviour as a whole (following Dennett - does he address the idea of modularisation in his talk of stances?).
eg. a NN may say words aloud, following the grammatical and ad-hoc rules of English. We can break this function up into modules, and brain damage in humans show that different modules may be independantly impaired.
Must the NN also have these modules, or may they be distributed through the NN? If the latter (which seems likely), then is it incorrect to modularise the NNs reading, even if it is functionally the same as ours? It would not show the same patterns of damage, but wouldn't the same modular theory of mind apply?
Even worse, the NN must be physical sub-structures, which can be considered as functional modules. These might not be the same functional modules as we have. And so the same function (reading words aloud) should be modularised differently?
What this might all mean: we have intuitions about how our minds should be modularised. These intiutions (generally) are based on modules that happen to exist, simply because of the idiosyncratic development of our brains, and have no deeper significance. (of course, there might be only a small number of possible ways for a brain to develop, given the tasks that it has to perform. There might be a natural modularisation in the task itself, which the brain might as well mirror. If this is the case, then the NN that reads words aloud would probably show the same functional modules, if it is allowed to develop in a similar way to our brains)
to do: think of some examples.
Wednesday, June 16, 2004
Main Essay - Theories of Memory
Memory is not a simple phenomenon, and there are two traditional divisions: procedural memory (embodied skills and habits, which can referred to as “knowing how”) and propositional memory (or “semantic memory” - memory for facts that can be referred to as “knowing that”). A third type of memory, often called episodic memory, consists of the recollecting episodes in one’s personal life. Episodic memory is either considered a third form of memory, or it is grouped with propositional memory and called “declarative memory”, since both kinds of memory are meant to represent the world (Sutton, 2004, pg2). This grouping of memory into two or three types is appealing, but not final or uncontroversial: Eichenbaum and Cohen (2002) point out that "there is at this time no consensus on just how many memory systems there are or on how to categorize them according to cognitive dimensions" (pg13). Conner and Carr (1982) add that “if we look at the very varied forms that our memories take, it is not easy to draw any hard and fast lines between them” (pg 206). They continue to argue that memory can be thought of as a continuum of related phenomena, with episodic memories containing the most individual and perceptual detail, and habit memory being “an accumulated compost of experience stripped of their individuating properties and their wealth of perceptual details” (pg218). There is a variety of phenomena that are grouped under the heading of memory, but since all share the function of allowing access to the past, and since there seems to be no clear demarcation between the phenomena, it is tempting to assume that there is a single underlying mechanism. This essay will examine what this mechanism (or mechanisms) could be.
How it is possible for the past to be known to us? One extreme answer is to argue that the past is directly accessible to us. Known as direct realism, this theory seems unintuitive, since the thing that seems to distinguish past events from present ones is that past events are no longer directly accessible. However, proponents of direct realism argue that it is the most straightforward way of interpreting our claims about memory. For example, Laird (1920) asserts that “memory does not mean the existence of present representatives of past things. It is the mind’s awareness of past things themselves … memories can be explained by the hypothesis of direct acquaintance with the past without further ado” (pg56-7). This, of course, is not an explanation at all, as Laird himself later admits: “It is plainly impossible to explain the fact of memory. Memory is possible, and that is all we need to know” (pg59). While it is a straightforward way of interpreting our claim to memory, Laird’s version of direct realism does not advance any explanation of memory. Our goal is to explain the phenomena, and not simply describe it, as Sutton (2003) asserts: “The genuine phenomenology of ‘direct’ access to the past … cannot be deemed primitive and inexplicable” (pg7).
In contrast to direct realism, indirect realism claims that it is a representation that we observe, and not reality itself. In the case of memory, it is the observation of this representation that creates the experience of remembering the past. The subject is therefore removed from the past, cut off “behind a veil of memory ideas” (Sutton 2003, pg3), so that the past itself is never observed directly. Aristotle (1973) argued that our memory consists of images or pictures – “a copy or a souvenir” (pg107) that represent our experience. He uses the example of a picture, which can be simply considered as a picture, but fulfils its intended function when it is considered to stand-in for the thing that the picture naturally represents. The picture represents this thing simply because it resembles it. Locke (1999) continues this theory: in memory (and perception in general), there is a “double awareness” of the internal object (the “idea”, which is perceived) and the material thing, which is seen (pg599). Aristotle claimed that images were essential to memory, and traditionally the role of imagery continued to be thought of as not only central, but essential to memory.
One particular problem for the image theory of memory is that of distinguishing the images of memory from those of imagination or perception. Audi (1998) says that the memory images “might even be sense-data if they are vivid enough” but usually “my memorial images … might be conceived as a kind of residue of perception” (pg60). Given the great similarity, how is it that we reliably tell them apart? Two historical replies to this are from Hume and Russell (both quoted in Dancy, 1985, pg186). Hume’s attempts to answer this (that the memory images are more vivid, forceful and lively than imagination), and Russell’s (that there is a notion of pastness, or familiarity, associated with memory), describe the fact that we distinguish memory from imagination, but do not provide any detail of how this is done.
The image theory provides a good description of episodic memory, for remembering our past experiences is reasonably like accessing a perceptual image of them. However, apart from the problem of distinguishing memories from perception or imaginings, it is not clear how the image is created (and why some experiences are memorable, and others not), how is it accessed at the appropriate time, and why the perceptual image differs in some ways to the original perception (for example, we usually remember our experiences from an outsider’s point-of-view, Jaynes, 1976, pg29).
The centrality of images is also open to dispute. Although images play at least a phenomenological role in many of our memories, this is not always the case, and some people experience little imagery for any memories (Conner and Carr, 1982, pg210). Audi (1998) claims that “remembering an event surely does not require acquaintance with an image of it”, and he goes on to argue that “I might remember what color your sweater was even if I cannot bring the color itself to mind” (pg61). While a memory may, at times, seem to consist simply in the recollection of an image, “if memory can work equally well without them their role is clearly an inessential one” (Conner and Carr, 1982,, pg210). Susan Engel argues that "one creates the memory at the moment one needs it, rather than merely pulling out an intact item, image, or story" (quoted in Sutton, 2004, pg9). It is the components that produce such images that are the basis of memory, not the phenomenological image that is produced. Additionally, consideration of non-episodic memory argues against the importance of images. It is difficult to see how images could play any role in procedural memory, or in the semantic memory of propositions. The image theory can only be applied to episodic memory, and has little explanatory power even there.
The trace theory maintains that, instead of being mediated by images, the past is preserved by neurological traces. There are versions of trace theory that see these traces as local and distinct. For example, Robert Hooke thought memories are "in themselves distinct; and therefore that not two of them can be in the same space, but that they are actually different and separate one from another" (quoted in Sutton 2004, pg 5). However, modern trace theories more usually refer to dynamic, distributed systems. Recall of an event is not seen as finding the correct item that represents that event; instead “occurent remembering is the temporary activation of a particular pattern or vector across the units of a neural network” (Sutton 2004, pg3).
The mechanism of procedural memory in simple organisms has been described in these terms (for example, Eric Kandel’s work in the mid-1970s, on the Californian sea snail Aplysia, described in Spitzer, 1999, pg42). If we are willing to extend this neurological basis to all types of memory, we will need to explain how the same mechanism can produce the apparent principled difference between the types of memory, i.e. the obvious phenomenological difference. Such differences can be accommodated in the theory in two ways. Firstly, different types of memory can be mediated by different neurological structures. Eichenbaum and Cohen (2001) have demonstrated that "declarative memory supports a relational representation ... conversely, non-declarative forms of memory, such as procedural memory, involve individual representations ... such memories are isolated in that they are encoded only with the brain modules in which perceptual or motor processing is engaged during learning" (pg54). Additionally, there are two general ways of encoding memory: bias, or modulating response to stimulus (briefly for working memory, or longer term for cortical maps for stimuli), and ability to sustain or reactivate response in absence of the stimulus (pg133). The second way to accommodate the difference between types of memory is in the mechanism that actualises the trace, and so constructs the memory. The same basic information about a past event may be recalled in the context of an autobiographical recollection, or as a proposition, if it can be encoded in a way that is appropriate to both forms of recollection.
It is an important aspect of modern trace theory that “traces (whatever they may be) are "merely potential contributors to recollection", providing one kind of continuity between experience and remembering; so traces are invoked merely as one relevant causal/ explanatory factor” (Suytton, 2004, pg6). That is, the trace is only that part of the mechanism that provides the causal link to the past, and should not be identified as the whole system of memory and recollection. Laird (1920) agrues that “the mere fact that the brain endures and retains traces of former simulation does not explain memory” (pg59), and it is the case that the mechanisms of recall are an important aspect of memory. Modern trace theory accepts this point and acknowledges "the engram (the stored fragments of an episode) and the memory … are not the same thing" (Schacter, quoted in Sutton, 2004, pg6).
The major problem that faces a trace theory is to explain how the traces represent the events which caused them. This problem of representation does not affect the image theory, for the image was a simply copy, or residue, of the perceptual event (although Dennett, 1997, pg69, sees this view as suffering from a circular definition of resemblance). Martin and Deutscher argue that an analysis of remembering should include the requirement that (in cases of genuine remembering) "the state or set of states produced by the past experience must constitute a structural analogue of the thing remembered" (quoted in Sutton, 2004, pg7). Some structural relationship must exist between the traces and the events, else they do not represent the events. The traces cannot be a mere copy, or residue, of the perceptual experience (otherwise we would have an image theory of memory), but instead must encode the relevant features in ways that can enable the later recall.
Spitzer (1999, pg83ff) reports on various experiments on rats which demonstrate one possible relationship between the traces and the experiences they represent. A map-like structure is developed in the hippocampus that exhibits a simple resemblance to the surroundings that it models. But this simple structural isomorphism cannot explain the relationship between traces and other events, which do not have a spatial structure. How should the taste of sangria be encoded in a spatial network? What are the relevant features, and how can they be represented? Answering these questions is a difficulty for trace theory. In fact, the problem is greater than simply identifying a possible mapping, for the structures involved “need not remain the same over time, or might not always involve identifiable determinate forms over time” (Sutton 2004, pg8). In fact, for dynamic versions of trace theory, the traces cannot remain the same, for they must “live with our interests and with them they change” (Bartlett, quoted in Sutton, 2004, pg8), yet they must continue to represent, at least at some level, the same thing.
Eichenbaum and Cohen (2001) argue that “memory should be conceived as being intimately intertwined with information processing in the cortex, indeed so much so that the ‘memory’ and ‘processing’ are inherently indistinguishable … information processing and memory combine to constitute the structure of our knowledge about the world” (pg133). Memory consists in nothing more than the fact that our cognitive systems change as a result of the information they process, and those changes affect later processing. A memory trace is a change that can be used as the basis of an occurrent recall. Andy Clarke (2001) speaks of memory traces as functioning as
“internal stand-ins for potentially absent, abstract, or non-existent states of affairs. A ‘stand-in’, in this strong sense is an item designed not just to carry information about some state of affairs … but to allow the system to key its behaviour to specific states of affairs even in the absence of direct physical connection” (pg 129)
A memory trace should not be considered as a simple, passive carrier of information about the past. Instead, it is a change to a system (such as consciousness or occurrent semantic knowledge) that affects behaviour just as if the system was able to gain direct access to those aspects of the past. A full description of how the trace represents the event is therefore not to be given as a set of mapping rules, but instead is one aspect of a description of the behaviour of the occurrent system. A full description of the occurrent system is required to understand the function and meaning of the representation. This can be quite involved; for example, Sutton (2003) points out that autobiographical memory involves “the internalisation of cultural schemes”, which would provide the appropriate scaffolding on the top of “flexible internal processes” (pg5). An explanation of autobiographical memory would therefore span personal, subpersonal, and social levels of explanation. The problem here is the same as the general problem of the mental representation of meaning. Dennett (1997) argues that a representation “means what it does because of its particular position in the ongoing economy of your brain’s internal activities and their role in governing your body’s complex activities” (pg70). This is not to say that structural resemblances cannot be found (for example, in the map-like structures of the hippocampus), but there is no need to insist that they must be found at the neurological level. Sutton (2004) argues that “the structures which underpin retention … might not always involve identifiable determinate forms over time” (pg8).
Direct realism and the image theory are both appealing descriptions of the phenomena of memory. However, to actually explain memory, we need a theory that details how we actually store and then access the events of the past, to produce these phenomena. Trace theory is such a theory, which appeals to the neurological changes left by these past events, and describes how these may be used to affect current behaviour and enable us to construct our memory of past events.
Word count: 2487
Aristotle, “De sensu and De memoria”, translation by G. R. T. Ross, New York : Arno Press, 1973.
Audi, Robert. “Epistemology: a contemporary introduction to the theory of knowledge” Routledge: New York, 1998
Clarke, Andy, “Reasons, Robots and the Extended Mind”, Mind and Language, Vol 16 No 2 April 2001, pp 121-145, Blackwell Publishers: Oxford
Conner, D.J. and Carr, Brian (1982) “Memory” Ch 5 in “Introduction to the Theory of Knowledge”, Knowledge and Reality: Selected Readings”, Sydney: Macquarie University, 2004
Dancy, J, “Introduction to Contemporary Epistemology”, Oxford: Blackwell, 1985
Dennett, Daniel C , “Kinds of minds : towards an understanding of consciousness”, London : Phoenix, 1997
Howard Eichenbaum, Neal J. Cohen, “From conditioning to conscious recollection : memory systems of the brain”, Oxford : Oxford University Press, 2001
Julian Jaynes, “The Origin Of Consciousness in The Breakdown of the Bicameral Mind”, 1976, Houghton Mifflin, New York.
Laird, John, “A Study in Realism”, Cambridge 1920
Locke, “An Essay Concerning Human Understanding”, in Readings in Epistemology, compiled by Jack S. Crumley II, Mountain View, Calif. : Mayfield Pub. Co., 1999.
Spitzer, Manfred, “The Mind Within The Net :models of learning, thinking, and acting”, Cambridge, Mass : The MIT Press, 1999
Sutton, John, “Memory: Philosophical Issues”, in the Encyclopedia of Cognitive Science, ed. Nadel, 2003
Sutton, John "Memory", The Stanford Encyclopedia of Philosophy (Summer 2004 Edition), Edward N. Zalta (ed.), forthcoming URL =
Second Short Essay - Coherence Theory of Justification
The coherence theory of justification, or “epistemic coherentism”, claims that “all justification of beliefs depends on coherence within a system of beliefs” (pg82, Moser et al, 1998). It is usefully contrasted with foundationalism, in which “the direction of justification is all one-way, and … there are some comparatively fixed points in the structure, the basic beliefs” (pg110, Dancy, 1985); i.e. we justify any given belief by referring to other, more basic beliefs. The coherence theory of justification does without these basic beliefs, and justifies beliefs on the coherence of the belief-set they create. This belief-set is coherent “to the extent that the members are mutually explanatory and consistent.” (p112, Dancy, 1985)
One objection to epistemic coherentism is that it does not seem to be compatible with empiricism - “the view that the evidence of the senses … is a sort of evidence appropriate to genuine knowledge”, pg188, Moser et al, 1998. If we are empiricists, then our belief-set needs to be more than simply coherent –our beliefs must also be consistent with our experience. The empiricist objection to epistemic coherentism has two parts: one part is based on whether perceptions can be part of the belief-set; the other is the special status of perceptual beliefs.
The first can be called the “isolation objection” (pg85, Moser et al, 1998). Beliefs are justified by reference to the coherence of the belief-set, but this belief-set traditionally excludes data such as perceptual states, which are non-propositional. To allow for empiricism, the gap between experience and belief must somehow be bridged. Naturally, other models of belief must also bridge the gap between experience and belief. For example, the empirical foundationalist wishes for his basic beliefs to be directly related in some way to non-propositional perceptual states. The isolation objection can be raised against any model of belief, and if the required link between perception and belief cannot be established, then it is empirical justification in general that has been undermined, and not only an empirical coherence model of justification.
One way of allowing interaction is to deny that there is a fundamental distinction between belief and experience (following Kant, as quoted in Darcy, 1985). In this case, it would arbitrary as to whether we extend the belief-set to include experience, or allow experience to somehow influence the set. In either case, beliefs that are wholly disconnected from experience could not be justified.
The second part to the objection is that beliefs which are grounded in experience ought to have a privileged justification, by virtue of having been caused by experience. There is no place for any special roles within simple coherence, because the nature or source of the belief is irrelevant; justification is provided only by the belief’s effect on the coherence of the set.
One response to this is to propose a weaker version of coherency theory, which allows for differences between beliefs. This theory would distinguish between beliefs with “antecedent security” (the security or justification that a belief brings with it, regardless of coherence) and “subsequent security” (acquired through a belief’s contribution to the coherency of the belief-set). This form of coherency seems simply to be “another name for a form of foundationalism” (pg122, Darcy, 1985), since the beliefs with antecedent security will provide a foundation for those with only subsequent security.
A second response is to acknowledge that sensory beliefs have the following status: we accept them as true so long as nothing counts against them. This acknowledgment seems to fit the demands of empiricism. We can then argue that this is our approach to all beliefs: “any belief will remain until there is some reason to reject it” (pg124, Darcy, 1985). We are naturally credulous, and this credulity is essential for any learning: “For how can a child immediately doubt what it is taught? That could only mean that he was incapable of learning certain language games.” (283, Wittgenstein, 1999)
Arguably, there ought to be an additional empirical demand on our theory of justification: that it take more evidence to reject a sensory belief than a non-sensory one. Coherence theory can accommodate this demand by adding “stubborn empiricism” as a belief to the belief-set. If our belief-set contains the belief that sensory experience is inherently reliable, only overwhelming incoherence from other beliefs will be sufficient to reject a sensory belief (Darcy, 1985). This seems to be a better alternative to weak coherency, for it not only a simpler theory, but allows for belief sets that are not empirical (such as delusional or fictional belief sets).
The coherence theory of justification is, therefore, compatible with empiricism. Perceptions can influence the belief-set, and so ground it in experience, and the special role of perceptual beliefs demanded by empiricism can be handled without resorting to foundationalism.
First Short Essay - JTB account of knowledge
The traditional account of knowledge attempts to describe the requirements of propositional knowledge. It states that there are three requirements, all of which must be fulfilled. First, a person claiming to know something is also claiming to believe it; Moser et al (1998) describe belief as “a logically necessary condition for knowing” (pg15). However, mere belief is insufficient for knowledge; the statement must be a true one. In the past, people believed that the earth was flat, but they were wrong if they claimed to know the earth was flat, since it is not (Moser et al, 1998, pg15). In particular, it is obvious that our beliefs can be mistaken, and having truth as a prerequisite for knowledge allows for this fact (Moser et al, 1998, pg74-75). Additionally, lucky guesses, even if believed, do not count as genuine knowledge – the true belief must have “supporting reasons” (Moser et al, 1998, pg15). This justification is the third requirement of knowledge, and completes the justified true belief (JTB) definition. In the words of Moser et al, “If you have good reasons in support of the truth of your belief, and your belief is true and is based on good reasons, then you have knowledge, according to the traditional analysis” (Moser et al, 1998, pg16).
We will set aside à priori statements as special cases of knowledge, and instead focus on à posteriori statements which make claims about the external world, such as “The earth is not flat”, and “London is the capital of England”. In particular, we will focus on “Gettier-style counterexamples”. The following is based on one of the counterexamples in Gettier, 2004:
Smith and Jones are both applying for a job. Smith believes, and has evidence for, two facts: (a) Jones will get the job, and (b) Jones has 10 coins in his pocket. Smith concludes, and has a justified belief, that (c) the person who gets the job will have 10 coins in their pocket. However, it turns out that Smith gets the job; however, since Smith happens to have 10 coins in his pocket, (c) remains true. Although (c) is an example of justified true belief, it is not intuitive to say that Smith knew (c).
In this example, and others like it, belief and truth are treated as simple, definite and atomic. On analysis, few of our beliefs are absolute, and many propositions are too complex for a simple truth value. However, restating the problem with a more complex analysis of Smith’s belief-states does not seem to remove the quandary. Also, the treatment of truth, though naïve, seems sufficient for the simple propositions that the example uses. Instead, Gettier-style counterexamples highlight that we need to be more specific about what we mean by "justification". They show that a justified belief can be true, and yet not count as knowledge, when the justification, though seemingly overwhelming, is only coincidently related to the truth. We need to add a further condition to the JTB analysis, which clarifies the relationship between justification and the truth.
Two such conditions have been proposed (Moser et al, 1998, pg95-98):
1. A causal relationship. We naturally expect that our justification will have something to do with the relevant facts of the world. The situation we assume is that our justification is caused by the relevant facts of the world. Obviously, something will have caused our justification; the issue is whether the cause was related to the proposition in question. In Gettier-style counterexamples, this is not the case.
2. No defeating facts for our justification. We do not expect there to be additional facts that reveal one or more of our assumptions to be incorrect. Furthermore, we also do not expect there to be further evidence that, while not actually dismissing our previous justification, would add justification to opposing propositions.
We naturally expect objective reality to be consistent. This consistency means that justification that is caused by a fact of the world would not be defeated by other facts of the world. Therefore, given our intuitions about objective reality, we would expect this second condition to be a consequence of the first. The defeating facts in Gettier-style counterexamples can act as our addition criterion of knowledge because they reveal that the justification is unrelated to the truth.
If the facts of the world have not caused our justification, then we do not have knowledge. This lack of causal relationship means that are defeating facts. In the Gettier-style counterexamples, we can see this lack of causal relationship, and defeating facts are exhibited that undermine the justification. The counterexamples challenge our intuitions about knowledge, and force us to analyse the concept in greater detail. This analysis reveals a further condition implicit to the concept of knowledge: our justification must be related to the truth.
Tuesday, March 30, 2004
Week 4 (24th March)
1. an epistemic state.
2. an intentional state. pictures and words are also intentional.
3. a propositional attitude. ie. an attitude towards a proposition (proposition = something that expresses content that can be true or false). Other propositional attitudess are desire, hope and fear, but only belief can be true or false. The others also have an affect, or phenomenal content.
Even though beliefs can influence perceptions ("theory laden theory of perception"), perception must in general be prior to belief, in order to form beliefs. Perception is of course inferential, with inference being made on the basis of expectations and beliefs.
Unconcious beliefs. Not enough room in consciousness for all our beliefs. If we stop being conscious of a belief, do we stop having it?
Dispositional Account of Belief.
Are beliefs just dispositions to act (or talk)? (disposition = tendency, inherent in the object, to behave a certain way in certain conditions). Do all beliefs -> actions? Depends on strength of belief, + action is not always directly or simply related (belief -> a cluster of dispositions). Can we eliminate talk of beliefs? (= doxastic eliminativism).
How are beliefs related?
Beliefs are inferentially promiscuous. Obviously the promiscuousness has limits, otherwise learning something would lead to learning all its ramifications. Also, we can hold inconsistent beliefs (but can we believe them if we know they are inconsistent?)
Doxastic holism: our beliefs and concepts are anchored in other beliefs and concepts
sub-doxastic states: encaspsulated beliefs, not ingerentially pomiscuous (eg. illusions. Even though we know the illusion, we still believe one line is longer, and cannot stop that perception).
= no such thing as beliefs. Just a convenient, folk psych fiction. Should really talk about states in the brain, neuro-psych, etc.
Is this self-defeating? Can doxastic eliminativism be believed? Supporters would say "we predict doxastic eliminativism will be true", instead of "believe".
Thursday, March 18, 2004
Week 3 (17th March 2004)
Thoughts on essay writing:
And so each venture
Is a new beginning, a raid on the inarticulate
With shabby equipment always deteriorating
In the general mess of imprecision of feeling,
Undisciplined squads of emotion. And what there is to conquer
By strength and submission, has already been discovered
Once or twice, or several times, by men whom one cannot hope
To emulate - but there is no competition -
There is only the fight to recover what has been lost
And found and lost again and again: and now, under conditions
That seem unpropitious. But perhaps neither gain nor loss.
For us, there is only the trying. The rest is not our business.
- T.S. Eliot, East Coker (No. 2 of 'Four Quartets')
- A philosophical account of knowledge details the necessary and sufficient conditions for knowledge. This is a conceptual analysis , not a way to work out how to get or evaluated knowledge. We want to describe the concept of knowledge in terms of other, related concepts. So we can include "truth" in our concept of knowledge (since a claim to knowledge implies that the proposition is true), even though the truth is in practice inaccessible.
Russell: A man looks at the town clock, which has always been reliable in the past. The clock says 1 o'clock, so he claims to know that the time is one o'clock. However, the clock is actually broken, and stuck at one o'clock, so he doesn't have knowledge. As a final twist, it turn out that it really is one o'clock.
So, he has justified true belief, but not knowledge.
- Did he really have justification? Certainly it was not complete, but justification never is. Justification is in degrees, and the degree of justification -> degree of legitimate claim to knowledge (which does not mean a *correct* claim to knowledge).
1. The justification must be related to the truth to count as real justification. There must be a causal relationship. In Russell's clock example, the truth of the time and the justification for believing the time are not related; it is just a coincidence that they are the same. Examination of the justification shows that it is not the right kind. So knowledge is JTB caused and sustained by the facts of the world. This is supposed to be a problem with knowledge of universals (eg. "All human beings have brains"), because there are no facts in the world about these universals (really??).
2. Have a "Gettier defeater" as a fourth criteria of knowledge. In the Gettier counterexamples, there are always facts that, if known, would have prevented JTB. The fourth criteria could be something like "there are no facts that, if known, would be sufficient to remove the justification". I'll be arguing this in the essay still, and just remove the attacks on the concept of truth.
Alternatives to JTB
1. The concept of "objective truth" is not valid. eg. "replacement pragmatism", Rorty, etc. I think you can argue this point without becoming a pragmatist, and I think this is the buddhist position, but this needs more thought.
2. the concept of "conceptual analysis" is not valid. eg. "replacement naturalism", Quinne (who is, of course, a genius), etc. We should explain what knowledge is, not try to describe it in terms of other concepts. Eventually I guess we end up talking about neural nets. And then we stop talking. Normative concepts are about how things ought to be . Replacement naturalism says we should only just descriptive concepts.
3. Knowledge is not an interesting concept, because the truth is inaccessible (yes, this is the same argument that was frowned upon at the start of the lecture as "missing the point". I agree that it doesn't really miss the point, but arguing that takes some time). The interesting thing, and what we always seem to end up talking about, is justification.
Monday, March 15, 2004
First short essay.
Basic idea: Truth is a criterion of the JTB defintion of knowledge. I dislike the idea of objective truth about the external world (for reasons given in metaphysics quotes, eg. "The meaning of a concept derives not from its reference to some independently real object, but rather through the circumstance that it recommends to us a particular way of looking at the world and suggests a certain appropriate form of action"). And so I attack this as a criterion, and suggest instead that undefeatable justification should be the criterion. This avoids Gettier problems as well.
Thursday, March 11, 2004
Lecture 2 (10 March 2004)
Learnt about a few distinctions:
1. a priori vs a posteriori. An epistemic distinction (ie. a distinction about how we know). A priori knowledge is known without experience, a posteriori can only be known through experience. eg. "2+2=4" vs "rain is wet".
2. analytic vs synthetic. A semantic distinction (ie. to do with meaning). Analytic statements have the truth or falsity contained within the statement, synthetic statements need reference to other facts. eg. "all bachelors are unmarried" vs "John is a bachelor".
3. necessity vs contingency. Necessary truths (or falsehoods) have to be true or false - it's not possible for them to be otherwise. Contingent truths may have been true or false; they happen to be either true or false, but it would have been possible for them to be otherwise. eg. "2+2=4" vs "dinosaurs are extinct"
Within necessity and contingency we have the logical (or "conceptual") and nomological (or "physical") distinction. If logical, the necessity or contingency covers all possible worlds, if nomological, it only covers worlds with the same laws of nature as ours. If something is nomologically necessary, it is true in all worlds with laws of nature like ours, if it is nomologically contingent, then it is possible in all worlds with laws of nature like ours (so there will be at least one with it true, and at least one with it false). "dinosaurs are extinct" is nomologically contingent, "water is wet" is nomologically necessary, but not logically necessary (it is possible to conceive of different laws of nature where H2O is not wet, although arguably it would no longer be "water" - more on this later).
Logical is the default and usual meaning of necessity and contingency.
a priori = analytic = logically necessary (and therefore nomologically necessary as well)
a posteriori = synthetic = logically contingent (nomologically ??? - work this out)
Arguments against Coextensiveness Thesis
Kant: some statements are a priori and synthetic eg. causation and maths. "every event has a cause" is a priori, but not true simply by virtue of the words.
Saul Kirpke: there are necessary truths that are a posteriori (eg. "Water is H20" - must be the case, but known by experience) and contingent truths that are a priori (eg. "The metre bar in Paris is one metre long" - depends on no other facts, but could have been otherwise).