What Are the Majestic Documents?
Â Â Â The term âMajestic documentsâ refers generally to thousands of pages of purportedly classified government documents that prove the existence of a Top Secret group of scientists and military personnelâMajestic 12âformed in 1947 under President Harry Truman, and charged with investigating crashed extraterrestrial spacecraft and their occupants. Majestic 12 personnel allegedly included a number of noteworthy political, scientific, and military figures, including: Rear Admiral Roscoe Hillenkoetter, the first CIA Director; Dr. Vannevar Bush, wartime chair of the Office of Scientific Research; James Forrestal, Secretary of the Navy and first Secretary of Defense; General Nathan Twining, head of Air Materiel Command at Wright-Patterson Air Force Base and later Chairman of Joint Chiefs of Staff; and Dr. Donald Menzel, an astronomer at Harvard University. More specifically, the Majestic documents refer to a series allegedly classified documents leaked from 1981 to the present day by unidentified sources concerning Majestic 12 and the United States governmentâs knowledge of intelligent extraterrestrials and their technology.1 The documents date from 1942 to 1999.
Due to the explosive nature of their content, the Majestic documents are considered by many to be the core evidence for a genuine extraterrestrial reality and alien visitation of planet Earth in the 20th century. United States government personnel have denied their authenticity, primarily on an opinion rendered by AFOSI, the U.S. Air Force counterintelligence office. The AFOSI report focused on certain features of the documents it considered historically anachronous and other historical inconsistencies (see Section 1.2 below). The charges of the AFOSI have been coherently rebutted, and so both validation and debunking efforts has resulted in a stalemate.
This impasse notwithstanding, other documents discovered before and after the alleged leaking of the Majestic documents appear to validate the existence of the group Majestic-12. In 1985, a document referring to a joint National Security Council (NSC) MJ-12 âSpecial Studies Projectâ group was discovered by Jaime Shandera in the National Archives.2 This document, a 1954 memorandum from Robert Cutler to General Nathan Twining, became known by UFO researchers as the Cutler-Twining memo. The Cutler-Twining memo shared certain stylistic traits with a 1953 memorandum between Cutler and Twining discovered in 1981 among General Twiningâs papers at the Library of Congress. Canadian documents discovered in 1978, three years before the first alleged leak of the first Majestic documents, note the existence of a highly-classified UFO study group operating within the Pentagon’s U.S. Research and Development Board, and headed by Dr. Vannevar Bush. Although the name of the group is not given, these Canadian documents appear to support the existence of Majestic 12. While this may be the case, proof for the existence of Majestic 12 does not logically translate into authentication for the Majestic Documents themselves or their content on other points.
The Majestic documents have undergone thorough forensic authentication with respect to non-linguistic issues and methods.3 The primary researchers who have put considerable effort into authenticating the documents are Stanton Friedman4 and the father-son team of Dr. Robert and Ryan Wood.5 These researchers have tested the documents in the following ways:6
1. Physical dating of the ink, pencil and paper.
2. Dating by matching the reproductive process (typography) of the typewriter, printer,
copy machine, or mimeographic machine.
3. Dating by use of language of the period.
4. Watermarks and chemical composition of paper.
5. Comparison of handwriting.
6. Comparison with known events of record.
7. Comparison with known styles for government memoranda and correspondence.
8. Comparison with known or expected security procedures.
9. Logic of content.
10. Records of provenance.
11. Eyewitness testimony of individuals mentioned in documents
The Wood team was able to solicit the expertise of specialists in their authentication effort. For comparison of typewriter impressions, watermarks, James Black served as their primary expert. Mr. Black is a Fellow of the Questioned Documents Section of the American Academy of Forensic Sciences and a former chairman of the Questioned Documents Subcommittee of the American Society of Testing and Materials.7 For examination of paper, ink, and watermarks, the Wood team sought the services of the Speckin Forensic Laboratories. The Speckin website states that the laboratory is:
. . . [A]n International forensic firm specializing in consulting with plaintiff and defense lawyers involving issues concerning: Forgery, Sequencing of Entries, Alterations, Additions, Rewritings, Ink Dating and Paper, Typewriting, Facsimiles, Photocopies, Fingerprints, Narcotic and Street Drug Analysis, Analytical and Forensic Chemistry, DNA, Firearms and Toolmark Examination, Shoe and Tire Prints, Handwriting, Crime Scene Reconstruction Criminal Forensic Matters and Computer Forensics.8
A variety of concerns have been raised in the course of forensic authentication procedures and publication of these efforts, such as apparent anachronistic statements, possible typewriter impression inconsistencies, grammatical errors, departures from standard styles, printing flaws, and virtually identical signatures on different documents. Examples of each of these concerns have been catalogued and answered by the Wood team.9
To date such criticisms of the Majestic Documents have failed to deliver conclusive evidence of forgery. However, Stanton Friedman has successfully detected several fakes among the cache. The forgeries were photocopies of authentic documents with certain content and vocabulary changes designed to alter the content toward a discussion of Majestic 12. These forgeries are explained and illustrated on Friedmanâs website.10 The presence of these forgeries do raise the spectre that all the Majestic documents may be contrived, especially since an estimated seventy percent of the documents are photocopies. However, it is important to note that no other fakes have been conclusively detected.
Notwithstanding the examinations noted above, the Majestic documents have never been subjected to scientific linguistic analysis to determine the validity of their authorship. While the Wood team and Mr. Friedman mention in several of the cited publications and websites that the Majestic documents have also undergone âlinguisticâ testing, the same publications and online sources offer no evidence of such testing. The Wood team and Mr. Friedman fail to define what they mean by terms like âlinguistic testingâ or âlinguistic analysis,â and offer no proof that genuine forensic linguistic analysis of the type conducted for this paper ever took place as part of their authentication efforts. Additionally, while the Speckin Forensic Laboratories website mentions that the company does work in âcomputer forensicsâ (see above), the Woods offer no evidence in their writings or website that Speckin ever tested the Majestic documents in this way.
Only Stanton Friedman makes any attempt to describe an effort to have the Majestic documents tested linguistically and, as his description makes clear, no modern forensic computational linguistic work was actually done:
At the suggestion of attorney Bob Bletchman, I had obtained 27 examples of Hillenkoetter’s various writings from the Truman Library. Dr. Wescott reviewed these and the EBD [Eisenhower Briefing Document] and stated in an April 7, 1988, letter to Bob . . . âIn my opinion there is no compelling reason to regard any of these communications as fraudulent or to believe that any of them were written by anyone other than Hillenkoetter himself. This statement holds for the controversial presidential briefing memorandum of November 18, 1952, as well as for the letters, both official and personal.â11
The above account contains no information on what Dr. Wescott (now deceased) did with the documents given to him. Several considerations suggest that Dr. Wescott likely did little more than look at the documents, rather than conducting actual tests. First, the development of the field of computational linguistics and the use of computers for natural language processing of necessity followed the development of computers and processing power. In 1988 these research methods were known, but not widely available. Second, Dr. Wescottâs areas of expertise included neither authorship attribution research or computer forensic linguistics. Rather, the focus of Dr. Wescottâs work was anthropological linguistics.12 Despite his distinguished academic year, a search of linguistics databases produces no evidence that Dr. Wescott ever did any work in these areas. This is no doubt because his teaching career ended at roughly the time these fields were beginning to blossom.
These observations are significant, since training as a linguist, especially one that earned his Ph.D. in 1948, does not guarantee one has any knowledge of any given subfield within oneâs discipline. For example, what would a podiatrist know about heart surgery? A cardiologist about neuro-medicine? A defense attorney about patent law? A microbiologist about frogs? The answer to all would be very littleâenough to perhaps converse with other nonspecialists, but not nearly enough to be considered competent by specialists. The point is that a doctoral degree in linguistics hardly guarantees and sort of expertise in a specific sub-discipline of linguistics, especially one that dovetailed with computer science. Dr. Wescott had perhaps used a computer by 1988, but his academic record gives no indication that he was either proficient in their use or involved in applying computers to language processing and authorship attribution. Consequently, he would be disqualified from having anything meaningful to contribute to any discussion of computational methods of authorship attribution.
It should also be noted that Dr. Wescottâs assessment lacks conviction. At best his amateur opinion in this sub-discipline of linguistics offers the conclusion that he has no basis to draw an actual conclusion. As UFO researcher Paul Kimball points out, Wescott himself made it clear that he had given no conclusive answer or endorsement to authenticity. In a letter to the International UFO Reporter, Wescott wrote: âI have no strong conviction favoring either rather polarized position in the matter . . . I wrote that I thought its [the EBD] fraudulence [was] unproved . . . I could equally well have maintained that its authenticity is unproved . . . inconclusiveness seems to me to be of its essence.â13
This is all that is offered in terms of linguistic testing and evidence for the Majestic documents. The thoroughness and care with which Friedman and the Woods have addressed other forensic issues is sorely lacking with respect to modern methods of linguistic analysis, specifically designed to determine (or rule out the possibility) of authorship of documents. The absence of demonstrable testing data in any form of publication puts the burden of proof on these and other researchers to prove they have indeed subjected the Majestic documents to linguistic analysis.
This study fills the existing research void created by the absence of strictly linguistic approaches to the problem of authenticating the Majestic documents. The goal of the research presented in this study was to determine whether the Majestic documents that carry a signature were indeed written by the people to whom authorship is attributed. Toward achieving this goal, the study employed state-of-the-art computational linguistic methods of authorship attribution. In some cases, these techniques have been pioneered by Dr. Carol Chaski, a recognized leader in this type of linguistic research.14 These methods have been employed, validated, and approved numerous times in various courts of law. It is the opinion of the authors that the utilization of these methods is the most reliable and testable means of authenticating or refuting the authorship attribution of those Majestic documents that bear the name of an author.
The focus of this study, as noted, is validation or falsification of the authorship attributions of the Majestic documents. As such, the scientific methods employed for this study cannot be used to validate the content of any of the Majestic documents whose authorship proves genuine. The computational methods of the research cannot determine the truth of written content. It can only determine whether or not that content was written by the attributed author. Refutation of attributed authorship would prove a document is a forgery, and so the content of that document would therefore be considered spurious. The converse is not true, however. Authentication of authorship means only that the document was written by the person to whom it is attributed. This suggests that the content is genuine, but does not actually prove that to be the case. Additionally, computational methods of authorship attribution lend nothing to the necessary enterprise of interpreting written content. The study should be characterized as preliminary because further testing that could be applied to the documents is currently cost prohibitive. As funding becomes available, other methods will be applied for redundancy and validation of the results presented in this paper.
The remainder of this paper details the application of computational linguistic methods to determine the authenticity of authorship attributions of the Majestic documents. The paper is divided into the following sections:
â¢ Description of the Majestic documents included and excluded in the study.
â¢ Overview of the linguistic testing methods used in the study.
â¢ Explanation and interpretation of the test results.
â¢ Overview of how these same methods have held up in courts of law.
â¢ Suggestions for future linguistic research of the Majestic documents
Source of the Majestic Documents for Testing
The Majestic documents tested were obtained online via www.majesticdocuments.com, the website repository for the Majestic documents maintained by Dr. Robert Wood and his son Ryan Wood. The Woods have had the Majestic documents posted free to the public for several years as part of their efforts to expose the public to this material.
For authorship attribution testing to be undertaken, the document under question must have been attributed to some author. As such, only those documents among the Majestic documents that specifically bear the name of a signatory author were considered for testing. Famous Majestic documents such as the Eisenhower Briefing, for example, were not tested because there is no claim in the briefing as to the author of the briefing. Researchers and amateurs refer to the Eisenhower Briefing as though its authorship by Dwight D. Eisenhower was self-evident. The document itself makes clear that Eisenhower was not the author, as the very first page informs the reader that the briefing was âprepared for President-elect Dwight D. Eisenhower.â Another famous Majestic document not bearing an author name and therefore excluded from testing is the SOM1-01 manual for Extraterrestrial Entities and Technology, Recovery and Disposal. Additionally, the Einstein-Oppenheimer document could not be tested because it represents overlapping authorship.
Another criterion was applied to the list of documents that passed the initial litmus test of bearing an author name. The testing methods employed require that a document be more substantial than a couple sentences, and so length was an issue. The need for length notwithstanding, a document of this brevity that met the third criteria below would have been included in testing due to content importance. There was no instance, however, of a document of insufficient length being important enough in terms of content to still test that document. An example of a document too brief for testing would be the âMalcolm Grow to Lt. Gen. Twining â Aero Medical Laboratoryâ (20 September 1947), which is a single sentence.
The third criteria was pragmatic, and driven in part by cost considerations. Of those documents that bore a signature and were of sufficient length (more than a sentence or two), preference for testing was given to those documents that contained specific reference to the existence of extraterrestrial biological entities (EBEs) or claims of an extraterrestrial origin for salvaged wreckage. Any document that appeared important for validating the extraterrestrial hypothesis (ETH) as an explanation to UFOs was included in the testing. For example, a document that mentioned the retrieval or transport of wreckage from Roswell or some other event famous for its connection to the UFO question may have been deferred for testing if there was nothing in the document that specifically pointed to the ETH or an EBE. The mere mention of âRoswellâ or âWright Pattersonâ would not be sufficient to mandate testing. In brief, there has to be something compelling about the document for it to merit testing.
Fourth, some of the Majestic documents could not be tested because they contained no prose text. An example is the document entitled, âMajestic Twelve Project, Purpose and Table of Contents (Summer, 1952?).â This document is simply a table of contents. Even if a document of this nature had an attributed author, it could not be tested by linguistic means.
Lastly, documents that were clearly secondhand in nature were not chosen for testing. An example is the lengthy Bowen manuscript. While the Wood team labels this document as âhigh interest,â it is not written by a person who would be âin the knowâ with respect to the high levels of security needed to be a primary witness to either evidence for the ETH and EBEs or to discussions within Majestic Twelve. While it may be true that, as the Wood team states, âBowen was personally connected to many top people,â15 it defies coherence to argue, on one hand, that Majestic-12 and its activities were so secret that evidence of its existence only became available in the 1980s, and on the other, to suggest that members of Majestic-12 were sharing the nationâs most highly classified secrets with an outsider like Mr. Bowen. The secondary nature of the Bowen manuscript is acknowledged by the Wood team, as they note its status as âa well written snapshot of the public history of flying saucers
from 1947 to 1954.â16 The operative word in this comment is âpublic,â which reveals its peripheral importance in terms of content.
The Majestic documents tested by Dr. Chaski were typed and proofed by Dr. Michael S. Heiser, Amy C. Ward, and Joe E. (âFreeâ) Ward, of Roswell, NM. Only the prose content of the documents was typed out for testing, along with salutations and benedictions. Date formulas, stamps, handwritten annotations, military file numbers, memoranda headings, etc. were not typed out since authorship attribution testing concerns the testing of written prose content for author-particular stylistics. Misspellings and ungrammatical errors in usage were preserved in the prose content reproduced for testing. Documents were saved as text (.txt) files.
The following spreadsheet chart (Chart 1) contains the seventeen documents allegedly written by nine authors that were tested by Dr. Chaski. Unknown to Dr. Chaski, I included several documents previously demonstrated as fraudulent by Stanton Friedman (see Section 2.7). I did so to test Dr. Chaskiâs analysis independently. The identity of these fraudulent documents is revealed below under the test results.
|– click image(s) to enlarge –|
Thirty documents whose composition by the nine authors to whom the Majestic documents were attributed served as the data pool for computational stylistic comparison.17 The chart below (Chart 2) reveals that these âknown authorâ documents were selected with
|– click on image(s) to enlarge –|
sensitivity to sameness of word and character count, genre, chronological era, and recipient. While the enterprise of authorship attribution by computational linguistic methods does not require sameness of subject matter for document comparison, several of the âknown authorâ documents contained similar subject matter (e.g., space technology). In some instances, the âknown authorâ document references an event in one of the unverified documents (e.g., the 1942 Los Angeles sighting).
The material in this section draws heavily upon the peer-reviewed article by Dr. Chaski.18
Dr. Chaski explains that, when it comes to document attribution in the legal world, methods for determining authorship âmust work in conjunction with the standard investigative and forensic techniques which are currently available.â19 Determining authorship of a typewritten document, whether originally or subsequently put into electronic form, can be approached three ways: â. . . biometric analysis of the computer user; qualitative analysis of âidiosyncrasiesâ in the language in questioned and known documents; and quantitative, computational stylometric analysis of the language in questioned and known documents.â20
With respect to the Majestic documents, the first method is not possibleâthere is no way to analyze actual keystroke pattern dynamics. This method is technically non-linguistic. The second method âassesses errors and âidiosyncrasiesâ based on the examinerâs experience.â21 This method also has the disadvantage of requiring the pre-existence of a stylistic database against which to measure presumed idiosyncrasies. Chaski elaborates:
This approach, known as forensic stylistics, could be quantified through databasing, as suggested by McMenamin (2001), but at this time the databases which would be required have not been fully developed. Without the databases to ground the significance of stylistic features, the examinerâs intuition about the significance of a stylistic feature can lead to methodological subjectivity and bias. Another approach to quantifying is counting particular errors or idiosyncrasies and inputting this into a statistical classification procedure. When the forensic stylistics approach was quantified in this way by Koppel and Schler (2000), using 100 âstylemarkersâ in a Support Vector Machine (Vapnik 1995) and C4.5 (Quinlan 1993) analysis, the highest accuracy for author attribution was 72%.22
The third approach, stylometry, âis quantitative and computational, focusing on readily computable and countable language features, e.g. word length, phrase length, sentence length, vocabulary frequency, distribution of words of different lengths.â23 Stylometric analysis also may include analysis of function word frequency and punctuation.24
As one of the leaders in the field of the development of authorship attribution techniques that meet legal standards for evidence, Dr. Chaski has developed âa computational, stylometric method which has obtained 95% accuracy and has been successfully used in investigating and adjudicating several crimes involving digital evidence.â25 Chaski elaborates on her method (ALIAS26):
[My] syntactic analysis method (Chaski 1997, 2001, 2004) has obtained an accuracy rate of 95%. The primary difference between the syntactic analysis method and other computational stylometric methods is the syntactic methodâs linguistic sophistication and foundation in linguistic theory. Typical stylometric features such as word length and sentence length are easy to compute even if not very interesting in terms of linguistic theory, but the more difficult to compute features such as phrasal type are also more theoretically grounded in linguistic science and experimental psycholinguistics.27
As noted above (Sec. 1.3), with respect to the Majestic documents, Dr. Chaskiâs testing was not as thorough as it could have been due to expense. Variations on the capabilities of ALIAS were employed to test the Majestic documents. The testing is therefore referred to as preliminary in this paper. Future testing will allow a full exploitation of the capabilities of ALIAS.
Specifically, the method employed in this initial round of testing by Dr. Chaski was an ângramâ approach. N-gram approaches involves pattern detection of a specific number (n) of parts-of-speech labels or words in sequence. Once these sequences are found, they can be sorted by similarity.28 (Chaski, âKeyboard,â 5). In regard to her own pioneering techniques in the fieldâwhich were used for testing the Majestic documentsâDr. Chaski noted:
âN-gram approaches for author identification have been very successful on large documents, approaching 98% accuracy verified. I wanted to make sure that an n-gram approach would also work on short documents. Another problem is that some n-gram approaches are very biased toward document length, so the wordier person always gets selected as the author. I was able to fix both problems and get ~90% accuracy on short documents with verbose known authors not being favored over concise known authors or vice versa. The exact details are proprietary, as this is a real advance in the field.â29
One final word on the testing enterprise is necessary. It is acknowledged that many of the Majestic documents were not handwritten or even typed by the author to whom they are attributed. The typical practice, especially for presidents, would be to verbally dictate the content of correspondence to a secretary who would type and reproduce the content. This reality is not at odds with Dr. Chaskiâs testing methods since memoranda and correspondence are not be produced by distinct psycho-linguistic processes. In other words, there is no significant linguistic difference between dictating a letter as one would desire it be written and the mental connection to the act of typing those thoughts oneself.
(pt II pending)
* Special Thanks To Dr. Michael S. Heiser
1).See the chronological listing of the reception of the Majestic documents reconstructed by Dr. Robert Wood and Ryan Wood, http://www.majesticdocuments.com/sources.php, accessed June 5, 2007. A table summary of the circumstances of the source and provenance of each document can be found in Dr. Robert Wood, âMounting Evidence for the Authenticity of MJ-12 Documents,â paper presented at the International MUFON Symposium, Irvine, CA; July 21, 2001, 5. Accessed at 18.104.22.168/pdf/rmwood_mufon2001.pdf on June 5, 2007.
2). Shandera was one of the early recipients of the Majestic documents.
3). See Wood, âMounting Evidence,â 6-10
4). Stanton Friedmanâs website biography reads in part: âStanton Friedman received the BSc and MSc degrees in physics from the University of Chicago in 1955 and 1956. He was employed for 14 years as a nuclear physicist for such companies as GE, GM, Westinghouse, TRW Systems, Aerojet General Nucleonics, and McDonnell Douglas on such advanced, classified, eventually cancelled, projects as nuclear aircraft, fission and fusion rockets, and nuclear powerplants for space.â Accessed at www.v-j-enterprises.com/sfbio.html on June 5, 2007.
5). Dr. Robert Wood holds a B.S. in Aeronautical Engineering from the University of Colorado and a Ph.D. in Physics from Cornell University. He spent 43 years in research and development with Douglas Aircraft and McDonnell Douglas before retiring in 1993. Ryan Wood holds a B.S. in Mathematics and Computer Science from California Polytechnic State University at San Luis Obispo. He has held various positions in marketing, consulting, and sales for Intel Corporation, Digital Equipment, and Toshiba.
6). See Stanton T. Friedman, Top Secret / Majic (New York: Marlowe and Company, 1996); idem; âFinal Report on Operation Majestic Twelve,â unpublished paper, 1990; Wood, âMounting Evidenceâ; idem., âValidating the New Majestic Documents,â paper presented at the International MUFON Symposium, St. Louis, MO; July 15, 2000; Robert M. and Ryan Wood, “Another Look at Majestic,” MUFON UFO Journal No. 371, March 1999.
7). Wood, âMounting Evidence,â 6-7.
9). Wood, âMounting Evidence,â 9-10.
12). See www.utc.edu/Research/SunTrustChair/chair_previous_wescott_index.html accessed June 6, 2007.
13). International UFO Reporter, vol. 13, no. 4, July / August 1988, p. 19. Cited by Paul Kimball, âMJ-12 â The Wescott âAnalysisâ Red Herring,â The Other Side of the Truth, July 14, 2005, accessed at redstarfilms.blogspot.com/2005_07_01_archive.html on June 6, 2007.
14). Dr. Chaski holds an M.A. and Ph.D. in linguistics from Brown University. Computational linguistics is one of her specialties, and her work in this field has been recognized and validated through peer review, numerous legal cases, and scientific grant funding. See www.linguisticevidence.org/FLCV.htm.
17). Numbers and names of these documents were invented by Dr. Heiser as a means of categorization. The âknown authorâ documents in the spreadsheet above were drawn from a larger number of possible documents.
18). Carol E. Chaski, âWhoâs at the Keyboard? Authorship Attribution in Digital Evidence Investigations,â International Journal of Digital Evidence 4:1 (Spring, 2005). Accessed online, June 10, 2007.
19). Chaski, âKeyboard,â 1.
22). Ibid., 2. See the bibliography for the articles cited by Chaski.
23). Ibid., 2.
24). See Carol E. Chaski, âEmpirical Evaluations of Language-Based Author Identification Techniques,â International Journal of Speech, Language and the Law 8:1 (2005): 5; John Olsson, âUsing Groups of Common Textual Features for Authorship Attribution,â Forensic Linguistics Institute, Nebraska Linguistics Institute (n.d.), 1-10; accessed at www.thetext.co.uk/authorship/authorship.doc, June 12, 2007; Michael Gamon, âLinguistic Correlates of Style: Authorship Classification with Deep Linguistic Structures,â Microsoft Research, Microsoft Corp.; accessed at research.microsoft.com/nlp/publications/coling2004_authorship.pdf, June 11, 2007; Shlomo Argamon, and Shlomo Levitan, âMeasuring the Usefulness of Function Words for Authorship Attribution,â Illinois Institute of Technology; accessed at lingcog.iit.edu/doc/paper_162_argamon.pdf, June 11, 2007.
25). Ibid., 1.
26). ALIAS is a computer program written by Dr. Chaski. Dr. Chaskiâs method is currently under patent pending status with the U.S. Patent Office.
27). Ibid., 2. See the bibliography for works cited.
28). See Chaski, âKeyboard,â 5.
29). Personal communication with the author in email, June 12, 2007.