The Snorlaxiser « Myself, Coding, Ranting, and Madness

The Snorlaxiser

31 Oct 2010 0:00 Tags: None

Disclaimer: The copyright to Snorlax as a concept is probably owned by someone. If anyone asks, this is satire or some such, and is meant as harmless fun. The code presented here uses my own name, in fact. I don't think anyone's copyrighted my name. Might be a good idea. I look forward to the cease and desist letters.

Anyway, the point of the Snorlaxiser is replace the text content of DOM Tree (or sub tree) with text mad up pseudo-randomly from a selection of syllables, whilst maintaining the layout, punctuation, etc.

The code, written in javascript (so as one day I may get round to making into a google chrome plug-in), is based around three primary functions:

Tree navigator - Find the all of the text nodes which are the child a given node, and run some function on the located text nodes
Syllable counter - Used to estimate the number of syllables in a given word, used to maintain the feel of the document
Text converter - replaces each word with an appropriate number of syllables

The tree navigator is very easy to write as a recursive function

function convertNodes(node) {
 if (node.nodeType == 3) { // 3 is a Text Node
 node.nodeValue = convertText(node.nodeValue); // Call the conversion function
 } else {
 // Even if this isn't a text node, perhaps some of its children are...
 var m = node.childNodes.length;
 for (var i = 0; i < m; i ++) convertNodes(node.childNodes[i]); 
 }
}

Although this function is not properly optimised, but is in the form most obvious when writing from scratch.

The syllable counter uses a simple method: count the number of separated vowels. There is also an additional check which tries to catch 'magic-e' words. Running it on a range of haiku generally yielded a result between 15 and 18 (the type of Haiku I was using should have 17 syllables), an accuracy that I felt was sufficient for such a silly purpose as this code. Suggestions for improving this are most welcome in the comments

var syllableCount = function(string) {
 var matches = string.match(syllableCount.pattern);
 if (matches == null) return 0; // No vowels found...
 
 var currentSyllableCount = matches.length;
 
 if (string.match(syllableCount.silentE) != null) currentSyllableCount -= string.match(syllableCount.silentEs).length;
 
 return currentSyllableCount;
}
syllableCount.pattern = new RegExp("[aeiouy]([^aieouy]|$)", 'gim'); // Vowel followed be non-vowel or end of string. Matches all in multi-line string, case insensitively.
syllableCount.silentE = new RegExp("[aeiouy][^aeiouy]e([^a-z]s|[^a-z]|$)", 'i'); // words ending vce / vces where v is some vowel, c is some consonant
syllableCount.silentEs = new RegExp("[aeiouy][^aeiouy]e([^a-z]s|[^a-z]|$)", 'gim'); // as above, but match all in multi=line string (previous matches only first - used to find if there are any quickly)

And now for the text replacement function. This generates a pusedo-random string which matches the opening case of each word, and attempts to maintain the syllable count of each word. The output is based on the supplied syllables at the end of the function. A member of the upperSyllables array is used as the first syllable of a word which original began with a capital letter. For all other syllables, a member of the syllables array is used.


function convertText(str) {
 var words = str.match(convertText.splitIntoWords); // Attmpt to split the input string into a series of words
 if (words == null) return str; // If there are no words, just return the original string
 
 var word_count = words.length; 
 var out = str.match(convertText.initialPunctuation); // Output Buffer, Initialised with the opening non-letters form input string
 
 for (var i = 0; i < words.length; i++) {
 var syllCount = syllableCount(words[i]); // Get the (estimated) number of syllables
 
 if (words[i].match(convertText.startsWithUpper) == null) { // null = no match, so does not start with upper
 out += convertText.syllables[Math.floor(Math.random()*convertText.syllables.length)]; // Add lower syllable 
 } else {
 out += convertText.upperSyllables[Math.floor(Math.random()*convertText.upperSyllables.length)]; // Add upper syllable 
 }
 
 for (var j = 1; j < syllCount; j++) { // The first syllable is dealt with above. All others are lower.
 out += convertText.syllables[Math.floor(Math.random()*convertText.syllables.length)]; // Add lower syllable
 }

 // Add any trailing non-letters to the string (maintains spacing, punctuation, numbers, etc.)
 trailing_punctuation = words[i].match(convertText.trailingPunctuation); 
 if (trailing_punctuation != null) out += trailing_punctuation;
 }
 
 return out;
}
convertText.splitIntoWords = new RegExp("[a-z\']+([^a-z\']+|$)", 'gim'); // Split words based on each group of non-letters. These are kept with the preceding word.
convertText.initialPunctuation = new RegExp("^[^a-z]+", 'im'); // Any initial non-characters. Others are included in their preceding word
convertText.trailingPunctuation = new RegExp("[^a-z']+$", 'i'); // Get the punctuation from a word
convertText.startsWithUpper = new RegExp("^[A-Z]", ''); // First character is upper case

// The main syllable array
convertText.syllables = new Array (
 'ben',
 'e'
 'dict'
);

// The array of syllables to start a word with when the word starts with a capital letter
convertText.upperSyllables = new Array (
 'Ben',
 'E'
);

Feel free to take and mess with this code: there's nothing special in this post, so share and share alike. Mention me in passing, if you have the chance. Or, possibly, even comment?

Myself, Coding, Ranting, and Madness

Home

Feeds

Tags

Other

The Snorlaxiser