The Snorlaxiser
Disclaimer: The copyright to Snorlax as a concept is probably owned by someone. If anyone asks, this is satire or some such, and is meant as harmless fun. The code presented here uses my own name, in fact. I don't think anyone's copyrighted my name. Might be a good idea. I look forward to the cease and desist letters.
Anyway, the point of the Snorlaxiser is replace the text content of DOM Tree (or sub tree) with text mad up pseudo-randomly from a selection of syllables, whilst maintaining the layout, punctuation, etc.
The code, written in javascript (so as one day I may get round to making into a google chrome plug-in), is based around three primary functions:
- Tree navigator - Find the all of the text nodes which are the child a given node, and run some function on the located text nodes
- Syllable counter - Used to estimate the number of syllables in a given word, used to maintain the feel of the document
- Text converter - replaces each word with an appropriate number of syllables
The tree navigator is very easy to write as a recursive function
function convertNodes(node) {
if (node.nodeType == 3) { // 3 is a Text Node
node.nodeValue = convertText(node.nodeValue); // Call the conversion function
} else {
// Even if this isn't a text node, perhaps some of its children are...
var m = node.childNodes.length;
for (var i = 0; i < m; i ++) convertNodes(node.childNodes[i]);
}
}
Although this function is not properly optimised, but is in the form most obvious when writing from scratch.
The syllable counter uses a simple method: count the number of separated vowels. There is also an additional check which tries to catch 'magic-e' words. Running it on a range of haiku generally yielded a result between 15 and 18 (the type of Haiku I was using should have 17 syllables), an accuracy that I felt was sufficient for such a silly purpose as this code. Suggestions for improving this are most welcome in the comments
var syllableCount = function(string) {
var matches = string.match(syllableCount.pattern);
if (matches == null) return 0; // No vowels found...
var currentSyllableCount = matches.length;
if (string.match(syllableCount.silentE) != null) currentSyllableCount -= string.match(syllableCount.silentEs).length;
return currentSyllableCount;
}
syllableCount.pattern = new RegExp("[aeiouy]([^aieouy]|$)", 'gim'); // Vowel followed be non-vowel or end of string. Matches all in multi-line string, case insensitively.
syllableCount.silentE = new RegExp("[aeiouy][^aeiouy]e([^a-z]s|[^a-z]|$)", 'i'); // words ending vce / vces where v is some vowel, c is some consonant
syllableCount.silentEs = new RegExp("[aeiouy][^aeiouy]e([^a-z]s|[^a-z]|$)", 'gim'); // as above, but match all in multi=line string (previous matches only first - used to find if there are any quickly)
And now for the text replacement function. This generates a pusedo-random string which matches the opening case of each word, and attempts to maintain the syllable count of each word. The output is based on the supplied syllables at the end of the function. A member of the upperSyllables array is used as the first syllable of a word which original began with a capital letter. For all other syllables, a member of the syllables array is used.
function convertText(str) {
var words = str.match(convertText.splitIntoWords); // Attmpt to split the input string into a series of words
if (words == null) return str; // If there are no words, just return the original string
var word_count = words.length;
var out = str.match(convertText.initialPunctuation); // Output Buffer, Initialised with the opening non-letters form input string
for (var i = 0; i < words.length; i++) {
var syllCount = syllableCount(words[i]); // Get the (estimated) number of syllables
if (words[i].match(convertText.startsWithUpper) == null) { // null = no match, so does not start with upper
out += convertText.syllables[Math.floor(Math.random()*convertText.syllables.length)]; // Add lower syllable
} else {
out += convertText.upperSyllables[Math.floor(Math.random()*convertText.upperSyllables.length)]; // Add upper syllable
}
for (var j = 1; j < syllCount; j++) { // The first syllable is dealt with above. All others are lower.
out += convertText.syllables[Math.floor(Math.random()*convertText.syllables.length)]; // Add lower syllable
}
// Add any trailing non-letters to the string (maintains spacing, punctuation, numbers, etc.)
trailing_punctuation = words[i].match(convertText.trailingPunctuation);
if (trailing_punctuation != null) out += trailing_punctuation;
}
return out;
}
convertText.splitIntoWords = new RegExp("[a-z\']+([^a-z\']+|$)", 'gim'); // Split words based on each group of non-letters. These are kept with the preceding word.
convertText.initialPunctuation = new RegExp("^[^a-z]+", 'im'); // Any initial non-characters. Others are included in their preceding word
convertText.trailingPunctuation = new RegExp("[^a-z']+$", 'i'); // Get the punctuation from a word
convertText.startsWithUpper = new RegExp("^[A-Z]", ''); // First character is upper case
// The main syllable array
convertText.syllables = new Array (
'ben',
'e'
'dict'
);
// The array of syllables to start a word with when the word starts with a capital letter
convertText.upperSyllables = new Array (
'Ben',
'E'
);
Feel free to take and mess with this code: there's nothing special in this post, so share and share alike. Mention me in passing, if you have the chance. Or, possibly, even comment?