Take the 2-minute tour ×
Programming Puzzles & Code Golf Stack Exchange is a question and answer site for programming puzzle enthusiasts and code golfers. It's 100% free, no registration required.

Problem:

Generate a sentence that can be read and understood. It must contain a subject, verb, and object, and tenses and plurals must match. The program must also be able to generate several different sentences to qualify.

Rules:

  • Hard-coding the sentences is not permitted, and nor is reading them directly from a file (i'm looking at you, unclemeat)
  • You can have any number of word lists
  • Submit an example sentence or 2 that have been generated by your program
  • Any language is accepted
  • It's a , so the most upvoted answer wins
share|improve this question
 
+1 for your edit description. –  Mike 2 days ago
6  
I think it's clear from some of the answers (MatLab I'm looking at you) that you should modify the rules such that data-mining is not allowed to pull consecutive words from any source. –  Carl Witthoft yesterday
 
While I'm being a smartass: since it's purely a popularity contest, someone should just post a HotModelBikini jpg. That'll get more votes than anything. –  Carl Witthoft yesterday
5  
I'll upvote anyone who uses repetitions of "buffalo" or "fish" as sample sentences! –  Yimin Rong yesterday
4  
Most answers here either mine valid, full sentences from text sources, or generate output that does not meet the criteria. To me, both approaches seem against the spirit of the question! If someone really wants to impress, might I suggest a program that starts with a set of valid sentence structures like [Adjective] [pl. noun] [verb] [adjective] [pl. noun] and pulls from a real dictionary (maybe using one of the Dictionary APIs available out there) to fill in the blanks? I'd write it myself if I had a few minutes to spare! :( After all... Lazy Developers Write Lousy Programs. –  Brian Lacy yesterday
show 6 more comments

26 Answers

Matlab

why

example of outputs:

>> why
The programmer suggested it.
>> why
To please a very terrified and smart and tall engineer.
>> why
The tall system manager obeyed some engineer.
>> why
He wanted it that way.

[This is one of Matlab's easter eggs]

share|improve this answer
3  
+1 I like Easter eggs :D –  Silviu Burcea 2 days ago
2  
you can see the code here: opg1.ucsd.edu/~sio221/SIO_221A_2009/SIO_221_Data/Matlab5/… –  Elisha 2 days ago
3  
The second example is not a sentence. It's an infinitive phrase. –  WChargin yesterday
 
Must of the answers here produce not only sentences (look at the other high voted answers for example). The task doesn't say it must create only sentences, it says it must be able to produce sentences. –  Elisha 4 hours ago
add comment

Bash

fgrep '/* ' /usr/src/linux* -r | cut -d '*' -f 2 | head -$((RANDOM)) | tail -1

Requirements: linux kernel source installed in /usr/src

This pulls random comments out of the kernel source. Whether the sentences are actually understandable is open to debate.

Examples of actual output:

  • end of packet for rx
  • I don't know what to do
  • 256 byte packet data buffer.
  • The rest of this junk is to help gdb figure out what goes where
  • Convert page list back to physical addresses, what a mess.
  • ???
  • Only Sun can take such nice parts and fuck up the programming interface
share|improve this answer
10  
Good one! You should pull them all and submit it as an official fortune database. –  Jason C 2 days ago
2  
You'd better use /usr/src/linux*, as, for instance, gentoo, names the folder /usr/src/linux-$version –  mniip 2 days ago
9  
"???" best comment ever –  PacMani 2 days ago
2  
Is not first rule 'nor is reading them directly from a file' is violated ? –  kuldeep.kamboj 2 days ago
4  
I'd say searching through system source code and filtering out the text from comments doesn't really count as "reading directly". –  Riot yesterday
show 5 more comments

PHP

Given enough time, this will produce all literature, past, present and future. The rules didn't mention that no other text may be produced.

The string 'TOS...' provides a logarithmic scale frequency of the letters to more closely match English. This is used to generate a larger string with the approximate relative letter frequencies.

$a = ord('A');
$s = '';

foreach (str_split('TOSRWQPPUALRQTTRGUUUQMMLMFZ') as $i=>$f)
{
    if (!ctype_alpha($c = chr($a + $i)))
        $c = ' ';
    $s .= str_repeat($c, round(exp((ord($f) - $a) / 3.976)));
}

$l = strlen($s) - 1;
for (;;)
    echo substr($s, mt_rand(0, $l), 1);

Running it, I have discovered such literary gems as:

  • GO NOW - You as a subject is implied.
  • IM AOK - I'm A-OK
  • IM FDR - I'm F(ranklin) D(eleano) R(oosevelt)

Also, numerous invectives to concisely express displeasure with the current situation. [Some letters redacted.]

  • F**K
  • S**T

As well, the following using the fine-tuned scaling:

  • IS IT ON
  • I AM STU
  • I SEE HTML
share|improve this answer
36  
Why, a bunch of monkeys could do the same! –  Tim S. 2 days ago
7  
I likes! Now make a program that processes the letters coming out of that and finds understandable sentences! :) –  TheDoctor 2 days ago
1  
+1 - any chances to automate the discovering part? The task was seemingly to produce *one*(?) sentence. BTW: how much time did you spend ;) –  Wolf 2 days ago
13  
How did you get F**K and S**T provided there is no * in 'ABCDEFGHIJKMLNOPQRSTUVWXYZ '? –  glglgl 2 days ago
1  
@glglgl - letters CHIU redacted. –  Yimin Rong yesterday
add comment

Java

Pulls the intro sentence from a random Wikipedia article:

import java.io.InputStream;
import java.net.URL;
import javax.xml.parsers.DocumentBuilderFactory;
import org.w3c.dom.Document;

public class RandomSentence {
    public static void main (String[] args) throws Exception {
        String sentence;
        do {
            InputStream in = new URL("https://en.wikipedia.org/wiki/Special:Random").openStream();
            Document doc = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(in);
            String intro = doc.getElementsByTagName("p").item(0).getTextContent();
            sentence = intro.replaceAll("\\([^(]*\\) *", "").replaceAll("\\[[^\\[]*\\]", "").split("\\.( +[A-Z0-9]|$)")[0];
        } while (sentence.endsWith(":") || sentence.length() < 30 || sentence.contains("?"));
        System.out.println(sentence + ".");
    }
}

Sometimes you get unlucky; I try to minimize this by setting a minimum sentence length and filtering out sentences that end with ":" (all disambiguation pages start that way) or contain a "?" (there seem to be many articles with unresolved unknown info marked by question marks). Sentence boundaries are a period followed by whitespace followed by a number or capital letter.

I also filter out text in parentheses (the result is still a valid sentence) to try and remove some periods that aren't sentence boundaries. I filter out square braces to remove source citation numbers. Example (5 runs):

  • Idle Cure was an arena rock band from Long Beach, California.
  • Self-focusing is a non-linear optical process induced by the change in refractive index of materials exposed to intense electromagnetic radiation.
  • TB10Cs4H3 is a member of the H/ACA-like class of non-coding RNA molecule that guide the sites of modification of uridines to pseudouridines of substrate RNAs.
  • The Six-headed Wild Ram in Sumerian mythology was one of the Heroes slain by Ninurta, patron god of Lagash, in ancient Iraq.
  • Sugar daddy is a slang term for a man who offers to support a typically younger woman or man after establishing a relationship that is usually sexual.
  • Old Bethel United Methodist Church is located at 222 Calhoun St., Charleston, South Carolina.
  • Douglas Geers is an American composer.

If you notice any grammar issues, well, that's your fault for not being a diligent Wikipedia editor! ;-)

share|improve this answer
 
I like it! +1 +1 +1 ... –  TheDoctor 2 days ago
2  
There is definitely a difference between "valid" and "understandable". I've got some substrate RNA pseudouridines for you right here, baby. –  Jason C 2 days ago
 
Also - mostly - of good quality :) –  Wolf 2 days ago
1  
+1 Six-headed Wild Ram, Sugar daddy! –  David Conrad 6 hours ago
 
I was so psyched when sugar daddy popped up! –  Jason C 4 hours ago
show 1 more comment

C

char*strerror(),i;main(){for(;--i;)puts(strerror(i));}

Example output:

Software caused connection abort
Interrupted system call should be restarted

There are also plenty of valid sentences output that do not have a subject, verb and object:

Timer expired
File exists

share|improve this answer
add comment

PHP + Project Gutenberg

I wrote a PHP script that turns a plain text document into a set of word bigrams, which it then uses to generate random sentences. Here are some of the better examples it generated from the entire plain text version of Patrick Henry's "Give Me Liberty Or Give Me Death" speech, including the Project Gutenberg small print:

  • The Project Gutenberg Etext of nations, and slavery!

  • We apologize for the 200th anniversary of this Small Print!

  • YOU DON'T HAVE NO OTHER WARRANTIES OF ANY KIND, EXPRESS OR INCIDENTAL DAMAGES, But for me, death!

You can try it out for yourself here. Refresh the page for a new batch of sentences.

If you want to run the source code yourself, don't forget to load $src_text with your chosen plain text.

<html>
<head>
<title>Give Me Liberty Or Give Me Death</title>
<style>
body { margin:4em 6em; text-align:center; background-color:#feb; }
h1 { font-weight:normal; font-size:2em; margin-bottom:2em; }
blockquote { font-style:italic; }
</style>
</head>
<body>
<h1>A collection of quotes randomly generated from Patrick Henry's speech
<a href="http://www.gutenberg.org/ebooks/6">Give Me Liberty Or Give Me Death</a>
(and its accompanying Project Gutenberg blurb).</h1>
<?php

/* Give Me Liberty Or Give Me Death */
/* Plain text available from http://www.gutenberg.org/ebooks/6 */
$src_text = file_get_contents('libertyordeath.txt');

$bigrams = array();
$openers = array();
$loc = 0;
$new_sentence = true;
$last = false;
while (preg_match('/\'?\w+[^\s\[\]\*\(\)"#@]*/',$src_text,$matches,PREG_OFFSET_CAPTURE,$loc)) {
  $w = $matches[0][0];
  $loc = $matches[0][1]+strlen($w);
  $bareword = preg_replace('/\W/','',$w);
  if ($last) {
    if (!isset($bigrams[$last][$w])) $bigrams[$last][$w] = 1;
    else $bigrams[$last][$w]++;
  }
  if (!isset($bigrams[$bareword])) $bigrams[$bareword] = array();
  $last = $bareword;
  if ($new_sentence && preg_match('/^[A-Z]/',$w)) {
    if (!isset($openers[$w])) $openers[$w] = 1;
    else $openers[$w]++;
    $new_sentence = false;
  }
  if (ends_sentence($w)) {
    $new_sentence = true;
    $last = false;
  }
}

/* Now generate ten random sentences */

for ($ns=0; $ns<10; $ns++) {

  echo "<blockquote><p>";

  /* Choose a starting word */

  $sum = 0;
  foreach ($openers as $w=>$c) $sum += $c;
  $r = mt_rand(0,$sum);
  foreach ($openers as $w=>$c) {
    $r -= $c;
    if ($r<=0) break;
  }

  /* Barf out additional words until end of sentence reached */

  while(1) {
    echo "$w ";
    if (ends_sentence($w)) break;
    $bareword = preg_replace('/\W/','',$w);
    $sum = 0;
    foreach ($bigrams[$bareword] as $w=>$c) $sum += $c;
    $r = mt_rand(0,$sum);
    foreach ($bigrams[$bareword] as $w=>$c) {
      $r -= $c;
      if ($r<=0) break;
    }
  }

  echo "</p></blockquote>\n";
}

function ends_sentence($w) {
  if (!preg_match('/[\.\?!]$/',$w)) return false;
  if (preg_match('/^(\w|St|Mr|Ms|Mrs|Messrs|i\.e|e\.g|etc|Rd)\./i',$w)) return false;
  return true;
}

?>
</body>
</html>
share|improve this answer
 
+10 This one really nails the spirit of the challenge! I can't find it now but there used to be an online Google-based sentence generator that worked in a similar way, but the bigrams (or optionally larger n-grams) were derived from Google search results by searching for a word and observing what followed it in the search result preview snippets. Maybe I will recreate it and post it here. –  Jason C yesterday
1  
+1 for your third example. –  Brian Minton yesterday
add comment

Bash

Inspired by the Matlab answer. Assumes you have aptitude installed.

r=$[ RANDOM % 7 ]
a=''
for i in `seq $r`; do a=$a'v'; done
if [ $r -ne 0 ]; then a='-'$a; fi
aptitude $a moo

Possible outputs (screenshot from this wikipedia article)

enter image description here

share|improve this answer
1  
I don't think . /----\ -------/ \ / \ / | -----------------/ --------\ ---------------------------------------------- is a valid sentence. –  svick 13 hours ago
 
@svick you win can be a sentence (the object "the argument" is implied). And even if it isn't, the question does not forbid cases where the output isn't valid. –  ace 13 hours ago
 
+1 for the last sentence :)))) –  Songo 11 hours ago
add comment

Soooo... Since this is , I had some fun with eval and with functions. Basically I generate a random number and then execute a random function based on that number (in your face, switch!) via eval.

PHP, ~9k valid outputs

<?php

//Subjects
function s1(){ echo "I "; $m = rand(1,20); eval ("v".$m."(0);");}
function s2(){ echo "You "; $m = rand(1,20); eval ("v".$m."(0);");}
function s3(){ echo "He "; $m = rand(1,20); eval ("v".$m."(1);");}
function s4(){ echo "She "; $m = rand(1,20); eval ("v".$m."(1);");}
function s5(){ echo "We "; $m = rand(1,20); eval ("v".$m."(0);");}
function s6(){ echo "They "; $m = rand(1,20); eval ("v".$m."(0);");}

//Verbs
function v1($n){ echo "want"; if($n==1)echo"s"; echo " to "; $z = rand(1,10); eval ("a".$z."();");}
function v2($n){ echo "need"; if($n==1)echo"s"; echo " to "; $z = rand(1,10); eval ("a".$z."();");}
function v3($n){ echo "ha"; if($n==1){echo"s";}else{echo"ve";} echo " to "; $z = rand(1,10); eval ("a".$z."();");}
function v4($n){ echo "wanted to "; $z = rand(1,10); eval ("a".$z."();");}
function v5($n){ echo "needed to "; $z = rand(1,10); eval ("a".$z."();");}
function v6($n){ echo "had to "; $z = rand(1,10); eval ("a".$z."();");}
function v7($n){ echo "eat"; if($n==1)echo"s"; echo " "; $w = rand(1,20); eval ("o".$w."();");}
function v8($n){ echo "think"; if($n==1)echo"s"; echo " about "; $w = rand(1,20); eval ("o".$w."();");}
function v9($n){ echo "ate "; $w = rand(1,20); eval ("o".$w."();");}
function v10($n){ echo "thought about "; $w = rand(1,20); eval ("o".$w."();");}
function v11($n){ echo "draw"; if($n==1)echo"s"; echo " "; $w = rand(1,20); eval ("o".$w."();");}
function v12($n){ echo "drew "; $w = rand(1,20); eval ("o".$w."();");}
function v13($n){ echo "smell"; if($n==1)echo"s"; echo " like "; $w = rand(1,20); eval ("o".$w."();");}
function v14($n){ echo "shot "; $w = rand(1,20); eval ("o".$w."();");}
function v15($n){ echo "destroy"; if($n==1)echo"s"; echo " "; $w = rand(1,20); eval ("o".$w."();");}
function v16($n){ echo "destroyed "; $w = rand(1,20); eval ("o".$w."();");}
function v17($n){ echo "melt"; if($n==1)echo"s"; echo " "; $w = rand(1,20); eval ("o".$w."();");}
function v18($n){ echo "saw "; $w = rand(1,20); eval ("o".$w."();");}
function v19($n){ echo "ha"; if($n==1){echo"s";}else{echo"ve";} echo " "; $w = rand(1,20); eval ("o".$w."();");}
function v20($n){ echo "had "; $w = rand(1,20); eval ("o".$w."();");}

//Auxiliaries
function a1(){ echo "punch "; $w = rand(1,20); eval ("o".$w."();");}
function a2(){ echo "drive "; $w = rand(1,20); eval ("o".$w."();");}
function a3(){ echo "mount "; $w = rand(1,20); eval ("o".$w."();");}
function a4(){ echo "see "; $w = rand(1,20); eval ("o".$w."();");}
function a5(){ echo "have "; $w = rand(1,20); eval ("o".$w."();");}
function a6(){ echo "eat "; $w = rand(1,20); eval ("o".$w."();");}
function a7(){ echo "stun "; $w = rand(1,20); eval ("o".$w."();");}
function a8(){ echo "kiss "; $w = rand(1,20); eval ("o".$w."();");}
function a9(){ echo "Ted "; $w = rand(1,20); eval ("o".$w."();");} //See "How I met Your Mother" for further informations :)
function a10(){ echo "blow "; $w = rand(1,20); eval ("o".$w."();");}

//Objects
function o1(){ echo "a cow!<br>";}
function o2(){ echo "a meatball!<br>";} 
function o3(){ echo "a car!<br>";} 
function o4(){ echo "shoes!<br>";} 
function o5(){ echo "pigs!<br>";} 
function o6(){ echo "a telephone!<br>";} 
function o7(){ echo "some bottles of water!<br>";} 
function o8(){ echo "a laptop!<br>";} 
function o9(){ echo "my shorts!<br>";} //Quote needed
function o10(){ echo "anchovies!<br>";}
function o11(){ echo "an alarm clock!<br>";}
function o12(){ echo "every second!<br>";}
function o13(){ echo "until the end!<br>";}
function o14(){ echo "sitting!<br>";}
function o15(){ echo "a sword!<br>";}
function o16(){ echo "fire!<br>";}
function o17(){ echo "the dust!<br>";}
function o18(){ echo "in the bedroom!<br>";}
function o19(){ echo "a poor ant!<br>";}
function o20(){ echo "a pencil!<br>";}

//Testing
$n = rand(1,6); eval ("s".$n."();");
$n = rand(1,6); eval ("s".$n."();");
$n = rand(1,6); eval ("s".$n."();");
$n = rand(1,6); eval ("s".$n."();");

?>

Some outputs...

She draws a sword!
They thought about sitting!
You eat my shorts!
He wanted to Ted a cow!
You want to mount a poor ant!
She smells like anchovies!
He wanted to have shoes!
They wanted to see a pencil!
share|improve this answer
1  
+1 for "eat my shorts" –  ace 20 hours ago
 
I'd use PHP_EOL to be compatible with both web and CLI. –  nyuszika7h 18 hours ago
add comment

Python

This entry selects words from whole system dictionary. It takes advantage of the fact that you can make most nouns into verbs and vice-versa, It uses a few heuristics to avoid obvious impossibilities.

It produces a few nearly sane statements:

The snigger westernizes the bacteriologist.
A drizzle stoked the sentiments.

Many insane ones:

Tipper's orthopaedic knitwear plates a payroll.
A fibula teletypewritered a yogi.
The protozoan's spiralling skydive coats this veterinarian

And a lot of stuff that sounds like Monty Python making lewd innuendos:

That rolling indictment tarries some bang's bulge.
Some inflammatory tush's intermarriage sextants some postman.
Some pentagon's manufacturer squeaked the wolverine.
A disagreeable participant is entertaining my optimized spoonful.

Code:

import random
import string

words = open("/usr/share/dict/words").readlines()
articles=('','a ','the ','some ','this ','that ','my ')
pl_articles=('','some ','those ','many ','the ')

possesive=False
modifiers=0

def getword():
    global possesive
    while True:
        w=words[random.randrange(len(words))].rstrip()
        if w[0] in string.ascii_uppercase: continue
        if "'" in w:
            if not possesive: possesive = True
            else: continue
        print w
        return w

def is_modifier(w):
    return ("'" in w or
        w[-2:] in ('ry','ed','er','ic','al')  or
        w[-3:] in ('ing','est','ble','ous') or
        w[-4:] in ('less','ical') )

def is_verb(w):
    return (
        w[-2:] in ('ed',) or
        w[-3:] in ('ing','ize') )


obstr=''
while True:
    w=getword()
    if is_modifier(w):
        if modifiers<2:
            obstr+=w+' '
            modifiers+=1
        else: continue
    elif w[-2:]=='ly': continue
    else:
        if w[-1] == 's': continue
        art = articles[random.randrange(len(articles))]
        obstr= obstr+w+' '
        if art is 'a ' and obstr[0] in 'aeiou': art='an '
        obstr= string.capwords(art+obstr,'.')
        break

verbstr=''
while True:
    w=getword()
    if "'" in w: continue
    if w[-4:]=="ness": continue
    if w[-2:]=='ly': verbstr+=w+' '
    elif w[-3:]=='ing':
        verbstr+='is '+w+' '
        break
    elif is_verb(w):
        verbstr= verbstr+w+' '
        break
    elif is_modifier(w): continue
    else:
        if w[-1] != 's':
            w=w+'ed' if w[-1]!='e' else w+'d'
        verbstr= verbstr+w+' '
        break

substr=''
while True:
    w=getword()
    if is_modifier(w):
        if  modifiers<2:
            substr+=w+' '
            modifiers+=1
        else: continue
    elif w[-2:]=='ly': continue
    else:
        substr = substr+w
        if w[-1] == 's':
            art = pl_articles[random.randrange(len(pl_articles))] 
        else:
            art = articles[random.randrange(len(articles))] 
            if art is 'a ' and substr[0] in 'aeiou': art='an '
        substr= art+substr
        break

print obstr+verbstr+substr+'.'
share|improve this answer
 
The example sentences are making me laugh so hard, I'm crying! xD –  mikhailcazi 1 hour ago
add comment

Python

import this


The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
share|improve this answer
5  
Could you argue that import antigravity leads to the output I LEARNED IT LAST NIGHT! EVERYTHING IS SO SIMPLE!? :D –  ace yesterday
 
Undoubtedly, yes. –  user3058846 yesterday
add comment

Playing with the Mathematica internal dictionary:

res = {};
SeedRandom[42 + 1];
Do[
  (While[
    If[(c = Flatten@WordData[RandomChoice[WordData[All]], "Examples"][[All, 2]]) != {},
     StringPosition[(c1 = RandomChoice@c), "'" | "-" | "\\" | "`"] != {}, True, True]];
   sp = ToLowerCase /@ StringSplit[c1, (WhitespaceCharacter .. | ",")];
   toChange = RandomSample[Range@#, RandomInteger[IntegerPart[{#/2, #}]]] &@Length@sp;
   If[StringPosition[ToString@WordData[sp[[#]], "Definitions"],  "WordData"] == {}, 
    sp[[#]] = RandomChoice@ WordData[All, RandomChoice@WordData[sp[[#]], "PartsOfSpeech"]])]
             & /@ toChange;
   AppendTo[res, StringJoin@Riffle[sp, " "]];)
  ,
  {10}];
res

You get lucky, say, 70% of the time. It generates things like:

a amygdaloid electric circuit
yonder Parkia was unrestrictive though ragged
his longanimous society
Doctor of Education unintelligible reply to kibbutz
little musical theater against Julius Caesar
an Tai nuthatch
mow down in sportive center contra thy niggardliness
the required extrinsic detergents
sans necromantic sorcerer
these vena pectoralis opposite mine latria trophy wife trend-setting investors brown
what man-portable field of fire
umbra charmingly whereunto my answer
another screw-loose debris storm scentless aslant Aral Sea complex waffle
for professed delight mongoloid type metal

but sometimes:

mine adoption pro least battle of Lutzen would cash draw in during whiles Hejira of the cleaver
nine common shiner subduction genus Seiurus heartwarming her audience

Oh well, its use of English is better than mine.

share|improve this answer
add comment

Python

As you know, you can do anything in python with few imports. This simple task can be accomplished with this 2 lines python script.

import random

print ("I like the number "+str(random.uniform(0,1)))

The number of sentences generated by this script is quate huge: 10^12 different sentences. If reading a sentece takes you ~0.5 sec, then reading them all will take more than 15000 years!

Some sample sentences:

  • I like the number 0.444371877853
  • I like the number 0.358614422548

Yet all the generated sentences contains a subject, a verb and an object.

share|improve this answer
add comment

Prolog

Use prolog's backtracking and a generative grammar approximating English grammar to generate all possible sentences.

This version has a fairly limited vocabulary and sentence structure, but it should be pretty easy to extend.

The code:

% Define the vocabulary
verb(V) :- V = 'eats' | V = 'fights' | V = 'finds'.
subj_pronoun(P) :- P = 'he' | P = 'she' | P = 'it'.
obj_pronoun(P) :- P = 'him' | P = 'her' | P = 'it'.
name(N) :- N = 'alice' | N = 'bob'.
noun(N) :- N = 'cat' | N = 'door' | N = 'pen'.
article(H) :- H = 'the' | H = 'a'.

% Grammar
subject_phrase_short(H) :- subj_pronoun(H)
                         | name(H).
% Subordinate clause. Don't use verb_phrase here to avoid recursive clauses.
sub_clause([Which, Verb|T], Rest) :- Which = 'which', verb(Verb),
                                     object_noun_phrase_short(T, Rest).
subject_phrase([H|T], Rest) :- subject_phrase_short(H), Rest = T.
object_noun_phrase_short([A, N | T], Rest) :- article(A), noun(N), Rest = T
                                            | obj_pronoun(A), Rest = [N|T].
object_phrase(L, Rest) :- object_noun_phrase_short(L, Rest)
                        | object_noun_phrase_short(L, Rest1), sub_clause(Rest1, Rest).
verb_phrase([H|T], Rest) :- verb(H), object_phrase(T, Rest).
sentence(S) :- subject_phrase(S, Rest), verb_phrase(Rest, []).

Run this query:

sentence(L).

to generate all possible sentences in this language.

Some sample outputs:

L = [he, eats, the, cat] ;
L = [she, finds, a, door] ;
L = [alice, fights, the, door] ;
L = [he, fights, the, cat, which, eats, the, pen] ;
L = [alice, eats, him, which, finds, the, cat] ;

(EDIT: Allow object subordinate clauses).

share|improve this answer
 
Any example sentence outputs? –  TheDoctor 18 hours ago
add comment

VBA/Excel

[edit 2]

Have taught it how to conjugate verbs, examples below are simple past tense:

The moderate wild cocaine slid abreast of the historic instant decision. The regional safe chapter snapped inside of the numerous random entity. The yellow right domain removed behind the magnetic fragile gender. The physical fatal pollution began past the dead poor sensation. The cognitive brave theater went to the front of the fragile aware literature. The conventional actual output resisted away from the favorite immune site. The fixed economic twin recognized out of the evil human necessity.

The relevant code follows, excluding a bunch of boring ancillary parsing and looping functions. The main parts that are missing are the various word lists (by parts of speech) which do pluralization, tenses, conjugations, etc.

All of the word roots are picked randomly, but I force them to be arranged in a particular sentence pattern:

Debug.Print getWords("ad adj adj nns vpa1s pl ad adj adj nns")

... which is what I used to generate the output above. It follows the general form of, "The quick red fox jumped over the lazy brown dog."

Function getWords(strStruc As String) As String
    Dim i As Long
    Dim s As Long
    Dim strIn As String
    Dim strOut As String

    getWords = ""
    s = numElements(strStruc)
    For i = 1 To s
        strIn = parsePattern(strStruc, i)
        Select Case strIn
            Case ",", ";", ":", """" 'punctuation
                strOut = strIn
                getWords = Trim(getWords)
            Case "ai", "ad" 'indefinite article, definite article
                strOut = getArticle(strIn)
            Case "adj" 'adjective
                strOut = getWord("adj", 1)
            Case "nns" 'noun nominative singular
                strOut = getWord("n", 1)
            Case "nnp" 'noun nominative plural
                strOut = getWord("n", 2)
            Case "nps" 'noun posessive singular
                strOut = getWord("n", 3)
            Case "npp" 'noun posessive plural
                strOut = getWord("n", 4)
            Case "vpr1s" 'Present 1st Person Singular
                strOut = getWord("v", 1)
            Case "vpr2s" 'Present 2nd Person Singular
                strOut = getWord("v", 2)
            Case "vpr3s" 'Present 3rd Person Singular
                strOut = getWord("v", 3)
            Case "vi" 'Infinitive
                strOut = getWord("v", 4)
            Case "vpp" 'Present Participle
                strOut = getWord("v", 5)
            Case "vi" 'Imperative/Subjunctive
                strOut = getWord("v", 6)
            Case "vpa1s" 'Past Tense First Person
                strOut = getWord("v", 7)
            Case "vpa2s" 'Past Tense Second Person
                strOut = getWord("v", 8)
            Case "vpa3s" 'Past Tense Third Person
                strOut = getWord("v", 9)
            Case "vppr1s" 'Present Progressive First Person Singular
                strOut = getWord("v", 10)
            Case "vppr2s" 'Present Progressive Second Person Singular
                strOut = getWord("v", 11)
            Case "vppr3s" 'Present Progressive Third Person Singular
                strOut = getWord("v", 12)
            Case "vppe1s" 'Present Perfect First Person Singular
                strOut = getWord("v", 13)
            Case "vppe2s" 'Present Perfect Second Person Singular
                strOut = getWord("v", 14)
            Case "vpp3s" 'Present Perfect Third Person Singular
                strOut = getWord("v", 15)
            Case "vi1s" 'Imperfect First Person Singular
                strOut = getWord("v", 16)
            Case "vi2s" 'Imperfect Second Person Singular
                strOut = getWord("v", 17)
            Case "v13s" 'Imperfect Third Person Singular
                strOut = getWord("v", 18)
            Case "vsf" 'Simple Future
                strOut = getWord("v", 19)
            Case "vfp" 'Future Progressive
                strOut = getWord("v", 20)
            Case "vc" 'Conditional
                strOut = getWord("v", 21)
            Case "vcp" 'Conditional Perfect
                strOut = getWord("v", 22)
            Case "vci" 'Conditional Imperfect
                strOut = getWord("v", 23)
            Case "pl" 'location prepositions
                strOut = getWord("pl", 1)
        End Select
        getWords = getWords & strOut & " "
    Next i
End Function

[begin original post]

Still a work in progress, need to add logic for tenses and noun/verb pluralization, viz.:

Your average travel our supposed dose nor a temperature boost beyond my tomato.

... which is parsable, but doesn't make much sense.

The programming enable their dirty fisherman far our pork cast instead no sentence.

Right. Not really a sentence, but better than some JavaScript error messages.

His appeal lift every live question that my lady outline top her English.

The innuendo routine is almost top-notch tho' ...

Code to follow anon. Does this contest have a deadline?

[edit 1]

Code that generated the above.

Function getWord(sht As Worksheet) As String
    Dim i As Long
    Dim freq As Long
    Dim c As Long
    Dim f As Double
    Dim fSum As Double

    c = 4
    fSum = WorksheetFunction.Count(sht.Columns(c))
    f = Rnd() * fSum
    i = 2
    Do
        If i >= f Then Exit Do
        i = i + 1
    Loop
    getWord = sht.Cells(i, 1).Value
End Function
Function PCase(str As String) As String
    PCase = UCase(Left(str, 1)) & Right(str, Len(str) - 1)
End Function
Sub doMakeSentences01()
    Dim shtIn As Worksheet
    Dim shtOut As Worksheet
    Dim strSheet As String
    Dim rIn As Long
    Dim rOut As Long
    Dim cFreq As Long
    Dim c As Long
    Dim strPattern As String
    Dim w As Long
    Dim strOut As String
    Dim strIn As String
    Dim strWord As String

    cFreq = 4
    Set shtOut = Sheets("Output")
    rOut = shtOut.Range("A65536").End(xlUp).Row + 1

    strPattern = "anvajncanvian"
    For rOut = rOut To rOut + 1000
        strOut = ""
        For w = 1 To Len(strPattern)
            Set shtIn = Sheets(Mid(strPattern, w, 1))
            strWord = getWord(shtIn)
            If w = 1 Then strWord = PCase(strWord)
            strOut = strOut & strWord & " "
        Next w
        strOut = Trim(strOut) & "."
        shtOut.Cells(rOut, 1).Value = strOut
    Next rOut
End Sub
share|improve this answer
5  
Where is your code? –  ace 2 days ago
 
See my edit for the code. –  Brandon R. Gates yesterday
add comment

In Python:

import random
l = ['Buffalo ']
while random.randint(0,5) > 0:
    l.append('buffalo ')
print "".join(l)

Samples:

  • Buffalo buffalo buffalo
  • Buffalo buffalo buffalo buffalo buffalo buffalo buffalo buffalo

Unfortunately, it has poor handling of punctuation and capitalization, but then again those weren't listed as requirements.

Also, here is a reference.

share|improve this answer
add comment

Shell Scripting

This script will always display the title of the first question that is currently on top of this site. My assumption is the question title will always be human readable. And it will change dynamically. So whenever a new question comes and when the script is executed, it will give the latest question title.

curl "codegolf.stackexchange.com" -s |  w3m -dump -T text/html > foo.txt
awk 'f;/more tags/{f=1}' foo.txt > foo1.txt
sed '8q;d' foo1.txt

Trial 1 output

Find words containing every vowel

Trial 2 output

Hello World 0.0!

EDIT

Not using any files. Without files, I can use the below script.

value1=$(curl "codegolf.stackexchange.com" -s |  w3m -dump -T text/html)
echo "$value1" | grep -A 8 "more tags" | tail -1

Output

Generate an understandable sentence
share|improve this answer
1  
nor is reading them directly from a file... –  rafaelcastrocouto yesterday
 
I have made the changes to not use a file. Now, it just used the variables. How about this one? –  Ramesh yesterday
1  
removed down vote! –  rafaelcastrocouto yesterday
add comment

Bash

Trying to run a program that exists but is not installed gives this (in Linux Mint 13).

$ say
The program 'say' is currently not installed.  To run 'say' please ask your administrator to install the package 'gnustep-gui-runtime'
share|improve this answer
add comment

JavaScript (ES6)

var t='';for(f of [_=>foo,_=>null.a,_=>0..toString(0)])try{f()}catch(e){t+=e.message+'\n';}t

Running it in the console produces

foo is not defined
null has no properties
radix must be an integer at least 2 and no greater than 36
share|improve this answer
 
Even shorter: t='';for(f of [_=>foo,_=>null.a,_=>0..toString(0)])try{f()}catch(e){t+=e.message+'\n'}t –  toothbrush 20 hours ago
add comment

A work in progress using JSoup and simpleNLG

import java.io.IOException;
import java.util.ArrayList;
import java.util.Arrays;

import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;

import simplenlg.framework.NLGFactory;
import simplenlg.lexicon.Lexicon;
import simplenlg.phrasespec.SPhraseSpec;
import simplenlg.realiser.english.Realiser;

/**
 * Scapes words from Wiktionary then assembles them into sentences
 * 
 * @author pureferret
 *
 */
public class SentenceBuilder {
    static ArrayList<String> ListOfWordTypes= new ArrayList<>(Arrays.asList("Noun","Verb","Adjective","Adverb","Proper noun","Conjunction"));
    private static String RandomWiktWord ="http://toolserver.org/~hippietrail/randompage.fcgi?langname=English";  
    /**
     * @param args
     */
    public static void main(String[] args) {
        Lexicon lexicon = Lexicon.getDefaultLexicon();
        NLGFactory nlgFactory = new NLGFactory(lexicon);
        Realiser realiser = new Realiser(lexicon);

        ArrayList<String> nounList = new ArrayList<String>();
        ArrayList<String> verbList = new ArrayList<String>();
        ArrayList<String> adjeList = new ArrayList<String>();
        ArrayList<String> adveList = new ArrayList<String>();
        ArrayList<String> pnouList = new ArrayList<String>();
        ArrayList<String> conjList = new ArrayList<String>();


        String word= null;
        String wordType = null;

        try {
            newDoc:
            while( nounList.size()<1 ||
                    verbList.size()<1 ||
//                  adjeList.size()<2 ||
//                  adveList.size()<2 ||
                    pnouList.size()<1){
                Document doc = Jsoup.connect(RandomWiktWord).get();
                Element bodyElem = doc.body();
                word = bodyElem.select("h1>span[dir=auto]").get(0).ownText();
                int wtIdx = 0;
                while(wtIdx<bodyElem.select("div#mw-content-text span.mw-headline").size()){
                    wordType = bodyElem.select("div#mw-content-text span.mw-headline").get(wtIdx).id()
                            .replace("_", " ");
                    wtIdx++;
                    switch (wordType) {
                    case "Proper noun":
                        pnouList.add(word);
                        continue newDoc;
                    case "Noun":
                        nounList.add(word);
                        continue newDoc;
                    case "Verb":
                        verbList.add(word);
                        continue newDoc;
                    case "Adjective":
                        adjeList.add(word);
                        continue newDoc;
                    case "Adverb":
                        adveList.add(word);
                        continue newDoc;
                    case "Conjunction":
                        conjList .add(word);
                        continue newDoc;
                    default:
                        break;
                    }
                }
            }
                SPhraseSpec p = nlgFactory.createClause();
                p.setSubject(pnouList.get(0));
                p.setVerb(verbList.get(0));
                p.setObject(nounList.get(0));

                String output2 = realiser.realiseSentence(p); // Realiser created earlier.
                System.out.println(output2);

        } catch (IOException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
            System.err.println(word + " is a " + wordType);
        } catch (IndexOutOfBoundsException e) {
            e.printStackTrace();
            System.err.println(word + " is a " + wordType);
        }
    }

}

Issues:

  • Sentences are too simple
  • Occasionally 404s (without good handling!)
  • Only generates one sentence at a time
  • Uses a switch case!

Sample outputs:

Popoloca prickethes runner beans.
Tropic of Capricorn beams up bodles.
Beijing synonymiseds pillow boxes.
Chukchis enculturateds influencing.

share|improve this answer
add comment

Perl 5

OK, the guts of the program is just this:

use v5.14;
my %pad = (
    ...
);
sub pad { shift =~ s(\{(.+?)\}){pad($pad{$1}[rand(@{$pad{$1}})])}rogue }
say ucfirst pad '{START}';

It's basically a "madlib" engine. To actually generate interesting sentences, you need to populate %pad with some data. Here's an example %pad...

my %pad = (
  START => ['{complex}.'],
  complex => [
    '{simple}',
    '{simple}, and {simple}',
    '{simple}, and {complex}',
    '{simple}, but {simple}',
    '{simple}, yet {simple}',
    'even though everybody knows {simple}, {simple}',
    'not only {simple}, but also {simple}',
  ],
  simple => [
    '{thing} {verb}s {thing}',
    '{thing} {verb}s {adverb}',
    '{thing} is {adjective}',
    '{things} {verb} {thing}',
    '{things} {verb} {adverb}',
    '{things} are {adjective}',
    '{thing} {past_verb} {thing}',
    '{things} {past_verb} {thing}',
  ],
  thing => [
    'the {adjective} gorilla',
    'the {adjective} mailbox',
    'Archbishop Desmond Tutu',
    'the beef salad sandwich',
    'the {adjective} stegosaur',
    'the summit of Mt Everest',
    'Chuck Norris',
    'the cast of television\'s "Glee"',
    'a {adjective} chocolate cake',
  ],
  things => [
    '{adjective} shoes',
    'spider webs',
    'millions of {adjective} eels',
    '{adjective} children',
    '{adjective} monkeys',
    '{things} and {things}',
    'the British crown jewels',
  ],
  verb => [
    'love',
    'hate',
    'eat',
    'drink',
    'follow',
    'worship',
    'respect',
    'reject',
    'welcome',
    'jump',
    'resemble',
    'grow',
    'encourage',
    'capture',
    'fascinate',
  ],
  past_verb => [  # too irregular to derive from {verb}
    'loved',
    'ate',
    'followed',
    'worshipped',
    'welcomed',
    'jumped',
    'made love to',
    'melted',
  ],
  adverb => [
    'greedily',
    'punctually',
    'noisily',
    'gladly',
    'regularly',
  ],
  adjective => [
    'enormous',
    'tiny',
    'haunted',
    'ghostly',
    'sparkling',
    'highly-decorated',
    'foul-smelling',
    '{adjective} (yet {adjective})',
    'expensive',
    'yellow',
    'green',
    'lilac',
    'tall',
    'short',
  ],
);

Here's some samples of the wisdom I've discovered from that %pad. These sentences have not been edited for length, punctuation, grammar, etc, though I have culled some uninteresting ones and rearranged the order in which the sentences appear - they are no longer in the order they were generated, but instead I'm trying to use them to tell a story: a story I hope you will find both touching and thought-provoking.

  • Spider webs are short.
  • Spider webs fascinate regularly.
  • Short monkeys are sparkling, but spider webs drink greedily.
  • Sparkling (yet foul-smelling) monkeys followed the tiny (yet sparkling) gorilla.
  • The summit of Mt Everest welcomed the highly-decorated stegosaur.
  • Not only the summit of Mt Everest is expensive, but also the cast of television's "Glee" followed the sparkling gorilla.
  • The cast of television's "Glee" resembles the lilac mailbox.
  • The expensive mailbox is tall, and the expensive stegosaur jumps Chuck Norris, yet green shoes jumped the beef salad sandwich.
  • The beef salad sandwich loved Chuck Norris.
  • Millions of sparkling eels are green (yet ghostly).
share|improve this answer
add comment

some meta code in Python

python -c "import urllib2, pprint; pprint.pprint([str(x[:x.find('<')]) for x in unicode(urllib2.urlopen('http://codegolf.stackexchange.com/questions/21571/generate-an-understandable-sentence').read(), 'utf8').split('<p>') if x.find('<') >= 1])"

first few lines of output:

'Generate a sentence that can be read and understood. It must contain a subject, verb, and object, and tenses and plurals must match. The program must also be able to generate several different sentences to qualify.', 'example of outputs:', "[This is one of Matlab's easter eggs]", 'Requirements: linux kernel source installed in /usr/src', 'This pulls random comments out of the kernel source. Whether the sentences are actually ', 'Examples of actual output:', 'Given enough time, this will produce all literature, past, present and future. '

share|improve this answer
add comment

Python

Result:

$ python mksentence.py
infringement lecture attainment
Produce more? (Y/N)y
impeachment recoup ornament
Produce more? (Y/N)y
maladjustment edit discouragement
Produce more? (Y/N)y
embellishment guest punishment
Produce more? (Y/N)y
settlement section escapement
Produce more? (Y/N)y
segment withhold recruitment
Produce more? (Y/N)

I used the word list from here Find words containing every vowel

Some more rules can be added. For example, if a word ending with "ness" and the word also exist in set without the suffix, then it's a noun.

Source code:

#!/usr/bin/env python
# vim: set fileencoding=utf-8 ts=4 sw=4 tw=72 :

from __future__ import (unicode_literals, absolute_import,
                        division, print_function)

import random                     

if __name__ == "__main__":        
    filename = 'corncob_lowercase.txt'
    noun = set()
    verb = set()
    whole_words_set = {word.rstrip() for word in open(filename)}

    for word in whole_words_set:
        if word.endswith('ment'):
            noun.add(word)
        elif word.endswith('ing'):
            if word[:-3] in whole_words_set:
                verb.add(word[:-3])
            elif word[:-3]+"e" in whole_words_set:
                verb.add(word[:-3]+"e")
    noun_list = list(noun)
    verb_list = list(verb)
    while True:                   
        sentence = "%s %s %s" % (random.choice(noun_list),
                                 random.choice(verb_list),
                                 random.choice(noun_list))                                                                                           
        print(sentence)
        if input("Produce more? (Y/N)").lower() == "n":
            break
share|improve this answer
1  
Do I really suck at Python and English, or are you outputting 3 nouns instead of 2 nouns and a verb? –  ace 2 days ago
 
@ace Oops, I decided to fix the code at the last minutes :-( –  yegle 2 days ago
add comment

PHP

<?php
  $trends = file_get_contents('http://www.google.com/trends/hottrends/widget?pn=p1&tn=30');
  preg_match_all("/widget-title-in-list'>(.+?)</", $trends, $m);

  $q = urlencode($m[1][array_rand($m[1])]);
  $page = file_get_contents("http://www.google.com/search?q=$q&btnI=1");
  preg_match_all('/[A-Z]([\w,]+ ){2,}[\w, ]+?[.!]/', strip_tags($page), $m);

  echo $m[0][array_rand($m[0])];

This fetches the 30 most trending google searches, performs an "I Feel Lucky" search, and then displays a random sentence from that page with at least 3 words.

Examples:

"She was considered a medal favourite in the event."

"Kate graduated from high school a year early."

"April 15, 2014, to promote compliance with the policy on biographies of living people."

"On behalf of Bryan, we, his family, would like to thank everyone for the outpouring of love, prayers and support."

"This article is about the American basketball player."

"Sorry, your browser either has JavaScript disabled or does not have any supported player."

share|improve this answer
add comment

Ms Word

I'm not sure if this is acceptable, but since html is, I think this should be also acceptable.

 =rand(1,1)

Sample sentences:

On the Insert tab, the galleries include items that are designed to coordinate with the overall look of your document.

You can use these galleries to insert tables, headers, footers, lists, cover pages, and other document building blocks.

you can also specify any number of sentences and paragraphs.

share|improve this answer
add comment

Yet another Python script

The answer of user3058846 isn't bad, but it displays every sentences, every time. Here, I propose a script that output a random sentence from the Zen of Python:

from random import choice
import subprocess
proc = subprocess.Popen("python -c 'import this'", shell=True, stdout=subprocess.PIPE,)
# Get output of proc, split by newline
sentences = [x for x in proc.communicate()[0].splitlines() if x != '']
print(choice(sentences))

In one line, for fans:

from random import choice;import subprocess;print(choice([x for x in subprocess.Popen("python -c 'import this'",shell=True,stdout=subprocess.PIPE).communicate()[0].split('\n') if x]))

(Boooh, dirty.)

Examples:

>>> a()  # <--- a is just the oneline above
Explicit is better than implicit.
>>> a() 
Although never is often better than *right* now.
>>> a() 
Errors should never pass silently.
>>> a() 
Special cases aren't special enough to break the rules.
share|improve this answer
add comment

Batch

@echo off
for /l %%a in (1,1,%1) do for /f %%b in (wordlist.txt) do set /p "=%%b "<nul

Output

H:\uprof>sentance.bat 1
Technically not hard-coded, mate.
H:\uprof>sentance.bat 2
Technically not hardcoded mate. Technically not hardcoded mate.

Where wordlist.txt contains the following -

Technically
not
hard-coded,
mate.

Using Wikipedia's definition of Hard-coding:

Hard coding (also, hard-coding or hardcoding) refers to the software development practice of embedding what may, perhaps only in retrospect, be regarded as input or configuration data directly into the source code of a program or other executable object, or fixed formatting of the data, instead of obtaining that data from external sources or generating data or formatting in the program itself with the given input.

The input data (wordlist.txt) is from an external source, and technically not hard coded in the source code.

sorry

share|improve this answer
2  
Technically not a proper sentence, mate. :) It is technically lacking in subjects and verbs. –  Jonathan Van Matre 2 days ago
3  
It technically is hard-coded; you just put the bytes somewhere else... –  Jason C 2 days ago
1  
I'm sorry unclemeat, i'm afraid you can't do that. –  TheDoctor 2 days ago
5  
I argue that you have created a minimal scripted language where every line that contains a string causes that string to be output to stdout, and that sentance.bat is your interpreter, and wordlist.txt is your source code. So there. –  Jason C 2 days ago
6  
I technically gave you a negative vote. –  nitro2k01 2 days ago
show 3 more comments

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.