6. Processing Raw Text

Lecture notes

Further reading

Before Class (code)

  1. Write a program to:
    1. Get the text of the Speckled Band (Sherlock Holmes)
      go to gutenberg and search for 'Speckled Band'
      find the url of the textfile
      read it in
      chop off extra text
    2. Save it as a file called "spec.txt"
    3. Read it in again and count characters, lines, tokens

Practical work (code)

HG251: Language and the Computer Francis Bond.