Today's project was to index my 100,000 email and Usenet messages into a
MySQL
full text index
so I can search things I've written. Not bad: 15 minutes to parse
and load the messages, 5 minutes to build the index. Queries
take a tenth of a second or two. MySQL supports a rich boolean query language.
What I like best is how easy this was. I spent weeks building Funes, a Java mail search program that never was useful. With Python and MySQL it took me just a few hours and the result is better! Goodbye, grepmail. I'm not the only MySQL fulltext enthusiast: Jeremy Zawodny's blog has a great entry with comments and Mitchell Harper has a useful introduction article. There's also some performance discussion on a PHP forum. One trick - for speed, run myisamchk -a on the table after building the full text index. And do your big load before creating the index; afterwards, inserts are slow. |