Custom Search

Tuesday, December 9, 2008

Lucene 2.4.0 - Hello World

Lucene 2.4.0 - Hello World application to play around with indexing / searching capabilities of Lucene. The original code is attributed to Lucene tutorial mentioned here.

Some of the API in the code like Hits have been deprecated that creates costly Document objects. The revised code, after addressing compilation warnings is shown herewith.




import java.io.IOException;

import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.IndexWriter.MaxFieldLength;
import org.apache.lucene.queryParser.ParseException;
import org.apache.lucene.queryParser.QueryParser;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.ScoreDoc;
import org.apache.lucene.search.TopDocs;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.RAMDirectory;

public class HelloLucene {
public static void main(String[] args) throws IOException, ParseException {
// 1. create the index
Directory index = new RAMDirectory();
IndexWriter w = new IndexWriter(index, new StandardAnalyzer(), true, new MaxFieldLength(25000));

addDoc(w, "Lucene in Action");
addDoc(w, "Lucene for Dummies");
addDoc(w, "Managing Gigabytes");
addDoc(w, "The Art of Computer Science");
w.close();

// 2. query
String querystr = args.length > 0 ? args[0] : "lucene";
Query q = new QueryParser("title", new StandardAnalyzer()).parse(querystr);

// 3. search
IndexSearcher s = new IndexSearcher(index);
TopDocs docs = s.search(q, null, 100);

// 4. display results
System.out.println("Found " + docs.totalHits + " hits.");
ScoreDoc [] hits = docs.scoreDocs;
int i = 0;
for(ScoreDoc scoreDoc : hits) {
System
.out.println((i + 1) + ". " + s.doc(scoreDoc.doc) );
++i;
}
s.close();
}

private static void addDoc(IndexWriter w, String value) throws IOException {
Document doc = new Document();
doc.add(new Field("title", value, Field.Store.YES, Field.Index.ANALYZED));
w.addDocument(doc);
}
}

6 comments:

Anonymous said...

Thanks - I was having a hard time finding examples that didn't use deprecated methods.

A couple of details:
We don't need to create that hits array, we can iterate with
for (ScoreDoc scoreDoc : docs.scoreDocs){

And also i isn't being incremented in the loop.

Anonymous said...

thx a lot dude

i was having she same problem as the "anonymous".
=x

by the way

u forgot to add the "i++" on the FOR.

c ya

Anonymous said...

Same as other. Thanks a lot!

Anonymous said...

Not to Anonymous respondent: the assignment

ScoreDoc [] hits = docs.scoreDocs;

doesn't create anything; it just assigns a member variable of the docs object to a local variable.

Why the good committers of Lucene decided to make fields of their objects public is another matter.

You really should fix the non-increment bug -- it's cribbed from the original bug.

Anonymous said...

I've just updated LuceneTutorial.com's code examples to Lucene 2.4.

Would love to get some feedback on the type of tutorials/apps you'd like to see on that site.

pseudonym said...

Thanks everyone for your comments.

I have fixed the bug related to ++i .

Also put in a new syntax highlighter to look much better. hope this helps.