org.mmbase.module.lucene.extraction.impl
Class TextMiningExtractor

java.lang.Object
  extended by org.mmbase.module.lucene.extraction.impl.TextMiningExtractor
All Implemented Interfaces:
Extractor

public class TextMiningExtractor
extends Object
implements Extractor

Use textmining lib to extract text from a Word document

Version:
$Id: TextMiningExtractor.java 35592 2009-06-02 23:56:16Z michiel $
Author:
Wouter Heijke, Michiel Meeuwissen

Constructor Summary
TextMiningExtractor()
           
 
Method Summary
 String extract(InputStream input)
          Extract text from a source
 String getMimeType()
          Mimetype this Extractor handles
 void setMimeType(String mimetype)
          Mimetype this Extractor handles
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

TextMiningExtractor

public TextMiningExtractor()
Method Detail

setMimeType

public void setMimeType(String mimetype)
Description copied from interface: Extractor
Mimetype this Extractor handles

Specified by:
setMimeType in interface Extractor
Parameters:
mimetype - String representing the MIME Type

getMimeType

public String getMimeType()
Description copied from interface: Extractor
Mimetype this Extractor handles

Specified by:
getMimeType in interface Extractor
Returns:
String representing the MIME Type

extract

public String extract(InputStream input)
               throws Exception
Description copied from interface: Extractor
Extract text from a source

Specified by:
extract in interface Extractor
Parameters:
input - InputStream where the data comes from
Returns:
String representing the extracted text
Throws:
Exception


MMBase 2.0-SNAPSHOT - null