org.mmbase.util.transformers
Class CP1252Surrogator

java.lang.Object
  extended by org.mmbase.util.transformers.ReaderTransformer
      extended by org.mmbase.util.transformers.ConfigurableReaderTransformer
          extended by org.mmbase.util.transformers.CP1252Surrogator
All Implemented Interfaces:
Serializable, CharTransformer, ConfigurableTransformer, Transformer

public class CP1252Surrogator
extends ConfigurableReaderTransformer
implements CharTransformer

Surrogates the Windows CP1252 characters which are not valid ISO-8859-1. It can also repair wrongly encoded Strings (byte arrays which were actually CP1252, but were considered ISO-8859-1 when they were made to a Java String).

Since:
MMBase-1.7.2
Version:
$Id: CP1252Surrogator.java 41057 2010-02-16 00:12:33Z michiel $
Author:
Michiel Meeuwissen
See Also:
Serialized Form

Field Summary
static int WELL_ENCODED
           
static int WRONG_ENCODED
           
 
Fields inherited from class org.mmbase.util.transformers.ConfigurableReaderTransformer
to
 
Constructor Summary
CP1252Surrogator()
           
CP1252Surrogator(int conf)
           
 
Method Summary
 String getEncoding()
          Returns the encoding that is currently active
static byte[] getTestBytes()
           
static String getTestString()
           
static void main(String[] args)
          For testing only.
 Writer transform(Reader r, Writer w)
           
 Map<String,Config> transformers()
          Returns which transformations can be done by an object of this class.
 
Methods inherited from class org.mmbase.util.transformers.ConfigurableReaderTransformer
configure, toString
 
Methods inherited from class org.mmbase.util.transformers.ReaderTransformer
transform, transform, transformBack, transformBack, transformBack
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 
Methods inherited from interface org.mmbase.util.transformers.CharTransformer
transform, transform, transformBack, transformBack, transformBack
 
Methods inherited from interface org.mmbase.util.transformers.Transformer
toString
 

Field Detail

WELL_ENCODED

public static final int WELL_ENCODED
See Also:
Constant Field Values

WRONG_ENCODED

public static final int WRONG_ENCODED
See Also:
Constant Field Values
Constructor Detail

CP1252Surrogator

public CP1252Surrogator()

CP1252Surrogator

public CP1252Surrogator(int conf)
Method Detail

transform

public Writer transform(Reader r,
                        Writer w)
Specified by:
transform in interface CharTransformer
Specified by:
transform in class ReaderTransformer

transformers

public Map<String,Config> transformers()
Description copied from interface: ConfigurableTransformer
Returns which transformations can be done by an object of this class.

Specified by:
transformers in interface ConfigurableTransformer
Specified by:
transformers in class ConfigurableReaderTransformer
Returns:
A Map with String Integer/Class pairs.

getEncoding

public String getEncoding()
Description copied from interface: ConfigurableTransformer
Returns the encoding that is currently active

Specified by:
getEncoding in interface ConfigurableTransformer
Specified by:
getEncoding in class ConfigurableReaderTransformer
Returns:
An String representing the coding that is currently used.

getTestBytes

public static byte[] getTestBytes()

getTestString

public static String getTestString()

main

public static void main(String[] args)
For testing only. Use on a UTF-8 terminal: java -Dfile.encoding=UTF-8 org.mmbase.util.transformers.CP1252Surrogator Or, on a ISO-8859-1 terminal: (you will see question marks, for the CP1252 chars) java -Dfile.encoding=ISO-8859-1 org.mmbase.util.transformers.CP1252Surrogator Or, if - may God forbid - you have a CP1252 terminal: java -Dfile.encoding=CP1252 org.mmbase.util.transformers.CP1252Surrogator This last thing you may simulate with something like this: java -Dfile.encoding=CP1252 org.mmbase.util.transformers.CP1252Surrogator | konwert cp1252-utf8



MMBase 2.0-SNAPSHOT - null