public class UCDParser
extends java.lang.Object
parseUCD(String)
) or a list of UCD
word (parseWordList(Reader, boolean)
).
Though the static function parseUCD(String)
is using a
UCDParser
already initialized with the list of all official IVOA UCD
words (see defaultParser
), it is possible to create an instance of
UCDParser
with a custom list of UCD words.
The main function of this parser prompts for a UCD, parses this UCD and finally returns some information about it.
To get more information about the version of the parser and/or the words lists, use the following functions:
Modifier and Type | Field and Description |
---|---|
static UCDParser |
defaultParser
Default UCD parser which is initialized with a list of only the official
IVOA UCD words and the deprecated ones (for better error messages and
suggestions).
|
DeprecatedUCDWordList |
deprecatedWords
List of all deprecated words.
|
static java.lang.String |
FILE_UCD_DEPRECATED
Default path to the file listing all deprecated UCD words and their
replacement.
|
static java.lang.String |
FILE_UCD_WORDS
Default path to the PSV file listing all official IVOA UCDs.
|
UCDWordList |
knownWords
List of all known words.
|
protected static java.util.logging.Logger |
LOGGER
Logger used to report errors during the initialization of the default
UCDParser . |
protected static int |
NB_MAX_ERRORS
Maximum number of consecutive errors while parsing a PSV file, before
stopping the parsing.
|
static java.lang.String |
UCIDY_VERSION
Current version of this Ucidy instance.
|
static java.lang.String |
VERSION_UCD_WORDS
Version of the UCD words listed in
FILE_UCD_WORDS and
FILE_UCD_DEPRECATED . |
Constructor and Description |
---|
UCDParser()
Create a UCD parser with an empty list of known words.
|
UCDParser(java.lang.String wordsListVersion)
Create a UCD parser with an empty list of known words with a declared
version.
|
UCDParser(UCDWordList words)
Build a UCD parser with the given list of known words.
|
UCDParser(UCDWordList words,
DeprecatedUCDWordList deprecatedWords)
Build a UCD parser with the given list of known words and a given list of
deprecated UCD words.
|
Modifier and Type | Method and Description |
---|---|
java.lang.String |
getFullVersion()
Get a paragraph giving the version of this parser and the version of
all used lists.
|
static java.lang.String |
getVersion()
Get the Ucidy/parser version.
|
static void |
main(java.lang.String[] args) |
UCD |
parse(java.lang.String ucdStr)
Parse the given string representing a UCD into an object representation:
UCD . |
static java.lang.String[] |
parseDeprecatedFileLine(java.lang.String line)
Parse a line of a file listing the deprecated UCD words.
|
static DeprecatedUCDWordList |
parseDeprecatedWordList(java.io.Reader reader,
UCDWordList lstWords)
Create a
DeprecatedUCDWordList with all deprecated UCD words
declared inside the specified input. |
static int |
parseDeprecatedWordList(java.io.Reader reader,
UCDWordList lstWords,
DeprecatedUCDWordList lstDeprecatedWords)
Add inside the given
UCDWordList all deprecated UCD words
declared inside the specified input. |
static UCDWord |
parsePSVLine(java.lang.String psvLine,
boolean recommended)
Parse a line of a PSV (Pipe Separated Value) file as the definition of a
UCD word.
|
static UCD |
parseUCD(java.lang.String ucdStr)
Parse the given UCD and try to resolve each word as a known UCD word
among the IVOA official list.
|
static UCDWordList |
parseWordList(java.io.Reader reader,
boolean recommended)
Create a
UCDWordList with all UCD words declared using the PSV
(Pipe-Separated-Value) format inside the specified input. |
static int |
parseWordList(java.io.Reader reader,
boolean recommended,
UCDWordList words)
Add inside the given
UCDWordList all UCD words declared using the
PSV (Pipe-Separated-Value) format inside the specified input. |
public static final UCDParser defaultParser
This parser is generally used through parseUCD(String)
but
could be used directly.
Note: Initialization errors and warnings are reported using the Java Util Logging API (JUL).
public final DeprecatedUCDWordList deprecatedWords
Important:
When a UCD word can not be resolved, this UCDParser
will try
to find it among the deprecated words. If a match is found, the
suggested UCD replacement will be proposed to the user.
This field is NEVER null
.
public static final java.lang.String FILE_UCD_DEPRECATED
This path must be relative to the class path.
See parseDeprecatedWordList(Reader, UCDWordList)
for more
details about the expected file format.
Important:
This list of deprecated words should be for the same UCD standard
version than FILE_UCD_WORDS
.
public static final java.lang.String FILE_UCD_WORDS
This path must be relative to the class path.
See parseWordList(Reader, boolean)
for more details about the
expected file format.
public final UCDWordList knownWords
Important:
"Known" means here that all words of this list will be used as reference
when building a UCD
object when the corresponding UCD word match
the listed UCDWord
. So it means there is no guarantee that ALL
UCDWord
objects stored in this list are valid
,
recognised
and/or recommended
.
This special status of a UCDWord
depends of its initialization.
This field is NEVER null
.
protected static final java.util.logging.Logger LOGGER
UCDParser
.protected static final int NB_MAX_ERRORS
public static final java.lang.String UCIDY_VERSION
public static final java.lang.String VERSION_UCD_WORDS
FILE_UCD_WORDS
and
FILE_UCD_DEPRECATED
.public UCDParser()
You can however fill this list directly through the field
knownWords
.
Note:
The list of deprecated words is also empty by default. As for the known
words, this list can be updated directly through the field
deprecatedWords
.
public UCDParser(java.lang.String wordsListVersion)
You can however fill this list directly through the field
knownWords
.
Note:
The list of deprecated words is also empty by default. As for the known
words, this list can be updated directly through the field
deprecatedWords
.
wordsListVersion
- Version of the empty UCD words lists created by
default.public UCDParser(UCDWordList words)
Note:
The given list can obviously be modified (i.e. addition and deletion are
allowed) through the field knownWords
or directly when
manipulating the given list (it is stored in this UCDParser
object by reference).
words
- List of all known words.public UCDParser(UCDWordList words, DeprecatedUCDWordList deprecatedWords)
Note:
The given lists can obviously be modified (i.e. addition and deletion
are allowed) through the fields knownWords
and deprecatedWords
or directly when manipulating the given lists (they are stored in this
UCDParser
object by reference).
words
- List of all known words.deprecatedWords
- List of all deprecated words.public final java.lang.String getFullVersion()
getVersion()
,
UCDWordList.getVersion()
,
UCDWordList.getVersion()
public static final java.lang.String getVersion()
Note:
This function is equivalent to UCIDY_VERSION
.
public static void main(java.lang.String[] args) throws java.lang.Throwable
java.lang.Throwable
public UCD parse(java.lang.String ucdStr)
UCD
.
Each word of this UCD is searched in the list of known UCD words.
If a match is found, the full definition of this UCD will be set in
UCD
.
If none can be found, the word is searched in the list of deprecated
words. If still not found, a not recognised
UCDWord
will be created instead, with a list of the closest
recognised UCD words (if any).
ucdStr
- The string serializing a UCD.null
if the given string is null
or empty.public static java.lang.String[] parseDeprecatedFileLine(java.lang.String line) throws java.lang.NullPointerException, java.text.ParseException
The expected syntax for a such line is 2 values separated by a space: the deprecated UCD word followed by a suggested UCD in replacement.
Note: Concatenated space characters will be replaced by a single space on each line. Besides, leading and trailing space characters will be ignored.
WARNING:
Comment lines (i.e. starting with the character #) are not supported.
This function will throw a NullPointerException
in such case, as
if an empty line was provided.
line
- A non empty and not commented line of a file listing the
deprecated UCD words.java.lang.NullPointerException
- If the given line is null
, an
empty string or a comment.java.text.ParseException
- If the given line has no space character or
has too many.public static DeprecatedUCDWordList parseDeprecatedWordList(java.io.Reader reader, UCDWordList lstWords) throws java.lang.NullPointerException, java.io.IOException
DeprecatedUCDWordList
with all deprecated UCD words
declared inside the specified input.
The expected file MUST contain exactly 2 columns, separated by at least one space character:
If the syntax of a line is incorrect, an error message will be displayed in the standard error output. If more than 10 consecutive errors are raised, the parsing of the file stops immediately with a new error message.
A line is considered as incorrect in the following cases:
If a deprecated UCD word is declared more than once, only the first occurrence will stay in the list. An error will be displayed only if the suggested UCD replacement is different. Otherwise just a warning is displayed.
Few additional notes about the parsing:
reader
- Reader whose the content must be parsed.lstWords
- The list of all known and still correct UCD words.java.lang.NullPointerException
- If the given reader is null
.java.io.IOException
- If an error occurred while reading the
specified input.parseDeprecatedWordList(Reader, UCDWordList, DeprecatedUCDWordList)
public static int parseDeprecatedWordList(java.io.Reader reader, UCDWordList lstWords, DeprecatedUCDWordList lstDeprecatedWords) throws java.lang.NullPointerException, java.io.IOException
UCDWordList
all deprecated UCD words
declared inside the specified input.
The expected file MUST contain exactly 2 columns, separated by at least one space character:
If the syntax of a line is incorrect, an error message will be displayed in the standard error output. If more than 10 consecutive errors are raised, the parsing of the file stops immediately with a new error message.
A line is considered as incorrect in the following cases:
If a deprecated UCD word is declared more than once, only the first occurrence will stay in the list. An error will be displayed only if the suggested UCD replacement is different. Otherwise just a warning is displayed.
Few additional notes about the parsing:
reader
- Reader whose the content must be parsed.lstWords
- The list of all known and still correct UCD
words.lstDeprecatedWords
- The DeprecatedUCDWordList
to
complete with the deprecated UCD words
extracted from the given input.java.lang.NullPointerException
- If the given reader is null
.java.io.IOException
- If an error occurred while reading the
specified input.parseDeprecatedFileLine(String)
public static UCDWord parsePSVLine(java.lang.String psvLine, boolean recommended) throws java.lang.NullPointerException, java.text.ParseException
psvLine
- A non empty line of a PSV file listing the allowed
UCD words.recommended
- true
if the described UCD word is
UCDWord.recommended
by the IVOA
standard,
false
otherwise.java.lang.NullPointerException
- If the given PSV line is null
or an empty string.java.text.ParseException
- If the syntax of the given PSV line is
incorrect (expected syntax: ()),
or if the syntax code is too long or unknown.UCDSyntax.get(char)
,
UCDSyntax.allowedSyntaxCodes
public static UCD parseUCD(java.lang.String ucdStr)
Non resolved words, will still be in the returned UCD
exactly as
provided, but won't be flagged as recognised
.
The consequence is a final non fully valid
UCD.
ucdStr
- The string serializing a UCD.null
if the given string is null
or empty.parse(String)
public static UCDWordList parseWordList(java.io.Reader reader, boolean recommended) throws java.lang.NullPointerException, java.io.IOException
UCDWordList
with all UCD words declared using the PSV
(Pipe-Separated-Value) format inside the specified input.
The expected PSV file MUST contain at least 3 columns, each separated by a pipe character (|):
UCDSyntax
for more details.Note: If more columns are provided, they will be considered as part of the description.
If the syntax of a PSV line is incorrect, an error message will be displayed in the standard error output. If more than 10 consecutive errors are raised, the parsing of the file stops immediately with a new error message.
A PSV line is considered as incorrect in the following cases:
Few additional notes about the parsing:
UCDWord.description
will be
set to null
.UCDWord
as NOT
recommended
.UCDWordList
as NOT
valid
and so automatically as NOT
recommended
.reader
- Reader whose the content must be parsed.recommended
- true
to flag all imported UCD words as
recommended
,
false
otherwise.java.lang.NullPointerException
- If the given reader is null
.java.io.IOException
- If an error occurred while reading the
specified input.public static int parseWordList(java.io.Reader reader, boolean recommended, UCDWordList words) throws java.lang.NullPointerException, java.io.IOException
UCDWordList
all UCD words declared using the
PSV (Pipe-Separated-Value) format inside the specified input.
The expected PSV file MUST contain at least 3 columns, each separated by a pipe character (|):
UCDSyntax
for more details.Note: If more columns are provided, they will be considered as part of the description.
If the syntax of a PSV line is incorrect, an error message will be displayed in the standard error output. If more than 10 consecutive errors are raised, the parsing of the file stops immediately with a new error message.
A PSV line is considered as incorrect in the following cases:
Few additional notes about the parsing:
UCDWord.description
will be
set to null
.UCDWord
as NOT
recommended
.UCDWordList
as NOT
valid
and so automatically as NOT
recommended
.reader
- Reader whose the content must be parsed.recommended
- true
to flag all imported UCD words as
recommended
,
false
otherwise.words
- The UCDWordList
to complete with the UCD
words extracted from the given input.java.lang.NullPointerException
- If the given reader is null
.java.io.IOException
- If an error occurred while reading the
specified input.