|
Open SiteSearch 4.1.1 Final |
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: INNER | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object | +--ORG.oclc.resources.html.IdentifyCopyright
Takes an html page as a String and locates the most likely copyright statement. Assuming the most commonly observed pattern of Copyright DateRange CopyrightOwner, the routine breaks assigns the date (if there) and publisher (assumed to be Copyright Owner) The routine could be more sophisticated, such as checking to see if the date is reasonable or dealing with non-standard characters, but it seems pretty good for capturing text to be checked by a human. Accessor methods to return the cleaned date, publisher, and copyright (date + publisher) are used.
Field Summary | |
String |
copyright
|
String |
date
|
String |
publisher
|
Constructor Summary | |
IdentifyCopyright(String text)
Constuctor based on text. |
Method Summary | |
static String |
cleanTags(String s)
routine to remove all html tagging leaving only visible text |
String |
getCopyright()
|
String |
getDate()
Accessor method for title |
String |
getPublisher()
Accessor method for publisher |
int |
indexOfAlphanum(String text)
Looks for the first AlphaNumeric Character (should prob. |
String |
removeBracketed(String s)
Removes html tagging but keeps spacing |
static String |
trimNonCharOrDigit(String s)
trim end of non alphanumeric characters |
int |
YearBreak(String text)
|
Methods inherited from class java.lang.Object |
clone,
equals,
finalize,
getClass,
hashCode,
notify,
notifyAll,
toString,
wait,
wait,
wait |
Field Detail |
public String date
public String publisher
public String copyright
Constructor Detail |
public IdentifyCopyright(String text)
Method Detail |
public String getPublisher()
public String getDate()
public String getCopyright()
public int YearBreak(String text)
public String removeBracketed(String s)
public static String cleanTags(String s)
public static String trimNonCharOrDigit(String s)
public int indexOfAlphanum(String text)
|
Open SiteSearch 4.1.1 Final |
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: INNER | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |