Ubuntu 10.4 JRE 1.6.0_26
SaxonHE9.3.0.5I have a very simple script that extracts the text content from a valid html file
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:html="http://www.w3.org/1999/html"
version="2.0">
<xsl:output method="text" encoding="utf-8"/>
</xsl:stylesheet>
running this script in Oxygen produces the expected output.
running this on the same PC on the command line using the same version of java and saxon like so
java -jar lib/saxonHE-9.3.0.5.jar -o:build/etemp/html_1.txt -s:build/ebook/epub_sh-tei.html -xsl:xslt/htm2text.xsl
outputs the following error
Error java.net.SocketException: Unexpected end of file from server Transformation failed: Run-time errors were reported
below is the verbose output from java
[Loaded net.sf.saxon.tinytree.TinyProcInstImpl from file:/home/scott/workspace/books_changes2/lib/saxonHE-9.3.0.5.jar]
[Loaded net.sf.saxon.tinytree.LargeStringBuffer from file:/home/scott/workspace/books_changes2/lib/saxonHE-9.3.0.5.jar]
[Loaded java.lang.ArrayIndexOutOfBoundsException from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded com.sun.org.apache.xerces.internal.impl.io.ASCIIReader from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded com.sun.org.apache.xerces.internal.impl.validation.EntityState from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded com.sun.org.apache.xerces.internal.xni.grammars.Grammar from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded com.sun.org.apache.xerces.internal.impl.dtd.DTDGrammar from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded com.sun.org.apache.xerces.internal.impl.dtd.models.ContentModelValidator from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded com.sun.org.apache.xerces.internal.impl.dtd.DTDGrammar$QNameHashtable from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded com.sun.org.apache.xerces.internal.impl.dtd.XMLContentSpec from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver from /usr/lib/jvm/java-6-sun-1.6.0.开发者_开发技巧26/jre/lib/rt.jar]
[Loaded com.sun.xml.internal.stream.StaxXMLInputSource from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded sun.net.www.protocol.http.Handler from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded sun.net.www.protocol.http.HttpURLConnection from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.util.logging.Logger from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.util.logging.Handler from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.util.logging.Level from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.util.logging.LogManager from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.util.logging.LogManager$1 from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.beans.PropertyChangeSupport from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.util.logging.LogManager$LogNode from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.util.logging.LoggingPermission from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.util.logging.LogManager$Cleaner from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.lang.ApplicationShutdownHooks from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.lang.ApplicationShutdownHooks$1 from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.lang.Shutdown from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.lang.Shutdown$Lock from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.util.IdentityHashMap from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.util.logging.LogManager$RootLogger from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.util.logging.LogManager$2 from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.util.Hashtable$Enumerator from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.util.EventObject from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.beans.PropertyChangeEvent from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.util.logging.LogManager$3 from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded sun.security.action.GetIntegerAction from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded sun.net.www.protocol.http.AuthCacheValue from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded sun.net.www.protocol.http.AuthenticationInfo from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded sun.net.www.protocol.http.NTLMAuthentication from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded sun.net.www.protocol.http.AuthCache from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded sun.net.www.protocol.http.AuthCacheImpl from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded sun.net.www.protocol.http.NTLMAuthenticationCallback from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded sun.net.www.protocol.http.NTLMAuthenticationCallback$DefaultNTLMAuthenticationCallback from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded sun.net.www.protocol.http.NTLMAuthentication$1 from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded sun.net.www.protocol.http.HttpURLConnection$TunnelState from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded sun.net.www.protocol.http.HttpURLConnection$2 from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.net.CookieHandler from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded sun.net.www.protocol.http.HttpURLConnection$3 from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.net.ResponseCache from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded com.sun.org.apache.xerces.internal.util.HTTPInputSource from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded sun.net.www.protocol.http.HttpURLConnection$5 from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.net.ProxySelector from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded sun.net.spi.DefaultProxySelector from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded sun.net.spi.DefaultProxySelector$1 from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded sun.net.NetProperties from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded sun.net.NetProperties$1 from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded sun.net.spi.DefaultProxySelector$NonProxyInfo from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded sun.net.spi.DefaultProxySelector$2 from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded sun.net.spi.DefaultProxySelector$3 from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.net.Proxy from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.net.Proxy$Type from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded sun.net.NetworkClient from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded sun.net.www.http.HttpClient from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded sun.net.NetworkClient$1 from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded sun.net.www.http.KeepAliveCache from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded sun.net.www.http.HttpClient$1 from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded sun.net.www.http.HttpClient$2 from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded sun.net.www.http.KeepAliveKey from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded sun.net.www.http.HttpClient$3 from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.net.Socket from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.net.SocksConsts from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.net.SocketOptions from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.net.SocketImpl from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.net.PlainSocketImpl from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.net.SocksSocketImpl from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded sun.security.action.LoadLibraryAction from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.net.SocketAddress from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.net.InetSocketAddress from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.net.InetAddress from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.net.InetAddress$Cache from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.net.InetAddress$Cache$Type from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.net.InetAddressImplFactory from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.net.InetAddressImpl from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.net.Inet6AddressImpl from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded sun.net.spi.nameservice.NameService from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.net.InetAddress$1 from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.net.Inet4AddressImpl from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.net.Inet4Address from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded sun.net.InetAddressCachePolicy from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded sun.net.InetAddressCachePolicy$1 from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded sun.net.InetAddressCachePolicy$2 from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.util.Queue from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.util.Deque from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.util.AbstractSequentialList from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.util.LinkedList from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.util.LinkedList$Entry from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.util.LinkedHashMap$KeyIterator from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.util.LinkedList$ListItr from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.net.InetAddress$CacheEntry from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.net.Inet6Address from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.net.SocketException from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.net.SocksSocketImpl$5 from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.net.Socket$3 from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.net.SocketOutputStream from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.net.Socket$2 from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.net.SocketInputStream from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.io.InterruptedIOException from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.net.SocketTimeoutException from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.io.EOFException from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded com.sun.org.apache.xerces.internal.xni.parser.XMLParseException from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
Error
java.net.SocketException: Unexpected end of file from server
Transformation failed: Run-time errors were reported
[Loaded java.util.IdentityHashMap$KeySet from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.util.IdentityHashMap$IdentityHashMapIterator from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
[Loaded java.util.IdentityHashMap$KeyIterator from /usr/lib/jvm/java-6-sun-1.6.0.26/jre/lib/rt.jar]
Please do not ask the same question on every forum you can think of. Choose the most appropriate, ask there, and only turn elsewhere if you get no response. Asking in multiple places is wasting the time of people who volunteer to help you, because they can't see that the question has already been answered elsewhere. Downvoting the question for this reason.
For those looking for an answer:
The answer to this question was that the parser was trying to get the html dtd from here http://www.w3.org/1999/html as stated in the style sheet.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:html="http://www.w3.org/1999/html"
version="2.0">
<xsl:output method="text" encoding="utf-8"/>
</xsl:stylesheet>
W3 was ignoring the request and java threw an exception. The parser gave no indication that it was requesting the dtd nor that it had not received the dtd. Leaving the user in the outer darkness as to why the transformation failed. Very bad behavior for the parser!
The solution was to tell the parser to use a local dtd, via a catalog like so, this is from a ant script:
<macrodef name="saxon_use_catalog">
<attribute name="input"/>
<attribute name="stylesheet"/>
<attribute name="output"/>
<sequential>
<echo>Transforming @{input} to @{output} using @{stylesheet} </echo>
<java classname="net.sf.saxon.Transform" fork="true" failonerror="true">
<jvmarg value="-Dxml.catalog.files=lib/xhtml11/dtd/xhtmlcatalog.xml" />
<classpath>
<pathelement location="lib/saxonHE-9.3.0.5.jar"/>
<pathelement location="lib/resolver.jar"/>
<pathelement location="lib/xhtml11/dtd/xhtmlcatalog.xml"/>
</classpath>
<arg value="-r:org.apache.xml.resolver.tools.CatalogResolver"/>
<arg value="-x:org.apache.xml.resolver.tools.ResolvingXMLReader"/>
<arg value="-y:org.apache.xml.resolver.tools.ResolvingXMLReader"/>
<arg value="-s:@{input}"/>
<arg value="-xsl:@{stylesheet}"/>
<arg value="-o:@{output}"/>
</java>
</sequential>
</macrodef>
The resolver jar and catalog I used were packaged with Oxygen.
Scott
精彩评论