file – Convert URL to normal windows filename Java-ThrowExceptions

Exception or error:

Is there a way to convert this:

/C:/Users/David/Dropbox/My%20Programs/Java/Test/bin/myJar.jar

into this?:

C:\Users\David\Dropbox\My Programs\Java\Test\bin\myJar.jar

I am using the following code, which will return the full path of the .JAR archive, or the /bin directory.

fullPath = new String(MainInterface.class.getProtectionDomain()
            .getCodeSource().getLocation().getPath());

The problem is, getLocation() returns a URL and I need a normal windows filename.
I have tried adding the following after getLocation():

toString() and toExternalForm() both return:

file:/C:/Users/David/Dropbox/My%20Programs/Java/Test/bin/

getPath() returns:

/C:/Users/David/Dropbox/My%20Programs/Java/Test/bin/

Note the %20 which should be converted to space.

Is there a quick and easy way of doing this?

How to solve:

The current recommendation (with JDK 1.7+) is to convert URL → URI → Path. So to convert a URL to File, you would say Paths.get(url.toURI()).toFile(). If you can’t use JDK 1.7 yet, I would recommend new File(URI.getSchemeSpecificPart()).

Converting file → URI: First I’ll show you some examples of what URIs you are likely to get in Java.

                          -classpath URLClassLoader File.toURI()                Path.toUri()
C:\Program Files          file:/C:/Program%20Files/ file:/C:/Program%20Files/   file:///C:/Program%20Files/
C:\main.c++               file:/C:/main.c++         file:/C:/main.c++           file:///C:/main.c++
\\VBOXSVR\Downloads       file://VBOXSVR/Downloads/ file:////VBOXSVR/Downloads/ file://VBOXSVR/Downloads/
C:\Résume.txt             file:/C:/R%c3%a9sume.txt  file:/C:/Résume.txt         file:///C:/Résume.txt
\\?\C:\Windows (non-path) file://%3f/C:/Windows/    file:////%3F/C:/Windows/    InvalidPathException

Some observations about these URIs:

  • The URI specifications are RFC 1738: URL, superseded by RFC 2396: URI, superseded by RFC 3986: URI. (The WHATWG also has a URI spec, but it does not specify how file URIs should be interpreted.) Any reserved characters within the path are percent-quoted, and non-ascii characters in a URI are percent-quoted when you call URI.toASCIIString().
  • File.toURI() is worse than Path.toUri() because File.toURI() returns an unusual non-RFC 1738 URI (gives file:/ instead of file:///) and does not format URIs for UNC paths according to Microsoft’s preferred format. None of these UNC URIs work in Firefox though (Firefox requires file://///).
  • Path is more strict than File; you cannot construct an invalid Path from “\.\” prefix. “These prefixes are not used as part of the path itself,” but they can be passed to Win32 APIs.

Converting URI → file: Let’s try converting the preceding examples to files:

                            new File(URI)            Paths.get(URI)           new File(URI.getSchemeSpecificPart())
file:///C:/Program%20Files  C:\Program Files         C:\Program Files         C:\Program Files
file:/C:/Program%20Files    C:\Program Files         C:\Program Files         C:\Program Files
file:///C:/main.c++         C:\main.c++              C:\main.c++              C:\main.c++
file://VBOXSVR/Downloads/   IllegalArgumentException \\VBOXSVR\Downloads\     \\VBOXSVR\Downloads
file:////VBOXSVR/Downloads/ \\VBOXSVR\Downloads      \\VBOXSVR\Downloads\     \\VBOXSVR\Downloads
file://///VBOXSVR/Downloads \\VBOXSVR\Downloads      \\VBOXSVR\Downloads\     \\VBOXSVR\Downloads
file://%3f/C:/Windows/      IllegalArgumentException IllegalArgumentException \\?\C:\Windows
file:////%3F/C:/Windows/    \\?\C:\Windows           InvalidPathException     \\?\C:\Windows

Again, using Paths.get(URI) is preferred over new File(URI), because Path is able to handle the UNC URI and reject invalid paths with the \?\ prefix. But if you can’t use Java 1.7, say new File(URI.getSchemeSpecificPart()) instead.

By the way, do not use URLDecoder to decode a file URL. For files containing “+” such as “file:///C:/main.c++”, URLDecoder will turn it into “C:\main.c  ”! URLDecoder is only for parsing application/x-www-form-urlencoded HTML form submissions within a URI’s query (param=value&param=value), not for unquoting a URI’s path.

2014-09: edited to add examples.

Answer:

String path = "/c:/foo%20bar/baz.jpg";
path = URLDecoder.decode(path, "utf-8");
path = new File(path).getPath();
System.out.println(path); // prints: c:\foo bar\baz.jpg

Answer:

The current answers seem fishy to me.

java.net.URL.getFile

turns a file URL such as this

java.net.URL = file:/C:/some/resource.txt

into this

java.lang.String = /C:/some/resource.txt

so you can use this constructor

new File(url.getFile)

to give you the Windows path

java.io.File = C:\some\resource.txt

Answer:

As was mentioned – getLocation() returns an URL. File can easily convert an URI to a path so for me the simpliest way is just use:

File fullPath = new File(MainInterface.class.getProtectionDomain().
    getCodeSource().getLocation().toURI());

Of course if you really need String, just modify to:

String fullPath = new File(MainInterface.class.getProtectionDomain().
    getCodeSource().getLocation().toURI()).toString();

You don’t need URLDecoder at all.

Answer:

The following code is what you need:

String path = URLDecoder.decode("/C:/Users/David/Dropbox/My%20Programs/Java/Test/bin/", "UTF-8");
System.out.println(new File(path).getPath());

Answer:

Hello confused people from the future. There is a nuance to the file path configuration here. The path you are setting for TESSDATA_PREFIX is used internally by the C++ tesseract program, not by the java wrapper. This means that if you’re using windows you will need to replace the leading slash and replace all other forward slashes with backslashes. A very hacky workaround looks like this:

URL pathUrl = this.getClass().getResource(TESS_DATA_PATH);
String pathStr = pathUrl.getPath();

// hack to get around windows using \ instead of /
if (SystemUtils.IS_OS_WINDOWS) {
  pathStr = pathStr.substring(1);
  pathStr = pathStr.replaceAll("/", "\\\\");
}

Leave a Reply

Your email address will not be published. Required fields are marked *