Java Regex: how to determine a URL with a file extension

How to create a REGEX to determine if the "String url" contains a file extension (.pdf, .jpeg, .asp, .cfm ...)?

Valids (without extensions):

Disabled people (with extensions):

Thanks, Celso

+4
source share
5 answers

In Java, you'd better use String.endsWith (). It is faster and easier to read. Example:

"file.jpg".endsWith(".jpg") == true 
+3
source

How about this?

 // assuming the file extension is either 3 or 4 characters long public boolean hasFileExtension(String s) { return s.matches("^[\\w\\d\\:\\/\\.]+\\.\\w{3,4}(\\?[\\w\\W]*)?$"); } @Test public void testHasFileExtension() { assertTrue("3-character extension", hasFileExtension("http://www.yahoo.com/a.pdf")); assertTrue("3-character extension", hasFileExtension("http://www.yahoo.com/a.htm")); assertTrue("4-character extension", hasFileExtension("http://www.yahoo.com/a.html")); assertTrue("3-character extension with param", hasFileExtension("http://www.yahoo.com/a.pdf?p=1")); assertTrue("4-character extension with param", hasFileExtension("http://www.yahoo.com/a.html?p=1&p=2")); assertFalse("2-character extension", hasFileExtension("http://www.yahoo.com/a.co")); assertFalse("2-character extension with param", hasFileExtension("http://www.yahoo.com/a.co?p=1&p=2")); assertFalse("no extension", hasFileExtension("http://www.yahoo.com/hello")); assertFalse("no extension with param", hasFileExtension("http://www.yahoo.com/hello?p=1&p=2")); assertFalse("no extension with param ends with .htm", hasFileExtension("http://www.yahoo.com/hello?p=1&p=a.htm")); } 
+3
source

Alternate version without regex but using the URI class:

 import java.net.*; class IsFile { public static void main( String ... args ) throws Exception { URI u = new URI( args[0] ); for( String ext : new String[] {".png", ".pdf", ".jpg", ".html" } ) { if( u.getPath().endsWith( ext ) ) { System.out.println("Yeap"); break; } } } } 

Work with:

 java IsFile "http://download.oracle.com/javase/6/docs/api/java/net/URI.html#getPath()" 
+3
source

Not a Java developer anymore, but you can determine what you are looking for with the following regex

 "/\.(pdf|jpe{0,1}g|asp|docx{0,1}|xlsx{0,1}|cfm)$/i" 

I don’t know what the function looks like.

0
source

If the following code returns true, then at the end it contains the file extension:

 urlString.matches("\\p{Graph}+\\.\\p{Alpha}{2,4}$"); 

Assuming the file extension is a dot followed by 2, 3, or 4 alphabetical characters.

0
source

Source: https://habr.com/ru/post/1342197/


All Articles