As suggested elsewhere, this is usually not what you want to do. It is usually best to create a temporary file using a safe method such as File.createTempFile ().
You should not do this with the whitelist and keep only the “good” characters. If the file consists only of Chinese characters, you will strip all of it. We cannot use the whitelist for this reason, we must use the blacklist.
Linux pretty much allows anything that can be a real pain. I would just limit Linux to the same list that you restrict Windows to keep your headaches in the future.
Using this C # snippet on Windows, I created a list of characters that are not valid on Windows. There are a few more characters in this list than you think (41), so I would not recommend creating your own list.
foreach (char c in new string(Path.GetInvalidFileNameChars())) { Console.Write((int)c); Console.Write(","); }
Here is a simple Java class that clears the file name.
public class FileNameCleaner { final static int[] illegalChars = {34, 60, 62, 124, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 58, 42, 63, 92, 47}; static { Arrays.sort(illegalChars); } public static String cleanFileName(String badFileName) { StringBuilder cleanName = new StringBuilder(); for (int i = 0; i < badFileName.length(); i++) { int c = (int)badFileName.charAt(i); if (Arrays.binarySearch(illegalChars, c) < 0) { cleanName.append((char)c); } } return cleanName.toString(); } }
EDIT: Because Stephen suggested that you probably should also make sure that these file accesses occur only within the directory you allow.
The following answer contains sample code for creating a custom security context in Java and then executing the code in this sandbox.
How to create a secure sandbox isolated JEXL (scripts)?