Getting meaningful text from Java.io.Reader

I have a program that I am writing, where I use another corporate library to download some reports from their website. I want to analyze these reports before writing them to a file, because if they meet certain criteria, I want to ignore them.

The problem is that their method called download () returns java.io.Reader. The only method available to me is

int read(char[] cbuf); 

Printing this returned array gives me meaningless characters. I want to be able to determine which character set I am working on or convert to an array of bytes, but I cannot figure out how to do this. I tried

 //retrievedFile is my Reader object char[] cbuf = new char[2048]; int numChars = retrievedFile.read(cbuf); //I've tried other character sets, too new String(cbuf).getBytes("UTF-8"); 

and I'm afraid to approach a more useful reader, because I cannot know for sure whether this will work or not. Any suggestions?

EDIT

When I say that it prints "meaningless characters", I do not mean that this is similar to the example given by John Skeet. It is very difficult to describe, because now I am not on my machine, but I think that this is an encoding problem. The characters seem to have padding and structure similar to the appearance of the reports. I will try these suggestions as soon as I get back on Tuesday (I am only an intern, so I did not worry about setting up a remote account or anything else).

+4
source share
5 answers

Try the following:

 BufferedReader in = new BufferedReader(retrievedFile); String line = null; StringBuilder rslt = new StringBuilder(); while ((line = in.readLine()) != null) { rslt.append(line); } System.out.println(rslt.toString()); 

Do not resort to reading Reader for any class, because you do not know its real type. Use BufferedReader instead and pass Reader into it. And BufferedReader uses any subclass of java.io.Reader as an argument to save it.

+14
source

The listing of char[] will probably give you something like:

 [ C@1c8825a5 

This is just the normal output of calling toString in a char array in Java. It looks like you want to convert it to String , which you can do with the String(char[]) constructor String(char[]) . Here is a sample code:

 public class Test { public static void main(String[] args) { char[] chars = "hello".toCharArray(); System.out.println((Object) chars); String text = new String(chars); System.out.println(text); } } 

On the other hand, java.io.Reader does not have a read method that returns char[] - it has methods that either return one character at a time or (more useful) take the value char[] to fill in the data and return the amount of data read. This actually shows a sample code. You just need to use the char array and the number of characters read to create a new String . For instance:

 char[] buffer = new char[4096]; int charsRead = reader.read(buffer); String text = new String(buffer, 0, charsRead); 

However, note that it cannot return all the data in one go. You can read it line by line using BufferedReader , or a loop to get all the information. Guava contains useful code in the CharStreams class, For example:

 String allText = CharStreams.toString(reader); 

or

 List<String> lines = CharStreams.readLines(reader); 
+4
source

What meaningless characters does he give. Probably null characters, because you do not read all the characters from the reader, but no more than 2048 characters, and ignore the return value from the read method (which tells you how many characters were read).

If you want to read all this in String, you have to loop until the return value is negative, and add the characters read at each iteration (from 0 to numChars) in StringBuilder.

 StringBuilder builder = new StringBuilder(); int numChars; while ((numChars = reader.read(cbuf)) >= 0) { builder.append(cbuf, 0, numChars); } String s = builder.toString(); 
+1
source

Wrap it in something more useful, like StringReader or BufferedReader:

http://docs.oracle.com/javase/6/docs/api/

.

0
source

Since the file is a text file, create a BufferedReader from your Reader and read it one at a time - this should help.

0
source

Source: https://habr.com/ru/post/1388665/


All Articles