Here is the Java 8 code using threads:
Set<String> getFields( Path xml ) { final Set<String> fields = new HashSet<>(); for( ... ) { ... fields.add( ... ); ... } return fields; } void scan() { final SortedSet<Path> files = new TreeSet<>(); final Path root = new File( "....." ).toPath(); final BiPredicate<Path, BasicFileAttributes> pred = (p,a) -> p.toString().toLowerCase().endsWith( ".xml" ); Files.find( root, 1, pred ).forEach( files::add ); final SortedSet<String> fields = new TreeSet<>(); files .stream() .parallel() .map( this::getFields ) .forEach( s -> fields.addAll( s )); // Do something with fields... }
I want to combine the output of map( this::getFields ) , i.e. a Stream<Set<Path>> to Set<Path> , and I'm not sure if forEach used correctly.
EDIT after John Skeet's answer to summarize comments and compile code
Stream<String> getFields( Path xml ) { final Set<String> fields = new HashSet<>(); for( ... ) { ... fields.add( ... ); ... } return fields.stream(); // returns a stream to ease integration } void scan() { final Path root = new File( "....." ).toPath(); final BiPredicate<Path, BasicFileAttributes> pred = (p,a) -> p.toString().toLowerCase().endsWith( ".xml" ); final SortedSet<Path> files = Files .find( root, 1, pred ) .collect( Collectors.toCollection( TreeSet::new )); final SortedSet<String> fields = files .stream() .parallel() .flatMap( this::getFields ) .collect( Collectors.toCollection( TreeSet::new )); // Do something with fields... }
Two streams can be merged into one, but files reused later.
Aubin source share