Goliath is a web environment, so I assume you plan to "swallow" these documents via HTTP? Each request falls into a ruby fiber, but the server works efficiently in one reactor.
So, to answer your question: Nokogiri is thread safe as far as I know, but it doesn't even really matter. What you have to look for: while the document is being analyzed, the CPU is pinned and Goliath does not accept any new requests. Thus, you will need to implement the correct logic to handle your specific case (for example: you can perform a stream analysis on pieces of data coming from a socket, or load a balance between several goliath servers or both ... :-))
source share