Writing Apache Storm Spouts and Bolts Using Python and Pandas

I want to turn an existing python-based implementation (data analysis tool into an event stream) into a storm topology.

During the research phase, my team and I used python pandas to develop a prototype of our tool and found that it was very useful in terms of programmer productivity. Now we want to create a storm topology that does the same, and aim to reuse our existing python modules as bolts, or at least make an informed decision about whether this is a good idea that can do this.

Any restrictions on using a python script depending on external libs like Storm Bolt on a cluster? Also, does anyone have a sense of what there would be a productive penalty for using an interpreted and non-JVM language like Python instead of Java for our bolts? The pandas library itself is designed with high performance.

thanks

+6
source share

Source: https://habr.com/ru/post/957034/


All Articles