I have a data logger that is not connected to a computer that collects data from a field. This data is stored as text files, and I manually combine the files and organize them. The current format is through a csv file per year for each registrar. Each file is about 4,000,000 lines x 7 loggers x 5 years = a lot of data. some of the data is organized as item_type, item_class, item_dimension_class, and other more unique bean data, such as item_weight, item_color, date_collected, etc ...
I am currently doing statistical analysis of data using the python / numpy / matplotlib program that I wrote. It works great, but the problem is that I'm the only one who can use it, since it and the data live on my computer.
I would like to publish data on the web using postgres db; however, I need to find or implement a statistical tool that will take up a large postgres table and return the statistical results for a sufficient period of time. I am not familiar with python for the web; however, I own PHP on websites and python on the standalone side.
users should be allowed to create their own histograms, data analysis. For example, a user can search for all items that are sent in blue between weeks x and weeks y, while another user can search for sorting by weight of all items by the hour throughout the year.
I thought of creating and indexing my own statistical tools or automating the process in some way to emulate most queries. It seemed ineffective.
I look forward to your ideas.
thanks
source share