I am trying to reprogram a pandas data frame with an hourly timestamp index. I'm interested in getting the most common value for a column with string values. However, the built-in re-sampling functions of the time series do not include the mode as one of the default methods for re-sampling (since this means "means" and "count"). I tried to define my own function and pass this function, but it does not work. I also tried using the np.bincount function, but it does not work, since I work with strings.
Here's what my data looks like:
station_arrived action lat1 lon1 date_removed 2012-01-01 13:12:00 56 A 19.4171 -99.16561 2012-01-01 13:12:00 56 A 19.4271 -99.16361 2012-01-01 15:41:00 56 A 19.4171 -99.16561 2012-01-02 08:41:00 56 C 19.4271 -99.16561 2012-01-02 11:36:00 56 C 19.2171 -99.16561
This is my code:
def mode1(algo): common=[ite for ite, it in Counter(algo).most_common(1)] # Returns all unique items and their counts return common hourlycount2 = travels2012.resample('H', how={'station_arrived': 'count', 'action': mode(travels2012['action']), 'lat1':'count', 'lon1':'count'}) hourlycount2.head()
I see the following error:
Traceback (most recent call last): File "<stdin>", line 3, in <module> File "C:\Program Files\Anaconda\lib\site-packages\pandas\core\generic.py", line 2836, in resample return sampler.resample(self).__finalize__(self) File "C:\Program Files\Anaconda\lib\site-packages\pandas\tseries\resample.py", line 83, in resample rs = self._resample_timestamps() File "C:\Program Files\Anaconda\lib\site-packages\pandas\tseries\resample.py", line 277, in _resample_timestamps result = grouped.aggregate(self._agg_method) File "C:\Program Files\Anaconda\lib\site-packages\pandas\core\groupby.py", line 2404, in aggregate result[col] = colg.aggregate(agg_how) File "C:\Program Files\Anaconda\lib\site-packages\pandas\core\groupby.py", line 2076, in aggregate ret = self._aggregate_multiple_funcs(func_or_funcs) File "C:\Program Files\Anaconda\lib\site-packages\pandas\core\groupby.py", line 2125, in _aggregate_multiple_funcs results[name] = self.aggregate(func) File "C:\Program Files\Anaconda\lib\site-packages\pandas\core\groupby.py", line 2073, in aggregate return getattr(self, func_or_funcs)(*args, **kwargs) File "C:\Program Files\Anaconda\lib\site-packages\pandas\core\groupby.py", line 486, in __getattr__ (type(self).__name__, attr)) AttributeError: 'SeriesGroupBy' object has no attribute 'A '