Not sure why you want to use pipe
for this operation.
pipe
intended to simplify the syntax for chaining a DataFrame with a chain of functions that modifies the incoming DataFrame ( see docs ).
What you are trying to do is a DataFrame filter with several filters (or masks).
To illustrate that using pipe
for this operation is somewhat cumbersome:
import pandas as pd pd.np.random.seed(123) # Generate some data dates = pd.date_range('2014-01-01', '2015-12-31', freq='M') df = pd.DataFrame({'region':pd.np.random.choice(['USA', 'Non-USA'], len(dates))}, index=dates) df['Month'] = df.index.month print df.head() region Month 2014-01-31 USA 1 2014-02-28 Non-USA 2 2014-03-31 USA 3 2014-04-30 USA 4 2014-05-31 USA 5
Your source filter will give:
df_a = df[df.index.year != 2014] df_b = df_a[(df_a['Month'].isin([3, 4, 5])) & (df_a['region'] == 'USA')] print df_b region Month 2015-03-31 USA 3 2015-05-31 USA 5
Here is how you could use pipe
to get the same output:
def masker(df, mask): return df[mask] mask1 = df.index.year != 2014 mask2 = df['Month'].isin([3, 4, 5]) mask3 = df['region'] == 'USA' print df.pipe(masker, mask1).pipe(masker, mask2).pipe(masker, mask3) region Month 2015-03-31 USA 3 2015-05-31 USA 5
However, pandas is able to handle filtering in a fairly simple (in this particular case) way:
print df[mask1 & mask2 & mask3] region Month 2015-03-31 USA 3 2015-05-31 USA 5