How to filter rows from Julia Array based on the value of a value in a specified column?

I have such data in a text file:

CLASS     col2    col3    ...
1         ...     ...     ...
1         ...     ...     ...
2         ...     ...     ...
2         ...     ...     ...
2         ...     ...     ...

I download them using the following code:

data = readdlm("file.txt")[2:end, :] # without header line

And now I would like to get an array with strings only from class 1.

(Data may be downloaded using some other function, if that helps.)

+4
source share
2 answers

Logical indexing is a straightforward way to filter by array:

data[data[:,1] .== 1, :]

If, however, you read your file as a data frame, you will have more options available to you, and it will track your headers:

julia> using DataFrames
julia> df = readtable("file.txt", separator=' ')
5Γ—4 DataFrames.DataFrame
β”‚ Row β”‚ CLASS β”‚ col2  β”‚ col3  β”‚ _     β”‚
β”œβ”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€
β”‚ 1   β”‚ 1     β”‚ "..." β”‚ "..." β”‚ "..." β”‚
β”‚ 2   β”‚ 1     β”‚ "..." β”‚ "..." β”‚ "..." β”‚
β”‚ 3   β”‚ 2     β”‚ "..." β”‚ "..." β”‚ "..." β”‚
β”‚ 4   β”‚ 2     β”‚ "..." β”‚ "..." β”‚ "..." β”‚
β”‚ 5   β”‚ 2     β”‚ "..." β”‚ "..." β”‚ "..." β”‚

julia> df[df[:CLASS] .== 1, :] # Refer to the column by its header name
2Γ—4 DataFrames.DataFrame
β”‚ Row β”‚ CLASS β”‚ col2  β”‚ col3  β”‚ _     β”‚
β”œβ”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€
β”‚ 1   β”‚ 1     β”‚ "..." β”‚ "..." β”‚ "..." β”‚
β”‚ 2   β”‚ 1     β”‚ "..." β”‚ "..." β”‚ "..." β”‚

, DataFramesMeta, ( ). @where SQL-:

julia> using DataFramesMeta
julia> @where(df, :CLASS .== 1)
2Γ—4 DataFrames.DataFrame
β”‚ Row β”‚ CLASS β”‚ col2  β”‚ col3  β”‚ _     β”‚
β”œβ”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€
β”‚ 1   β”‚ 1     β”‚ "..." β”‚ "..." β”‚ "..." β”‚
β”‚ 2   β”‚ 1     β”‚ "..." β”‚ "..." β”‚ "..." β”‚
+3
data[find(x -> a[x,1] == 1, 1:size(data)[1]),:]
+4

Source: https://habr.com/ru/post/1657158/


All Articles