Filter Pandas DataFrame by Column Values – All Possible Ways | BrontoWise

Imagine this: You’re analyzing customer data for an e-commerce store, and your boss asks you, “Can you pull out all orders from customers in the USA?”

If you’re working with pandas in Python, filtering a DataFrame by column values is something you’ll do all the time. But did you know there are multiple ways to do it?

Today, we’ll break it down step by step—so the next time you need to filter data, you’ll know exactly which method to use! 🔥


1️⃣ The Basics – Filtering with Boolean Indexing

The simplest way to filter a DataFrame is by using Boolean indexing.

Example:

Let’s say we have an e-commerce order dataset:

import pandas as pd  

# Sample DataFrame  
data = {
    'OrderID': [101, 102, 103, 104, 105],
    'Customer': ['Alice', 'Bob', 'Charlie', 'David', 'Eve'],
    'Country': ['USA', 'Canada', 'USA', 'UK', 'India'],
    'Amount': [200, 150, 450, 300, 500]
}

df = pd.DataFrame(data)  
print(df)

Now, let’s filter all orders from customers in the USA:

df_usa = df[df['Country'] == 'USA']
print(df_usa)

🎯 When to use?

  • Best for simple, quick filters
  • Works for exact matches

Limitations:

  • Doesn’t handle multiple conditions efficiently
  • Can’t filter by multiple values directly

2️⃣ Using .query() – The SQL-Like Way

If you love SQL, you’ll love query(). It makes filtering feel like writing a SQL WHERE clause.

df_usa_query = df.query("Country == 'USA'")
print(df_usa_query)

🎯 When to use?

  • If you prefer SQL-like syntax
  • If you have long conditions

3️⃣ Filtering with .isin() – When You Need Multiple Values

Let’s say your boss now asks, “Give me all orders from the USA and Canada.”

Here’s how you do it:

df_is_in = df[df['Country'].isin(['USA', 'Canada'])]
print(df_is_in)

🎯 When to use?

  • When filtering by multiple values
  • Super efficient for large datasets

4️⃣ Using .loc[] – The Explicit & Readable Approach

Another way is using loc[], which makes things explicit and easy to read:

df_usa_loc = df.loc[df['Country'] == 'USA']
print(df_usa_loc)

🎯 When to use?

  • When you need clear, readable code
  • If you’re working with row and column selections together

5️⃣ Filtering with Regular Expressions (str.contains())

Your boss says, “Give me all orders from countries that contain the letter ‘a’.”

🔍 Here’s how you use regex filtering:

df_a_countries = df[df['Country'].str.contains('nd', case=False, na=False)]
print(df_a_countries)

🎯 When to use?

  • When filtering based on patterns or partial matches
  • Works great for text-heavy datasets

6️⃣ Filtering with Lambda Functions (For Complex Conditions)

Now, your boss asks: “Find all customers whose names start with ‘A’ or ‘B’.”

Here’s a cool trick using lambda functions:

df_ab_customers = df[df['Customer'].apply(lambda x: x.startswith(('A', 'B')))]
print(df_ab_customers)

🎯 When to use?

  • When filtering requires custom logic
  • If built-in methods don’t work

7️⃣ Using .apply() for Advanced Filtering

Let’s say you want to filter all high-value orders (above $300) in the USA.

df_high_value_usa = df[df.apply(lambda row: row['Amount'] > 300 and row['Country'] == 'USA', axis=1)]
print(df_high_value_usa)

🎯 When to use?

  • When multiple columns affect filtering
  • When you need complex conditions

🚀 Conclusion: Which One Should You Use?

MethodBest Use Case
df[df['col'] == value]Simple filtering
query("col == 'value'")SQL-like syntax
df[df['col'].isin([...])]Multiple values
df.loc[df['col'] == value]Explicit, readable
df[df['col'].str.contains('pattern')]Text-based filtering
df[df['col'].apply(lambda x: condition)]Custom conditions

💡 Pro Tip: For large datasets, isin() and Boolean indexing are the fastest. 🚀


Now that you know all possible ways to filter pandas DataFrames, which method do you use the most? 🤔 Drop a comment below and let’s discuss! 👇


Stay tuned for more Python tutorials at BrontoWise! ✨ 🚀

Advertisements

Leave a comment

Website Powered by WordPress.com.

Up ↑

Discover more from BrontoWise

Subscribe now to keep reading and get access to the full archive.

Continue reading