Ever had multiple columns in your DataFrame and thought, “Hmm, wouldn’t it be great if I could just mash these into one clean column?” Whether you’re cleaning names, constructing addresses, or stitching strings together for a custom key — concatenating values in a DataFrame is a go-to move.
Let’s walk through all the nifty ways to do this in Pandas — with real-world style examples (and yes, with a pinch of fun too 😄).
🔹 Scenario 1: Joining Strings from Multiple Columns
Let’s say you have a DataFrame with first and last names.
import pandas as pd
df = pd.DataFrame({
'first_name': ['Tony', 'Bruce', 'Natasha'],
'last_name': ['Stark', 'Banner', 'Romanoff']
})
You want Tony Stark, Bruce Banner, etc. Here’s how:
df['full_name'] = df['first_name'] + ' ' + df['last_name']
Output:
| first_name | last_name | full_name |
|---|---|---|
| Tony | Stark | Tony Stark |
| Bruce | Banner | Bruce Banner |
| Natasha | Romanoff | Natasha Romanoff |
🔹 Scenario 2: Dealing with Non-String Columns 🧠
Say your columns are numbers or contain nulls. Pandas throws a tantrum if you don’t convert them first.
df = pd.DataFrame({
'product': ['Widget', 'Gadget'],
'code': [101, 202]
})
df['product_id'] = df['product'] + '-' + df['code'].astype(str)
✔️ Always remember: Convert non-strings using .astype(str) before concatenating.
🔹 Scenario 3: Using agg() for Flexible Joins
When you’ve got more than 2 columns or want to add a delimiter smartly:
df['full_id'] = df[['product', 'code']].agg('-'.join, axis=1)
#Gives error as 'code' is not string
Wanna keep it readable and handle NULLs? Use .astype(str) inside agg() or pass a custom function.
df['full_id'] = df[['product', 'code']].astype(str).agg('-'.join, axis=1)
🔹 Scenario 4: Using apply() for Complex Logic 💡
Want to customize per row?
df['custom'] = df.apply(lambda row: f"{row['product']}_{row['code']}_v1", axis=1)
This is perfect for building file names, IDs, or even dynamic messages.
🔹 Scenario 5: Combining Across Rows (Not Columns)
Sometimes, you might want to concatenate values down a column:
combined = ', '.join(df['product'])
# Output: 'Widget, Gadget'
🎯 When To Use What?
| Situation | Method |
|---|---|
| Simple string columns | + operator |
| Mixed datatypes or nulls | .astype(str) + + or agg() |
| Complex logic or conditions | apply() with lambda |
| Across rows | join() function |
✨ Final Thoughts
Data cleaning often needs stitching data together — and Pandas makes it surprisingly flexible once you get the hang of it.
Remember: If you can imagine the string, you can build it in Pandas 🧠💻
Leave a comment